Here are some small bash scripts that I use to aid in data management.

md5dir

Recursively prints out md5 values for files within a directory tree. It can be piped into a text file, and will ignore the piped file if it begins with “md5sum”

#!/bin/bash
find ./ -type f ! -iname "md5sum*" -exec md5sum {} \;

md5check

Recursively finds all md5sum* files in a directory, and verifies the data.

#!/bin/bash

# Generate a temp file
eval MD5LIST="~/md5list-$(pwgen 8 1).txt"

echo "Finding all md5sums in target dir"
find $PWD -iname "md5sum*txt" -o -iname "*.md5" -o -iname "md5sum" > $MD5LIST

while read md5path; do

	# This part is here because an earlier md5dir script did not do this check.
	if ( grep -q $(basename "$md5path") "$md5path" ); then
		echo "Stripping self-reference from $md5path"
		sed -i "/$(basename "$md5path")/d" $md5path
	fi

	cd "$(dirname "$md5path")"
		echo -e "Checking: $md5path"
		echo -en "\033[0;31m"
	md5sum --quiet -c "$(basename "$md5path")"
		echo -en "\033[0m"

done <$MD5LIST

rm $MD5LIST

rsync-cp

Copy directory trees with a progress bar.

#!/bin/bash
rsync -uav --info=progress2 $1 $2

hotremove

Hot remove a SATA drive.

#!/bin/bash
echo 1 > /sys/block/$1/device/delete