Here are some small bash scripts that I use to aid in data management.


Recursively prints out md5 values for files within a directory tree. It can be piped into a text file, and will ignore the piped file if it begins with “md5sum”

find ./ -type f ! -iname "md5sum*" -exec md5sum {} \;


Recursively finds all md5sum* files in a directory, and verifies the data.


# Generate a temp file
eval MD5LIST="~/md5list-$(pwgen 8 1).txt"

echo "Finding all md5sums in target dir"
find $PWD -iname "md5sum*txt" -o -iname "*.md5" -o -iname "md5sum" > $MD5LIST

while read md5path; do

	# This part is here because an earlier md5dir script did not do this check.
	if ( grep -q $(basename "$md5path") "$md5path" ); then
		echo "Stripping self-reference from $md5path"
		sed -i "/$(basename "$md5path")/d" $md5path

	cd "$(dirname "$md5path")"
		echo -e "Checking: $md5path"
		echo -en "\033[0;31m"
	md5sum --quiet -c "$(basename "$md5path")"
		echo -en "\033[0m"

done <$MD5LIST



Copy directory trees with a progress bar.

rsync -uav --info=progress2 $1 $2


Hot remove a SATA drive.

echo 1 > /sys/block/$1/device/delete