gzip or bzip2

Using bzip2 instead of gzip will sometimes save you valuable storage capacity.

Quite some time ago I already wrote a simple script to replace a gzip archive in case bzip2 compression is doing a better job.

With verbose handling and additional md5 check of the result, this is the current script:

#!/bin/bash
################################################################################
# Convert gzip'ped files to bzip2 format, if that saves space.
################################################################################

VERBOSE="1"

files="$@"
[ -z "$files" ] && files="$(ls *.gz)"

for file_gz in $files; do
        [[ "$file_gz" == *.gz ]] || continue
        [ -r "$file_gz" ] || continue
        [ -w "$(dirname $file_gz)" ] || continue

        file_bz2="$(dirname "$file_gz")/$(basename "$file_gz" .gz).bz2"
        if [ -e "$file_bz2" ]; then
                echo "Cowardly refusing to overwrite $file_bz2."
                continue
        fi

        # bzip2 compression:
        [ -n "$VERBOSE" ] && echo "Conduct bzip2 on $file_gz..."
        zcat "$file_gz" | nice bzip2 >"$file_bz2" || continue

        # Check size (bz2 clone is smaller).
        size_gz="$(stat -c "%s" "$file_gz")"
        size_bz2="$(stat -c "%s" "$file_bz2")"
        if [ -z "$size_bz2" -o "$size_bz2" = 0 -o "$size_gz" -le "$size_bz2" ]; then
                [ -n "$VERBOSE" ] && echo "Result is not smaller."
                rm -f "$file_bz2"
                continue
        fi
        [ -n "$VERBOSE" ] && echo "bzip2 compression wins benchmark: $size_gz > $size_bz2"

        # Additional md5 check.
        md5_gz="$(zcat "$file_gz" | md5sum)"
        md5_bz2="$(bzcat "$file_bz2" | md5sum)"
        if [ "$md5_gz" != "$md5_bz2" ]; then
                [ -n "$VERBOSE" ] && echo "MD5 check failed."
                rm -f "$file_bz2"
                continue
        fi
        [ -n "$VERBOSE" ] && echo "MD5 check passed."

        # Size is better, md5 is ok, then drop the original file.
        [ -n "$VERBOSE" ] && echo "Drop original file: $file_gz"
        rm -f "$file_gz"
done