Thursday, February 03, 2005

ITworld.com - Zipping your way to free space: Part 2

ITworld.com - Zipping your way to free space: Part 2: "bash-2.03$ ./compressTest

real 1:55.6
user 1:41.1
sys 11.2
compress 57% reduction

real 54.6
user 29.1
sys 22.1
pack 26% reduction

real 2:33.0
user 2:23.6
sys 5.7
gzip 71% reduction

real 2:39.5
user 2:29.2
sys 6.1
zip 71% reduction

real 11:35.1
user 11:14.3
sys 4.5
bzip2 74% reduction

As you can see from these results, my best compression ratio resulted from the bzip2 compression, though this command took quite a bit more time to run than all of the other commands. Whether or not compression time should be given serious consideration in determining which compression tool you should use probably depends on how many files you need to compress and whether you need to compress these files in a narrow window.

The gzip and zip commands were neck in neck -- both with respect to compression and with respect to running time. Since my calculation ignored decimal places, any difference in the performance of these two tools has been obliterated by the simplicity of my math.

The script that I used to generate these time and compression results is shown below. Notice that I take care to preserve the original file, so that it is available for each compression operation. Also, because the syntax for the zip command is different -- requiring a file name to be specified -- I put my compression commands in a case statement.


#!/bin/bash

if [ '$1' == '' ]; then
echo -n 'file> '
read $file
else
file=$1
fi

orig_sz=`ls -l $file | awk '{print $5}'`
mv $file $file.$$

for tool in compress pack gzip zip bzip2
do
cp $file.orig $file
case $tool in
compress) time compress $file
comp_sz=`ls -l $file.Z | awk '{print $5}'`
;;
pack) time pack $file > /dev/null
comp_sz=`ls -l $file.z | awk '{print $5}'`
;;
gzip) time gzip $file
comp_sz=`ls -l $file.gz | awk '{print $5}'`
;;
zip) time zip $file.zip $file > /dev/null
comp_sz=`ls -l $file.zip | awk '{print $5}'`
;;
bzip2) time bzip2 $file
comp_sz=`ls -l $file.bz2 | awk '{print $5}'`
;;
esac

percent=`expr $comp_sz \* 100 / $orig_sz`
reduction=`expr 100 - $percent`
echo $tool ${reduction}% reduction
done

mv $file.$$ $file
# rm -i $file.*


Starting with a file name "myfile", you will end up with a series of files such as those shown below. If you don't want to keep these files around after you've completed your timing/efficiency tests, uncomment the remove command at the bottom of the script.

-rw-r--r-- 1 shs staff 291209216 Feb 3 10:14 myfile
-rw-r--r-- 1 shs staff 127372957 Feb 3 10:16 myfile.Z
-rw-r--r-- 1 shs staff 76466288 Feb 3 10:27 myfile.bz2
-rw-r--r-- 1 shs staff 85682214 Feb 3 10:20 myfile.gz
-rw-r--r-- 1 shs staff 217362875 Feb 3 10:18 myfile.z
-rw-r--r-- 1 shs staff 85682333 Feb 3 10:26 myfile.zip

Next week, we will examine the remaining compression considerations -- whether the compression algorithm is patented and whether the compression commands have other limitations that you need to consider.

"

No comments: