Archiving and Compression in Linux

The function of archiving and compression is commonly used for archiving data with reduced size and building a single distribution package.  For example, the "tar" program is often used for software distribution.  If we have 100 files to distribute to the public, it would be very tedious to send them all one by one.  Using "tar" we can combine all 100 files into one file.  Here are some of the common archiving and compression utilities in Linux:

Directory Archive with tar

Using tar to create a single archive file, with extension .tar, called tarball.  The tar command records file and directory structure.  The following command makes a tarball from all files in the specified directory.  The tarball is created in the current directory.

$ tar -cf myball.tar /home/dave/mefiles/

To extract the tarball, use the command:

$ tar -xf myball.tar

The files in myball.tar tarball will be extracted into the current directory.

To list and view contents of a archive without extracting, use -t switch:

$ tar -t myball.tar

Directory Archive with cpio

The cpio method is used as the basis for RPM packages.  It does not recurse sub-directories thus it must be passed list of dirs.  It is more robust than tar when media errors encountered. 

The following command creates a cpio archive of all the files in the current directory:

$ ls | cpio -o > mybackup.cpio

Use -t option to list the contents before extracting files:

$ cpio -i -t < mybackup.cpio

To extract all the files from the archive and restore the directory structure inside navigate to the desired top level directory and run:

$ cpio -id < mybackup.cpio

The gzip Compression Utility

The gzip command is an improvement over "compress" and get higher compression ratios.  It is often used to distrubite and archive files.

The following command would compress mydata file into mydata.gz with the highest compression:

$ gzip -9 mydata

The following command will uncompress a .gz archive:

$ gunzip mydata.gz

The bzip2 Compression Utility

The bzip2 is the newest popular compression program.  It uses a different compression algorithm than gzip, and often archives much higher compression ratios.

The following command would compress mydata file to mydata.bz2 with the highest compression:

$ bzip2 mydata

To uncompress the file, use command:

$ bunzip2 mydata.bz2

No comments:

Post a Comment