File compression methods

3.1 Archiving Files on the Command Line (Weight: 2)

📘Linux Essentials (LPI 010-160)


1. What is File Compression?

File compression is the process of reducing the size of a file by encoding its data more efficiently.

When a file is compressed:

  • The file size becomes smaller
  • It takes less disk space
  • It can be transferred faster across networks

The compressed file must later be decompressed to return to its original form.

Common Uses in IT Environments

Compression is frequently used in:

  • System backups to reduce backup storage size
  • Log file archiving on servers
  • Transferring files between servers
  • Packaging software source code
  • Storing historical system data

For example, a system administrator may compress large log files before storing them in a backup archive.


2. Compression vs Archiving

It is important to understand the difference:

OperationPurpose
ArchivingCombines multiple files into one file
CompressionReduces the file size

Example workflow:

multiple files → archive (.tar) → compress (.tar.gz)

Result:

backup.tar.gz

This process is very common in Linux systems.


3. Common Linux Compression Tools

The Linux Essentials exam mainly focuses on these compression methods:

ToolFile ExtensionDescription
gzip.gzMost common compression method
bzip2.bz2Higher compression ratio than gzip
xz.xzVery high compression ratio
zip.zipCommon cross-platform format

Each method has its own commands for compressing and decompressing files.


4. gzip Compression

Overview

gzip is the most widely used compression tool in Linux.

It provides:

  • Good compression
  • Fast performance
  • Simple usage

It is often used with tar archives.

Compressed files usually end with:

.gz

Example:

backup.tar.gz

Compressing a File

Basic syntax:

gzip filename

Example:

gzip server.log

Result:

server.log.gz

Important behavior:

  • The original file is replaced
  • Only the compressed file remains

Decompressing a File

Use:

gunzip filename.gz

Example:

gunzip server.log.gz

This restores:

server.log

Viewing Compressed Files

You can read compressed text files without decompressing them using:

zcat filename.gz

Example:

zcat server.log.gz

This displays the content in the terminal.


Useful gzip Options

OptionFunction
-dDecompress
-kKeep original file
-rCompress directories recursively
-vShow detailed output

Example:

gzip -v logfile

5. bzip2 Compression

Overview

bzip2 provides better compression than gzip, but it is slower.

Compressed files use the extension:

.bz2

Example:

backup.tar.bz2

Compressing a File

bzip2 filename

Example:

bzip2 database.sql

Result:

database.sql.bz2

Decompressing a File

bunzip2 filename.bz2

Example:

bunzip2 database.sql.bz2

Viewing Without Extracting

bzcat filename.bz2

Example:

bzcat database.sql.bz2

Useful Options

OptionDescription
-dDecompress
-kKeep original file
-vVerbose output

Example:

bzip2 -k logfile

6. xz Compression

Overview

xz is a modern compression method that provides very high compression ratios.

It is commonly used for:

  • Linux distribution packages
  • Source code archives
  • Software repositories

Compressed files use:

.xz

Example:

backup.tar.xz

Compressing a File

xz filename

Example:

xz system_backup.tar

Result:

system_backup.tar.xz

Decompressing

unxz filename.xz

Example:

unxz system_backup.tar.xz

Viewing Content

xzcat filename.xz

Example:

xzcat system.log.xz

Useful Options

OptionFunction
-dDecompress
-kKeep original file
-vVerbose output

7. zip Compression

Overview

zip is commonly used when files need to be shared between Linux, Windows, and macOS systems.

Unlike gzip, zip can:

  • Compress multiple files
  • Store directory structures

Compressed files use:

.zip

Creating a Zip Archive

Syntax:

zip archive_name.zip files

Example:

zip project.zip report.txt script.sh config.cfg

Result:

project.zip

Compressing a Directory

zip -r archive.zip directory/

Example:

zip -r logs.zip logs/

The -r option means recursive.


Extracting Zip Files

Use:

unzip archive.zip

Example:

unzip project.zip

8. Compression with tar

Compression is often used together with the tar archiving command.

Common compressed archive formats include:

FormatCommand
.tar.gztar with gzip
.tar.bz2tar with bzip2
.tar.xztar with xz

Examples:

Create gzip archive:

tar -czf backup.tar.gz folder/

Create bzip2 archive:

tar -cjf backup.tar.bz2 folder/

Create xz archive:

tar -cJf backup.tar.xz folder/

Extract gzip archive:

tar -xzf backup.tar.gz

Extract bzip2 archive:

tar -xjf backup.tar.bz2

Extract xz archive:

tar -xJf backup.tar.xz

9. Comparison of Compression Methods

ToolCompression SpeedCompression RatioTypical Use
gzipFastMediumGeneral use
bzip2SlowerHigherLarge archives
xzSlowestHighestDistribution packages
zipMediumMediumCross-platform sharing

10. Important Exam Points (LPI 010-160)

Students preparing for the exam should remember:

  • Compression reduces file size
  • Archiving combines multiple files
  • gzip, bzip2, and xz compress single files
  • zip can compress multiple files
  • Common compressed archive extensions include:
    • .gz
    • .bz2
    • .xz
    • .zip
  • tar is commonly used together with compression tools

Important commands to remember:

gzip
gunzip
bzip2
bunzip2
xz
unxz
zip
unzip
zcat
bzcat
xzcat

11. Summary

File compression is widely used in Linux to reduce storage usage and improve file transfer efficiency.

Linux provides several command-line compression tools:

  • gzip – the most commonly used compression tool
  • bzip2 – higher compression than gzip
  • xz – very high compression ratio
  • zip – cross-platform compressed archive format

System administrators frequently use compression when archiving logs, creating backups, distributing software packages, and transferring files between systems.

Understanding these tools and their commands is essential for working with files in Linux and for successfully passing the Linux Essentials (LPI 010-160) exam.

Buy Me a Coffee