Master tar, gzip, bzip2, xz, and zip. Understand the difference between archiving and compression, when to use each tool, and how to automate backup pipelines from the command line.
Archiving bundles multiple files and directories into a single file, preserving structure, permissions, and timestamps. The classic tool is tar (Tape ARchive). By itself, tar does not compress — the result is often larger than the source due to archive headers.
Compression reduces file size by encoding data more efficiently using algorithms (LZ77, Huffman coding, etc.). Linux tools include gzip, bzip2, xz, and zip. These work on single files — they do not bundle directories.
Combining both is the standard practice: use tar to bundle files, then pipe through a compression algorithm. The resulting .tar.gz, .tar.bz2, or .tar.xz is both archived and compressed.
tar cf archive.tar folder/ creates an uncompressed archive.
tar czf archive.tar.gz folder/ creates a gzip-compressed archive.
The z flag tells tar to pipe through gzip. Use j for bzip2, J for xz.
Fastest compression. Moderate ratio. Extension: .gz. Ubiquitous — available everywhere. The default compression for tar. Use when speed matters more than size.
Better compression than gzip, slower speed. Extension: .bz2. Uses Burrows-Wheeler algorithm. Good for distributing source code archives where size matters.
Best compression ratio. Slowest speed. Extension: .xz. LZMA2 algorithm. Used for kernel source and major software distributions. RAM-intensive during compression.
Cross-platform (Windows/Linux/Mac). Archives AND compresses in one step. Extension: .zip. Good for sharing with Windows users. Slightly less efficient than tar.gz.
No compression. Bundles files and preserves Unix metadata (permissions, ownership, symlinks). Extension: .tar. Foundation of all Linux backup pipelines.
All Linux file compression tools are lossless — perfect reconstruction of original data. Lossy compression (JPEG, MP3) is for media where some quality loss is acceptable.
| Format | Extension | Archive? | Compress? | Speed | Use When |
|---|---|---|---|---|---|
| tar | .tar | Yes | No | Fast | Archiving with metadata |
| tar+gzip | .tar.gz | Yes | Yes | Fast | General purpose backup |
| tar+bzip2 | .tar.bz2 | Yes | Yes | Medium | Source code distributions |
| tar+xz | .tar.xz | Yes | Yes | Slow | Maximum compression needed |
| zip | .zip | Yes | Yes | Fast | Cross-platform sharing |
| gzip | .gz | No | Yes | Fast | Single file compression |
Always verify your archive immediately after creation with tar tzf archive.tar.gz. A corrupted or incomplete archive discovered weeks later during a crisis is worthless. Test extraction to a temp directory at least monthly.