================
 rtar.py Manual
================

:author: Marc 'BlackJack' Rintsch
:contact: marc@rintsch.de
:date: $Date: 2005-08-26 15:15:22 +0200 (Fri, 26 Aug 2005) $
:version: 0.3
:revision: $Rev: 755 $
:copyright: This document has been placed in the public domain.

.. meta::
   :description: Manual for the rtar.py script.
   :keywords: compression, solid, archive, RAR, tar, bzip2, gzip, Python

.. contents::
.. sectnum::

Name
====

rtar.py -- an archiver.


Synopsis
========
::

  rtar.py [-h|--help|--version]
  rtar.py [options] file(s)


Description
===========

The program creates compressed `tar` archives from files and
directories.  In contrast to the original ``tar`` it builds a list of
file names first and sorts it in a way that (should) give a better
compression ratio.

It also makes the common task of archiving one directory a bit easier
by providing a short option that infers the file name of the archive
from the name of the directory and the compression algorithm.  See
Examples_ for details.


Why sorting the names?
----------------------

Way back in the old DOS days the RAR_ archiver had, and still has,
three advantages over ZIP archives when it comes to compression ratio:

1. RAR_ uses a slower but better compression algorithm than the
   standard `deflate` algorithm of ZIP archives,

2. it creates `solid` archives instead of compressing each file
   separatly to benefit from redundancy between files,

3. and files are grouped by file name extensions in order to have
   files with similar contents close to each other and benefit from
   2. even more.

With `tar` archives 1. is true if `bzip2` compression is used and
2. is always true as a `tar` archive is created first and then
compressed as a whole.

But the grouping by file name extensions is not done by the standard
`tar` programs.  This is what ``rtar.py`` is doing.


.. Put some figures to compare ``tar`` and ``rtar.py`` on kernel, python
   documentation, etc. ::

     bj@s8n:~/src> l test-*    # About 38 MB source code.
     -rw-r--r--  1 bj users 8484733 2005-01-22 13:56 test-rtar.tar.bz2
     -rw-r--r--  1 bj users 8850233 2005-01-22 13:55 test-tar.tar.bz2


Requirements
============

The script requires Python_ 2.4 or higher.


Commandline Options
===================

  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -o FILENAME, -f FILENAME, --file=FILENAME
                        write archive to this file instead of STDOUT.
  -a, --auto-name       infer archive file name from the first given directory
                        name.  This only works if there is just one directory
                        name given as argument.  The archive is named:
                        ``<directory_name>.tar[.<algorithm>]``
  --list                just dump the sorted file names to STDOUT -- don't
                        create archive.
  --compression=ALGORITHM
                        select compression algorithm from none, gzip or bzip2.
                        [bzip2]
  -z, --gzip            use gzip compression.
  -j, --bzip2           use bzip2 compression. [default]
  -b BLOCKS, --blocking-factor=BLOCKS
                        BLOCKS x 512 per record [20]


Examples
========

Compress the contents of directories and all their subdirectories::

  rtar.py foo/ > foo.tar.bz2
  rtar.py -o foo_and_bar.tar.bz2 foo/ bar/

Create the archives `foo.tar.bz2` and `bar.tar.gz` with the auto
naming option::

  rtar.py -a foo/
  rtar.py --auto-name --gzip bar/


History
=======

0.3.0: 2005-08-26
  Added ``-b``/``--blocking-factor`` option.  Setting it to 1 prevents
  some blocks full of zero bytes to be appended to the archive.  May
  save some bytes, but generally those blocks are compressed very
  effectivly anyway.

  The program does not crash anymore if it comes across files that
  can't be read.  A warning is printed instead.

  Directories given at the command line are archived now too.  Before
  this fix only the contents of the directory were archived but not
  the top level directory name itself.

0.2a : 2005-01-23
  Fixed a really stupid bug that made creating archives with
  redirecting the output into a file impossible.

  While compressing each file name is written to `stderr` and prefixed
  with the percentage of files already processed.

  The user can select the compression algorithm (`none`, `gzip` or
  `bzip2`) and the `auto naming` feature (``-a``) was implemented.

0.1a : 2005-01-22
  Initial release.  Can be used to create archives but has a severe
  bug: it silently ignores problems while creating the file list.


ToDo
====

- Sort extensions by extension list instead of alphabetically.
- Group backup files (\*{~,.bck,.bak}) with their "master" files and
  sort just by name without extension.

- Exclude list/patterns.
- Option to add a prefix to every file.
- Change attributes like uid/gid, uname/gname.
- Color output with ANSI escape sequences if `stderr` is a `tty`.


Bugs
====

- Silently ignores problems while creating the file list.


.. See also
.. ========


Copyright
=========

Copyright  2005 Marc 'BlackJack' Rintsch <marc@rintsch.de>

This is free software; see the source for copying conditions. There is
NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR
PURPOSE.

.. _Python: http://www.python.org/
.. _RAR: http://www.rarsoft.com/
