Date: Fri, 19 Mar 1999 12:23:47 +0100 From: Hans-Bernhard Broeker Message-Id: <199903191123.MAA08469@acp3bf.physik.rwth-aachen.de> To: djgpp AT delorie DOT com Subject: Re: (fwd) Compression Newsgroups: comp.os.msdos.djgpp Organization: RWTH Aachen, III. physikalisches Institut B X-Newsreader: TIN [version 1.2 PL2] Reply-To: djgpp AT delorie DOT com In article <36F17BC3 DOT 78B926E4 AT cableol DOT co DOT uk> you wrote: > I'm sure I heard somewhere that tgz's are based around the same > algorithm as zips, so why the mega space saving? (Perhaps because > they use a different algorithm?) No, the packing algorithm itself is 100% identical. The difference between .zip and .tgz is in the stuff it's packing: single files *in* the archive, or the whole archive as one. [...] > DJ Delorie wrote: [...] > > file zip tgz > > djdev 1.42M 1.36M > > djlsr 1.45M 0.87M Just to complement what DJ already answered to this: note the difference between the given examples: djdev gains much less than djlsr does, from the use of tgz format. In the essence that's because djlsr contains a really enormous amount of quite similar, and very *small* files. That's exactly the situation where zip's approach of packing each file individually is rather inefficient. Packing works by finding and exploiting repetitions in the input, roughly, but inside a single, small file, there's not much repetition to be, and thus little to be gained from reducing them. Some people have reported that you can get even a bit better than .tar.gz by using .zip.gz, or .zip.zip, i.e.: zip -0 temparchive contained_files... zip -9 archive temparchive (or equivalently, replace the 'zip -9' by 'gzip -9'). The trick is that zip -0 makes a slightly smaller, and more easily packable package file than tar does. -- Hans-Bernhard Broeker (broeker AT physik DOT rwth-aachen DOT de) Even if all the snow were burnt, ashes would remain.