[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: [debian-knoppix] Improved cloop driver



Hi Michael,

On Thu, Jul 01, 2004 at 10:08:01PM +1000, Michael Cohen wrote:
> Hi Everyone,
>   The cloop driver is the core behind knoppix. The problem with the existing 
> cloop is that it saves the index for the compressed blocks at the start of 
> the file, then followed by the compressed chunks. This requires the 
> compressor to store the compressed file in virtual memory until the final 
> size of the index is known so it can be saved in the end. This is the warning 
> in the man page: 
> 
> create_compressed_fs(1) Make sure you have  enough  swap  to  hold  the
>        entire  compressed image in virtual memory! 
> 
> I have modified the cloop driver to produce a new loop driver that creates a 
> slightly different file format. There are basically 2 improvements:
> 1) the index is placed at the end, so the compressor just writes compressed 
> blocks until it gets to the end, and then dump the index.

This is a disadvantage, in my opinion, because some CDR-Media tends to
get unreliable or at least produces more CRC rechecksum at the end of
the medium. It is better to place indexes in front of the file, which is
what the current cloop version does.

Plus, I'm not especially happy with frequent changes of the internal
cloop file format, of course. ... :-/

> 2) The sizes of the compressed blocks is interleaved between the blocks. This 
> allows the index to be reconstructed in the event that the index is missing 
> (i.e. the file is truncated for example). This makes the compressed format 
> much more robust.


And adds overhead, and is only necessary if the index is NOT in the
header. ;-)

> The reason for these changes is because this compressed format is used in a 
> digital forensic package called pyflag (http://pyflag.sourceforge.net). In 
> this context, compressed images can be taken of entire hard disks 
> (multi-gigabytes) and then the modified cloop driver can be used to mount 
> them in ro mode. The new compressed format is called "sgzip", but it may be 
> best if the cloop format was changed in future to incorporate the changes.

A better approach would be, IMHO, to first generate an empty index in
the header (this is possible if you know in advance how many blocks must
be compressed), then create the image, and then seeking back to the
header, writing the index with the computed offsets. This would only
require to keep the index pointers in memory, not the entire compressed
image, and would need no changes in the compressed file format.
Disadvantage: Images cannot be written "on the fly" anymore because you
need a seek()able storage device in order to rewrite the header.


Regards
-Klaus Knopper
_______________________________________________
debian-knoppix mailing list
debian-knoppix@linuxtag.org
http://mailman.linuxtag.org/mailman/listinfo/debian-knoppix




Reply to: