[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

dvd md5sum connundrum



Hi,

> Using that, I read it with dd if=/dev/cdrom bds=2048 count=595536|md5sum
> which gave the wrong answer, so I tried it with 595535 and also got the
> wrong answer.
> 
> Then I fired up kcalc ... READ DVD STRUCTURE ... Legacy lead-out ...
> ... 26624 bytes, or 13 2048 byte blocks, is there a std for this
> 'padding' that we could subtract, or are we doomed to use the iso's
> actual size to determine the amount of data to feed to md5sum to make it
> work in this scenario?  How about subtracting a (MOD(64k)/size)/2048 on
> the mediainfo returned size?
> 

Your goal is quite demanding.

You compute the checksum of a certain number of bytes which you then
convert into a storage representation which is known to be fuzzy with the
exact number of stored bytes. Then you begin to riddle how much of the
fuzzy end belongs to your checksummed data and where the trailing garbage
possibly might begin.


> I'm open for ideas and insights.  This is a problem that the broken
> dvd filesystem gives us, and it needs to be fixed in a joe six-pack can
> use it manner.

One has to distinguish between filesystem and DVD.

Currently you are exploring the DVD aspect which is responsible
for storing a byte array on media. That byte array does not
necessarily have to be a filesystem.
One has to be aware that the writing process is free to append
readable data to the end of the byte array. It is also possible
that old data or even virgin blocks are readable after the end
of the array.

Your original checksum was made from a filesystem image.
Probably you should rather query the filesystem image on
DVD in order to learn about the original image file's size.

But that querying will make your method prone to small
changes in the behavior of the image formatter or the writer. 
(Do mkisofs -pad bytes count as part of the filesystem ?
 They are part of the resulting file, at least.)


Since you have to memorize the original MD5 for comparison
with the DVD's MD5 anyway, why not just memorize the size
of the original file too ?
The pair (size,MD5) is an unambigous fingerprint for
media which deliver the original byte array plus some
trailing garbage.
The comparison relies entirely on information which is
easy to determine at the time of writing the media. It does
not rely on inner details of the particular byte array's
semantics. This method is also independent of the writer
software (growisofs, cdrecord-ProDVD, dvdrecord, whatever). 

Another approach would be to add some recognizable bytes as
an end mark to the payload data when writing to media.

I myself consider a combination of both approches (size + 
end mark) to be very useful.


Have a nice day :)

Thomas



Reply to: