[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: dvd md5sum connundrum



On Sunday 20 November 2005 12:35, scdbackup@gmx.net wrote:
>Hi,
>
>> Using that, I read it with dd if=/dev/cdrom bds=2048
>> count=595536|md5sum which gave the wrong answer, so I tried it with
>> 595535 and also got the wrong answer.
>>
>> Then I fired up kcalc ... READ DVD STRUCTURE ... Legacy lead-out ...
>> ... 26624 bytes, or 13 2048 byte blocks, is there a std for this
>> 'padding' that we could subtract, or are we doomed to use the iso's
>> actual size to determine the amount of data to feed to md5sum to make
>> it work in this scenario?  How about subtracting a
>> (MOD(64k)/size)/2048 on the mediainfo returned size?
>
>Your goal is quite demanding.
>
>You compute the checksum of a certain number of bytes which you then
>convert into a storage representation which is known to be fuzzy with
> the exact number of stored bytes. Then you begin to riddle how much of
> the fuzzy end belongs to your checksummed data and where the trailing
> garbage possibly might begin.
>
>> I'm open for ideas and insights.  This is a problem that the broken
>> dvd filesystem gives us, and it needs to be fixed in a joe six-pack
>> can use it manner.
>
>One has to distinguish between filesystem and DVD.
>
>Currently you are exploring the DVD aspect which is responsible
>for storing a byte array on media. That byte array does not
>necessarily have to be a filesystem.
>One has to be aware that the writing process is free to append
>readable data to the end of the byte array. It is also possible
>that old data or even virgin blocks are readable after the end
>of the array.
>
>Your original checksum was made from a filesystem image.
>Probably you should rather query the filesystem image on
>DVD in order to learn about the original image file's size.
>
>But that querying will make your method prone to small
>changes in the behavior of the image formatter or the writer.
>(Do mkisofs -pad bytes count as part of the filesystem ?
> They are part of the resulting file, at least.)
>
>
>Since you have to memorize the original MD5 for comparison
>with the DVD's MD5 anyway, why not just memorize the size
>of the original file too ?
>The pair (size,MD5) is an unambigous fingerprint for
>media which deliver the original byte array plus some
>trailing garbage.
>The comparison relies entirely on information which is
>easy to determine at the time of writing the media. It does
>not rely on inner details of the particular byte array's
>semantics. This method is also independent of the writer
>software (growisofs, cdrecord-ProDVD, dvdrecord, whatever).
>
>Another approach would be to add some recognizable bytes as
>an end mark to the payload data when writing to media.

And that would no doubt require growisofs to be modified in some way to
achive this.

>I myself consider a combination of both approches (size +
>end mark) to be very useful.

Well, considering that I did get the correct md5sum return when md5sum's
input was restricted to the exact size of the .iso image on the hard
drive, what we need is the ability to get, from the dvd, the size of
the image.  That doesn't appear to be available, or is this something
that is there, but just not read by the *info utils?  I have problems
with depending on the availability of the original .iso image for this.

Bear in mind also that a 'cmp' between the .iso and the burnt disk
returns no differences up until it hits EOF on the src .iso.  This could 
be
used as an alternative check method as it seems to assure that they are
indeed identical up to that point.  My main concern there is that cmp
dies on the first error rather than listing them all until it hits the
EOF on one or the other input streams.  The fact that there is an error
at all is the important part, but it would be nice to know if its a
repeatable pattern such as a scratched disk might output.

Thanks Thomas.

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
99.36% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com and AOL/TW attorneys please note, additions to the above
message by Gene Heskett are:
Copyright 2005 by Maurice Eugene Heskett, all rights reserved.



Reply to: