[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: checking integrity of already written CD/DVD

On 2009-03-31_09:53:51, Matus UHLAR - fantomas wrote:
> > >On Sun,29.Mar.09, 20:28:44, Angelin Lalev wrote:
> > >> Is there a way to check a written DVD against the checksum of the iso
> > >> image written on it?
> > In <[🔎] 20090329202842.GA3540@think.homelan>, Andrei Popescu wrote:
> > >$ md5sum /dev/dvd
> > >
> > >This should result in *exactly* the same checksum as the iso
> On 29.03.09 16:27, Boyd Stephen Smith Jr. wrote:
> > Not in my experience.  Both DVDs and CDs have a physical sector size. If the 
> > image is not a multiple of that sector size, the md5sum of the block device 
> > and the image will differ, because of the extra bits in the last physical 
> > sector.
> afaik, if the same image is written to multiple CDs/DVDs, they all should
> have the same md5sum, independently on its size. That is the one md5sum 
> shjould report. The same for sha1sum. 

Are you saying that two files that have different lengths (size) 
should have the same md5 or sha1? If you mean something else by
size, ignore this post, but ...

The design goal of both md5 and sha1 is to provide, for any file, a
message digest that is different from that of any other file that is
different from the first file in any way. If the file that is read
from CD/DVD device is longer, or shorter, than the iso that was used
to burn the CD/DVD then the two files are different in a way that is
significant to the message digest idea. For two files to have the same
message digest, they must be bit for bit identical. That means, at a
very minimum, they must have the same length. The motivation for the
invention of sha1 was that there was growing evidence that md5 was
failing to meet the design goal of "identical message digest only if
the files are identical".

IOW, a message digest algorithm must NOT ignore trailing zero bytes,
or trailing "garbage" bytes that have no effect on the meaning of the
file in its intended use.

Trailing zero bytes are easy to truncate. If the truncate file has a
matching message digest, one can be reasonably confident that the file
with the trailing zeros will function properly. For trailing "garbage"
bytes it is difficult to assert that those lost bytes at the end
really are garbage that may safely be ignored. 

I have two CD/DVD devices one consistently reports longer files than
the other on reading the same CD/DVD. Luckily, for me, for the one
reporting the longer file, its reading is always longer than the
length of the iso that I used to burn.  I truncate the longer file to
match the iso and a always get a matching message digest. If the
message digest of the leading part of the file matches the whole of
the iso, and if the iso is a well constructed image of what should be
on a CD/DVD, then a CD/DVD reader should never look at those extra
bytes that dd reports at the end. 

A second comment: In my experience, the iso files that I download from
Debian always have lengths that are integral multiples of 1024 bytes.
I think there is already some padding going on in the creation of these
files, so partial sectors in the iso is probably not an explanation
for whatever difficulties one may be having in verifying a CD/DVD.
(On doing a little quick research, I think the sector size on CD/DVD
may be 2048 bytes. I don't make a claim for integral multiple of 2048
because that is not what I actually tested. I don't remember whether
the integer was odd or even, just that there was no remainder.)

My problems with the beautiful one or two line checking script are
indicative of a little extra complexity here. Manufacturers don't,
apparently, guarantee reliable, accurate end-of-file checking. If 
you are unlucky and have hardware with unreliable EOF sensing, you
need to take extra measures in verifying the accuracy of a burn.

Paul E Condon           

Reply to: