111 lines
4.6 KiB
Plaintext
111 lines
4.6 KiB
Plaintext
|
|
Description of libisofs MD5 checksumming
|
|
|
|
by Thomas Schmitt - mailto:scdbackup@gmx.net
|
|
Libburnia project - mailto:libburn-hackers@pykix.org
|
|
13 Aug 2009
|
|
|
|
|
|
MD5 is a 128 bit message digest with a very low probability to be the same for
|
|
any pair of differing data files. It is described in RFC 1321. and can be
|
|
computed e.g. by program md5sum.
|
|
|
|
libisofs can equip its images with MD5 checksums for the whole session and
|
|
for each single data file. See libisofs.h, iso_write_opts_set_record_md5().
|
|
The checksums get loaded together with the directory tree if this is enabled by
|
|
iso_read_opts_set_no_md5(). Loaded checksums can be inquired by
|
|
iso_image_get_session_md5() and iso_file_get_md5().
|
|
libisofs has own MD5 computation functions: iso_md5_start(), iso_md5_compute(),
|
|
iso_md5_clone(), iso_md5_end().
|
|
See iso_file_get_stream(), iso_stream_open() et.al. for reading file content
|
|
from the loaded image.
|
|
|
|
|
|
Representation in the Image
|
|
|
|
The checksums are stored in an area at the end of the session, in order to
|
|
allow quick loading from media with slow random access.
|
|
There is an array of MD5 entries and a single block with a checksum tag.
|
|
|
|
Location and layout of the checksum area is recorded as AAIP attribute
|
|
"isofs.ca" of the root node.
|
|
See doc/susp_aaip_2_0.txt for a general description of AAIP and
|
|
doc/susp_aaip_isofs_names.txt for the layout of "isofs.ca".
|
|
|
|
Because the inquiry of this attribute demands loading of the image tree,
|
|
there is also a checksum tag after the checksum area.
|
|
This tag can be detected on the fly when reading and checksumming the session
|
|
from the start point as learned from a media table-of-content. It covers not
|
|
only the payload of the session but also the checksum area.
|
|
|
|
The single data files hold an index to their MD5 checksum in individual AAIP
|
|
attributes "isofs.cx". Index I means: array base address + 16 * I.
|
|
|
|
The checksums cover the data content as it was actually written into the ISO
|
|
image stream, not necessarily as it was on hard disk before or afterwards.
|
|
This implies that content filtered files bear the MD5 of the filtered data
|
|
and not of the original files on disk. When checkreading, one has to avoid
|
|
any filtering. Dig out the stream which directly reads image data by calling
|
|
iso_stream_get_input_stream() until it returns NULL and use
|
|
iso_stream_get_size() rather than iso_file_get_size().
|
|
|
|
|
|
The MD5 array
|
|
|
|
If there are N checksummed data files then the array consists of N + 2 entries
|
|
with 16 bytes each.
|
|
|
|
Entry number 0 holds a session checksum which covers the range from the session
|
|
start block up to (but not including) the start block of the checksum area.
|
|
This range is described by attribute "isofs.ca" of the root node.
|
|
|
|
Entries 1 to N hold the checksums of individual data files.
|
|
|
|
Entry number N + 1 holds the MD5 checksum of entries 0 to N.
|
|
|
|
|
|
The Checksum Tag
|
|
|
|
The next block after the array begins with the checksum tag and is padded
|
|
by 0-bytes. The tag is a single line of printable text and has the following
|
|
format:
|
|
|
|
libisofs_checksum_tag_v1 pos=# range_start=# range_size=# md5=# self=#\n
|
|
|
|
Example:
|
|
libisofs_checksum_tag_v1 pos=81552 range_start=32 range_size=81520 md5=f172b994e8eb565a011d220b2a8b7a19 self=020975b2aa1189d455db2c09560b8732
|
|
|
|
There are five parameters. The first three are decimal numbers, the others
|
|
are strings of 32 hex digits.
|
|
|
|
pos=
|
|
gives the block address where the tag supposes itself to be stored.
|
|
If this does not match the block address where the tag is found then this
|
|
either indicates that the tag is payload of the image or that the image has
|
|
been relocated. (The latter makes the image unusable.)
|
|
|
|
range_start=
|
|
The block address where the session is supposed to start. If this does not
|
|
match the session start on media then the volume descriptors of the
|
|
image have been relocated. (This can happen with overwriteable media. If
|
|
checksumming started at LBA 0 and finds range_start=32, then one has to
|
|
restart checksumming at LBA 32. See libburn/doc/cookbook.txt
|
|
"ISO 9660 multi-session emulation on overwriteable media" for background
|
|
information.)
|
|
|
|
range_size=
|
|
The number of blocks beginning at range_start which are covered by the
|
|
checksum of the tag.
|
|
|
|
md5=
|
|
The checksum payload of the tag as lower case hex digits.
|
|
|
|
self=
|
|
The MD5 checksum of the tag itself up to and including the last hex digit of
|
|
parameter "md5=".
|
|
|
|
The newline character at the end is mandatory. For now all bytes of the
|
|
block after that newline shall be zero. There may arise future extensions.
|
|
|
|
|