diff --git a/zisofs.md b/zisofs.md new file mode 100644 index 0000000..99a60cf --- /dev/null +++ b/zisofs.md @@ -0,0 +1,203 @@ +`zisofs` is a compression format which is recognized by some Linux kernels. +Traditionally it is produced by program `mkzftree` from package +[zisofs-tools](http://freshmeat.net/projects/zisofs-tools/) and announced to +program `mkisofs` by option `-z`. + +A Linux kernel will transparently decompress file content if it +is modern enough and configured to do so. For checking the configuration, +try these shell commands: +``` +gunzip < /proc/config.gz | fgrep CONFIG_ZISOFS= +fgrep CONFIG_ZISOFS= /boot/config-$(uname -r) +``` + +libisofs can produce zisofs compressed content if it was linked with +`zlib` (`-lz`) at compile time. This is implemented by filter streams which +can be applied to the file nodes in an emerging ISO image. +With xorriso this can be controlled for individual files or +directory trees by option +``` +-set_filter_r --zisofs ...iso_rr_paths... -- +``` + +Already compressed files (e.g. produced by `mkzftree`) can be detected +automatically by an expensive test at image generation time like with +`mkisofs -z`. One may enable this by: +``` +-zisofs by_magic=on +``` + +By the appropriate uncompression filter it is possible to retrieve +the original file content via libisofs. +This filter is automatically applied to files from the loaded ISO image +which bear the ZF entry announcing zisofs compression. +If one wants to copy the files in compressed state, one may remove +that filter by +``` +-set_filter_r --remove-all-filters ...iso_rr_paths... -- +``` + +By default zlib compression level 6 is used. This may be changed to slowest +and most compact compression by xorriso option +``` +-zisofs level=9 +``` + +Relevant libisofs API functions (for details see `libisofs/libisofs.h`): +``` +iso_file_add_zisofs_filter() +iso_node_zf_by_magic() +iso_zisofs_get_refcounts() +iso_zisofs_set_params() +iso_zisofs_get_params() +iso_file_remove_filter() +iso_stream_get_input_stream() +``` + +------------ + +The following text is also available as libisofs/doc/zisofs_format.txt + +``` + + Description of the zisofs Format + + as of zisofs-tools-1.0.8 by H. Peter Anvin + and cdrtools-2.01.01a39 by Joerg Schilling + + For libburnia-project.org by Thomas Schmitt + - distribute freely , please report any errors or ambiguities - + + Apr 11 2009 + + +The zisofs format was invented by H. Peter Anvin. It compresses data file +content, marks it by a header and provides a pointer array for coarse random +access. Within a RRIP enhanced ISO 9660 image the format is additionally marked +by a System Use entry with signature "ZF". + +The uncompressed size of a single zisofs compressed file is restricted +to 4 GiB - 1. Larger files shall not be compressed. + + + File Header + +The file header has this layout (quoted from zisofs-tools-1.0.8/mkzftree.c): + Byte offset iso9660 type Contents + 0 (8 bytes) Magic number (37 E4 53 96 C9 DB D6 07) + 8 7.3.1 Uncompressed file size + 12 7.1.1 header_size >> 2 (currently 4) + 13 7.1.1 log2(block_size) + 14 (2 bytes) Reserved, must be zero +So its size is 16. +7.3.1 means little endian 4-byte words. 7.1.1. means unsigned single bytes. + +Readers shall be able to handle log2(block_size) values 15, 16 and 17 +i.e. block sizes 32 kB, 64 kB, and 128 kB. Writers must not use other sizes. + + + Block Pointers + +There are ceil(input_size / block_size) input and output blocks. +Each input block is of fixed size whereas the output blocks have varying +size (down to 0). For each output block there is an offset pointer giving +its byte address in the overall file content. The next block pointer in the +array tells the start of the next block which begins immediately after the +end of its predecessor. A final pointer gives the first invalid byte address +and thus marks the end of the last block. + +So there are ceil(input_size / block_size) + 1 block pointers. +They are stored as an array of 4-byte values which are in ISO 9660:7.3.1 format +directly after the file header, i.e. beginning at byte 16. + + + Data Part + +The data part begins immediately after the pointer array. In principle it +consists of the variable length output blocks as delivered by zlib function +compress2() when fed with the fixed size input blocks. + +A special case of input and output block is defined: +Zero-length blocks represent a block full of 0-bytes. +Such input blocks do not get processed by compress2() but shall be mapped to +0-sized output directly. Vice versa 0-sized blocks have to bypass uncompress() +when being read. + + + ZF System Use Entry Format + +ZF may only be applied to files with a single extent and less than 4 GiB of +uncompressed size. + +The ZF entry follows the general layout of SUSP and RRIP. +Its fields are: + + [1] "BP 1 to BP 2 - Signature Word" shall be (5A)(46) ("ZF"). + + [2] "BP 3 - Length" shall specify as an 8-bit number the length in bytes of + the ZF entry recorded according to ISO 9660:7.1.1. + This length is 16 decimal. + + [3] "BP 4 - System Use Entry Version" shall be 1 as in ISO 9660:7.1.1. + + [4] "BP 5 to BP 6 - Algorithm" shall be (70)(7A) ("pz") to indicate + "paged zlib". + + [5] "BP 7 - Header Size Div 4" shall specify as an 8-bit number the number of + 4-byte words in the header part of the file data recorded according + to ISO 9660:7.1.1. + (This is a copy of header byte 12 / BP 13). + + [6] "BP 8 - Log2 of Block Size" shall specify as an 8-bit number the binary + logarithm of the compression block size recorded according to + ISO 9660:7.1.1. + (This is a copy of header byte 13 / BP 14. + The value has to be 15, 16 or 17 i.e. 32 kiB, 64 kiB, or 128 kiB.) + + [7] "BP 9 to BP 16 - Uncompressed Size" shall tell the number of uncompressed + bytes represented by the given extent. This field shall be recorded + according to ISO 9660:7.3.3. + (This number is the same as in header bytes 8 to 11 / BP 9 to BP 12.) + + | 'Z' | 'F' | LENGTH | 1 | 'p' | 'z' | HEADER SIZE DIV 4 | LOG2 BLOCK SIZE + | UNCOMPRESSED SIZE | + +ISO 9660:7.3.3 means 4-byte word in both byte orders, first little endian, then +big endian. +Example (block size 32 kiB, uncompressed file size = 1,234,567 bytes): + { 'Z', "F', 16, 1, 'p', 'z', 4, 15, + 0x87, 0xD6, 0x12, 0x00, 0x00, 0x12, 0xD6, 0x87 } + + +------------------------------------------------------------------------------- +Revoked specification aspects: + +A comment in zisofs-tools-1.0.8 indicates a special case of output block: + "a block the length of which is equal to the block size is unencoded." +This is not implemented in zisofs-tools and in the Linux kernel. Existing +zisofs enhanced ISO images might contain encoded blocks which could be +mistaken for unencoded blocks. +Therefore this rule is not part of this description and must not be +implemented. + +------------------------------------------------------------------------------- +References: + +zisofs-tools + http://freshmeat.net/projects/zisofs-tools/ + +zlib: + /usr/include/zlib.h + +cdrtools with mkisofs + ftp://ftp.berlios.de/pub/cdrecord/alpha + +ECMA-119 aka ISO 9660 + http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-119.pdf + +SUSP 1.12 + ftp://ftp.ymi.com/pub/rockridge/susp112.ps + +RRIP 1.12 + ftp://ftp.ymi.com/pub/rockridge/rrip112.ps +```