libburn/doc/cdtext.txt

613 lines
24 KiB
Plaintext
Raw Normal View History

Description of CD-TEXT
Guided by Leon Merten Lohse via libcdio-devel@gnu.org
by reading mmc3r10g.pdf from http://www.t10.org/ftp/t10/drafts/mmc3/
by docs and results of cdtext.zip from http://www.sonydadc.com/file/
by reading source of libcdio from http://www.gnu.org/s/libcdio
which quotes source of cdrecord from ftp://ftp.berlios.de/pub/cdrecord/alpha
Language codes were learned from http://tech.ebu.ch/docs/tech/tech3264.pdf
Genre codes were learned from libcdio and confirmed by
http://helpdesk.audiofile-engineering.com/index.php?pg=kb.page&id=123
For libburnia-project.org by Thomas Schmitt <scdbackup@gmx.net>
Content:
- CD-TEXT from the view of the user
- Content specifications of particular pack types
- Format of a CD-TEXT packs array
- Overview of libburn API calls for CD-TEXT
2011-12-14 13:26:06 +00:00
- Sony Text File Format (Input Sheet Version 0.7T):
-------------------------------------------------------------------------------
CD-TEXT from the view of the user:
CD-TEXT records attributes of disc and tracks on audio CD.
The attributes are grouped into blocks which represent particular languages.
Up to 8 blocks are possible.
There are 13 defined attribute categories, which are called Pack Types and are
identified by a single-byte code:
0x80 = Title
2011-12-14 13:26:06 +00:00
0x81 = Names of Performers
0x82 = Names of Songwriters
2011-12-14 13:26:06 +00:00
0x83 = Names of Composers
0x84 = Names of Arrangers
0x85 = Messages
0x86 = text-and-binary: Disc Identification
0x87 = text-and-binary: Genre Identification
0x88 = binary: Table of Content information
0x89 = binary: Second Table of Content information
(0x8a to 0x8c are reserved.)
0x8d = Closed Information
0x8e = UPC/EAN code of the album and ISRC code of each track
0x8f = binary: Size Information of the Block
Some of these categories apply to the whole disc only:
0x86, 0x87, 0x88, 0x89, 0x8d
Some have to be additionally attributed to each track, if they are present for
the whole disc:
0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x8e
One describes the overall content of a block and in part of all other blocks:
0x8f
The total size of a block's attribute set is restricted by the fact that it
has to be stored in at most 253 records with 12 bytes of payload. These records
are called Text Packs.
A shortcut for repeated identical track texts is provided, so that a text
that is identical to the one of the previous track occupies only 2 or 4 bytes.
-------------------------------------------------------------------------------
Content specification of particular pack types:
Pack types 0x80 to 0x85 and 0x8e contain 0-terminated cleartext. If double byte
characters are used, then two 0-bytes terminate the cleartext.
The meaning of 0x80 to 0x85 should be clear by above list. They are encoded
according to the Character Code of their block. Either as ISO-8859-1 single
byte characters, or as 7-bit ASCII single byte characters, or as MS-JIS double
byte characters.
More info to 0x8e is given below.
Pack type 0x86 (Disc Identification) is documented by Sony as "Catalog Number:
(use ASCII Code) Catalog Number of the album". So it is not really binary
but might be non-printable, and should contain only bytes with bit7 = 0.
Pack type 0x87 contains 2 binary bytes, followed by 0-terminated cleartext.
The two binary bytes form a big-endian index to the following list.
0x0000 = "Not Used" (Sony prescribes to use this if no genre applies)
0x0001 = "Not Defined"
0x0002 = "Adult Contemporary"
0x0003 = "Alternative Rock"
0x0004 = "Childrens Music"
0x0005 = "Classical"
0x0006 = "Contemporary Christian"
0x0007 = "Country"
0x0008 = "Dance"
0x0009 = "Easy Listening"
0x000a = "Erotic"
0x000b = "Folk"
0x000c = "Gospel"
0x000d = "Hip Hop"
0x000e = "Jazz"
0x000f = "Latin"
0x0010 = "Musical"
0x0011 = "New Age"
0x0012 = "Opera"
0x0013 = "Operetta"
0x0014 = "Pop Music"
0x0015 = "Rap"
0x0016 = "Reggae"
0x0017 = "Rock Music"
0x0018 = "Rhythm & Blues"
0x0019 = "Sound Effects"
0x001a = "Spoken Word"
0x001b = "World Music"
Sony documents the cleartext part as "Genre information that would supplement
the Genre Code, such as 'USA Rock music in the 60s'". Always ASCII encoded.
Pack type 0x88 records information from the CDs Table of Content, as of
READ PMA/TOC/ATIP Format 0010b (mmc5r03c.pdf, table 490 TOC Track Descriptor
Format, Q Sub-channel).
See below, Format of CD-TEXT packs, for more details about the content of
pack type 0x88.
Pack type 0x89 is yet quite unclear. See below, Format of CD-TEXT packs, for
an example of this pack type.
Pack type 0x8d is documented by Sony as "Closed Information: (use 8859-1 Code)
Any information can be recorded on disc as memorandum. Information in this
field will not be read by CD TEXT players available to the public."
Always ISO-8859-1 encoded.
Pack type 0x8e is documented by Sony as "UPC/EAN Code (POS Code) of the album.
This field typically consists of 13 characters." Always ASCII encoded.
2011-12-14 13:26:06 +00:00
It applies to tracks as "ISRC code [which] typically consists of 12 characters"
and is always ISO-8859-1 encoded.
Pack type 0x8f summarizes the whole list of text packs of a block.
See below, Format of CD-TEXT packs, for details.
-------------------------------------------------------------------------------
Format of a CD-TEXT packs array:
The attributes are represented on CD as Text Packs in the sub-channel of
2011-12-14 13:26:06 +00:00
the Lead-in of the disc. See doc/cookbook.txt for a description how to write
the readily formatted CD-TEXT pack array to CD, and how to read CD-TEXT packs
from CD.
The format is explained in part in MMC-3 (mmc3r10g.pdf, Annex J) and in part by
2011-12-14 13:26:06 +00:00
the documentation in Sony's cdtext.zip :
Each pack consists of a 4-byte header, 12 bytes of payload, and 2 bytes of CRC.
The first byte of each pack tells the pack type. See above for a list of types.
The second byte tells the track number to which the first text piece in
a pack is associated. Number 0 means the whole album. Higher numbers are
valid for types 0x80 to 0x85, and 0x8e. With these types, there should be
one text for the disc and one for each track.
With types 0x88 and 0x89, the second byte bears a track number, too.
With type 0x8f, the second byte counts the record parts from 0 to 2.
The third byte is a sequential counter.
The fourth byte is the Block Number and Character Position Indicator.
It consists of three bit fields:
bit7 = Double Bytes Character Code (0= single byte characters)
bit4-6 = Block Number (groups text packs in language blocks)
bit0-3 = Character position. Either the number of characters which
the current text inherited from the previous pack, or
15 if the current text started before the previous pack.
The 12 payload bytes contain pieces of 0-terminated texts or binary data.
A text may span over several packs. Unused characters in a pack are used for
the next text of the same pack type. If no text of the same type follows,
then the remaining text bytes are set to 0.
The CRC algorithm uses divisor 0x11021. The resulting 16-bit residue of the
polynomial division gets exored with 0xffff and written as big-endian
number to bytes 16 and 17 of the pack.
The text packs are grouped in up to 8 blocks of at most 256 packs. Each block
is in charge for one language. Sequence numbers of each block are counted
separately. All packs of block 0 come before the packs of block 1.
The limitation of block number and sequence numbers imply that there are at
most 2048 text packs possible. (READ TOC/PMS/ATIP could retrieve 3640 packs,
as it is limited to 64 kB - 2.)
If a text of a track (pack types 0x80 to 0x85 and 0x8e) repeats identically
for the next track, then it may be represented by a TAB character (ASCII 9)
for single byte texts, resp. two TAB characters for double byte texts.
(This should be used because 256 * 12 bytes is few space for 99 tracks.)
The two binary bytes of pack type 0x87 are written to the first 0x87 pack of
a block. They may or may not be repeated at the start of the follow-up packs
of type 0x87.
The first pack of type 0x88 in a block records in its payload bytes:
0 : PMIN of POINT A1 = First Track Number
1 : PMIN of POINT A2 = Last Track Number
2 : unknown, 0 in Sony example
3 : PMIN of POINT A2 = Start position of Lead-Out
4 : PSEC of POINT A2 = Start position of Lead-Out
5 : PFRAME of POINT A2 = Start position of Lead-Out
6 to 11 : unknown, 0 in Sony example
The following packs record PMIN, PSEC, PFRAME of the POINTs between the
lowest track number (min 01h) and the highest track number (max 63h).
The payload of the last pack is padded by 0s.
The Sony .TOC example:
A0 01
A1 14
A2 63:02:18
01 00:02:00
02 04:11:25
03 08:02:50
04 11:47:62
...
13 53:24:25
14 57:03:25
yields
88 00 23 00 01 0e 00 3f 02 12 00 00 00 00 00 00 12 00
88 01 24 00 00 02 00 04 0b 19 08 02 32 0b 2f 3e 67 2d
...
88 0d 27 00 35 18 19 39 03 19 00 00 00 00 00 00 ea af
Pack type 0x89 is yet quite unclear. Especially what the information shall
mean to the user of the CD. The time points in the Sony example are in the
time range of the tracks numbers that are given before the time points:
01 02:41:48 01 02:52:58
06 23:14:25 06 23:29:60
07 28:30:39 07 28:42:30
13 55:13:26 13 55:31:50
yields
89 01 28 00 01 04 00 00 00 00 02 29 30 02 34 3a f3 0c
89 06 29 00 02 04 00 00 00 00 17 0e 19 17 1d 3c 73 92
89 07 2a 00 03 04 00 00 00 00 1c 1e 27 1c 2a 1e 72 20
89 0d 2b 00 04 04 00 00 00 00 37 0d 1a 37 1f 32 0b 62
The track numbers are stored in the track number byte of the packs. The two
time points are stored in byte 6 to 11 of the payload. Byte 0 of the payload
seems to be a sequential counter. Byte 1 always 4 ? Byte 2 to 5 always 0 ?
Pack type 0x8f summarizes the whole list of text packs of a block.
So there is one group of three 0x8f packs per block.
Nevertheless each 0x8f group tells the highest sequence number and the
language code of all blocks.
The payload bytes of three 0x8f packs form a 36 byte record. The track number
bytes of the three packs have the values 0, 1, 2.
Byte :
0 : Character code for pack types 0x80 to 0x85:
0x00 = ISO-8859-1
0x01 = 7 bit ASCII
0x80 = MS-JIS (japanese Kanji, double byte characters)
1 : Number of first track
2 : Number of last track
3 : libcdio source states: "cd-text information copyright byte"
Probably 3 means "copyrighted", 0 means "not copyrighted".
4 - 19 : Pack count of the various types 0x80 to 0x8f.
Byte number N tells the count of packs of type 0x80 + (N - 4).
I.e. the first byte in this field of 16 counts packs of type 0x80.
20 - 27 : Highest sequence byte number of blocks 0 to 7.
28 - 36 : Language code for blocks 0 to 7 (tech3264.pdf appendix 3)
Not all of these Codes have ever been seen with CD-TEXT, though.
0x00 = Unknown
0x01 = Albanian
0x02 = Breton
0x03 = Catalan
0x04 = Croatian
0x05 = Welsh
0x06 = Czech
0x07 = Danish
0x08 = German
0x09 = English
0x0a = Spanish
0x0b = Esperanto
0x0c = Estonian
0x0d = Basque
0x0e = Faroese
0x0f = French
0x10 = Frisian
0x11 = Irish
0x12 = Gaelic
0x13 = Galician
0x14 = Icelandic
0x15 = Italian
0x16 = Lappish
0x17 = Latin
0x18 = Latvian
0x19 = Luxembourgian
0x1a = Lithuanian
0x1b = Hungarian
0x1c = Maltese
0x1d = Dutch
0x1e = Norwegian
0x1f = Occitan
0x20 = Polish
0x21 = Portuguese
0x22 = Romanian
0x23 = Romansh
0x24 = Serbian
0x25 = Slovak
0x26 = Slovenian
0x27 = Finnish
0x28 = Swedish
0x29 = Turkish
0x2a = Flemish
0x2b = Wallon
0x45 = Zulu
0x46 = Vietnamese
0x47 = Uzbek
0x48 = Urdu
0x49 = Ukrainian
0x4a = Thai
0x4b = Telugu
0x4c = Tatar
0x4d = Tamil
0x4e = Tadzhik
0x4f = Swahili
0x50 = Sranan Tongo
0x51 = Somali
0x52 = Sinhalese
0x53 = Shona
0x54 = Serbo-croat
0x55 = Ruthenian
0x56 = Russian
0x57 = Quechua
0x58 = Pushtu
0x59 = Punjabi
0x5a = Persian
0x5b = Papamiento
0x5c = Oriya
0x5d = Nepali
0x5e = Ndebele
0x5f = Marathi
0x60 = Moldavian
0x61 = Malaysian
0x62 = Malagasay
0x63 = Macedonian
0x64 = Laotian
0x65 = Korean
0x66 = Khmer
0x67 = Kazakh
0x68 = Kannada
0x69 = Japanese
0x6a = Indonesian
0x6b = Hindi
0x6c = Hebrew
0x6d = Hausa
0x6e = Gurani
0x6f = Gujurati
0x70 = Greek
0x71 = Georgian
0x72 = Fulani
0x73 = Dari
0x74 = Churash
0x75 = Chinese
0x76 = Burmese
0x77 = Bulgarian
0x78 = Bengali
0x79 = Bielorussian
0x7a = Bambora
0x7b = Azerbaijani
0x7c = Assamese
0x7d = Armenian
0x7e = Arabic
0x7f = Amharic
E.g. these three packs
42 : 8f 00 2a 00 01 01 03 00 06 05 04 05 07 06 01 02 48 65
43 : 8f 01 2b 00 00 00 00 00 00 00 06 03 2c 00 00 00 c0 20
44 : 8f 02 2c 00 00 00 00 00 09 00 00 00 00 00 00 00 11 45
decode to
Byte :Value Meaning
0 : 01 = ASCII 7-bit
1 : 01 = first track is 1
2 : 03 = last track is 3
3 : 00 = copyright (0 = public domain, 3 = copyrighted ?)
4 : 06 = 6 packs of type 0x80
5 : 05 = 5 packs of type 0x81
6 : 04 = 4 packs of type 0x82
7 : 05 = 5 packs of type 0x83
8 : 07 = 7 packs of type 0x84
9 : 06 = 6 packs of type 0x85
10 : 01 = 1 pack of type 0x86
11 : 02 = 2 packs of type 0x87
12 : 00 = 0 packs of type 0x88
13 : 00 = 0 packs of type 0x89
14 : 00 00 00 00 = 0 packs of types 0x8a to 0x8d
18 : 06 = 6 packs of type 0x8e
19 : 03 = 3 packs of type 0x8f
20 : 2c = last sequence for block 0
This matches the sequence number of the last text pack (0x2c = 44)
21 : 00 00 00 00 00 00 00 = last sequence numbers for block 1..7 (none)
28 : 09 = language code for block 0: English
29 : 00 00 00 00 00 00 00 = language codes for block 1..7 (none)
-------------------------------------------------------------------------------
libburn API calls for CD-TEXT (see libburn/libburn.h for details):
libburn can retrieve the set of text packs from a CD:
int burn_disc_get_leadin_text(struct burn_drive *d,
unsigned char **text_packs, int *num_packs,
int flag);
It can write a text pack set with a CD SAO session.
This set may be attached as array of readily formatted text packs by:
int burn_write_opts_set_leadin_text(struct burn_write_opts *opts,
unsigned char *text_packs,
int num_packs, int flag);
Alternatively it may be defined by attaching CD-TEXT attributes to burn_session
and burn_track:
int burn_session_set_cdtext_par(struct burn_session *s,
int char_codes[8], int copyrights[8],
int languages[8], int flag);
int burn_session_set_cdtext(struct burn_session *s, int block,
int pack_type, char *pack_type_name,
unsigned char *payload, int length, int flag);
int burn_track_set_cdtext(struct burn_track *t, int block,
int pack_type, char *pack_type_name,
unsigned char *payload, int length, int flag);
2011-12-14 13:26:06 +00:00
Macros list the texts for genre and language codes:
BURN_CDTEXT_LANGUAGES_0X00
BURN_CDTEXT_FILLER
BURN_CDTEXT_LANGUAGES_0X45
BURN_CDTEXT_GENRE_LIST
BURN_CDTEXT_NUM_GENRES
There is a reader for Sony Input Sheet Version 0.7T:
int burn_session_input_sheet_v07t(struct burn_session *session,
char *path, int block, int flag);
These attributes can then be converted into an array of text packs by:
int burn_cdtext_from_session(struct burn_session *s,
unsigned char **text_packs, int *num_packs,
int flag);
or they can be written as array of text packs to CD when burning begins and
no array of pre-formatted packs was attached to the write options by
burn_write_opts_set_leadin_text().
There are calls for inspecting the attached attributes:
int burn_session_get_cdtext_par(struct burn_session *s,
int char_codes[8], int copyrights[8],
int block_languages[8], int flag);
int burn_session_get_cdtext(struct burn_session *s, int block,
int pack_type, char *pack_type_name,
unsigned char **payload, int *length, int flag);
int burn_track_get_cdtext(struct burn_track *t, int block,
int pack_type, char *pack_type_name,
unsigned char **payload, int *length, int flag);
and for removing attached attributes:
int burn_session_dispose_cdtext(struct burn_session *s, int block);
int burn_track_dispose_cdtext(struct burn_track *t, int block);
2011-12-14 13:26:06 +00:00
-------------------------------------------------------------------------------
Sony Text File Format (Input Sheet Version 0.7T):
This text file format provides comprehensive means to define the text
attributes of session and tracks for a single block. More than one
such file has to be read to form an attribute set with multiple blocks.
The information is given by text lines of the following form:
purpose specifier [whitespace] = [whitespace] content text
[whitespace] is zero or more ASCII 32 (space) or ASCII 9 (tab) characters.
The purpose specifier tells the meaning of the content text.
Empty content text does not cause a CD-TEXT attribute to be attached.
The following purpose specifiers apply to the session as a whole:
Specifier = Meaning
-------------------------------------------------------------------------
Text Code = Character code for pack type 0x8f
"ASCII", "8859"
Language Code = One of the language names for pack type 0x8f
Album Title = Content of pack type 0x80
Artist Name = Content of pack type 0x81
Songwriter = Content of pack type 0x82
Composer = Content of pack type 0x83
Arranger = Content of pack type 0x84
Album Message = Content of pack type 0x85
Catalog Number = Content of pack type 0x86
Genre Code = One of the genre names for pack type 0x87
Genre Information = Cleartext part of pack type 0x87
Closed Information = Content of pack type 0x8d
UPC / EAN = Content of pack type 0x8e
Text Data Copy Protection = Copyright value for pack type 0x8f
"ON" = 0x03, "OFF" = 0x00
First Track Number = The lowest track number used in the file
Last Track Number = The highest track number used in the file
The following purpose specifiers apply to particular tracks:
Track NN Title = Content of pack type 0x80
Track NN Artist = Content of pack type 0x81
Track NN Songwriter = Content of pack type 0x82
Track NN Composer = Content of pack type 0x83
Track NN Arranger = Content of pack type 0x84
Track NN Message = Content of pack type 0x85
ISRC NN = Content of pack type 0x8e
The following purpose specifiers have no effect on CD-TEXT:
Remarks = Comments with no influence on CD-TEXT
Disc Information NN = Supplementary information for use by record companies.
ISO-8859-1 encoded. NN ranges from 01 to 04.
Input Sheet Version = "0.7T"
2011-12-14 13:26:06 +00:00
libburn peculiarties:
2011-12-14 13:26:06 +00:00
libburn may read files of the described format by
burn_session_input_sheet_v07t()
after the burn_session has been establiched and all burn_track objects have
been added.
2011-12-14 13:26:06 +00:00
The following purpose specifiers accept byte values of the form 0xXY.
Text Code , Language Code , Genre Code , Text Data Copy Protection
E.g. to indicate MS-JIS character code (of which the exact name is unknown):
Text Code = 0x80
Genre Code is settable by 0xXY or 0xXYZT or 0xXY 0xZT.
Genre Code = 0x001b
Purpose specifiers which have the meaning "Content of pack type 0xXY"
may be replaced by the pack type codes. E.g.:
0x80 = Session content of pack type 0x80
Track 02 0x80 = Track content of pack type 0x80 for track 2.
Applicable are pack types 0x80 to 0x86, 0x8d, 0x8e.
Text Code may be specified only once. It gets speficied to "ISO-8850-1"
automatically as soon as content is defined which depends on the text
encoding of the block. I.e with pack types 0x80 to 0x85.
2011-12-14 13:26:06 +00:00
If a track attribute is set, but the corresponding session attribute is not
2011-12-14 13:26:06 +00:00
defined or defined with empty text, then the session attribute gets attached
as empty test. (Normally empty content is ignored.)
libburn will always start track numbering by 1. So it adjusts all track
2011-12-14 13:26:06 +00:00
numbers from the input sheet file by subtracting (First Track Number - 1).
libburn ignores Last Track number because it will always write its own first
and last track numbers to pack type 0x8f.
2011-12-14 13:26:06 +00:00
Example cdrskin run with three tracks:
2011-12-14 13:26:06 +00:00
$ cdrskin dev=/dev/sr0 -v input_sheet_v07t=NIGHTCATS.TXT \
2011-12-14 13:26:06 +00:00
-audio track_source_1 track_source_2 track_source_3
----------------------------------------------------------
Content of file NIGHTCATS.TXT :
2011-12-14 13:26:06 +00:00
----------------------------------------------------------
Input Sheet Version = 0.7T
2011-12-14 13:26:06 +00:00
Text Code = 8859
Language Code = English
Album Title = Joyful Nights
Artist Name = United Cat Orchestra
Songwriter = Various Songwriters
Composer = Various Composers
Arranger = Tom Cat
Album Message = For all our fans
Catalog Number = 1234567890
Genre Code = Classical
Genre Information = Feline classic music
Closed Information = This is not to be shown by CD players
UPC / EAN = 1234567890123
Text Data Copy Protection = OFF
First Track Number = 1
Last Track Number = 3
Track 01 Title = Song of Joy
Track 01 Artist = Felix and The Purrs
Track 01 Songwriter = Friedrich Schiller
Track 01 Composer = Ludwig van Beethoven
Track 01 Arranger = Tom Cat
Track 01 Message = Fritz and Louie once were punks
2011-12-14 13:26:06 +00:00
ISRC 01 = XYBLG1101234
Track 02 Title = Humpty Dumpty
Track 02 Artist = Catwalk Beauties
Track 02 Songwriter = Mother Goose
Track 02 Composer = unknown
Track 02 Arranger = Tom Cat
Track 02 Message = Pluck the goose
ISRC 02 = XYBLG1100005
Track 03 Title = Mee Owwww
Track 03 Artist = Mia Kitten
Track 03 Songwriter = Mia Kitten
Track 03 Composer = Mia Kitten
Track 03 Arranger = Mia Kitten
2011-12-14 13:26:06 +00:00
Track 03 Message =
ISRC 03 = XYBLG1100006
----------------------------------------------------------
-------------------------------------------------------------------------------
This text is copyright 2011 Thomas Schmitt <scdbackup@gmx.net>.
Permission is granted to copy, modify, and distribute it, as long as the
references to the original information sources are maintained.
There is NO WARRANTY, to the extent permitted by law.
-------------------------------------------------------------------------------