449 lines
17 KiB
Plaintext
449 lines
17 KiB
Plaintext
|
|
Description of CD-TEXT
|
|
|
|
Guided by Leon Merten Lohse via libcdio-devel@gnu.org
|
|
by reading mmc3r10g.pdf from http://www.t10.org/ftp/t10/drafts/mmc3/
|
|
by docs and results of cdtext.zip from http://www.sonydadc.com/file/
|
|
by reading source of libcdio from http://www.gnu.org/s/libcdio
|
|
which quotes source of cdrecord from ftp://ftp.berlios.de/pub/cdrecord/alpha
|
|
|
|
Language codes were learned from http://tech.ebu.ch/docs/tech/tech3264.pdf
|
|
Genre codes were learned from libcdio and confirmed by
|
|
http://helpdesk.audiofile-engineering.com/index.php?pg=kb.page&id=123
|
|
|
|
For libburnia-project.org by Thomas Schmitt <scdbackup@gmx.net>
|
|
|
|
Content:
|
|
- CD-TEXT from the view of the user
|
|
- Content specifications of particular pack types
|
|
- Format of a CD-TEXT packs array
|
|
- Overview of libburn API calls for CD-TEXT
|
|
|
|
-------------------------------------------------------------------------------
|
|
CD-TEXT from the view of the user:
|
|
|
|
CD-TEXT records attributes of disc and tracks on audio CD.
|
|
|
|
The attributes are grouped into blocks which represent particular languages.
|
|
Up to 8 blocks are possible.
|
|
|
|
There are 13 defined attribute categories, which are called Pack Types and are
|
|
identified by a single-byte code:
|
|
0x80 = Title
|
|
0x81 = Names of performers
|
|
0x82 = Names of Songwriters
|
|
0x83 = Names of Composers,
|
|
0x84 = Names of Arrangers
|
|
0x85 = Messages
|
|
0x86 = text-and-binary: Disc Identification
|
|
0x87 = text-and-binary: Genre Identification
|
|
0x88 = binary: Table of Content information
|
|
0x89 = binary: Second Table of Content information
|
|
(0x8a to 0x8c are reserved.)
|
|
0x8d = Closed Information
|
|
0x8e = UPC/EAN code of the album and ISRC code of each track
|
|
0x8f = binary: Size Information of the Block
|
|
|
|
Some of these categories apply to the whole disc only:
|
|
0x86, 0x87, 0x88, 0x89, 0x8d
|
|
Some have to be additionally attributed to each track, if they are present for
|
|
the whole disc:
|
|
0x80, 0x81, 0x82, 0x83, 0x84, 0x85, 0x8e
|
|
One describes the overall content of a block and in part of all other blocks:
|
|
0x8f
|
|
|
|
The total size of a block's attribute set is restricted by the fact that it
|
|
has to be stored in at most 253 records with 12 bytes of payload. These records
|
|
are called Text Packs.
|
|
A shortcut for repeated identical track texts is provided, so that a text
|
|
that is identical to the one of the previous track occupies only 2 or 4 bytes.
|
|
|
|
|
|
-------------------------------------------------------------------------------
|
|
Content specification of particular pack types:
|
|
|
|
Pack types 0x80 to 0x85 and 0x8e contain 0-terminated cleartext. If double byte
|
|
characters are used, then two 0-bytes terminate the cleartext.
|
|
The meaning of 0x80 to 0x85 should be clear by above list. They are encoded
|
|
according to the Character Code of their block. Either as ISO-8859-1 single
|
|
byte characters, or as 7-bit ASCII single byte characters, or as MS-JIS double
|
|
byte characters.
|
|
More info to 0x8e is given below.
|
|
|
|
Pack type 0x86 (Disc Identification) is documented by Sony as "Catalog Number:
|
|
(use ASCII Code) Catalog Number of the album". So it is not really binary
|
|
but might be non-printable, and should contain only bytes with bit7 = 0.
|
|
|
|
Pack type 0x87 contains 2 binary bytes, followed by 0-terminated cleartext.
|
|
The two binary bytes form a big-endian index to the following list.
|
|
0x0000 = "Not Used" (Sony prescribes to use this if no genre applies)
|
|
0x0001 = "Not Defined"
|
|
0x0002 = "Adult Contemporary"
|
|
0x0003 = "Alternative Rock"
|
|
0x0004 = "Childrens Music"
|
|
0x0005 = "Classical"
|
|
0x0006 = "Contemporary Christian"
|
|
0x0007 = "Country"
|
|
0x0008 = "Dance"
|
|
0x0009 = "Easy Listening"
|
|
0x000a = "Erotic"
|
|
0x000b = "Folk"
|
|
0x000c = "Gospel"
|
|
0x000d = "Hip Hop"
|
|
0x000e = "Jazz"
|
|
0x000f = "Latin"
|
|
0x0010 = "Musical"
|
|
0x0011 = "New Age"
|
|
0x0012 = "Opera"
|
|
0x0013 = "Operetta"
|
|
0x0014 = "Pop Music"
|
|
0x0015 = "Rap"
|
|
0x0016 = "Reggae"
|
|
0x0017 = "Rock Music"
|
|
0x0018 = "Rhythm & Blues"
|
|
0x0019 = "Sound Effects"
|
|
0x001a = "Spoken Word"
|
|
0x001b = "World Music"
|
|
Sony documents the cleartext part as "Genre information that would supplement
|
|
the Genre Code, such as 'USA Rock music in the 60s'". Always ASCII encoded.
|
|
|
|
Pack type 0x88 records information from the CDs Table of Content, as of
|
|
READ PMA/TOC/ATIP Format 0010b (mmc5r03c.pdf, table 490 TOC Track Descriptor
|
|
Format, Q Sub-channel).
|
|
See below, Format of CD-TEXT packs, for more details about the content of
|
|
pack type 0x88.
|
|
|
|
Pack type 0x89 is yet quite unclear. See below, Format of CD-TEXT packs, for
|
|
an example of this pack type.
|
|
|
|
Pack type 0x8d is documented by Sony as "Closed Information: (use 8859-1 Code)
|
|
Any information can be recorded on disc as memorandum. Information in this
|
|
field will not be read by CD TEXT players available to the public."
|
|
Always ISO-8859-1 encoded.
|
|
|
|
Pack type 0x8e is documented by Sony as "UPC/EAN Code (POS Code) of the album.
|
|
This field typically consists of 13 characters." Always ASCII encoded.
|
|
|
|
Pack type 0x8f summarizes the whole list of text packs of a block.
|
|
See below, Format of CD-TEXT packs, for details.
|
|
|
|
|
|
-------------------------------------------------------------------------------
|
|
Format of a CD-TEXT packs array:
|
|
|
|
The attributes are represented on CD as Text Packs in the sub-channel of
|
|
the Lead-in of the disc.
|
|
The format is explained in part in MMC-3 (mmc3r10g.pdf, Annex J) and in part by
|
|
the documentation of Sony's cdtext.zip.
|
|
|
|
Each pack consists of a 4-byte header, 12 bytes of payload, and 2 bytes of CRC.
|
|
|
|
The first byte of each pack tells the pack type. See above for a list of types.
|
|
|
|
The second byte tells the track number to which the first text piece in
|
|
a pack is associated. Number 0 means the whole album. Higher numbers are
|
|
valid for types 0x80 to 0x85, and 0x8e. With these types, there should be
|
|
one text for the disc and one for each track.
|
|
With types 0x88 and 0x89, the second byte bears a track number, too.
|
|
With type 0x8f, the second byte counts the record parts from 0 to 2.
|
|
|
|
The third byte is a sequential counter.
|
|
|
|
The fourth byte is the Block Number and Character Position Indicator.
|
|
It consists of three bit fields:
|
|
bit7 = Double Bytes Character Code (0= single byte characters)
|
|
bit4-6 = Block Number (groups text packs in language blocks)
|
|
bit0-3 = Character position. Either the number of characters which
|
|
the current text inherited from the previous pack, or
|
|
15 if the current text started before the previous pack.
|
|
|
|
The 12 payload bytes contain pieces of 0-terminated texts or binary data.
|
|
A text may span over several packs. Unused characters in a pack are used for
|
|
the next text of the same pack type. If no text of the same type follows,
|
|
then the remaining text bytes are set to 0.
|
|
|
|
The CRC algorithm uses divisor 0x11021. The resulting 16-bit residue of the
|
|
polynomial division gets exored with 0xffff and written as big-endian
|
|
number to bytes 16 and 17 of the pack.
|
|
|
|
|
|
The text packs are grouped in up to 8 blocks of at most 256 packs. Each block
|
|
is in charge for one language. Sequence numbers of each block are counted
|
|
separately. All packs of block 0 come before the packs of block 1.
|
|
|
|
The limitation of block number and sequence numbers imply that there are at
|
|
most 2048 text packs possible. (READ TOC/PMS/ATIP could retrieve 3640 packs,
|
|
as it is limited to 64 kB - 2.)
|
|
|
|
|
|
If a text of a track (pack types 0x80 to 0x85 and 0x8e) repeats identically
|
|
for the next track, then it may be represented by a TAB character (ASCII 9)
|
|
for single byte texts, resp. two TAB characters for double byte texts.
|
|
(This should be used because 256 * 12 bytes is few space for 99 tracks.)
|
|
|
|
The two binary bytes of pack type 0x87 are written to the first 0x87 pack of
|
|
a block. They may or may not be repeated at the start of the follow-up packs
|
|
of type 0x87.
|
|
|
|
The first pack of type 0x88 in a block records in its payload bytes:
|
|
0 : PMIN of POINT A1 = First Track Number
|
|
1 : PMIN of POINT A2 = Last Track Number
|
|
2 : unknown, 0 in Sony example
|
|
3 : PMIN of POINT A2 = Start position of Lead-Out
|
|
4 : PSEC of POINT A2 = Start position of Lead-Out
|
|
5 : PFRAME of POINT A2 = Start position of Lead-Out
|
|
6 to 11 : unknown, 0 in Sony example
|
|
The following packs record PMIN, PSEC, PFRAME of the POINTs between the
|
|
lowest track number (min 01h) and the highest track number (max 63h).
|
|
The payload of the last pack is padded by 0s.
|
|
The Sony .TOC example:
|
|
A0 01
|
|
A1 14
|
|
A2 63:02:18
|
|
01 00:02:00
|
|
02 04:11:25
|
|
03 08:02:50
|
|
04 11:47:62
|
|
...
|
|
13 53:24:25
|
|
14 57:03:25
|
|
yields
|
|
88 00 23 00 01 0e 00 3f 02 12 00 00 00 00 00 00 12 00
|
|
88 01 24 00 00 02 00 04 0b 19 08 02 32 0b 2f 3e 67 2d
|
|
...
|
|
88 0d 27 00 35 18 19 39 03 19 00 00 00 00 00 00 ea af
|
|
|
|
Pack type 0x89 is yet quite unclear. Especially what the information shall
|
|
mean to the user of the CD. The time points in the Sony example are in the
|
|
time range of the tracks numbers that are given before the time points:
|
|
01 02:41:48 01 02:52:58
|
|
06 23:14:25 06 23:29:60
|
|
07 28:30:39 07 28:42:30
|
|
13 55:13:26 13 55:31:50
|
|
yields
|
|
89 01 28 00 01 04 00 00 00 00 02 29 30 02 34 3a f3 0c
|
|
89 06 29 00 02 04 00 00 00 00 17 0e 19 17 1d 3c 73 92
|
|
89 07 2a 00 03 04 00 00 00 00 1c 1e 27 1c 2a 1e 72 20
|
|
89 0d 2b 00 04 04 00 00 00 00 37 0d 1a 37 1f 32 0b 62
|
|
The track numbers are stored in the track number byte of the packs. The two
|
|
time points are stored in byte 6 to 11 of the payload. Byte 0 of the payload
|
|
seems to be a sequential counter. Byte 1 always 4 ? Byte 2 to 5 always 0 ?
|
|
|
|
Pack type 0x8f summarizes the whole list of text packs of a block.
|
|
So there is one group of three 0x8f packs per block.
|
|
Nevertheless each 0x8f group tells the highest sequence number and the
|
|
language code of all blocks.
|
|
The payload bytes of three 0x8f packs form a 36 byte record. The track number
|
|
bytes of the three packs have the values 0, 1, 2.
|
|
Byte :
|
|
0 : Character code for pack types 0x80 to 0x85:
|
|
0x00 = ISO-8859-1
|
|
0x01 = 7 bit ASCII
|
|
0x80 = MS-JIS (japanese Kanji, double byte characters)
|
|
1 : Number of first track
|
|
2 : Number of last track
|
|
3 : libcdio source states: "cd-text information copyright byte"
|
|
Probably 3 means "copyrighted", 0 means "not copyrighted".
|
|
4 - 19 : Pack count of the various types 0x80 to 0x8f.
|
|
Byte number N tells the count of packs of type 0x80 + (N - 4).
|
|
I.e. the first byte in this field of 16 counts packs of type 0x80.
|
|
20 - 27 : Highest sequence byte number of blocks 0 to 7.
|
|
28 - 36 : Language code for blocks 0 to 7 (tech3264.pdf appendix 3)
|
|
Not all of these Codes have ever been seen with CD-TEXT, though.
|
|
0x00 = Unknown
|
|
0x01 = Albanian
|
|
0x02 = Breton
|
|
0x03 = Catalan
|
|
0x04 = Croatian
|
|
0x05 = Welsh
|
|
0x06 = Czech
|
|
0x07 = Danish
|
|
0x08 = German
|
|
0x09 = English
|
|
0x0a = Spanish
|
|
0x0b = Esperanto
|
|
0x0c = Estonian
|
|
0x0d = Basque
|
|
0x0e = Faroese
|
|
0x0f = French
|
|
0x10 = Frisian
|
|
0x11 = Irish
|
|
0x12 = Gaelic
|
|
0x13 = Galician
|
|
0x14 = Icelandic
|
|
0x15 = Italian
|
|
0x16 = Lappish
|
|
0x17 = Latin
|
|
0x18 = Latvian
|
|
0x19 = Luxembourgian
|
|
0x1a = Lithuanian
|
|
0x1b = Hungarian
|
|
0x1c = Maltese
|
|
0x1d = Dutch
|
|
0x1e = Norwegian
|
|
0x1f = Occitan
|
|
0x20 = Polish
|
|
0x21 = Portuguese
|
|
0x22 = Romanian
|
|
0x23 = Romansh
|
|
0x24 = Serbian
|
|
0x25 = Slovak
|
|
0x26 = Slovenian
|
|
0x27 = Finnish
|
|
0x28 = Swedish
|
|
0x29 = Turkish
|
|
0x2a = Flemish
|
|
0x2b = Wallon
|
|
0x45 = Zulu
|
|
0x46 = Vietnamese
|
|
0x47 = Uzbek
|
|
0x48 = Urdu
|
|
0x49 = Ukrainian
|
|
0x4a = Thai
|
|
0x4b = Telugu
|
|
0x4c = Tatar
|
|
0x4d = Tamil
|
|
0x4e = Tadzhik
|
|
0x4f = Swahili
|
|
0x50 = Sranan Tongo
|
|
0x51 = Somali
|
|
0x52 = Sinhalese
|
|
0x53 = Shona
|
|
0x54 = Serbo-croat
|
|
0x55 = Ruthenian
|
|
0x56 = Russian
|
|
0x57 = Quechua
|
|
0x58 = Pushtu
|
|
0x59 = Punjabi
|
|
0x5a = Persian
|
|
0x5b = Papamiento
|
|
0x5c = Oriya
|
|
0x5d = Nepali
|
|
0x5e = Ndebele
|
|
0x5f = Marathi
|
|
0x60 = Moldavian
|
|
0x61 = Malaysian
|
|
0x62 = Malagasay
|
|
0x63 = Macedonian
|
|
0x64 = Laotian
|
|
0x65 = Korean
|
|
0x66 = Khmer
|
|
0x67 = Kazakh
|
|
0x68 = Kannada
|
|
0x69 = Japanese
|
|
0x6a = Indonesian
|
|
0x6b = Hindi
|
|
0x6c = Hebrew
|
|
0x6d = Hausa
|
|
0x6e = Gurani
|
|
0x6f = Gujurati
|
|
0x70 = Greek
|
|
0x71 = Georgian
|
|
0x72 = Fulani
|
|
0x73 = Dari
|
|
0x74 = Churash
|
|
0x75 = Chinese
|
|
0x76 = Burmese
|
|
0x77 = Bulgarian
|
|
0x78 = Bengali
|
|
0x79 = Bielorussian
|
|
0x7a = Bambora
|
|
0x7b = Azerbaijani
|
|
0x7c = Assamese
|
|
0x7d = Armenian
|
|
0x7e = Arabic
|
|
0x7f = Amharic
|
|
E.g. these three packs
|
|
42 : 8f 00 2a 00 01 01 03 00 06 05 04 05 07 06 01 02 48 65
|
|
43 : 8f 01 2b 00 00 00 00 00 00 00 06 03 2c 00 00 00 c0 20
|
|
44 : 8f 02 2c 00 00 00 00 00 09 00 00 00 00 00 00 00 11 45
|
|
decode to
|
|
Byte :Value Meaning
|
|
0 : 01 = ASCII 7-bit
|
|
1 : 01 = first track is 1
|
|
2 : 03 = last track is 3
|
|
3 : 00 = copyright (0 = public domain, 3 = copyrighted ?)
|
|
4 : 06 = 6 packs of type 0x80
|
|
5 : 05 = 5 packs of type 0x81
|
|
6 : 04 = 4 packs of type 0x82
|
|
7 : 05 = 5 packs of type 0x83
|
|
8 : 07 = 7 packs of type 0x84
|
|
9 : 06 = 6 packs of type 0x85
|
|
10 : 01 = 1 pack of type 0x86
|
|
11 : 02 = 2 packs of type 0x87
|
|
12 : 00 = 0 packs of type 0x88
|
|
13 : 00 = 0 packs of type 0x89
|
|
14 : 00 00 00 00 = 0 packs of types 0x8a to 0x8d
|
|
18 : 06 = 6 packs of type 0x8e
|
|
19 : 03 = 3 packs of type 0x8f
|
|
20 : 2c = last sequence for block 0
|
|
This matches the sequence number of the last text pack (0x2c = 44)
|
|
21 : 00 00 00 00 00 00 00 = last sequence numbers for block 1..7 (none)
|
|
28 : 09 = language code for block 0: English
|
|
29 : 00 00 00 00 00 00 00 = language codes for block 1..7 (none)
|
|
|
|
|
|
-------------------------------------------------------------------------------
|
|
libburn API calls for CD-TEXT (see libburn/libburn.h for details):
|
|
|
|
libburn can retrieve the set of text packs from a CD:
|
|
|
|
int burn_disc_get_leadin_text(struct burn_drive *d,
|
|
unsigned char **text_packs, int *num_packs,
|
|
int flag);
|
|
|
|
|
|
It can write a text pack set with a CD SAO session.
|
|
|
|
This set may be attached as array of readily formatted text packs by:
|
|
|
|
int burn_write_opts_set_leadin_text(struct burn_write_opts *opts,
|
|
unsigned char *text_packs,
|
|
int num_packs, int flag);
|
|
|
|
Alternatively it may be defined by attaching CD-TEXT attributes to burn_session
|
|
and burn_track:
|
|
|
|
int burn_session_set_cdtext_par(struct burn_session *s,
|
|
int char_codes[8], int copyrights[8],
|
|
int languages[8], int flag);
|
|
|
|
int burn_session_set_cdtext(struct burn_session *s, int block,
|
|
int pack_type, char *pack_type_name,
|
|
unsigned char *payload, int length, int flag);
|
|
|
|
int burn_track_set_cdtext(struct burn_track *t, int block,
|
|
int pack_type, char *pack_type_name,
|
|
unsigned char *payload, int length, int flag);
|
|
|
|
These attributes can then be converted into an array of text packs by:
|
|
|
|
int burn_cdtext_from_session(struct burn_session *s,
|
|
unsigned char **text_packs, int *num_packs,
|
|
int flag);
|
|
|
|
or they can be written as array of text packs to CD when burning begins and
|
|
no array of pre-formatted packs was attached to the write options by
|
|
burn_write_opts_set_leadin_text().
|
|
|
|
There are calls for inspecting the attached attributes:
|
|
|
|
int burn_session_get_cdtext_par(struct burn_session *s,
|
|
int char_codes[8], int copyrights[8],
|
|
int block_languages[8], int flag);
|
|
|
|
int burn_session_get_cdtext(struct burn_session *s, int block,
|
|
int pack_type, char *pack_type_name,
|
|
unsigned char **payload, int *length, int flag);
|
|
|
|
int burn_track_get_cdtext(struct burn_track *t, int block,
|
|
int pack_type, char *pack_type_name,
|
|
unsigned char **payload, int *length, int flag);
|
|
|
|
and for removing attached attributes:
|
|
|
|
int burn_session_dispose_cdtext(struct burn_session *s, int block);
|
|
|
|
int burn_track_dispose_cdtext(struct burn_track *t, int block);
|
|
|