Clarified in man xorriso the roles of character sets

This commit is contained in:
Thomas Schmitt 2014-01-02 13:58:58 +00:00
parent 3e03b758c3
commit d25733f677
3 changed files with 126 additions and 85 deletions

View File

@ -9,7 +9,7 @@
.\" First parameter, NAME, should be all caps .\" First parameter, NAME, should be all caps
.\" Second parameter, SECTION, should be 1-8, maybe w/ subsection .\" Second parameter, SECTION, should be 1-8, maybe w/ subsection
.\" other parameters are allowed: see man(7), man(1) .\" other parameters are allowed: see man(7), man(1)
.TH XORRISO 1 "Version 1.3.5, Dec 28, 2013" .TH XORRISO 1 "Version 1.3.5, Jan 02, 2014"
.\" Please adjust this date whenever revising the manpage. .\" Please adjust this date whenever revising the manpage.
.\" .\"
.\" Some roff macros, for reference: .\" Some roff macros, for reference:
@ -3308,30 +3308,42 @@ on differently nationalized terminals.
The meanings of byte codes are defined in \fBcharacter sets\fR which have The meanings of byte codes are defined in \fBcharacter sets\fR which have
names. Shell command iconv \-l lists them. names. Shell command iconv \-l lists them.
.br .br
Character sets should not matter as long as only english alphanumeric The file names on hard disk are assumed to be encoded by the
\fBlocal character set\fR which is also used for the communication
with the user.
Byte codes 32 to 126 of the local character set must match the US\-ASCII
characters of the same code. ISO\-8859 and UTF\-8 fulfill this demand.
.br
By default, \fBxorriso\fR uses the character set as told by
shell command "locale" with argument "charmap". This may be influenced
by environment variables LC_ALL, LC_CTYPE, or LANG and should match the
expectations of the terminal.
In some situations it may be necessary to set it by command \-local_charset.
.br
Local character sets should not matter as long as only english alphanumeric
characters are used for file names or as long as all writers and readers characters are used for file names or as long as all writers and readers
of the media use the same character set. of the media use the same local character set.
Outside these constraints it may be necessary to let \fBxorriso\fR Outside these constraints it may be necessary to let \fBxorriso\fR
convert byte codes. convert byte codes from and to other character sets.
.br .br
There is an input conversion from input character set to the local character The Rock Ridge file names in ISO filesystems are assumed to be
set which applies when an ISO image gets loaded. A conversion from local encoded by the \fBinput character set\fR.
character set to the output character set is performed when an The Rock Ridge file names which get written with ISO filesystems will be
image tree gets written. The sets can be defined independently by commands encoded by the \fBoutput character set\fR.
.br
The sets can be defined independently by commands
\-in_charset and \-out_charset. Normally one will have both identical, if ever. \-in_charset and \-out_charset. Normally one will have both identical, if ever.
Other than the local character set, these two character sets may deviate
from US\-ASCII.
.br .br
If conversions are desired then \fBxorriso\fR needs to know the name of the The output character sets for Joliet and HFS+ are not influenced by these
local character set. \fBxorriso\fR can inquire the same info as commands. Joliet uses output character set UCS\-2 or UTF\-16. HFS+ uses UTF\-16.
shell command
"locale" with argument "charmap". This may be influenced by environment
variables LC_ALL, LC_CTYPE, or LANG and should match the expectations of
the terminal.
.br .br
The default output charset is the local character set of the terminal where The default output charset is the local character set of the terminal where
\fBxorriso\fR runs. So by default no conversion happens between local \fBxorriso\fR runs. So by default no conversion happens between local
filesystem filesystem
names and emerging names in the image. The situation stays ambigous and the names and emerging Rock Ridge names in the image. The situation stays
reader has to riddle what character set was used. ambigous and the reader has to riddle what character set was used.
.br .br
By command \-auto_charset it is possible to attribute the output charset name By command \-auto_charset it is possible to attribute the output charset name
to the image. This makes the situation unambigous. But if your terminal to the image. This makes the situation unambigous. But if your terminal

View File

@ -2897,25 +2897,36 @@ the same byte string may appear as different peculiar national
characters on differently nationalized terminals. The meanings of byte characters on differently nationalized terminals. The meanings of byte
codes are defined in *character sets* which have names. Shell command codes are defined in *character sets* which have names. Shell command
iconv -l lists them. iconv -l lists them.
Character sets should not matter as long as only english alphanumeric The file names on hard disk are assumed to be encoded by the *local
characters are used for file names or as long as all writers and readers character set* which is also used for the communication with the user.
of the media use the same character set. Outside these constraints it Byte codes 32 to 126 of the local character set must match the US-ASCII
may be necessary to let `xorriso' convert byte codes. characters of the same code. ISO-8859 and UTF-8 fulfill this demand.
There is an input conversion from input character set to the local By default, `xorriso' uses the character set as told by shell command
character set which applies when an ISO image gets loaded. A conversion "locale" with argument "charmap". This may be influenced by environment
from local character set to the output character set is performed when variables LC_ALL, LC_CTYPE, or LANG and should match the expectations
an image tree gets written. The sets can be defined independently by of the terminal. In some situations it may be necessary to set it by
commands -in_charset and -out_charset. Normally one will have both command -local_charset.
identical, if ever. Local character sets should not matter as long as only english
If conversions are desired then `xorriso' needs to know the name of the alphanumeric characters are used for file names or as long as all
local character set. `xorriso' can inquire the same info as shell writers and readers of the media use the same local character set.
command "locale" with argument "charmap". This may be influenced by Outside these constraints it may be necessary to let `xorriso' convert
environment variables LC_ALL, LC_CTYPE, or LANG and should match the byte codes from and to other character sets.
expectations of the terminal. The Rock Ridge file names in ISO filesystems are assumed to be encoded
by the *input character set*. The Rock Ridge file names which get
written with ISO filesystems will be encoded by the *output character
set*.
The sets can be defined independently by commands -in_charset and
-out_charset. Normally one will have both identical, if ever. Other
than the local character set, these two character sets may deviate from
US-ASCII.
The output character sets for Joliet and HFS+ are not influenced by
these commands. Joliet uses output character set UCS-2 or UTF-16. HFS+
uses UTF-16.
The default output charset is the local character set of the terminal The default output charset is the local character set of the terminal
where `xorriso' runs. So by default no conversion happens between local where `xorriso' runs. So by default no conversion happens between local
filesystem names and emerging names in the image. The situation stays filesystem names and emerging Rock Ridge names in the image. The
ambigous and the reader has to riddle what character set was used. situation stays ambigous and the reader has to riddle what character
set was used.
By command -auto_charset it is possible to attribute the output charset By command -auto_charset it is possible to attribute the output charset
name to the image. This makes the situation unambigous. But if your name to the image. This makes the situation unambigous. But if your
terminal character set does not match the character set of the local terminal character set does not match the character set of the local
@ -4902,7 +4913,7 @@ File: xorriso.info, Node: CommandIdx, Next: ConceptIdx, Prev: Legal, Up: Top
* -cd sets working directory in ISO: Navigate. (line 7) * -cd sets working directory in ISO: Navigate. (line 7)
* -cdx sets working directory on disk: Navigate. (line 16) * -cdx sets working directory on disk: Navigate. (line 16)
* -changes_pending overrides change status: Writing. (line 13) * -changes_pending overrides change status: Writing. (line 13)
* -charset sets input/output character set: Charset. (line 43) * -charset sets input/output character set: Charset. (line 54)
* -check_md5 verifies file checksum: Verify. (line 154) * -check_md5 verifies file checksum: Verify. (line 154)
* -check_md5_r verifies file tree checksums: Verify. (line 170) * -check_md5_r verifies file tree checksums: Verify. (line 170)
* -check_media reads media block by block: Verify. (line 21) * -check_media reads media block by block: Verify. (line 21)
@ -4991,7 +5002,7 @@ File: xorriso.info, Node: CommandIdx, Next: ConceptIdx, Prev: Legal, Up: Top
* -list_speeds lists available write speeds: Writing. (line 146) * -list_speeds lists available write speeds: Writing. (line 146)
* -lns creates ISO symbolic link: Insert. (line 176) * -lns creates ISO symbolic link: Insert. (line 176)
* -load addresses a particular session as input: Loading. (line 35) * -load addresses a particular session as input: Loading. (line 35)
* -local_charset sets terminal character set: Charset. (line 47) * -local_charset sets terminal character set: Charset. (line 58)
* -logfile logs output channels to file: Frontend. (line 20) * -logfile logs output channels to file: Frontend. (line 20)
* -ls lists files in ISO image: Navigate. (line 26) * -ls lists files in ISO image: Navigate. (line 26)
* -lsd lists files in ISO image: Navigate. (line 34) * -lsd lists files in ISO image: Navigate. (line 34)
@ -5138,10 +5149,10 @@ File: xorriso.info, Node: ConceptIdx, Prev: CommandIdx, Up: Top
* cdrecord, Emulation: Emulation. (line 116) * cdrecord, Emulation: Emulation. (line 116)
* Character Set, _definition: Charset. (line 6) * Character Set, _definition: Charset. (line 6)
* Character Set, for input, -in_charset: Loading. (line 116) * Character Set, for input, -in_charset: Loading. (line 116)
* Character Set, for input/output, -charset: Charset. (line 43) * Character Set, for input/output, -charset: Charset. (line 54)
* Character Set, for output, -out_charset: SetWrite. (line 276) * Character Set, for output, -out_charset: SetWrite. (line 276)
* Character set, learn from image, -auto_charset: Loading. (line 122) * Character set, learn from image, -auto_charset: Loading. (line 122)
* Character Set, of terminal, -local_charset: Charset. (line 47) * Character Set, of terminal, -local_charset: Charset. (line 58)
* CHRP partition, _definition: Bootable. (line 158) * CHRP partition, _definition: Bootable. (line 158)
* Closed media, _definition: Media. (line 43) * Closed media, _definition: Media. (line 43)
* Comment, #: Scripting. (line 173) * Comment, #: Scripting. (line 173)
@ -5227,6 +5238,7 @@ File: xorriso.info, Node: ConceptIdx, Prev: CommandIdx, Up: Top
* Image, set volume set id, -volset_id: SetWrite. (line 185) * Image, set volume set id, -volset_id: SetWrite. (line 185)
* Image, set volume timestamp, -volume_date: SetWrite. (line 212) * Image, set volume timestamp, -volume_date: SetWrite. (line 212)
* Image, show id strings, -pvd_info: Inquiry. (line 115) * Image, show id strings, -pvd_info: Inquiry. (line 115)
* Input Character Set, _definition: Charset. (line 25)
* Insert, enable overwriting, -overwrite: SetInsert. (line 127) * Insert, enable overwriting, -overwrite: SetInsert. (line 127)
* Insert, file exclusion absolute, -not_paths: SetInsert. (line 55) * Insert, file exclusion absolute, -not_paths: SetInsert. (line 55)
* Insert, file exclusion from file, -not_list: SetInsert. (line 67) * Insert, file exclusion from file, -not_list: SetInsert. (line 67)
@ -5255,6 +5267,7 @@ File: xorriso.info, Node: ConceptIdx, Prev: CommandIdx, Up: Top
* Jigdo Template Extraction, _definition: Jigdo. (line 6) * Jigdo Template Extraction, _definition: Jigdo. (line 6)
* LBA, _definition: Drives. (line 17) * LBA, _definition: Drives. (line 17)
* List delimiter, _definition: Processing. (line 9) * List delimiter, _definition: Processing. (line 9)
* Local Character Set, _definition: Charset. (line 11)
* MBR, _definition: Extras. (line 26) * MBR, _definition: Extras. (line 26)
* MBR, set, -boot_image system_area=: Bootable. (line 126) * MBR, set, -boot_image system_area=: Bootable. (line 126)
* MD5, control handling, -md5: Loading. (line 183) * MD5, control handling, -md5: Loading. (line 183)
@ -5284,6 +5297,7 @@ File: xorriso.info, Node: ConceptIdx, Prev: CommandIdx, Up: Top
* Navigate, tell disk working directory, -pwdx: Navigate. (line 23) * Navigate, tell disk working directory, -pwdx: Navigate. (line 23)
* Navigate, tell ISO working directory, -pwd: Navigate. (line 20) * Navigate, tell ISO working directory, -pwd: Navigate. (line 20)
* Next writeable address, -grow_blindly: AqDrive. (line 46) * Next writeable address, -grow_blindly: AqDrive. (line 46)
* Output Character Set, _definition: Charset. (line 26)
* Overwriteable media, _definition: Media. (line 14) * Overwriteable media, _definition: Media. (line 14)
* Ownership, global in ISO image, -uid: SetWrite. (line 282) * Ownership, global in ISO image, -uid: SetWrite. (line 282)
* Ownership, in ISO image, -chown: Manip. (line 49) * Ownership, in ISO image, -chown: Manip. (line 49)
@ -5426,37 +5440,37 @@ Node: SetWrite107824
Node: Bootable128409 Node: Bootable128409
Node: Jigdo144799 Node: Jigdo144799
Node: Charset149046 Node: Charset149046
Node: Exception151808 Node: Exception152361
Node: DialogCtl157928 Node: DialogCtl158481
Node: Inquiry160526 Node: Inquiry161079
Node: Navigate166843 Node: Navigate167396
Node: Verify175141 Node: Verify175694
Node: Restore184173 Node: Restore184726
Node: Emulation191260 Node: Emulation191813
Node: Scripting201562 Node: Scripting202115
Node: Frontend209333 Node: Frontend209886
Node: Examples218940 Node: Examples219493
Node: ExDevices220118 Node: ExDevices220671
Node: ExCreate220777 Node: ExCreate221330
Node: ExDialog222062 Node: ExDialog222615
Node: ExGrowing223327 Node: ExGrowing223880
Node: ExModifying224132 Node: ExModifying224685
Node: ExBootable224636 Node: ExBootable225189
Node: ExCharset225188 Node: ExCharset225741
Node: ExPseudo226080 Node: ExPseudo226633
Node: ExCdrecord226978 Node: ExCdrecord227531
Node: ExMkisofs227295 Node: ExMkisofs227848
Node: ExGrowisofs228635 Node: ExGrowisofs229188
Node: ExException229770 Node: ExException230323
Node: ExTime230224 Node: ExTime230777
Node: ExIncBackup230683 Node: ExIncBackup231236
Node: ExRestore234663 Node: ExRestore235216
Node: ExRecovery235596 Node: ExRecovery236149
Node: Files236166 Node: Files236719
Node: Seealso237465 Node: Seealso238018
Node: Bugreport238188 Node: Bugreport238741
Node: Legal238769 Node: Legal239322
Node: CommandIdx239780 Node: CommandIdx240333
Node: ConceptIdx256442 Node: ConceptIdx256995
 
End Tag Table End Tag Table

View File

@ -50,7 +50,7 @@
@c man .\" First parameter, NAME, should be all caps @c man .\" First parameter, NAME, should be all caps
@c man .\" Second parameter, SECTION, should be 1-8, maybe w/ subsection @c man .\" Second parameter, SECTION, should be 1-8, maybe w/ subsection
@c man .\" other parameters are allowed: see man(7), man(1) @c man .\" other parameters are allowed: see man(7), man(1)
@c man .TH XORRISO 1 "Version 1.3.5, Dec 28, 2013" @c man .TH XORRISO 1 "Version 1.3.5, Jan 02, 2014"
@c man .\" Please adjust this date whenever revising the manpage. @c man .\" Please adjust this date whenever revising the manpage.
@c man .\" @c man .\"
@c man .\" Some roff macros, for reference: @c man .\" Some roff macros, for reference:
@ -3861,30 +3861,45 @@ on differently nationalized terminals.
The meanings of byte codes are defined in @strong{character sets} which have The meanings of byte codes are defined in @strong{character sets} which have
names. Shell command iconv -l lists them. names. Shell command iconv -l lists them.
@* @*
Character sets should not matter as long as only english alphanumeric @cindex Local Character Set, _definition
The file names on hard disk are assumed to be encoded by the
@strong{local character set} which is also used for the communication
with the user.
Byte codes 32 to 126 of the local character set must match the US-ASCII
characters of the same code. ISO-8859 and UTF-8 fulfill this demand.
@*
By default, @command{xorriso} uses the character set as told by
shell command "locale" with argument "charmap". This may be influenced
by environment variables LC_ALL, LC_CTYPE, or LANG and should match the
expectations of the terminal.
In some situations it may be necessary to set it by command -local_charset.
@*
Local character sets should not matter as long as only english alphanumeric
characters are used for file names or as long as all writers and readers characters are used for file names or as long as all writers and readers
of the media use the same character set. of the media use the same local character set.
Outside these constraints it may be necessary to let @command{xorriso} Outside these constraints it may be necessary to let @command{xorriso}
convert byte codes. convert byte codes from and to other character sets.
@* @*
There is an input conversion from input character set to the local character @cindex Input Character Set, _definition
set which applies when an ISO image gets loaded. A conversion from local The Rock Ridge file names in ISO filesystems are assumed to be
character set to the output character set is performed when an encoded by the @strong{input character set}.
image tree gets written. The sets can be defined independently by commands @cindex Output Character Set, _definition
The Rock Ridge file names which get written with ISO filesystems will be
encoded by the @strong{output character set}.
@*
The sets can be defined independently by commands
-in_charset and -out_charset. Normally one will have both identical, if ever. -in_charset and -out_charset. Normally one will have both identical, if ever.
Other than the local character set, these two character sets may deviate
from US-ASCII.
@* @*
If conversions are desired then @command{xorriso} needs to know the name of the The output character sets for Joliet and HFS+ are not influenced by these
local character set. @command{xorriso} can inquire the same info as commands. Joliet uses output character set UCS-2 or UTF-16. HFS+ uses UTF-16.
shell command
"locale" with argument "charmap". This may be influenced by environment
variables LC_ALL, LC_CTYPE, or LANG and should match the expectations of
the terminal.
@* @*
The default output charset is the local character set of the terminal where The default output charset is the local character set of the terminal where
@command{xorriso} runs. So by default no conversion happens between local @command{xorriso} runs. So by default no conversion happens between local
filesystem filesystem
names and emerging names in the image. The situation stays ambigous and the names and emerging Rock Ridge names in the image. The situation stays
reader has to riddle what character set was used. ambigous and the reader has to riddle what character set was used.
@* @*
By command -auto_charset it is possible to attribute the output charset name By command -auto_charset it is possible to attribute the output charset name
to the image. This makes the situation unambigous. But if your terminal to the image. This makes the situation unambigous. But if your terminal