Clarified in man xorriso the roles of character sets

This commit is contained in:
Thomas Schmitt 2014-01-02 13:58:58 +00:00
parent f58ee1db6b
commit 565be458c7
3 changed files with 126 additions and 85 deletions

View File

@ -9,7 +9,7 @@
.\" First parameter, NAME, should be all caps
.\" Second parameter, SECTION, should be 1-8, maybe w/ subsection
.\" other parameters are allowed: see man(7), man(1)
.TH XORRISO 1 "Version 1.3.5, Dec 28, 2013"
.TH XORRISO 1 "Version 1.3.5, Jan 02, 2014"
.\" Please adjust this date whenever revising the manpage.
.\"
.\" Some roff macros, for reference:
@ -3308,30 +3308,42 @@ on differently nationalized terminals.
The meanings of byte codes are defined in \fBcharacter sets\fR which have
names. Shell command iconv \-l lists them.
.br
Character sets should not matter as long as only english alphanumeric
The file names on hard disk are assumed to be encoded by the
\fBlocal character set\fR which is also used for the communication
with the user.
Byte codes 32 to 126 of the local character set must match the US\-ASCII
characters of the same code. ISO\-8859 and UTF\-8 fulfill this demand.
.br
By default, \fBxorriso\fR uses the character set as told by
shell command "locale" with argument "charmap". This may be influenced
by environment variables LC_ALL, LC_CTYPE, or LANG and should match the
expectations of the terminal.
In some situations it may be necessary to set it by command \-local_charset.
.br
Local character sets should not matter as long as only english alphanumeric
characters are used for file names or as long as all writers and readers
of the media use the same character set.
of the media use the same local character set.
Outside these constraints it may be necessary to let \fBxorriso\fR
convert byte codes.
convert byte codes from and to other character sets.
.br
There is an input conversion from input character set to the local character
set which applies when an ISO image gets loaded. A conversion from local
character set to the output character set is performed when an
image tree gets written. The sets can be defined independently by commands
The Rock Ridge file names in ISO filesystems are assumed to be
encoded by the \fBinput character set\fR.
The Rock Ridge file names which get written with ISO filesystems will be
encoded by the \fBoutput character set\fR.
.br
The sets can be defined independently by commands
\-in_charset and \-out_charset. Normally one will have both identical, if ever.
Other than the local character set, these two character sets may deviate
from US\-ASCII.
.br
If conversions are desired then \fBxorriso\fR needs to know the name of the
local character set. \fBxorriso\fR can inquire the same info as
shell command
"locale" with argument "charmap". This may be influenced by environment
variables LC_ALL, LC_CTYPE, or LANG and should match the expectations of
the terminal.
The output character sets for Joliet and HFS+ are not influenced by these
commands. Joliet uses output character set UCS\-2 or UTF\-16. HFS+ uses UTF\-16.
.br
The default output charset is the local character set of the terminal where
\fBxorriso\fR runs. So by default no conversion happens between local
filesystem
names and emerging names in the image. The situation stays ambigous and the
reader has to riddle what character set was used.
names and emerging Rock Ridge names in the image. The situation stays
ambigous and the reader has to riddle what character set was used.
.br
By command \-auto_charset it is possible to attribute the output charset name
to the image. This makes the situation unambigous. But if your terminal

View File

@ -2897,25 +2897,36 @@ the same byte string may appear as different peculiar national
characters on differently nationalized terminals. The meanings of byte
codes are defined in *character sets* which have names. Shell command
iconv -l lists them.
Character sets should not matter as long as only english alphanumeric
characters are used for file names or as long as all writers and readers
of the media use the same character set. Outside these constraints it
may be necessary to let `xorriso' convert byte codes.
There is an input conversion from input character set to the local
character set which applies when an ISO image gets loaded. A conversion
from local character set to the output character set is performed when
an image tree gets written. The sets can be defined independently by
commands -in_charset and -out_charset. Normally one will have both
identical, if ever.
If conversions are desired then `xorriso' needs to know the name of the
local character set. `xorriso' can inquire the same info as shell
command "locale" with argument "charmap". This may be influenced by
environment variables LC_ALL, LC_CTYPE, or LANG and should match the
expectations of the terminal.
The file names on hard disk are assumed to be encoded by the *local
character set* which is also used for the communication with the user.
Byte codes 32 to 126 of the local character set must match the US-ASCII
characters of the same code. ISO-8859 and UTF-8 fulfill this demand.
By default, `xorriso' uses the character set as told by shell command
"locale" with argument "charmap". This may be influenced by environment
variables LC_ALL, LC_CTYPE, or LANG and should match the expectations
of the terminal. In some situations it may be necessary to set it by
command -local_charset.
Local character sets should not matter as long as only english
alphanumeric characters are used for file names or as long as all
writers and readers of the media use the same local character set.
Outside these constraints it may be necessary to let `xorriso' convert
byte codes from and to other character sets.
The Rock Ridge file names in ISO filesystems are assumed to be encoded
by the *input character set*. The Rock Ridge file names which get
written with ISO filesystems will be encoded by the *output character
set*.
The sets can be defined independently by commands -in_charset and
-out_charset. Normally one will have both identical, if ever. Other
than the local character set, these two character sets may deviate from
US-ASCII.
The output character sets for Joliet and HFS+ are not influenced by
these commands. Joliet uses output character set UCS-2 or UTF-16. HFS+
uses UTF-16.
The default output charset is the local character set of the terminal
where `xorriso' runs. So by default no conversion happens between local
filesystem names and emerging names in the image. The situation stays
ambigous and the reader has to riddle what character set was used.
filesystem names and emerging Rock Ridge names in the image. The
situation stays ambigous and the reader has to riddle what character
set was used.
By command -auto_charset it is possible to attribute the output charset
name to the image. This makes the situation unambigous. But if your
terminal character set does not match the character set of the local
@ -4902,7 +4913,7 @@ File: xorriso.info, Node: CommandIdx, Next: ConceptIdx, Prev: Legal, Up: Top
* -cd sets working directory in ISO: Navigate. (line 7)
* -cdx sets working directory on disk: Navigate. (line 16)
* -changes_pending overrides change status: Writing. (line 13)
* -charset sets input/output character set: Charset. (line 43)
* -charset sets input/output character set: Charset. (line 54)
* -check_md5 verifies file checksum: Verify. (line 154)
* -check_md5_r verifies file tree checksums: Verify. (line 170)
* -check_media reads media block by block: Verify. (line 21)
@ -4991,7 +5002,7 @@ File: xorriso.info, Node: CommandIdx, Next: ConceptIdx, Prev: Legal, Up: Top
* -list_speeds lists available write speeds: Writing. (line 146)
* -lns creates ISO symbolic link: Insert. (line 176)
* -load addresses a particular session as input: Loading. (line 35)
* -local_charset sets terminal character set: Charset. (line 47)
* -local_charset sets terminal character set: Charset. (line 58)
* -logfile logs output channels to file: Frontend. (line 20)
* -ls lists files in ISO image: Navigate. (line 26)
* -lsd lists files in ISO image: Navigate. (line 34)
@ -5138,10 +5149,10 @@ File: xorriso.info, Node: ConceptIdx, Prev: CommandIdx, Up: Top
* cdrecord, Emulation: Emulation. (line 116)
* Character Set, _definition: Charset. (line 6)
* Character Set, for input, -in_charset: Loading. (line 116)
* Character Set, for input/output, -charset: Charset. (line 43)
* Character Set, for input/output, -charset: Charset. (line 54)
* Character Set, for output, -out_charset: SetWrite. (line 276)
* Character set, learn from image, -auto_charset: Loading. (line 122)
* Character Set, of terminal, -local_charset: Charset. (line 47)
* Character Set, of terminal, -local_charset: Charset. (line 58)
* CHRP partition, _definition: Bootable. (line 158)
* Closed media, _definition: Media. (line 43)
* Comment, #: Scripting. (line 173)
@ -5227,6 +5238,7 @@ File: xorriso.info, Node: ConceptIdx, Prev: CommandIdx, Up: Top
* Image, set volume set id, -volset_id: SetWrite. (line 185)
* Image, set volume timestamp, -volume_date: SetWrite. (line 212)
* Image, show id strings, -pvd_info: Inquiry. (line 115)
* Input Character Set, _definition: Charset. (line 25)
* Insert, enable overwriting, -overwrite: SetInsert. (line 127)
* Insert, file exclusion absolute, -not_paths: SetInsert. (line 55)
* Insert, file exclusion from file, -not_list: SetInsert. (line 67)
@ -5255,6 +5267,7 @@ File: xorriso.info, Node: ConceptIdx, Prev: CommandIdx, Up: Top
* Jigdo Template Extraction, _definition: Jigdo. (line 6)
* LBA, _definition: Drives. (line 17)
* List delimiter, _definition: Processing. (line 9)
* Local Character Set, _definition: Charset. (line 11)
* MBR, _definition: Extras. (line 26)
* MBR, set, -boot_image system_area=: Bootable. (line 126)
* MD5, control handling, -md5: Loading. (line 183)
@ -5284,6 +5297,7 @@ File: xorriso.info, Node: ConceptIdx, Prev: CommandIdx, Up: Top
* Navigate, tell disk working directory, -pwdx: Navigate. (line 23)
* Navigate, tell ISO working directory, -pwd: Navigate. (line 20)
* Next writeable address, -grow_blindly: AqDrive. (line 46)
* Output Character Set, _definition: Charset. (line 26)
* Overwriteable media, _definition: Media. (line 14)
* Ownership, global in ISO image, -uid: SetWrite. (line 282)
* Ownership, in ISO image, -chown: Manip. (line 49)
@ -5426,37 +5440,37 @@ Node: SetWrite107824
Node: Bootable128409
Node: Jigdo144799
Node: Charset149046
Node: Exception151808
Node: DialogCtl157928
Node: Inquiry160526
Node: Navigate166843
Node: Verify175141
Node: Restore184173
Node: Emulation191260
Node: Scripting201562
Node: Frontend209333
Node: Examples218940
Node: ExDevices220118
Node: ExCreate220777
Node: ExDialog222062
Node: ExGrowing223327
Node: ExModifying224132
Node: ExBootable224636
Node: ExCharset225188
Node: ExPseudo226080
Node: ExCdrecord226978
Node: ExMkisofs227295
Node: ExGrowisofs228635
Node: ExException229770
Node: ExTime230224
Node: ExIncBackup230683
Node: ExRestore234663
Node: ExRecovery235596
Node: Files236166
Node: Seealso237465
Node: Bugreport238188
Node: Legal238769
Node: CommandIdx239780
Node: ConceptIdx256442
Node: Exception152361
Node: DialogCtl158481
Node: Inquiry161079
Node: Navigate167396
Node: Verify175694
Node: Restore184726
Node: Emulation191813
Node: Scripting202115
Node: Frontend209886
Node: Examples219493
Node: ExDevices220671
Node: ExCreate221330
Node: ExDialog222615
Node: ExGrowing223880
Node: ExModifying224685
Node: ExBootable225189
Node: ExCharset225741
Node: ExPseudo226633
Node: ExCdrecord227531
Node: ExMkisofs227848
Node: ExGrowisofs229188
Node: ExException230323
Node: ExTime230777
Node: ExIncBackup231236
Node: ExRestore235216
Node: ExRecovery236149
Node: Files236719
Node: Seealso238018
Node: Bugreport238741
Node: Legal239322
Node: CommandIdx240333
Node: ConceptIdx256995

End Tag Table

View File

@ -50,7 +50,7 @@
@c man .\" First parameter, NAME, should be all caps
@c man .\" Second parameter, SECTION, should be 1-8, maybe w/ subsection
@c man .\" other parameters are allowed: see man(7), man(1)
@c man .TH XORRISO 1 "Version 1.3.5, Dec 28, 2013"
@c man .TH XORRISO 1 "Version 1.3.5, Jan 02, 2014"
@c man .\" Please adjust this date whenever revising the manpage.
@c man .\"
@c man .\" Some roff macros, for reference:
@ -3861,30 +3861,45 @@ on differently nationalized terminals.
The meanings of byte codes are defined in @strong{character sets} which have
names. Shell command iconv -l lists them.
@*
Character sets should not matter as long as only english alphanumeric
@cindex Local Character Set, _definition
The file names on hard disk are assumed to be encoded by the
@strong{local character set} which is also used for the communication
with the user.
Byte codes 32 to 126 of the local character set must match the US-ASCII
characters of the same code. ISO-8859 and UTF-8 fulfill this demand.
@*
By default, @command{xorriso} uses the character set as told by
shell command "locale" with argument "charmap". This may be influenced
by environment variables LC_ALL, LC_CTYPE, or LANG and should match the
expectations of the terminal.
In some situations it may be necessary to set it by command -local_charset.
@*
Local character sets should not matter as long as only english alphanumeric
characters are used for file names or as long as all writers and readers
of the media use the same character set.
of the media use the same local character set.
Outside these constraints it may be necessary to let @command{xorriso}
convert byte codes.
convert byte codes from and to other character sets.
@*
There is an input conversion from input character set to the local character
set which applies when an ISO image gets loaded. A conversion from local
character set to the output character set is performed when an
image tree gets written. The sets can be defined independently by commands
@cindex Input Character Set, _definition
The Rock Ridge file names in ISO filesystems are assumed to be
encoded by the @strong{input character set}.
@cindex Output Character Set, _definition
The Rock Ridge file names which get written with ISO filesystems will be
encoded by the @strong{output character set}.
@*
The sets can be defined independently by commands
-in_charset and -out_charset. Normally one will have both identical, if ever.
Other than the local character set, these two character sets may deviate
from US-ASCII.
@*
If conversions are desired then @command{xorriso} needs to know the name of the
local character set. @command{xorriso} can inquire the same info as
shell command
"locale" with argument "charmap". This may be influenced by environment
variables LC_ALL, LC_CTYPE, or LANG and should match the expectations of
the terminal.
The output character sets for Joliet and HFS+ are not influenced by these
commands. Joliet uses output character set UCS-2 or UTF-16. HFS+ uses UTF-16.
@*
The default output charset is the local character set of the terminal where
@command{xorriso} runs. So by default no conversion happens between local
filesystem
names and emerging names in the image. The situation stays ambigous and the
reader has to riddle what character set was used.
names and emerging Rock Ridge names in the image. The situation stays
ambigous and the reader has to riddle what character set was used.
@*
By command -auto_charset it is possible to attribute the output charset name
to the image. This makes the situation unambigous. But if your terminal