Clarified in man xorriso the roles of character sets

This commit is contained in:
2014-01-02 13:58:58 +00:00
parent 3e03b758c3
commit d25733f677
3 changed files with 126 additions and 85 deletions

View File

@ -50,7 +50,7 @@
@c man .\" First parameter, NAME, should be all caps
@c man .\" Second parameter, SECTION, should be 1-8, maybe w/ subsection
@c man .\" other parameters are allowed: see man(7), man(1)
@c man .TH XORRISO 1 "Version 1.3.5, Dec 28, 2013"
@c man .TH XORRISO 1 "Version 1.3.5, Jan 02, 2014"
@c man .\" Please adjust this date whenever revising the manpage.
@c man .\"
@c man .\" Some roff macros, for reference:
@ -3861,30 +3861,45 @@ on differently nationalized terminals.
The meanings of byte codes are defined in @strong{character sets} which have
names. Shell command iconv -l lists them.
@*
Character sets should not matter as long as only english alphanumeric
@cindex Local Character Set, _definition
The file names on hard disk are assumed to be encoded by the
@strong{local character set} which is also used for the communication
with the user.
Byte codes 32 to 126 of the local character set must match the US-ASCII
characters of the same code. ISO-8859 and UTF-8 fulfill this demand.
@*
By default, @command{xorriso} uses the character set as told by
shell command "locale" with argument "charmap". This may be influenced
by environment variables LC_ALL, LC_CTYPE, or LANG and should match the
expectations of the terminal.
In some situations it may be necessary to set it by command -local_charset.
@*
Local character sets should not matter as long as only english alphanumeric
characters are used for file names or as long as all writers and readers
of the media use the same character set.
of the media use the same local character set.
Outside these constraints it may be necessary to let @command{xorriso}
convert byte codes.
convert byte codes from and to other character sets.
@*
There is an input conversion from input character set to the local character
set which applies when an ISO image gets loaded. A conversion from local
character set to the output character set is performed when an
image tree gets written. The sets can be defined independently by commands
@cindex Input Character Set, _definition
The Rock Ridge file names in ISO filesystems are assumed to be
encoded by the @strong{input character set}.
@cindex Output Character Set, _definition
The Rock Ridge file names which get written with ISO filesystems will be
encoded by the @strong{output character set}.
@*
The sets can be defined independently by commands
-in_charset and -out_charset. Normally one will have both identical, if ever.
Other than the local character set, these two character sets may deviate
from US-ASCII.
@*
If conversions are desired then @command{xorriso} needs to know the name of the
local character set. @command{xorriso} can inquire the same info as
shell command
"locale" with argument "charmap". This may be influenced by environment
variables LC_ALL, LC_CTYPE, or LANG and should match the expectations of
the terminal.
The output character sets for Joliet and HFS+ are not influenced by these
commands. Joliet uses output character set UCS-2 or UTF-16. HFS+ uses UTF-16.
@*
The default output charset is the local character set of the terminal where
@command{xorriso} runs. So by default no conversion happens between local
filesystem
names and emerging names in the image. The situation stays ambigous and the
reader has to riddle what character set was used.
names and emerging Rock Ridge names in the image. The situation stays
ambigous and the reader has to riddle what character set was used.
@*
By command -auto_charset it is possible to attribute the output charset name
to the image. This makes the situation unambigous. But if your terminal