[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#372095: xdvi: should paste ligatures as such when UTF-8 is available



Samuel Thibault wrote:

> When pasting ligatures, they are developped, which is fine for non-UTF-8
> environment. But when a UTF-8 transfer is possible, maybe they should rather
> be transmitted as such? For instance,
> http://dept-info.labri.fr/~thibault/tmp/ligature.dvi
[...]
> Locale: LANG=fr_FR@euro, LC_CTYPE=fr_FR@euro (charmap=ISO-8859-15)

When do you think a UTF-8 transfer would be possible? You are not
working in a UTF-8 locale. However, I am actually surprised that you do
get something reasonable at all. After all, the DVI file contains in
place of the 'ffi' ligature only x0E for OT1 encoding as in your example
or x1E for T1 encoding. In both cases copy & paste gives the correct
sequence of characters.

BTW, I don't know how xdvi works internally here, but generally text
extraction functions are used for searching and copying. For text
searching it is however important to also find the version with
decomposed ligature.

In addition, AFAIK the only reason why a very small set of ligatures is
part of Unicode is because they were defined in legacy encodings like
AdobeExpert. Normally Unicode wants to encode characters not glyphs. And
the 'ffi' ligature is just a special glyph for the character sequence
'ffi'. So from my understanding of the intention of Unicode, xdvi's
behaviour to decompose ligatures is correct even in a locale using
Unicode. See <URL:http://www.unicode.org/versions/Unicode4.0.0/ch02.pdf>
for more details.

cheerio
ralf




Reply to: