[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#448184: Eterm and UTF-8



I looked at this, prompted by Jose Antonio Jimenez Madrid.

I discovered that:

  * configure.ac appears to have support for the utf-8 multibyte
    encoding, calling it `unicode', but in upstream it tries to detect
    whether to enable it by calling xlfonts and seeing if the output
    contains the string `iso-10646'!  In the Debian package this is
    overridden.

  * In the source code this encoding is spelled variously utf-8,
    UTF-8, UTF8, iso-10646, etc.  Sometimes as strings, which are then
    compared.

  * There is one function which compares the system locale's encoding
    with a list of strings including "UTF8".  Of course that would be
    "UTF-8".  But fixing that does not help because:

  * Eventually we find this code:

        if (!strcasecmp(str, "utf8") || !strcasecmp(str, "ucs2")) {
            encoding_method = UCS2;
            multichar_decode = latin1;
        } else if (..

  * encoding_method is used by a state machine in
    screen.c:scr_add_lines, to decode terminal characters for writing
    into the screen array.  The state machine is an interleaved
    mixture of different multibyte character decoders does not contain
    a UTF-8 decoder.

  * After all that I decided not to look at the encoding side.

I think that to fix this one would have to, _at least_:

  * Sort out all of the names of the encodings so that the plumbing
    and configuration works

  * Add a UTF-8 decoder to screen.c

Very likely there is no UTF-8 encoder either.

It would probably be easier and more fruitful to add the wanted
features (or UI frills) from eterm to another terminal emulator.

Ian.

-- 
Ian Jackson <ijackson@chiark.greenend.org.uk>   These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.


Reply to: