nmh-workers
[Top] [All Lists]

Re: [Nmh-workers] colorized/highlighted scan output?

2012-11-01 21:38:41
I was thinking of looking for ANSI sequences and not counting
them.  But I don't know if that could get into trouble with
multibyte characters.  mbtowc() is too much of a mystery to me.

Well, this is where things get "funky".

In the particular case of UTF-8, the only magical bytes are ones
with the high bit set.  For bytes < 128, they are handled "normally".
So assuming you're using the "normal" ANSI escape sequences (and
you're not using 0x9b as a CSI), the multibyte routines will ignore
them.

If you care, what we do in fmt_scan with the multibyte routines is this:

- Use mbtowc to convert a possible multibyte character (example: anything
  in UTF-8 U+0080 or greater) into a "wide" character.

- mbtowc() tells us the number of bytes that character consumed.  For ASCII,
  it's always 1.  For UTF-8, sometimes it's > 1.  If we don't have enough
  room in the buffer for a complete character, we stop.

- We use wcwidth() to see how many columns that character consumes, and
  use that to make sure we don't overrun our field width.

- We then copy the bytes over for that character (that we got from mbtowc()).

But it occurs to me that we shouldn't actually do any of this for a "don't
count this" format escape, because that stuff should live outside of
the normal string handling routines in fmt_scan().  Also, I'm with Tom that
I'm not so crazy about putting knowledge of ANSI escape sequences directly
into fmt_scan(), because who knows if your terminal supports them?

David, do you want to implement this?

--Ken

_______________________________________________
Nmh-workers mailing list
Nmh-workers(_at_)nongnu(_dot_)org
https://lists.nongnu.org/mailman/listinfo/nmh-workers

<Prev in Thread] Current Thread [Next in Thread>