nmh-workers
[Top] [All Lists]

Re: [Nmh-workers] nmh internals: full MIME integration

2014-07-29 21:42:06
Bio's formatted I/O routines are fully utf-8 aware.  E.g. when looking
to consume or output a character they know how many octets are required
to form any given unicode character.  The upside is you never have to
think at all about processing utf-8 -- it just happens.

So here's the thing ... right now we (mostly) don't have to think about
processing UTF-8.  We get bytes in from decoding and squirt them out.
There's no processing; we leave that up to the terminal to handle it.
We're essentially UTF-8 ignorant, the same way we're ISO-8859 ignorant.

Now it does matter when we're doing stuff like scan(1); we need to know
how many bytes have been consumed, and how many column positions we've
moved so we can format things correctly.  But ... I'm looking at Bio(2)
again, and I don't see how that helps us (we cannot assume 1 rune =
1 column position).  For this we use the POSIX wcwidth() routine, and
I don't see a Plan 9 equivalent.

Another benefit is that print() and friends let you install custom
formatting verbs.  So you can do things like:

 int to_qp(Fmt *fmt) {/* convert data to quoted-printable */}         ;
 fmtinstall('Q', to_qp)                                               ;

and then

 char *text = "some string with non-ascii text"; print("%Q\n", text)

and the %Q conversion formats its output as quoted-printable on the
fly.  Similarly, you could define verbs that know how to encode stuff
in headers, escape and quote addresses as necessary, etc.  In some
situations this can really help improve the code's readability.

That might be interesting ... although I found out the hard way that
when it comes to RFC-2047 encoding, you need to keep a lot of state
around (see sbr/encode_rfc2047.c).  It's interesting, but I don't see
it as a good enough feature to switch to Bio(2) on it's own.

--Ken

_______________________________________________
Nmh-workers mailing list
Nmh-workers(_at_)nongnu(_dot_)org
https://lists.nongnu.org/mailman/listinfo/nmh-workers

<Prev in Thread] Current Thread [Next in Thread>