On Fri, 10 Aug 2001, Martin Duerst wrote:
At 12:17 01/08/08 -0700, Benjamin Franz wrote:
In UTF8 the 'frame' problem doesn't exist because character start
bytes _ALWAYS_ have bit eight set to 0 while continuation bytes _ALWAYS_
have bit eight set to 1. 'quotemeta' works fine if you use UTF8 as your
working encoding.
Small correction: start bytes have the most significant byte as 0 or
the two most significant bytes as 11. Continuation bytes have the two
most significant bytes as 10.
Right. I got sloppy (fortunately not while actually writing code) - I
blame fatigue. :)
The self-framing property remains valid.
--
Benjamin Franz
Programs must be written for people to read, and only
incidentally for machines to execute.
---Abelson and Sussman