Par was a problem for a while, until I finally figured out that the
current round of 8-bit patches was the whole problem.
I don't understand. Are you saying that unpatched par works for you on
anything but ASCII? Unpatched par fails utterly for me on UTF-8.
Not exactly. I am saying that the various i18n patches were an utter failure
for me. I use the par out of MacPorts, and the key patch is this:
--- ./par.c.orig 2001-04-01 23:25:57.000000000 -0500
+++ ./par.c 2012-04-15 13:56:42.000000000 -0500
@@ -403,7 +403,8 @@
}
continue;
}
- if (isspace(c)) ch = ' ';
+ // Exclude non-breaking space from the class of space chars
+ if (isspace(c) && isascii(c)) ch = ' ';
else blank = 0;
additem(cbuf, &ch, errmsg);
if (*errmsg) goto rlcleanup;
Your problem is probably the same one I ran into; isspace() would interpret
0xA0 as a non-breaking space and replace it with a "real" space, messing
up UTF-8 sequences.
This isn't perfect; in a perfect world you'd pull in the bytes and call
wcwidth() on each Unicode character, but it works good enough for me in
practice.
--Ken
_______________________________________________
Nmh-workers mailing list
Nmh-workers(_at_)nongnu(_dot_)org
https://lists.nongnu.org/mailman/listinfo/nmh-workers