[bug #12314] linebreak not utf-8 aware


Follow-up Comment #2, bug #12314 (project mhonarc):

Applied a fix that will work with versions of Perl with
the Encode module installed (i.e Perl >=5.8).  The fix converts
the input to perl's internal utf-8 so line breaking is done on characters
vs bytes.  The data is translated back after operation.

To avoid unnecessary translations, translation is not done on
us-ascii or iso-8859 sets, common 7/8-bit charsets.  Of course,
conversion could be avoided for other 8-bit sets, but charset check
may cause more overhead then translation, so only common ones are
checked for.

Anyone who can, please test changes to make sure nothing got
broke.  Note, translation only done when maxwidth is set.
Changes will be available in next snapshot build.

Side Comment: For the future, it is worth considering that
all textual content get normalized to Perl's internal utf-8
format.  Of course, such a change would cause mhonarc to
not be compatible with versions of Perl < 5.8.


    _______________________________________________________

Reply to this item at:

  <http://savannah.nongnu.org/bugs/?func=detailitem&item_id=12314>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.nongnu.org/

---------------------------------------------------------------------
To sign-off this list, send email to majordomo(_at_)mhonarc(_dot_)org with the
message text UNSUBSCRIBE MHONARC-DEV