Re: utf8 pragma, lexical scope

Jan Dubois schrieb am 09.09.2010 um 13:13 (-0700):

Without the utf8 pragma, identifiers are not allowed to have
funny characters. (Yes, it was a stupid exercise.)


The Perl parser is internally not UTF8-clean, so I would recommend
not to use non-ASCII characters in variable names for now, even if
it looks like it mostly works under "utf8".


Okay. I can certainly get by without non-ASCII variable names.

From perltodo.pod:


| =head2 Properly Unicode safe tokeniser and pads.
|
| The tokeniser isn't actually very UTF-8 clean. C<use utf8;> is a
| hack - variable names are stored in stashes as raw bytes, without
| the utf-8 flag set. The pad API only takes a C<char *> pointer,
| so that's all bytes too. The tokeniser ignores the UTF-8-ness of
| C<PL_rsfp>, or any SVs returned from source filters.  All this
| could be fixed.


Thanks - I didn't know this doc.
-- 
Michael Ludwig