Re: BOM and principle of least surprise

At 8:56 AM +0300 5/5/04, Jarkko Hietaniemi wrote:

Paul Hoffman wrote:

 At 2:17 PM -0800 3/31/04, Larry Wall wrote:

Perl 6 will assume that script is in some kind of recognizable Unicode
encoding, any of:

    UTF-8
    UTF-16
    UTF-32
    SCSU

Of those, probably only SCSU requires a BOM, since Perl scripts are almost
certain to be strict ASCII in the first few bytes where it matters.

If it starts parsing as UTF-8, and runs into trouble, it might or might
not try to intuit the real encoding.  Haven't really decided that yet.

You can always explicitly switch the encoding with "use encoding" or
some such.


 Is it too late in the Perl 6 process to ask for fewer options here?


Ummm, why?  Giving fewer options to users has never been a strong Perl
tradition :-)  Besides, recognizing the various Unicode encodings is
pretty trivial, especially if we know something that's likely to be
present in the first line, like "perl".

My mistake. When I saw "script" in the first line, I assumed we weretalking about subsections of Unicode, not "script as in a program".You're right about doing a guess based on looking for "perl" beingpretty definitive.

My hope for fewer options is for reading input. That is, I'd like thedefault encoding for all inputs and outputs to be UTF8, unless it hasbeen converted and that conversion is somehow flagged.

<Prev in Thread]

Current Thread

[Next in Thread>

Previous by Date:

Re: BOM and principle of least surprise, Jarkko Hietaniemi

Next by Date:

Re: BOM and principle of least surprise, Jarkko Hietaniemi

Previous by Thread:

Re: BOM and principle of least surprise, Jarkko Hietaniemi

Next by Thread:

Re: BOM and principle of least surprise, Jarkko Hietaniemi

Indexes:

[Date] [Thread] [Top] [All Lists]