Re: Encode UTF-8 optimizations

On 08/22/2016 02:47 PM, pali(_at_)cpan(_dot_)org wrote:

> And I think you misunderstand when is_utf8_char_slow() is called.  It is
> called only when the next byte in the input indicates that the only
> legal UTF-8 that might follow would be for a code point that is at least
> U+200000, almost twice as high as the highest legal Unicode code point.
> It is a Perl extension to handle such code points, unlike other
> languages.  But the Perl core is not optimized for them, nor will it be.
>   My point is that is_utf8_char_slow() will only be called in very
> specialized cases, and we need not make those cases have as good a
> performance as normal ones.

In strict mode, there is absolutely no need to call is_utf8_char_slow(). As in 
strict
mode such sequence must be always invalid (it is above last valid Unicode 
character)
This is what I tried to tell.

And currently is_strict_utf8_string_loc() first calls isUTF8_CHAR() (which 
could call
is_utf8_char_slow()) and after that is check for UTF8_IS_SUPER().


I only have time to respond to this portion just now.

The code could be tweaked to call UTF8_IS_SUPER first, but I'm assertingthat an optimizing compiler will see that any call tois_utf8_char_slow() is pointless, and will optimize it out.

<Prev in Thread]

Current Thread

[Next in Thread>

Previous by Date:

Re: Encode UTF-8 optimizations, pali

Next by Date:

Re: Encode UTF-8 optimizations, Karl Williamson

Previous by Thread:

Re: Encode UTF-8 optimizations, pali

Next by Thread:

Re: Encode UTF-8 optimizations, Karl Williamson

Indexes:

[Date] [Thread] [Top] [All Lists]