On 08/22/2016 02:47 PM, pali(_at_)cpan(_dot_)org wrote:
> And I think you misunderstand when is_utf8_char_slow() is called. It is
> called only when the next byte in the input indicates that the only
> legal UTF-8 that might follow would be for a code point that is at least
> U+200000, almost twice as high as the highest legal Unicode code point.
> It is a Perl extension to handle such code points, unlike other
> languages. But the Perl core is not optimized for them, nor will it be.
> My point is that is_utf8_char_slow() will only be called in very
> specialized cases, and we need not make those cases have as good a
> performance as normal ones.
In strict mode, there is absolutely no need to call is_utf8_char_slow(). As in
strict
mode such sequence must be always invalid (it is above last valid Unicode
character)
This is what I tried to tell.
And currently is_strict_utf8_string_loc() first calls isUTF8_CHAR() (which
could call
is_utf8_char_slow()) and after that is check for UTF8_IS_SUPER().
I only have time to respond to this portion just now.
The code could be tweaked to call UTF8_IS_SUPER first, but I'm asserting
that an optimizing compiler will see that any call to
is_utf8_char_slow() is pointless, and will optimize it out.