On 08/21/2016 02:34 AM, pali(_at_)cpan(_dot_)org wrote:
On Sunday 21 August 2016 03:10:40 Karl Williamson wrote:
Top posting.
Attached is my alternative patch. It effectively uses a different
algorithm to avoid decoding the input into code points, and to copy
all spans of valid input at once, instead of character at a time.
And it uses only currently available functions.
And that's the problem. As already wrote in previous email, calling
function from shared library cannot be heavy optimized as inlined
function and cause slow down. You are calling is_utf8_string_loc for
non-strict mode which is not inlined and so encode/decode of non-strict
mode will be slower...
And also in is_strict_utf8_string_loc you are calling isUTF8_CHAR which
is calling _is_utf8_char_slow and which is calling utf8n_to_uvchr which
cannot be inlined too...
Therefore I think this is not good approach...
Then you should run your benchmarks to find out the performance.
On valid input, is_utf8_string_loc() is called once per string. The
function call overhead and non-inlining should be not noticeable.
On valid input, is_utf8_char_slow() is never called. The used-parts can
be inlined.
On invalid input, performance should be a minor consideration.
The inner loop is much tighter in both functions; likely it can be held
in the cache. The algorithm avoids a bunch of work compared to the
previous one. I doubt that it will be slower than that. The only way
to know in any performance situation is to actually test. And know that
things will be different depending on the underlying hardware, so only
large differences are really significant.