perl-unicode

Re: Encode UTF-8 optimizations

2016-08-31 16:44:15
On Monday 29 August 2016 17:00:00 Karl Williamson wrote:
If you'd be willing to test this out, especially the performance
parts that would be great!
[snip]
There are 2 experimental performance commits.  If you want to see if
they actually improve performance by doing a before/after compare
that would be nice.

So here are my results:

strict = bless({strict_utf8 => 1}, "Encode::utf8")->encode_xs/decode_xs
lax    = bless({strict_utf8 => 0}, "Encode::utf8")->encode_xs/decode_xs
int    = utf8::encode/decode

all    = join "", map { chr } 0 .. 0x10FFFF
short  = "žluťoučký kůň pěl ďábelské ódy " x 45
long   = $short x 1000
ishort = "\xA0" x 1000
ilong  = "\xA0" x 1000000

your   = 9c03449800417dd02cc1af613951a1002490a52a
orig   = f16e7fa35c1302aa056db5d8d022b7861c1dd2e8
my     = orig without c8247c27c13d1cf152398e453793a91916d2185d
your1  = your without b65e9a52d8b428146ee554d724b9274f8e77286c
your2  = your without 9ccc3ecd1119ccdb64e91b1f03376916aa8cc6f7


decode
                          all            ilong          ishort         long     
      short 
    my: - int           285.94/s     14988.61/s   4694109.54/s       704.15/s   
 599678.93/s
  orig: - int           292.41/s     15121.98/s   4782883.50/s       494.33/s   
 553182.28/s
 your1: - int           271.21/s     14232.25/s   4706722.93/s       599.68/s   
 554941.90/s
 your2: - int           280.85/s     14090.33/s   4210573.40/s       593.93/s   
 558487.86/s
  your: - int           283.23/s     15121.98/s   4500252.51/s       691.95/s   
 678859.55/s

                          all            ilong          ishort         long     
      short 
    my: - lax            83.28/s       202.22/s    142049.67/s       181.82/s   
 163352.41/s
  orig: - lax            53.49/s       201.58/s    152422.11/s       147.13/s   
 133974.37/s
 your1: - lax           255.13/s        53.75/s     47590.82/s       560.34/s   
 431447.77/s
 your2: - lax           281.71/s        48.41/s     43260.19/s       634.16/s   
 445365.29/s
  your: - lax           286.96/s        46.35/s     42848.40/s       632.20/s   
 442546.52/s

                          all            ilong          ishort         long     
      short 
    my: - strict         90.48/s       200.00/s    143081.15/s       197.53/s   
 175800.00/s
  orig: - strict         49.21/s       202.22/s    149447.34/s       142.81/s   
 128290.63/s
 your1: - strict        154.94/s        48.16/s     44237.93/s       191.36/s   
 169228.16/s
 your2: - strict        158.75/s        40.06/s     37244.06/s       195.95/s   
 173588.68/s
  your: - strict        158.26/s        38.54/s     36898.14/s       195.95/s   
 172504.61/s


encode
                          all            ilong          ishort         long     
      short 
    my: - int       5197722.67/s   5227338.26/s   5210583.97/s   5163520.62/s   
5227338.26/s
  orig: - int       5449888.54/s   5381336.48/s   5370254.05/s   5449888.54/s   
5301624.60/s
 your1: - int       5244200.62/s   5293830.28/s   5277183.02/s   5361483.07/s   
5260640.13/s
 your2: - int       5435994.67/s   5432587.30/s   5398312.30/s   5487602.22/s   
5606457.74/s
  your: - int       5261172.17/s   5327441.90/s   5310582.91/s   5310582.91/s   
5361483.07/s

                          all            ilong          ishort         long     
      short 
    my: - lax          2442.24/s     15084.08/s   2882995.00/s      7993.15/s   
2716293.65/s
  orig: - lax          2438.39/s     15121.98/s   2933419.33/s      7965.22/s   
2665521.81/s
 your1: - lax          2229.94/s     14908.60/s   2117316.51/s      7428.89/s   
2011133.75/s
 your2: - lax          2400.92/s     15121.98/s   3046739.87/s      8065.41/s   
2742961.18/s
  your: - lax          2368.00/s     15168.94/s   2862328.67/s      8090.85/s   
2685694.50/s

                          all            ilong          ishort         long     
      short 
    my: - strict         92.16/s       204.81/s    157772.05/s       200.00/s   
 190344.59/s
  orig: - strict         49.04/s       202.22/s    160767.72/s       142.81/s   
 133548.90/s
 your1: - strict        147.75/s        46.91/s     46095.57/s       194.36/s   
 176949.84/s
 your2: - strict        159.25/s        40.19/s     38034.59/s       196.20/s   
 185166.45/s
  your: - strict        158.26/s        38.54/s     37012.73/s       196.20/s   
 186357.23/s


So looks like that experimental commits did not speed up encoder or decoder.

What is relevant from these tests is that your patches slow down encoding
and decoding of illegal sequences like "\xA0" x 1000000 about 4-5 times.

<Prev in Thread] Current Thread [Next in Thread>