perl-unicode

Re: Segfault using HTML::Entities

2004-06-30 15:30:05
On Wed, Jun 30, 2004 at 10:15:13PM +0100, Richard Jolly wrote:

On 30 Jun 2004, at 17:52, Nicholas Clark wrote:

On Tue, Jun 29, 2004 at 06:49:16PM +0100, Richard Jolly wrote:
#### Script

Could you resend the script/data test case as an attachment please?

Attached.

Thanks.

Looks like a core bug, as it's all going pear shaped somewhere in the regexp
engine. You need a UTF8 locale to provoke it:

$ LC_ALL=en_GB.utf8 PERL_UNICODE= valgrind 
/home/nick/Sandpit/-i-g/bin/perl5.9.2 old.pl 
==11515== Memcheck, a memory error detector for x86-linux.
==11515== Copyright (C) 2002-2003, and GNU GPL'd, by Julian Seward.
==11515== Using valgrind-2.1.0, a program supervision framework for x86-linux.
==11515== Copyright (C) 2000-2003, and GNU GPL'd, by Julian Seward.
==11515== Estimated CPU clock rate is 2808 MHz
==11515== For more details, rerun with: -v
==11515== 
==11515== warning: Valgrind's siglongjmp is incomplete
==11515==          (it ignores cleanup handlers)
==11515==          your program may misbehave as a result
The Modern Résumé
Malformed UTF-8 character (unexpected end of string) at 
/home/nick/Sandpit/-i-g/lib/perl5/site_perl/5.9.2/i686-linux-thread-multi/HTML/Entities.pm
 line 435, <DATA> line 1.
==11515== Invalid read of size 1
==11515==    at 0x817005A: Perl_utf8n_to_uvuni (utf8.c:418)
==11515==    by 0x816E6DD: S_reginclass (regexec.c:4364)
==11515==    by 0x81610A5: S_find_byclass (regexec.c:968)
==11515==    by 0x8165259: Perl_regexec_flags (regexec.c:1945)
==11515==  Address 0x42475ED6 is 0 bytes after a block of size 18 alloc'd
==11515==    at 0x40027C66: malloc (vg_replace_malloc.c:160)
==11515==    by 0x80C9F31: Perl_safesysmalloc (util.c:67)
==11515==    by 0x80CB5DD: Perl_savepvn (util.c:780)
==11515==    by 0x8165718: Perl_regexec_flags (regexec.c:2053)
==11515== 
==11515== Invalid read of size 1
==11515==    at 0x817008F: Perl_utf8n_to_uvuni (utf8.c:425)
==11515==    by 0x816E6DD: S_reginclass (regexec.c:4364)
==11515==    by 0x81610A5: S_find_byclass (regexec.c:968)
==11515==    by 0x8165259: Perl_regexec_flags (regexec.c:1945)
==11515==  Address 0x42475ED6 is 0 bytes after a block of size 18 alloc'd
==11515==    at 0x40027C66: malloc (vg_replace_malloc.c:160)
==11515==    by 0x80C9F31: Perl_safesysmalloc (util.c:67)
==11515==    by 0x80CB5DD: Perl_savepvn (util.c:780)
==11515==    by 0x8165718: Perl_regexec_flags (regexec.c:2053)
==11515== 
==11515== Invalid read of size 1
==11515==    at 0x817005A: Perl_utf8n_to_uvuni (utf8.c:418)
==11515==    by 0x816E6DD: S_reginclass (regexec.c:4364)
==11515==    by 0x8166CB6: S_regmatch (regexec.c:2542)
==11515==    by 0x8165E1A: S_regtry (regexec.c:2198)
==11515==  Address 0x42475ED6 is 0 bytes after a block of size 18 alloc'd
==11515==    at 0x40027C66: malloc (vg_replace_malloc.c:160)
==11515==    by 0x80C9F31: Perl_safesysmalloc (util.c:67)
==11515==    by 0x80CB5DD: Perl_savepvn (util.c:780)
==11515==    by 0x8165718: Perl_regexec_flags (regexec.c:2053)
==11515== 
==11515== Invalid read of size 1
==11515==    at 0x817008F: Perl_utf8n_to_uvuni (utf8.c:425)
==11515==    by 0x816E6DD: S_reginclass (regexec.c:4364)
==11515==    by 0x8166CB6: S_regmatch (regexec.c:2542)
==11515==    by 0x8165E1A: S_regtry (regexec.c:2198)
==11515==  Address 0x42475ED6 is 0 bytes after a block of size 18 alloc'd
==11515==    at 0x40027C66: malloc (vg_replace_malloc.c:160)
==11515==    by 0x80C9F31: Perl_safesysmalloc (util.c:67)
==11515==    by 0x80CB5DD: Perl_savepvn (util.c:780)
==11515==    by 0x8165718: Perl_regexec_flags (regexec.c:2053)
==11515== 
==11515== Invalid read of size 1
==11515==    at 0x8166D13: S_regmatch (regexec.c:2547)
==11515==    by 0x8165E1A: S_regtry (regexec.c:2198)
==11515==    by 0x816113D: S_find_byclass (regexec.c:972)
==11515==    by 0x8165259: Perl_regexec_flags (regexec.c:1945)
==11515==  Address 0x42475ED7 is 1 bytes after a block of size 18 alloc'd
==11515==    at 0x40027C66: malloc (vg_replace_malloc.c:160)
==11515==    by 0x80C9F31: Perl_safesysmalloc (util.c:67)
==11515==    by 0x80CB5DD: Perl_savepvn (util.c:780)
==11515==    by 0x8165718: Perl_regexec_flags (regexec.c:2053)
Malformed UTF-8 character (unexpected non-continuation byte 0x73, immediately 
after start byte 0xe9) in substitution iterator at 
/home/nick/Sandpit/-i-g/lib/perl5/site_perl/5.9.2/i686-linux-thread-multi/HTML/Entities.pm
 line 435, <DATA> line 1.
==11515== 
==11515== Invalid read of size 1
==11515==    at 0x42082515: memmove (in /lib/i686/libc-2.2.5.so)
==11515==    by 0x80FC5F4: Perl_sv_setpvn (sv.c:4790)
==11515==    by 0x80D3CA9: Perl_magic_get (mg.c:753)
==11515==    by 0x80D23FA: Perl_mg_get (mg.c:156)
==11515==  Address 0x42475ED6 is 0 bytes after a block of size 18 alloc'd
==11515==    at 0x40027C66: malloc (vg_replace_malloc.c:160)
==11515==    by 0x80C9F31: Perl_safesysmalloc (util.c:67)
==11515==    by 0x80CB5DD: Perl_savepvn (util.c:780)
==11515==    by 0x8165718: Perl_regexec_flags (regexec.c:2053)
==11515== 
==11515== Invalid read of size 4
==11515==    at 0x42082542: memmove (in /lib/i686/libc-2.2.5.so)
==11515==    by 0x80FD023: Perl_sv_catpvn_flags (sv.c:5137)
==11515==    by 0x812CE63: Perl_pp_substcont (pp_ctl.c:193)
==11515==    by 0x80C9844: Perl_runops_debug (dump.c:1564)
==11515==  Address 0x42475EC2 is 2 bytes before a block of size 18 alloc'd
==11515==    at 0x40027C66: malloc (vg_replace_malloc.c:160)
==11515==    by 0x80C9F31: Perl_safesysmalloc (util.c:67)
==11515==    by 0x80CB5DD: Perl_savepvn (util.c:780)
==11515==    by 0x8165718: Perl_regexec_flags (regexec.c:2053)
==11515== 
==11515== More than 30000 total errors detected.  I'm not reporting any more.
==11515== Final error counts will be inaccurate.  Go fix your program!
==11515== Rerun with --error-limit=no to disable this cutoff.  Note
==11515== that errors may occur in your program without prior warning from
==11515== Valgrind, because errors are no longer being displayed.
==11515== 
==11515== 
==11515== Process terminating with default action of signal 11 (SIGSEGV): 
dumping core
==11515==  Invalid permissions for mapped object at address 0x42D94FFC
==11515==    at 0x42082542: memmove (in /lib/i686/libc-2.2.5.so)
==11515==    by 0x80FD023: Perl_sv_catpvn_flags (sv.c:5137)
==11515==    by 0x812CE63: Perl_pp_substcont (pp_ctl.c:193)
==11515==    by 0x80C9844: Perl_runops_debug (dump.c:1564)
==11515== 
==11515== ERROR SUMMARY: 30000 errors from 7 contexts (suppressed: 20 from 2)
==11515== malloc/free: in use at exit: 2548607 bytes in 55034 blocks.
==11515== malloc/free: 109625 allocs, 54591 frees, 20232500 bytes allocated.
==11515== For a detailed leak analysis,  rerun with: --leak-check=yes
==11515== For counts of detected errors, rerun with: -v
Segmentation fault

Also goes segfault the same way without ithreads, and on maint (ie 5.8.4999)

(That happened to be on an x86 Redhat box)

Nicholas Clark

<Prev in Thread] Current Thread [Next in Thread>