perl-unicode

should a non-breaking space character be treated as whitespace in perl source?

2005-10-05 14:20:54
Should a non-breaking space character be treated as whitespace in
perl source code?  It doesn't appear to be:

$ xxd foo.pl 
0000000: 2321 2f75 7372 2f62 696e 2f70 6572 6c0a  #!/usr/bin/perl.
0000010: 7573 6520 7574 6638 3b0a 6d79 2024 6e6f  use utf8;.my $no
0000020: 6272 6561 6b73 7061 6365 203d 2022 c2a0  breakspace = "..
0000030: 223b 0a70 7269 6e74 2022 6d61 7463 6822  ";.print "match"
0000040: 2069 6620 246e 6f62 7265 616b 7370 6163   if $nobreakspac
0000050: 6520 3d7e 202f 5c73 2f3b 0a65 7869 743b  e =~ /\s/;.exit;
0000060: 0ac2 a00a                                ....

$ perl foo.pl 
Unrecognized character \xC2 at foo.pl line 6.

Delete the last line and the program will work:

$ perl foo.pl 
match

Version of perl I'm using:

$ perl -v

This is perl, v5.8.7 built for cygwin-thread-multi-64int

foo.pl should be attached.

Cheers,
Stephen

PS.  The reason I ask is that Ultra Edit inserts unicode code
point FEFF (zero-width non-breaking space) at the beginning of
UTF-8 encoded files *by default*.  I happen to think that doing
this by default is criminal, but it occurred to me that maybe
perl shouldn't care.

Attachment: foo.pl
Description: Perl program

<Prev in Thread] Current Thread [Next in Thread>