perl-unicode

Re: Website encoding

2004-11-27 10:30:05
At 10:33 am +1100 18/11/04, Rick Measham wrote:

That being the case, I grab the charset and use Encode's decode function
to turn it into 'perl's internal format' .. which in 5.8.5 is utf8
right? I then store that in the db.

What happens if you do something like this? :


my $uri = 'http://www.lemonde.fr';
my $fin = '/tmp/latin1.html';
my $fout = '/tmp/utf8.html';
my $charsetin = "text/html; charset=iso-8859-1";
my $charsetout = "text/html; charset=UTF-8";
`curl -o $fin $uri` ;
open(FIN, "<:encoding(iso-8859-1)",$fin);
open(FOUT, ">:encoding(utf8)", $fout);
for (<FIN>) {
  chomp;
  $_ .= $/;
  s~$charsetin~$charsetout~ig;
  print FOUT;
  print;
}
`open $fout`;

JD

<Prev in Thread] Current Thread [Next in Thread>