At 10:33 am +1100 18/11/04, Rick Measham wrote:
That being the case, I grab the charset and use Encode's decode function
to turn it into 'perl's internal format' .. which in 5.8.5 is utf8
right? I then store that in the db.
What happens if you do something like this? :
my $uri = 'http://www.lemonde.fr';
my $fin = '/tmp/latin1.html';
my $fout = '/tmp/utf8.html';
my $charsetin = "text/html; charset=iso-8859-1";
my $charsetout = "text/html; charset=UTF-8";
`curl -o $fin $uri` ;
open(FIN, "<:encoding(iso-8859-1)",$fin);
open(FOUT, ">:encoding(utf8)", $fout);
for (<FIN>) {
chomp;
$_ .= $/;
s~$charsetin~$charsetout~ig;
print FOUT;
print;
}
`open $fout`;
JD