perl-i18n

GB2312 Encoding and File Names

2011-11-17 12:20:39
Hello all, 

I do hope I am in the right place for some help! I am working on a project that 
requires email attachments to be extracted to the file system. All was working 
great until one of our kind testers tried with normal and simplified Chinese; 
where I ended up with files of the name ?????.txt. 

Am using the module MIME::Parser to extract the files and after some great help 
from the developer I have realized that one need to override a method in 
MIME::Parser::Filer so that the correct file names are generated. 

One of the attachments in the test email is show below: 

360新闻监测-12-01-Chi Simp.txt 

I have tried to use MIME::EncWords and MIME::Charset to extract the correct 
name from the MIME entity using: 

my $fname = decode_mimewords($head->recommended_filename); 

but this still does not work :( so I tried to compare what the file name looks 
like with the LANG with/and without UTF8

With LANG en_GB.UTF8 

360新闻监测-12-01-Chi Simp.txt

With LANG en_GB 

360�?��?��??��?-12-01-Chi Simp.txt

Now this is what happens when I extract the file with my new method: 

With LANG en_GB 

360���ż���-12-01-Chi Simp.txt

With LANG en_GB.UTF8 

360???ż???-12-01-Chi Simp.txt

The MIME file name appears as 
?gb2312?B?MzYw0MLChLFPnHktMTItMDEtQ2hpIFRyYWQudHh0?=

This is not may area of expertise so reaching out to you for some help. How can 
one extract the file name from an email and have it reflect its really Chinese 
name ?  Hope this make sense!
-- 
Thanks, Phil

<Prev in Thread] Current Thread [Next in Thread>