perl-unicode

Re: Help slurping a file?

2005-10-28 09:10:50

reneeh(_at_)stanford(_dot_)edu said:
I'm not sure if this is the correct group to post this question to. If
there is a better forum for this kind of question, please let me know. 

There doesn't seem to be any reference to unicode in your question, so 
it probably lacks relevance to the perl-unicode list... (You can try 
www.perlmonks.org -- they love the kind of stuff you're describing.)

In any case, I have found that "typical" line-termination patterns on 
macosx depend on the application that creates the file.  I use mostly 
unix-based apps, so on my powerbook, most of my text files have just "\n",
but I have seen both "\r" and "\r\n" as well.

Have you tried something like this:

#!/usr/local/bin/perl

my $fname = "path/name_of.file";
open( IN, $fname );
{
    $/ = undef;
    $_ = <IN>;
}
close IN;

printf("File size = %d, slurped string size = %d\n", -s $fname, length());

__END__

If things are kosher, the two numbers shown by the printf should be equal,
and if that's the case, the next question is figuring out how to split $_ 
into lines.  This should work for just about every case:

  @lines = split /[\r\n]+/;

(That will obliterate blank lines.  If it's important to keep track of
blank lines, put parens around the regex to capture the line termination
characters -- each string of ([\r\n]+) will be saved in @lines, interleaved
between the non-empty lines that they separate.)

        Dave Graff


<Prev in Thread] Current Thread [Next in Thread>