rac Bodhisattva
Joined: 30 May 2002 Posts: 6553 Location: Japanifornia
|
Posted: Thu Nov 14, 2002 10:59 pm Post subject: Perl Line Endings Redux |
|
|
I have lots of CLI and CGI programs that take files that people make and do something with them. I have no idea what character encoding things are coming in (SJIS, EUC, ISO-2022, UTF-8, UCS-2), but that's a separate issue. I also have no idea what line endings are being used, and when I'm doing things like feeding the input line by line to a CSV parser or something similar, that causes problems.
CGI.pm's file upload feature gives me some lightweight filehandle object that isn't a true IO::File. If I'm reading filenames from the command line, I usually use IO::File. Whether I'm calling getline or using the old skool <$fh> syntax, if the record separator isn't set up correctly first, bad things happen.
So here's a subroutine that I just wrote. Pass it a filehandle of some sort - I've tested it with both the things that CGI.pm gives you for file uploads, and with IO::File. If you're using it with IO::File, you need to call IO::File->input_record_separator( $/ ) afterwards. One could take care of that in the subroutine, but if one isn't using IO::File at all, that might be unwanted.
Code: | sub autors {
my ($fh) = @_;
my $buf = '';
my $pos = tell $fh;
read( $fh, $buf, 2048 );
seek( $fh, $pos, 0 );
for my $tle ( "\015\012", "\015" ) {
if( $buf =~ /$tle/ ) {
$/ = $tle;
return;
}
}
$/ = "\012";
} |
Some notes: - the 2048 can be shrunk considerably for most uses
- explicit use of \015 and \012 instead of \n and \r attempts to make this portable to MacPerl
- the changes made by autors to $/ persist until you either call the function with a different filehandle or set $/ to something else. It would be nice if Perl had per-filehandle $/, but it doesn't, AFAIK.
_________________ For every higher wall, there is a taller ladder |
|