Text::CSV(3pm) User Contributed Perl Documentation Text::CSV(3pm)
NAME
Text::CSV - comma-separated values manipulator (using XS or PurePerl)
SYNOPSIS
use Text::CSV;
my @rows;
my $csv = Text::CSV->new ( { binary => 1 } ) # should set binary attribute.
or die "Cannot use CSV: ".Text::CSV->error_diag ();
open my $fh, "<:encoding(utf8)", "test.csv" or die "test.csv: $!";
while ( my $row = $csv->getline( $fh ) ) {
$row->[2] =~ m/pattern/ or next; # 3rd field should match
push @rows, $row;
}
$csv->eof or $csv->error_diag();
close $fh;
$csv->eol ("
");
open $fh, ">:encoding(utf8)", "new.csv" or die "new.csv: $!";
$csv->print ($fh, $_) for @rows;
close $fh or die "new.csv: $!";
#
# parse and combine style
#
$status = $csv->combine(@columns); # combine columns into a string
$line = $csv->string(); # get the combined string
$status = $csv->parse($line); # parse a CSV string into fields
@columns = $csv->fields(); # get the parsed fields
$status = $csv->status (); # get the most recent status
$bad_argument = $csv->error_input (); # get the most recent bad argument
$diag = $csv->error_diag (); # if an error occured, explains WHY
$status = $csv->print ($io, $colref); # Write an array of fields
# immediately to a file $io
$colref = $csv->getline ($io); # Read a line from file $io,
# parse it and return an array
# ref of fields
$csv->column_names (@names); # Set column names for getline_hr ()
$ref = $csv->getline_hr ($io); # getline (), but returns a hashref
$eof = $csv->eof (); # Indicate if last parse or
# getline () hit End Of File
$csv->types(@t_array); # Set column types
DESCRIPTION
Text::CSV provides facilities for the composition and decomposition of comma-separated values using Text::CSV_XS or its pure Perl version.
An instance of the Text::CSV class can combine fields into a CSV string and parse a CSV string into fields.
The module accepts either strings or files as input and can utilize any user-specified characters as delimiters, separators, and escapes so
it is perhaps better called ASV (anything separated values) rather than just CSV.
VERSION
1.21
This module is compatible with Text::CSV_XS 0.80 and later.
Embedded newlines
Important Note: The default behavior is to only accept ASCII characters. This means that fields can not contain newlines. If your data
contains newlines embedded in fields, or characters above 0x7e (tilde), or binary data, you *must* set "binary => 1" in the call to "new
()". To cover the widest range of parsing options, you will always want to set binary.
But you still have the problem that you have to pass a correct line to the "parse ()" method, which is more complicated from the usual
point of usage:
my $csv = Text::CSV->new ({ binary => 1, eol => $/ });
while (<>) { # WRONG!
$csv->parse ($_);
my @fields = $csv->fields ();
will break, as the while might read broken lines, as that does not care about the quoting. If you need to support embedded newlines, the
way to go is either
my $csv = Text::CSV->new ({ binary => 1, eol => $/ });
while (my $row = $csv->getline (*ARGV)) {
my @fields = @$row;
or, more safely in perl 5.6 and up
my $csv = Text::CSV->new ({ binary => 1, eol => $/ });
open my $io, "<", $file or die "$file: $!";
while (my $row = $csv->getline ($io)) {
my @fields = @$row;
Unicode (UTF8)
On parsing (both for "getline ()" and "parse ()"), if the source is marked being UTF8, then all fields that are marked binary will also be
be marked UTF8.
For complete control over encoding, please use Text::CSV::Encoded:
use Text::CSV::Encoded;
my $csv = Text::CSV::Encoded->new ({
encoding_in => "iso-8859-1", # the encoding comes into Perl
encoding_out => "cp1252", # the encoding comes out of Perl
});
$csv = Text::CSV::Encoded->new ({ encoding => "utf8" });
# combine () and print () accept *literally* utf8 encoded data
# parse () and getline () return *literally* utf8 encoded data
$csv = Text::CSV::Encoded->new ({ encoding => undef }); # default
# combine () and print () accept UTF8 marked data
# parse () and getline () return UTF8 marked data
On combining ("print ()" and "combine ()"), if any of the combining fields was marked UTF8, the resulting string will be marked UTF8.
Note however if the backend module is Text::CSV_XS, that all fields "before" the first field that was marked UTF8 and contained 8-bit
characters that were not upgraded to UTF8, these will be bytes in the resulting string too, causing errors. If you pass data of different
encoding, or you don't know if there is different encoding, force it to be upgraded before you pass them on:
# backend = Text::CSV_XS
$csv->print ($fh, [ map { utf8::upgrade (my $x = $_); $x } @data ]);
SPECIFICATION
See to "SPECIFICATION" in Text::CSV_XS.
FUNCTIONS
These methods are common between XS and puer Perl version. Most of the document was shamelessly copied and replaced from Text::CSV_XS.
version ()
(Class method) Returns the current backend module version. If you want the module version, you can use the "VERSION" method,
print Text::CSV->VERSION; # This module version
print Text::CSV->version; # The version of the worker module
# same as Text::CSV->backend->version
new (\%attr)
(Class method) Returns a new instance of Text::CSV_XS. The objects attributes are described by the (optional) hash ref "\%attr". Currently
the following attributes are available:
eol An end-of-line string to add to rows. "undef" is replaced with an empty string. The default is "$". Common values for "eol" are "