Difference between two huge .csv files Post: 302711659

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Difference between two huge files

Hi, As per my requirement, I need to take difference between two big files(around 6.5 GB) and get the difference to a output file without any line numbers or '<' or '>' in front of each new line. As DIFF command wont work for big files, i tried to use BDIFF instead. I am getting incorrect...

2. AIX

Huge difference in reported Disk usage between ls,df and du

IBM RS6000 F50 AIX 4.3.2 i am having trouble in calculating the actual size of a set of directories and reconciling the results with the actual Hard Disk space used I have 33GB disk which is showing 7.8GB used, a byte count of the files in the directory/sub-dirs i`m interested in is 48GB,...

3. Programming

Huge difference between _POSIX_OPEN_MAX and sysconf(_SC_OPEN_MAX).

On my Linux system there seems to be a massive difference between the value of _POSIX_OPEN_MAX and what sysconf(_SC_OPEN_MAX) returns and also what I'd expect from the table of examples of configuration limits from Advanced Programming In The UNIX Environment, 2nd Ed. _POSIX_OPEN_MAX: 16...

4. Shell Programming and Scripting

Counting difference in two CSV files

Hi, I am new to awk and trying to count the difference between the first columns of two CSV files. -------- Sample input (header is:name, id1,id2): file1.csv name, id1,id2 sss,34,56 yyy,3,56 www,56,78 pppp,43,12 file2.csv name,id1,id2 sss,32,56 yyy,12,7 ttt,4,8 uuu,7,9

5. Shell Programming and Scripting

Three Difference File Huge Data Comparison Problem.

I got three different file: Part of File 1 ARTPHDFGAA . . Part of File 2 ARTGHHYESA . . Part of File 3 ARTPOLYWEA . .

6. Shell Programming and Scripting

Format & Compare two huge CSV files

I have two csv files having 90K records each & each row has around 50 columns.Lets say the file names are FILE1 and FILE2. I have to compare both the files and generate a new file that has rows from FILE2 if it differs. FILE1 ----- 2001,"John",25,19901130,21211.41,Unix Forum...

7. Shell Programming and Scripting

Comparing 2 difference csv files

Hello, I have about 10 csv files which range from csv1 - csv10. Each csv file has same type/set of tabs and we have around 5-6 tabs for each of the csv file which have slightly different content(data). A sample of CSV1 is shown below: Joins: Data related to Joins, it can be any number of...

8. Shell Programming and Scripting

Compare two CSV files and put the difference in third file with line no,field no and diff value.

I am having two csv files i need to compare these files and the output file should have the information of the differences at the field level. For Example, File 1: A,B,C,D,E,F 1,2,3,4,5,6 File 2: A,C,B,D,E,F 1,2,4,5,5,6 out put file:

9. Shell Programming and Scripting

Comparing 2 CSV files and sending the difference to a new csv file

(say) I have 2 csv files - file1.csv & file2.csv as mentioned below: file1.csv ID,version,cost 1000,1,30 2000,2,40 3000,3,50 4000,4,60 file2.csv ID,version,cost 1000,1,30 2000,2,45 3000,4,55 6000,5,70 ...

10. Shell Programming and Scripting

Compare 2 csv files in ksh and o/p the difference in a new csv file

LEARN ABOUT DEBIAN

class::csv

CSV(3pm)						User Contributed Perl Documentation						  CSV(3pm)

NAME

       Class::CSV - Class based CSV parser/writer

SYNOPSIS

	 use Class::CSV;

	 my $csv = Class::CSV->parse(
	   filename => 'test.csv',
	   fields   => [qw/item qty sub_total/]
	 );

	 foreach my $line (@{$csv->lines()}) {
	   $line->sub_total('$'. sprintf("%0.2f", $line->sub_total()));

	   print 'Item:     '. $line->item(). "
".
		 'Qty:	    '. $line->qty(). "
".
		 'SubTotal: '. $line->sub_total(). "
";
	 }

	 my $cvs_as_string = $csv->string();

	 $csv->print();

	 my $csv = Class::CSV->new(
	   fields	  => [qw/userid username/],
	   line_separator => "
";
	 );

	 $csv->add_line([2063, 'testuser']);
	 $csv->add_line({
	   userid   => 2064,
	   username => 'testuser2'
	 });

DESCRIPTION

       This module can be used to create objects from CSV files, or to create CSV files from objects. Text::CSV_XS is used for parsing and
       creating CSV file lines, so any limitations in Text::CSV_XS will of course be inherant in this module.

       EXPORT

       None by default.

METHOD

       CONSTRUCTOR

       parse
	   the parse constructor takes a hash as its paramater, the various options that can be in this hash are detailed below.

	   Required Options
	       o   fields - an array ref containing the list of field names to use for each row.  there are some reserved words that cannot be
		   used as field names, there is no checking done for this at the moment but it is something to be aware of. the reserved field
		   names are as follows: "string", "set", "get". also field names cannot contain whitespace or any characters that would not be
		   allowed in a method name.

	   Source Options (only one of these is needed)
	       o   filename - the path of the CSV file to be opened and parsed.

	       o   filehandle - the file handle of the CSV file to be parsed.

	       o   objects - an array ref of objects (e.g. Class::DBI objects). for this to work properly the field names provided in fields needs
		   to correspond to the field names of the objects in the array ref.

	       o   classdbi_objects - depreciated use objects instead - using classdbi_objects will still work but its advisable to update your
		   code.

	   Optional Options
	       o   line_separator - the line seperator to be included at the end of every line. defaulting to "
" (unix carriage return).

       new the new constructor takes a hash as its paramater, the same options detailed in parse apply to new however no Source Options can be
	   used. this constructor creates a blank CSV object of which lines can be added via add_line.

       ACCESSING

       lines
	   returns an array ref containing objects of each CSV line (made via Class::Accessor). the field names given upon construction are
	   available as accessors and can be set or get. for more information please see the notes below or the perldoc for Class::Accessor. the
	   lines accessor is also able to be updated/retrieved in the same way as individual lines fields (examples below).

	   Example
	       retrieving the lines:

		 my @lines = @{$csv->lines()};

	       removing the first line:

		 pop @lines;

		 $csv->lines(@lines);

	       sorting the lines:

		 @lines = sort { $a->userid() <=> $b->userid() } @lines:

		 $csv->lines(@lines);

	       sorting the lines (all-in-one way):

		 $csv->lines([ sort { $a->userid() <=> $b->userid() } @{$csv->lines()} ]);

	   Retrieving a fields value
	       there is two ways to retrieve a fields value (as documented in Class::Accessor). firstly you can call the field name on the object
	       and secondly you can call "get" on the object with the field name as the argument (multiple field names can be specified to
	       retrieve an array of values). examples are below.

		 my $value = $line->test();

	       OR

		 my $value = $line->get('test');

	       OR

		 my @values = $line->get(qw/test test2 test3/);

	   Setting a fields value
	       setting a fields value is simmilar to getting a fields value. there are two ways to set a fields value (as documented in
	       Class::Accessor).  firstly you can simply call the field name on the object with the value as the argument or secondly you can call
	       "set" on the object with a hash of fields and their values to set (this isn't standard in Class::Accessor, i have overloaded the
	       "set" method to allow this). examples are below.

		 $line->test('123');

	       OR

		 $line->set( test => '123' );

	       OR

		 $line->set(
		   test  => '123',
		   test2 => '456'
		 );

	   Retrieving a line as a string
	       to retrieve a line as a string simply call "string" on the object.

		 my $string = $line->string();

       new_line
	   returns a new line object, this can be useful for to "splice" a line into lines (see example below). you can pass the values of the
	   line as an ARRAY ref or a HASH ref.

	   Example
		 my $line = $csv->new_line({ userid => 123, domainname => 'splicey.com' });
		 my @lines = $csv->lines();
		 splice(@lines, 1, 0, $line);

	       OR

		 splice(@{$csv->lines()}, 1, 0, $csv->new_line({ userid => 123, domainname => 'splicey.com' }));

       add_line
	   adds a line to the lines stack. this is mainly useful when the new constructor is used but can of course be used with any constructor.
	   it will add a new line to the end of the lines stack. you can pass the values of the line as an ARRAY ref or a HASH ref. examples of
	   how to use this are below.

	   Example
		 $csv->add_line(['house', 100000, 4]);

		 $csv->add_line({
		   item     => 'house',
		   cost     => 100000,
		   bedrooms => 4
		 });

       OUTPUT

       string
	   returns the object as a string (CSV file format).

       print
	   calls "print" on string (prints the CSV to STDOUT).

SEE ALSO

       Text::CSV_XS, Class::Accessor

AUTHOR

       David Radunz, <david@boxen.net>

COPYRIGHT AND LICENSE

       Copyright 2004 by David Radunz

       This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

perl v5.10.0							    2007-02-08								  CSV(3pm)