Sponsored Content
Top Forums Shell Programming and Scripting Duplicate rows in CSV files based on values Post 302514439 by DGPickett on Friday 15th of April 2011 04:55:45 PM
Old 04-15-2011
sort and use sed, with two lines in the buffer, to fold them together.
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Remove duplicate rows of a file based on a value of a column

Hi, I am processing a file and would like to delete duplicate records as indicated by one of its column. e.g. COL1 COL2 COL3 A 1234 1234 B 3k32 2322 C Xk32 TTT A NEW XX22 B 3k32 ... (7 Replies)
Discussion started by: risk_sly
7 Replies

2. Shell Programming and Scripting

Duplicate rows in CSV files based on values

I want to duplicate a row if found two or more values in a particular column for corresponding row which is delimitted by comma. Input abc,line one,value1 abc,line two, value1, value2 abc,line three,value1 needs to converted to abc,line one,value1 abc,line two, value1 abc,line... (8 Replies)
Discussion started by: Incrediblian
8 Replies

3. Shell Programming and Scripting

how to delete duplicate rows based on last column

hii i have a huge amt of data stored in a file.Here in this file i need to remove duplicates rows in such a way that the last column has different data & i must check for greatest among last colmn data & print the largest data along with other entries but just one of other duplicate entries is... (16 Replies)
Discussion started by: reva
16 Replies

4. UNIX for Dummies Questions & Answers

forming duplicate rows based on value of a key

if the key (A or B or ...others) has 4 in its 3rd column the 1st A row has to form 4 dupicates along with the all the values of A in 4th column (2.9, 3.8, 4.2) . Hope I explain the question clearly. Cheers Ruby input "A" 1 4 2.9 "A" 2 5 ... (7 Replies)
Discussion started by: ruby_sgp
7 Replies

5. Shell Programming and Scripting

printing 3 files side by side based on similar values in rows

Hi I'm trying to compare 3 or more files based on similar values and outputting them into 3 columns. For example: file1 ABC DEF GHI file2 DEF DER file3 ABC DER The output should come out like this file1 file2 file3 ABC ABC (4 Replies)
Discussion started by: zerofire123
4 Replies

6. Shell Programming and Scripting

How to generate a csv files by separating the values from the input file based on position?

Hi All, I need help for doing the following. I have a input file like: aaaaaaaaaabbbbbbbbbbbbbbbbbbbb cccbbbbbaaaaaadddddaaaabbbbbbb now I am trying to generate a output csv file where i will have for e.g. 0-3 chars of each line as the first column in the csv, 4-10 chars of the line as... (3 Replies)
Discussion started by: babom
3 Replies

7. Shell Programming and Scripting

Remove duplicate rows based on one column

Dear members, I need to filter a file based on the 8th column (that is id), and does not mather the other columns, because I want just one id (1 line of each id) and remove the duplicates lines based on this id (8th column), and does not matter wich duplicate will be removed. example of my file... (3 Replies)
Discussion started by: clarissab
3 Replies

8. Shell Programming and Scripting

Average values of duplicate rows

I have this file input.txt. I want to take average column-wise for the rows having duplicate gene names. Gene Sample_1 Sample_2 Sample_3 gene_A 2 4 5 gene_B 1 2 3 gene_A 0 5 7 gene_B 4 5 6 gene_A 11 12 13 gene_C 2 3 4 Desired output: gene_A 4.3 7 8.3 gene_B 2.5 3.5 4.5 gene_C 2 3 4... (6 Replies)
Discussion started by: Sanchari
6 Replies

9. Shell Programming and Scripting

Extract and exclude rows based on duplicate values

Hello I have a file like this: > cat examplefile ghi|NN603762|eee mno|NN607265|ttt pqr|NN613879|yyy stu|NN615002|uuu jkl|NN607265|rrr vwx|NN615002|iii yzA|NN618555|ooo def|NN190486|www BCD|NN628717|ppp abc|NN190486|qqq EFG|NN628717|aaa HIJ|NN628717|sss > I can sort the file by... (5 Replies)
Discussion started by: CHoggarth
5 Replies

10. UNIX for Beginners Questions & Answers

Get duplicate rows from a csv file

How can i get the duplicates rows from a file using unix, for example i have data like a,1 b,2 c,3 d,4 a,1 c,3 e,5 i want output to be like a,1 c,3 (4 Replies)
Discussion started by: ggupta
4 Replies
CSV(3pm)						User Contributed Perl Documentation						  CSV(3pm)

NAME
Class::CSV - Class based CSV parser/writer SYNOPSIS
use Class::CSV; my $csv = Class::CSV->parse( filename => 'test.csv', fields => [qw/item qty sub_total/] ); foreach my $line (@{$csv->lines()}) { $line->sub_total('$'. sprintf("%0.2f", $line->sub_total())); print 'Item: '. $line->item(). " ". 'Qty: '. $line->qty(). " ". 'SubTotal: '. $line->sub_total(). " "; } my $cvs_as_string = $csv->string(); $csv->print(); my $csv = Class::CSV->new( fields => [qw/userid username/], line_separator => " "; ); $csv->add_line([2063, 'testuser']); $csv->add_line({ userid => 2064, username => 'testuser2' }); DESCRIPTION
This module can be used to create objects from CSV files, or to create CSV files from objects. Text::CSV_XS is used for parsing and creating CSV file lines, so any limitations in Text::CSV_XS will of course be inherant in this module. EXPORT None by default. METHOD
CONSTRUCTOR parse the parse constructor takes a hash as its paramater, the various options that can be in this hash are detailed below. Required Options o fields - an array ref containing the list of field names to use for each row. there are some reserved words that cannot be used as field names, there is no checking done for this at the moment but it is something to be aware of. the reserved field names are as follows: "string", "set", "get". also field names cannot contain whitespace or any characters that would not be allowed in a method name. Source Options (only one of these is needed) o filename - the path of the CSV file to be opened and parsed. o filehandle - the file handle of the CSV file to be parsed. o objects - an array ref of objects (e.g. Class::DBI objects). for this to work properly the field names provided in fields needs to correspond to the field names of the objects in the array ref. o classdbi_objects - depreciated use objects instead - using classdbi_objects will still work but its advisable to update your code. Optional Options o line_separator - the line seperator to be included at the end of every line. defaulting to " " (unix carriage return). new the new constructor takes a hash as its paramater, the same options detailed in parse apply to new however no Source Options can be used. this constructor creates a blank CSV object of which lines can be added via add_line. ACCESSING lines returns an array ref containing objects of each CSV line (made via Class::Accessor). the field names given upon construction are available as accessors and can be set or get. for more information please see the notes below or the perldoc for Class::Accessor. the lines accessor is also able to be updated/retrieved in the same way as individual lines fields (examples below). Example retrieving the lines: my @lines = @{$csv->lines()}; removing the first line: pop @lines; $csv->lines(@lines); sorting the lines: @lines = sort { $a->userid() <=> $b->userid() } @lines: $csv->lines(@lines); sorting the lines (all-in-one way): $csv->lines([ sort { $a->userid() <=> $b->userid() } @{$csv->lines()} ]); Retrieving a fields value there is two ways to retrieve a fields value (as documented in Class::Accessor). firstly you can call the field name on the object and secondly you can call "get" on the object with the field name as the argument (multiple field names can be specified to retrieve an array of values). examples are below. my $value = $line->test(); OR my $value = $line->get('test'); OR my @values = $line->get(qw/test test2 test3/); Setting a fields value setting a fields value is simmilar to getting a fields value. there are two ways to set a fields value (as documented in Class::Accessor). firstly you can simply call the field name on the object with the value as the argument or secondly you can call "set" on the object with a hash of fields and their values to set (this isn't standard in Class::Accessor, i have overloaded the "set" method to allow this). examples are below. $line->test('123'); OR $line->set( test => '123' ); OR $line->set( test => '123', test2 => '456' ); Retrieving a line as a string to retrieve a line as a string simply call "string" on the object. my $string = $line->string(); new_line returns a new line object, this can be useful for to "splice" a line into lines (see example below). you can pass the values of the line as an ARRAY ref or a HASH ref. Example my $line = $csv->new_line({ userid => 123, domainname => 'splicey.com' }); my @lines = $csv->lines(); splice(@lines, 1, 0, $line); OR splice(@{$csv->lines()}, 1, 0, $csv->new_line({ userid => 123, domainname => 'splicey.com' })); add_line adds a line to the lines stack. this is mainly useful when the new constructor is used but can of course be used with any constructor. it will add a new line to the end of the lines stack. you can pass the values of the line as an ARRAY ref or a HASH ref. examples of how to use this are below. Example $csv->add_line(['house', 100000, 4]); $csv->add_line({ item => 'house', cost => 100000, bedrooms => 4 }); OUTPUT string returns the object as a string (CSV file format). print calls "print" on string (prints the CSV to STDOUT). SEE ALSO
Text::CSV_XS, Class::Accessor AUTHOR
David Radunz, <david@boxen.net> COPYRIGHT AND LICENSE
Copyright 2004 by David Radunz This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. perl v5.10.0 2007-02-08 CSV(3pm)
All times are GMT -4. The time now is 03:56 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy