Sponsored Content
Full Discussion: CSV File parse help in Perl
Top Forums Shell Programming and Scripting CSV File parse help in Perl Post 302179964 by lodey on Saturday 29th of March 2008 08:06:16 AM
Old 03-29-2008
CSV File parse help in Perl

Folks,

I have a bit of an issue trying to obtain some data from a csv file using PERL. I can sort the file and remove any duplicates leaving only 4 or 5 rows containing data. My problem is that the data contained in the original file contains a lot more columns and when I try ro run this script it finds that all the data is unique.

I have the following fields within the orignal file:
PROGRAM ID,OP,PROBE_CARD ,DEVREVSTEP,TEST_START,TESTER_ID

The data which I need to obtain and sort is within the op,probecard and tester_id fields.

How can I go about this?

The code that I use after manually deleting the fields that i do not require is as follows:

Code:
#!/usr/bin/perl -w
use strict;
my $csvfile = 'probecards.csv';
my $newfile = 'sorted.csv';
my $fieldnames = 1;
open (IN, "<$csvfile")  or die "Couldn't open input CSV file: $!";
open (OUT, ">$newfile") or die "Couldn't open output file: $!";
my $header;
$header = <IN> if $fieldnames;
my @data = sort <IN>;
print OUT $header;
my $n = 0;
my $lastline = '';
foreach my $currentline (@data) {
  next if $currentline eq $lastline;
  print OUT $currentline;
  $lastline = $currentline;
  $n++;
}
close IN; close OUT;
print "Processing complete. In = " . scalar @data . " records, Out = $n records\n";

Sample CSV File:
Code:
Original File:

PROGRAM	ID	OP	PROBE_CARD	DEVREVSTEP	TEST_START	TESTER_ID
12630M196	139	2660	S25E3N36	88BCRA	16/03/2008 12:05	IN01
12630M196	1	2660	S25E3N36	88BLBHD	16/03/2008 13:04	IN04
12630M196	508	2660	S25E3N36	88BCRA	16/03/2008 13:41	IN01
12630M196	437	2660	S25E3N36	88CLNHC	16/03/2008 14:18	IN01
12630M196	465	2660	S25E3N36	88BCRA	16/03/2008 15:34	IN02
12630M196	27	2660	S25E3N36	88BCRA	16/03/2008 18:00	IN01
12630M196	18	2660	S25E3N36	88BCRA	16/03/2008 19:03	IN03


Output:

OPERATION PROBE_CARD_ID	TESTER_ID
2660	S25E3N21	IN04
2660	S25E3N27	IN02
2660	S25E3N36	IN01
2660	S25E3N39	IN03
2660	S25E3N40	IN04

Any pointers on how I could go about this would be greatly appreciated..

Rgds

Colin
 

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

How to Parse a CSV file into a Different Format

Hi I have a CSV file with me in this format Currency, USD, EUR, USD, 1.00, 1.32, EUR, 0.66, 1.00, How do I transpose the file to get to the format below. currency, currency, rate USD, USD, 1.00 USD, EUR, 1.32 EUR, USD, 0.66 EUR, EUR, 1.00 Thanks for your help We are using... (2 Replies)
Discussion started by: cdesiks
2 Replies

2. Shell Programming and Scripting

parse csv file, sha1 hash and output

I have a file, not really a csv, but containing delineated data just the same. Lets call that file "raw_data.txt". It contains data in the format of company name:fein number like this: first company name:123456789 second company name:987654321 what i need to do is read this file, apply... (11 Replies)
Discussion started by: FreddyG
11 Replies

3. Shell Programming and Scripting

Parse XML file into CSV with shell?

Hi, It's been a few years since college when I did stuff like this all the time. Can someone help me figure out how to best tackle this problem? I need to parse a file full of entries that look like this: <eq action="A" sectyType="0" symbol="PGR" exch="CA" curr="VEF" sess="NORM"... (7 Replies)
Discussion started by: Pcushing
7 Replies

4. Shell Programming and Scripting

Parse csv file

Hi, Our requirement is to parse the input file(.csv format). The each column in the file is delimited with comma. We need to take each column and apply some business validation rule. If data itself contains comma, then those fields are enclosed with double quotes ("). We can see this double... (7 Replies)
Discussion started by: vfrg
7 Replies

5. Shell Programming and Scripting

How to read and parse the content of csv file containing # as delimeter into fields using Bash?

#!/bin/bash i=0 cat 1.csv | while read fileline do echo "$fileline" IFS="#" flds=( $fileline ) nrofflds=${#flds} echo "noof fields$nrofflds" fld=0 while do echo "noof counter$fld" echo "$nrofflds" #fld1="${flds}" trying to store the content of line to fields but i... (4 Replies)
Discussion started by: barani75
4 Replies

6. Shell Programming and Scripting

how to parse this file and obtain a .csv or .xls

Hello Expert, I have a file in the following format: SYNTAX_VERSION 5 MONITOR "NAME_TEMPLATES" DESCRIPTION "Monitors for contents of error " INTERVAL "1m" MONPROG "script.sh NAME_TEMPLATES" MAXTHRESHOLD GEN_BELOW_RESET SEVERITY Major ... (17 Replies)
Discussion started by: Ant-one
17 Replies

7. Shell Programming and Scripting

Korn shell program to parse CSV text file and insert values into Oracle database

Enclosed is comma separated text file. I need to write a korn shell program that will parse the text file and insert the values into Oracle database. I need to write the korn shell program on Red Hat Enterprise Linux server. Oracle database is 10g. (15 Replies)
Discussion started by: shellguy
15 Replies

8. UNIX for Dummies Questions & Answers

Help to parse csv file with shell script

Hello ! I am very aware that this is not the first time this question is asked here, because I have already read a lot of previous answers, but none of them worked, so... As said in the title, I want to read a csv file with a bash script. Here is a sample of the file: ... (4 Replies)
Discussion started by: Grhyll
4 Replies

9. Shell Programming and Scripting

Consolidate several lines of a CSV file with firewall rules, in order to parse them easier?

Consolidate several lines of a CSV file with firewall rules Hi guys. I have a CSV file, which I created using an HTML export from a Check Point firewall policy. Each rule is represented as several lines, in some cases. That occurs when a rule has several address sources, destinations or... (4 Replies)
Discussion started by: starriol
4 Replies

10. Shell Programming and Scripting

How to parse this file using awk and output in CSV format?

My source file looks like this: Cust-Number = "101" Cust-Name="Joe" Cust-Town="London" Cust-hobby="tennis" Cust-purchase="200" Cust-Number = "102" Cust-Name="Mary" Cust-Town="Newyork" Cust-hobby="reading" Cust-purchase="125" Now I want to parse this file (leaving out hobby) and... (10 Replies)
Discussion started by: Balav
10 Replies
CSV(3pm)						User Contributed Perl Documentation						  CSV(3pm)

NAME
XML::CSV - Perl extension converting CSV files to XML SYNOPSIS
use XML::CSV; $csv_obj = XML::CSV->new(); $csv_obj = XML::CSV->new(\%attr); $status = $csv_obj->parse_doc(file_name); $status = $csv_obj->parse_doc(file_name, \%attr); $csv_obj->declare_xml(\%attr); $csv_obj->declare_doctype(\%attr); $csv_obj->print_xml(file_name, \%attr); DESCRIPTION
XML::CSV is a new module in is going to be upgraded very often as my time permits. For the time being it uses CSV_XS module object default values to parse the (*.csv) document and then creates a perl data structure with xml tags names and data. At this point it does not allow for a write as you parse interface but is the first upgrade for the next release. I will also allow more access to the data structures and more documentation. I will also put in more support for XML, since currently it only allows a simple XML structure. Currently you can modify the tag structure to allow for attributes. No DTD support is currently available, but will be implemented in a soon coming release. As the module will provide both: object and event interfaces, it will be used upon individual needs, system resources, and required performance. Ofcourse the DOM implementation takes up more resources and in some instances timing, it's the easiest to use. ATTRIBUTES new() error_out - Turn on the error handling which will die on all errors and assign the error message to $XML::CSV::csvxml_error. column_headings - Specifies the column heading to use. Passed as an array reference. Can be used as a supplement to using the first column in the file as the XML tag names. Since XML::CSV does not require you to parse the CSV file, you can provide your own data structure to parse. column_data - Specifies the CSV data in a two dimensional array. Passed as an array reference. csv_xs - Specifies the CSV_XS object to use. This is used to create custom CSV_XS object and override the default one created by XML::CSV. ATTRIBUTES parse_doc() headings - Specifies the number of rows to use as tag names. Defaults to 0. Ex. {headings => 1} (This will use the first row of data as xml tags) sub_char - Specifies the character with which the illegal tag characters will be replaced with. Defaults to undef meaning no substitution is done. To eliminate characters use "" (empty string) or to replace with another see below. Ex. {sub_char => "_"} or {sub_char => ""} ATTRIBUTES declare_xml() version - Specifies the xml version. Ex. {version => '1.0'} encoding - Specifies the type of encoding. XML standard defaults encoding to 'UTF-8' if notspecifically set. Ex. {encoding => 'ISO-8859_1'} standalone - Specifies the the document as standalone (yes|no). If the document is does not rely on an external DTD, DTD is internal, or the external DTD does not effect the contents of the document, the standalone attribute should be set to 'yes', otherwise 'no' should be used. For more info see XML declaration documentation. Ex. {standalone => 'yes'} ATTRIBUTES declare_doctype() source - Specifies the source of the DTD (SYSTEM|PUBLIC) Ex. {source => 'SYSTEM'} location1 - URI to the DTD file. Public ID may be used if source is PUBLIC. Ex. {location1 => 'http://www.xmlproj.com/dtd/index_dtd.dtd'} or {location1 => '-//Netscape Communications//DTD RSS 0.90//EN'} location2 - Optional second URI. Usually used if the location1 public ID is not found by the validating parser. Ex. {location2 => 'http://www.xmlproj.com/file.dtd'} subset - Any other information that proceedes the DTD declaration. Usually includes internal DTD if any. Ex. {subset => 'ELEMENT first_name (#PCDATA)> <!ELEMENT last_name (#PCDATA)>'} You can even enterpolate the string with $obj->{column_headings} to dynamically build the DTD. Ex. {subset => "ELEMENT $obj->{columnt_headings}[0] (#PCDATA)>"} ATTRIBUTES print_xml() file_tag - Specifies the file parent tag. Defaults to "records". Ex. {file_tag => "file_data"} (Do not use < and > when specifying) parent_tag - Specifies the record parent tag. Defaults to "record". Ex. {parent_tag => "record_data"} (Do not use < and > when specifying) format - Specifies the character to use to indent nodes. Defaults to " " (tab). Ex. {format => " "} or {format => " "} PUBLIC VARIABLES
$csv_obj->{column_headings} $csv_obj->{column_data} EXAMPLES
Example #1: This is a simple implementation which uses defaults use XML::CSV; $csv_obj = XML::CSV->new(); $csv_obj->parse_doc("in_file.csv", {headings => 1}); $csv_obj->print_xml("out.xml"); Example #2: This example uses a passed headings array reference which is used along with the parsed data. use XML::CSV; $csv_obj = XML::CSV->new(); $csv_obj->{column_headings} = @arr_of_headings; $csv_obj->parse_doc("in_file.csv"); $csv_obj->print_xml("out.xml", {format => " ", file_tag = "xml_file", parent_tag => "record"}); Example #3: First it passes a reference to a array with column headings and then a reference to two dimensional array of data where the first index represents the row number and the second column number. We also pass a custom Text::CSV_XS object to overwrite the default object. This is usefull for creating your own CSV_XS object's args before using the parse_doc() method. See 'perldoc Text::CSV_XS' for different new() attributes. use XML::CSV; $default_obj_xs = Text::CSV_XS->new({quote_char => '"'}); $csv_obj = XML::CSV->new({csv_xs => $default_obj_xs}); $csv_obj->{column_headings} = @arr_of_headings; $csv_obj->{column_data} = @arr_of_data; $csv_obj->print_xml("out.xml"); AUTHOR
Ilya Sterin, isterin@mail.com SEE ALSO
Text::CSV_XS perl v5.10.0 2001-05-28 CSV(3pm)
All times are GMT -4. The time now is 04:13 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy