Visit Our UNIX and Linux User Community


Parse csv file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Parse csv file
# 1  
Old 08-17-2009
Parse csv file

Hi,

Our requirement is to parse the input file(.csv format). The each column in the file is delimited with comma. We need to take each column and apply some business validation rule.

If data itself contains comma, then those fields are enclosed with double quotes ("). We can see this double quotes symbols by opening the csv file in notepad or wordpad.

Some times, data itself contains double quotes, in those cases the data is enclosed in double quotes also for each double quotes, one extra double quotes is added. You can see them if we open in notepad or word pad.

Sample example of the csv file when opened in notepad is as shown below
Test.csv
========
This,test is,with out comma or double quotes
This,"test, contains",comma
This,"test contains ""double quotes"" ","and , comma"

Output required is:
1st Line:
Field1:This
Field2:test is
Field3:with out comma or double quotes

2nd Line:
Field1:This
Field2:test, contains
Field3:comma

3rd Line:
Field1:This
Field2:test contains "double quotes"
Field3:and , comma


If you save the above file contents into a csv file and open in excel, then you will see column values in each line matches my required output.

I tried hard and could not get right way to do this. Please help me in achieving the above.

I have attached test.csv for the reference purpose.

Thanks in advance,
Venkat

Last edited by vfrg; 08-17-2009 at 07:39 AM..
# 2  
Old 08-17-2009
Could any one please help in writing script..I am somewhat stuck with this script.

Thanks.
Venkat
# 3  
Old 08-17-2009
Bumping up posts or double posting is not permitted in these forums.

Please read the rules, which you agreed to when you registered, if you have not already done so.

You may receive an infraction for this. If so, don't worry, just try to follow the rules more carefully. The infraction will expire in the near future

Thank You.

The UNIX and Linux Forums.
# 4  
Old 08-17-2009
# 5  
Old 08-17-2009
I am sorry for this. Henceforth, I would be more careful.

Thanks
Venkat

Quote:
Originally Posted by Franklin52
Bumping up posts or double posting is not permitted in these forums.

Please read the rules, which you agreed to when you registered, if you have not already done so.

You may receive an infraction for this. If so, don't worry, just try to follow the rules more carefully. The infraction will expire in the near future

Thank You.

The UNIX and Linux Forums.
# 6  
Old 08-17-2009
Code:
while(<DATA>){
	my @tmp = split(/,(?=(?:[^"]*$)|(?:(?:[^"]*"[^"]*"[^"]*)*$))/,$_);
	print "$. Line:\n";
	for(my $i=1;$i<=$#tmp+1;$i++){
		$tmp[$i-1] =~ s/(^"|"$)//g;
		$tmp[$i-1] =~ s/""/"/g;
		print "Field($i+1):$tmp[$i-1]\n";
	}
	print "\n";
}
__DATA__
This,test is,with out comma or double quotes
This,"test, contains",comma
This,"test contains ""double quotes"" ","and , comma"

# 7  
Old 08-18-2009
Hi summer_cherry,

Thank you very much for the code.

I have written small shell scripts using basic commands, hence my expertsie in Unix shell is limited.

Could you please let me know what should be passed to <Data> in while loop. I assume it should be input file. I tried passing that, but program did not work.

I know, i am missing something here. It would be really great if this script can be changed by substituting with right values for <Data> and so on.

Thanks.
Venkat


Quote:
Originally Posted by summer_cherry
Code:
while(<DATA>){
    my @tmp = split(/,(?=(?:[^"]*$)|(?:(?:[^"]*"[^"]*"[^"]*)*$))/,$_);
    print "$. Line:\n";
    for(my $i=1;$i<=$#tmp+1;$i++){
        $tmp[$i-1] =~ s/(^"|"$)//g;
        $tmp[$i-1] =~ s/""/"/g;
        print "Field($i+1):$tmp[$i-1]\n";
    }
    print "\n";
}
__DATA__
This,test is,with out comma or double quotes
This,"test, contains",comma
This,"test contains ""double quotes"" ","and , comma"



---------- Post updated 08-18-09 at 12:25 AM ---------- Previous update was 08-17-09 at 11:43 PM ----------

Hi summer_cherry,

Please ignore my previous request. I was able to execute this script.

Thank you very much for your help.

Thanks,
Venkat


Quote:
Originally Posted by vfrg
Hi summer_cherry,

Thank you very much for the code.

I have written small shell scripts using basic commands, hence my expertsie in Unix shell is limited.

Could you please let me know what should be passed to <Data> in while loop. I assume it should be input file. I tried passing that, but program did not work.

I know, i am missing something here. It would be really great if this script can be changed by substituting with right values for <Data> and so on.

Thanks.
Venkat

Previous Thread | Next Thread
Test Your Knowledge in Computers #68
Difficulty: Easy
Awk is abbreviated from the names of the developers - Aho, Weinberger, and Kernighan.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to parse this file using awk and output in CSV format?

My source file looks like this: Cust-Number = "101" Cust-Name="Joe" Cust-Town="London" Cust-hobby="tennis" Cust-purchase="200" Cust-Number = "102" Cust-Name="Mary" Cust-Town="Newyork" Cust-hobby="reading" Cust-purchase="125" Now I want to parse this file (leaving out hobby) and... (10 Replies)
Discussion started by: Balav
10 Replies

2. Shell Programming and Scripting

Consolidate several lines of a CSV file with firewall rules, in order to parse them easier?

Consolidate several lines of a CSV file with firewall rules Hi guys. I have a CSV file, which I created using an HTML export from a Check Point firewall policy. Each rule is represented as several lines, in some cases. That occurs when a rule has several address sources, destinations or... (4 Replies)
Discussion started by: starriol
4 Replies

3. UNIX for Dummies Questions & Answers

Help to parse csv file with shell script

Hello ! I am very aware that this is not the first time this question is asked here, because I have already read a lot of previous answers, but none of them worked, so... As said in the title, I want to read a csv file with a bash script. Here is a sample of the file: ... (4 Replies)
Discussion started by: Grhyll
4 Replies

4. Shell Programming and Scripting

Korn shell program to parse CSV text file and insert values into Oracle database

Enclosed is comma separated text file. I need to write a korn shell program that will parse the text file and insert the values into Oracle database. I need to write the korn shell program on Red Hat Enterprise Linux server. Oracle database is 10g. (15 Replies)
Discussion started by: shellguy
15 Replies

5. Shell Programming and Scripting

how to parse this file and obtain a .csv or .xls

Hello Expert, I have a file in the following format: SYNTAX_VERSION 5 MONITOR "NAME_TEMPLATES" DESCRIPTION "Monitors for contents of error " INTERVAL "1m" MONPROG "script.sh NAME_TEMPLATES" MAXTHRESHOLD GEN_BELOW_RESET SEVERITY Major ... (17 Replies)
Discussion started by: Ant-one
17 Replies

6. Shell Programming and Scripting

How to read and parse the content of csv file containing # as delimeter into fields using Bash?

#!/bin/bash i=0 cat 1.csv | while read fileline do echo "$fileline" IFS="#" flds=( $fileline ) nrofflds=${#flds} echo "noof fields$nrofflds" fld=0 while do echo "noof counter$fld" echo "$nrofflds" #fld1="${flds}" trying to store the content of line to fields but i... (4 Replies)
Discussion started by: barani75
4 Replies

7. Shell Programming and Scripting

Parse XML file into CSV with shell?

Hi, It's been a few years since college when I did stuff like this all the time. Can someone help me figure out how to best tackle this problem? I need to parse a file full of entries that look like this: <eq action="A" sectyType="0" symbol="PGR" exch="CA" curr="VEF" sess="NORM"... (7 Replies)
Discussion started by: Pcushing
7 Replies

8. Shell Programming and Scripting

parse csv file, sha1 hash and output

I have a file, not really a csv, but containing delineated data just the same. Lets call that file "raw_data.txt". It contains data in the format of company name:fein number like this: first company name:123456789 second company name:987654321 what i need to do is read this file, apply... (11 Replies)
Discussion started by: FreddyG
11 Replies

9. Shell Programming and Scripting

CSV File parse help in Perl

Folks, I have a bit of an issue trying to obtain some data from a csv file using PERL. I can sort the file and remove any duplicates leaving only 4 or 5 rows containing data. My problem is that the data contained in the original file contains a lot more columns and when I try ro run this script... (13 Replies)
Discussion started by: lodey
13 Replies

10. UNIX for Advanced & Expert Users

How to Parse a CSV file into a Different Format

Hi I have a CSV file with me in this format Currency, USD, EUR, USD, 1.00, 1.32, EUR, 0.66, 1.00, How do I transpose the file to get to the format below. currency, currency, rate USD, USD, 1.00 USD, EUR, 1.32 EUR, USD, 0.66 EUR, EUR, 1.00 Thanks for your help We are using... (2 Replies)
Discussion started by: cdesiks
2 Replies

Featured Tech Videos