Csv file parsing and validating


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Csv file parsing and validating
# 15  
Old 04-21-2014
No, the above code is not working according to the requirement.
The result i'm getting is
goodrec:
Code:
abc,john,MI,201

badrec:
Code:
abc,john,MI,201
22,Lilly,CA,405
33,Taylor,NY,415
44,Reet5,NY,258
55,Richard,CA,21Q

---------- Post updated at 04:09 AM ---------- Previous update was at 12:57 AM ----------

Hi,
Any updates on the above problem?
Also i found similar post in the below link. But there they are not passing the structure of the file directly.They are making one file called pat and specifying the patterns over there. But i don't want to do this and wanted to directly pass the definition(structure of the file) directly for validating length and datatype.

Thanks,
Shree
# 16  
Old 04-21-2014
Hey Shree, you cant keep changing the requirement.
In your first code, there was a header and hence written the code to meet that requirement

With the data you gave recently, I ran the below code and it ran fine and produced expected results
Code:
awk -F "," 'NR == FNR {a[NR] = $2; next}
  {for(i = 1; i <= NF; i++)
  {if((a[i] == "Numeric" && ($i + 0) == $i) || (a[i] == "String" && $i != "")) {f = 1} else {f = 0};
  if(f == 0) {print $0 > "badrec"; next}} print $0 > "goodrec"}' conf.csv file.csv

Code:
$ cat conf.csv
id,Numeric
name,String
state,String
street_No,Numeric
$ cat file.csv
abc,john,MI,201
22,Lilly,CA,405
33,Taylor,NY,415
44,Reet5,NY,258
55,Richard,CA,21Q
$ awk -F "," 'NR == FNR {a[NR] = $2; next}
  {for(i = 1; i <= NF; i++)
  {if((a[i] == "Numeric" && ($i + 0) == $i) || (a[i] == "String" && $i != "")) {f = 1} else {f = 0};
  if(f == 0) {print $0 > "badrec"; next}} print $0 > "goodrec"}' conf.csv file.csv
$ cat goodrec
22,Lilly,CA,405
33,Taylor,NY,415
44,Reet5,NY,258
$ cat badrec
abc,john,MI,201
55,Richard,CA,21Q
$


Last edited by SriniShoo; 04-21-2014 at 06:47 AM..
# 17  
Old 04-21-2014
Sorry for changing the requirement.

When i run the below script it is giving all the records(good and bad) in a badrec file and it is not at all creating the goodrec file.

Code:
awk -F "," 'NR == FNR {a[NR] = $2; next}{for(i = 1; i <= NF; i++){if((a[i] == "Numeric" && ($i + 0) == $i) || (a[i] == "String" && $i != "")) {f = 1} else {f = 0};if(f == 0) {print $0 > "badrec"; next}} print $0 > "goodrec"}' cf df

Don't know what's going wrong here.
# 18  
Old 04-21-2014
Are you sure cf and df has the same content as provided in the previous post
# 19  
Old 04-21-2014
Yes. My cf and df are given below:
cf
Code:
id,Numeric
name,String
state,String
street_No,Numeric

df
Code:
abc,john,MI,201
22,Lilly,CA,405
33,Taylor,NY,415
44,Reet5,NY,258
55,Richard,CA,21Q

---------- Post updated at 05:23 AM ---------- Previous update was at 05:13 AM ----------

Hi, Now its working fine. I deleted the existing cf and df files and re-created again.Also in the script at the end specified cf.txt and df.txt. Now its working fine.
I wanted to add some more requirements to the working solution. Shall i open a new thread or continue with the same thread?

Thanks for your guidance and help.

Regards,
Shree
# 20  
Old 04-21-2014
post your questions here
# 21  
Old 04-21-2014
The above solution is working fine for the configuration(cf.txt) and data (df.txt)files where it is validating the datatypes of the data file by referring to configuration file. And my cf.txt was
id,Numeric
name,String
state,String
street_No,Numeric

Suppose if i have a configuration file cf.txt which looks like below:
Code:
id,Numeric(2),NOT NULL
name,String(20)
state,String(10),NOT NULL
street_No,Numeric(4)

Here i wanted to validate datatype, field length and nullable values.

And if my datafile df.txt looks like below :
Code:
abc,john,MI,201
22,Lilly,CA,405
33,Richard,CA,21Q
444,Reet5,NY,258
55,Taylor,GI,3333
66,Merry,,3333
77,,FL,22

My expected result should be in goodrec is
Code:
22,Lilly,CA,405
55,Taylor,GI,3333
77,,FL,22

In badrec i should get
Code:
abc,john,MI,201
33,Richard,CA,21Q
444,Reet5,NY,258
66,Merry,,3333

Here i need to check the datatype, field length and nullable values.How i can modify the below code to achieve it?
Code:
awk -F "," 'NR == FNR {a[NR] = $2; next} {for(i = 1; i <= NF; i++) {if((a[i] == "Numeric" && ($i + 0) == $i) || (a[i] == "String" && $i != "")) {f = 1} else {f = 0};if(f == 0) {print $0 > "badrec"; next}} print $0 > "goodrec"}' cf.txt df.txt

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Help with Parsing a CSV File

Hello All, I have an input CSV file like below, where first row data can be in different position after every run of the tool, i.e. pzTest in below example is in column 1, but it can be also in 3 column and same for all the headers in the first row. pzTest, pzExtract, pxUpdate, pzInfo... (1 Reply)
Discussion started by: asirohi
1 Replies

2. Shell Programming and Scripting

Parsing csv file and pass to a variable

Hi, Newbie here and I need some help to parse a csv file that contains fields separated by ",". What I need to achieve here is, read the 1 line file and extract 240 fields and pass to a variable and then read the next 240 fields and pass to a variable, over and over. If anyone can assist that... (4 Replies)
Discussion started by: tmslixx
4 Replies

3. Shell Programming and Scripting

Help required in parsing a csv file

Hi Members, I am stuck with the following problem. Request your kind help I have an csv file which contains, 1 header record, data records and 1 footer record. Sample is as below Contents of cm_update_file_101010.csv -------------------------------------------------- ... (6 Replies)
Discussion started by: ramakanth_burra
6 Replies

4. Shell Programming and Scripting

Parsing a CSV file and deleting all rows on condition

Hello list, I am working on a csv file which contains two fields per record which contain IP addresses. What I am trying to do is find records which have identical fields(IP addresses) which occur 4(four) times, and if they do, delete all records with that specific identical field(ip address). ... (4 Replies)
Discussion started by: landossa
4 Replies

5. Shell Programming and Scripting

Parsing complicated CSV file with sed

Yes, there is a great doc out there that discusses parsing csv files with sed, and this topic has been covered before but not enough to answer my question (unix.com forums). I'm trying to parse a CSV file that has optional quotes like the following: "Apple","Apples, are fun",3.60,4.4,"I... (3 Replies)
Discussion started by: analog999
3 Replies

6. Shell Programming and Scripting

Parsing a CSV File

Hey guys, I'm in the process of learning PHP and BASH scripting. I'm getting there, slowly ;) I would like some help with parsing a CSV file. This file contains a list of hostnames, dates, and either Valid, Expired, or Expired Soon in the last column. Basically, I want to parse the file,... (12 Replies)
Discussion started by: dzl
12 Replies

7. Shell Programming and Scripting

2 problems: Mailing CSV file / parsing CSV for display

I have been trying to find a good solution for this seemingly simple task for 2 days, and I'm giving up and posting a thread. I hope someone can help me out! I'm on HPUX, using sqlplus, mailx, awk, have some other tools available, but can't install stuff that isn't already in place (without a... (6 Replies)
Discussion started by: soldstatic
6 Replies

8. Shell Programming and Scripting

CSV file parsing and validation

I have a CSV file that needs to through two seperate processes (in the end there will be 2 files (Dload.unl and Tload.unl and we'll say the input file name is mass.csv). I have a processfile() function that will call the process Dload funtion. In Dload I want to read mass.csv into Dload and then... (1 Reply)
Discussion started by: dolo21taf
1 Replies

9. Shell Programming and Scripting

Parsing a csv file

I am trying to parse a csv file in the below 'name-value pair' format and then use the values corresponding to the name. Type:G,Instance:instance1,FunctionalID:funcid,Env:dev,AppName:appname... (6 Replies)
Discussion started by: chiru_h
6 Replies

10. Shell Programming and Scripting

Help in parsing a CSV file with Shell script

I have a CSV file which contains number series as one of the fields. Some of the records of the field look like : 079661/3 I have to convert the above series as 079661 079662 079663 and store it as 3 different records. Looking for help on how to achieve this. Am a newbie at Shell... (10 Replies)
Discussion started by: mihirk
10 Replies
Login or Register to Ask a Question