Field validations in multiple files CSV


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Field validations in multiple files CSV
# 1  
Old 01-30-2012
Field validations in multiple files CSV

Hi,

I am regular reader of this forum. My advanced thanks to everyone.

Below given are the sample files

INDATA (Main data)
Code:
Fild1Çfld2Çfld3…..
Fild1Çfld2Çfld3…..
Fild1Çfld2Çfld3…..
Fild1Çfld2Çfld3…..
Fild1Çfld2Çfld3…..

.
.
N records (140000) eg

GRPDATA (Reference file)
Fild1Çfld2Çfld3…..
Fild1Çfld2Çfld3…..
Fild1Çfld2Çfld3…..
.
.
100 or 150 (small file)

I have to prepare one output file by using this indata and referring the fields in grp data files. Means I have to parse simultaneously.
its like a Transformation process.

The I prepared which I done without using awk is working perfect but its taking a huge time like 2 r 3 hrs to generate outfile.
When I use awk I unable to parse simultaneously. It process sequentially ie IN_DATA then GRPDATA.
I have to open both files at a time and all the first rec in IN_DATA should be verified with all the records in GRPDATA (until get match)
The second rec in IN_DATA should be verified with all the records in GRPDATA (until get match)
Like wise…..

Please show me light on this and that should not affect the performance.
Its very urgent i am almost reaching the dead line.


My coding
=========
Code:
#Include File for Environment variable usage
. /opt/hyperion/Payer_Transformation/Scripts/PayersTrnEnv.env
#Defining Log file for this treatment
awk -F"Ç" 'BEGIN {
    FS = "Ç"
    OFS = "|"  }
   {
FILENAME=="$IN_DATA" 
 if ( $7 == "RETAIL" )
   {
    Tar_Loc=$7
    if ( $9 -le 83 )
     Tar_Lob="RETAIL30";
     else Tar_Lob ="RETAIL90";
   }
   
  if ( $7 == "MAIL" )
   { 
    Tar_Loc=$7; 
    Tar_Lob="MAIL";
    if ( $2 ~ /PCS/ )
     { if ( $3 ~ /V/ )
      {Group_ID="V";}
      else if ($3 ~ /2407/ || $3 ~ /2428/) 
       {Group_ID="HME";}
     }
   } 
  Tar_Year=$1;
  Tar_Num_Rxs=$11;
  Tar_Tot_Rev=$12;
  Tar_GP_Wac=$13; }
  FILENAME=="$GRP_DATA"
  while ((getline grp < "$GRP_DATA") > 0){
  split(grp, grpfield, "Ç")
/##   if [grpfield (2)== IN_DATA .$2 ] && [grpfield (3) == "$Group_ID" ] && [grpfield (4) == IN_DATA.$7 ]; 
    then Tar_Grp_Nam= grpfield (1);
  fi
###/ 
If  Tar_Grp_Nam = “” then if [grpfield (2)== IN_DATA .$2 ] && [grpfield (3) == "$Group_ID" ] && [grpfield (4) == “MAIL” ]; 
Tar_Grp_Nam= grpfield (1);

Print Tar_Year,Tar_Loc,Tar_Lob,Tar_Grp_Nam,Tar_Num_Rxs,Tar_Tot_Rev,Tar_GP_Wac 
 } ' $IN_DATA $GRP_DATA

Moderator's Comments:
Mod Comment Please use next time code tags for your code and data, and avoid fancy color and fonts...

Last edited by vbe; 01-30-2012 at 11:12 AM..
# 2  
Old 01-30-2012
You should have displayed the desired output.

Thus, making easier to understand your requirement.

In any event, an easy way to find if records of file A is present if file B:
Code:
grep -f FileB FileA

# 3  
Old 01-31-2012
Thanks Shell Life.

I am strugling to use grep inside the awk here.

awk '
BEGIN {
FS = "Ç"
OFS = "|"
}
{

##Need to comare
if (indata.$2=grpdata.col2 && indata.$7= grpdata.col4)
{ groupname = grpdata.col3
}
##### not working ##### grep "^[^,]*,$2,[^,]*,$7,[^,]*,[^,]*," "'$GRP_DATA'"
print groupname
}' $IN_DATA

if i can compare variables in IN_DATA and GRP_DATA(ref file) problem will be solved.

thanks in advance
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Export Oracle multiple tables to multiple csv files using UNIX shell scripting

Hello All, just wanted to export multiple tables from oracle sql using unix shell script to csv file and the below code is exporting only the first table. Can you please suggest why? or any better idea? export FILE="/abc/autom/file/geo_JOB.csv" Export= `sqlplus -s dev01/password@dEV3... (16 Replies)
Discussion started by: Hope
16 Replies

2. Shell Programming and Scripting

CSV Split field to check multiple codes

Hello, For work i am trying to generate a combined csv file excisting out of 2 other csv files. The problem i am facing is that the first field on both files have multiple values in there which arent always the same. This first field is also the joining part. The layout of the files is as... (16 Replies)
Discussion started by: SDohmen
16 Replies

3. Shell Programming and Scripting

Shell script for field wise record count for different Files .csv files

Hi, Very good wishes to all! Please help to provide the shell script for generating the record counts in filed wise from the .csv file My question: Source file: Field1 Field2 Field3 abc 12f sLm 1234 hjd 12d Hyd 34 Chn My target file should generate the .csv file with the... (14 Replies)
Discussion started by: Kirands
14 Replies

4. Shell Programming and Scripting

awk - CSV file - field with single or multiple spaces

Hi, In a csv file, I want to select records where first column has zero or multiple spaces. Eg: abc.csv ,123,a ,22,b ,11,c a,11,d So output should be: ,123,a ,22,b ,11,c Please advise (5 Replies)
Discussion started by: vegasluxor
5 Replies

5. Shell Programming and Scripting

Match columns from two csv files and update field in one of the csv file

Hi, I have a file of csv data, which looks like this: file1: 1AA,LGV_PONCEY_LES_ATHEE,1,\N,1,00020460E1,0,\N,\N,\N,\N,2,00.22335321,0.00466628 2BB,LES_POUGES_ASF,\N,200,200,00006298G1,0,\N,\N,\N,\N,1,00.30887539,0.00050312... (10 Replies)
Discussion started by: djoseph
10 Replies

6. Linux

How do I format a Date field of a .CSV file with multiple commas in a string field?

I have a .CSV file (file.csv) whose data are all enclosed in double quotes. Sample format of the file is as below: column1,column2,column3,column4,column5,column6, column7, Column8, Column9, Column10 "12","B000QRIGJ4","4432","string with quotes, and with a comma, and colon: in... (3 Replies)
Discussion started by: dhruuv369
3 Replies

7. Shell Programming and Scripting

Compare two CSV files and put the difference in third file with line no,field no and diff value.

I am having two csv files i need to compare these files and the output file should have the information of the differences at the field level. For Example, File 1: A,B,C,D,E,F 1,2,3,4,5,6 File 2: A,C,B,D,E,F 1,2,4,5,5,6 out put file: (12 Replies)
Discussion started by: karingulanagara
12 Replies

8. Shell Programming and Scripting

Read in 2-column CSV, output many files based on field

Is there a way to read in a two-columned CSV file, and based on the fields in 1st column, output many different files? The input/output looks something like: input.csv: call Call Mom. call Call T-Mobile. go Go home. go Go to school. go Go to gas station. play Play music. play Play... (4 Replies)
Discussion started by: pxalpine
4 Replies

9. Shell Programming and Scripting

Matching lines across multiple csv files and merging a particular field

I have about 20 CSV's that all look like this: "","","","","","","","","","","","","","","",""What I've been told I need to produce is the exact same thing, but with each file now containing the start_code from every other file where the email matches. It doesn't matter if any of the other... (1 Reply)
Discussion started by: Demosthenes
1 Replies

10. Shell Programming and Scripting

Last field problem while comparing two csv files

Hi All, I've two .csv files as below file1.csv abc, tdf, 223, tpx jgsd, tex, 342, rpy a, jdjdsd, 423, djfkld Where as file2.csv is the new version of file1.csv with some added fields in the end of each line and some additional lines. lfj, eru, 98, jkldj, 39, jdkj9 abc, tdf, 223, tpx,... (3 Replies)
Discussion started by: ganapati
3 Replies
Login or Register to Ask a Question