Now, I want to check if key from file 1 is present in file 2 or not if present I want to crate new file with following headers and data
Code:
Key, diff columns, f1_value1,f2_value1,value1_diff, f1_value2,f2_value2, value2_diff......
A,"Value2,Value3,Value4",50,50,0,100,80,20,50,45,5 and so on.
I have file with more 50k lines and around 60 columns...
Can someone help on suggesting how we can achieve this... I am new to shell.
Then, after you have your new array as you desire, then you can simply convert your temporary array back to PHP, for example:
Code:
<?php
fputcsv ();
In a nutshell, is easy to process CSV files with PHP either from a script, directly from the command line, or interactively from the command line; most notably converting CSV files to arrays, doing array operations, and converting back to a CSV file.
So personally I would do this in PHP and not use shell scripts because PHP is built to do this kind of processing easily.
OBTW, these days I tent to quickly prototype and test my PHP ideas interactively in the shell as follows:
Code:
php -a
Then in the shell in interactive mode, I test and debug logic quickly and easily.
This is how I process CSV files. You can also easily to this same type of CSV processing easily in Python, BTW.
Other may have more "shell script-like" approaches for you which do not use PHP or Python; I am only describing how I approach these types of issues in CSV, JSON or other standard file formats. Since most of my work all touches the Internet somehow (web servers), and those servers are mostly PHP based, I like to stick to code I can reuse and debug together, so that is why I tend to use PHP over Python. Actually, if my apps were not mostly PHP based, I would use Python more.
awk -F, '
FNR == 1 {FCNT++ # inc file counter with every new file
}
FNR > 1 {KEYS[$1] # keep $1 in an array; overwrite duplicates
for (i=2; i<=NF; i++) W[$1,FCNT,i] = $i # keep fields in array indexed by key, file No., field No.
}
END {printf "Key, diff columns" # start printing header
for (i=1; i<NF; i++) printf ",f1_value%d,f2_value%d,value%d_diff", i, i, i # complete header line for all fields
printf RS
for (k in KEYS) {for (i=NF; i>1; i--) {N1 = W[k,1,i] # for all keys, for all fields, get values
N2 = W[k,2,i] # for both files,
D1 = N2 - N1 # and calc difference
OUT = sprintf (",%s,%s,%s", N1, N2, D1) OUT # collect all those in temp var OUT
if (D1) COLS = ",Value" i-1 COLS # if diff exist, collect fields in temp var COLS
}
print k, "\"" substr (COLS,2) "\"", substr (OUT, 2) # print all those, cutting off leading comma
OUT = COLS = "" # reset temp vars
}
}
' file[12] # OFS and SUBSEP relict from development, not needed
awk '
NR==FNR { # When reading the first file (then NR is equal to FNR)
A[$1]=$0 # Store the first file in array A with key $1
next
}
FNR==1 { # On the first line of the second file
split($0,Header) # Split the header labels in array "Header"
$1=$1 OFS "diff columns" # Create the first 2 field headers
for(i=2; i<=NF; i++)
$i=sprintf("f1_%s,f2_%s,%s_diff",$i, $i, $i) # Create the rest of the field headers
print # Print the field headers
}
FNR>1 { # Processing the content of file 2
diffs="" # Set the differences to ""
if($1 in A) { # if the key in $1 of file2 also occurs in file1
split(A[$1], F) # Split the corresponding line of file 1 into Fields in array F
for(i=2; i<=NF; i++) { # For field 2 until the last field
if($i!=F[i]) # if there is a value difference for that field
diffs=diffs (diffs?OFS:"") Header[i] # Add the corresponding header label to the differences
$i=F[i] OFS $i OFS (F[i]-$i) # Prepend the value of file1 and append the subtraction of file1 val - file val
}
$1=$1 OFS "\"" diffs "\"" # When all differences found, append them to field 1
print # print the result
}
}
' FS=', *' OFS=, file1 file2 # set FS to a comma with spaces, set OFS to a comma and read file 1 and file2
Hello All,
just wanted to export multiple tables from oracle sql using unix shell script to csv file and the below code is exporting only the first table.
Can you please suggest why? or any better idea?
export FILE="/abc/autom/file/geo_JOB.csv"
Export= `sqlplus -s dev01/password@dEV3... (16 Replies)
I have input data looks like this which is a part of a csv file
7,1265,76548,"0102:04"
8,1266,76545,"0112:04"
I need to make the output data should look like this and the output data will be part of text file:
7|1265000 |7654899 |A|
8|12660000 |76545999 |B|
The logic behind the... (6 Replies)
Hi,
I need to remove first column from a csv file and i can do this by using below command.
cut -f1 -d, --complement Mytest.csv
I need to implement this in shell scripting, Whenever i am using the above command alone in command line it is working fine.
I have 5 files in my directory and... (3 Replies)
Hi,
Very good wishes to all!
Please help to provide the shell script for generating the record counts in filed wise from the .csv file
My question:
Source file:
Field1 Field2 Field3
abc 12f sLm
1234 hjd 12d
Hyd 34
Chn
My target file should generate the .csv file with the... (14 Replies)
hi all i have a reporting work and i want it to be automated using shell scripting kindly let me know how can i make that possibe .
eg data are :... (2 Replies)
Hello friends,
I have a file as follows:
"empty line"
content1
content2
content3
content1
content2
content3
content1
content2
content3
It starts with an empty line,
how can i get a csv like this: (12 Replies)
Hi All,
I need a shell script which could insert a sequence number column inside a dat file(pipe delimited).
I have the dat file similar to the one as shown below..
|A|B|C||D|E
|F|G|H||I|J
|K|L|M||N|O
|P|Q|R||S|T
As shown above, the column 4 is currently blank and i need to insert sequence... (5 Replies)
I want to automate test script on shell scripting.
There are 2 .csv files named account.csv and balance.csv.These files needs to passed as command line arguments and the following logic needs to applied further.
Any account with a balance that was due before Oct 23, 2007 has an overdue... (2 Replies)