Filter using awk in CSV files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Filter using awk in CSV files
# 1  
Old 11-25-2017
Facebook Filter using awk in CSV files

Hello Gentlemen,

Finding difficulties to play with my Input filesSmilie . Your guidance will certainly help as always.
After converting to csv file from XLSM file, I am getting some extra ""(double quote) characters which I want to terminate inside shell script and process it further.

Ex:One Single Line post converting to CSV file
Code:
0),0,,,extW_2x32_2x32,targetServer_500,,,,,,command execution server,root,extW_2x32_2x32_Multi=200_cont=1_1exist=500use=100exec=openssl to=10,60,"./call_command_local.sh  -c extW_2x32_2x32  -t targetServer_500  -X 1  -N 1  -D 2 -a 100 wac sample exec ""\""hostname=XXXXX commandline='timeout 10 openssl  speed -multi 2 ; exit 0'\"""" ",1,1,0,200,1,1,"32G, t2.2xlarge","32G, t2.2xlarge",5000,,500,100,openssl ,10,timeout 10 openssl  speed -multi 2 ; exit 0,,160,30,161,4.183333333,42.55,0,

First Step:
I want to extract only the above highlighted red color and run it 3 consecutive times inside shell script.
I tried something like below but still not able to figure out how to remove the trailing extra "(Double quotes) which generated unnecessarily while converting to CSV
Code:
awk -F, '{gsub(/^"/, "", $16); print $16}' AWS_V1.7.5.csv 


Second Step:
Run 3 times the above parsed string in shell script. At the same time record the Execution time to use laterEx: ${start_time} = 1st Execution
${end_time} = 3rd Execution

Code:
seq 3 | xargs -I -- (Pass the above whole shell script along with parameters)

The above command will generate 3 XXX.csv Files with Date time. Format is as shown below. Which needs to process further for next command
CSV File Ex:
Code:
exec=1_host_count=100_dup=4_NameTag=targetServer_500_20171122112108.csv

Third Step:
Run the below command once the above commands executed and generated above XXX.CSV files are the input to below command
Code:
java -classpath ./:${CLASSPATH}  execGetResult XXX.csv

I need to run it for 3 times for each CSV file generated in step 2.
It will give me output 3 out files as below.
Ex:
Code:
out.exec=1_host_count=100_dup=4_NameTag=targetServer_500_20171122112108.csv



Fourth Step :

Run below commands
Code:
./AWS_get_statistics.sh -t Manager -s ${start_time} -e ${end_time}

${start_time} = ${start_time}-1 min
${end_time} = ${end_time}+ 1 min

Fifth Step :

tar cvf AWS.${date}.tar ./AWS

Thanks in advance
PD
# 2  
Old 11-25-2017
First step, try:
Code:
awk -F, '{gsub(/^"|"" "$/, "", $16); print $16}' file

# 3  
Old 11-25-2017
Parse one line and execute 3 times each line

Hi Scrutinizer,

Thank you for the help . The first step is now working as expected.

For the 2nd Step:
Do you suggest me to output to somefile in the 1st step to proceed or we can run directly one line at each time using pipe ?

Rgds,
PD
# 4  
Old 11-25-2017
That spec is beyond me. Howsoever, for executing the result of the awk script, pipe it into a shell:
Code:
awk '...' file | sh

For executing it thrice, try
Code:
awk '... print $16; print $16; print $16; ...

# 5  
Old 11-25-2017
parse all csv files in directory and process it for Java command

Hi Rudic,

I know this must be silly for you but somewhere I might've over-explained or misguided you. Sorry about that.
Now I tried with all your suggestion and Step 1 and Step 2 and overcame it. Of-course your expertise one liner which I love is most welcome
Code:
#!/bin/bash

#Step 1
awk -F, '{gsub(/^"|"" "$/, "", $16); print $16}' AWS.CSV > out

#Step 2

while IFS= read -r line; do
  #Process each line 3 times with xargs(first try failed)
  # seq 3 | xargs -I{} $line

  for i in `seq 1 3`;
   do
    #Record the start time
    if [ $i -eq 1 ]; then
	start_time=$(date);# need to change later to get certain date format
    fi
    #Record the end time
    if [ $i -eq 3 ]; then
        end_time=$(date); 
    fi
    
    $line
   done < out

#Step 3
#Process all the CSV files starting with exec=1 run and run those files with java command
find . -maxdepth 1 -type f -name "exec=1*.csv" -exec 'java -classpath ./:${CLASSPATH} execGetResult {}' \;

Somehow managed to do step 3.But not sure about the syntax.
Step 3:
Find all the CSV files which is starting with exec=1*.csv and execute java command for each file found like below
Ex:
Code:
java -classpath ./:${CLASSPATH}  execGetResult exec=1_one.csv
java -classpath ./:${CLASSPATH}  execGetResult exec=1_two.csv

I am doing something like below to search the csv files in current diretory but struggling with syntax to embed java command for each csv files
Code:
find . -maxdepth 1 -type f -name "exec=1*.csv" -exec 'java -classpath ./:${CLASSPATH} execGetResult {}' \;

Rgds,
PD

Last edited by pradyumnajpn10; 11-25-2017 at 01:38 PM..
# 6  
Old 11-25-2017
For step 2, how about something like:
Code:
cmd=$(awk -F, '{gsub(/^"|"" "$/, "", $16); print $16}' AWS.CSV)
start_time=$(date -d '-1 minute')
$cmd; $cmd; $cmd
end_time=$(date -d '+1 minute')

Note 1: Running a command like that is dangerous, unless you have total control over the content of $cmd.

Note 2: The date command is for GNU date..

Last edited by Scrutinizer; 11-25-2017 at 01:42 PM..
# 7  
Old 11-25-2017
Awk read from shell environment variable

Hi Scrutinizer,

Actually I want to have that sort from suggestions.liked it. I ve no control over the commands.It executes for 1 hr or so for one command.
As long as it serves the purpose and is out of danger I love to adopt any suggestion from unix.com as a newcomer to linux.

Issue:
In this case the problem is only processing the first line from AWS.CSV file and exits.
Actually It should process all the steps for a single line from AWS.CSV input file and
fetch next line from AWS.csv file and repeat all the 5 steps.
Which I solved with below code using while loop, Is there any risk in doing like this or any better idea please suggest.
Code:
#!/bin/bash

while read -r line
do

  #Step 1
  cmd = $(echo $line|awk -F, '{gsub(/^"|"" "$/, "", $16); print $16}')

  #Step 2
  start_time=$(date +"%Y-%m-%dT%T" -d '-1 minute')
  $cmd; $cmd; $cmd
  end_time=$(date +"%Y-%m-%dT%T" -d '+1 minute')

  #Step 3
  #Process all the CSV files starting with exec=1 run and run those files with java command
  find . -maxdepth 1 -type f -name "exec=1*.csv" -exec java -classpath ./:${CLASSPATH} execGetResult {} \;

  #Step 4
  ./AWS_get_statistics.sh -t Manager -s ${start_time} -e ${end_time}

  #Step 5
  cd ..
  tar cvf AWS.${date}.tar ./AWS

done < AWS.csv


Rgds,
PD

Last edited by pradyumnajpn10; 11-25-2017 at 06:31 PM.. Reason: read Awk read from shell environment variable
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Can we filter the below log data into CSV format?

HI , I m looking for help here!!! Can we filter the below log data into CSV format ? timestamp INFO <text > - Some text Drive .. Need a format of separate field such as 1 2 3 4 ... (2 Replies)
Discussion started by: MohSalNiz
2 Replies

2. UNIX for Beginners Questions & Answers

awk on csv files

wk on csv files Hi I have two csv files 1 ) keys.csv with data as below XX1,offsite XX2,offsite XX8,none XX3,offsite XX4,onsite XX7,none XX5,onsite XX6,onsite XX15,none 2) source.csv file with data like 1,0123,ppp,XX1 2,0122,sss,XX2 3,1239,yyy,XX8 4,567,kkk,XX5 (2 Replies)
Discussion started by: zozoo
2 Replies

3. Shell Programming and Scripting

Modify csv-files with awk

Hello everyone! I have thousands of csv files I have to import into a Database table. As usually the files aren't perfect. For example they have a different number of columns and some weird columns. The second problem is, that I have to add 3 parts of the filename into 3 rows in the... (6 Replies)
Discussion started by: elRonaldo
6 Replies

4. UNIX for Beginners Questions & Answers

awk assistance - Comparing 2 csv files

Hello all, I have searched high and low for a solution to this, many have come really close but not quite what I'm after. I have 2 files. One contains GUID's, for example: 8121E002-96FE-4C9C-BC5A-6AFF20DACECD 84468F30-F3B7-418B-81F0-0908E80792BF A second file, contains a path to the... (8 Replies)
Discussion started by: tirmUK
8 Replies

5. Shell Programming and Scripting

awk filter by columns of file csv

Hi, I would like extract some lines from file csv using awk , below the example: I have the file test.csv with in content below. FLUSSO;COD;DATA_LAV;ESITO ULL;78;17/09/2013;OL ULL;45;05/09/2013;Apertura NP;45;13/09/2013;Riallineamento ULLNP;78;17/09/2013;OL NPG;14;12/09/2013;AperturaTK... (6 Replies)
Discussion started by: giankan
6 Replies

6. Shell Programming and Scripting

Using AWK to match CSV files with duplicate patterns

Dear awk users, I am trying to use awk to match records across two moderately large CSV files. File1 is a pattern file with 173,200 lines, many of which are repeated. The order in which these lines are displayed is important, and I would like to preserve it. File2 is a data file with 456,000... (3 Replies)
Discussion started by: isuewing
3 Replies

7. Shell Programming and Scripting

Merge 2 csv files with awk

I have 2 files pipe delimted and want to merge them based on a key e.g file 1 123$aaa$yyy$zzz 345$xab$yzy$zyz 456$sss$ttt$foo 799$aaa$ggg$dee file 2 123$hhh 345$ddd 456$xxx 888$zzz so if the key is the first field, and the result should be the common key between file 1 and 2 (6 Replies)
Discussion started by: loloAix
6 Replies

8. Shell Programming and Scripting

how to give multiple csv files as input in awk

Hi All, I am new to shell scripting..My problem is i want to give multiple csv files as input to awk script and process the data into one file.. My input file is File1 File2 File3 Product Location Period SalesPrice A x 8/11/2010 ... (7 Replies)
Discussion started by: kvth
7 Replies

9. Shell Programming and Scripting

validation of data using filter (awk or other that works...) in csv files

Hello People I have the following file.csv: date,string,float,number,boolean 20080303,abc,1.5,123,Y 20080304,abc,1.2,345,N 20080229,nvh,1.4,098,Y 20080319,ugy,1.9,586,N 20080315,gyh,2.4,345,Y 20080316,erf,3.1,932,N I need to filter the date field where I have a data bigger than I... (1 Reply)
Discussion started by: Rafael.Buria
1 Replies

10. UNIX for Dummies Questions & Answers

csv files (with quoted commas) and awk

I have a file as follows: 1,"This is field 2",3,4,5 2,"This is field 2 it can contain one , comma",3,4,5 3,"This is field 2 it also, can, contain, more",3,4,5 4,"This is field 2 without extra commas",3,4,5 and i pass this through to awk: awk -F, ' { if (... (3 Replies)
Discussion started by: Cranie
3 Replies
Login or Register to Ask a Question