Sponsored Content
Operating Systems Linux Filter a .CSV file based on the 5th column values Post 302866383 by dhruuv369 on Monday 21st of October 2013 11:32:25 PM
Old 10-22-2013
Filter a .CSV file based on the 5th column values

I have a .CSV file with the below format:

Code:
"column 1","column 2","column 3","column 4","column 5","column 6","column 7","column 8","column 9","column 10
"12310","42324564756","a simple string with a , comma","string with or, without commas","string 1","USD","12","70%","08/01/2013",""
"23455","12312255564","string, with, multiple, commas","string with or, without commas","string 2","USD","433","70%","07/15/2013",""
"23525","74535243123","string , with commas, and - hypens and: semicolans","string with or, without commas","string 1","CAND","744","70%","05/06/2013",""
"46476","15467534544","lengthy string, with commas, multiple: colans","string with or, without commas","string 2","CAND","388","70%","09/21/2013",""

5th column of the file has different strings. I need to filter out the file based on the 5th column value. Lets say, I need a new file from the current file which has records only with the value "string 1" in its fifth field.

For this I tried the below command,

Code:
awk -F"," ' { if toupper($5) == "STRING 1") PRINT  }' file1.csv > file2.csv

but it was throwing me an error as following:

Code:
awk:  { if toupper($5) == "STRING 1") PRINT  }
awk:       ^ syntax error
awk:  { if toupper($5) == "STRING 1") PRINT  }
awk:                                ^ syntax error

I also tried the following but it did not help much:

Code:
awk -F"," ' { if ($5 == "string 1") print  }' file1.csv > file2.csv

I then used the following which gives me an odd output.

Code:
awk -F"," '$5="string 1" {print}' file1.csv > file2.csv

Output:

Code:
"column 1" "column 2" "column 3" "column 4" string 1 "column 6" "column 7" "column 8" "column 9" "column 10
"12310" "42324564756" "a simple string with a   comma" string 1  without commas" "string 1" "USD" "12" "70%" "08/01/2013" ""
"23455" "12312255564" "string  with string 1  commas" "string with or  without commas" "string 2" "USD" "433" "70%" "07/15/2013" ""
"23525" "74535243123" "string   with commas string 1 "string with or  without commas" "string 1" "CAND" "744" "70%" "05/06/2013" ""
"46476" "15467534544" "lengthy string  with commas string 1 "string with or  without commas" "string 2" "CAND" "388" "70%" "09/21/2013" ""

P.S: I used toupper command to be on the safe side, as I am not sure if the string will be in lower or higher case. Also, Please advise if the space in the string matters while searching for a pattern using AWK... Thanks in advance.

Last edited by Don Cragun; 10-22-2013 at 01:35 AM.. Reason: Use CODE tags for input and output samples as well as for code samples.
 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Script for extracting data from csv file based on column values.

Hi all, I am new to shell script.I need your help to write a shell script. I need to write a shell script to extract data from a .csv file where columns are ',' separated. The file has 5 columns having values say column 1,column 2.....column 5 as below along with their valuesm.... (3 Replies)
Discussion started by: Vivekit82
3 Replies

2. Shell Programming and Scripting

Fetching values in CSV file based on column name

input.csv: Field1,Field2,Field3,Field4,Field4 abc ,123 ,xyz ,000 ,pqr mno ,123 ,dfr ,111 ,bbb output: Field2,Field4 123 ,000 123 ,111 how to fetch the values of Field4 where Field2='123' I don't want to fetch the values based on column position. Instead want to... (10 Replies)
Discussion started by: bharathbangalor
10 Replies

3. Shell Programming and Scripting

Remove the values from a certain column without deleting the Column name in a .CSV file

(14 Replies)
Discussion started by: dhruuv369
14 Replies

4. Linux

To get all the columns in a CSV file based on unique values of particular column

cat sample.csv ID,Name,no 1,AAA,1 2,BBB,1 3,AAA,1 4,BBB,1 cut -d',' -f2 sample.csv | sort | uniq this gives only the 2nd column values Name AAA BBB How to I get all the columns of CSV along with this? (1 Reply)
Discussion started by: sanvel
1 Replies

5. Shell Programming and Scripting

UNIX command -Filter rows in fixed width file based on column values

Hi All, I am trying to select the rows in a fixed width file based on values in the columns. I want to select only the rows if column position 3-4 has the value AB I am using cut command to get the column values. Is it possible to check if cut -c3-4 = AB is true then select only that... (2 Replies)
Discussion started by: ashok.k
2 Replies

6. Shell Programming and Scripting

Filter file to remove duplicate values in first column

Hello, I have a script that is generating a tab delimited output file. num Name PCA_A1 PCA_A2 PCA_A3 0 compound_00 -3.5054 -1.1207 -2.4372 1 compound_01 -2.2641 0.4287 -1.6120 3 compound_03 -1.3053 1.8495 ... (3 Replies)
Discussion started by: LMHmedchem
3 Replies

7. Shell Programming and Scripting

Filter tab file based on column value

Hello I have a tab text file with many columns and have to filter rows ONLY if column 22 has the value of '0', '1', '2' or '3' (out of 0-5). If Column 22 has value '0','1', '2' or '3' (highlighted below), then remove anything less than 10 and greater 100 (based on column 5) AND remove anything... (1 Reply)
Discussion started by: nans
1 Replies

8. Shell Programming and Scripting

Get maximum per column from CSV file, based on date column

Hello everyone, I am using ksh on Solaris 10 and I'm gathering data in a CSV file that looks like this: 20170628-23:25:01,1,0,0,1,1,1,1,55,55,1 20170628-23:30:01,1,0,0,1,1,1,1,56,56,1 20170628-23:35:00,1,0,0,1,1,2,1,57,57,2 20170628-23:40:00,1,0,0,1,1,1,1,58,58,2... (6 Replies)
Discussion started by: ejianu
6 Replies

9. Shell Programming and Scripting

Filter duplicate records from csv file with condition on one column

I have csv file with 30, 40 columns Pasting just three column for problem description I want to filter record if column 1 matches CN or DN then, check for values in column 2 if column contain 1235, 1235 then in column 3 values must be sequence of 2345, 2345 and if column 2 contains 6789, 6789... (5 Replies)
Discussion started by: as7951
5 Replies
All times are GMT -4. The time now is 08:16 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy