Extract data based on specific search criteria


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extract data based on specific search criteria
# 1  
Old 10-06-2010
Extract data based on specific search criteria

I have a huge file (about 2 millions records) contains data separated by “,” (comma). As part of the requirement, I can't change the format. The objective is to remove some of the records with the following condition. If the 23rd field on each line start with 302 , I need to remove that from the original file. Simple grep command like “grep –v ^302” but here 302 is actually 23rd field and separated by comma. Please see the sample input and expected out. Your immediate help is really appreciated.
Code:
Data,4l4680,71130,2010,277,01/03/2011,1,1,2,,,2,0,01/11/2010,,,,0,0,,0,,302619988771130,0,4l4680,Call,302619988771130,99988771130,1,
Data,4l4680,1132,2010,176,01/03/2011,1,1,2,,,2,0,01/11/2010,,,,0,0,,0,,302619988771132,0,14680,Call,302619988771132,99988771132,1,
Data,4l3689,1133,2010,1574,,1,1,1,,,2,0,,,,,0,0,,0,,302619988871133,0,12689,_Call,302619988871133,99988871133,1,
Data,05678,9131,2010,18,17/01/2011,2,1,2,DPE,TEST,2,0,18/12/2010,,,,1286200,0,09/08/2010,-2949,,1131,00,1678,all,131,99998881131,1,
Data,6909,289,2010,031,,1,1,1,Irvin,Andé,2,0,,520007980,ON,BH,0,0,,0,,000569,0,1909,CEST,56909,932356909,1,


Output:
Code:
Data,4l4680,71130,2010,277,01/03/2011,1,1,2,,,2,0,01/11/2010,,,,0,0,,0,,302619988771130,0,4l4680,Call,302619988771130,99988771130,1,
Data,4l4680,1132,2010,176,01/03/2011,1,1,2,,,2,0,01/11/2010,,,,0,0,,0,,302619988771132,0,14680,Call,302619988771132,99988771132,1,
Data,4l3689,1133,2010,1574,,1,1,1,,,2,0,,,,,0,0,,0,,302619988871133,0,12689,_Call,302619988871133,99988871133,1,



Moderator's Comments:
Mod Comment Please use code tags for your data en code

Last edited by Franklin52; 10-06-2010 at 01:20 PM..
# 2  
Old 10-06-2010
Try this
Code:
awk -F, '$23!~/^302/' file
Data,05678,9131,2010,18,17/01/2011,2,1,2,DPE,TEST,2,0,18/12/2010,,,,1286200,0,09/08/2010,-2949,,1131,00,1678,all,131,99998881131,1,
Data,6909,289,2010,031,,1,1,1,Irvin,Andé,2,0,,520007980,ON,BH,0,0,,0,,000569,0,1909,CEST,56909,93235 6909,1,

This User Gave Thanks to danmero For This Post:
# 3  
Old 10-06-2010
Or Perl -

Code:
$
$ perl -F, -lane 'print if substr($F[22],0,3) ne "302"' file
Data,05678,9131,2010,18,17/01/2011,2,1,2,DPE,TEST,2,0,18/12/2010,,,,1286200,0,09/08/2010,-2949,,1131,00,1678,all,131,99998881131,1,
Data,6909,289,2010,031,,1,1,1,Irvin,AndΘ,2,0,,520007980,ON,BH,0,0,,0,,000569,0,1909,CEST,56909,93235 6909,1,
$
$
$ perl -F, -lane 'print if not $F[22] =~ /^302/' file
Data,05678,9131,2010,18,17/01/2011,2,1,2,DPE,TEST,2,0,18/12/2010,,,,1286200,0,09/08/2010,-2949,,1131,00,1678,all,131,99998881131,1,
Data,6909,289,2010,031,,1,1,1,Irvin,AndΘ,2,0,,520007980,ON,BH,0,0,,0,,000569,0,1909,CEST,56909,93235 6909,1,
$
$
$ perl -F, -lane 'print if $F[22] !~ /^302/' file
Data,05678,9131,2010,18,17/01/2011,2,1,2,DPE,TEST,2,0,18/12/2010,,,,1286200,0,09/08/2010,-2949,,1131,00,1678,all,131,99998881131,1,
Data,6909,289,2010,031,,1,1,1,Irvin,AndΘ,2,0,,520007980,ON,BH,0,0,,0,,000569,0,1909,CEST,56909,93235 6909,1,
$
$

tyler_durden
This User Gave Thanks to durden_tyler For This Post:
# 4  
Old 10-06-2010
Code:
ruby -F"," -ane  'print if $F[22]!~/^302/' file

This User Gave Thanks to kurumi For This Post:
# 5  
Old 10-06-2010
Greatly appreciate your quick response. Keep up your good work guys.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need a Linux command for find/replace column based on specific criteria.

I'm new to shell programming, I have a huge text file in the following format, where columns are separated by single space: ACA MEX 4O_ $98.00 $127.40 $166.60 0:00 0:00 0 ; ACA YUL TS_ $300.00 $390.00 $510.00 0:00 0:00 0 ; ACA YYZ TS_ $300.00 $390.00 $510.00 0:00 0:00 0 ; ADZ YUL TS_ $300.00... (3 Replies)
Discussion started by: transat
3 Replies

2. Shell Programming and Scripting

awk to print specific line in file based on criteria

In the file below I am trying to extract a specific instance of path, if the adjacent plugin": "/rundb/api/v1/plugin/49/. Thank you :). file "path": "/results/analysis/output/Home/Auto_user_S5-00580-4-Medexome_65_028/plugin_out/FileExporter_out.52", "plugin": "/rundb/api/v1/plugin/49/",... (8 Replies)
Discussion started by: cmccabe
8 Replies

3. Shell Programming and Scripting

Copying section of file based on search criteria

Hi Guru's, I am new to unix scripting. I have a huge file with user details in it(file2) and I have another file with a list of users(file1). Script has to search a user from file1 and get all the associated lines from file2. Example: fiel1: cn=abc cn=DEF cn=xyx File 2: dn:... (10 Replies)
Discussion started by: Samingla
10 Replies

4. Shell Programming and Scripting

Need To Delete Lines Based On Search Criteria

Hi All, I have following input file. I wish to retain those lines which match multiple search criteria. The search criteria is stored in a variable seperated from each other by comma(,). SEARCH_CRITERIA = "REJECT, DUPLICATE" Input File: ERROR,MYFILE_20130214_11387,9,37.75... (3 Replies)
Discussion started by: angshuman
3 Replies

5. Shell Programming and Scripting

Extract error records based on specific criteria from Unix file

Hi, I look for a awk one liner for below issue. input file ABC 1234 abc 12345 ABC 4567 678 XYZ xyz ght 678 ABC 787 yyuu ABC 789 7890 777 zxr hyip hyu mno uii 678 776 ABC ty7 888 All lines should be started with ABC as first field. If a record has another value for 1st... (7 Replies)
Discussion started by: ratheesh2011
7 Replies

6. Shell Programming and Scripting

Search for a specific data in a file based on a date range

Hi, Currently I am working on a script to automate the process of converting the log file from binary into text format. To achieve this, partly I am depending on my application’s utility for this conversion and the rest I am relying on shell commands to search for directory, locate the file and... (5 Replies)
Discussion started by: svajhala
5 Replies

7. Shell Programming and Scripting

Delete new lines based on search criteria

Hi all! A bit of background: I am trying to create a script that formats SQL statements. I have gotten so far as to add new lines based on certain match criteria like commas, keywords etc. In the process, I end up adding newlines where I don't want. For example: substr(colName, 1, 10)... (3 Replies)
Discussion started by: jayarkay
3 Replies

8. Shell Programming and Scripting

Append specific lines to a previous line based on sequential search criteria

I'll try explain this as best I can. Let me know if it is not clear. I have large text files that contain data as such: 143593502 09-08-20 09:02:13 xxxxxxxxxxx xxxxxxxxxxx 09-08-20 09:02:11 N line 1 test line 2 test line 3 test 143593503 09-08-20 09:02:13... (3 Replies)
Discussion started by: jesse
3 Replies

9. Shell Programming and Scripting

extract data from a data matrix with filter criteria

Here is what old matrix look like, IDs X1 X2 Y1 Y2 10914061 -0.364613333 -0.362922333 0.001691 -0.450094667 10855062 0.845956333 0.860396667 0.014440333 1.483899333... (7 Replies)
Discussion started by: ssshen
7 Replies

10. UNIX for Dummies Questions & Answers

Select records based on search criteria on first column

Hi All, I need to select only those records having a non zero record in the first column of a comma delimited file. Suppose my input file is having data like: "0","01/08/2005 07:11:15",1,1,"Created",,"01/08/2005" "0","01/08/2005 07:12:40",1,1,"Created",,"01/08/2005"... (2 Replies)
Discussion started by: shashi_kiran_v
2 Replies
Login or Register to Ask a Question