Awk/sed/cut to filter out records from a file based on criteria


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Awk/sed/cut to filter out records from a file based on criteria
# 8  
Old 06-27-2017
put in a file and attach it to the thread.
The current implementation is based on you "mock-up" and might not be relevant at all to the issue at hand.
This User Gave Thanks to vgersh99 For This Post:
# 9  
Old 06-27-2017
I have the two files zipped. When you unzip, you will see prod.dat and prod_last.dat, they are different. In records 1-3, have different department numbers (columns 361-369), all those 3 records with matching item numbers (columns 1-9) in the other file would need to be filtered out in the output. Thanks again.
# 10  
Old 06-28-2017
not sure of the desired output... try switching the order of the prod.dat and prod_last.dat files on a cli.
Code:
awk '
{s=substr($0,361,9)}
FNR==NR {
  f1[s]=$1
  next
}
NF && s in f1 && f1[s]==$1' prod.dat  prod_last.dat


Last edited by vgersh99; 06-28-2017 at 02:54 PM..
# 11  
Old 06-28-2017
Quote:
Originally Posted by vgersh99
not sure of the desired output... try switching the order of the prod.dat and prod_last.dat files on a cli.
Code:
awk '
{s=substr($0,361,9)}
FNR==NR {
  f1[s]=$1
  next
}
NF && s in f1 && f1[s]==$1' prod.dat  prod_last.dat

Sorry for bothering you again, but this returned 176 records out of 1885, the desired output would be 1882 records, as only 3 items have different department IDs between the two files.
# 12  
Old 06-28-2017
there you go:
Code:
awk '
{dep=substr($0,361,9)}
FNR==NR {
  f1[$1]=dep
  next
}
NF && $1 in f1 && f1[$1]==dep' prod.dat  prod_last.dat

This User Gave Thanks to vgersh99 For This Post:
# 13  
Old 06-28-2017
Now, when I replace the item number (col 1-9), or replace the order of the items, those records were filtered out as well. They won't always be in the same order, or new items might come in. I just need those with different dept numbers (col 361-369) to be filtered out. Anyway that can be done?
# 14  
Old 06-28-2017
the order of the line is ANY of the 2 files doesn't matter.
I'm keying on item num (1-9) with value of dep (361-369).

If item num (1-9) exists in file1 and file2, but have different dep value (361-369) in file2, this record/line will be filtered out.

That's the basics of the script. What am I missing?
Maybe providing a small enough (manageable) snippets of both files covering all expected scenarios and the expected out could help....

Last edited by vgersh99; 06-28-2017 at 05:21 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Filter records from a log file based on timestamp

Dear Experts, I have a log file that contains a timestamp, I would like to filter record from that file based on timestamp. For example refer below file - cat sample.txt Jan 19 20:51:48 mukul-Vostro-14-3468 systemd: pam_unix(systemd-user:session): session opened for user root by (uid=0)... (6 Replies)
Discussion started by: mukulverma2408
6 Replies

2. Shell Programming and Scripting

awk to filter file based on seperate conditions

The below awk will filter a list of 30,000 lines in the tab-delimited file. What I am having trouble with is adding a condition to SVTYPE=CNV that will only print that line if CI= must be >.05 . The other condition to add is if SVTYPE=Fusion, then in order to print that line READ_COUNT must... (3 Replies)
Discussion started by: cmccabe
3 Replies

3. Shell Programming and Scripting

awk to print specific line in file based on criteria

In the file below I am trying to extract a specific instance of path, if the adjacent plugin": "/rundb/api/v1/plugin/49/. Thank you :). file "path": "/results/analysis/output/Home/Auto_user_S5-00580-4-Medexome_65_028/plugin_out/FileExporter_out.52", "plugin": "/rundb/api/v1/plugin/49/",... (8 Replies)
Discussion started by: cmccabe
8 Replies

4. Shell Programming and Scripting

Filter records based on 2nd file

Hello, I want to filter records of a file if they fall in range associated with a second file. First the chr number (2nd col of 1st file and 1st col of 2nd file) needs to be matched. Then if the 3rd col of the first file falls within any of the ranges specified by the 2nd and 3rd cols , then... (4 Replies)
Discussion started by: ritakadm
4 Replies

5. Shell Programming and Scripting

Extract error records based on specific criteria from Unix file

Hi, I look for a awk one liner for below issue. input file ABC 1234 abc 12345 ABC 4567 678 XYZ xyz ght 678 ABC 787 yyuu ABC 789 7890 777 zxr hyip hyu mno uii 678 776 ABC ty7 888 All lines should be started with ABC as first field. If a record has another value for 1st... (7 Replies)
Discussion started by: ratheesh2011
7 Replies

6. Shell Programming and Scripting

Filter/remove duplicate .dat file with certain criteria

I am a beginner in Unix. Though have been asked to write a script to filter(remove duplicates) data from a .dat file. File is very huge containig billions of records. contents of file looks like 30002157,40342424,OTC,mart_rec,100, ,0 30002157,40343369,OTC,mart_rec,95, ,0... (6 Replies)
Discussion started by: mukeshguliao
6 Replies

7. Shell Programming and Scripting

awk - splitting 1 large file into multiple based on same key records

Hello gurus, I am new to "awk" and trying to break a large file having 4 million records into several output files each having half million but at the same time I want to keep the similar key records in the same output file, not to exist accross the files. e.g. my data is like: Row_Num,... (6 Replies)
Discussion started by: kam66
6 Replies

8. Shell Programming and Scripting

Filter records in a file using AWK

I want to filter records in one of my file using AWK command (or anyother command). I am using the below code awk -F@ '$1=="0003"&&"$2==20100402" print {$0}' $INPUT > $OUTPUT I want to pass the 0003 and 20100402 values through a variable. How can I do this? Any help is much... (1 Reply)
Discussion started by: gpaulose
1 Replies

9. UNIX for Dummies Questions & Answers

Select records based on search criteria on first column

Hi All, I need to select only those records having a non zero record in the first column of a comma delimited file. Suppose my input file is having data like: "0","01/08/2005 07:11:15",1,1,"Created",,"01/08/2005" "0","01/08/2005 07:12:40",1,1,"Created",,"01/08/2005"... (2 Replies)
Discussion started by: shashi_kiran_v
2 Replies

10. Shell Programming and Scripting

Selecting records from file on criteria.

Can I have 2 files as in input to the awk command? Situation is somewhat below, File A contains number & value delimited by a space. File B contains number as a part of a line. I am not supposed to retrieve more than 1 number from a line. If number from file B matches with number from... (7 Replies)
Discussion started by: videsh77
7 Replies
Login or Register to Ask a Question