Print rows, having pattern in specific column...


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Print rows, having pattern in specific column...
# 1  
Old 10-04-2009
Error Print rows, having pattern in specific column...

Hello all, Smilie
I have a pattern file some what like this,

cd003
cd005
cd007
cd008


and input file like this,

abc cd001 cd002 zca
bca cd002 cd003 cza
cba cd003 cd004 zca
bac cd004 cd005 zac
cba cd005 cd006 acz
acb cd006 cd007 caz
cab cd007 cd008 azc


Smilie
I want to print all the rows which have the pattern in column 3 ?
Is it possible to give patterns from a file, in AWk ?
How to do it in shell scripting (without using AWk) ?

THANKS IN ADVANCE Smilie

Last edited by admax; 10-04-2009 at 04:38 PM..
# 2  
Old 10-04-2009
Hi.

Something like:
Code:
awk 'NR == FNR { A[$1] = $1; next } A[$3]' pattern_file input_file

# 3  
Old 10-04-2009
Hi scottn,

Very nice code. The only thing I can't figure out is why you need the next statement in this case.
Code:
awk 'NR == FNR { A[$1] = $1 } A[$3]' pattern_file input_file

seems to produce the same result..

S.
# 4  
Old 10-04-2009
The idea is to read the pattern file completely before doing anything else.

But you're right, in this case A[$3] would never be set as there is no field three in the pattern file. But if that was to change, using next means you wouldn't have to worry about it later!

Last edited by Scott; 10-04-2009 at 03:53 PM..
# 5  
Old 10-05-2009
Hi,

Can I know what is the relationship between NR == FNR { A[$1] = $1; next } A[$3] ?
I quite confusing about the reason why you using the A[$X] to print rows, having pattern in specific column.
Thanks for your reply Smilie

Quote:
Originally Posted by scottn
Hi.

Something like:
Code:
awk 'NR == FNR { A[$1] = $1; next } A[$3]' pattern_file input_file

# 6  
Old 10-05-2009
The quick answer:

First realize that awk arrays are indexed by a value not a number. While array indexes may look like a number, that is not how awk sees them.

The first clause sets up an array (A) indexed by values from the pattern file.
Indexes to A are cd003, cd005, etc... so A["cd003"] is a valid entry in the A array.
NR is the number records awk has ever read. NR is set to 1 when awk starts.
FNR is the number of records read from the current file. FNR is reset to 1 when a new file is opened.
Both are incremented when a record is read.

So, if NR is equal to FNR, then we are reading from the first file (pattern file) since the record counts are the same.
If NR is not equal to FNR, then we are reading from a subsequent file (i.e. data file).

The A[$3] (where $3 is the third field from the data file) says if the entry exists (i.e A["cd003"] then do the default action (print the line), else ignore that entry.

This clause is not executed on the pattern file because the next statement says "skip all following code. read next line, and start processing clauses from the top". The 'next' statement adds to the robustness of the code.

Last edited by jp2542a; 10-05-2009 at 10:20 PM..
# 7  
Old 10-06-2009
Hi,

Really thanks for your explanation.
It is excellent.
Besides that, can you told me what is the meaning of
A[$1] = $1Thanks for your help again.Smilie



Quote:
Originally Posted by jp2542a
The quick answer:

First realize that awk arrays are indexed by a value not a number. While array indexes may look like a number, that is not how awk sees them.

The first clause sets up an array (A) indexed by values from the pattern file.
Indexes to A are cd003, cd005, etc... so A["cd003"] is a valid entry in the A array.
NR is the number records awk has ever read. NR is set to 1 when awk starts.
FNR is the number of records read from the current file. FNR is reset to 1 when a new file is opened.
Both are incremented when a record is read.

So, if NR is equal to FNR, then we are reading from the first file (pattern file) since the record counts are the same.
If NR is not equal to FNR, then we are reading from a subsequent file (i.e. data file).

The A[$3] (where $3 is the third field from the data file) says if the entry exists (i.e A["cd003"] then do the default action (print the line), else ignore that entry.

This clause is not executed on the pattern file because the next statement says "skip all following code. read next line, and start processing clauses from the top". The 'next' statement adds to the robustness of the code.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

If pattern in column 3 matches pattern in column 2 (any row), print value in column 1

Hi all, I have searched and searched, but I have not found a solution that quite fits what I am trying to do. I have a long list of data in three columns. Below is a sample: 1,10,8 2,12,10 3,13,12 4,14,14 5,15,16 6,16,18 Please use code tags What I need to do is as follows: If a... (4 Replies)
Discussion started by: bleedingturnip
4 Replies

2. Shell Programming and Scripting

How to print multiple specific column after a specific word?

Hello.... Pls help me (and sorry my english) :) So I have a file (test.txt) with 1 long line.... for example: isgc jsfh udgf osff 8462 error iwzr 653 idchisfb isfbisfb sihfjfeb isfhsi gcz eifh How to print after the "error" word the 2nd 4th 5th and 7th word?? output well be: 653 isfbisfb... (2 Replies)
Discussion started by: marvinandco
2 Replies

3. Shell Programming and Scripting

Converting Single Column into Multiple rows, but with strings to specific tab column

Dear fellows, I need your help. I'm trying to write a script to convert a single column into multiple rows. But it need to recognize the beginning of the string and set it to its specific Column number. Each Line (loop) begins with digit (RANGE). At this moment it's kind of working, but it... (6 Replies)
Discussion started by: AK47
6 Replies

4. UNIX for Dummies Questions & Answers

Deleting rows where the value in a specific column match

Hi, I have a tab delimited text file where I want to delete all rows that have the same string for column 1. How do I go about doing that? Thanks! Example Input: aa 1 aa 2 aa 3 bb 4 bc 5 bb 6 cd 8 Output: bc 5 cd 8 (4 Replies)
Discussion started by: evelibertine
4 Replies

5. Shell Programming and Scripting

awk command to print only selected rows in a particular column specified by column name

Dear All, I have a data file input.csv like below. (Only five column shown here for example.) Data1,StepNo,Data2,Data3,Data4 2,1,3,4,5 3,1,5,6,7 3,2,4,5,6 5,3,5,5,6 From this I want the below output Data1,StepNo,Data2,Data3,Data4 2,1,3,4,5 3,1,5,6,7 where the second column... (4 Replies)
Discussion started by: ks_reddy
4 Replies

6. UNIX for Dummies Questions & Answers

How to Detect Specific Pattern and Print the Specific String after It?

I'm still beginner and maybe someone can help me. I have this input: the great warrior a, b, c and what i want to know is, with awk, how can i detect the string with 'warrior' string on it and print the a, b, and c seperately, become like this : Warrior Type a b c Im still very... (3 Replies)
Discussion started by: radynaraya
3 Replies

7. Shell Programming and Scripting

Replace column that matches specific pattern, with column data from another file

Can anyone please help with this? I have 2 files as given below. If 2nd column of file1 has pattern foo1@a, find the matching 1st column in file2 & replace 2nd column of file1 with file2's value. file1 abc_1 foo1@a .... abc_1 soo2@a ... def_2 soo2@a .... def_2 foo1@a ........ (7 Replies)
Discussion started by: prashali
7 Replies

8. Shell Programming and Scripting

print first few lines, then apply regex on a specific column to print results.

abc.dat tty cpu tin tout us sy wt id 0 0 7 3 19 71 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 133.2 0.0 682.9 0.0 1.0 0.0 7.2 0 79 c1t0d0 0.2 180.4 0.1 5471.2 3.0 2.8 16.4 15.6 15 52 aaaaaa1-xx I want to skip first 5 line... (4 Replies)
Discussion started by: kchinnam
4 Replies

9. Shell Programming and Scripting

Print out specific pattern column data

Input file: adc_0123 haa_1000 bcc_520 adc_0150 bcc_290 adc_0112 haa_8000 adc_0139 haa_7000 Output file: adc_0123 adc_0123 haa_1000 bcc_520 adc_0150 adc_0150 bcc_290 (3 Replies)
Discussion started by: patrick87
3 Replies

10. Shell Programming and Scripting

Question about sort specific column and print other column at the same time !

Hi, This is my input file: ali 5 usa abc abu 4 uk bca alan 6 brazil bac pinky 10 utah sdc My desired output: pinky 10 utah sdc alan 6 brazil bac ali 5 usa abc abu 4 uk bca Based on the column two, I want to do the descending order and print out other related column at the... (3 Replies)
Discussion started by: patrick87
3 Replies
Login or Register to Ask a Question