Print rows, having pattern in specific column...


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Print rows, having pattern in specific column...
# 8  
Old 10-06-2009
The A[$1] = $1 statement simply makes A[$1] exist by giving it a value. Assigning the value to the same index is just a convenience.
# 9  
Old 10-06-2009
I try the command below:
awk 'NR == FNR { A[$1] = $2; next } A[$3]'
awk 'NR == FNR { A[$1] = $3; next } A[$3]'
awk 'NR == FNR { A[$1] = $4; next } A[$3]'
All fail to get my desired output result.
Thus I'm interesting about the reason why need to set like A[$1] = $1

Quote:
Originally Posted by jp2542a
The quick answer:

First realize that awk arrays are indexed by a value not a number. While array indexes may look like a number, that is not how awk sees them.

The first clause sets up an array (A) indexed by values from the pattern file.
Indexes to A are cd003, cd005, etc... so A["cd003"] is a valid entry in the A array.
NR is the number records awk has ever read. NR is set to 1 when awk starts.
FNR is the number of records read from the current file. FNR is reset to 1 when a new file is opened.
Both are incremented when a record is read.

So, if NR is equal to FNR, then we are reading from the first file (pattern file) since the record counts are the same.
If NR is not equal to FNR, then we are reading from a subsequent file (i.e. data file).

The A[$3] (where $3 is the third field from the data file) says if the entry exists (i.e A["cd003"] then do the default action (print the line), else ignore that entry.

This clause is not executed on the pattern file because the next statement says "skip all following code. read next line, and start processing clauses from the top". The 'next' statement adds to the robustness of the code.


---------- Post updated at 11:45 PM ---------- Previous update was at 11:41 PM ----------

Actually I got try the command below:
awk 'NR == FNR { A[$1] = $2; next } A[$3]'
awk 'NR == FNR { A[$1] = $3; next } A[$3]'
awk 'NR == FNR { A[$1] = $4; next } A[$3]'
All fail to get my desired output result.
Thus I'm interesting about the reason why need to set like A[$1] = $1
Really thanks ya.
I'm the new user of awk ^^
But I like awk Smilie
Quote:
Originally Posted by jp2542a
The A[$1] = $1 statement simply makes A[$1] exist by giving it a value. Assigning the value to the same index is just a convenience.
# 10  
Old 10-06-2009
Scottn's srcript works fine based on your spec. What exactly are you trying to do?

Changing the A[$1] rvalue will not change the programs execution. If what you posted represents the command line you put to the shell, then you forgot to specify the pattern and data files..
# 11  
Old 10-06-2009
Hi,
Sorry that I missed some hints..
Like what you said, "If what you posted represents the command line you put to the shell, then you forgot to specify the pattern and data files"
How can I do that to specify the pattern and data files?

My work script

[patrick@home]$ awk 'NR == FNR { A[$1] = $1; next } A[$3]' pattern_file data_file
bca cd002 cd003 cza
bac cd004 cd005 zac
acb cd006 cd007 caz
cab cd007 cd008 azc
[patrick@home]$ awk 'NR == FNR { A[$1] = $2; next } A[$3]' pattern_file data_file
[patrick@home]$ awk 'NR == FNR { A[$1] = $3; next } A[$3]' pattern_file data_file
[patrick@home]$ awk 'NR == FNR { A[$1] = $4; next } A[$3]' pattern_file data_file
[patrick@home]$ awk 'NR == FNR { A[$1] = $5; next } A[$3]' pattern_file data_file

If I change the A[$1] = $1 to $2/$3/$4, I can't get any output result.
Thanks a lot for your guide and sharing Smilie

Quote:
Originally Posted by jp2542a
Scottn's srcript works fine based on your spec. What exactly are you trying to do?

Changing the A[$1] rvalue will not change the programs execution. If what you posted represents the command line you put to the shell, then you forgot to specify the pattern and data files..
# 12  
Old 10-06-2009
pattern_file and data_file are just place holders for the actual paths to your files.

For instance if /export/home/user/real.txt is the path to the file with the patterns and /export/home/user/thedata.txt is the path to the data file, then the command line would be:

Code:
awk 'NR == FNR { A[$1] = $1; next } A[$3]' /export/home/user/real.txt /export/home/user/thedata.txt

And I just noticed something. $2..$n don't exist. So A[$1] will have no value and hence not exist either.
# 13  
Old 10-06-2009
Hi,
thanks for your explanation Smilie
Actually I just confusing about this "A[$1] = $1"
The $1 I not sure how to fill it and its refer to what ?! Smilie
Quote:
Originally Posted by jp2542a
Scottn's srcript works fine based on your spec. What exactly are you trying to do?

Changing the A[$1] rvalue will not change the programs execution. If what you posted represents the command line you put to the shell, then you forgot to specify the pattern and data files..
Quote:
Originally Posted by jp2542a
pattern_file and data_file are just place holders for the actual paths to your files.

For instance if /export/home/user/real.txt is the path to the file with the patterns and /export/home/user/thedata.txt is the path to the data file, then the command line would be:

Code:
awk 'NR == FNR { A[$1] = $1; next } A[$3]' /export/home/user/real.txt /export/home/user/thedata.txt

And I just noticed something. $2..$n don't exist. So A[$1] will have no value and hence not exist either.
# 14  
Old 10-06-2009
A[$1] = $1 is used to set up the array of patterns. It is set when NR is equal to FNR, in other words when the first file is read i.e. the pattern file. It means fill the associative array "$1" to the value of "$1", $1 being the first field of your pattern file So with the OP's provided inputs it gets filled like so:
Code:
A[cd003]=cd003
A[cd005]=cd005
A[cd007]=cd007
A[cd008]=cd008

So once the pattern file is done it starts reading the input file FNR is no longer equal to NR, so it will just execute the part "A[$3]" for each line, which means: print the current line if field 3 exist as a key in the array.

The script is only testing the existence of the array elements, not using its contents. So IMO the use of $1 is a tiny bit superfluous. I think we could also just set it to 1 instead:

Code:
awk 'NR == FNR { A[$1]=1; next } A[$3]' pattern_file input_file

or, since pattern file does not have a third column.
Code:
awk 'NR == FNR { A[$1]=1 } A[$3]' pattern_file input_file


Last edited by Scrutinizer; 10-06-2009 at 03:28 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

If pattern in column 3 matches pattern in column 2 (any row), print value in column 1

Hi all, I have searched and searched, but I have not found a solution that quite fits what I am trying to do. I have a long list of data in three columns. Below is a sample: 1,10,8 2,12,10 3,13,12 4,14,14 5,15,16 6,16,18 Please use code tags What I need to do is as follows: If a... (4 Replies)
Discussion started by: bleedingturnip
4 Replies

2. Shell Programming and Scripting

How to print multiple specific column after a specific word?

Hello.... Pls help me (and sorry my english) :) So I have a file (test.txt) with 1 long line.... for example: isgc jsfh udgf osff 8462 error iwzr 653 idchisfb isfbisfb sihfjfeb isfhsi gcz eifh How to print after the "error" word the 2nd 4th 5th and 7th word?? output well be: 653 isfbisfb... (2 Replies)
Discussion started by: marvinandco
2 Replies

3. Shell Programming and Scripting

Converting Single Column into Multiple rows, but with strings to specific tab column

Dear fellows, I need your help. I'm trying to write a script to convert a single column into multiple rows. But it need to recognize the beginning of the string and set it to its specific Column number. Each Line (loop) begins with digit (RANGE). At this moment it's kind of working, but it... (6 Replies)
Discussion started by: AK47
6 Replies

4. UNIX for Dummies Questions & Answers

Deleting rows where the value in a specific column match

Hi, I have a tab delimited text file where I want to delete all rows that have the same string for column 1. How do I go about doing that? Thanks! Example Input: aa 1 aa 2 aa 3 bb 4 bc 5 bb 6 cd 8 Output: bc 5 cd 8 (4 Replies)
Discussion started by: evelibertine
4 Replies

5. Shell Programming and Scripting

awk command to print only selected rows in a particular column specified by column name

Dear All, I have a data file input.csv like below. (Only five column shown here for example.) Data1,StepNo,Data2,Data3,Data4 2,1,3,4,5 3,1,5,6,7 3,2,4,5,6 5,3,5,5,6 From this I want the below output Data1,StepNo,Data2,Data3,Data4 2,1,3,4,5 3,1,5,6,7 where the second column... (4 Replies)
Discussion started by: ks_reddy
4 Replies

6. UNIX for Dummies Questions & Answers

How to Detect Specific Pattern and Print the Specific String after It?

I'm still beginner and maybe someone can help me. I have this input: the great warrior a, b, c and what i want to know is, with awk, how can i detect the string with 'warrior' string on it and print the a, b, and c seperately, become like this : Warrior Type a b c Im still very... (3 Replies)
Discussion started by: radynaraya
3 Replies

7. Shell Programming and Scripting

Replace column that matches specific pattern, with column data from another file

Can anyone please help with this? I have 2 files as given below. If 2nd column of file1 has pattern foo1@a, find the matching 1st column in file2 & replace 2nd column of file1 with file2's value. file1 abc_1 foo1@a .... abc_1 soo2@a ... def_2 soo2@a .... def_2 foo1@a ........ (7 Replies)
Discussion started by: prashali
7 Replies

8. Shell Programming and Scripting

print first few lines, then apply regex on a specific column to print results.

abc.dat tty cpu tin tout us sy wt id 0 0 7 3 19 71 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 133.2 0.0 682.9 0.0 1.0 0.0 7.2 0 79 c1t0d0 0.2 180.4 0.1 5471.2 3.0 2.8 16.4 15.6 15 52 aaaaaa1-xx I want to skip first 5 line... (4 Replies)
Discussion started by: kchinnam
4 Replies

9. Shell Programming and Scripting

Print out specific pattern column data

Input file: adc_0123 haa_1000 bcc_520 adc_0150 bcc_290 adc_0112 haa_8000 adc_0139 haa_7000 Output file: adc_0123 adc_0123 haa_1000 bcc_520 adc_0150 adc_0150 bcc_290 (3 Replies)
Discussion started by: patrick87
3 Replies

10. Shell Programming and Scripting

Question about sort specific column and print other column at the same time !

Hi, This is my input file: ali 5 usa abc abu 4 uk bca alan 6 brazil bac pinky 10 utah sdc My desired output: pinky 10 utah sdc alan 6 brazil bac ali 5 usa abc abu 4 uk bca Based on the column two, I want to do the descending order and print out other related column at the... (3 Replies)
Discussion started by: patrick87
3 Replies
Login or Register to Ask a Question