Selecting specific 'id's from lines and columns using 'SED' or 'AWK'


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Selecting specific 'id's from lines and columns using 'SED' or 'AWK'
# 1  
Old 10-11-2010
Selecting specific 'id's from lines and columns using 'SED' or 'AWK'

Hello experts,
I am new to this group and to 'SED' and 'AWK'. I have data (text file) with 5 columns (C_1-5) and 100s of lines (only 10 lines are shown below as an example). I have to find or select only the id numbers (C-1) of specific lines with '90' in the same line (of C_3) AND with '20' in their previous line (of C_3) AND '40' in their next line (of C_3). That is, the id-list should represent all three conditions altogether.

My Data File:

C_1 C_2 C_3 C_4 C_5
1 1 90 0 406
2 0 20 -1 1500
3 1 90 0 377
4 0 60 -1 1500
5 4 90 1 275
6 0 40 -1 1500
7 4 90 1 228
8 0 80 -1 1500
9 1 90 0 414
10 0 60 -1 1500
-- - -- -- ---

Any 'SED' or ''AWK' command(s) could do this task..?
I would greatly appreciate your help..!

Thanking you in advance...!
Kamu
# 2  
Old 10-11-2010
Hi, welcome to the Unix forum. Your sample file does not contain such a pattern.
What have you tried so far?
# 3  
Old 10-11-2010
Hello Scrutinizer,
Thank you for your quick response, and for welcoming me!
Sorry for the sample that did not contain the pattern.
Actually, other portions of the file have the pattern.
I am appending some more lines that have the pattern for your kind notice.

I was searching for similar threads from this group. Nothing worked.
Here is the data (sample) with additional lines:

C_1 C_2 C_3 C_4 C_5
1 1 90 0 406
2 0 20 -1 1500
3 1 90 0 377
4 0 60 -1 1500
5 4 90 1 275
6 0 40 -1 1500
7 4 90 1 228
8 0 80 -1 1500
9 1 90 0 414
10 0 60 -1 1500
11 1 90 0 406
12 0 20 -1 1500
13 1 90 0 377
14 0 40 -1 1500
15 4 90 1 275
16 0 20 -1 1500
17 4 90 1 228
18 0 40 -1 1500
19 1 90 0 414
20 0 60 -1 1500
21 1 90 0 406
22 0 40 -1 1500
23 1 90 0 377
24 0 20 -1 1500
25 4 90 1 275
26 0 40 -1 1500
27 4 90 1 228
28 0 20 -1 1500
29 1 90 0 414
30 0 40 -1 1500
-- - -- -- ---
I expect that the command will give the output like this: (13, 17, 25, 29); because these line-ids satisfy all 3 conditions (based on values in C_3): 1) '90' in the same line, 2) '20' in the previous line, and 3) '40' in the next line. However, I could not arrive at a command (lines) that yields correct solution (although I searched for similar threads that may give clues to this problem).

It would be great if any one solves this...!
Thanks again,
Kamu
===============
# 4  
Old 10-11-2010
Hi Kamkamu, try this
Code:
awk 'q==20&&p==90&&$3==40{print id}{q=p;id=$1;p=$3}' infile

# 5  
Old 10-11-2010
Hello Scrutinizer,
What a superb expert you are...! It worked perfectly..!
Within a flash of a moment, you have provided the solution!
Thanks a ton, for your help..!

Sincerely,
Kamu
# 6  
Old 10-11-2010
Code:
awk '{a[NR]=$3} a[NR-2]==20&&a[NR-1]==90&&a[NR]==40 {print NR-2}' infile

# 7  
Old 10-11-2010
Hello rdcwayx,
Thank you so much for your help.
When I tried with my long sample (as in my earlier post), it gave the output like this:
12
16
24
28
This id-list is one id-number short of the actual id's that satisfy those 3 conditions. The correct id-list should be as follows:
13
17
25
29
I will try to modify your code slightly and see whether it works perfectly.

Thanks again,
Kamu

---------- Post updated at 12:28 PM ---------- Previous update was at 12:07 PM ----------

Hello rdcwayx,
When I did a minor change in your code (from {print NR-2} to {print NR-1}, it worked perfectly. Now, your code looks like this:

Code:
awk '{a[NR]=$3} a[NR-2]==20&&a[NR-1]==90&&a[NR]==40 {print NR-1}' infile

Do you agree with me..?
Thanks for your expert help.

Sincerely,
Kamu

---------- Post updated at 01:06 PM ---------- Previous update was at 12:28 PM ----------

Hello rdcwayx,
On a recount, when I tried your code with my data-file with column titles on the first line, it gives the correct line-id. With data alone (without header), I had to modify your code.
Code:
awk '{a[NR]=$3} a[NR-2]==20&&a[NR-1]==90&&a[NR]==40 {print NR-2}' infile

On the other hand, the code provided by 'Scrutinizer' (as below) worked even when the id-lines were jumbled up.
Code:
awk 'q==20&&p==90&&$3==40{print id}{id=$1;q=p;p=$3}' infile

Thank you so much both of you...again!

Kamu
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Add character to specific columns using sed or awk and make it a permanent change

Hi, I am writing a shell script where I want that # should be added in all those lines as the first character where the pattern matches. file has lot of functions defined a.sh #!/bin/bash fn a { beautiful evening sunny day } fn b { } fn c { hello world .its a beautiful day ... (12 Replies)
Discussion started by: ashima jain
12 Replies

2. Shell Programming and Scripting

Selecting lines having same values for first two columns

Hello to all. This is first post. Kindly excuse me if I do not adhere to any rules and regulations of this forum. I have a file containing some rows with three columns each per row(separeted by a space). There are certain rows for which first two columns have same value but the value in... (6 Replies)
Discussion started by: manojmalhotra13
6 Replies

3. Shell Programming and Scripting

Summing over specific lines and replacing the lines with the sum using sed, awk

Hi friends, This is sed & awk type question. I have a text file which has numbers spread all over the file. I want to sum the series of numbers whenever i find it and produce an output file with the sum. For example ###start of input text file #### abc def ghi 1 2 3 4 kjld random... (3 Replies)
Discussion started by: kaaliakahn
3 Replies

4. Shell Programming and Scripting

selecting and deleting specific lines with condition

I have a set of data as below: The first field, $1 represent "|". The $3 (3rd field) and $6 (6th field) in my data file represent "number-molecule" which has arrangement as below: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 ... (9 Replies)
Discussion started by: vjramana
9 Replies

5. Shell Programming and Scripting

Selecting Specific Columns and Insert the delimiter TAB

Hi, I am writing a Perl Script for the below : I have a data file that consists of the header information which is 231 Lines and the footer information as 4 lines. The total number of line including the header and footer 1.2 Million with Pipe Delimited file. For example: Header Information:... (4 Replies)
Discussion started by: filter
4 Replies

6. Shell Programming and Scripting

selecting specific fields in a file (maybe with sed?)

Hi, I have a file with following lines: chr1 10 AC=2;AF=1.00;AN=2;DP=2;Dels=0.00;HRun=0;HaplotypeScore=0.00;MQ=23.00;MQ0=0;QD=14.33;SB=-10.01 chrX 18 AB=0.52;AC=1;AF=0.50;AN=2;DP=203;DS;Dels=0.00;HRun=0;HaplotypeScore=20.01;MQ=15.63;MQ0=85;QD=12.80;SB=-1289.58 I need to extract 4... (2 Replies)
Discussion started by: menenuh
2 Replies

7. Shell Programming and Scripting

Selecting lines with sed

Hi all, I have a file with special characters like this file1 691775025 ýÄqJ8^Z^Y{ 2004-08-23E P 100.00 45585025 0527541139295037342008-07-25OEP 100.00 6983025 ýB<9D>x<^F^Xb 2004-11-16SPP 100.00 I need a sed command to print the lines which don't have special characters.ie., only line 2... (9 Replies)
Discussion started by: allinshell
9 Replies

8. Shell Programming and Scripting

Sed or Awk to remove specific lines

I have searched the forum for this - forgive me if I missed a previous post. I have the following file: blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah alter table "informix".esc_acct add constraint (foreign key (fi_id) references "informix".fi ... (5 Replies)
Discussion started by: Shoeless_Mike
5 Replies

9. UNIX for Dummies Questions & Answers

command for selecting specific lines from a script

I need help on following script: I need to print the lines which are in bold letters in separate file as record string("|") emp_name; string("|") emp_id; decimal("|") emp_salary; string("|") emp_status; string("\n") emp_proj; end (1 Reply)
Discussion started by: gardasgangadhar
1 Replies

10. UNIX for Dummies Questions & Answers

Help with selecting specific lines in a large file

Hello, I need to select the 3 lines above as well as below a search string, including the search string. I have been trying various combinations using sed command without any success. Can anuone help please. Thanking (2 Replies)
Discussion started by: tansha
2 Replies
Login or Register to Ask a Question