Filtering text with awk


 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Filtering text with awk
# 1  
Old 02-06-2019
Filtering text with awk

I need to filter a file that is composed like that:
Code:
>Cluster 0
0	292nt, >last294258;size=1;... *
>Cluster 1
0	292nt, >last111510;size=1;... *
1	290nt, >last136280;size=1;... at -/98.62%
2	292nt, >last217336;size=1;... at +/99.66%
3	292nt, >last280937;size=1;... at -/99.32%
>Cluster 2
0	292nt, >last355423;size=1;... *

i need to output it having just the lines that contain the "*" pattern and I need to add in the output file also the info of number of lies before and after the match . like this:
Code:
>last294258;size=1;... *nr=1
>last111510;size=1;... *nr=4
 >last355423;size=1;... *nr=1


Last edited by vgersh99; 02-06-2019 at 11:38 AM.. Reason: Code tags, please!
# 2  
Old 02-06-2019
The 'after' of one is just the 'before' of the other, so:

Code:
$ awk '/[*]/ { print $0";nr=" N ; N=0 ; next } { N++ }' data

0       292nt, >last294258;size=1;... *;nr=1
0       292nt, >last111510;size=1;... *;nr=1
0       292nt, >last355423;size=1;... *;nr=4

$

# 3  
Old 02-06-2019
sorry maybe I was not so clear but like that doesn't work in the whole big file correctly.

for each line with * I need to add nr=the number of lines belonging to that group(group=cluster)

so like that:
Code:
>last294258;size=1;... *nr=1
>last111510;size=1;... *nr=4
>last355423;size=1;... *nr=1

--- Post updated at 04:25 PM ---

maybe I was not that clear.
this is the output I need

Code:
>last294258;size=1;... *nr=1
>last111510;size=1;... *nr=4
>last355423;size=1;... *nr=1

nr=4 is because of the subgroup of Cluster 1 is composed by 4 lines




Moderator's Comments:
Mod Comment Seriously: Please use CODE tags as required by forum rules!

Last edited by RudiC; 02-06-2019 at 12:57 PM.. Reason: Added CODE tags.
# 4  
Old 02-06-2019
something to start with - a bit verbose...:
awk -f pedro.awk myFile
where pedro.awk is:
Code:
BEGIN {
  FS="[>;]"
  OFS=";"
}

function p(a, i)
{
   for(i in a)
     print ">" i, "*nr=" ln

}
/^>/ {p(out);ln=0;split("",out);next}
/[*]/  {idx=$2 OFS $3; out[idx]}
{ln++}
END {
  if (ln) p(out)
}


Last edited by vgersh99; 02-06-2019 at 01:22 PM.. Reason: * -> [*] : to be treated a single char, instead of RE
# 5  
Old 02-06-2019
It is common use in these fora to show what you've tried and where you were stuck when posting a problem. Try



Code:
awk '
/>Cluster/      {if (CNT) print CNT
                 CNT = 0
                 next
                }
                {CNT++
                }
/\*/            {printf "%snr=", $NF
                }
END             {print CNT
                }
' FS=, file
 >last294258;size=1;... *nr=1
 >last111510;size=1;... *nr=4
 >last355423;size=1;... *nr=1

This User Gave Thanks to RudiC For This Post:
# 6  
Old 02-06-2019
Quote:
Originally Posted by RudiC
It is common use in these fora to show what you've tried and where you were stuck when posting a problem. Try



Code:
awk '
/>Cluster/      {if (CNT) print CNT
                 CNT = 0
                 next
                }
                {CNT++
                }
/\*/            {printf "%snr=", $NF
                }
END             {print CNT
                }
' FS=, file
 >last294258;size=1;... *nr=1
 >last111510;size=1;... *nr=4
 >last355423;size=1;... *nr=1

Just a slight modification if a "block" does not have anything marked with *:
Code:
>Cluster 0
0       292nt, >last294258;size=1;... *
>Cluster 1
0       292nt, >last111510;size=1;... *
1       290nt, >last136280;size=1;... at -/98.62%
2       292nt, >last217336;size=1;... at +/99.66%
3       292nt, >last280937;size=1;... at -/99.32%
>Cluster 2
0       292nt, >last355423;size=1;...

Code:
BEGIN {
  FS=","
}
/>Cluster/      {if (flg) print CNT
                 CNT=flg = 0
                 next
                }
                {CNT++
                }
/\*/            {printf "%snr=", $NF;flg++
                }
END             {if (flg) print CNT
                }

This User Gave Thanks to vgersh99 For This Post:
# 7  
Old 02-06-2019
The following skips empty lines (where NF is 0) and saves the 1st line after ">Cluster"
Code:
awk '
function prt(){ if (run==1) print (save1 "nr=" nr); else run=1 }
$1~/^>Cluster/ { prt(); nr=0; next }
(NF>0 && ++nr==1) { $1=$2=""; save1=$0 }
END { prt() }
' data

At each ">Cluster" and at the END it calls prt() that prints the collected values (but not at its first invocation).
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Filtering data from text to csv

Hello, Is there a way to filerter data from a text file as shown below to a Column e.g. hostname nfsmount as two separate column. Currently I could get hostname and the mount is appearing below.. using this script #! /bin/bash for i in `cat fqdn.txt` do echo "$i ............ " >>... (3 Replies)
Discussion started by: Cy Pqa
3 Replies

2. Shell Programming and Scripting

text filtering

INPUT FILE: Date: 10-JUN-12 12:00:00 B 0: 00 00 00 00 10 00 16 28 B 120: 00 00 00 39 53 32 86 29 Date: 10-JUN-12 12:00:10 B 0: 00 00 00 00 10 01 11 22 B 120: 00 00 00 29 23 32 16 29 Date: 10-JUN-12 12:00:20 B 0: 00 00 00 00 10 02 17 29 B 120: 00 00 35 51 42 66 14 Date: 10-JUN-12... (5 Replies)
Discussion started by: thibodc
5 Replies

3. Shell Programming and Scripting

Parsing and filtering multiline text into comma separated line

I have a log file that contains several reports with following format. <Start of delimiter> Report1 header Report1 header continue Report1 header continue Record1 header Record1 header continue Record1 header continue field1 field2 field3 field4 ------... (1 Reply)
Discussion started by: yoda9691
1 Replies

4. Shell Programming and Scripting

filtering with awk

i have question about awk ex: input.txt 1252468812,yahoo,3.5 1252468812,hotmail,2.4 1252468819,yahoo,1.2 msn,1252468812,8.9 1252468923,gmail,12 live,1252468812,3.4 yahoo,1252468812,9.0 1252468929,msn,1.2 output.txt 1252468812,yahoo,3.5 1252468812,hotmail,2.4 msn,1252468812,8.9... (3 Replies)
Discussion started by: zvtral
3 Replies

5. Shell Programming and Scripting

Filtering out text with awk

(0 Replies)
Discussion started by: nilekyle
0 Replies

6. Shell Programming and Scripting

text processing and filtering scripting

Still new to bash. Using debian lenny 5, bash version 3.2.39. I'm working on three scripts. I need help completing them. One script that inputs a plain text file, echo then chop it up into separate whitespace-delimited strings as an output. Not sure how to do this... for example, the... (4 Replies)
Discussion started by: l20N1N
4 Replies

7. Shell Programming and Scripting

filtering text

Hi how can I filter the text using this one. SAMPLE servervmpool -listall|tail -11 ================================================================================ pool number: 112 pool name: Net-Ora-1wk description: Net-Ora-1wk max partially full: 0... (12 Replies)
Discussion started by: kenshinhimura
12 Replies

8. Shell Programming and Scripting

Another text filtering question

I want to remove everything from a file but the word following the search word. Example: crap crap crap crap SearchWord WordToKeep crap crap crap How would I do this with say awk or grep? Thank you! (4 Replies)
Discussion started by: DethLark
4 Replies

9. UNIX for Advanced & Expert Users

awk filtering ?

I have a Ques. Regarding awk I have few strings in a file, like.. ABC DEF_ABC GHI_ABC GHI Now I want string which has only 'ABC', not the part of any other string as it is also present in 'DEF_ABC' Output should be ABC Please guide me asap !! Thanks :b: (4 Replies)
Discussion started by: varungupta
4 Replies

10. UNIX for Dummies Questions & Answers

Filtering text from a string

I'm trying to write a script which prints out the users who are loged in. Printing the output of the "users" command isn't the problem. What I want is to filter out my own username. users | grep -v (username) does not work because the whole line in which username exists is suppressed. If... (5 Replies)
Discussion started by: Cozmic
5 Replies
Login or Register to Ask a Question