Filtering text with awk


Login or Register for Dates, Times and to Reply

 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Filtering text with awk
# 1  
Filtering text with awk

I need to filter a file that is composed like that:
Code:
>Cluster 0
0	292nt, >last294258;size=1;... *
>Cluster 1
0	292nt, >last111510;size=1;... *
1	290nt, >last136280;size=1;... at -/98.62%
2	292nt, >last217336;size=1;... at +/99.66%
3	292nt, >last280937;size=1;... at -/99.32%
>Cluster 2
0	292nt, >last355423;size=1;... *

i need to output it having just the lines that contain the "*" pattern and I need to add in the output file also the info of number of lies before and after the match . like this:
Code:
>last294258;size=1;... *nr=1
>last111510;size=1;... *nr=4
 >last355423;size=1;... *nr=1


Last edited by vgersh99; 02-06-2019 at 12:38 PM.. Reason: Code tags, please!
# 2  
The 'after' of one is just the 'before' of the other, so:

Code:
$ awk '/[*]/ { print $0";nr=" N ; N=0 ; next } { N++ }' data

0       292nt, >last294258;size=1;... *;nr=1
0       292nt, >last111510;size=1;... *;nr=1
0       292nt, >last355423;size=1;... *;nr=4

$

# 3  
sorry maybe I was not so clear but like that doesn't work in the whole big file correctly.

for each line with * I need to add nr=the number of lines belonging to that group(group=cluster)

so like that:
Code:
>last294258;size=1;... *nr=1
>last111510;size=1;... *nr=4
>last355423;size=1;... *nr=1

--- Post updated at 04:25 PM ---

maybe I was not that clear.
this is the output I need

Code:
>last294258;size=1;... *nr=1
>last111510;size=1;... *nr=4
>last355423;size=1;... *nr=1

nr=4 is because of the subgroup of Cluster 1 is composed by 4 lines




Moderator's Comments:
Mod Comment Seriously: Please use CODE tags as required by forum rules!

Last edited by RudiC; 02-06-2019 at 01:57 PM.. Reason: Added CODE tags.
# 4  
something to start with - a bit verbose...:
awk -f pedro.awk myFile
where pedro.awk is:
Code:
BEGIN {
  FS="[>;]"
  OFS=";"
}

function p(a, i)
{
   for(i in a)
     print ">" i, "*nr=" ln

}
/^>/ {p(out);ln=0;split("",out);next}
/[*]/  {idx=$2 OFS $3; out[idx]}
{ln++}
END {
  if (ln) p(out)
}


Last edited by vgersh99; 02-06-2019 at 02:22 PM.. Reason: * -> [*] : to be treated a single char, instead of RE
# 5  
It is common use in these fora to show what you've tried and where you were stuck when posting a problem. Try



Code:
awk '
/>Cluster/      {if (CNT) print CNT
                 CNT = 0
                 next
                }
                {CNT++
                }
/\*/            {printf "%snr=", $NF
                }
END             {print CNT
                }
' FS=, file
 >last294258;size=1;... *nr=1
 >last111510;size=1;... *nr=4
 >last355423;size=1;... *nr=1

This User Gave Thanks to RudiC For This Post:
# 6  
Quote:
Originally Posted by RudiC
It is common use in these fora to show what you've tried and where you were stuck when posting a problem. Try



Code:
awk '
/>Cluster/      {if (CNT) print CNT
                 CNT = 0
                 next
                }
                {CNT++
                }
/\*/            {printf "%snr=", $NF
                }
END             {print CNT
                }
' FS=, file
 >last294258;size=1;... *nr=1
 >last111510;size=1;... *nr=4
 >last355423;size=1;... *nr=1

Just a slight modification if a "block" does not have anything marked with *:
Code:
>Cluster 0
0       292nt, >last294258;size=1;... *
>Cluster 1
0       292nt, >last111510;size=1;... *
1       290nt, >last136280;size=1;... at -/98.62%
2       292nt, >last217336;size=1;... at +/99.66%
3       292nt, >last280937;size=1;... at -/99.32%
>Cluster 2
0       292nt, >last355423;size=1;...

Code:
BEGIN {
  FS=","
}
/>Cluster/      {if (flg) print CNT
                 CNT=flg = 0
                 next
                }
                {CNT++
                }
/\*/            {printf "%snr=", $NF;flg++
                }
END             {if (flg) print CNT
                }

This User Gave Thanks to vgersh99 For This Post:
# 7  
The following skips empty lines (where NF is 0) and saves the 1st line after ">Cluster"
Code:
awk '
function prt(){ if (run==1) print (save1 "nr=" nr); else run=1 }
$1~/^>Cluster/ { prt(); nr=0; next }
(NF>0 && ++nr==1) { $1=$2=""; save1=$0 }
END { prt() }
' data

At each ">Cluster" and at the END it calls prt() that prints the collected values (but not at its first invocation).
Login or Register for Dates, Times and to Reply

Previous Thread | Next Thread
Thread Tools Search this Thread
Search this Thread:
Advanced Search

Test Your Knowledge in Computers #316
Difficulty: Easy
RAM stands for Registered Access Memory.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Filtering data from text to csv

Hello, Is there a way to filerter data from a text file as shown below to a Column e.g. hostname nfsmount as two separate column. Currently I could get hostname and the mount is appearing below.. using this script #! /bin/bash for i in `cat fqdn.txt` do echo "$i ............ " >>... (3 Replies)
Discussion started by: Cy Pqa
3 Replies

2. Shell Programming and Scripting

text filtering

INPUT FILE: Date: 10-JUN-12 12:00:00 B 0: 00 00 00 00 10 00 16 28 B 120: 00 00 00 39 53 32 86 29 Date: 10-JUN-12 12:00:10 B 0: 00 00 00 00 10 01 11 22 B 120: 00 00 00 29 23 32 16 29 Date: 10-JUN-12 12:00:20 B 0: 00 00 00 00 10 02 17 29 B 120: 00 00 35 51 42 66 14 Date: 10-JUN-12... (5 Replies)
Discussion started by: thibodc
5 Replies

3. Shell Programming and Scripting

Parsing and filtering multiline text into comma separated line

I have a log file that contains several reports with following format. <Start of delimiter> Report1 header Report1 header continue Report1 header continue Record1 header Record1 header continue Record1 header continue field1 field2 field3 field4 ------... (1 Reply)
Discussion started by: yoda9691
1 Replies

4. Shell Programming and Scripting

filtering with awk

i have question about awk ex: input.txt 1252468812,yahoo,3.5 1252468812,hotmail,2.4 1252468819,yahoo,1.2 msn,1252468812,8.9 1252468923,gmail,12 live,1252468812,3.4 yahoo,1252468812,9.0 1252468929,msn,1.2 output.txt 1252468812,yahoo,3.5 1252468812,hotmail,2.4 msn,1252468812,8.9... (3 Replies)
Discussion started by: zvtral
3 Replies

5. Shell Programming and Scripting

Filtering out text with awk

(0 Replies)
Discussion started by: nilekyle
0 Replies

6. Shell Programming and Scripting

text processing and filtering scripting

Still new to bash. Using debian lenny 5, bash version 3.2.39. I'm working on three scripts. I need help completing them. One script that inputs a plain text file, echo then chop it up into separate whitespace-delimited strings as an output. Not sure how to do this... for example, the... (4 Replies)
Discussion started by: l20N1N
4 Replies

7. Shell Programming and Scripting

filtering text

Hi how can I filter the text using this one. SAMPLE servervmpool -listall|tail -11 ================================================================================ pool number: 112 pool name: Net-Ora-1wk description: Net-Ora-1wk max partially full: 0... (12 Replies)
Discussion started by: kenshinhimura
12 Replies

8. Shell Programming and Scripting

Another text filtering question

I want to remove everything from a file but the word following the search word. Example: crap crap crap crap SearchWord WordToKeep crap crap crap How would I do this with say awk or grep? Thank you! (4 Replies)
Discussion started by: DethLark
4 Replies

9. UNIX for Advanced & Expert Users

awk filtering ?

I have a Ques. Regarding awk I have few strings in a file, like.. ABC DEF_ABC GHI_ABC GHI Now I want string which has only 'ABC', not the part of any other string as it is also present in 'DEF_ABC' Output should be ABC Please guide me asap !! Thanks :b: (4 Replies)
Discussion started by: varungupta
4 Replies

10. UNIX for Dummies Questions & Answers

Filtering text from a string

I'm trying to write a script which prints out the users who are loged in. Printing the output of the "users" command isn't the problem. What I want is to filter out my own username. users | grep -v (username) does not work because the whole line in which username exists is suppressed. If... (5 Replies)
Discussion started by: Cozmic
5 Replies

Featured Tech Videos