Find a matched pattern and perform comparison on numbers next to it


Login or Register for Dates, Times and to Reply

 
Thread Tools Search this Thread
# 1  
Find a matched pattern and perform comparison on numbers next to it

Hi,

I have been trying to extract rows that match pattern "cov" with the value next to it to be > 3. The 'cov' pattern may appear either in $3 or $4 (if using ";" as field separator). Below is the example:-

input file
Code:
ENST00000652609.1|ENSG00000230590.10|OTTHUMG00000021850.6|OTTHUMT00000503925.1|FTX-232|FTX|2334|	StringTie	exon	385	622	1000	.	.	gene_id "SRR5206792.443"; transcript_id "SRR5206792.443.1"; exon_number "1"; cov "2.749580";
lnc-NAA50-3:1	StringTie	transcript	1	859	1000	.	.	gene_id "SRR5206792.445"; transcript_id "SRR5206792.445.1"; cov "25.269176"; FPKM "81.295151"; TPM "72.137390";
lnc-NAA50-3:1	StringTie	exon	1	859	1000	.	.	gene_id "SRR5206792.445"; transcript_id "SRR5206792.445.1"; exon_number "1"; cov "25.269176";
lnc-DCAF1-1:1	StringTie	transcript	1	2446	1000	.	.	gene_id "SRR5206792.446"; transcript_id "SRR5206792.446.1"; cov "19.228128"; FPKM "61.860096"; TPM "54.891655";
lnc-DCAF1-1:1	StringTie	exon	1	2446	1000	.	.	gene_id "SRR5206792.446"; transcript_id "SRR5206792.446.1"; exon_number "1"; cov "19.228128";

From the sample file, the "cov" value for the first row is less than 3. Therefore, it should be excluded and the output should be like below.
Code:
lnc-NAA50-3:1	StringTie	transcript	1	859	1000	.	.	gene_id "SRR5206792.445"; transcript_id "SRR5206792.445.1"; cov "25.269176"; FPKM "81.295151"; TPM "72.137390";
lnc-NAA50-3:1	StringTie	exon	1	859	1000	.	.	gene_id "SRR5206792.445"; transcript_id "SRR5206792.445.1"; exon_number "1"; cov "25.269176";
lnc-DCAF1-1:1	StringTie	transcript	1	2446	1000	.	.	gene_id "SRR5206792.446"; transcript_id "SRR5206792.446.1"; cov "19.228128"; FPKM "61.860096"; TPM "54.891655";
lnc-DCAF1-1:1	StringTie	exon	1	2446	1000	.	.	gene_id "SRR5206792.446"; transcript_id "SRR5206792.446.1"; exon_number "1"; cov "19.228128";

I know how to search the pattern but do not know how to compare the value to be > 3. Below is one of the sample codes that i did:-
Code:
awk 'BEGIN{FS=";"] $0~/cov/ && $3 || $4 >3 {print}' input file

tried couple of times to do the comparison by combining with pattern matching but failed. appreciate your kind help and advise. thanks

Last edited by bunny_merah19; 10-01-2019 at 03:39 AM..
# 2  
Try (untested)
Code:
awk -F";" 'match ($0, /cov[ ".0-9]*;/) {split (substr ($0, RSTART, RLENGTH), T, "\""); if (T[2] <= 3) next} 1' file


Last edited by RudiC; 10-01-2019 at 11:11 AM..
This User Gave Thanks to RudiC For This Post:
# 3  
Quote:
Originally Posted by RudiC
Try (untested)
Code:
awk -F";" 'match ($0, /cov[ ".0-9]*;/) {split (substr ($0, RSTART, RLENGTH), T, "\""); if (T[2] <= 3) next} 1' file

Thanks so much. It works perfectly!

I am looking into your codes.
Code:
{split (substr ($0, RSTART, RLENGTH), T, "\""); if (T[2] <= 3) next}

This is so good to know as I have many more data with this kind of almost similar condition to work on. Again, thanks so much. Really appreciate it. Smilie
# 4  
Quote:
Originally Posted by bunny_merah19
This is so good to know as I have many more data with this kind of almost similar condition to work on. Again, thanks so much. Really appreciate it. Smilie
With a slight change it should also work nicely for string data values:

Code:
$ echo ' wrongkey  "some data"; key "more data";' | 
     awk 'match($0, /[ ;]key +"[^"]*" *-;/) {split(substr($0,RSTART,RLENGTH), T, "\""); print T[2]}'
more data

This User Gave Thanks to Chubler_XL For This Post:
# 5  
Quote:
Originally Posted by RudiC
Try (untested)
Code:
awk -F";" 'match ($0, /cov[ ".0-9]*;/) {split (substr ($0, RSTART, RLENGTH), T, "\""); if (T[2] <= 3) next} 1' file

Quote:
Originally Posted by Chubler_XL
With a slight change it should also work nicely for string data values:

Code:
$ echo ' wrongkey  "some data"; key "more data";' | 
     awk 'match($0, /[ ;]key +"[^"]*" *-;/) {split(substr($0,RSTART,RLENGTH), T, "\""); print T[2]}'
more data

This is great! thanks so much Smilie
Login or Register for Dates, Times and to Reply

Previous Thread | Next Thread
Thread Tools Search this Thread
Search this Thread:
Advanced Search

Test Your Knowledge in Computers #264
Difficulty: Easy
Alan Turing worked at MIT where he designed the Automatic Computing Engine, which was one of the first designs for a stored-program computer.
True or False?

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Find matched pattern and print all based on certain conditions

Hi, I am trying to extract data based on certain conditions. My sample input file as below:- lnc-2:1 OnePiece tra_law 500 688 1 . . g_id "R792.8417"# tra_law_id "R792.8417.1"# g_line "2.711647"# KM "8.723820"# lnc-2:1 OnePiece room 500 510 1 . . g_id "R792.8417"# tra_law_id "R792.8417.1"#... (4 Replies)
Discussion started by: bunny_merah19
4 Replies

2. Shell Programming and Scripting

How to use sed to search a particular pattern in a file backward after a pattern is matched.?

Hi, I have two files file1.txt and file2.txt. Please see the attachments. In file2.txt (which actually is a diff output between two versions of file1.txt.), I extract the pattern corresponding to 1172c1172. Now ,In file1.txt I have to search for this pattern 1172c1172 and if found, I have to... (9 Replies)
Discussion started by: saurabh kumar
9 Replies

3. Shell Programming and Scripting

Insert certain field of matched pattern line above pattern

Hello every, I am stuck in a problem. I have file like this. I want to add the fifth field of the match pattern line above the lines starting with "# @D". The delimiter is "|" eg > # @D0.00016870300|0.05501020000|12876|12934|3||Qp||Pleistocene||"3 Qp Pleistocene"|Q # @P... (5 Replies)
Discussion started by: jyu3
5 Replies

4. Homework & Coursework Questions

[solved]Perl: Printing line numbers to matched strings and hashes.

Florida State University, Tallahassee, FL, USA, Dr. Whalley, COP4342 Unix Tools. This program takes much of my previous assignment but adds the functionality of printing the concatenated line numbers found within the input. Sample input from <> operator: Hello World This is hello a sample... (2 Replies)
Discussion started by: D2K
2 Replies

5. Shell Programming and Scripting

Awk to match a pattern and perform a search after the first pattern

Hello Guyz I have been following this forum for a while and the solutions provided are super useful. I currently have a scenario where i need to search for a pattern and start searching by keeping the first pattern as a baseline ABC DEF LMN EFG HIJ LMN OPQ In the above text i need to... (8 Replies)
Discussion started by: RickCharles
8 Replies

6. Shell Programming and Scripting

How to find the matched numbers between 2 text file using perl program??

hi dudes, I nee you kind assistance, I have to find the matched numbers from 2 text files and output of matched numbers should be in another text file.. I do have text files like this , for example File 1 787 665*5-p 5454 545-p 445-p 5454*-p File 2 5455 787 445-p 4356 2445 144 ... (3 Replies)
Discussion started by: sureshraj
3 Replies

7. Shell Programming and Scripting

Can sed perform editing operations ONLY in the matched region?

Hi: Let's suppose I want to replace all the | by > ONLY when | is between . Usually (and it works) I would do something like sed -e 's/\(\*\)|\(*\]\)/\1>\2/g' where I have to "save" some portions of the matched region and use them with the \n metacharacter. I was wondering if I could... (2 Replies)
Discussion started by: islegmar
2 Replies

8. Shell Programming and Scripting

HELP! PERL script to find matched pattern

Hi all, I just learnt Perl and I encountered a problem in my current project. For a verilog file, i am required to write a PERL script that could match pattern to output nitrolink and nitropack. I wont know what name to grep except the pattern below. the verilog file: nitrolink nitrolink... (1 Reply)
Discussion started by: kimhuat
1 Replies

9. Solaris

How to perform addition of two numbers in shell scripting in Solaris-10

Hi, I have a sh script which contains the following line TOTAL=$((e4-s4)) -> Where e4 and s4 are input got from the user. At the time of execution of this line the following error occurs test.sh: syntax error at line 8: `TOTAL=$' unexpected How to solve this issue?. Can any... (9 Replies)
Discussion started by: krevathi1912
9 Replies

10. Shell Programming and Scripting

How to perform calculations using numbers greater than 2150000000.

Could someone tell me how to perform calculations using numbers greater than 2150000000 in Korn Shell? When I tried to do it it gave me the wrong answer. e.g. I have a ksh file with the contents below: --------------------------------- #!/bin/ksh SUM=`expr 2150000000 + 2` PRODUCT=`expr... (3 Replies)
Discussion started by: stevefox
3 Replies

Featured Tech Videos