Extract specific content from a file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extract specific content from a file
# 8  
Old 10-09-2009
The idea is very simple,

Print from sequence_2 to sequence_3, and when you find a pattern sequence_3 just exit.

So i would very well recommend, as after the sequence_3 your file is not read, sed had been quit, so it is efficient too, is it not ?!

> This terminology in sed is PATTERN addressing.
# 9  
Old 10-09-2009
Thanks a lot, thegeek.
I understand it now d Smilie
hehe...
Do you have any idea to solve this thread:
https://www.unix.com/shell-programmin...#post302360533
It seems like more difficult and complicated Smilie
Thanks a lot for your advice.
# 10  
Old 10-10-2009
Hi Radoulov

Once again I am baffled by the brevity of your code! Smilie

You Explained this one to me a few days ago in another post: -

Code:
awk '/_3$/{exit}/_2$/{f=1}f' file

I just don't get these two at all though, why do they work?

Code:
awk '/_3$/{exit}/_2$/,0' infile

Or:

Code:
awk '/_3$/{exit}/_2$/,_' infile


What is the ,0 and ,_ about?

Last edited by steadyonabix; 10-10-2009 at 03:55 AM.. Reason: code tags
# 11  
Old 10-10-2009
Quote:
Originally Posted by steadyonabix
What is the ,0 and ,_ about?
0 is NULL and _ variable is not set, is NULL.
Literal awk will print from first pattern to the end(NULL) but exit on second pattern.

That's why I like radoulov solutions, you have to ask yourself why Smilie
# 12  
Old 10-10-2009
Another road to Rome:
Code:
mawk 'BEGIN {RS="\n>"; printf">"} /_2/' infile

The following is more generic and would also work in case the actual label is not "sequence_2" but the OP means the second record and the ">" at the beginning of a line marks the start of a label of a new record:
Code:
mawk 'BEGIN {RS="\n>"; printf">"} NR==2' infile

or gawk. As danmero pointed out, this code does not work with standard awk nor nawk or posix awk. Those versions only accept a single character for RS.

Last edited by Scrutinizer; 10-10-2009 at 10:06 AM..
# 13  
Old 10-11-2009
Quote:
Originally Posted by danmero
0 is NULL and _ variable is not set, is NULL.
Literal awk will print from first pattern to the end(NULL) but exit on second pattern.

That's why I like radoulov solutions, you have to ask yourself why Smilie
I see Smilie Why the single , before the 0 though, what does that mean in this context?

Code:
awk '/_3$/{exit}/_2$/,0' infile

# 14  
Old 10-12-2009
Code:
local $/=">";
open FH,"<a.txt";
while(<FH>){
  print if /sequence.*2/;
}

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Solaris

Extract content of .dump file

We have been provided a .dump file.The need is to extract the contents(may includes files and folder). ls -lZ didnt help me as Z is not a valid option. How to extract the file contents ? (7 Replies)
Discussion started by: vinil
7 Replies

2. Shell Programming and Scripting

Extract specific line in an html file starting and ending with specific pattern to a text file

Hi This is my first post and I'm just a beginner. So please be nice to me. I have a couple of html files where a pattern beginning with "http://www.site.com" and ending with "/resource.dat" is present on every 241st line. How do I extract this to a new text file? I have tried sed -n 241,241p... (13 Replies)
Discussion started by: dejavo
13 Replies

3. Shell Programming and Scripting

Extract Content from a file

I have an input file with contents like: ./prbru6/12030613.LOG:24514|APPL|prbru6.8269.RTUDaemon.1|?|13:49:56|12/03/06|GMT+3|?|RTUServer Error:Count of Internal Error Qty (-1) < 0, for Audit group id - 1L5XVJ6DQE36AXL, after record number,1, File: EventAuditor.cc, Line: 394|? ... (5 Replies)
Discussion started by: rkrish
5 Replies

4. Shell Programming and Scripting

perl extract content of file

I'm using Mail::Internet module, which will basically filter through email content and extract the body of the message my perl script to extract the body of the email #!/usr/bin/perl -w use Mail::Internet; @lines = <STDIN>; $mi_obj = new Mail::Internet(); ... (2 Replies)
Discussion started by: amlife
2 Replies

5. Shell Programming and Scripting

Extract all content that match exactly only specific word

Input: 21 templeta parent 35718 36554 . - . ID=parent_cluster_50.21.11; Name=Partial%20parent%20for%20training%20set; 21 templeta kids 35718 36554 . - . ID=_52; Parent=parent_cluster_5085.21.11; 21 templeta ... (7 Replies)
Discussion started by: patrick87
7 Replies

6. Shell Programming and Scripting

Extract specific content from data and rename its header problem asking

Input file 1: >pattern_5 GAATTCGTTCATGTAGGTTGASDASFGDSGRTYRYGHDGSDFGSDGGDSGSDGSDFGSDF ATTTAATTATGATTCATACGTCATATGTTATTATTCAATCGTATAAAATTATGTGACCTT SDFSDGSDFKSDAFLKJASLFJASKLFSJAKJFHASJKFHASJKFHASJKFHSJAKFHAW >pattern_1 AAGTCTTAAGATATCACCGTCGATTAGGTTTATACAGCTTTTGTGTTATTTAAATTTGAC... (10 Replies)
Discussion started by: patrick87
10 Replies

7. Shell Programming and Scripting

Way to extract detail and its content above specific value problem asking

Input file: >position_10 sample:68711 coords:5453-8666 number:3 type:complete len:344 MSINQYSSDFHYHSLMWQQQQQQQQHQNDVVEEKEALFEKPLTPSDVGKLNRLVIPKQHA ERYFPLAAAAADAVEKGLLLCFEDEEGKPWRFRYSYWNSSQSYVLTKGWSRYVKEKHLDA NRTS* >position_4 sample:68711 coords:553-866 number:4 type:partial len:483... (7 Replies)
Discussion started by: patrick87
7 Replies

8. Shell Programming and Scripting

Extract specific data content from a long list of data

My input: Data name: ABC001 Data length: 1000 Detail info Data Direction Start_time End_time Length 1 forward 10 100 90 1 forward 15 200 185 2 reverse 50 500 450 Data name: XFG110 Data length: 100 Detail info Data Direction Start_time End_time Length 1 forward 50 100 50 ... (11 Replies)
Discussion started by: patrick87
11 Replies

9. Shell Programming and Scripting

Extract all the content after a specific data

My input: >seq_1 DSASSTRRARRRRTPRTPSLRSRRSDVTCS >seq_3 RMRLRRWRKSCSERS*RRSN >seq_8 RTTGLSERPRLPTTASRSISSRWTR >seq_10 NELPLEKGSLDSISIE >seq_9 PNQGDAREPQAHLPRRQGPRDRPLQAYA+ QVQHRRHDHSRTQH*LCRRRQREDCDRLHR >seq_4 DRGKGQAGCRRPQEGEALVRRCS>seq_6 FA*GLAAQDGEA*SGRG My output: Extract all... (22 Replies)
Discussion started by: patrick87
22 Replies

10. Shell Programming and Scripting

Shell script or command help to extract specific contents from a long list of content

Hi, I got a long list of contents: >sequence_1 ASSSSSSSSSSSDDDDDDDDDDDCCCCCCC ASDSFDFFDFDFFWERERERERFSDFESFSFD >sequence_2 ASDFDFDFFDDFFDFDSFDSFDFSDFSDFDSFASDSADSADASD ASDFFDFDFASFASFASFAFSFFSDASFASFASFAFS >sequence_3 VEDFGSDGSDGSDGSDGSDGSDGSDG dDFSDFSDFSDFSDFSDFSDFSDFSDF... (2 Replies)
Discussion started by: patrick87
2 Replies
Login or Register to Ask a Question