Sponsored Content
Top Forums Shell Programming and Scripting Extract specific content from a file Post 302360751 by Scrutinizer on Saturday 10th of October 2009 07:09:20 AM
Old 10-10-2009
Another road to Rome:
Code:
mawk 'BEGIN {RS="\n>"; printf">"} /_2/' infile

The following is more generic and would also work in case the actual label is not "sequence_2" but the OP means the second record and the ">" at the beginning of a line marks the start of a label of a new record:
Code:
mawk 'BEGIN {RS="\n>"; printf">"} NR==2' infile

or gawk. As danmero pointed out, this code does not work with standard awk nor nawk or posix awk. Those versions only accept a single character for RS.

Last edited by Scrutinizer; 10-10-2009 at 10:06 AM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Shell script or command help to extract specific contents from a long list of content

Hi, I got a long list of contents: >sequence_1 ASSSSSSSSSSSDDDDDDDDDDDCCCCCCC ASDSFDFFDFDFFWERERERERFSDFESFSFD >sequence_2 ASDFDFDFFDDFFDFDSFDSFDFSDFSDFDSFASDSADSADASD ASDFFDFDFASFASFASFAFSFFSDASFASFASFAFS >sequence_3 VEDFGSDGSDGSDGSDGSDGSDGSDG dDFSDFSDFSDFSDFSDFSDFSDFSDF... (2 Replies)
Discussion started by: patrick87
2 Replies

2. Shell Programming and Scripting

Extract all the content after a specific data

My input: >seq_1 DSASSTRRARRRRTPRTPSLRSRRSDVTCS >seq_3 RMRLRRWRKSCSERS*RRSN >seq_8 RTTGLSERPRLPTTASRSISSRWTR >seq_10 NELPLEKGSLDSISIE >seq_9 PNQGDAREPQAHLPRRQGPRDRPLQAYA+ QVQHRRHDHSRTQH*LCRRRQREDCDRLHR >seq_4 DRGKGQAGCRRPQEGEALVRRCS>seq_6 FA*GLAAQDGEA*SGRG My output: Extract all... (22 Replies)
Discussion started by: patrick87
22 Replies

3. Shell Programming and Scripting

Extract specific data content from a long list of data

My input: Data name: ABC001 Data length: 1000 Detail info Data Direction Start_time End_time Length 1 forward 10 100 90 1 forward 15 200 185 2 reverse 50 500 450 Data name: XFG110 Data length: 100 Detail info Data Direction Start_time End_time Length 1 forward 50 100 50 ... (11 Replies)
Discussion started by: patrick87
11 Replies

4. Shell Programming and Scripting

Way to extract detail and its content above specific value problem asking

Input file: >position_10 sample:68711 coords:5453-8666 number:3 type:complete len:344 MSINQYSSDFHYHSLMWQQQQQQQQHQNDVVEEKEALFEKPLTPSDVGKLNRLVIPKQHA ERYFPLAAAAADAVEKGLLLCFEDEEGKPWRFRYSYWNSSQSYVLTKGWSRYVKEKHLDA NRTS* >position_4 sample:68711 coords:553-866 number:4 type:partial len:483... (7 Replies)
Discussion started by: patrick87
7 Replies

5. Shell Programming and Scripting

Extract specific content from data and rename its header problem asking

Input file 1: >pattern_5 GAATTCGTTCATGTAGGTTGASDASFGDSGRTYRYGHDGSDFGSDGGDSGSDGSDFGSDF ATTTAATTATGATTCATACGTCATATGTTATTATTCAATCGTATAAAATTATGTGACCTT SDFSDGSDFKSDAFLKJASLFJASKLFSJAKJFHASJKFHASJKFHASJKFHSJAKFHAW >pattern_1 AAGTCTTAAGATATCACCGTCGATTAGGTTTATACAGCTTTTGTGTTATTTAAATTTGAC... (10 Replies)
Discussion started by: patrick87
10 Replies

6. Shell Programming and Scripting

Extract all content that match exactly only specific word

Input: 21 templeta parent 35718 36554 . - . ID=parent_cluster_50.21.11; Name=Partial%20parent%20for%20training%20set; 21 templeta kids 35718 36554 . - . ID=_52; Parent=parent_cluster_5085.21.11; 21 templeta ... (7 Replies)
Discussion started by: patrick87
7 Replies

7. Shell Programming and Scripting

perl extract content of file

I'm using Mail::Internet module, which will basically filter through email content and extract the body of the message my perl script to extract the body of the email #!/usr/bin/perl -w use Mail::Internet; @lines = <STDIN>; $mi_obj = new Mail::Internet(); ... (2 Replies)
Discussion started by: amlife
2 Replies

8. Shell Programming and Scripting

Extract Content from a file

I have an input file with contents like: ./prbru6/12030613.LOG:24514|APPL|prbru6.8269.RTUDaemon.1|?|13:49:56|12/03/06|GMT+3|?|RTUServer Error:Count of Internal Error Qty (-1) < 0, for Audit group id - 1L5XVJ6DQE36AXL, after record number,1, File: EventAuditor.cc, Line: 394|? ... (5 Replies)
Discussion started by: rkrish
5 Replies

9. Shell Programming and Scripting

Extract specific line in an html file starting and ending with specific pattern to a text file

Hi This is my first post and I'm just a beginner. So please be nice to me. I have a couple of html files where a pattern beginning with "http://www.site.com" and ending with "/resource.dat" is present on every 241st line. How do I extract this to a new text file? I have tried sed -n 241,241p... (13 Replies)
Discussion started by: dejavo
13 Replies

10. Solaris

Extract content of .dump file

We have been provided a .dump file.The need is to extract the contents(may includes files and folder). ls -lZ didnt help me as Z is not a valid option. How to extract the file contents ? (7 Replies)
Discussion started by: vinil
7 Replies
AWK(1)							      General Commands Manual							    AWK(1)

NAME
awk - pattern scanning and processing language SYNOPSIS
awk [ -Fc ] [ prog ] [ file ] ... DESCRIPTION
Awk scans each input file for lines that match any of a set of patterns specified in prog. With each pattern in prog there can be an asso- ciated action that will be performed when a line of a file matches the pattern. The set of patterns may appear literally as prog, or in a file specified as -f file. Files are read in order; if there are no files, the standard input is read. The file name `-' means the standard input. Each line is matched against the pattern portion of every pattern-action statement; the associated action is performed for each matched pattern. An input line is made up of fields separated by white space. (This default can be changed by using FS, vide infra.) The fields are denoted $1, $2, ... ; $0 refers to the entire line. A pattern-action statement has the form pattern { action } A missing { action } means print the line; a missing pattern always matches. An action is a sequence of statements. A statement can be one of the following: if ( conditional ) statement [ else statement ] while ( conditional ) statement for ( expression ; conditional ; expression ) statement break continue { [ statement ] ... } variable = expression print [ expression-list ] [ >expression ] printf format [ , expression-list ] [ >expression ] next # skip remaining patterns on this input line exit # skip the rest of the input Statements are terminated by semicolons, newlines or right braces. An empty expression-list stands for the whole line. Expressions take on string or numeric values as appropriate, and are built using the operators +, -, *, /, %, and concatenation (indicated by a blank). The C operators ++, --, +=, -=, *=, /=, and %= are also available in expressions. Variables may be scalars, array elements (denoted x[i]) or fields. Variables are initialized to the null string. Array subscripts may be any string, not necessarily numeric; this allows for a form of associative memory. String constants are quoted "...". The print statement prints its arguments on the standard output (or on a file if >file is present), separated by the current output field separator, and terminated by the output record separator. The printf statement formats its expression list according to the format (see printf(3)). The built-in function length returns the length of its argument taken as a string, or of the whole line if no argument. There are also built-in functions exp, log, sqrt, and int. The last truncates its argument to an integer. substr(s, m, n) returns the n-character sub- string of s that begins at position m. The function sprintf(fmt, expr, expr, ...) formats the expressions according to the printf(3) for- mat given by fmt and returns the resulting string. Patterns are arbitrary Boolean combinations (!, ||, &&, and parentheses) of regular expressions and relational expressions. Regular expressions must be surrounded by slashes and are as in egrep. Isolated regular expressions in a pattern apply to the entire line. Regu- lar expressions may also occur in relational expressions. A pattern may consist of two patterns separated by a comma; in this case, the action is performed for all lines between an occurrence of the first pattern and the next occurrence of the second. A relational expression is one of the following: expression matchop regular-expression expression relop expression where a relop is any of the six relational operators in C, and a matchop is either ~ (for contains) or !~ (for does not contain). A condi- tional is an arithmetic expression, a relational expression, or a Boolean combination of these. The special patterns BEGIN and END may be used to capture control before the first input line is read and after the last. BEGIN must be the first pattern, END the last. A single character c may be used to separate the fields by starting the program with BEGIN { FS = "c" } or by using the -Fc option. Other variable names with special meanings include NF, the number of fields in the current record; NR, the ordinal number of the current record; FILENAME, the name of the current input file; OFS, the output field separator (default blank); ORS, the output record separator (default newline); and OFMT, the output format for numbers (default "%.6g"). EXAMPLES
Print lines longer than 72 characters: length > 72 Print first two fields in opposite order: { print $2, $1 } Add up first column, print sum and average: { s += $1 } END { print "sum is", s, " average is", s/NR } Print fields in reverse order: { for (i = NF; i > 0; --i) print $i } Print all lines between start/stop pairs: /start/, /stop/ Print all lines whose first field is different from previous one: $1 != prev { print; prev = $1 } SEE ALSO
lex(1), sed(1) A. V. Aho, B. W. Kernighan, P. J. Weinberger, Awk - a pattern scanning and processing language BUGS
There are no explicit conversions between numbers and strings. To force an expression to be treated as a number add 0 to it; to force it to be treated as a string concatenate "" to it. AWK(1)
All times are GMT -4. The time now is 07:52 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy