Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Extracting the two lines where the first line is matched Post 302932773 by senhia83 on Saturday 24th of January 2015 09:13:21 AM
Old 01-24-2015
Hope this helps, a generalization of your post.

Here is a script that I use to parse one or more sequences from a multiFASTA file. This should also work if your sequences span across multiple lines.

You need to store the sequence ids you want in one file, 1 id per line..

Code:
#!/usr/bin/perl -w

use strict;

my $idsfile = "ids.txt";
my $seqfile = "seqs.fa";
my %ids  = ();

open FILE, $idsfile;
while(<FILE>) {
  chomp;
  $ids{$_} += 1;
}
close FILE;

local $/ = "\n>";  # read by FASTA record

open FASTA, $seqfile;
while (<FASTA>) {
    chomp;
    my $seq = $_;
    my ($id) = $seq =~ /^>*(\S+)/;  # parse ID as first word in FASTA header
    if (exists($ids{$id})) {
        $seq =~ s/^>*.+\n//;  # remove FASTA header
        $seq =~ s/\n//g;  # remove endlines
        print "$seq\n";
    }
}
close FASTA;

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

sed, grep, awk, regex -- extracting a matched substring from a file/string

Ok, I'm stumped and can't seem to find relevant info. (I'm not even sure, I might have asked something similar before.): I'm trying to use shell scripting/UNIX commands to extract URLs from a fairly large web page, with a view to ultimately wrapping this in PHP with exec() and including the... (2 Replies)
Discussion started by: ropers
2 Replies

2. Shell Programming and Scripting

AWK - Extracting matched line

Hi all, I have one more query related to AWK. I have the following csv data: ,qwertyA, field1, field2, field3, field4, field5, field6 ,,,,,,,,,,,,,,,,,,,100,200 ,,,,,,,,,,,,,,,,,,,300,400 ,qwertyB, field1, field2, field3, field4, field5, field6 ,,,,,,,,,,,,,,,,,,,100,200... (9 Replies)
Discussion started by: not4google
9 Replies

3. Shell Programming and Scripting

Logfile - extracting certain lines to concatenate into 1 line

I've got a log file from automatic diagnostic runs. The log file is appended to each time an automatic log is run. I'd like to just pull certain lines from each run in the log file, and concatenate them into 1 comma delimited line (for export into excel or an html table). Each diagnostic run... (3 Replies)
Discussion started by: BecTech
3 Replies

4. Shell Programming and Scripting

extracting matched pattern from a line using sed

I am trying to pull certain pieces of data out of a line of a file that matches a certain pattern: The three pieces that I want to pull out of this line are the only occurrences of that pattern within the line, but the rest of the line is not consistent in each file. Basically the line is... (3 Replies)
Discussion started by: ellhef
3 Replies

5. Shell Programming and Scripting

Extracting line matching a phrase and then the next lines after it

Hi all, I was wondering if someone could tell me a way to extract from a file lines where you search for a phrase and then also extract the next X lines after it (i.e. take a block of text from the file)? Example { id=123 time=10:00:00 date=12/12/09 { ........ ... (6 Replies)
Discussion started by: muay_tb
6 Replies

6. Shell Programming and Scripting

awk script to move a line after the matched pattern line

I have the following text format in a file which lists the question first and then 5 choices after that the explanantion and finally the answer. 1.The amount of time it takes for most of a worker’s occupational knowledge and skills to become obsolete has been declining because of the... (2 Replies)
Discussion started by: nanchil_guy
2 Replies

7. Shell Programming and Scripting

Help required on joining one line above & below to the pattern matched string line.

Hi Experts, Help needed on joining one line above & below to the pattern matched string line. The input file, required output is mentioned below Input file ABCD DEFG5 42.0.1-63.38.31 KKKK iokl IP Connection Available ABCD DEFG5 42.0.1-63.38.31 ... (7 Replies)
Discussion started by: krao
7 Replies

8. Shell Programming and Scripting

Extracting lines after nth LINE from an output

Hi all, Here is my problem for which i am breaking my head for past three days.. I have parted command output as follows.. Model: ATA WDC WD5000AAKS-0 (scsi) Disk /dev/sdb: 500GB Sector size (logical/physical): 512B/512B Partition Table: msdos Number Start End Size Type ... (3 Replies)
Discussion started by: selvarajvs
3 Replies

9. Shell Programming and Scripting

Sed: how to merge two lines moving matched pattern to end of previous line

hello everyone, im new here, and also programming with awk, sed and grep commands on linux. In my text i have many lines with this config: 1 1 4 3 1 1 2 5 2 2 1 1 1 3 1 2 1 3 1 1 1 2 2 2 5 2 4 1 3 2 1 1 4 1 2 1 1 1 3 2 1 1 5 4 1 3 1 1... (3 Replies)
Discussion started by: satir
3 Replies

10. Shell Programming and Scripting

How to print previous line of multiple pattern matched line?

Hello, I have below format log file, Comparing csv_converted_files/2201/9747.1012H67126.5077292103609547345.csv and csv_converted_files/22019/97447.1012H67126.5077292103609547345.csv Comparing csv_converted_files/2559/9447.1012H67126.5077292103609547345.csv and... (6 Replies)
Discussion started by: arvindshukla81
6 Replies
fmt(1)								   User Commands							    fmt(1)

NAME
fmt - simple text formatters SYNOPSIS
fmt [-cs] [-w width | -width] [inputfile]... DESCRIPTION
fmt is a simple text formatter that fills and joins lines to produce output lines of (up to) the number of characters specified in the -w width option. The default width is 72. fmt concatenates the inputfiles listed as arguments. If none are given, fmt formats text from the standard input. Blank lines are preserved in the output, as is the spacing between words. fmt does not fill nor split lines beginning with a `.' (dot), for compatibility with nroff(1). Nor does it fill or split a set of contiguous non-blank lines which is determined to be a mail header, the first line of which must begin with "From". Indentation is preserved in the output, and input lines with differing indentation are not joined (unless -c is used). fmt can also be used as an in-line text filter for vi(1). The vi command: !}fmt reformats the text between the cursor location and the end of the paragraph. OPTIONS
-c Crown margin mode. Preserve the indentation of the first two lines within a paragraph, and align the left margin of each subsequent line with that of the second line. This is useful for tagged paragraphs. -s Split lines only. Do not join short lines to form longer ones. This prevents sample lines of code, and other such for- matted text, from being unduly combined. -w width | -width Fill output lines to up to width columns. OPERANDS
inputfile Input file. ENVIRONMENT VARIABLES
See environ(5) for a description of the LC_CTYPE environment variable that affects the execution of fmt. ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWcsu | +-----------------------------+-----------------------------+ SEE ALSO
nroff(1), vi(1), attributes(5), environ(5) NOTES
The -width option is acceptable for BSD compatibility, but it may go away in future releases. SunOS 5.11 9 May 1997 fmt(1)
All times are GMT -4. The time now is 08:39 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy