Sponsored Content
Top Forums UNIX for Dummies Questions & Answers removing several lines from a file Post 302503543 by kkohl78 on Thursday 10th of March 2011 11:22:46 PM
Old 03-11-2011
Power removing several lines from a file

Hi folks, I have a long string of DNA sequences, and I need to remove several lines, as well as the line directly following them. For example, here is a sample of my starting material:

Code:
>548::GY31UMJ02DLYEH rank=0007170 x=1363.5 y=471.0 length=478
AAAAACTGGAGTTTGATCATGGCTCAGGATGAACGCTGGCGGCATGCTTTACACATGCAAGTCGAACGGGAAGTG
>548::GY31UMJ02DMQQU rank=0007956 x=1372.0 y=340.0 length=447
AAAAACTGGAGTTTGATCATGGCTCAGGATGAACGCTGGCGGCATGCTTAACACATGCAAGTCGGACGGGAAGTG
>548::GY31UMJ02DIC9K rank=0008157 x=1322.0 y=1046.0 length=465
AAAAACTGGAGTTTGATCATGGCTCAGGATGAACGCTGGCGGCATGCTTTACACATGCAAGTCGAACGGGAAGTG
>548::GY31UMJ02C9BJX rank=0008439 x=1219.5 y=811.0 length=486
AAAAACTGGAGTTTGATCATGGCTCAGGATGAACGCTGGCGGCATGCTTTACACATGCAAGTCGAACGGGAAGTG
>548::GY31UMJ02DII82 rank=0008459 x=1324.0 y=612.0 length=524
AAAAACTGGAGTTTGATCATGGCTCAGGATGAACGCTGGCGGCGTGCCTAACACATGCAAGTCGAGCGAGAAGCA
>548::GY31UMJ02D0CW0 rank=0008480 x=1527.0 y=722.0 length=482
AAAAACTGGAGTTTGATCATGGCTCAGGATGAACGCTGGCGGCATGCTTTACACATGCAAGTCGAACGGGAAGTG
>548::GY31UMJ02C5JCT rank=0008979 x=1176.0 y=427.0 length=464
AAAAACTGGAGTTTGATCATGGCTCAGGATGAACGCTGGCGGCATGCTTTACACATGCAAGTCGAACGGGAAGTG
>548::GY31UMJ02EWIQD rank=0008983 x=1893.5 y=2115.0 length=481
AAAAACTGGAGTTTGATCATGGCTCAGGATGAACGCTGGCGGCATGCTTTACACATGCAAGTCGAACGGGAAGTG
>548::GY31UMJ02DMVNJ rank=0009035 x=1373.0 y=2605.0 length=392
AAAAACTGGAGTTTGATCATGGCTCAGGATGAACGCTGGCGGCATGCTTTACACATGCAAGTCGAACGGGAAGTG

and I have a list of the lines I need to remove (along with the lines after them! containing the sequence)

Code:
548::GY31UMJ02DLYEH, 548::GY31UMJ02EWIQD, 548::GY31UMJ02C9BJX

so that it ends up like this

Code:
>548::GY31UMJ02DMQQU rank=0007956 x=1372.0 y=340.0 length=447
AAAAACTGGAGTTTGATCATGGCTCAGGATGAACGCTGGCGGCATGCTTAACACATGCAAGTCGGACGGGAAGTG
>548::GY31UMJ02DIC9K rank=0008157 x=1322.0 y=1046.0 length=465
AAAAACTGGAGTTTGATCATGGCTCAGGATGAACGCTGGCGGCATGCTTTACACATGCAAGTCGAACGGGAAGTG
>548::GY31UMJ02DII82 rank=0008459 x=1324.0 y=612.0 length=524
AAAAACTGGAGTTTGATCATGGCTCAGGATGAACGCTGGCGGCGTGCCTAACACATGCAAGTCGAGCGAGAAGCA
>548::GY31UMJ02D0CW0 rank=0008480 x=1527.0 y=722.0 length=482
AAAAACTGGAGTTTGATCATGGCTCAGGATGAACGCTGGCGGCATGCTTTACACATGCAAGTCGAACGGGAAGTG
>548::GY31UMJ02C5JCT rank=0008979 x=1176.0 y=427.0 length=464
AAAAACTGGAGTTTGATCATGGCTCAGGATGAACGCTGGCGGCATGCTTTACACATGCAAGTCGAACGGGAAGTG
>548::GY31UMJ02DMVNJ rank=0009035 x=1373.0 y=2605.0 length=392
AAAAACTGGAGTTTGATCATGGCTCAGGATGAACGCTGGCGGCATGCTTTACACATGCAAGTCGAACGGGAAGTG

Hopefully there is a somewhat easy solution to this?

Thanks so much!!
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Removing lines from a file

Hello i have 2 files file1 and file2 as shown below file1 110010000000206|567810008161509 110010000000207|567810072227627 110010000000208|567811368851555 110010000000209|567811422513652 110010000000210|567812130217683 110010000000211|567813220211182 110010000000212|567813449322589... (4 Replies)
Discussion started by: PradeepRed
4 Replies

2. Shell Programming and Scripting

Removing lines within a file

Hi There, I've written a script that processes a data file on our system. Basically the script reads a post code from a list file, looks in the data file for the first occurrence (using grep) and reads the line number. It then tails the data file, with the line number just read, and outputs to a... (3 Replies)
Discussion started by: tookers
3 Replies

3. UNIX for Dummies Questions & Answers

Removing lines from a file

I'm trying to find a command which will allow me to remove a range of lines (2-4) from a .dat file from the command line without opening the file. Someone mentioned using the ex command? Does anyone have any ideas? thanks (6 Replies)
Discussion started by: computersaysno
6 Replies

4. AIX

Removing the first and last lines in a file

Hi Gurus, I'm a little new to UNIX. How can I do remove the first and last line in a file? Say, supppose I have a file as below: 1DMA 400002BARRIE 401002CALGARY/LETHBRI 402002CARLETON 500001PORTLAND-AUBRN 501001NEW YORK, NY ... (1 Reply)
Discussion started by: naveendronavall
1 Replies

5. Shell Programming and Scripting

Removing the first and last lines in a file

Hi Gurus, I'm a little new to UNIX. How can I do remove the first and last line in a file? Say, supppose I have a file as below: Code: 1DMA 400002BARRIE 401002CALGARY/LETHBRI 402002CARLETON 500001PORTLAND-AUBRN 501001NEW YORK, NY 502001BINGHAMTON, NY ... (2 Replies)
Discussion started by: naveendronavall
2 Replies

6. Shell Programming and Scripting

Removing Lines From a File

Hi Does anybody know of a command that will enable me to remove all entries in a file that have the format (name & time) more testfile anthony 2003 anthonyr 2008 amorel 15:00 anthonyp 14:35 anthonyp 14:35 anthonyr 2008 ardean 13:28 arlene 2003 arlenem 08:15 arlenem 08:15... (5 Replies)
Discussion started by: jamba1
5 Replies

7. Shell Programming and Scripting

removing lines from file

Hi I have many files all with 1 field per line as in 12345 abcde john.paul.net 6789101 how do I remove ceratin lines from these files. Have tried sed but sed wrecks my head! Many thanks in advance for any help (9 Replies)
Discussion started by: rob171171
9 Replies

8. Shell Programming and Scripting

Removing lines from a file

Hi, I have a linux server that was hacked and I have a bunch of files that sporadically contain the following lines through out the file: <?php eval(base64_decode("Xxxxxxxxxxxxxx/xxxxxxxx")); I did't put the exact lines of the file in this post. The "Xxxx" are random letters/numbers.... (8 Replies)
Discussion started by: nck
8 Replies

9. Shell Programming and Scripting

Removing multiple lines from input file, if multiple lines match a pattern.

GM, I have an issue at work, which requires a simple solution. But, after multiple attempts, I have not been able to hit on the code needed. I am assuming that sed, awk or even perl could do what I need. I have an application that adds extra blank page feeds, for multiple reports, when... (7 Replies)
Discussion started by: jxfish2
7 Replies

10. Shell Programming and Scripting

Removing lines from a file

I have a file `/tmp/wrk` containing filenames with paths. I want to remove filenames from this file, for example remove all filenames containing alja cagr cavt clta cmdo or corl remove all filenames containing data for days in region `d.2016.001` to `d.2016.207` remove all filenames... (10 Replies)
Discussion started by: kristinu
10 Replies
Text::ParseWords(3pm)					 Perl Programmers Reference Guide				     Text::ParseWords(3pm)

NAME
Text::ParseWords - parse text into an array of tokens or array of arrays SYNOPSIS
use Text::ParseWords; @lists = nested_quotewords($delim, $keep, @lines); @words = quotewords($delim, $keep, @lines); @words = shellwords(@lines); @words = parse_line($delim, $keep, $line); @words = old_shellwords(@lines); # DEPRECATED! DESCRIPTION
The &nested_quotewords() and &quotewords() functions accept a delimiter (which can be a regular expression) and a list of lines and then breaks those lines up into a list of words ignoring delimiters that appear inside quotes. &quotewords() returns all of the tokens in a single long list, while &nested_quotewords() returns a list of token lists corresponding to the elements of @lines. &parse_line() does tokenizing on a single string. The &*quotewords() functions simply call &parse_line(), so if you're only splitting one line you can call &parse_line() directly and save a function call. The $keep argument is a boolean flag. If true, then the tokens are split on the specified delimiter, but all other characters (quotes, backslashes, etc.) are kept in the tokens. If $keep is false then the &*quotewords() functions remove all quotes and backslashes that are not themselves backslash-escaped or inside of single quotes (i.e., &quotewords() tries to interpret these characters just like the Bourne shell). NB: these semantics are significantly different from the original version of this module shipped with Perl 5.000 through 5.004. As an additional feature, $keep may be the keyword "delimiters" which causes the functions to preserve the delimiters in each string as tokens in the token lists, in addition to preserving quote and backslash characters. &shellwords() is written as a special case of &quotewords(), and it does token parsing with whitespace as a delimiter-- similar to most Unix shells. EXAMPLES
The sample program: use Text::ParseWords; @words = quotewords('s+', 0, q{this is "a test" of quotewords "for you}); $i = 0; foreach (@words) { print "$i: <$_> "; $i++; } produces: 0: <this> 1: <is> 2: <a test> 3: <of quotewords> 4: <"for> 5: <you> demonstrating: 0 a simple word 1 multiple spaces are skipped because of our $delim 2 use of quotes to include a space in a word 3 use of a backslash to include a space in a word 4 use of a backslash to remove the special meaning of a double-quote 5 another simple word (note the lack of effect of the backslashed double-quote) Replacing "quotewords('s+', 0, q{this is...})" with "shellwords(q{this is...})" is a simpler way to accomplish the same thing. SEE ALSO
Text::CSV - for parsing CSV files AUTHORS
Maintainer: Alexandr Ciornii <alexchornyATgmail.com>. Previous maintainer: Hal Pomeranz <pomeranz@netcom.com>, 1994-1997 (Original author unknown). Much of the code for &parse_line() (including the primary regexp) from Joerk Behrends <jbehrends@multimediaproduzenten.de>. Examples section another documentation provided by John Heidemann <johnh@ISI.EDU> Bug reports, patches, and nagging provided by lots of folks-- thanks everybody! Special thanks to Michael Schwern <schwern@envirolink.org> for assuring me that a &nested_quotewords() would be useful, and to Jeff Friedl <jfriedl@yahoo-inc.com> for telling me not to worry about error-checking (sort of-- you had to be there). POD ERRORS
Hey! The above document had some coding errors, which are explained below: Around line 250: Expected text after =item, not a number Around line 254: Expected text after =item, not a number Around line 258: Expected text after =item, not a number Around line 262: Expected text after =item, not a number Around line 266: Expected text after =item, not a number perl v5.18.2 2014-01-06 Text::ParseWords(3pm)
All times are GMT -4. The time now is 05:35 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy