Removing specific lines using Unix


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Removing specific lines using Unix
# 1  
Old 08-10-2010
Removing specific lines using Unix

Hello everyone,

I have a fasta file in the following format:

Code:
>Sic_7657.x01 bhg|7675859 info:546474
ATGCTAGATGCTAGCTAGCTAGCTGCT
CGTAGCTAGTCGTAGCTGATGCTAGGC
CGATG
>Sic_7657.x1 bhg|76675 info:546474
CGATGCTGATGCTGATCGTGATCTGTC
CAGTCGAGCTGATGTCGTATGCGGGTG
GCTAGCTA
>Sic_7658.y1 bhg|76675 info:546474
CGATGCTGATGCTGATCGTGATCTGTC
CAGTCGAGCTGATGTCGTATGCGGGTG
GCTAGCTA

I want to read the lines starting with ">" and remove those lines (and the data following below eg: ATGCTATC), which have the following :
Code:
>Sic_7657.x01

I do not want sequences whose titles contain ".x01", in my result file. Also the digits before '.' (dot) in the title name can be anything and no specific format is followed for them.

Is there an easy way to do it ?

Thanks!
# 2  
Old 08-11-2010
I'm not sure to understand wjat you want to do.
The following script removes '>' lines containing '.x01' and the following.
Code:
awk '
BEGIN { RS=">" ; ORS="" }
NF && $1 !~ /\.x01$/ { print ">" $0 }
' inputfile

Result:
Code:
>Sic_7657.x1 bhg|76675 info:546474
CGATGCTGATGCTGATCGTGATCTGTC
CAGTCGAGCTGATGTCGTATGCGGGTG
GCTAGCTA
>Sic_7658.y1 bhg|76675 info:546474
CGATGCTGATGCTGATCGTGATCTGTC
CAGTCGAGCTGATGTCGTATGCGGGTG
GCTAGCTA

Jean-Pierre.
# 3  
Old 08-11-2010
Code:
# sed -e '/\>.*x01/,/>/{;/>/!d};1d' -e 's/_[^_][0-9]*//' infile
>Sic.x1 bhg|76675 info:546474
CGATGCTGATGCTGATCGTGATCTGTC
CAGTCGAGCTGATGTCGTATGCGGGTG
GCTAGCTA
>Sic.y1 bhg|76675 info:546474
CGATGCTGATGCTGATCGTGATCTGTC
CAGTCGAGCTGATGTCGTATGCGGGTG
GCTAGCTA

# 4  
Old 08-11-2010
sed '/>/d' < input > outputfile
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Joining broken lines and removing empty lines

Hi - I have req to join broken lines and remove empty lines but should NOT be in one line. It has to be as is line by line. The challenge here is there is no end of line/start of line char. thanks in advance Source:- 2003-04-34024|04-10-2003|Claims|Claim|01-13-2003|Air Bag:Driver;... (7 Replies)
Discussion started by: Jackceasar123
7 Replies

2. UNIX for Dummies Questions & Answers

Removing PATTERN from txt without removing lines and general text formatting

Hi Everybody! First post! Totally noobie. I'm using the terminal to read a poorly formatted book. The text file contains, in the middle of paragraphs, hyphenation to split words that are supposed to be on multiple pages. It looks ve -- ry much like this. I was hoping to use grep -v " -- "... (5 Replies)
Discussion started by: AxeHandle
5 Replies

3. UNIX for Dummies Questions & Answers

Printing lines with specific strings at specific columns

Hi I have a file which is tab-delimited. Now, I'd like to print the lines which have "chr6" string in both first and second columns. Could anybody help? (3 Replies)
Discussion started by: a_bahreini
3 Replies

4. UNIX for Dummies Questions & Answers

Quick UNIX command to display specific lines in the middle of a file from/to specific word

This could be a really dummy question. I have a log text file. What unix command to extract line from specific string to another specific string. Is it something similar to?: more +/"string" file_name Thanks (4 Replies)
Discussion started by: aku
4 Replies

5. Shell Programming and Scripting

Removing specific lines from script files.

Hello, Activity to perform: 1. Find all of the "*.tmp" files in a given user directory 2. Determine which ones have "find" in them. 3. Replace the "find sequence" of commands with a "list set" of commands. Example: Original file: -------------- define lastn1 = "A" define... (7 Replies)
Discussion started by: manishdivs
7 Replies

6. Shell Programming and Scripting

Print Specific lines when found specific character

Hello all, I have thousand file input like this: file1: $$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$ $$ | | | |$$ $$ UERT | TTYH | TAFE | FRFG |$$ $$______|______|________|______|$$ $$ | | | |$$ $$ 1 | DISK | TR1311 | 1 |$$ $$ 1 |... (4 Replies)
Discussion started by: attila
4 Replies

7. Shell Programming and Scripting

substitute a string on a specific position for specific lines

I woud like to substitue a string on a specific position for specific lines I've got a file and I would like to change a specific string from "TOCHANGE" to "ABCABCAB" For every line (except 1,2, 3 and the last one) , I need to check between the 9th and the 16th digits. For the 3rd line, I... (7 Replies)
Discussion started by: BSF
7 Replies

8. UNIX for Dummies Questions & Answers

unix: extract a specific list of lines from a file

I would like to extract specific lines from a file and output them into another file. Each line in the file has a unique ID, and I have a specific list of IDs (that are not consecutive) that I wish to extract. for example: 1 aaaaaa bbbcb cccccc 2 aaaaaa bbbbb cccccd 3 aaaaaa bbbab... (6 Replies)
Discussion started by: mert2481
6 Replies

9. Shell Programming and Scripting

Removing empty lines(space) between two lines containing strings

Hi, Please provide shell script to Remove empty lines(space) between two lines containing strings in a file. Input File : A1/EXT "BAP_BSC6/07B/00" 844 090602 1605 RXOCF-465 PDTR11 1 SITE ON BATTERY A2/EXT... (3 Replies)
Discussion started by: sudhakaryadav
3 Replies

10. Shell Programming and Scripting

Removing specific lines

Hi I have a .conf file having many location tags like <Location /main> AuthName main AuthUserFile /ppt/gaea/passwd_main Require user admin </Location> ...... ... <Location /wonder> AuthName gaea AuthUserFile /ppt/gaea/passwd_gaea Require... (3 Replies)
Discussion started by: catgovind
3 Replies
Login or Register to Ask a Question