Editing headers


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Editing headers
# 1  
Old 12-02-2010
Editing headers

Hi,
I have a folder that contains many (multiple) files

1.fasta
2.fasta
3.fasta
4.fasta
5.fasta
.
.
100's of files

Each such file have data in the following format
for example:
vi 1.fasta
Code:
>AB_1 200bp
MLKKPIIIGVTGGSGGGKTSVSRAILDSFPNARIAMIQHDSYYKDQSHMSFEERVKTNYDHPLAFDTDFM
IQQLKELLAGRPVDIPIYDYKKHTRSNTTFRQDPQDVIIVEGILVLEDERLRDLMDIKLFVDTDDDIRII
RRIKRDMMERGRSLESIIDQYTSVVKPMYHQFIEPSKRYADIVIPEGVSNVVAIDVINSKIASILGEV
>AB_2 200bp
MRARLIYNPTSGQELMRKSVPEVLDILEGFGYETSAFQTTAKKNSALNEARRAAKAGFDLLIAAGGDGTI
NEVVNGIAPLKKRPKMAIIPTGTTNDFARALKVPRGNPSQAAKLIGKNQTIQMDIGRAKKDTYFINIAAA
GSLTELTYSVPSQLKTMFGYLAYLAKGVELLPRVSNVPVKITHDKGVFEGQVSMIFAAITNSVGGFEMIA
PDAKLDDGMFTLILIKTANLFEIVHLLRLILDGGKHITDRRVEYIKTSKIVIEPQCGKRMMINLDGEYGG
DAPITLENLKNHITFFADTDLISDDALVLDQDELEIEEIVKKFAHEVEDLEQELEE
>AB_3 200bp
MTGYDDFNYALSALKLGADDYLLKPFSKADVEDMLGKLRKKLELSKKTETIQELVEQPQKEVSAIAMAIH
ERLADSDLTLKSLAQQLGFSPNYLSVLIKKELGMPFQDYLVQERLKKAKLFLLTSNLKIYEIAEQVGFED
MNYFSQRFKQLVGVTPSQYKKGGQA
>AB_4 200bp
MRARLIYNPTSGQELMRKSVPEVLDILEGFGYETSAFQTTAKKNSALNEARRAAKAGFDLLIAAGGDGTI
NEVVNGIAPLKKRPKMAIIPTGTTNDFARALKVPRGNPSQAAKLIGKNQTIQMDIGRAKKDTYFINIAAA
GSLTELTYSVPSQLKTMFGYLAYLAKGVELLPRVSNVPVKITHDKGVFEGQVSMIFAAITNSVGGFEMIA
PDAKLDDGMFTLILIKTANLFEIVHLLRLILDGGKHITDRRVEYIKTSKIVIEPQCGKRMMINLDGEYGG
DAPITLENLKNHITFFADTDLISDDALVLDQDELEIEEIVKKFAHEVEDLEQELEE
>AB_5 200bp
MRARLIYNPTSGQELMRKSVPEVLDILEGFGYETSAFQTTAKKNSALNEARRAAKAGFDLLIAAGGDGTI
NEVVNGIAPLKKRPKMAIIPTGTTNDFARALKVPRGNPSQAAKLIGKNQTIQMDIGRAKKDTYFINIAAA
GSLTELTYSVPSQLKTMFGYLAYLAKGVELLPRVSNVPVKITHDKGVFEGQVSMIFAAITNSVGGFEMIA
PDAKLDDGMFTLILIKTANLFEIVHLLRLILDGGKHITDRRVEYIKTSKIVIEPQCGKRMMINLDGEYGG
DAPITLENLKNHITFFADTDLISDDALVLDQDELEIEEIVKKFAHEVEDLEQELEE

I would like to edit these files such a way that the data below so that the file will look like
Code:
>AB_1
MLKKPIIIGVTGGSGGGKTSVSRAILDSFPNARIAMIQHDSYYKDQSHMSFEERVKTNYDHPLAFDTDFM
IQQLKELLAGRPVDIPIYDYKKHTRSNTTFRQDPQDVIIVEGILVLEDERLRDLMDIKLFVDTDDDIRII
RRIKRDMMERGRSLESIIDQYTSVVKPMYHQFIEPSKRYADIVIPEGVSNVVAIDVINSKIASILGEV
>AB_2
MRARLIYNPTSGQELMRKSVPEVLDILEGFGYETSAFQTTAKKNSALNEARRAAKAGFDLLIAAGGDGTI
NEVVNGIAPLKKRPKMAIIPTGTTNDFARALKVPRGNPSQAAKLIGKNQTIQMDIGRAKKDTYFINIAAA
GSLTELTYSVPSQLKTMFGYLAYLAKGVELLPRVSNVPVKITHDKGVFEGQVSMIFAAITNSVGGFEMIA
PDAKLDDGMFTLILIKTANLFEIVHLLRLILDGGKHITDRRVEYIKTSKIVIEPQCGKRMMINLDGEYGG
DAPITLENLKNHITFFADTDLISDDALVLDQDELEIEEIVKKFAHEVEDLEQELEE
>AB_3
MTGYDDFNYALSALKLGADDYLLKPFSKADVEDMLGKLRKKLELSKKTETIQELVEQPQKEVSAIAMAIH
ERLADSDLTLKSLAQQLGFSPNYLSVLIKKELGMPFQDYLVQERLKKAKLFLLTSNLKIYEIAEQVGFED
MNYFSQRFKQLVGVTPSQYKKGGQA
>AB_4 
MRARLIYNPTSGQELMRKSVPEVLDILEGFGYETSAFQTTAKKNSALNEARRAAKAGFDLLIAAGGDGTI
NEVVNGIAPLKKRPKMAIIPTGTTNDFARALKVPRGNPSQAAKLIGKNQTIQMDIGRAKKDTYFINIAAA
GSLTELTYSVPSQLKTMFGYLAYLAKGVELLPRVSNVPVKITHDKGVFEGQVSMIFAAITNSVGGFEMIA
PDAKLDDGMFTLILIKTANLFEIVHLLRLILDGGKHITDRRVEYIKTSKIVIEPQCGKRMMINLDGEYGG
DAPITLENLKNHITFFADTDLISDDALVLDQDELEIEEIVKKFAHEVEDLEQELEE
>AB_5  
MRARLIYNPTSGQELMRKSVPEVLDILEGFGYETSAFQTTAKKNSALNEARRAAKAGFDLLIAAGGDGTI
NEVVNGIAPLKKRPKMAIIPTGTTNDFARALKVPRGNPSQAAKLIGKNQTIQMDIGRAKKDTYFINIAAA
GSLTELTYSVPSQLKTMFGYLAYLAKGVELLPRVSNVPVKITHDKGVFEGQVSMIFAAITNSVGGFEMIA
PDAKLDDGMFTLILIKTANLFEIVHLLRLILDGGKHITDRRVEYIKTSKIVIEPQCGKRMMINLDGEYGG
DAPITLENLKNHITFFADTDLISDDALVLDQDELEIEEIVKKFAHEVEDLEQELEE

That is remove every thing after >AB_1 ..same with >AB_2....etc
Please let me know the best way to do it using awk or sed as I have to edit hundred's of files.
LA
# 2  
Old 12-02-2010
Code:
ls *.fasta | xargs sed -i .b 's/ [^ ]*bp//'

This will change your file as expected and it will keep a backup copy *.b of the original file
you can then rm *.b if you want to remove the backup of the initial files
or you can rollback to your initial file from the backup copy:
Code:
cat 3.fasta.b >3.fasta

# 3  
Old 12-02-2010
Hi,
Actually there is a space between 200 and bp. The above code you suggested only removed the bp. I would like to remove the entire thing (200 bp). That is any thing after >AB_1 ... >AB_2..etc
LA

---------- Post updated at 06:00 PM ---------- Previous update was at 05:50 PM ----------

Hi
I think I got it now. I first ran your code and repeated the same code again why replacing *bp with *\d*.
Code:
ls *.fasta | xargs sed -i .b 's/ [^ ]*\d*//'

I think it worked
# 4  
Old 12-02-2010
Not sure, if your file have space in the content not only in line ">"

Here is the command which will only remove the bp part in line ">"

Code:
ls *.fasta | xargs sed -i  '/>/ s/ .*//'

you can also run by (but both can't update the file directly. you need generate temp file, and rename it back.)

Code:
awk 'NF=1' 1.fasta
cut -f1 1.fasta

# 5  
Old 12-02-2010
In your file, is the following :

Code:
>AB_1 200bp
MLKKPIIIGVTGGSGGGKTSVSRAILDSFPNARIAMIQHDSYYKDQSHMSFEERVKTNYDHPLAFDTDFM
IQQLKELLAGRPVDIPIYDYKKHTRSNTTFRQDPQDVIIVEGILVLEDERLRDLMDIKLFVDTDDDIRII
RRIKRDMMERGRSLESIIDQYTSVVKPMYHQFIEPSKRYADIVIPEGVSNVVAIDVINSKIASILGEV

1 long line or 4 lines ?
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Convert vi editing to text editing

Dear Guru's I'm using Putty and want to edit a file. I know we generally use vi editor to do it. As I'm not good in using vi editor, I want to convert the vi into something like text pad. Is there any option in Putty to do the same ? Thanks for your response. Srini (6 Replies)
Discussion started by: thummi9090
6 Replies

2. UNIX for Dummies Questions & Answers

Email Headers

I'm trying to pick up some Unix SysAdmin skills on my own outside of work through the use of the "Unix and Linux System Administrators Handbook." I've found the exercises to be very beneficial, until I came to this.... "What path did the email take? To Whom was it addressed, and to whom was it... (0 Replies)
Discussion started by: ksmarine1980
0 Replies

3. Shell Programming and Scripting

editing headers

Hi, I have a folder that contains many (multiple) files 1.fasta 2.fasta 3.fasta 4.fasta 5.fasta . . 100's of files Each such file have data in the following format for example: vi 1.fasta 58 390 A GTATACATTATTGATGAAGTCCACATGCTTTCTATGGGTGCCTTCAATGCGCTTTTAAAA (7 Replies)
Discussion started by: Lucky Ali
7 Replies

4. Shell Programming and Scripting

Editing File Headers

Hey Guys, Absolute neewbie here. I am trying to see if it is possible to edit headers/meta-data of files in Mac OSX. I am basically trying to change an audio file header to read 16bit instead of 24bit. We have an issue with some of our software and it regularly exports 16bit audio files with... (3 Replies)
Discussion started by: andysuperaudiom
3 Replies

5. Programming

c - problem with headers?

I have a simple program to create a poker deck, shuffle it and deal cards. Here it is: #include <stdio.h> #include <stdlib.h> #include <time.h> struct Card { char *face, *suit; }; void fillDeck (Card *deck, char *face, char *suit); void shuffle (Card *deck); void... (4 Replies)
Discussion started by: Luke Bonham
4 Replies

6. UNIX for Dummies Questions & Answers

Grep Headers

Hi! Trying to find string and then put the above Headers of corresponding fist line. After executing a Property command a get this output: SP/CH-CH Span Name Type TG Idle InUse OffHk OnHk Ring -------- ------------------------------ ------ ---- ----- ----- ----- 02/01-24 CARRIERSS7... (6 Replies)
Discussion started by: Joel_john
6 Replies

7. Shell Programming and Scripting

Merging of files with different headers to make combined headers file

Hi , I have a typical situation. I have 4 files and with different headers (number of headers is varible ). I need to make such a merged file which will have headers combined from all files (comman coluns should appear once only). For example - File 1 H1|H2|H3|H4 11|12|13|14 21|22|23|23... (1 Reply)
Discussion started by: marut_ashu
1 Replies

8. Shell Programming and Scripting

Remove text between headers while leaving headers intact

Hi, I'm trying to strip all lines between two headers in a file: ### BEGIN ### Text to remove, contains all kinds of characters ... Antispyware-Downloadserver.com (Germany)=http://www.antispyware-downloadserver.c om/updates/ Antispyware-Downloadserver.com #2... (3 Replies)
Discussion started by: Trones
3 Replies

9. Programming

headers of the query

when we are spooling query o/p to certain txt file,in that file how we can get headers in the query.(through unix shell scripting). for exmple q1="slect * from XXXXXX;"; sqlplus XXX/XXXX@XXXXX spool XXXX.txt $q1 spool off in the text file i want the headers of the query..... ... (0 Replies)
Discussion started by: bhagya.puccha
0 Replies

10. Programming

C Headers

Where can i get C/C++ headers for OS MINIX 2.0.3? (0 Replies)
Discussion started by: biosdos
0 Replies
Login or Register to Ask a Question