Replacing headers based on a second file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Replacing headers based on a second file
# 1  
Old 07-28-2011
Replacing headers based on a second file

I have a file with thousands of sequences that looks like this:
Quote:
>SeqID1
AACCATGACAGAGGAGATGTGAACAGATAGAGGGATGACAGATGACAGATAGACCCAGAC
>Sequence 22
AACCATGACAGAGGAGATGTGAACAGATAGAGGGATGACAGATGACAGATAGACCCAGAC
>Seq-39
AACCATGACAGAGGAGATGTGAACAGATAGAGGGATGACAGATGACAGATAGACCCAGAC
I need to replace the headers using a second file
Quote:
HIV
WNV
Flu
Thus, I will end up having the following file:
Quote:
>HIV
AACCATGACAGAGGAGATGTGAACAGATAGAGGGATGACAGATGACAGATAGACCCAGAC
>WNV
AACCATGACAGAGGAGATGTGAACAGATAGAGGGATGACAGATGACAGATAGACCCAGAC
>Flu
AACCATGACAGAGGAGATGTGAACAGATAGAGGGATGACAGATGACAGATAGACCCAGAC
I am looking for an AWK script that I can easily plug in my current pipeline.
Any help will be greatly appreciated!
# 2  
Old 07-28-2011
Hi Xterra Smilie Try this:
Code:
awk 'NR==FNR{a[NR]=$0;next}FNR%2{$0=">"a[int(FNR/2+1)]}1' file2 file1


Last edited by bartus11; 07-28-2011 at 02:50 PM..
This User Gave Thanks to bartus11 For This Post:
# 3  
Old 07-28-2011
This will work:
Code:
#!/usr/bin/ksh
sed -n 's/^>\(.*\)/s#>\1#/p' Large_File > Tmp1
paste -d'>' Tmp1 String_File > Tmp2
sed 's/$/#/' Tmp2 > Sed_File
sed -f Sed_File Large_File

# 4  
Old 07-28-2011
bartus11

As usual, a great solution!
Thank you very, very much!
# 5  
Old 07-28-2011
Code:
awk '/>/ {getline a<"file2";$0=">" a}1' file1

This User Gave Thanks to rdcwayx For This Post:
# 6  
Old 07-29-2011
Yet another approach:
Code:
paste -d\\n file1 <(sed G file2) | sed -n 'n; N; p; n'

Regards,
Alister
# 7  
Old 07-29-2011
rdcwayx

Very nice!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Is it possible to rename fasta headers based on its position specified in another file?

I have 5 sequences in a fasta file namely gene1.fasta as follows, gene1.fasta >1256 ATGTAGC >GEP TAGAG >GTY578 ATGCATA >67_iga ATGCTGA >90_ld ATGCTG I need to rename the gene1.fasta file based on the sequence position specified in list.txt as follows, list.txt position1=org5... (5 Replies)
Discussion started by: dineshkumarsrk
5 Replies

2. Shell Programming and Scripting

Replacing 12 columns of one file by second file based on mapping in third file

i have a real data prod file with 80+ fields containing 1k -2k records. i have to extract say 12 columns out of this which are sensitive fields along with one primary key say SEQ_ID (like DOB,account no, name, SEQ_ID, govtid etc) in a lookup file. i have to replace these sensitive fields in... (11 Replies)
Discussion started by: megh12
11 Replies

3. Shell Programming and Scripting

Help with awk replacing identical columns based on another file

Hello, I am using Awk in UBUNTU 12.04. I have a file like following with three fields and 44706 rows. F1 A A F2 G G F3 A T I have another file like this: AL_1 F1 A A AL_2 F1 A T AL_3 F1 A A AL_1 F2 G G AL_2 F2 G A AL_3 F2 G G BO_1 F1 A A BO_2 F1 A T... (6 Replies)
Discussion started by: Homa
6 Replies

4. Shell Programming and Scripting

Finding/replacing strings in some files based on a file

Hi, We have a file (e.g. a .csv file, but could be any other format), with 2 columns: the old value and the new value. We need to modify all the files within the current directory (including subdirectories), so find and replace the contents found in the first column within the file, with the... (9 Replies)
Discussion started by: Talkabout
9 Replies

5. UNIX for Dummies Questions & Answers

Script for replacing text in a file based on list

Hi All, I am fairly new to the world of Unix, and I am looking for a way to replace a line of text in a file with a delimited array of values. I have an aliases file that is currently in use on our mail server that we are migrating off of. Until the migration is complete, the server must stay... (8 Replies)
Discussion started by: phoenixjc
8 Replies

6. Shell Programming and Scripting

Matching words based on column headers

Hi , Pls help on this. Input file: NAME1 BSC1 TEXT ID 1 MAINSFAIL TEXT ID 2 DGON TEXT ID 3 lOADONDG NAME2 BSC2 TEXT ID 1 DGON TEXT ID 3 lOADONG (1 Reply)
Discussion started by: bha148
1 Replies

7. Shell Programming and Scripting

Replacing values in a file based on values in another file

Hi I have 2 files:- 1. List of files which consists of names of some output files. 2. A delimited file; delimted by "|" I want to replace the value of the $23 (23rd column) in the delimited file with name in the first file. It is always position to position. Meaning first row of the first... (5 Replies)
Discussion started by: pparthiv
5 Replies

8. Shell Programming and Scripting

Merging of files with different headers to make combined headers file

Hi , I have a typical situation. I have 4 files and with different headers (number of headers is varible ). I need to make such a merged file which will have headers combined from all files (comman coluns should appear once only). For example - File 1 H1|H2|H3|H4 11|12|13|14 21|22|23|23... (1 Reply)
Discussion started by: marut_ashu
1 Replies

9. Shell Programming and Scripting

Replacing Character in a file based on element

Hi, I have file like below. Unix:/pclls/turc>cat tibc.property executeReceiver=Y executeSender=Y I want to replace executeSender=N in the file. My file should be like below. executeReceiver=Y executeSender=N I tried with the below command, its giving error. cat tibc.property |... (2 Replies)
Discussion started by: senthil_is
2 Replies

10. Shell Programming and Scripting

Remove text between headers while leaving headers intact

Hi, I'm trying to strip all lines between two headers in a file: ### BEGIN ### Text to remove, contains all kinds of characters ... Antispyware-Downloadserver.com (Germany)=http://www.antispyware-downloadserver.c om/updates/ Antispyware-Downloadserver.com #2... (3 Replies)
Discussion started by: Trones
3 Replies
Login or Register to Ask a Question