![]() |
|
|
google unix.com
|
|||||||
| Forums | Register | Forum Rules | Links | Albums | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !! |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| replacing text in a file, but... | Angelseph | Shell Programming and Scripting | 2 | 12-06-2008 12:46 AM |
| Replacing Text in Text file | cgilchrist | Shell Programming and Scripting | 3 | 06-30-2008 11:32 PM |
| Parsing text from file | ndnkyd | Shell Programming and Scripting | 0 | 04-02-2008 02:42 AM |
| Need help in parsing text file contents | Alecs | Shell Programming and Scripting | 0 | 03-30-2008 01:58 PM |
| Text File Parsing | Djlethal | Shell Programming and Scripting | 2 | 02-27-2008 03:31 AM |
![]() |
|
|
LinkBack | Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
|
|
||||
|
Help parsing and replacing text with file name
Hi everyone,
I'm having trouble figuring this one out. I have ~100 *.fa files with multiple lines of fasta sequences like this: file1.fa >xyzsequence atcatgcacac...... ataccgagagg..... atataccagag..... >abcsequence atgagatatat..... acacacggd..... atcgaacac.... agttccagat.... The name of each sequence is delimited by a ">" and followed by a newline. I'm trying to figure out how iterate through all of my files with a ".fa" extension and create a single tab-delimited table with the name of the sequence (tab) and the name of the file it came from. Like so: xyzsequence file1 abcsequence file1 somsequence file2 etc... Can anyone point me in the right direction? Many thanks, |
|
||||
|
Hi Zaxxon, Thanks a million! I didn't want the actual sequence, just the sequence name, so I used some of your code and bits of other things that I pieced together. This is hideous and long (I know ) but it works. Next week I'll try to learn to pipe.Code: Code:
grep '^>' *.fa >new; sed -e 's/.fa:>/\t/g' new > new2; perl -e ' @cols=(1, 0); while(<>) { s/\r?\n//; @F=split /\t/, $_; print join("\t", @F[@cols]), "\n" } warn "\nChose columns ", join(", ", @cols), " for $. lines\n\n" ' new2 > new3; rm new; rm new2
|
![]() |
| Bookmarks |
| Tags |
| multiple files, parsing, replacing text |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|