12-02-2010
bash script to parse sequence...
Hi,
I have 4000 list files and 4000 sequence data files. Each list file contains a number of 'headers' and data file contains 'header and data'. I would like to extract data from the data file using the list file and write into a new file. As each of the files are quite large, an efficient piece of script(preferably bash) will be much appreciated. Example below:
Example list file:
HTML Code:
contig00002 length=653 numreads=34
contig00005 length=636 numreads=21
contig00015 length=662 numreads=51
contig00033 length=584 numreads=24
contig00045 length=539 numreads=19
contig00073 length=454 numreads=67
contig00046 length=660 numreads=27
contig00014 length=746 numreads=18
contig00089 length=298 numreads=19
.....
.....
Example data file:
HTML Code:
>contig00001 length=477 numreads=22
GGGGCTGACGTGGCCGCTAATACGACTCACTATAGGGAGAGTAAGTGAAT
GTCACATCGTTTGGATCAAGACCCATTTGCAGCACAAGCCCTGTTTTGTT
>contig00002 length=530 numreads=27
GGGCTGACGTGGCCGCTAATACGACTCACTATAGGGAGAGGAGGATAGGG
AGCTGAGCAGCCAGTGACAGGATCCAGCTCCAGGGGGTGAATGGGGATGG
>contig00004 length=670 numreads=22
GGGGCTGACGTGGCCGCTAATACGACTCACTATAGGGAGAGATTGTTGAA
GTGGAAAGCCATTTTGACTATTACCGCCCGGTGGCAGAAACCAAACCTGG
.....
....
Example output file:
HTML Code:
>contig00002 length=653 numreads=34
GGGCAGCTGCGGCCGCTAATACGACTCACTATAGGGAGAGGCTTGCTCAA
ATCCGCGTTCAAGGATTTCCAGATTGGTAAGAACTTCAGATTCCTTGACG
>contig00005 length=636 numreads=21
GGGCAGCTGCGGCCGCTAATACGACTCACTATAGGGAGAGATCGTGGCGA
TCGCCAATCACCCAGGTGCCGTTAGCCAGAGCTGGTTTGATGACCGTTTC
>contig00015 length=662 numreads=51
GGGCAGCTGCGGCCGCTAATACGACTCACTATAGGGAGAGAGCTCCAGCA
GAATGGACACGCCTCCTGAGCTGTGATAGGGAGAGCATAAACACGCCTCC
.....
.....
Thanks in advance.
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
I have a script I use on my web server (Apache2). I am changing to Lighttpd and need to make a few changes.
This is what I use on my apache server
#!/bin/bash
# accepts 3 parameters: <domain name> <user name> <XXXXXXXX>
# domain name is without www (just domain.com)
# username would be... (3 Replies)
Discussion started by: vertical98
3 Replies
2. Shell Programming and Scripting
I am completely new to bash scripting and now need to write a bash script that would parse a XML file and take out values from specific tags.
I tried using xsltproc, xml_grep commands. But the issue is that the XML i am trying to parse is not UTF 8. so those commands are unable to parse my XML's... (4 Replies)
Discussion started by: shivashankar.g
4 Replies
3. Shell Programming and Scripting
Hi Guys,
I am new to unix scripting and I am tasked to parse through a CSV file delimited by #.
Sample:
sample.csv
H#A#B#C
D#A#B#C
T#A#B#C
H = Header
D = Detail Record
T = Tail
What I need is to read the file and parse through it to get the columns.
I have no idea on how... (8 Replies)
Discussion started by: 3vilwyatt
8 Replies
4. Shell Programming and Scripting
Raw Results:
results|192.168.2|192.168.2.1|general/udp|10287|Security Note|For your information, here is the traceroute from 192.168.2.24 to 192.168.2.1 : \n192.168.2.24\n192.168.2.1\n\n
results|192.168.2|192.168.2.1|ssh (22/tcp)|22964|Security Note|An SSH server is running on this port.\n... (2 Replies)
Discussion started by: jroberson
2 Replies
5. Shell Programming and Scripting
hi to all.
im a newbie in unix shell scripts. i want to make a simple unix shell script using the bash shell that asks a user to press any key after a series of commands, or an x if he wishes to exit. here's a sample script that i made:
#!/usr/bin/bash
pause(){
/usr/bin/echo "\t\t Press... (3 Replies)
Discussion started by: booghaw
3 Replies
6. Shell Programming and Scripting
I would create a bash script than parse like this:
test.sh -p (protocol) -i (address) -d (directory)
I need retrive the value after -p for example...
understand???
I hope...
thanks (6 Replies)
Discussion started by: ionral
6 Replies
7. Shell Programming and Scripting
Hi All,
Hope all you are doing good! Need your help. I have an XML file which needs to be converted CSV file. I am not an expert of awk/sed so your help is highly appreciated!!
XML file looks like this:
<l:event dateTime="2013-03-13 07:15:54.713" layerName="OSB" processName="ABC"... (2 Replies)
Discussion started by: bhaskar_m
2 Replies
8. Shell Programming and Scripting
All,
Have a weird issue where i need to generate a report from GitHub monthly detailing user accounts and the last time they logged in. I'm using a windows box to do this (work issued) and would like to know if anyone has any experience scripting for GitAPI using windows / cygwin / powershell?... (9 Replies)
Discussion started by: ChocoTaco
9 Replies
9. Shell Programming and Scripting
Hi All,
I need to write a bash script that will parse some perforce log files, the log files will contain user login information, the script would need to pare the log, and check who logs in, and if the user is a superadmin, then the script will check the ip address to see which server the... (4 Replies)
Discussion started by: BostonRob
4 Replies
10. Shell Programming and Scripting
OK,
I know function has to be defined first - in sequence - before it can be used.
So the script has to be build "bottoms -up style, if you pardon my expression.
I am running into a problem reusing function and breaking the sequence.
It would be nice to be able to see the function... (10 Replies)
Discussion started by: annacreek
10 Replies
LEARN ABOUT DEBIAN
mincdiff
MINCDIFF(1) MINC User's Guide MINCDIFF(1)
NAME
mincdiff - report differences between minc files
SYNOPSIS
mincdiff [-header|-body] [-l] [diff options] file1 file2
DESCRIPTION
The mincdiff shell script compares two minc files by running diff(1) on the headers of the two minc files, and cmp(1) on the image vari-
able. You can view only the header differences using -header or only the body (image variable) differences using -body. The option -l is
passed on to cmp of the image variable. Any unrecognized options (e.g. -u) are passed verbatim to the diff of the headers.
OPTIONS
-header Compare only the headers of the two files.
-body Compare only the image data of the two files.
-l Print the byte offset in decimal and the byte value in octal
for each difference encountered in the image variable.
AUTHOR
Peter Neelin
COPYRIGHTS
Copyright (C) 1993 by Peter Neelin
SEE ALSO
diff(1), cmp(1).
$Date: 2004-05-20 21:52:08 $ MINCDIFF(1)