Sponsored Content
Top Forums UNIX for Dummies Questions & Answers bash script to parse sequence... Post 302476661 by Fahmida on Thursday 2nd of December 2010 07:50:00 AM
Old 12-02-2010
bash script to parse sequence...

Hi,

I have 4000 list files and 4000 sequence data files. Each list file contains a number of 'headers' and data file contains 'header and data'. I would like to extract data from the data file using the list file and write into a new file. As each of the files are quite large, an efficient piece of script(preferably bash) will be much appreciated. Example below:

Example list file:
HTML Code:
contig00002 length=653   numreads=34
contig00005 length=636   numreads=21
contig00015 length=662   numreads=51
contig00033 length=584   numreads=24
contig00045 length=539   numreads=19
contig00073 length=454   numreads=67
contig00046 length=660   numreads=27
contig00014 length=746   numreads=18
contig00089 length=298   numreads=19
.....
.....
Example data file:
HTML Code:
>contig00001 length=477   numreads=22
GGGGCTGACGTGGCCGCTAATACGACTCACTATAGGGAGAGTAAGTGAAT
GTCACATCGTTTGGATCAAGACCCATTTGCAGCACAAGCCCTGTTTTGTT
>contig00002 length=530   numreads=27
GGGCTGACGTGGCCGCTAATACGACTCACTATAGGGAGAGGAGGATAGGG
AGCTGAGCAGCCAGTGACAGGATCCAGCTCCAGGGGGTGAATGGGGATGG
>contig00004 length=670   numreads=22
GGGGCTGACGTGGCCGCTAATACGACTCACTATAGGGAGAGATTGTTGAA
GTGGAAAGCCATTTTGACTATTACCGCCCGGTGGCAGAAACCAAACCTGG
.....
....
Example output file:
HTML Code:
>contig00002 length=653   numreads=34
GGGCAGCTGCGGCCGCTAATACGACTCACTATAGGGAGAGGCTTGCTCAA
ATCCGCGTTCAAGGATTTCCAGATTGGTAAGAACTTCAGATTCCTTGACG
>contig00005 length=636   numreads=21
GGGCAGCTGCGGCCGCTAATACGACTCACTATAGGGAGAGATCGTGGCGA
TCGCCAATCACCCAGGTGCCGTTAGCCAGAGCTGGTTTGATGACCGTTTC
>contig00015 length=662   numreads=51
GGGCAGCTGCGGCCGCTAATACGACTCACTATAGGGAGAGAGCTCCAGCA
GAATGGACACGCCTCCTGAGCTGTGATAGGGAGAGCATAAACACGCCTCC
.....
.....
Thanks in advance.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How do you parse a variable in a bash script?

I have a script I use on my web server (Apache2). I am changing to Lighttpd and need to make a few changes. This is what I use on my apache server #!/bin/bash # accepts 3 parameters: <domain name> <user name> <XXXXXXXX> # domain name is without www (just domain.com) # username would be... (3 Replies)
Discussion started by: vertical98
3 Replies

2. Shell Programming and Scripting

Need to Parse XML from bash script

I am completely new to bash scripting and now need to write a bash script that would parse a XML file and take out values from specific tags. I tried using xsltproc, xml_grep commands. But the issue is that the XML i am trying to parse is not UTF 8. so those commands are unable to parse my XML's... (4 Replies)
Discussion started by: shivashankar.g
4 Replies

3. Shell Programming and Scripting

Bash Script to read a file and parse each record

Hi Guys, I am new to unix scripting and I am tasked to parse through a CSV file delimited by #. Sample: sample.csv H#A#B#C D#A#B#C T#A#B#C H = Header D = Detail Record T = Tail What I need is to read the file and parse through it to get the columns. I have no idea on how... (8 Replies)
Discussion started by: 3vilwyatt
8 Replies

4. Shell Programming and Scripting

Bash Shell Script to parse file

Raw Results: results|192.168.2|192.168.2.1|general/udp|10287|Security Note|For your information, here is the traceroute from 192.168.2.24 to 192.168.2.1 : \n192.168.2.24\n192.168.2.1\n\n results|192.168.2|192.168.2.1|ssh (22/tcp)|22964|Security Note|An SSH server is running on this port.\n... (2 Replies)
Discussion started by: jroberson
2 Replies

5. Shell Programming and Scripting

Press Any Key script sequence using bash - HELP

hi to all. im a newbie in unix shell scripts. i want to make a simple unix shell script using the bash shell that asks a user to press any key after a series of commands, or an x if he wishes to exit. here's a sample script that i made: #!/usr/bin/bash pause(){ /usr/bin/echo "\t\t Press... (3 Replies)
Discussion started by: booghaw
3 Replies

6. Shell Programming and Scripting

Bash Script for parse input like option and value

I would create a bash script than parse like this: test.sh -p (protocol) -i (address) -d (directory) I need retrive the value after -p for example... understand??? I hope... thanks (6 Replies)
Discussion started by: ionral
6 Replies

7. Shell Programming and Scripting

BASH script to parse XML and generate CSV

Hi All, Hope all you are doing good! Need your help. I have an XML file which needs to be converted CSV file. I am not an expert of awk/sed so your help is highly appreciated!! XML file looks like this: <l:event dateTime="2013-03-13 07:15:54.713" layerName="OSB" processName="ABC"... (2 Replies)
Discussion started by: bhaskar_m
2 Replies

8. Shell Programming and Scripting

Bash script - cygwin (powershell?) pull from GitHub API Parse JSON

All, Have a weird issue where i need to generate a report from GitHub monthly detailing user accounts and the last time they logged in. I'm using a windows box to do this (work issued) and would like to know if anyone has any experience scripting for GitAPI using windows / cygwin / powershell?... (9 Replies)
Discussion started by: ChocoTaco
9 Replies

9. Shell Programming and Scripting

Bash Script to parse Perforce Logs

Hi All, I need to write a bash script that will parse some perforce log files, the log files will contain user login information, the script would need to pare the log, and check who logs in, and if the user is a superadmin, then the script will check the ip address to see which server the... (4 Replies)
Discussion started by: BostonRob
4 Replies

10. Shell Programming and Scripting

Bash functions sequence ?

OK, I know function has to be defined first - in sequence - before it can be used. So the script has to be build "bottoms -up style, if you pardon my expression. I am running into a problem reusing function and breaking the sequence. It would be nice to be able to see the function... (10 Replies)
Discussion started by: annacreek
10 Replies
MPSVectorDescriptor(3)					 MetalPerformanceShaders.framework				    MPSVectorDescriptor(3)

NAME
MPSVectorDescriptor SYNOPSIS
#import <MPSMatrixTypes.h> Inherits NSObject. Class Methods (__nonnull instancetype) + vectorDescriptorWithLength:dataType: (__nonnull instancetype) + vectorDescriptorWithLength:vectors:vectorBytes:dataType: (size_t) + vectorBytesForLength:dataType: Properties NSUInteger length NSUInteger vectors MPSDataType dataType NSUInteger vectorBytes Detailed Description This depends on Metal.framework A MPSVectorDescriptor describes the length and data type of a an array of 1-dimensional vectors. All vectors are stored as contiguous arrays of data. Method Documentation + (size_t) vectorBytesForLength: (NSUInteger) length(MPSDataType) dataType Return the recommended stride, in bytes, to be used for an array of vectors of a given length. Parameters: length The number of elements in a single vector. dataType The type of vector data values. To achieve best performance the optimal stride between vectors within an array of vectors is not necessarily equivalent to the number of elements per vector. This method returns the stride, in bytes, which gives best performance for a given vector length. Using this stride to construct your array is recommended, but not required (provided that the stride used is still large enough to allocate a full vector of data). + (__nonnull instancetype) vectorDescriptorWithLength: (NSUInteger) length(MPSDataType) dataType Create a MPSVectorDescriptor with the specified length and data type. Parameters: length The number of elements in a single vector. dataType The type of the data to be stored in the vector. Use this function for creating a descriptor of a MPSVector object containing a single vector. + (__nonnull instancetype) vectorDescriptorWithLength: (NSUInteger) length(NSUInteger) vectors(NSUInteger) vectorBytes(MPSDataType) dataType Create a MPSVectorDescriptor with the specified length and data type. Parameters: length The number of elements in a single vector. vectors The number of vectors in the MPSVector object. vectorBytes The number of bytes between starting elements of consecutive vectors. dataType The type of the data to be stored in the vector. For performance considerations the optimal stride between vectors may not necessarily be equal to the vector length. The MPSVectorDescriptor class provides a method which may be used to determine this value, see the vectorBytesForLength API. Property Documentation - dataType [read], [write], [nonatomic], [assign] The type of the data which makes up the values of the vector. - length [read], [write], [nonatomic], [assign] The number of elements in the vector. - vectorBytes [read], [nonatomic], [assign] The stride, in bytes, between corresponding elements of consecutive vectors. Must be a multiple of the element size - vectors [read], [nonatomic], [assign] The number of vectors. Author Generated automatically by Doxygen for MetalPerformanceShaders.framework from the source code. Version MetalPerformanceShaders-100 Thu Feb 8 2018 MPSVectorDescriptor(3)
All times are GMT -4. The time now is 10:58 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy