Sponsored Content
Top Forums Shell Programming and Scripting Splitting large file and renaming based on field Post 302637143 by fozrun on Tuesday 8th of May 2012 11:38:19 AM
Old 05-08-2012
Splitting large file and renaming based on field

I am trying to update an older program on a small cluster. It uses individual files to send jobs to each node. However the newer database comes as one large file, containing over 10,000 records. I therefore need to split this file. It looks like this:

Code:
HMMER3/b [3.0 | March 2010]
NAME  1-cysPrx_C
ACC   PF10417.4
DESC  C-terminal domain of 1-Cys peroxiredoxin
LENG  40
ALPH  amino
RF    no
CS    yes
MAP   yes
.....more data...
          0.00103  6.88015        *  0.61958  0.77255  0.00000        *
//
HMMER3/b [3.0 | March 2010]
NAME  120_Rick_ant
ACC   PF12574.3
DESC  120 KDa Rickettsia surface antigen
LENG  255
ALPH  amino
RF    no
CS    no
MAP   yes
DATE  Tue Sep 27 11:43:56 2011
NSEQ  7
... etc..

Each record starts with HMMER3/b and ends with //

I would like each individual file named after the ACC field, such as PF10417.4 or PF10417 (the . doesn't matter)

Any clues?
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Splitting a large log file

Okay, absolute newbie here... I'm on a Mac trying to split an almost 2 Gig log file on a Unix box into manageable chunks for my web-based log analysis tool. What do I need to do, what programs do I need to do it? All and any help appreciated/needed :-) Cheers (8 Replies)
Discussion started by: simmonet
8 Replies

2. Shell Programming and Scripting

split large file based on field criteria

I have a file containing date/time sorted data of the form ... 2009/06/10,20:59:59.950,XAG/USD,Q,1,1115, 14.3025,100,1,1 2009/06/10,20:59:59.950,XAG/USD,Q,1,1116, 14.3026,125,1,1 2009/06/10,20:59:59.950,XAG/USD,R,0,0, , 0,0,0 2009/06/10,20:59:59.950,XAG/USD,R,1,0, 14.1910,100,1,1... (6 Replies)
Discussion started by: asriva
6 Replies

3. Shell Programming and Scripting

awk - splitting 1 large file into multiple based on same key records

Hello gurus, I am new to "awk" and trying to break a large file having 4 million records into several output files each having half million but at the same time I want to keep the similar key records in the same output file, not to exist accross the files. e.g. my data is like: Row_Num,... (6 Replies)
Discussion started by: kam66
6 Replies

4. Shell Programming and Scripting

Splitting large file into multiple files in unix based on pattern

I need to write a shell script for below scenario My input file has data in format: qwerty0101TWE 12345 01022005 01022005 datainala alanfernanded 26 qwerty0101mXZ 12349 01022005 06022008 datainalb johngalilo 28 qwerty0101TWE 12342 01022005 07022009 datainalc hitalbert 43 qwerty0101CFG 12345... (19 Replies)
Discussion started by: jimmy12
19 Replies

5. Shell Programming and Scripting

Problem with splitting large file based on pattern

Hi Experts, I have to split huge file based on the pattern to create smaller files. The pattern which is expected in the file is: Master..... First... second.... second... third.. third... Master... First.. second... third... Master... First... second.. second.. second..... (2 Replies)
Discussion started by: saisanthi
2 Replies

6. Shell Programming and Scripting

Sed: Splitting A large File into smaller files based on recursive Regular Expression match

I will simplify the explaination a bit, I need to parse through a 87m file - I have a single text file in the form of : <NAME>house........ SOMETEXT SOMETEXT SOMETEXT . . . . </script> MORETEXT MORETEXT . . . (6 Replies)
Discussion started by: sumguy
6 Replies

7. UNIX for Dummies Questions & Answers

[Solved] File Splitting And Renaming Problem

OK So I Recently Bought A whatbox Seed-box Act!!:cool: I am connected to whatbox via SSH!!! Now i have downloaded a movie and renamed it to 2yify.mp4 (800MB):o When I TYPE the command to split it which is:) split -b 400m 2yify.mp4 It gets renamed into two parts with different names... (4 Replies)
Discussion started by: anime12345
4 Replies

8. Shell Programming and Scripting

Help with Splitting a Large XML file based on size AND tags

Hi All, This is my first post here. Hoping to share and gain knowledge from this great forum !!!! I've scanned this forum before posting my problem here, but I'm afraid I couldn't find any thread that addresses this exact problem. I'm trying to split a large XML file (with multiple tag... (7 Replies)
Discussion started by: Aviktheory11
7 Replies

9. Shell Programming and Scripting

Splitting file into multiple files and renaming them

Hi all, Newbie here. First of all, sorry if I made any mistakes while posting this question in terms of rules. Correct me if I am wrong. :b: I have a .dat file whose name is in the format of 20170311_abc_xyz.dat. The file consists of records whose first column consists of multiple dates in... (2 Replies)
Discussion started by: chanduris
2 Replies

10. UNIX for Beginners Questions & Answers

Splitting the XML file and renaming the files

Hello Gurus, I have a requirement to split the xml file into different xml files. Can you please help me with that? Here is my Source XML file <jms-system-resource> <name>PS6SOAJMSModule</name> <target>soa_server1</target> <sub-deployment> ... (3 Replies)
Discussion started by: Siv51427882
3 Replies
hmmconvert(1)							   HMMER Manual 						     hmmconvert(1)

NAME
hmmconvert - convert profile file to a HMMER format SYNOPSIS
hmmconvert [options] <hmmfile> DESCRIPTION
The hmmconvert utility converts an input profile file to different HMMER formats. By default, the input profile can be in any HMMER format, including old/obsolete formats from HMMER2, ASCII or binary; the output profile is a current HMMER3 ASCII format. OPTIONS
-h Help; print a brief reminder of command line usage and all available options. -a Output profiles in ASCII text format. This is the default. -b Output profiles in binary format. -2 Output in legacy HMMER2 ASCII text format, in ls (glocal) mode. This allows HMMER3 models to be converted back to a close approxima- tion of HMMER2, for comparative studies. --outfmt <s> Output in a HMMER3 ASCII text format other then the most current one. Valid choices for <s> are 3/b or 3/a. The current format is 3/b, and this is the default. There is a slightly different format 3/a that was used in some alpha test code. SEE ALSO
See hmmer(1) for a master man page with a list of all the individual man pages for programs in the HMMER package. For complete documentation, see the user guide that came with your HMMER distribution (Userguide.pdf); or see the HMMER web page (@HMMER_URL@). COPYRIGHT
@HMMER_COPYRIGHT@ p@HMMER_LICENSE@ For additional information on copyright and licensing, see the file called COPYRIGHT in your HMMER source distribution, or see the HMMER web page (@HMMER_URL@). AUTHOR
Eddy/Rivas Laboratory Janelia Farm Research Campus 19700 Helix Drive Ashburn VA 20147 USA http://eddylab.org " HMMER
@HMMER_VERSION@ @HMMER_DATE@ hmmconvert(1)
All times are GMT -4. The time now is 11:09 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy