Sponsored Content
Top Forums Shell Programming and Scripting Splitting large file and renaming based on field Post 302637143 by fozrun on Tuesday 8th of May 2012 11:38:19 AM
Old 05-08-2012
Splitting large file and renaming based on field

I am trying to update an older program on a small cluster. It uses individual files to send jobs to each node. However the newer database comes as one large file, containing over 10,000 records. I therefore need to split this file. It looks like this:

Code:
HMMER3/b [3.0 | March 2010]
NAME  1-cysPrx_C
ACC   PF10417.4
DESC  C-terminal domain of 1-Cys peroxiredoxin
LENG  40
ALPH  amino
RF    no
CS    yes
MAP   yes
.....more data...
          0.00103  6.88015        *  0.61958  0.77255  0.00000        *
//
HMMER3/b [3.0 | March 2010]
NAME  120_Rick_ant
ACC   PF12574.3
DESC  120 KDa Rickettsia surface antigen
LENG  255
ALPH  amino
RF    no
CS    no
MAP   yes
DATE  Tue Sep 27 11:43:56 2011
NSEQ  7
... etc..

Each record starts with HMMER3/b and ends with //

I would like each individual file named after the ACC field, such as PF10417.4 or PF10417 (the . doesn't matter)

Any clues?
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Splitting a large log file

Okay, absolute newbie here... I'm on a Mac trying to split an almost 2 Gig log file on a Unix box into manageable chunks for my web-based log analysis tool. What do I need to do, what programs do I need to do it? All and any help appreciated/needed :-) Cheers (8 Replies)
Discussion started by: simmonet
8 Replies

2. Shell Programming and Scripting

split large file based on field criteria

I have a file containing date/time sorted data of the form ... 2009/06/10,20:59:59.950,XAG/USD,Q,1,1115, 14.3025,100,1,1 2009/06/10,20:59:59.950,XAG/USD,Q,1,1116, 14.3026,125,1,1 2009/06/10,20:59:59.950,XAG/USD,R,0,0, , 0,0,0 2009/06/10,20:59:59.950,XAG/USD,R,1,0, 14.1910,100,1,1... (6 Replies)
Discussion started by: asriva
6 Replies

3. Shell Programming and Scripting

awk - splitting 1 large file into multiple based on same key records

Hello gurus, I am new to "awk" and trying to break a large file having 4 million records into several output files each having half million but at the same time I want to keep the similar key records in the same output file, not to exist accross the files. e.g. my data is like: Row_Num,... (6 Replies)
Discussion started by: kam66
6 Replies

4. Shell Programming and Scripting

Splitting large file into multiple files in unix based on pattern

I need to write a shell script for below scenario My input file has data in format: qwerty0101TWE 12345 01022005 01022005 datainala alanfernanded 26 qwerty0101mXZ 12349 01022005 06022008 datainalb johngalilo 28 qwerty0101TWE 12342 01022005 07022009 datainalc hitalbert 43 qwerty0101CFG 12345... (19 Replies)
Discussion started by: jimmy12
19 Replies

5. Shell Programming and Scripting

Problem with splitting large file based on pattern

Hi Experts, I have to split huge file based on the pattern to create smaller files. The pattern which is expected in the file is: Master..... First... second.... second... third.. third... Master... First.. second... third... Master... First... second.. second.. second..... (2 Replies)
Discussion started by: saisanthi
2 Replies

6. Shell Programming and Scripting

Sed: Splitting A large File into smaller files based on recursive Regular Expression match

I will simplify the explaination a bit, I need to parse through a 87m file - I have a single text file in the form of : <NAME>house........ SOMETEXT SOMETEXT SOMETEXT . . . . </script> MORETEXT MORETEXT . . . (6 Replies)
Discussion started by: sumguy
6 Replies

7. UNIX for Dummies Questions & Answers

[Solved] File Splitting And Renaming Problem

OK So I Recently Bought A whatbox Seed-box Act!!:cool: I am connected to whatbox via SSH!!! Now i have downloaded a movie and renamed it to 2yify.mp4 (800MB):o When I TYPE the command to split it which is:) split -b 400m 2yify.mp4 It gets renamed into two parts with different names... (4 Replies)
Discussion started by: anime12345
4 Replies

8. Shell Programming and Scripting

Help with Splitting a Large XML file based on size AND tags

Hi All, This is my first post here. Hoping to share and gain knowledge from this great forum !!!! I've scanned this forum before posting my problem here, but I'm afraid I couldn't find any thread that addresses this exact problem. I'm trying to split a large XML file (with multiple tag... (7 Replies)
Discussion started by: Aviktheory11
7 Replies

9. Shell Programming and Scripting

Splitting file into multiple files and renaming them

Hi all, Newbie here. First of all, sorry if I made any mistakes while posting this question in terms of rules. Correct me if I am wrong. :b: I have a .dat file whose name is in the format of 20170311_abc_xyz.dat. The file consists of records whose first column consists of multiple dates in... (2 Replies)
Discussion started by: chanduris
2 Replies

10. UNIX for Beginners Questions & Answers

Splitting the XML file and renaming the files

Hello Gurus, I have a requirement to split the xml file into different xml files. Can you please help me with that? Here is my Source XML file <jms-system-resource> <name>PS6SOAJMSModule</name> <target>soa_server1</target> <sub-deployment> ... (3 Replies)
Discussion started by: Siv51427882
3 Replies
hmmpress(1)							   HMMER Manual 						       hmmpress(1)

NAME
hmmpress - prepare an HMM database for hmmscan SYNOPSIS
hmmpress [options] <hmmfile> DESCRIPTION
Starting from a profile database <hmmfile> in standard HMMER3 format, construct binary compressed datafiles for hmmscan. The hmmpress step is required for hmmscan to work. Four files are created: <hmmfile>.h3m, <hmmfile>.h3i, <hmmfile>.h3f, and <hmmfile>.h3p. The <hmmfile>.h3m file contains the profile HMMs and their annotation in a binary format. The <hmmfile>.h3i file is an SSI index for the <hmmfile>.h3m file. The <hmmfile>.h3f file con- tains precomputed data structures for the fast heuristic filter (the MSV filter). The <hmmfile>.h3p file contains precomputed data struc- tures for the rest of each profile. OPTIONS
-h Help; print a brief reminder of command line usage and all available options. -f Force; overwrites any previous hmmpress'ed datafiles. The default is to bitch about any existing files and ask you to delete them first. SEE ALSO
See hmmer(1) for a master man page with a list of all the individual man pages for programs in the HMMER package. For complete documentation, see the user guide that came with your HMMER distribution (Userguide.pdf); or see the HMMER web page (@HMMER_URL@). COPYRIGHT
@HMMER_COPYRIGHT@ @HMMER_LICENSE@ For additional information on copyright and licensing, see the file called COPYRIGHT in your HMMER source distribution, or see the HMMER web page (@HMMER_URL@). AUTHOR
Eddy/Rivas Laboratory Janelia Farm Research Campus 19700 Helix Drive Ashburn VA 20147 USA http://eddylab.org HMMER
@HMMER_VERSION@ @HMMER_DATE@ hmmpress(1)
All times are GMT -4. The time now is 12:51 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy