Sponsored Content
Top Forums Shell Programming and Scripting Splitting large file and renaming based on field Post 302637143 by fozrun on Tuesday 8th of May 2012 11:38:19 AM
Old 05-08-2012
Splitting large file and renaming based on field

I am trying to update an older program on a small cluster. It uses individual files to send jobs to each node. However the newer database comes as one large file, containing over 10,000 records. I therefore need to split this file. It looks like this:

Code:
HMMER3/b [3.0 | March 2010]
NAME  1-cysPrx_C
ACC   PF10417.4
DESC  C-terminal domain of 1-Cys peroxiredoxin
LENG  40
ALPH  amino
RF    no
CS    yes
MAP   yes
.....more data...
          0.00103  6.88015        *  0.61958  0.77255  0.00000        *
//
HMMER3/b [3.0 | March 2010]
NAME  120_Rick_ant
ACC   PF12574.3
DESC  120 KDa Rickettsia surface antigen
LENG  255
ALPH  amino
RF    no
CS    no
MAP   yes
DATE  Tue Sep 27 11:43:56 2011
NSEQ  7
... etc..

Each record starts with HMMER3/b and ends with //

I would like each individual file named after the ACC field, such as PF10417.4 or PF10417 (the . doesn't matter)

Any clues?
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Splitting a large log file

Okay, absolute newbie here... I'm on a Mac trying to split an almost 2 Gig log file on a Unix box into manageable chunks for my web-based log analysis tool. What do I need to do, what programs do I need to do it? All and any help appreciated/needed :-) Cheers (8 Replies)
Discussion started by: simmonet
8 Replies

2. Shell Programming and Scripting

split large file based on field criteria

I have a file containing date/time sorted data of the form ... 2009/06/10,20:59:59.950,XAG/USD,Q,1,1115, 14.3025,100,1,1 2009/06/10,20:59:59.950,XAG/USD,Q,1,1116, 14.3026,125,1,1 2009/06/10,20:59:59.950,XAG/USD,R,0,0, , 0,0,0 2009/06/10,20:59:59.950,XAG/USD,R,1,0, 14.1910,100,1,1... (6 Replies)
Discussion started by: asriva
6 Replies

3. Shell Programming and Scripting

awk - splitting 1 large file into multiple based on same key records

Hello gurus, I am new to "awk" and trying to break a large file having 4 million records into several output files each having half million but at the same time I want to keep the similar key records in the same output file, not to exist accross the files. e.g. my data is like: Row_Num,... (6 Replies)
Discussion started by: kam66
6 Replies

4. Shell Programming and Scripting

Splitting large file into multiple files in unix based on pattern

I need to write a shell script for below scenario My input file has data in format: qwerty0101TWE 12345 01022005 01022005 datainala alanfernanded 26 qwerty0101mXZ 12349 01022005 06022008 datainalb johngalilo 28 qwerty0101TWE 12342 01022005 07022009 datainalc hitalbert 43 qwerty0101CFG 12345... (19 Replies)
Discussion started by: jimmy12
19 Replies

5. Shell Programming and Scripting

Problem with splitting large file based on pattern

Hi Experts, I have to split huge file based on the pattern to create smaller files. The pattern which is expected in the file is: Master..... First... second.... second... third.. third... Master... First.. second... third... Master... First... second.. second.. second..... (2 Replies)
Discussion started by: saisanthi
2 Replies

6. Shell Programming and Scripting

Sed: Splitting A large File into smaller files based on recursive Regular Expression match

I will simplify the explaination a bit, I need to parse through a 87m file - I have a single text file in the form of : <NAME>house........ SOMETEXT SOMETEXT SOMETEXT . . . . </script> MORETEXT MORETEXT . . . (6 Replies)
Discussion started by: sumguy
6 Replies

7. UNIX for Dummies Questions & Answers

[Solved] File Splitting And Renaming Problem

OK So I Recently Bought A whatbox Seed-box Act!!:cool: I am connected to whatbox via SSH!!! Now i have downloaded a movie and renamed it to 2yify.mp4 (800MB):o When I TYPE the command to split it which is:) split -b 400m 2yify.mp4 It gets renamed into two parts with different names... (4 Replies)
Discussion started by: anime12345
4 Replies

8. Shell Programming and Scripting

Help with Splitting a Large XML file based on size AND tags

Hi All, This is my first post here. Hoping to share and gain knowledge from this great forum !!!! I've scanned this forum before posting my problem here, but I'm afraid I couldn't find any thread that addresses this exact problem. I'm trying to split a large XML file (with multiple tag... (7 Replies)
Discussion started by: Aviktheory11
7 Replies

9. Shell Programming and Scripting

Splitting file into multiple files and renaming them

Hi all, Newbie here. First of all, sorry if I made any mistakes while posting this question in terms of rules. Correct me if I am wrong. :b: I have a .dat file whose name is in the format of 20170311_abc_xyz.dat. The file consists of records whose first column consists of multiple dates in... (2 Replies)
Discussion started by: chanduris
2 Replies

10. UNIX for Beginners Questions & Answers

Splitting the XML file and renaming the files

Hello Gurus, I have a requirement to split the xml file into different xml files. Can you please help me with that? Here is my Source XML file <jms-system-resource> <name>PS6SOAJMSModule</name> <target>soa_server1</target> <sub-deployment> ... (3 Replies)
Discussion started by: Siv51427882
3 Replies
ACC(4)							     Kernel Interfaces Manual							    ACC(4)

NAME
acc - ACC LH/DH IMP interface SYNOPSIS
/sys/conf/SYSTEM: NACC 0 # ACC LH/DH ARPAnet IMP interface PLI YES # LH/DH is connected to a PLI DESCRIPTION
The acc device provides a Local Host/Distant Host interface to an IMP. It is normally used when participating in the DARPA Internet. The controller itself is not accessible to users, but instead provides the hardware support to the IMP interface described in imp(4). When configuring, the imp(NIMP) pseudo-device must also be included. DIAGNOSTICS
acc%d: not alive. The initialization routine was entered even though the device did not autoconfigure. This indicates a system problem. acc%d: can't initialize. Insufficient UNIBUS resources existed to initialize the device. This is likely to occur when the device is run on a buffered data path on an 11/750 and other network interfaces are also configured to use buffered data paths, or when it is configured to use buffered data paths on an 11/730 (which has none). acc%d: imp doesn't respond, icsr=%b. The driver attempted to initialize the device, but the IMP failed to respond after 500 tries. Check the cabling. acc%d: stray xmit interrupt, csr=%b. An interrupt occurred when no output had previously been started. acc%d: output error, ocsr=%b, icsr=%b. The device indicated a problem sending data on output. acc%d: input error, csr=%b. The device indicated a problem receiving data on input. acc%d: bad length=%d. An input operation resulted in a data transfer of less than 0 or more than 1008 bytes of data into memory (according to the word count register). This should never happen as the maximum size of a host-IMP message is 1008 bytes. 3rd Berkeley Distribution July 26, 1987 ACC(4)
All times are GMT -4. The time now is 06:12 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy