new enough versions of awk can conveniently be told to consider "//" the record splitter, which makes it just a matter of finding the "ACC" field and using the next one as the file name to print into.
Code:
$ cat hmmer.awk
BEGIN { RS="//\n"; ORS="//\n" }
{
for(N=1; N<=NF; N++)
if($N == "ACC")
{
printf("Send this record to %s\n", $(N+1));
print > $(N+1);
close( $(N+1) );
break;
}
}
$ awk -f hmmer.awk data
Send this record to PF10417.4
Send this record to PF12574.3
$ cat PF10417.4
HMMER3/b [3.0 | March 2010]
NAME 1-cysPrx_C
ACC PF10417.4
DESC C-terminal domain of 1-Cys peroxiredoxin
LENG 40
ALPH amino
RF no
CS yes
MAP yes
.....more data...
0.00103 6.88015 * 0.61958 0.77255 0.00000 *
//
$ cat PF12574.3
HMMER3/b [3.0 | March 2010]
NAME 120_Rick_ant
ACC PF12574.3
DESC 120 KDa Rickettsia surface antigen
LENG 255
ALPH amino
RF no
CS no
MAP yes
DATE Tue Sep 27 11:43:56 2011
NSEQ 7
//
$
Okay, absolute newbie here...
I'm on a Mac trying to split an almost 2 Gig log file on a Unix box into manageable chunks for my web-based log analysis tool.
What do I need to do, what programs do I need to do it?
All and any help appreciated/needed :-)
Cheers (8 Replies)
I have a file containing date/time sorted data of the form
...
2009/06/10,20:59:59.950,XAG/USD,Q,1,1115, 14.3025,100,1,1
2009/06/10,20:59:59.950,XAG/USD,Q,1,1116, 14.3026,125,1,1
2009/06/10,20:59:59.950,XAG/USD,R,0,0, , 0,0,0
2009/06/10,20:59:59.950,XAG/USD,R,1,0, 14.1910,100,1,1... (6 Replies)
Hello gurus,
I am new to "awk" and trying to break a large file having 4 million records into several output files each having half million but at the same time I want to keep the similar key records in the same output file, not to exist accross the files.
e.g. my data is like:
Row_Num,... (6 Replies)
I need to write a shell script for below scenario
My input file has data in format:
qwerty0101TWE 12345 01022005 01022005 datainala alanfernanded 26
qwerty0101mXZ 12349 01022005 06022008 datainalb johngalilo 28
qwerty0101TWE 12342 01022005 07022009 datainalc hitalbert 43
qwerty0101CFG 12345... (19 Replies)
Hi Experts,
I have to split huge file based on the pattern to create smaller files. The pattern which is expected in the file is:
Master.....
First...
second....
second...
third..
third...
Master...
First..
second...
third...
Master...
First...
second..
second..
second..... (2 Replies)
I will simplify the explaination a bit, I need to parse through a 87m file -
I have a single text file in the form of :
<NAME>house........
SOMETEXT
SOMETEXT
SOMETEXT
.
.
.
.
</script>
MORETEXT
MORETEXT
.
.
. (6 Replies)
OK So I Recently Bought A whatbox Seed-box Act!!:cool:
I am connected to whatbox via SSH!!!
Now i have downloaded a movie and renamed it to 2yify.mp4 (800MB):o
When I TYPE the command to split it which is:)
split -b 400m 2yify.mp4
It gets renamed into two parts with different names... (4 Replies)
Hi All,
This is my first post here. Hoping to share and gain knowledge from this great forum !!!!
I've scanned this forum before posting my problem here, but I'm afraid I couldn't find any thread that addresses this exact problem.
I'm trying to split a large XML file (with multiple tag... (7 Replies)
Hi all,
Newbie here. First of all, sorry if I made any mistakes while posting this question in terms of rules. Correct me if I am wrong. :b:
I have a .dat file whose name is in the format of 20170311_abc_xyz.dat. The file consists of records whose first column consists of multiple dates in... (2 Replies)
Hello Gurus,
I have a requirement to split the xml file into different xml files.
Can you please help me with that?
Here is my Source XML file
<jms-system-resource>
<name>PS6SOAJMSModule</name>
<target>soa_server1</target>
<sub-deployment>
... (3 Replies)
Discussion started by: Siv51427882
3 Replies
LEARN ABOUT MOJAVE
ucblinks
ucblinks(1B) SunOS/BSD Compatibility Package Commands ucblinks(1B)NAME
ucblinks - adds /dev entries to give SunOS 4.x compatible names to SunOS 5.x devices
SYNOPSIS
/usr/ucb/ucblinks [-e rulebase] [-r rootdir]
DESCRIPTION
ucblinks creates symbolic links under the /dev directory for devices whose SunOS 5.x names differ from their SunOS 4.x names. Where possi-
ble, these symbolic links point to the device's SunOS 5.x name rather than to the actual /devices entry.
ucblinks does not remove unneeded compatibility links; these must be removed by hand.
ucblinks should be called each time the system is reconfiguration-booted, after any new SunOS 5.x links that are needed have been created,
since the reconfiguration may have resulted in more compatibility names being needed.
In releases prior to SunOS 5.4, ucblinks used a nawk rule-base to construct the SunOS 4.x compatible names. ucblinks no longer uses nawk
for the default operation, although nawk rule-bases can still be specifed with the -e option. The nawk rule-base equivalent to the SunOS
5.4 default operation can be found in /usr/ucblib/ucblinks.awk.
OPTIONS -e rulebase Specify rulebase as the file containing nawk(1) pattern-action statements.
-r rootdir Specify rootdir as the directory under which dev and devices will be found, rather than the standard root directory /.
FILES
/usr/ucblib/ucblinks.awk sample rule-base for compatibility links
ATTRIBUTES
See attributes(5) for descriptions of the following attributes:
+-----------------------------+-----------------------------+
| ATTRIBUTE TYPE | ATTRIBUTE VALUE |
+-----------------------------+-----------------------------+
|Availability |SUNWscpu |
+-----------------------------+-----------------------------+
SEE ALSO devlinks(1M), disks(1M), ports(1M), tapes(1M), attributes(5)SunOS 5.10 13 Apr 1994 ucblinks(1B)