how to use split command in unix shell with a condition


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting how to use split command in unix shell with a condition
# 1  
Old 09-12-2012
how to use split command in unix shell with a condition

Hi all,
I have a file which I want to split into several files based on a condition. This files has several records. I want one record per file. Each record ends with a //. So, I want to separate files based on this condition. I want split files to be named with the name across the field ID (for example: AACL2_BRAJA in the name give below). Also, I want the main file to be as such.
Code:
Input
  ID   AACL2_BRAJA             Reviewed;         334 AA.
AC   Q89GR3;
DT   30-NOV-2010, integrated into UniProtKB/Swiss-Prot.
DT   01-JUN-2003, sequence version 1.
DT   11-JUL-2012, entry version 42.
DE   RecName: Full=Amino acid--[acyl-carrier-protein] ligase 2;
RP   FUNCTION, CATALYTIC ACTIVITY, SUBSTRATE SPECIFICITY, COFACTOR, KINETIC
RP   PARAMETERS, AND SUBUNIT.
RC   STRAIN=USDA 110;
RX   PubMed=20663952; DOI=10.1073/pnas.1007470107;
RA   Mocibob M., Ivic N., Bilokapic S., Maier T., Luic M., Ban N.,
RA   Weygand-Durasevic I.;
RT   "Homologs of aminoacyl-tRNA synthetases acylate carrier proteins and
RT   provide a link between ribosomal and nonribosomal peptide synthesis.";
RL   Proc. Natl. Acad. Sci. U.S.A. 107:14585-14590(2010).
CC   -!- FUNCTION: Catalyzes the ATP-dependent activation of L-glycine and
CC       its transfer to the phosphopantetheine prosthetic group covalently
CC       attached to the vicinal carrier protein blr6284 of yet unknown
CC       function. May participate in nonribosomal peptide synthesis or
CC       related processes. L-alanine is a poor substrate whereas L-serine
MNLAIVEAPA DSTPPPADPL DHLADALFHE MGSPGVYGRT ALYEDVVERI AAVISRNREP
     NTEVMRFPPV MNRAQLERSG YLKSFPNLLG CVCGLHGIES EIDAAISRFD AGGDWTESLS
     PADLVLSPAA CYPLYPIAAS RGPVPAAGWS FDVAADCFRR EPSRHLDRLQ SFRMREFVCI
     GSADHVSAFR ERWIIRAQKI ARDLGLTFRI DHANDPFFGR VGQMMAVSQK QLSLKFELLV
     PLRSEERPTA CMSFNYHRDH FGTTWGIVDA AGEPAHTACV AFGMDRLAVA MFHTHGKDVA
     LWPIAVRDLL GLAQTDRGAP SAFEEYRCAK EAGS
//

Code:
Expected output
split files named as  their ID (in this case AACL2_BRAJA.txt)
files separated when there is //

Any help would be much appreciated. Thanks in advance.

Last edited by kaav06; 09-12-2012 at 11:09 PM..
# 2  
Old 09-12-2012
I think this will do what you are wanting:

Code:
awk '
    $1 == "ID" {
        if( fname )
            close( fname );
        fname = $2 ".txt";
    }
    { print >fname; }
' input-file


It assumes that there is no "ID" as the first token other than on records that identify the next filename.
This User Gave Thanks to agama For This Post:
# 3  
Old 09-13-2012
Thanks. Worked like a wonder.Smilie

Last edited by kaav06; 09-13-2012 at 02:27 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

awk do not split if condition is meet

Trying to use awk to format the input based on the filed count being 5. Most lines are fine using the awk below, except the first two lines. I know the reason is the -1 in green and -2 in blue. But can not figure out how to not split on the - if it is followed by a digit then letter. Thank you :).... (1 Reply)
Discussion started by: cmccabe
1 Replies

2. Shell Programming and Scripting

Split a content in a file with specific interval base on the delimited values using UNIX command

Hi All, we have a requirement to split a content in a text file every 5 rows and write in a new file . conditions: if 5th line falls between center of the statement . it should look upto after ";" files are below format: 1 UPDATE TABLE TEST1 SET VALUE ='AFDASDFAS' 2 WHERE... (3 Replies)
Discussion started by: KK230689
3 Replies

3. UNIX for Advanced & Expert Users

How to split a large file with the first 100 lines of each condition?

I have a huge file with the following input: Case1 Specific_Info Specific_Info Case1 Specific_Info Specific_Info Case3 Specific_Info Specific_Info Case4 Specific_Info Specific_Info Case1 Specific_Info Specific_Info Case2 Specific_Info Specific_Info Case2 Specific_Info Specific_Info... (2 Replies)
Discussion started by: laurigo
2 Replies

4. Shell Programming and Scripting

How to Split File to 2 depending on condition?

Hi , cat myfile.txt ! 3100.2.0.5 ! 3100.2.22.4 ! 3100.2.30.33 ! 3100.2.4.1 ! ! 3100.2.0.5 ! 3100.2.22.4 ! 3100.2.22.11 ! 3100.2.4.1 ! ! 3100.2.0.5 ! 3100.2.2.50 ! 3100.2.22.11 ! 3100.2.4.1 ! ! 3100.2.0.5 ! 3100.2.22.4 ! 3100.2.30.33 ! 3100.2.4.1 ! ! 3100.2.0.5 ! 3100.2.22.4 !... (6 Replies)
Discussion started by: OTNA
6 Replies

5. Shell Programming and Scripting

If else condition inside for loop of awk command in UNIX shell scripting

Hi , Please excuse me for opening a new thread i am unable to find out the syntax error in my if else condition inside for loop in awk command , my actual aim is to print formatted html td tag when if condition (True) having string as "failed", could anyone please advise what is the right... (2 Replies)
Discussion started by: karthikram
2 Replies

6. Shell Programming and Scripting

redirect stdout echo command in condition A run in condition B

hi, I have some problems in my simple script about the redirect echo stdout command inside a condition. Why is the echo command inside the elif still execute in the else command Here are my simple script After check on the two diff output the echo stdout redirect is present in two diff... (3 Replies)
Discussion started by: jao_madn
3 Replies

7. Shell Programming and Scripting

Error while using sqlplus command inside 'if' condition in an unix shell script

Hi all, I am using the below given sqlplus command in my unix script to invoke a stored procedure which returns a value .It works fine. RET_CODE=$(/opt/oracle/product/10.2.0.4.CL/bin/sqlplus -S $USER/$PASSWD@$DB_NAME <<EOF EXEC MY_PKG.MY_SP (:COUNT); PRINT COUNT; commit; ... (6 Replies)
Discussion started by: Shri123
6 Replies

8. Shell Programming and Scripting

split file with condition

$ cat file H1:12:90 k:12:b n:22:i k:54:b k:42:b s:48:s a:41:b t:18:n c:77:a I am trying to split above file based on $2 such that if $2 is rounded to nearest 10's multiple (e.g. 10,20,30 etc), each sub file should contain 3 multiples and so on (also I want to keep header i.e. NR==1, in... (6 Replies)
Discussion started by: uwork72
6 Replies

9. Shell Programming and Scripting

How to split the String based on condition?

hi , I have a String str="/opt/ibm/lotus/ibw/latest" or ="/opt/lotus/ibw/latest" this value is dynamic..I want to split this string into 2 strings 1. /opt/ibm/lotus(/opt/lotus) this string must ends with "lotus" 2./ibw/latest can any body help me on this? Regards, sankar (2 Replies)
Discussion started by: sankar reddy
2 Replies

10. Shell Programming and Scripting

awk script to split a file based on the condition

I have the file with the records like 4234234 US phone 3244234 US cup 2342342 CA phone 8947234 US phone 2389472 CA cup 2348972 US maps 3894234 CA phone I want the records with (US,phone) as record to be in one file, (Us, cup) in another file and (CA,cup) to be in another I mean all... (12 Replies)
Discussion started by: superprogrammer
12 Replies
Login or Register to Ask a Question