Sponsored Content
Top Forums Shell Programming and Scripting how to use split command in unix shell with a condition Post 302700039 by kaav06 on Wednesday 12th of September 2012 10:04:19 PM
Old 09-12-2012
how to use split command in unix shell with a condition

Hi all,
I have a file which I want to split into several files based on a condition. This files has several records. I want one record per file. Each record ends with a //. So, I want to separate files based on this condition. I want split files to be named with the name across the field ID (for example: AACL2_BRAJA in the name give below). Also, I want the main file to be as such.
Code:
Input
  ID   AACL2_BRAJA             Reviewed;         334 AA.
AC   Q89GR3;
DT   30-NOV-2010, integrated into UniProtKB/Swiss-Prot.
DT   01-JUN-2003, sequence version 1.
DT   11-JUL-2012, entry version 42.
DE   RecName: Full=Amino acid--[acyl-carrier-protein] ligase 2;
RP   FUNCTION, CATALYTIC ACTIVITY, SUBSTRATE SPECIFICITY, COFACTOR, KINETIC
RP   PARAMETERS, AND SUBUNIT.
RC   STRAIN=USDA 110;
RX   PubMed=20663952; DOI=10.1073/pnas.1007470107;
RA   Mocibob M., Ivic N., Bilokapic S., Maier T., Luic M., Ban N.,
RA   Weygand-Durasevic I.;
RT   "Homologs of aminoacyl-tRNA synthetases acylate carrier proteins and
RT   provide a link between ribosomal and nonribosomal peptide synthesis.";
RL   Proc. Natl. Acad. Sci. U.S.A. 107:14585-14590(2010).
CC   -!- FUNCTION: Catalyzes the ATP-dependent activation of L-glycine and
CC       its transfer to the phosphopantetheine prosthetic group covalently
CC       attached to the vicinal carrier protein blr6284 of yet unknown
CC       function. May participate in nonribosomal peptide synthesis or
CC       related processes. L-alanine is a poor substrate whereas L-serine
MNLAIVEAPA DSTPPPADPL DHLADALFHE MGSPGVYGRT ALYEDVVERI AAVISRNREP
     NTEVMRFPPV MNRAQLERSG YLKSFPNLLG CVCGLHGIES EIDAAISRFD AGGDWTESLS
     PADLVLSPAA CYPLYPIAAS RGPVPAAGWS FDVAADCFRR EPSRHLDRLQ SFRMREFVCI
     GSADHVSAFR ERWIIRAQKI ARDLGLTFRI DHANDPFFGR VGQMMAVSQK QLSLKFELLV
     PLRSEERPTA CMSFNYHRDH FGTTWGIVDA AGEPAHTACV AFGMDRLAVA MFHTHGKDVA
     LWPIAVRDLL GLAQTDRGAP SAFEEYRCAK EAGS
//

Code:
Expected output
split files named as  their ID (in this case AACL2_BRAJA.txt)
files separated when there is //

Any help would be much appreciated. Thanks in advance.

Last edited by kaav06; 09-12-2012 at 11:09 PM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk script to split a file based on the condition

I have the file with the records like 4234234 US phone 3244234 US cup 2342342 CA phone 8947234 US phone 2389472 CA cup 2348972 US maps 3894234 CA phone I want the records with (US,phone) as record to be in one file, (Us, cup) in another file and (CA,cup) to be in another I mean all... (12 Replies)
Discussion started by: superprogrammer
12 Replies

2. Shell Programming and Scripting

How to split the String based on condition?

hi , I have a String str="/opt/ibm/lotus/ibw/latest" or ="/opt/lotus/ibw/latest" this value is dynamic..I want to split this string into 2 strings 1. /opt/ibm/lotus(/opt/lotus) this string must ends with "lotus" 2./ibw/latest can any body help me on this? Regards, sankar (2 Replies)
Discussion started by: sankar reddy
2 Replies

3. Shell Programming and Scripting

split file with condition

$ cat file H1:12:90 k:12:b n:22:i k:54:b k:42:b s:48:s a:41:b t:18:n c:77:a I am trying to split above file based on $2 such that if $2 is rounded to nearest 10's multiple (e.g. 10,20,30 etc), each sub file should contain 3 multiples and so on (also I want to keep header i.e. NR==1, in... (6 Replies)
Discussion started by: uwork72
6 Replies

4. Shell Programming and Scripting

Error while using sqlplus command inside 'if' condition in an unix shell script

Hi all, I am using the below given sqlplus command in my unix script to invoke a stored procedure which returns a value .It works fine. RET_CODE=$(/opt/oracle/product/10.2.0.4.CL/bin/sqlplus -S $USER/$PASSWD@$DB_NAME <<EOF EXEC MY_PKG.MY_SP (:COUNT); PRINT COUNT; commit; ... (6 Replies)
Discussion started by: Shri123
6 Replies

5. Shell Programming and Scripting

redirect stdout echo command in condition A run in condition B

hi, I have some problems in my simple script about the redirect echo stdout command inside a condition. Why is the echo command inside the elif still execute in the else command Here are my simple script After check on the two diff output the echo stdout redirect is present in two diff... (3 Replies)
Discussion started by: jao_madn
3 Replies

6. Shell Programming and Scripting

If else condition inside for loop of awk command in UNIX shell scripting

Hi , Please excuse me for opening a new thread i am unable to find out the syntax error in my if else condition inside for loop in awk command , my actual aim is to print formatted html td tag when if condition (True) having string as "failed", could anyone please advise what is the right... (2 Replies)
Discussion started by: karthikram
2 Replies

7. Shell Programming and Scripting

How to Split File to 2 depending on condition?

Hi , cat myfile.txt ! 3100.2.0.5 ! 3100.2.22.4 ! 3100.2.30.33 ! 3100.2.4.1 ! ! 3100.2.0.5 ! 3100.2.22.4 ! 3100.2.22.11 ! 3100.2.4.1 ! ! 3100.2.0.5 ! 3100.2.2.50 ! 3100.2.22.11 ! 3100.2.4.1 ! ! 3100.2.0.5 ! 3100.2.22.4 ! 3100.2.30.33 ! 3100.2.4.1 ! ! 3100.2.0.5 ! 3100.2.22.4 !... (6 Replies)
Discussion started by: OTNA
6 Replies

8. UNIX for Advanced & Expert Users

How to split a large file with the first 100 lines of each condition?

I have a huge file with the following input: Case1 Specific_Info Specific_Info Case1 Specific_Info Specific_Info Case3 Specific_Info Specific_Info Case4 Specific_Info Specific_Info Case1 Specific_Info Specific_Info Case2 Specific_Info Specific_Info Case2 Specific_Info Specific_Info... (2 Replies)
Discussion started by: laurigo
2 Replies

9. Shell Programming and Scripting

Split a content in a file with specific interval base on the delimited values using UNIX command

Hi All, we have a requirement to split a content in a text file every 5 rows and write in a new file . conditions: if 5th line falls between center of the statement . it should look upto after ";" files are below format: 1 UPDATE TABLE TEST1 SET VALUE ='AFDASDFAS' 2 WHERE... (3 Replies)
Discussion started by: KK230689
3 Replies

10. UNIX for Beginners Questions & Answers

awk do not split if condition is meet

Trying to use awk to format the input based on the filed count being 5. Most lines are fine using the awk below, except the first two lines. I know the reason is the -1 in green and -2 in blue. But can not figure out how to not split on the - if it is followed by a digit then letter. Thank you :).... (1 Reply)
Discussion started by: cmccabe
1 Replies
SPLIT(1)						    BSD General Commands Manual 						  SPLIT(1)

NAME
split -- split a file into pieces SYNOPSIS
split -d [-l line_count] [-a suffix_length] [file [prefix]] split -d -b byte_count[K|k|M|m|G|g] [-a suffix_length] [file [prefix]] split -d -n chunk_count [-a suffix_length] [file [prefix]] split -d -p pattern [-a suffix_length] [file [prefix]] DESCRIPTION
The split utility reads the given file and breaks it up into files of 1000 lines each (if no options are specified), leaving the file unchanged. If file is a single dash ('-') or absent, split reads from the standard input. The options are as follows: -a suffix_length Use suffix_length letters to form the suffix of the file name. -b byte_count[K|k|M|m|G|g] Create split files byte_count bytes in length. If k or K is appended to the number, the file is split into byte_count kilobyte pieces. If m or M is appended to the number, the file is split into byte_count megabyte pieces. If g or G is appended to the num- ber, the file is split into byte_count gigabyte pieces. -d Use a numeric suffix instead of a alphabetic suffix. -l line_count Create split files line_count lines in length. -n chunk_count Split file into chunk_count smaller files. -p pattern The file is split whenever an input line matches pattern, which is interpreted as an extended regular expression. The matching line will be the first line of the next output file. This option is incompatible with the -b and -l options. If additional arguments are specified, the first is used as the name of the input file which is to be split. If a second additional argument is specified, it is used as a prefix for the names of the files into which the file is split. In this case, each file into which the file is split is named by the prefix followed by a lexically ordered suffix using suffix_length characters in the range ``a-z''. If -a is not speci- fied, two letters are used as the suffix. If the prefix argument is not specified, the file is split into lexically ordered files named with the prefix ``x'' and with suffixes as above. ENVIRONMENT
The LANG, LC_ALL, LC_CTYPE and LC_COLLATE environment variables affect the execution of split as described in environ(7). EXIT STATUS
The split utility exits 0 on success, and >0 if an error occurs. SEE ALSO
csplit(1), re_format(7) STANDARDS
The split utility conforms to IEEE Std 1003.1-2001 (``POSIX.1''). HISTORY
A split command appeared in Version 3 AT&T UNIX. BUGS
The maximum line length for matching patterns is 65536. BSD
May 9, 2013 BSD
All times are GMT -4. The time now is 04:45 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy