Sponsored Content
Top Forums Shell Programming and Scripting Split a file into multiple files based on first two digits of file. Post 302621533 by okkadu on Tuesday 10th of April 2012 02:57:25 PM
Old 04-10-2012
Split a file into multiple files based on first two digits of file.

Hi ,
I do have a fixedwidth flatfile that has data for 10 different datasets each identified by the first two digits in the flatfile.
01 in the first two digit position refers to Set A
02 in the first two digit position refers to Set B and so on
I want to genrate 10 different files from my fixedwidth file , each file having it's own record set.
File A.dat having only 01 records.
File B.dat having only 02 records and so on.

Please Let me know how we can achieve this.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Split file into multiple files depending upon first 4 digits

Hi All, I have a file like below: 1016D"ddd","343","1299" 1016D"ddd","3564","1299" 1016D"ddd","3297","1393" 1016D"ddd","32989","1527" 1016D"ddd","346498","1652" 2312D"ddd","3269","1652" 2312D"ddd","328","1652" 2312D"ddd","2224","2100" 3444D"ddd","252","2100" 3444D"ddd","2619","2100"... (4 Replies)
Discussion started by: deepakgang
4 Replies

2. Shell Programming and Scripting

Split single file into multiple files based on the number in the column

Dear All, I would like to split a file of the following format into multiple files based on the number in the 6th column (numbers 1, 2, 3...): ATOM 1 N GLY A 1 -3.198 27.537 -5.958 1.00 0.00 N ATOM 2 CA GLY A 1 -2.199 28.399 -6.617 1.00 0.00 ... (3 Replies)
Discussion started by: tomasl
3 Replies

3. Shell Programming and Scripting

Split a file into multiple files based on the input pattern

I have a file with lines something like. ...... 123_start ...... ....... 123_end .... ..... 456_start ...... ..... 456_end .... ..... 789_start .... .... 789_end (6 Replies)
Discussion started by: abinash
6 Replies

4. Shell Programming and Scripting

split XML file into multiple files based on pattern

Hello, I am using awk to split a file into multiple files using command: nawk '{ if ( $1 == "<process" ) { n=split($2, arr, "\""); file=arr } print > file }' processes.xml <process name="Process1.process"> ... (3 Replies)
Discussion started by: chiru_h
3 Replies

5. Shell Programming and Scripting

Split a file into multiple files based on field value

Hi, I've one requirement. I have to split one comma delimited file into multiple files based on one of the column values. How can I achieve this Unix Here is the sample data. In this case I have split the files based on date column(c4) Input file c1,c2,c3,c4,c5... (1 Reply)
Discussion started by: manasvi24
1 Replies

6. Shell Programming and Scripting

How to split file into multiple files using awk based on 1 field in the file?

Good day all I need some helps, say that I have data like below, each field separated by a tab DATE NAME ADDRESS 15/7/2012 LX a.b.c 15/7/2012 LX1 a.b.c 16/7/2012 AB a.b.c 16/7/2012 AB2 a.b.c 15/7/2012 LX2 a.b.c... (2 Replies)
Discussion started by: alexyyw
2 Replies

7. Shell Programming and Scripting

awk script to split file into multiple files based on many columns

So I have a space delimited file that I'd like to split into multiple files based on multiple column values. This is what my data looks like 1bc9A02 1 10 1000 FTDLNLVQALRQFLWSFRLPGEAQKIDRMMEAFAQRYCQCNNGVFQSTDTCYVLSFAIIMLNTSLHNPNVKDKPTVERFIAMNRGINDGGDLPEELLRNLYESIKNEPFKIPELEHHHHHH 1ku1A02 1 10... (9 Replies)
Discussion started by: viored
9 Replies

8. Shell Programming and Scripting

Split a big file into multiple files based on first four characters

I have a requirement to split a huge file to smaller text files based on first four characters which look like ABCD 1234 DFGH RREX : : : : : 0000 Each of these records are OF EQUAL bytes with a different internal layout based on the above first digit identifier.. Any help to start... (5 Replies)
Discussion started by: etldev
5 Replies

9. Shell Programming and Scripting

Split a single file into multiple files based on a value.

Hi All, I have the sales_data.csv file in the directory as below. SDDCCR; SOM ; MD6546474777 ;05-JAN-16 ABC ; KIRAN ; CB789 ;04-JAN-16 ABC ; RAMANA; KS566767477747 ;06-JAN-16 ABC ; KAMESH; A33535335 ;04-JAN-16 SDDCCR; DINESH; GD6674474747 ;08-JAN-16... (4 Replies)
Discussion started by: ROCK_PLSQL
4 Replies

10. UNIX for Beginners Questions & Answers

Split file into multiple files based on empty lines

I am using below code to split files based on blank lines but it does not work. awk 'BEGIN{i=0}{RS="";}{x="F"++i;}{print > x;}' Your help would be highly appreciated find attachment of sample.txt file (2 Replies)
Discussion started by: imranrasheedamu
2 Replies
Boulder::Unigene(3pm)					User Contributed Perl Documentation				     Boulder::Unigene(3pm)

NAME
Boulder::Unigene - Fetch Unigene data records as parsed Boulder Stones SYNOPSIS
# parse a file of Unigene records $ug = new Boulder::Unigene(-accessor=>'File', -param => '/data/unigene/Hs.dat'); while (my $s = $ug->get) { print $s->Identifier; print $s->Gene; } # parse flatfile records yourself open (UG,"/data/unigene/Hs.dat"); local $/ = "*RECORD*"; while (<UG>) { my $s = Boulder::Unigene->parse($_); # etc. } DESCRIPTION
Boulder::Unigene provides retrieval and parsing services for UNIGENE records Boulder::Unigene provides retrieval and parsing services for NCBI Unigene records. It returns Unigene entries in Stone format, allowing easy access to the various fields and values. Boulder::Unigene is a descendent of Boulder::Stream, and provides a stream-like interface to a series of Stone objects. Access to Unigene is provided by one accessors, which give access to local Unigene database. When you create a new Boulder::Unigene stream, you provide the accessors, along with accessor-specific parameters that control what entries to fetch. The accessors is: File This provides access to local Unigene entries by reading from a flat file (typically Hs.dat file downloadable from NCBI's Ftp site). The stream will return a Stone corresponding to each of the entries in the file, starting from the top of the file and working downward. The parameter is the path to the local file. It is also possible to parse a single Unigene entry from a text string stored in a scalar variable, returning a Stone object. Boulder::Unigene methods This section lists the public methods that the Boulder::Unigene class makes available. new() # Local fetch via File $ug=new Boulder::Unigene(-accessor => 'File', -param => '/data/unigene/Hs.dat'); The new() method creates a new Boulder::Unigene stream on the accessor provided. The only possible accessors is File. If successful, the method returns the stream object. Otherwise it returns undef. new() takes the following arguments: -accessor Name of the accessor to use -param Parameters to pass to the accessor Specify the accessor to use with the -accessor argument. If not specified, it defaults to File. -param is an accessor-specific argument. The possibilities is: For File, the -param argument must point to a string-valued scalar, which will be interpreted as the path to the file to read Unigene entries from. get() The get() method is inherited from Boulder::Stream, and simply returns the next parsed Unigene Stone, or undef if there is nothing more to fetch. It has the same semantics as the parent class, including the ability to restrict access to certain top-level tags. put() The put() method is inherited from the parent Boulder::Stream class, and will write the passed Stone to standard output in Boulder format. This means that it is currently not possible to write a Boulder::Unigene object back into Unigene flatfile form. OUTPUT TAGS
The tags returned by the parsing operation are taken from the names shown in the Flat file Hs.dat since no better description of them is provided yet by the database source producer. Top-Level Tags These are tags that appear at the top level of the parsed Unigene entry. Identifier The Unigene identifier of this entry. Identifier is a single-value tag. Example: my $identifierNo = $s->Identifier; Title The Unigene title for this entry. Example: my $titledef=$s->Title; Gene The Gene associated with this Unigene entry Example: my $thegene=$s->Gene; Cytoband The cytological band position of this entry Example: my $thecytoband=$s->Cytoband; Counts The number of EST in this record Example: my $thecounts=$s->Counts; LocusLink The id of the LocusLink entry associated with this record Example: my $thelocuslink=$s->LocusLink; Chromosome This field contains a list, of the chromosomes numbers in which this entry has been linked Example: my @theChromosome=$s->Chromosome; STS Multiple records in the form ^STS ACC=XXXXXX NAME=YYYYYY ACC NAME TXMAP Multiple records in the form ^TXMAP XXXXXXX; MARKER=YYYYY; RHPANEL=ZZZZ The TXMAP tag points to a Stone record that contains multiple subtags. Each subtag is the name of a feature which points, in turn, to a Stone that describes the feature's location and other attributes. Each feature will contain one or more of the following subtags: MARKER RHPANEL PROTSIM Multiple records in the form ^PROTSIM ORG=XXX; PROTID=DBID:YYY; PCT=ZZZ; ALN=QQQQ Where DBID is PID for indicate presence of GenPept identifier, SP to indicate SWISSPROT identifier, PIR to indicate PIR identifier, PRF to indicate ??? ORG PROTID PCT ALN SEQUENCE Multiple records in the form ^SEQUENCE ACC=XXX; NID=YYYY; PID = CLONE= END= LID= ACC NID PID CLONE END LID SEE ALSO
Boulder, Boulder::Blast, Boulder::Genbank AUTHOR
Lincoln Stein <lstein@cshl.org>. Luca I.G. Toldo <luca.toldo@merck.de> Copyright (c) 1997 Lincoln D. Stein Copyright (c) 1999 Luca I.G. Toldo This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See DISCLAIMER.txt for disclaimers of warranty. perl v5.10.1 2000-01-18 Boulder::Unigene(3pm)
All times are GMT -4. The time now is 04:26 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy