Sponsored Content
Top Forums UNIX for Advanced & Expert Users Split one file to many based on pattern Post 302925054 by deal1dealer on Thursday 13th of November 2014 04:23:37 PM
Old 11-13-2014
I am sorry for any confusion, this is my first time posting here, below is the what the main file looks like:

Code:
A200198565634
B769348348547
B837563487567
K656895565906
A387562985749
B893745647875
B394857348957
K734564735644
A893745634785
B938457348953
K783456347856
A890345765875
B378945634789
B934785643534
K378945634764

Desired Output:

File1:
Code:
A200198565634
B769348348547
B837563487567
K656895565906

File2:
Code:
A387562985749
B893745647875
B394857348957
K734564735644

File3:
Code:
A893745634785
B938457348953
K783456347856

File:
Code:
A890345765875
B378945634789
B934785643534
K378945634764


Last edited by Corona688; 11-13-2014 at 05:36 PM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Split a file based on pattern in awk, grep, sed or perl

Hi All, Can someone please help me write a script for the following requirement in awk, grep, sed or perl. Buuuu xxx bbb Kmmmm rrr ssss uuuu Kwwww zzzz ccc Roooowwww eeee Bxxxx jjjj dddd Kuuuu eeeee nnnn Rpppp cccc vvvv cccc Rhhhhhhyyyy tttt Lhhhh rrrrrssssss Bffff mmmm iiiii Ktttt... (5 Replies)
Discussion started by: kumarn
5 Replies

2. Shell Programming and Scripting

Split File Based on Line Number Pattern

Hello all. Sorry, I know this question is similar to many others, but I just can seem to put together exactly what I need. My file is tab delimitted and contains approximately 1 million rows. I would like to send lines 1,4,& 7 to a file. Lines 2, 5, & 8 to a second file. Lines 3, 6, & 9 to... (11 Replies)
Discussion started by: shankster
11 Replies

3. Shell Programming and Scripting

Split a file based on a pattern

Dear all, I have a large file which is composed of 8000 frames, what i would like to do is split the file into 8000 single files names file.pdb.1, file.pdb.2 etc etc each frame in the large file is seperated by a "ENDMDL" flag so my thinking is to use this flag a a point to split the files... (4 Replies)
Discussion started by: Mish_99
4 Replies

4. Shell Programming and Scripting

Split a file into multiple files based on the input pattern

I have a file with lines something like. ...... 123_start ...... ....... 123_end .... ..... 456_start ...... ..... 456_end .... ..... 789_start .... .... 789_end (6 Replies)
Discussion started by: abinash
6 Replies

5. Shell Programming and Scripting

split XML file into multiple files based on pattern

Hello, I am using awk to split a file into multiple files using command: nawk '{ if ( $1 == "<process" ) { n=split($2, arr, "\""); file=arr } print > file }' processes.xml <process name="Process1.process"> ... (3 Replies)
Discussion started by: chiru_h
3 Replies

6. Shell Programming and Scripting

Split a file based on pattern and size

Hello, I have a large file (2GB) that I would like to split based on pattern and size. I've used the following command to split the file (token is "HELLO") awk '/HELLO/{i++}{print > "file"i}' input.txt and the output is similar to the following (i included filesize in KB): 10 ... (2 Replies)
Discussion started by: jl487
2 Replies

7. Shell Programming and Scripting

Split the file based on pattern

Hi , I have huge files around 400 mb, which has clob data and have diffeent scenarios: I am trying to pass scenario number as parameter and and get required modified file based on the scenario number and criteria. Scenario 1: file name : scenario_1.txt ... (2 Replies)
Discussion started by: sol_nov
2 Replies

8. UNIX for Dummies Questions & Answers

Split a huge 7 GB File Based on Pattern into 4 files

Hi, I have a Huge 7 GB file which has around 1 million records, i want to split this file into 4 files to contain around 250k messages each. Please help me as Split command cannot work here as it might miss tags.. Format of the file is as below <!--###### ###### START-->... (6 Replies)
Discussion started by: KishM
6 Replies

9. Shell Programming and Scripting

How to split a file based on pattern line number?

Hi i have requirement like below M <form_name> sdasadasdMklkM D ...... D ..... M form_name> sdasadasdMklkM D ...... D ..... D ...... D ..... M form_name> sdasadasdMklkM D ...... M form_name> sdasadasdMklkM i want split file based on line number by finding... (10 Replies)
Discussion started by: bhaskar v
10 Replies

10. Shell Programming and Scripting

Split a text file into multiple pages based on pattern

Hi, I have a text file (attached the sample). I have also, attached the way the way the files need to be split. We get this file, that will either have 24 Jurisdictions, or will miss some and retain some. Like in the attached sample file, there are only Jurisdictions 03,11,14,15, 20 and 30.... (3 Replies)
Discussion started by: ebsus
3 Replies
AGREP(1)						    BSD General Commands Manual 						  AGREP(1)

NAME
agrep -- print lines approximately matching a pattern SYNOPSIS
agrep [options] pattern [files] DESCRIPTION
Searches for approximate matches of pattern in each FILE or standard input. OPTIONS
Regexp selection and interpretation -e pattern, --regexp=pattern Use PATTERN as a regular expression; useful to protect patterns beginning with '-'. -i, --ignore-case Ignore case distinctions (as defined by the current locale) in pattern and input files. -k, --literal Treat pattern as a literal string, that is, a fixed string with no special characters. -w, --word-regexp Force pattern to match only whole words. A ``whole word'' is a substring which either starts at the beginning or the record or is preceded by a non-word constituent character. Similarly, the substring must either end at the end of the record or be fol- lowed by a non-word constituent character. Word-constituent characters are alphanumerics (as defined by the current locale) and the underscore character. Note that the non-word constituent characters must surround the match; they cannot be counted as errors. Approximate matching settings -D num, --delete-cost=num Set cost of missing characters to num. -I num, --insert-cost=num Set cost of extra characters to num. -S num, --substitue-cost=num Set cost of incorrect characters to num. Note that a deletion (a missing character) and an insertion (an extra character) together constitute a substituted character, but the cost will be the that of a deletion and an insertion added together. Thus, if the const of a substitution is set to be larger than the sum of the costs of deletion and insertion, direct substitutions will never be done. -E -num, --max-errors=num Select records that have at most num errors. -# Select records that have at most # errors (# is a digit between 0 and 9). Miscellaneous -d -pattern, --delimiter=pattern Set the record delimiter regular expression to pattern. The text between two delimiters, before the first delimiter, and after the last delimiter is considered to be a record. The default record delimiter is the regexp `` '', so by default a record is a line. pattern can be any regular expression that does not match the empty string. For example, using -d file ... defines mail messages as records in a Mailbox format file. -v, --invert-match Select non-matching records instead of matching records. -V, --version Print version information and exit. -y, --nothing Does nothing. This options exists only for compatibility with the non-free agrep program. --help Display a brief help message and exit. Output control -B, --best-match Only output the best matching records, that is, the records with the lowest cost. This is currently implemented by making two passes over the input files and cannot be used when reading from standard input. --color, --colour Highlight the matching strings in the output with a color marker. The color string is taken from the GREP_COLOR environment variable. The default color is red. -c, --count Only print a count of matching records per each input file, suppressing normal output. -h, --no-filename Suppress the prefixing filename on output when multiple files are searched. -H, --with-filename Prefix each output record with the name of the input file where the record was read from. -l, --files-with-matches Only print the name of each input file which contains at least one match, suppressing normal output. The scanning for each file will stop on the first match. -n, --record-number Prefix each output record with its sequence number in the input file. The number of the first record is 1. -q, --quiet, --silent Do not write anything to standard output. Exit immediately with zero exit status if a match is found. -s, --show-cost Print match cost with output. --show-position Prefix each output record with the start and end offset of the first match within the record. The offset of the first character of the record is 0. The end position is given as the offset of the first character after the match. -M, --delimiter-after By default, the record delimiter is the newline character and is output after the matching record. If -d is used, the record delimiter will be output before the matching record. This option causes the delimiter to be output after the matching record. With no file, or when file is ``-'', agrep reads standard input. If less than two files are given -h is assumed, otherwise -H is the default. EXAMPLES
agrep -2 optimize foo.txt outputs all lines in file foo.txt that match ``optimize'' within two errors. E.g. lines which contain ``optimise'', ``optmise'', and ``opitmize'' all match. DIAGNOSTICS
Exit status is 0 if a match is found, 1 for no match, and 2 if there were errors. If -E or -# is not specified, only exact matches are selected. pattern is a POSIX extended regular expression (ERE) with the TRE extensions. REPORTING BUGS
Report bugs to the TRE mailing list <tre-general@lists.laurikari.net>. COPYRIGHT
Copyright (C) 2002-2004 Ville Laurikari. This is free software, and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute this software under certain conditions; see the source for the full license text. BSD
November 21, 2004 BSD
All times are GMT -4. The time now is 07:21 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy