File Splitter output filename


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting File Splitter output filename
# 8  
Old 10-11-2012
try something like this...


Code:
awk '{a++;if($0 ~ /^101/){if(s){ 
if(a>=20){a=0;x++;fn="file__"x;print s > fn;s=$0" "a}else{print s > fn;s=$0" "a}}
else{s=$0" "a;x++;fn="file__"x;}}
else{s=s"\n"$0" "a;}}END{print s > fn}' file


Last edited by pamu; 10-12-2012 at 02:34 AM..
# 9  
Old 10-12-2012
@pamu .., was not able to get right results with mentioned code.. any other ideas?
# 10  
Old 10-12-2012
Quote:
Originally Posted by santosh2k2
@pamu .., was not able to get right results with mentioned code.. any other ideas?
okies try this..

Code:
awk '{a++;if($0 ~ /^101/){if(s){ 
if(a>=20){a=0;x++;fn="file__"x;print s > fn;s=$0}else{print s > fn;s=$0}}
else{s=$0;x++;fn="file__"x;}}
else{s=s"\n"$0;}}END{if(a>=20){x++;fn="file__"x};print s > fn}' file


Last edited by pamu; 10-12-2012 at 09:02 AM.. Reason: edited=- removed a..
# 11  
Old 10-12-2012
@pamu.. still not giving correct results in some scenarios. I will rewrite the requirements again. as i am now told that its ok to include next set of 104 data even if record count goes beyoond 20.

Reqirement:
- The source file is set of customer data. a customer set has 101 header record and its child records as 104 records. Always 101 will be first records in set.
- Target file should include all records in customer set and start with 101 record.
- Target file can contain many customer sets.
- Number of records in each Target file needs to be either equal or can be just more than splitCount variable to include next customer set.

Lets take example of splitCount=10. Below code just splits file in sets of 10 records and assigns correct name to output file. can someone please extend this logic to include Target file requirements.

Code:
awk 'NR%"'"${splitCount}"'"==1{x="'"${SrcFileName}_"'" sprintf("%04d",++i) ".txt"}{print > x}' $SrcFileName.txt
 
Variables assigned to run command
SrcFileName=SS
splitCount=10
 
Source file = SS.txt
101|M|28854| 
104|28854| I|
101|M|30854| MER
104|30854| S|
104|30854| C|
104|30854| I|
101|M|30855| SG
104|30855| I|
104|30855| S|
104|30855| C|
104|30855| S|
101|M|30856| 
104|30856| I|
104|30856| S|
104|30856| S|
104|30856| S|
104|30856| C|
104|30856| S|
101|M|30857| 
104|30857| I|
104|30857| S|
104|30857| S|
104|30857| S|
104|30857| C|
104|30857| S|
101|M|30858| 
104|30858| I|
104|30858| S|
 
Target Files
SS_0001.txt= has 11 records as we cannot move pending 30855 records in next file
101|M|28854| 
104|28854| I|
101|M|30854| MER
104|30854| S|
104|30854| C|
104|30854| I|
101|M|30855| SG
104|30855| I|
104|30855| S|
104|30855| C|
104|30855| S|
 
SS_0002.txt= has more than 10 records as we cannot move pending 30857 records in next file
101|M|30856| 
104|30856| I|
104|30856| S|
104|30856| S|
104|30856| S|
104|30856| C|
104|30856| S|
101|M|30857| 
104|30857| I|
104|30857| S|
104|30857| S|
104|30857| S|
104|30857| C|
104|30857| S|
 
SS_0003.txt
101|M|30858| 
104|30858| I|
104|30858| S|


Last edited by santosh2k2; 10-12-2012 at 10:59 AM..
# 12  
Old 10-12-2012
Quote:
Originally Posted by santosh2k2
@pamu.. still not giving correct results in some scenarios. I will rewrite the requirements again. as i am now told that its ok to include next set of 104 data even if record count goes beyoond 20.

Reqirement:
- The source file is set of customer data. a customer set has 101 header record and its child records as 104 records. Always 101 will be first records in set.
- Target file should include all records in customer set and start with 101 record.
- Target file can contain many customer sets.
- Number of records in each Target file needs to be either equal or can be just more than splitCount variable to include next customer set.

Lets take example of splitCount=10. Below code just splits file in sets of 10 records and assigns correct name to output file. can someone please extend this logic to include Target file requirements.
Your requirement is changing with every post..

Quote:
Originally Posted by santosh2k2
- Records in each file should not be more than 20
- each file should start with 101 record. This ensures that all associated 101 and 104 are in same file. Hence in example below since count including next set of 101 is going beyond 20, first file is cut at 18. rest of records are pushed to next file and so on.
see below
a - You can decide how much you want.
if you say a=20/10 it is maximum value of record. it will not contain more than 20/10 records.

Code:
awk '{a++;if($0 ~ /^101/){if(s){ 
if(a>=20){a=0;x++;fn="file__"x;print s > fn;s=$0}else{print s > fn;s=$0}}
else{s=$0;x++;fn="file__"x;}}
else{s=s"\n"$0;}}END{if(a>=20){x++;fn="file__"x};print s > fn}' file

I have tested for a=10 and a=20.

for a=20

Code:
$ ls file__*
file__1  file__2
$ wc -l file__1
18 file__1
$ wc -l file__2
10 file__2

a=10

Code:
$ wc -l file_*
  6 file__1
 12 file__2
 10 file__3
 28 total

Please let me know if you still have any doubtsSmilie
# 13  
Old 10-12-2012
Thanks @pamu...The code works fine as it is.

I am trying to assin variables and still cant get it right . is it possible for you to help please.


Can you assign a variable to Count and output file name.
in below case for 20 and file__

Code:
awk '{a++;if($0 ~ /^101/){if(s){ 
if(a>=20){a=0;x++;fn="file__"x;print s > fn;s=$0}else{print s > fn;s=$0}}
else{s=$0;x++;fn="file__"x;}}
else{s=s"\n"$0;}}END{if(a>=20){x++;fn="file__"x};print s > fn}'file

# 14  
Old 10-12-2012
try this...


Code:
awk -v CN="20" -v File_name="file__" '{a++;if($0 ~ /^101/){if(s){ 
if(a>=CN){a=0;x++;fn=File_name""x;print s > fn;s=$0}else{print s > fn;s=$0}}
else{s=$0;x++;fn=File_name""x;}}
else{s=s"\n"$0;}}END{if(a>=CN){x++;fn=File_name""x};print s > fn}'file


Last edited by pamu; 10-12-2012 at 02:18 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Web Development

Filename output in curl

How can I get the name of the default output filename from curl using the argument -O? Using -o one can choose a filename. I want to get the name of the original file, but don't understand how to get it. curl -o filename http://www.website.com curl -O http://www.website.com The... (3 Replies)
Discussion started by: locoroco
3 Replies

2. UNIX for Beginners Questions & Answers

Insert the line number from text file to filename output

Hi everyone :) I have a file "words.txt" containing hundreds of lines of text. Each line contains a slogan. Using the code below i am able to generate an image with the slogan text from each line. The image filename is saved matching the last word on each line. Example: Line 1: We do... (2 Replies)
Discussion started by: martinsmith
2 Replies

3. Shell Programming and Scripting

How to remove filename from output file?

Hello, I am trying to print searched multiple keywords in multiple files. It is almost okay with the code but the code puts filename in front of each line. How may I get rid of it? -grep -A1 'word1' *.txt | grep -A1 'word2' | grep -A1 'word3' I expect: Real outcome: How may I... (3 Replies)
Discussion started by: baris35
3 Replies

4. UNIX for Dummies Questions & Answers

Output a list of five books with their filename titles into one file

Dear unix forum, could I output a list of five books with their file name titles into one file? In order o output all the contents of all the files with their file names there was: find . -type f | while read x; echo -e "\n$x";cat "$x";done > бетховен.txt In spite of them being successively... (5 Replies)
Discussion started by: Xcislav
5 Replies

5. Shell Programming and Scripting

File splitter

I have below script which does splitting based on a different criteria. can it be amended to produce required result SrcFileName=XML_DUMP awk '/<\?xml version="1\.0" encoding="utf-8"\?>/{n++} n{f="'"${SrcFileName}_"'" sprintf("%04d",n) ".txt" print >> f close(f)}' $SrcFileName.txt My... (3 Replies)
Discussion started by: santosh2k2
3 Replies

6. Shell Programming and Scripting

Source xml file splitter

I have a source file that contains multiple XML files concatenated in it. The separator string between files is <?xml version="1.0" encoding="utf-8"?>. I wanted to split files in multiple files with mentioned names. I had used a awk code earlier to spilt files in number of lines i.e. awk... (10 Replies)
Discussion started by: santosh2k2
10 Replies

7. Shell Programming and Scripting

Text Splitter

Hi, I need to split files based on text: BEGIN DSJOB Identifier "LA" DateModified "2011-10-28" TimeModified "11.10.02" BEGIN DSRECORD Identifier "ROOT" BEGIN DSSUBRECORD Owner "APT" Name "RecordJobPerformanceData" Value "0" ... (16 Replies)
Discussion started by: unme
16 Replies

8. Shell Programming and Scripting

use input filename as an argument to name output file

I know this is a simple matter, but I'm new to this. I have a shell script that calls a sed script from within it. I want the output of the shell script to be based on the input file I pass as an argument to the original script. In other words... ./script.sh file.txt (script.sh calls sed... (2 Replies)
Discussion started by: estebandido
2 Replies

9. Shell Programming and Scripting

Filename from splitting files to have the same filename of the original file with counter value

Hi all, I have a list of xml file. I need to split the files to a different files when see the <ko> tag. The list of filename are B20090908.1100-20090908.1200_CDMA=1,NO=2,SITE=3.xml B20090908.1200-20090908.1300_CDMA=1,NO=2,SITE=3.xml B20090908.1300-20090908.1400_CDMA=1,NO=2,SITE=3.xml ... (3 Replies)
Discussion started by: natalie23
3 Replies

10. Shell Programming and Scripting

File splitter by nth row

I need to split a file into n separate files of about the same size. The way the file will be split is at every nth row, starting with the first row, that row will be cut and copied to it's corresponding new file so that each file has unique records. Any 'leftovers' will go into the last file. e.g.... (4 Replies)
Discussion started by: sitney
4 Replies
Login or Register to Ask a Question