How to Split a source file in specified format?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to Split a source file in specified format?
# 1  
Old 03-05-2013
How to Split a source file in specified format?

Requirement: Need to split a source file say a1.txt which can be of size upto 150 MB into 25 target files each with a max size of 25 MB along with the header line in each target file.

NOTE: Few target files can be empty also ,but 25 files must be generated for 1 source file( I can expect upto 80 source files)

Target name format must be :
Code:
XXX_YYY1001.CSV , XXX_YYY1002.CSV,…. XXX_YYY1025.CSV

for 2nd source file it must be
Code:
XXX_YYY2001…XXX_YYY2025

and so on for 80 th source file it should be
Code:
XXX_YYY8001…XXX_YYY8025


Last edited by Scrutinizer; 03-05-2013 at 09:57 AM.. Reason: Cleaned font tags
# 2  
Old 03-05-2013
So this would be 6 files a 25MB and 19 empty files?
How do you differentiate between 8th and 80th target files?
What have you tried so far?
# 3  
Old 03-05-2013
Hi Rudic we can avoid max 25MB requirement,


Code:
Code used :
Code:
awk 'BEGIN{getline f; }{ print > "'$val'" 1 + int (NR%25) ".csv" } ' $original_file

where
Code:
$original_file->Source file 
getline f->to get the header line which is appended later to all target files
$val->name of the target file (fixed part of file name stored in table Ex:XXX_YYY1)

The above command generates the target file with names XXX_YYY11, XXX_YYY12 ,….XXX_YYY125 (I need the names to be XXX_YYY1001,….XXX_YYY1025)

If I modify the above code :
Code:
awk 'BEGIN{getline f; }{ print > "'$val'" 1001 + int (NR%25) ".csv" } ' $original_file

- Target files generated are XXX_YYY1001,….XXX_YYY1025 (But I cannot fix this value since the target file names are different for 80 source files – like XXX_YYY2001…XXX_YYY2025 /XXX_YYY8001…XXX_YYY8025

Last edited by Scrutinizer; 03-05-2013 at 10:02 AM.. Reason: code tag / font tag kludge untangled
# 4  
Old 03-05-2013
That post is absolutely unreadable!
# 5  
Old 03-05-2013
Confused by this

Quote:
Target name format must be : XXX_YYY1001.CSV , XXX_YYY1002.CSV,.... XXX_YYY1025.CSV
for 2nd source file it must be
XXX_YYY2001...XXX_YYY2025
and so on for 80 th source file it should be
XXX_YYY8001...XXX_YYY8025
Shouldn't the first source file be XXX_YYY0101.CSV
so the tenth source file can be XXX_YYY1001.CSV
and the 20th source file can be XXX_YYY2001.CSV
and the 80th source file can be XXX_YYY8001.CSV


Not sure if I understand much else, but your example naming convention is quite confusing.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to format file with conditional split

In the awk below I am splitting $7 on the : (colon) then - (hyphen) as array a. The word chr is printed at the start of every $1 line. Next, $4 is split on the > (greater then) as array b. I am not sure how to account for the two other possibilities in $4 so the correct output is printed. Every... (3 Replies)
Discussion started by: cmccabe
3 Replies

2. Shell Programming and Scripting

Split File by Pattern with File Names in Source File... Awk?

Hi all, I'm pretty new to Shell scripting and I need some help to split a source text file into multiple files. The source has a row with pattern where the file needs to be split, and the pattern row also contains the file name of the destination for that specific piece. Here is an example: ... (2 Replies)
Discussion started by: cul8er
2 Replies

3. Shell Programming and Scripting

NAWK: changing string-format with split

Hi all, I try to make a awk-script, which counts lines, summarized by pdf and xml. So far it works, but for sorting reasons, I'd like to change the format from the field $1 from dd-mm-yyyy to yyyy-mm-dd. This works find, but: split() and sprintf() prints its output (for no reason, the results... (2 Replies)
Discussion started by: regisl67
2 Replies

4. Shell Programming and Scripting

Need to split a xml file in proper format

Hi, I have a file which has xml data but all in single line Ex - <?xml version="1.0"?><User><Name>Robert</Name><Location>California</Location><Occupation>Programmer</Occupation></User> I want to split the data in proper xml format Ex- <?xml version="1.0"?> <User> <Name>Robert</Name>... (6 Replies)
Discussion started by: avishek007
6 Replies

5. Shell Programming and Scripting

Converting windows format file to unix format using script

Hi, I am having couple of files which i used to copy from windows to Linux, so now in case of text files (CTRL^M) appears at end of line. I know i can convert this windows format file to unix format file by running dos2unix. My requirement here is that i want to do it automatically using a... (5 Replies)
Discussion started by: sarbjit
5 Replies

6. Shell Programming and Scripting

Split variable length and variable format CSV file

Dear all, I have basic knowledge of Unix script and her I am trying to process variable length and variable format CSV file. The file length will depend on the numbers of Earnings/Deductions/Direct Deposits. And The format will depend on whether it is Earnings/Deductions or Direct Deposits... (2 Replies)
Discussion started by: chechun
2 Replies

7. Shell Programming and Scripting

use of format file to extract columns from a source file

hi experts lets say my format file is B B ========= column no,name,type,length 1,ee,N,12 3,hj,N.4 4,kl,N,5 source file ======== d e f g h i 5 8 9 7 6 5 1 3 4 5 6 6 (2 Replies)
Discussion started by: subhendu81
2 Replies

8. UNIX for Dummies Questions & Answers

To convert multi format file to a readable ascii format

Hi I have a file which has ascii , binary, binary decimal coded,decimal & hexadecimal data with lot of special characters (like öƒ.ƒ.„İİ¡Š·œƒ.„İİ¡Š· ) in it. I want to standardize the file into ASCII format & later use that as source . Can any one suggest a way a logic to convert such... (5 Replies)
Discussion started by: gaur.deepti
5 Replies

9. UNIX for Dummies Questions & Answers

Split a file with no pattern -- Split, Csplit, Awk

I have gone through all the threads in the forum and tested out different things. I am trying to split a 3GB file into multiple files. Some files are even larger than this. For example: split -l 3000000 filename.txt This is very slow and it splits the file with 3 million records in each... (10 Replies)
Discussion started by: madhunk
10 Replies

10. UNIX for Dummies Questions & Answers

Convert UTF8 Format file to ANSI format

:confused: Hi i am trying to convert a file which is in UTF8 format to ANSI format i tried to use the function ICONV but it is throwing error Function i used it as $ iconv -f UTF8 -t ANSI filename Error iam getting is NOT Supported UTF8 to ANSI please some help me out on... (9 Replies)
Discussion started by: rajreddy
9 Replies
Login or Register to Ask a Question