Help on Spliting files - urgent


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Help on Spliting files - urgent
# 1  
Old 03-06-2008
Computer Help on Spliting files - urgent

Hi Script Masters I have a strange requirement. Please help.
I am using C shell.
I have a file like the below in sorted order
22
23
25
34
37
45
67
342
456
476
543
677
789
Now I have to split the file in such a way that first 5 of 2 digit number should be saved as aaa.in and the next 5 of 2 digit number should be saved as aab.in and this should continue for all remaning 2 digits (any count) and the first 5 of 3 digit number should be saved in aac.in and the next 5 in aad.in and so on...

In nut shell

file aaa.in should contain
22
23
25
34
37
and file aab.in should contain
45
67
and file aac.in
342
456
476
543
677
and file aad.in should be
789

The split up of 5 number per file is constant and other things will vary..

If you have any doubts please ask me. Any help or clue will be greatly appreciated.
I know split command can create multiple files like aaa,aab but the problem I see it that the filename is not getting carried on .If I use split 2 times it will overwrite aaa file instead of creating aac ..
# 2  
Old 03-06-2008
Code:
awk 'BEGIN { split("a b c d e f", f) }
x[length]++ == 5 || !y[length]++ { close(fn); fn = "aa"f[++c]".in" }
{ print > fn }' file

Use nawk or /usr/xpg4/bin/awk on Solaris.
Add more letters if needed Smilie
# 3  
Old 03-06-2008
Re : Help on Spliting files - urgent

thanks for the reply rad.
When I tried the code i am getting as
Awk :Syntax error near line 2
Awk : bailing out near line 2.
Can you please tell me what could be wrong.

Actually There might be a file containing around 500,000 lines of numbers so when I split it it might even go for aaa..aab..aac and so on to aba abb abc and so on to aca acb acc and to maximum zzz. From your code I beieve it ca maximum go for aaa to aaz.

The file naming convention is cinaaa.in cinaab.in ...cinbaa...cinzzz.in

What kind of change I have to do in your code for accomadating the above stuffs.
# 4  
Old 03-07-2008
Hope this should work!!!!!!!!!!

#!/usr/bin/ksh
i=1
j=0
k=0
cat samp | while read line
do
if [ $line -lt 100 ]
then
arr[$i]=$line
if [ $j -le 5 ]
then
echo ${arr[$i]}
echo ${arr[$i]} >> aa$k.txt
i=`expr $i + 1`
j=`expr $j + 1`
fi
if [ $j -eq 5 ]
then
j=0
k=`expr $k + 1`
fi
fi
done

The above script wil split the files for two-digit numbers..
And you complete the rest..
# 5  
Old 03-07-2008
Quote:
Originally Posted by rajee
thanks for the reply rad.
When I tried the code i am getting as
Awk :Syntax error near line 2
Awk : bailing out near line 2.
Can you please tell me what could be wrong.
[...]
You should use nawk as suggested in my first reply.

Quote:
Actually There might be a file containing around 500,000 lines of numbers so when I split it it might even go for aaa..aab..aac and so on to aba abb abc and so on to aca acb acc and to maximum zzz. From your code I beieve it ca maximum go for aaa to aaz.

The file naming convention is cinaaa.in cinaab.in ...cinbaa...cinzzz.in

What kind of change I have to do in your code for accomadating the above stuffs.
OK,
try this:

Code:
nawk 'BEGIN { 
n = split("a b c d e f g h i j k l m n o p q r s t u v w x y z", f)
c2 = c1 = c = 1 }
!x[length]++ % 5 {
close(fn); fn = "cin"f[c2] f[c1] f[c]".in"
if ((c1 == n) && (c == n)) 
   c2 = c2 >= n ? 1 : ++c2
c = c >= n ? 1 : ++c
if (c == 1)
  c1 = c1 >= n ? 1 : ++c1 }
{ print > fn }' file

# 6  
Old 03-07-2008
MySQL Re : Help on Spliting files - Thanks Rad

Hi radoulov
I agree you are a great script master and specialist in nawk. Thanks for the code and it works fine . If you have time, please explain the code. Thanks for the effort and time you took .


Aajan
Thanks for your effort too, actually my requirement is to create file names with cinaaa,aab,aac and so on..but your code creates output as aa0,aa1,aa2 and so on but still I extend my thanks for a different idea on your approach also your code might be little slower for huge data.

Thanks for all who took time to view my problem and took effort to solve it

Great Forum!
# 7  
Old 03-10-2008
Quote:
Originally Posted by rajee
Hi radoulov
[...]
If you have time, please explain the code.
[...]
Code:
BEGIN {
n = split("a b c d e f g h i j k l m n o p q r s t u v w x y z", f)
c2 = c1 = c = 1 }

Prepare the f array (the alphabet) and set c,c1 and c2 to 1 and n to the number of elements in the f array.

Code:
!x[length]++ % 5

I think that an example will illustrate best this expression:

Code:
% awk '{print x[length]++%5==0?"here -->":"--------",$0}' file
here --> 22
-------- 23
-------- 25
-------- 34
-------- 37
here --> 45
-------- 67
here --> 342
-------- 456
-------- 476
-------- 543
-------- 677
here --> 789

In other words, we change the filename every time the expression x[length]++ % 5 is 0.

Now, the filename generation.

Code:
close(fn); fn = "cin"f[c2] f[c1] f[c]".in"
if ((c1 == n) && (c == n))
   c2 = c2 >= n ? 1 : ++c2
c = c >= n ? 1 : ++c
if (c == 1)
  c1 = c1 >= n ? 1 : ++c1 }

We need to close the previous file because some Awk implementations (such as Awk on Solaris) can open a limited number of files at the same time.
Then we compose the filename: the f array with rotating keys (from 1 to n, where n is the number of elements in f the array).

Code:
{ print > fn }

This simply prints into the changing filename.

Hope this helps.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

XML spliting

hi, Am able to split an XML file by using follwing awk command, awk 'NR==1{x=$0;next}/<\/Order>/{print y RS $0 RS "</Order>">f}/Order BillToKey/{f="file"++n".xml";y=x}{y=y RS $0}' filename.xml but i need to insert a following tag in the begining of every file how to do so. The tag is as... (7 Replies)
Discussion started by: mitnix
7 Replies

2. Shell Programming and Scripting

Spliting log file

Hello, I want to split or cut a large size log file by year wise(eg 2009, 2010) .But the source file must not have the splited or cut lines after this process ,all of them must move to the destination folder.Does grep command have the fuctionality like cut and paste? I used grep -Ev command but... (17 Replies)
Discussion started by: jobycxa
17 Replies

3. UNIX for Dummies Questions & Answers

Zipping files - Please help me its urgent

Dear all, I have thousands of log files in my log directory which I need to zip them and archive. I tried using zip command. But it is not allowing me to archive it more that 4GB of file size. So how to archive them. If it is not possible how to zip all files in to multiple archive files which... (3 Replies)
Discussion started by: tvbhkishore
3 Replies

4. UNIX for Dummies Questions & Answers

Spliting of two files

hi I have a log file which contains some reports. The log file looks like this:- STARTOFREPORT /tmp file1.txt some text to be folowd ENDOFREPORT some non utilized characters STARTOFREPORT /log file2.txt more text (3 Replies)
Discussion started by: infyanurag
3 Replies

5. Shell Programming and Scripting

spliting up a large file

Dear All, I have a very large file which which i would like split into indvidual frames evrytime the line ends with "ENDMDL" and then name frame1.pdb frame2.pdb etc can any one give me a few sugeestions? ideally i would like to have ENDMDL at the end of each frame or not pressent at all. an... (4 Replies)
Discussion started by: Mish_99
4 Replies

6. UNIX for Dummies Questions & Answers

spliting a file

how would i split the file "file1" into smaller files containg lines of 15 (1 Reply)
Discussion started by: JamieMurry
1 Replies

7. Shell Programming and Scripting

Spliting the file dynamically

i am creating the file , when this file reaches the size 2 GB, i need one message or fire (4 Replies)
Discussion started by: kingganesh04
4 Replies

8. Shell Programming and Scripting

spliting 4gb files to 4*1 gb each

I have log file whose size is 4 GB , i would like to split it to 1 gb each ,Can any one tell me the syntax of csplit comand for that. I am using Sun0S 5.8 (3 Replies)
Discussion started by: jambesh
3 Replies

9. Shell Programming and Scripting

spliting variable value

Hi, I am reading two values from oracle to unix variable and spliting them using the read command as follows, get_details=`sqlplus -s $sld_user/$sld_password@$sld_string<<EOF whenever sqlerror exit 1 whenever oserror exit 1 set feedback off set heading off set pagesize... (0 Replies)
Discussion started by: harsh_kats
0 Replies

10. UNIX for Dummies Questions & Answers

spliting up sentences

hello, i'm looking to split up text into a list of words but can't figure it out, any help would be great. thanks steven (2 Replies)
Discussion started by: stevox
2 Replies
Login or Register to Ask a Question