Split file based on file size in Korn script


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Split file based on file size in Korn script
# 1  
Old 06-25-2012
Split file based on file size in Korn script

I need to split a file if it is over 2GB in size (or any size), preferably split on the lines. I have figured out how to get the file size using awk, and I can split the file based on the number of lines (which I got with wc -l) but I can't figure out how to connect them together in the script.

So the command
Code:
ls -l>mylist.txt

gives me the file listing in a file, and the command
Code:
awk < mylist.txt '{if ($5>75000000) print $5 " " $NF}'

gives me a list of all the files that are larger that the size with their sizes, and the command
Code:
wc -l myfile.txt

gives me the number of lines in the file (call it 50000), and if I manually put them together, the command
Code:
split -l 25000 myfile.txt myfile.txt

gives me two files, myfile.txtaa and myfile.txtab, each with 25000 lines.

The problem is how to get them together in one script....
Thank you.

Moderator's Comments:
Mod Comment Video tutorial on how to use code tags in The UNIX and Linux Forums.

Last edited by radoulov; 06-25-2012 at 06:35 PM..
# 2  
Old 06-25-2012
Code:
man split

see -l option
# 3  
Old 06-25-2012
split file based on file size in Korn script

Thanks, but I already know how to use split -l --what I'm looking for is how to get the size of the file and pass the number of lines (or half the number of lines) to the split -l command in the script.

I was thinking this might work, but it doesn't:
awk < mylist.txt '{if ($5>78000000) "split -l $5/2 $NF $NF"; rm $NF}'

So I need to know how to get the $5 and $NF values into the shell script so I can run it....

Thank you.
# 4  
Old 06-25-2012
This may help get you started:

Code:
#!/bin/ksh
#
#

# declare an array and populate it with files larger
# than 2GBs in the current directory
set -A files $(find . -maxdepth 1 -size +2000000 -type f | sed 's/\.\///')

# set counter
counter=0

# get number of files in the array
numfiles=${#files[*]}

# set linecount
linecount=0

# set number of lines
numlines=0

# iterate through the array files, retrieve the line count,
# divide it by 2 and feed everything to the split command
while [ $counter -lt $numfiles ]
do
    linecount=$(wc -l ${files[$counter]} | awk '{print $1}')
    numlines=$(expr $linecount / 2)
    split -l $numlines ${files[$counter]} ${files[$counter]}
    ((counter=$counter+1))
done

# done
exit 0

This User Gave Thanks to in2nix4life For This Post:
# 5  
Old 06-25-2012
Thank you. However, when I run the script, I get the error
find: bad option -maxdepth
What does the -maxdepth option do? I don't see it when I man find, but there is a -depth option.

Thanks again.

---------- Post updated at 04:36 PM ---------- Previous update was at 03:36 PM ----------

I think what -maxdepth 1 is supposed to do is to keep find from searching sub-directories. If this is the case, the version of find on the version of Unix that I am using does not have that option. At any rate, I was able to remove the -maxdepth parameter and the script is working. Thank you, thank you, thank you!
Smilie
# 6  
Old 06-25-2012
Why not just use -n 2 with splt?

Code:
split -n 2 myfile.txt myfile.txt

# 7  
Old 06-25-2012
I guess because the man page for split on my version of Unix doesn't list an option of -n for split. It would be more convenient, but the option isn't there.

Thanks anyway.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

File Size Split up based on Month

Hi, I have a directory in Unix and there are folders available in the directory. Files are created on different month and now i have a requirement to calculate size of the folder on month basis. Is there any Unix command to check this please?? Thanks (6 Replies)
Discussion started by: Nivas
6 Replies

2. Answers to Frequently Asked Questions

How to split a dat file based on another file ni UNIX?

i have two files , one is var.txt and another res.dat file var.txt contains informaton like below date,request,sales,item 20171015,1,123456,216 20171015,1,123456,217 20171015,2,345678,214 20171015,3,456789,218 and res.dat contains is a one huge file contains information like... (1 Reply)
Discussion started by: pogo
1 Replies

3. Shell Programming and Scripting

Split the File based on Size

I have a file that is about 7 GB in size. The requirement is I should split the file equally in such a way that the size of the split files is less than 2Gb. If the file is less than 2gb, than nothing needs to be done. ( need to done using shell script) Thanks, (4 Replies)
Discussion started by: rudoraj
4 Replies

4. Shell Programming and Scripting

awk script to split file into multiple files based on many columns

So I have a space delimited file that I'd like to split into multiple files based on multiple column values. This is what my data looks like 1bc9A02 1 10 1000 FTDLNLVQALRQFLWSFRLPGEAQKIDRMMEAFAQRYCQCNNGVFQSTDTCYVLSFAIIMLNTSLHNPNVKDKPTVERFIAMNRGINDGGDLPEELLRNLYESIKNEPFKIPELEHHHHHH 1ku1A02 1 10... (9 Replies)
Discussion started by: viored
9 Replies

5. Shell Programming and Scripting

How to split file into multiple files using awk based on 1 field in the file?

Good day all I need some helps, say that I have data like below, each field separated by a tab DATE NAME ADDRESS 15/7/2012 LX a.b.c 15/7/2012 LX1 a.b.c 16/7/2012 AB a.b.c 16/7/2012 AB2 a.b.c 15/7/2012 LX2 a.b.c... (2 Replies)
Discussion started by: alexyyw
2 Replies

6. Shell Programming and Scripting

Split a file based on pattern and size

Hello, I have a large file (2GB) that I would like to split based on pattern and size. I've used the following command to split the file (token is "HELLO") awk '/HELLO/{i++}{print > "file"i}' input.txt and the output is similar to the following (i included filesize in KB): 10 ... (2 Replies)
Discussion started by: jl487
2 Replies

7. Shell Programming and Scripting

Split a file into multiple files based on first two digits of file.

Hi , I do have a fixedwidth flatfile that has data for 10 different datasets each identified by the first two digits in the flatfile. 01 in the first two digit position refers to Set A 02 in the first two digit position refers to Set B and so on I want to genrate 10 different files from my... (6 Replies)
Discussion started by: okkadu
6 Replies

8. Shell Programming and Scripting

Split file based on size

Hi Friends, Below is my requirement. I have a file with the below structure. 0001A1.... 0001B1.. .... 0001L1 0002A1 0002B1 ...... 0002L1 .. the first 4 characters are the sequence numbers for a record, A record will start with A1 and end with L1 with same sequence number. Now the... (2 Replies)
Discussion started by: diva_thilak
2 Replies

9. Programming

create a spool file based on values passed from korn shell to sql script

this is my issue. 4 parameters are passed from korn shell to sql script. parameter_1= varchar2 datatype or no value entered my user. parameter_2= number datatype or no value entered my user. parameter_3= number datatype or no value entered my user. parameter_4= number datatype or no... (5 Replies)
Discussion started by: megha2525
5 Replies

10. Shell Programming and Scripting

awk script to split a file based on the condition

I have the file with the records like 4234234 US phone 3244234 US cup 2342342 CA phone 8947234 US phone 2389472 CA cup 2348972 US maps 3894234 CA phone I want the records with (US,phone) as record to be in one file, (Us, cup) in another file and (CA,cup) to be in another I mean all... (12 Replies)
Discussion started by: superprogrammer
12 Replies
Login or Register to Ask a Question