split files based on size


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers split files based on size
# 1  
Old 07-15-2009
split files based on size

I have a few txt files in some directory and I need to check their sizes one by one. If any of them are greater than 5mb then I need to split the file in two.

Can someone help?

Thanks.
# 2  
Old 07-16-2009
Luckily I had some large files around:
Code:
> find . -type f -size +5M -print 2>/dev/null | while read file; do
size=$( ls -l $file | awk '{print $5/2}' );
echo split -b $size $file ${file}_ ;
done
split -b 536870912 ./test/copytrunc/xdev/size1G.log ./test/copytrunc/xdev/size1G.log_
split -b 536870912 ./test/copy/local/size1G.log ./test/copy/local/size1G.log_
split -b 536870912 ./test/copy/xdev/size1G.log ./test/copy/xdev/size1G.log_
split -b 536870912 ./test/move/xdev/size1G.log ./test/move/xdev/size1G.log_

Will split the files in 2, naming the parts $file_aa, $file_ab
To disable the safety, remove the echo
# 3  
Old 07-16-2009
Can you please explain jsut a little bit how this works for my understanding?

I can see that it will look under current directory and subdirectories for files with size over 5M..what is -type f do?. I am fairly new to unix...but just some general idea would be fine.

Thanks.

---------- Post updated at 05:55 PM ---------- Previous update was at 03:26 PM ----------

Also it echo all the files in the directory (by splitting) and I know there is only one that is over 5MB. However, if I change the size in find to +5242800c (bytes) then it echo the right file. So what could be wrong with +5M ?
# 4  
Old 07-17-2009
'-type f' means "only look for regular files". Other options might be 'd' for directories, 'l' for links, ...
The '+5M' doesn't work because it's apparently only implemented in GNU find. Sorry for that.
The awk line uses the 5th field of the output from 'ls -l' (which is the size) and prints it divided by 2.
-b tells split to split after that many bytes (from the awk above). The 2 others are the file itself that needs splitting and the prefix for the new name (sadly, requires full path).
# 5  
Old 07-17-2009
Cool...thanks

---------- Post updated at 11:56 AM ---------- Previous update was at 01:05 AM ----------

Ok...I am almost done here but little issue with the way files are getting renamed after the split. Here is my final script:
Code:
#!/bin/bash
splitFileDir=/home/cognosdev/texas

find $splitFileDir -type f -size +5242800c -print 2>/dev/null | while read file
do
  echo split -b 5242800 $file ${file}__
  rm $file
  echo 'splitting completed'
done

So a file like 99I999877_ABCD.txt gets split and renamed as :
Code:
99I999877_ABCD.txt_aa
99I999877_ABCD.txt_ab

I would like for them to be something like:
Code:
99I999877_ABCD_aa.txt
99I999877_ABCD_ab.txt

How can I achieve this?

Thanks.

Last edited by vgersh99; 07-17-2009 at 05:28 PM.. Reason: code tags, PLEASE!
# 6  
Old 07-18-2009
Note: I haven't tested this completely, only the filename/extension extraction.

Step 1: Save the filename without the extension:
Code:
filename=${file%.*}

Step 2: Save the extension:
Code:
extension=${file#$filename}

Step 3: Split:
Code:
split -b 5242800 $file ${filename}_

Step 4: Append the extension again:
Code:
for splits in ${filename}_*
do
    mv $splits ${splits}${extension}
done

# 7  
Old 07-23-2009
thanks
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

File Size Split up based on Month

Hi, I have a directory in Unix and there are folders available in the directory. Files are created on different month and now i have a requirement to calculate size of the folder on month basis. Is there any Unix command to check this please?? Thanks (6 Replies)
Discussion started by: Nivas
6 Replies

2. Shell Programming and Scripting

Split files based on row delimiter count

I have a huge file (around 4-5 GB containing 20 million rows) which has text like: <EOFD>11<EOFD>22<EORD>2<EOFD>2222<EOFD>3333<EORD>3<EOFD>44<EOFD>55<EORD>66<EOFD>888<EOFD>9999<EORD> Actually above is an extracted file from a Sql Server with each field delimited by <EOFD> and each row ends... (8 Replies)
Discussion started by: amvip
8 Replies

3. UNIX for Dummies Questions & Answers

Split files based on user input

Hi All, I have a UNIX script which reads "PxyType" (read PxyType) as input from user and a file (eg : "File.json") with the list all PxyType's. Based on the user input for "PxyType" in "File.json", I want to redirect each matched line to a different file ("File1,2,3,..json"). Can you... (7 Replies)
Discussion started by: Deena1984
7 Replies

4. Shell Programming and Scripting

Split the File based on Size

I have a file that is about 7 GB in size. The requirement is I should split the file equally in such a way that the size of the split files is less than 2Gb. If the file is less than 2gb, than nothing needs to be done. ( need to done using shell script) Thanks, (4 Replies)
Discussion started by: rudoraj
4 Replies

5. Shell Programming and Scripting

Split Large Files Based On Row Pattern..

Hi all. I've tried searching the web but could not find similar problem to mine. I have one large file to be splitted into several files based on the matching pattern found in each row. For example, let's say the file content: ... (13 Replies)
Discussion started by: aimy
13 Replies

6. Shell Programming and Scripting

Split a file in more files based on score content

Dear All, I have the following file tabulated: ID distanceTSS score 8434 571269 10 10122 393912 9 7652 6 10 4863 1451 9 8419 39 2 9363 564 21 9333 7714 22 9638 8334 9 1638 1231 11 10701 918 1000 6587 32056 111 What I would like to do is the following, create 100 new files based... (5 Replies)
Discussion started by: paolo.kunder
5 Replies

7. Shell Programming and Scripting

Split file based on file size in Korn script

I need to split a file if it is over 2GB in size (or any size), preferably split on the lines. I have figured out how to get the file size using awk, and I can split the file based on the number of lines (which I got with wc -l) but I can't figure out how to connect them together in the script. ... (6 Replies)
Discussion started by: ssemple2000
6 Replies

8. Shell Programming and Scripting

Split a file into multiple files based on field value

Hi, I've one requirement. I have to split one comma delimited file into multiple files based on one of the column values. How can I achieve this Unix Here is the sample data. In this case I have split the files based on date column(c4) Input file c1,c2,c3,c4,c5... (1 Reply)
Discussion started by: manasvi24
1 Replies

9. Shell Programming and Scripting

Split a file based on pattern and size

Hello, I have a large file (2GB) that I would like to split based on pattern and size. I've used the following command to split the file (token is "HELLO") awk '/HELLO/{i++}{print > "file"i}' input.txt and the output is similar to the following (i included filesize in KB): 10 ... (2 Replies)
Discussion started by: jl487
2 Replies

10. Shell Programming and Scripting

Split file based on size

Hi Friends, Below is my requirement. I have a file with the below structure. 0001A1.... 0001B1.. .... 0001L1 0002A1 0002B1 ...... 0002L1 .. the first 4 characters are the sequence numbers for a record, A record will start with A1 and end with L1 with same sequence number. Now the... (2 Replies)
Discussion started by: diva_thilak
2 Replies
Login or Register to Ask a Question