Merge files into groups of 10000


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Merge files into groups of 10000
# 1  
Old 04-27-2011
Merge files into groups of 10000

Hi Guys,

First post! I've seen a few options but dont know the most efficient:

I have a directory with a 150,000+ text files in it

I want to merge them into files contain 10,000 files with a carriage return in between.


Thanks

P

The following is an example but doesnt limit the number of files to merge:

$ paste -d "\n" * >> x.txt
# 2  
Old 04-28-2011
Code:
#!/bin/ksh
cnt=0
i=1
base=bigfile.dat
ls -1 * > $HOME/listfile
while read fname
do
     cat $fname >> base${i}
     echo "" >> base{$i}
     cnt=$(( $cnt + 1 ))
      [ $(( $cnt % 10000 )) -eq 0 ] &&  i=$(( $i + 1 ))  
done  < $HOME/listfile

This will make files: bigfile.dat1 ... bigfile.dat15

Most of your problems with 'efficiency' are related to having way too many files in a directory. ls will take a lot of time to return before the loop engages. You should rebuild the directory after you consoildate the files into fifteen big files.
# 3  
Old 05-02-2011
After run the awk command, files name bigfile.NUMBER are generated under /tmp, you can change the output path (/tmp)

Code:
awk -v file=1 'FNR==1{print "" >"/tmp/bigfile." file; cnt++;file=(cnt%10000)?file:file+1} 
        {print >> "/tmp/bigfile." file}' *

 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Rearrange groups of lines from several files

I have three files as an input and I need to rearrange this input to match the rules by which the processing program consumes the data. My files are: /tmp$ cat F # file -1- FS00|0|zero-zero| FSTA|0|10| FSTA|0|12| FSTA|0|15| FSTA|0|17| FS00|3|negative| FSTA|3|-1| FS00|5|regular|... (2 Replies)
Discussion started by: migurus
2 Replies

2. Shell Programming and Scripting

Checking in a directory how many files are present and basing on that merge all the files

Hi, My requirement is,there is a directory location like: :camp/current/ In this location there can be different flat files that are generated in a single day with same header and the data will be different, differentiated by timestamp, so i need to verify how many files are generated... (10 Replies)
Discussion started by: srikanth_sagi
10 Replies

3. Shell Programming and Scripting

Move groups of files

G'day all, I'm have tons of image files I need to process, but I don't need to process all of them and it would take a long time to process them all if I don't have to. The images are arranged in folders like this... folder1/RawData folder2/RawData folder3/RawData ... folderN/RawData ... (2 Replies)
Discussion started by: Dan_S
2 Replies

4. Shell Programming and Scripting

Urgent...Need a shell script to list files which belong to particular groups

Hi, I am just new to scripting but got to write a complex scipt please help. i need a shell script which can check the list of data listed in a txt doc and see if they belong to any of the groups that are listed in other list file.... (5 Replies)
Discussion started by: draghun9
5 Replies

5. Shell Programming and Scripting

Searching a particular string pattern in 10000 files

Problem Statement:- I need to search a particular `String Pattern` in around `10000 files` and find the records which contains that `particular pattern`. I can use `grep` here, but it is taking lots of time. Below is the command I am using to search a `particular string pattern` after... (3 Replies)
Discussion started by: raihan26
3 Replies

6. UNIX for Advanced & Expert Users

SQL script with 86000 lines: new files with only 10000 lines (per file)

Hi this is my SQL script $ wc -l insert_into_customers.sql 85601 insert_into_customers.sqlI wish to cut this file into 9 files each 10000 lines (the last one less) $ wc -l insert_into_customers_00*.sql 10000 insert_into_customers_001.sql 10000 insert_into_customers_002.sql ... (1 Reply)
Discussion started by: slashdotweenie
1 Replies

7. Shell Programming and Scripting

multiple groups of files processing

I have five directories, dir1 to dir5 for each directory, I have all same number-named folders. There are four types of folders, {1..10}, {20..30}, { 40..50}, {60..70} Now for each types of folder, I will do the same thing, here is the code for i in {1..5} do cd dir$i mkdir temp1 for... (5 Replies)
Discussion started by: ksgreen
5 Replies

8. Web Development

Please till me how to stop or to limit some IP which download files for more than 10000

Please till me how to stop or to limit some IP which download .rm and .mp3 files for more than 10000 times ... I have two cases : 1. code 206 up to 20/second 5 GB or more than that ... 2. code 206 up to 20/second but less than 0.5 GB I used Ddos and mod_evasive20.so <IfModule... (0 Replies)
Discussion started by: nonowa
0 Replies

9. Shell Programming and Scripting

to find the last updated file from different groups of files.

Hi i have many sets of files as shown below(here i have shown 2 sets) basel_aa_20091030.txt basel_aa_20091130.txt basel_aa_20091230.txt basel_bb_20091030.txt basel_bb_20091130.txt basel_bb_20091230.txt from each set of files i need to select the latest updated file(there are... (3 Replies)
Discussion started by: jagadeeshn04
3 Replies

10. Shell Programming and Scripting

Merge files of differrent size with one field common in both files using awk

hi, i am facing a problem in merging two files using awk, the problem is as stated below, file1: A|B|C|D|E|F|G|H|I|1 M|N|O|P|Q|R|S|T|U|2 AA|BB|CC|DD|EE|FF|GG|HH|II|1 .... .... .... file2 : 1|Mn|op|qr (2 Replies)
Discussion started by: shashi1982
2 Replies
Login or Register to Ask a Question