Copy files in Parallel


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Copy files in Parallel
# 8  
Old 01-25-2010
Hi,

setsize is the total number of files divided by the number of parallel processes. I added 1 so that you will have no more than the number op parallel process. In this case the workload is split in 3 and 1, but for larger numbers it works out more evenly.

wc -w tests the number of entries although wc -l is perhaps more appropriate in this case
xargs cuts up the set of files in more or less equal parts
each cp process copies the files in these equal parts to the target directory.

The & puts the copy processes in the background and the wait command waits for them to finish. So the are not multiple threads but multiple processes.

The script breaks with file names that contain spaces. I'll fix that when I have more time.

S.
# 9  
Old 01-26-2010
Code:
wc -w

is used to get total number of files.

The tricky part I did not understand and I am waiting for Scrutinizer is the code

Code:
| xargs -n $setsize |

How does this divide the file set and spawn appropriate parallel thread ?

I am learning how to use this forum. I can now use code tags

---------- Post updated at 02:46 PM ---------- Previous update was at 02:40 PM ----------

Scrutinzer,
Thanx for your explanation. I just saw your response it flipped over to page two and I was waiting like a fool on page one.

Question can you please explain :-
So the are not multiple threads but multiple processes.

So what is the difference between a thread and a process ? Are they not the same ?
# 10  
Old 01-26-2010
Quote:
Originally Posted by simonsimon
So what is the difference between a thread and a process ? Are they not the same ?
Multiple threads means multiple execution contexts inside the same process. This is less weight on the operating system since it doesn't have to switch process contexts as often, but is more difficult on the programmer since they have to worry about race conditions etc. You can't do threads from the shell.

Multiple processes is multiple processes.

They can both run several things in parallel.
# 11  
Old 01-26-2010
I would use a tool like rsync or unison that are optimized for this kind of use and making backups.

Try this:

rsync -avz /data4/dbx /backup/dbx

Last edited by jostber; 01-26-2010 at 06:47 PM..
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

How to paste multiple files in parallel?

Hi all, I am trying to paste thousands of files together into a matrix. Each file has only 1 column and all the files have the same number of rows (~27k rows). I tried paste * > output as well as some other for loops but the output only contains the columns from the 1st and last files. The... (8 Replies)
Discussion started by: notimenocall
8 Replies

2. Shell Programming and Scripting

Alignment tool to join text files in 2 directories to create a parallel corpus

I have two directories called English and Hindi. Each directory contains the same number of files with the only difference being that in the case of the English Directory the tag is .english and in the Hindi one the tag is .Hindi The file may contain either a single text or more than one text... (7 Replies)
Discussion started by: gimley
7 Replies

3. Shell Programming and Scripting

Copy file from different ports in parallel

Hello folks, Can you please help me to solve the below concern. I have a source server with 2 ports and have to copy the files from both the port to destination server simultaneously in my shell script. How can I achieve that? Source : x.x.x.x port -22 X.x.x.x port -2222 ... (7 Replies)
Discussion started by: sadique.manzar
7 Replies

4. Shell Programming and Scripting

Parallel move keeping folder structure along with files in it

The below will move all the files in the directory dir to the destination using parallel and create a log, however will not keep them in the directory. I have tried mkdir -p but that does not seem to work or at least I can not seem to get it (as it deletes others files when I use it). What is the... (2 Replies)
Discussion started by: cmccabe
2 Replies

5. Shell Programming and Scripting

Comparing list of files in parallel

Hi everyone. I have a list of files like: file001 file002 file003 . . . . file385 file386 file387 There are more files than above, but I hope you understand what I'm trying to do here. Is there a way I can create a loop to compare: file001 with file385 file002 with file386 (9 Replies)
Discussion started by: craigsky
9 Replies

6. Shell Programming and Scripting

Need to read two files in parallel

Following is the requirement In FileA I have the content as follows. 1,2,3 111,222,333 1000,2000,3000 In FileB I have the content as follows. 4,5,6 444,555,666 4000,5000,6000 I need to read FileA and FileB parallely and create the FileC as follows. 1,2,3,4,5,6... (1 Reply)
Discussion started by: kmanivan82
1 Replies

7. Shell Programming and Scripting

scp or rsync multiple files in parallel from a remote host

Hi. I'm trying to speed up an rsync command by running it in parallel. There's no real option for this other than if the files are in multiple directories (which they're not). And even then there's no way of knowing if rsync has succeeded as the process is running in the background .. and... (4 Replies)
Discussion started by: Big_Jeffrey
4 Replies

8. Shell Programming and Scripting

parallel excution for 2000 files.

Hi, I have a function abc(). i am calling the function 9 times. it is working fine and completed the script execution 10 hours. input files: CUSTOMER_INFO_1111_12345.csv CUSTOMER_INFO_1222_12345.csv CUSTOMER_INFO_1322_12345.csv CUSTOMER_INFO_1333_12345.csv CUSTOMER_INFO_1151_12345.csv... (4 Replies)
Discussion started by: onesuri
4 Replies

9. UNIX for Advanced & Expert Users

implementation of copy command in parallel

hey i have to implement copy command in parallel in c language. i dont know how to create a new directory in destination. if anything u know related to this help me (1 Reply)
Discussion started by: rajsekhar28
1 Replies

10. Shell Programming and Scripting

split process files in parallel and rejoin

Hi I need to split a huge file into multiple smaller files using split command. After that i need to process each file in the back ground with sql loader .Sql loader is a utlity to load CSV files into oracle . Check the status of each of these sqlloaders and then after sucessfull... (6 Replies)
Discussion started by: xiamin
6 Replies
Login or Register to Ask a Question