Copy files in Multiple Threads


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Copy files in Multiple Threads
# 1  
Old 02-18-2010
Question Copy files in Multiple Threads

Hello all,

I have a directory of files of varying sizes.

I want to copy all these files in n number of threads to another directory such that each

copy set is more or less the same size.

Example :

Say /mydirA

It has around say 23 files of various sizes.

Number of copy threads say = 3

total directory size of /mydirA = 25 GB

So each thread should copy files whose sizes add up to almost 25/3 ~ 8 GB

So need to gather files based on the size for each thread such that they add upto 8GB

Thread 1 --> 8GB ..could be 11 files which add up to 8 gb

Thread 2 -->8 Gb ... couldbe 5 files which add up to 8 gb

Thread 3 ---> 9GB ...could be 8 files which add up to 8 or 9 gb

Want roughly equal copy set threads. It is also possible that even though I select 3 threads of equal size because of lack of number of files not all 3 threads could satisy the 8gb copy set size. So atleast try to fulfill the copy set thread size as far as possible.

All files need to go from /mydirA to /mydirB in N threads bases on the size of each thread as
(Total size of directory)/N which could have different number of files in each thread based on size to add up to the individual copy thread size
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Copy data at specified location from multiple files

Hello everyone, Im super new to coding but increasingly in need of it at work. Im have task stacked because of this problems, that I cannot figure out how to solve looking on the internet after trying many many things that looked similar to me. I have multiple data files of the form (see below).... (2 Replies)
Discussion started by: Xfiles_fan
2 Replies

2. Shell Programming and Scripting

Copy files matching multiple conditions

Hello How do i copy files matching multiple conditions. Requirement is to search files starting with name abc* and def* and created on a particular date or date range given by the user and copy it to the destination folder. i tried with different commands. below one will give the list ,... (5 Replies)
Discussion started by: NarayanaPrakash
5 Replies

3. Shell Programming and Scripting

Copy multiple files from A to B through passwordless ssh

hi all, I need to write one script to copy multiple imp files like /etc/passwd /etc/group /etc/shadow /etc/printers.conf from system A, System B and system C to system Z and I need to execute this script on System Z. like if system is equal A copy 1 2 3 files to system Z into... (9 Replies)
Discussion started by: manalisharmabe
9 Replies

4. UNIX for Dummies Questions & Answers

copy multiple files

Hi, I am facing this problem, however i am not finding any solution. Kindly help I have the list of files to be search , i need to search for those files and copy the files to a folder. Really its urgent. MG_0281.JPG Tdfa_0077.JPG The%20SirehSet%20Geduing%20KpgGlam%20.jpg... (4 Replies)
Discussion started by: umapearl
4 Replies

5. UNIX for Dummies Questions & Answers

Copy multiple files with space to folder

Please help , I am in an urgent need, Please help nawk '{for(i=1;i<=NF;i++){printf("%s\n",$i)}}' filename | sed 's/.*com//' | nawk '/pdf/ {printf("F:%s\n",$0)}' | while read line; do mv $line /images/; done the above script works for without spaces but,My path is also having some space... (3 Replies)
Discussion started by: umapearl
3 Replies

6. UNIX for Dummies Questions & Answers

Zip multiple files and copy to help

Hi All, I have a set of large files ~ 500_900Mb I have generated and I'd like to quickly zip and copy them to a new folder elsewhere ... Can anyone suggest a quicky ?? Cheers :) (3 Replies)
Discussion started by: pawannoel
3 Replies

7. Shell Programming and Scripting

ksh to copy multiple files

Guys, I've got a list of about 200 files I need to copy from /tmp to /data. I can't use wildcards because the filenames are all very different. What I want to do is cut and paste them into a file and read that as the input to a copy command (line by line). I tried using find and -exec... (4 Replies)
Discussion started by: Grueben
4 Replies

8. UNIX for Dummies Questions & Answers

Copy multiple files

Hi i have 1000 files is a directory, which are serially numbered (file1,file2,file3...). I would like to copy every 200 files to different directories. many thanks in advance. (6 Replies)
Discussion started by: saint2006
6 Replies

9. UNIX for Advanced & Expert Users

Very Challenging :Copy files in Multiple Threads

Hello all, I asked this in the basic Unix forum got no answer since one week. So I believe this is an advanced level question hence posting it here. Any suggestions welcome. I have a directory of files of varying sizes. I want to copy all these files in n number of threads to... (2 Replies)
Discussion started by: samoo
2 Replies

10. UNIX for Dummies Questions & Answers

copy multiple files in different directories

I have a report file that is generated every day by a scheduled process. Each day the file is written to a directory named .../blah_blah/Y07/MM-DD-YY/reportmmddyy.tab I want to copy all of this reports to a separate directory without having to do it one by one. However, if I try cp... (3 Replies)
Discussion started by: ken2834
3 Replies
Login or Register to Ask a Question
cvcp(1)                                                                cvcp(1)

NAME
cvcp - Xsan Copy Utility SYNOPSIS
cvcp [options] Source Destination DESCRIPTION
cvcp provides a high speed, multi-threaded copy mechanism for copying directories onto and off of an Xsan volume. The utility uses IO strate- gies and multi-threading techniques that exploit the Xsan IO model. cvcp can work in many modes; Directory-to-directory copies of regular files. Directory copy of regular files to a Vtape virtual sub-directory. Single File-to-File copy. In terms of functionality for regular files, cvcp is much like the tar(1) utility. However, when copying a directory to a Vtape virtual directory, cvcp can rename and renumber the source images as they are being transferred. The files in the <Source> directory must have a decipherable numeric sequence imbedded in their names. The cvcp utility was written to provide high performance data movement, therefore, unlike utilities such as rsync, it does not write data to temporary files or manipulate the target files' modification times to allow recovery of partially-copied files when interrupted. Because of this, cvcp may leave partially-copied files if interrupted by signals such as SIGINT, SIGTERM, or SIGHUP. Partially-copied target files will be of the same size as source files; however, the data will be only partially copied into them. USAGE
The <Source> parameter determines whether to copy a single file or use a directory scan. <Source> must be a directory or file name. Using cvcp for directory copies is best accomplished by cd'ing to the <Source> directory and using the dot (.) as the <Source>. This has been shown to improve performance since fewer paths are searched in the directory tree scan. The <Destination> parameter determines the target file or directory. OPTIONS
-A If specified, will turn off the pre-allocation feature. This feature looks at the size of the source file and then makes an ALLOCSPACE call to the file system. This pre-allocation is a performance advantage as the file will only contain a single extent. It also promotes volume space savings since files that are dynamically expanded do so in a more coarse manner. Up to 30% savings in physical disk space can be seen using the pre- allocation feature. NOTE: Non-Xsan file systems that do not sup- port pre-allocation will turn pre-allocation off when writing. The default is to have the pre-allocation feature on. -b <buffers> Set the number of IO buffers to <buffers>. The default is two times the number of copy threads started(see the -t option). Experimenting with other values between 1 and 2 times the number of copy threads may yield performance improvements. -d Changes directory-to-directory mode to work more like cp -R. Without -d, cvcp copies the files and sub-directories under Source to the Destination directory. With -d, cvcp first cre- ates a sub-directory called Source in the Destination directory, then copies the files and sub-directories under Source to that new sub-directory. -k <buffer_size> Set the IO buffer size to <buffer_size> bytes. The default buffer size is 4MB. -l If set, copy the target of symbolic links rather than copying the link itself. -n If set, do not recurse into any sub-directories. -p <source_prefix> If set, only copy files whose beginning file name characters match <source_prefix>. The matching test only checks starting at character one. -s The -s option forces allocations to line up on the beginning block modulus of the storage pool. This can help performance in situations where the I/O size perfectly spans the width of the storage pool's disks. -t Set the number of copy threads to <num_threads>. The default is 4 copy threads. This option may have a significant impact on speed and resource consumption. The total copy buffer pool size is calculated by multiplying the number of buffers(-b) by the buffer size(-k). Experimenting with the -t option along with the -b and -k options are encouraged. -u Update only. If set, copies only when the source file is newer than the destination file or the destination file does not exist. Note that file access times have a granularity of only one second, so it is possible for a source file to be copied over a destination file even though -u is used. -u cannot be used with tar files or with -z. -v Be verbose about the files being copied. -x If set, ignore umask(1) and retain original permissions from the source file. If the super-user, set sticky and setuid/gid bits as well. -y If set by the super-user, retain ownership and group informa- tion. If the user is not the super-user then this option is silently ignored. -z If set, retain original modification times. Cannot be used with -u. EXAMPLES
Copy directory abc and its sub-directories to directory /usr/clips/foo. This copy will use the default number of copy threads and buffers. The total buffer pool size will total 24MB (6 buffers @ 4MB each). Retain all permissions and ownerships. Show all files being copied. rock% cvcp -vxy abc /usr/clips/foo Copy the same directory the same way, but only those files that start with mumblypeg. rock# cvcp -vxy -p mumblypeg abc /usr/clips/foo Copy a single file def to the directory /usr/clips/foo/ rock# cvcp def /usr/clips/foo Copy a file sequence in the current directory prefixed with secta with a range from 200 to 300. Place the files into the Vtape /usr/clips/n8 YUV sub-directory. Set the target frame to 500. Use the verbose option. rock% cvcp -v -f 500 -p secta -r 200-300 . /usr/clips/n8/yuv CVCP TUNING
cvcp can be tuned to improve performance and resource utilization. By adjusting the -t, -k and -b options cvcp can be optimized for any num- ber of different environments. -t Increasing the number of copy threads will increase the number of concurrent copies. This option is useful when copying large directory structures. Single file copies are not affected by the number of copy threads. -b The number of copy buffer should be set to a number between 1 and 3 times the number of copy threads. Increasing the number of copy buffers increases the amount of work that is queued up waiting for an available copy thread, but also increases resource consumption. -k The size of the copy buffer may be tuned to fit the I/O charac- teristics of a copy. If files smaller than 4MB are being copied performance may be improved by reducing the size of copy buffers to more closely match the source file sizes. NOTE: It is important to ensure that the resource consumption of cvcp is tuned to minimize the effects of system memory pressure. On systems with limited available physical memory, performance may be increased by reducing the resource consumption of cvcp. SEE ALSO
cvfs(1) Xsan File System March 2008 cvcp(1)