Sponsored Content
Top Forums UNIX for Advanced & Expert Users Copying Thousands of Tiny or Empty Files? Post 302311391 by jim mcnamara on Tuesday 28th of April 2009 01:48:52 PM
Old 04-28-2009
Can you run multiple 'threads' of rsync - divide up the source tree and dest tree among several rsync processes?

Code:
rsync -auvlxHS /source_dir/dir1 /dest_dir/dir1
rsync -auvlxHS /source_dir/dir2 /dest_dir/dir2
rsync -auvlxHS /source_dir/dir3 /dest_dir/dir3

When you create lots of files and directories there is substantially more filesystem overhead than just writing to an existing file. You may want to do some serious filesystem tuning on the destination box, particularly the /dest_dir filesystem.

Also, having huge numbers of files in a single directory really bogs things down as well. readdir() takes a lot longer to complete a full scan of a directory for example...

What OS?
 

5 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Finding a specific pattern from thousands of files ????

Hi All, I want to find a specific pattern from approximately 400000 files on solaris platform. Its very heavy for me to grep that pattern to each file individually. Can anybody suggest me some way to search for specific pattern (alpha numeric) from these forty thousand files. Please note that... (6 Replies)
Discussion started by: aarora_98
6 Replies

2. Shell Programming and Scripting

trnsmiting thousands ftp files and get an error message

Im transmiting thousands ftp files to a server, when type the command mput *, an error comes and say. args list to long. set to I. So ihave to transmit them in batch or blocks, but its too sloww. what shoul i do?. i need to do a program, or with a simple command i could solve the problem? (3 Replies)
Discussion started by: alexcol
3 Replies

3. Shell Programming and Scripting

help to parallelize work on thousands of files

I need to find a smarter way to process about 60,000 files in a single directory. Every night a script runs on each file generating a output on another directory; this used to take 5 hours, but as the data grows it is taking 7 hours. The files are of different sizes, but there are 16 cores... (10 Replies)
Discussion started by: vhope07
10 Replies

4. Shell Programming and Scripting

Search for patterns in thousands of files

Hi All, I want to search for a certain string in thousands of files and these files are distributed over different directories created daily. For that I created a small script in bash but while running it I am getting the below error: /ms.sh: xrealloc: subst.c:5173: cannot allocate... (17 Replies)
Discussion started by: danish0909
17 Replies

5. Shell Programming and Scripting

Bash-awk to process thousands of files

Hi to all, I have thousand of files in a folder with names with format "FILE-YYYY-MM-DD-HHMM" for what I want to send the following AWK command awk '/Code.*/' FILE-2014* I'd like to separate all files that have the same date to a folder named with the corresponding date. For example, if I... (7 Replies)
Discussion started by: Ophiuchus
7 Replies
PARALLEL-RSYNC(1)														 PARALLEL-RSYNC(1)

NAME
parallel-rsync - deploy files to listed hosts SYNOPSIS
parallel-rsync [OPTIONS] -h hosts.txt local remote DESCRIPTION
pssh provides a number of commands for executing against a group of computers, using SSH. It's most useful for operating on clusters of homogenously-configured hosts. parallel-rsync deploy files files to all hosts you listed. OPTIONS
-r --recursive recusively copy directories (OPTIONAL) -a ----archive use rsync -a (archive mode) (OPTIONAL) -z --compress use rsync compression (OPTIONAL) -h --hosts hosts file (each line "host[:port] [user]") -l --user username (OPTIONAL) -p --par max number of parallel threads (OPTIONAL) -o --outdir output directory for stdout files (OPTIONAL) -e --errdir output directory for stderr files (OPTIONAL) -t --timeout timeout (secs) (-1 = no timeout) per host (OPTIONAL) -O --options SSH options (OPTIONAL) -v --verbose turn on warning and diagnostic messages (OPTIONAL) EXAMPLE
# parallel-rsync -r -h hosts.txt -l irb2 foo /home/irb2/foo ENVIRONMENT
All four programs take similar sets of options. All of these options can be set using the following environment variables: o PSSH_HOSTS o PSSH_USER o PSSH_PAR o PSSH_OUTDIR o PSSH_VERBOSE o PSSH_OPTIONS SEE ALSO
parallel-ssh(1), parallel-scp(1), parallel-slurp(1), parallel-nuke(1), ssh(1), rsync(1) AUTHOR
Brent N. Chun <bnc@theether.org> COPYING
Copyright: 2003, 2004, 2005, 2006, 2007 Brent N. Chun NOTES
1. bnc@theether.org mailto:bnc@theether.org 03/30/2009 PARALLEL-RSYNC(1)
All times are GMT -4. The time now is 12:01 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy