Sponsored Content
Top Forums UNIX for Advanced & Expert Users Copying Thousands of Tiny or Empty Files? Post 302311412 by deckard on Tuesday 28th of April 2009 02:46:04 PM
Old 04-28-2009
I could try doing multiple instances, that's a good idea at least to test and see if it has any speed increase over the single rsync process. The OS itself is HP-UX 11.11 but we expect to be moving to 11.30 soon-ish. The filesystem is vxfs and it was created with the 'largefiles' option because we also have files that are 8 to 12 gigs in size.

The application uses the small/empty files as some kind of "label" for information in a database that needs to be changed in an indexing process. I'm not clear on it as that portion isn't my responsibility. I've been told that they're necessary. As such, I'm hoping to increase the speed of transfer. However, tuning the FS might not be workable since I need both large files and these small/empty ones.

To add to that, when it is a true sync instead of a full copy, these empty files are always different, so basically it winds up being a full copy anyway. The files are deleted and new ones created on a daily basis during the week.

Quote:
Originally Posted by jim mcnamara
Can you run multiple 'threads' of rsync - divide up the source tree and dest tree among several rsync processes?

Code:
rsync -auvlxHS /source_dir/dir1 /dest_dir/dir1
rsync -auvlxHS /source_dir/dir2 /dest_dir/dir2
rsync -auvlxHS /source_dir/dir3 /dest_dir/dir3

When you create lots of files and directories there is substantially more filesystem overhead than just writing to an existing file. You may want to do some serious filesystem tuning on the destination box, particularly the /dest_dir filesystem.

Also, having huge numbers of files in a single directory really bogs things down as well. readdir() takes a lot longer to complete a full scan of a directory for example...

What OS?
 

5 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Finding a specific pattern from thousands of files ????

Hi All, I want to find a specific pattern from approximately 400000 files on solaris platform. Its very heavy for me to grep that pattern to each file individually. Can anybody suggest me some way to search for specific pattern (alpha numeric) from these forty thousand files. Please note that... (6 Replies)
Discussion started by: aarora_98
6 Replies

2. Shell Programming and Scripting

trnsmiting thousands ftp files and get an error message

Im transmiting thousands ftp files to a server, when type the command mput *, an error comes and say. args list to long. set to I. So ihave to transmit them in batch or blocks, but its too sloww. what shoul i do?. i need to do a program, or with a simple command i could solve the problem? (3 Replies)
Discussion started by: alexcol
3 Replies

3. Shell Programming and Scripting

help to parallelize work on thousands of files

I need to find a smarter way to process about 60,000 files in a single directory. Every night a script runs on each file generating a output on another directory; this used to take 5 hours, but as the data grows it is taking 7 hours. The files are of different sizes, but there are 16 cores... (10 Replies)
Discussion started by: vhope07
10 Replies

4. Shell Programming and Scripting

Search for patterns in thousands of files

Hi All, I want to search for a certain string in thousands of files and these files are distributed over different directories created daily. For that I created a small script in bash but while running it I am getting the below error: /ms.sh: xrealloc: subst.c:5173: cannot allocate... (17 Replies)
Discussion started by: danish0909
17 Replies

5. Shell Programming and Scripting

Bash-awk to process thousands of files

Hi to all, I have thousand of files in a folder with names with format "FILE-YYYY-MM-DD-HHMM" for what I want to send the following AWK command awk '/Code.*/' FILE-2014* I'd like to separate all files that have the same date to a folder named with the corresponding date. For example, if I... (7 Replies)
Discussion started by: Ophiuchus
7 Replies
All times are GMT -4. The time now is 09:18 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy