Sponsored Content
Top Forums UNIX for Advanced & Expert Users Copying Thousands of Tiny or Empty Files? Post 302311383 by deckard on Tuesday 28th of April 2009 01:31:03 PM
Old 04-28-2009
Copying Thousands of Tiny or Empty Files?

There is a procedure I do here at work where I have to synchronize file systems. The source file system always has three or four directories of hundreds of thousands of tiny (1k or smaller) or empty files. Whenever my rsync command reaches these directories, I'm waiting for hours for those files to finish copying. Is there any way to decrease the time it takes for those files to be copied?

The files are generated by an application that definitely needs them, and I'm in no position to dispense with them. I wondered about trying to 'tar' the directories first, but I suspect that if I do, I'll merely be moving the time spent copying them during rsync to the time spent to create the archive in the first place.

My rsync command is pretty basic:

Code:
rsync -auvlxHS /source_dir/ /dest_dir/

Usually /dest_dir/ is a new, empty file system so it really is a full copy, but sometimes there are actual synchronizations done. However, if there's a better approach than my rsync, I'd like to know.
 

5 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Finding a specific pattern from thousands of files ????

Hi All, I want to find a specific pattern from approximately 400000 files on solaris platform. Its very heavy for me to grep that pattern to each file individually. Can anybody suggest me some way to search for specific pattern (alpha numeric) from these forty thousand files. Please note that... (6 Replies)
Discussion started by: aarora_98
6 Replies

2. Shell Programming and Scripting

trnsmiting thousands ftp files and get an error message

Im transmiting thousands ftp files to a server, when type the command mput *, an error comes and say. args list to long. set to I. So ihave to transmit them in batch or blocks, but its too sloww. what shoul i do?. i need to do a program, or with a simple command i could solve the problem? (3 Replies)
Discussion started by: alexcol
3 Replies

3. Shell Programming and Scripting

help to parallelize work on thousands of files

I need to find a smarter way to process about 60,000 files in a single directory. Every night a script runs on each file generating a output on another directory; this used to take 5 hours, but as the data grows it is taking 7 hours. The files are of different sizes, but there are 16 cores... (10 Replies)
Discussion started by: vhope07
10 Replies

4. Shell Programming and Scripting

Search for patterns in thousands of files

Hi All, I want to search for a certain string in thousands of files and these files are distributed over different directories created daily. For that I created a small script in bash but while running it I am getting the below error: /ms.sh: xrealloc: subst.c:5173: cannot allocate... (17 Replies)
Discussion started by: danish0909
17 Replies

5. Shell Programming and Scripting

Bash-awk to process thousands of files

Hi to all, I have thousand of files in a folder with names with format "FILE-YYYY-MM-DD-HHMM" for what I want to send the following AWK command awk '/Code.*/' FILE-2014* I'd like to separate all files that have the same date to a folder named with the corresponding date. For example, if I... (7 Replies)
Discussion started by: Ophiuchus
7 Replies
MRB(8)							      System Manager's Manual							    MRB(8)

NAME
mrb - manage incremental snapshots with rsync/make. SYNOPSIS
mrb command DESCRIPTION
mrb is a simple aid to creating efficient incremental snapshots of a set, or sets, of directories whenever that may be required. It may be used as part of a regular automated backup regime, or for manually checkpointing changes at convenient points in time. COMMANDS
The following commands are recognised (where 'MODULE' is the name of one of your snapshot definitions): new-MODULE Create a skeleton definition for a new snapshot 'MODULE'. dest-MODULE Create the destination dir for 'MODULE'. This directory must exist to create a snapshot. snap-MODULE Create a snapshot of 'MODULE'. sync Create snapshots of all defined modules. If run as root this may be configured to include the modules of other users too (see MRB_SYNC_USERS in ~/.mrb/defaults). help Show mrb's own help text. CONFIGURATION FILES
Per-user configuration The following files may be used to specify global and local configuration options. /etc/default/mrb system default configuration. ~/.mrb/defaults per-user configuration. Per-user options The following options control behaviour for all of a user's modules. MRB_SNAPSHOT_LOG An optional file path where transfer details will be recorded. If unset these details will not be logged. MRB_SYNC_USERS A space separated list of users whose modules should be included in a sync. This is mostly only useful for root, as mrb will assume the identity of each user before creating snapshots of their modules. If unset, only the invoking user's modules will be sync'ed. MRB_CONFDIR An space separated list of the directories to search for module definition (*.mrc) files. They will be searched in the order given, with new modules added by default to the last one listed. There should be few reasons to change the default value. Per-module configuration The default MRB_CONFDIR value will search for module definitions in: /etc/mrb/*.mrc ~/.mrb/*.mrc Those created by new-MODULE will be placed in this latter location by default. Per-module options In each case module below is the name of the particular module that the value set should apply to. These options should be defined in a file named module.mrc. module_SRC A space separated list of the files and (top level) directories to include in the snapshots for this module. module_DEST The directory root where snapshots of module should be stored. module_INCLUDE An optional list of rsync(1) include patterns. module_EXCLUDE An optional list of rsync(1) exclude patterns. module_FILTER An optional list of rsync(1) filter patterns. module_FILTER_FILE An optional filename for rsync(1) dir-merge filtering support. module_RSYNC_OPTIONS Optional additional rsync(1) options to pass verbatim when it is invoked. module_PRECOMMAND An optional shell command to invoke just prior to creating a new snapshot. If the command does not return a successful exit status, then the snapshot creation will be aborted before it begins. It may be used to mount removable media or similar. module_POSTCOMMAND An optional shell command to execute after making the snapshot. It will not be called if the snaphot creation failed at an earlier stage, and its return status may halt a sync operation if it fails with more modules still to process. It may be used, for example, to unmount removable media again. module_USER An optional user name to check that mrb is running as before performing a snapshot. This can be used to ensure you have the correct permisson to access the files being mirrored before you get too far. SEE ALSO
rsync(1), make(1). AUTHOR
mrb was written by Ron <ron@debian.org>. May 9, 2006 MRB(8)
All times are GMT -4. The time now is 08:24 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy