Sponsored Content
Top Forums UNIX for Advanced & Expert Users Copying Thousands of Tiny or Empty Files? Post 302311383 by deckard on Tuesday 28th of April 2009 01:31:03 PM
Old 04-28-2009
Copying Thousands of Tiny or Empty Files?

There is a procedure I do here at work where I have to synchronize file systems. The source file system always has three or four directories of hundreds of thousands of tiny (1k or smaller) or empty files. Whenever my rsync command reaches these directories, I'm waiting for hours for those files to finish copying. Is there any way to decrease the time it takes for those files to be copied?

The files are generated by an application that definitely needs them, and I'm in no position to dispense with them. I wondered about trying to 'tar' the directories first, but I suspect that if I do, I'll merely be moving the time spent copying them during rsync to the time spent to create the archive in the first place.

My rsync command is pretty basic:

Code:
rsync -auvlxHS /source_dir/ /dest_dir/

Usually /dest_dir/ is a new, empty file system so it really is a full copy, but sometimes there are actual synchronizations done. However, if there's a better approach than my rsync, I'd like to know.
 

5 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Finding a specific pattern from thousands of files ????

Hi All, I want to find a specific pattern from approximately 400000 files on solaris platform. Its very heavy for me to grep that pattern to each file individually. Can anybody suggest me some way to search for specific pattern (alpha numeric) from these forty thousand files. Please note that... (6 Replies)
Discussion started by: aarora_98
6 Replies

2. Shell Programming and Scripting

trnsmiting thousands ftp files and get an error message

Im transmiting thousands ftp files to a server, when type the command mput *, an error comes and say. args list to long. set to I. So ihave to transmit them in batch or blocks, but its too sloww. what shoul i do?. i need to do a program, or with a simple command i could solve the problem? (3 Replies)
Discussion started by: alexcol
3 Replies

3. Shell Programming and Scripting

help to parallelize work on thousands of files

I need to find a smarter way to process about 60,000 files in a single directory. Every night a script runs on each file generating a output on another directory; this used to take 5 hours, but as the data grows it is taking 7 hours. The files are of different sizes, but there are 16 cores... (10 Replies)
Discussion started by: vhope07
10 Replies

4. Shell Programming and Scripting

Search for patterns in thousands of files

Hi All, I want to search for a certain string in thousands of files and these files are distributed over different directories created daily. For that I created a small script in bash but while running it I am getting the below error: /ms.sh: xrealloc: subst.c:5173: cannot allocate... (17 Replies)
Discussion started by: danish0909
17 Replies

5. Shell Programming and Scripting

Bash-awk to process thousands of files

Hi to all, I have thousand of files in a folder with names with format "FILE-YYYY-MM-DD-HHMM" for what I want to send the following AWK command awk '/Code.*/' FILE-2014* I'd like to separate all files that have the same date to a folder named with the corresponding date. For example, if I... (7 Replies)
Discussion started by: Ophiuchus
7 Replies
rsync_selinux(8)					rsync Selinux Policy documentation					  rsync_selinux(8)

NAME
rsync_selinux - Security Enhanced Linux Policy for the rsync daemon DESCRIPTION
Security-Enhanced Linux secures the rsync server via flexible mandatory access control. FILE_CONTEXTS SELinux requires files to have an extended attribute to define the file type. Policy governs the access daemons have to these files. If you want to share files using the rsync daemon, you must label the files and directories public_content_t. So if you created a special directory /var/rsync, you would need to label the directory with the chcon tool. chcon -t public_content_t /var/rsync To make this change permanent (survive a relabel), use the semanage command to add the change to file context configuration: semanage fcontext -a -t public_content_t "/var/rsync(/.*)?" This command adds the following entry to /etc/selinux/POLICYTYPE/contexts/files/file_contexts.local: /var/rsync(/.*)? system_u:object_r:publix_content_t:s0 Run the restorecon command to apply the changes: restorecon -R -v /var/rsync/ SHARING FILES
If you want to share files with multiple domains (Apache, FTP, rsync, Samba), you can set a file context of public_content_t and pub- lic_content_rw_t. These context allow any of the above domains to read the content. If you want a particular domain to write to the pub- lic_content_rw_t domain, you must set the appropriate boolean. allow_DOMAIN_anon_write. So for rsync you would execute: setsebool -P allow_rsync_anon_write=1 BOOLEANS
system-config-selinux is a GUI tool available to customize SELinux policy settings. AUTHOR
This manual page was written by Dan Walsh <dwalsh@redhat.com>. SEE ALSO
selinux(8), rsync(1), chcon(1), setsebool(8), semanage(8) dwalsh@redhat.com 17 Jan 2005 rsync_selinux(8)
All times are GMT -4. The time now is 02:23 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy