Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Using UNIX Commands with Larger number of Files Post 302729585 by Don Cragun on Saturday 10th of November 2012 12:07:07 PM
Old 11-10-2012
Quote:
Originally Posted by RudiC
May I advise against executing a script within a find command? This would include, for every file found, creating a shell to run the script, implying a huge overhead, esp. in this case with "Large Number of Files". Why not collect all filenames in a working file and then work on that?
The whole point of my suggestion to use a script was to use the script to rearrange arguments to be passed to a simple command thereby allowing find's -exec primary to exec the underlying UNIX command with a large number of file operands instead of with a single file operand. I tried an experiment using the following commands on my relatively old MacBook Pro running Mac OS X Version 10.7.5, which sets {ARG_MAX} to 262144 bytes (i.e., 256Kb). I used cp as the test command to copy all of the PDF files found in and under my home directory to /tmp/pdfdest. /tmp/pdfdest was a new directory when I started this test, but I did not empty and recreate the directory between tests.

I used the command:
Code:
time find $HOME -name '*.pdf' -exec cp -f {} /tmp/pdfdest \;

three times and ignored the 1st time (which ran a lot slower than the other two as it cleared out the various caches and loaded my home directory's file hierarchy). The two remaining runs averaged 2 minutes 27.45 seconds wall clock time, 0.85 seconds user clock time, and 5.56 seconds system time to copy 881 files by invoking cp 881 times.

Adding the following Korn shell script (named CpDest1st):
Code:
#!/bin/ksh
# Usage: CpDest1st options destdir srcfile...
opts="$1"
dest="$2"
shift 2
exec cp $opts "$@" "$dest"

and using the command:
Code:
time find $HOME -name '*.pdf' -exec CpDest1st -f /tmp/pdfdest {} +

three times and again ignored the first set of results and averaged the other two sets of results. Copying the same 881 files using this method invoking the shell script and the cp utility once each took 2 minutes 3.02 seconds wall time, 0.42 seconds user time, and 3.76 seconds system time.

This isn't a statistically valid comparison and your mileage will vary depending on the value of {ARG_MAX} on your system, the number and sizes of the PDF files being copied, ... . But, it does show that the overhead of using a shell script to reduce the number of invocations of cp (or mv or many other utilities) may actually reduce the elapsed time and the amount of system resources used. In this simple test, using the shell script reduced wall clock time 16%, reduced system time 50%, and reduced user time 32%.
 

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

sending larger files via ftp

hi all, i am looking for ways to make ftp efficient by tuning the parameters currently, tcp_max_buf is 1 MB tcp_xmit_hiwat is 48 KB say to transmit multiple 2 gb files from unix server to mainframe sys, will increasing the window size or the send buffer size of the current TCP/IP... (6 Replies)
Discussion started by: matrixmadhan
6 Replies

2. Shell Programming and Scripting

How to initialize array with a larger number?

Language: ksh OS: SunOS I have been getting the 'subscript out of range' error when the below array variable gets elements greater that 1024. I understand that 1024 is the default size for 'set -A' dynamic array, but is there a way to initialize it with a larger number? set -A arr `grep... (6 Replies)
Discussion started by: ChicagoBlues
6 Replies

3. AIX

Tar files larger than 2GB

Hi, Does anyone know if it is possible to tar files larger than 2GB? The reason being is they want me to dump a single file (which is around 20GB) to a tape drive and they will restore it on a Solaris box. I know the tar have a limitation of 2GB so I am thinking of a way how to overcome this.... (11 Replies)
Discussion started by: depam
11 Replies

4. UNIX for Dummies Questions & Answers

unix commands related to ftp of files..

Hi, I am new to unix , I was planning to write a script that will FTP files to destination folder. , Please guide me what are the various networking commands that unix will help in this ftp process..?:confused: (1 Reply)
Discussion started by: rahul125
1 Replies

5. Programming

Using basic UNIX commands to make/compile JAVA files

Hello! This is my first post, and I just learned what UNIX was this week. For a JAVA programming class I am taking, I must be able to create a directory in UNIX, use the nano command to create a JAVA program, compile it, and then run it on the command prompt using the java command. For some... (5 Replies)
Discussion started by: UNdvoItX
5 Replies

6. UNIX for Dummies Questions & Answers

7z command for files larger than 4GB ( unzip doesn't work)

My unzip command doesn't work for files that are greater than 4GB. Consider my file name is unzip -p -a filename.zip, the command doesn't work since the size of the file is larger. I need to know the corresponding 7z command for the same. This is my Unix shell script program: if then ... (14 Replies)
Discussion started by: chandraprakash
14 Replies

7. Shell Programming and Scripting

Backingup larger files with TAR command

I need to backup my database but the files are very large and the TAR command will not let me. I searched aids and found that I could do something with the mknod, COMPRESS and TAR command using them together. I appreciate your help. (10 Replies)
Discussion started by: frizcala
10 Replies

8. UNIX for Dummies Questions & Answers

Split larger files into smaller ones with Column names

Hi, I have one large files of 100000 rows with header column. Eg: Emp Code, Emp Name 101,xxx 102,YYY 103,zzz ... ... I want to split the files into smaller files with only 30000 rows each..File 1,2 and 3 must have 30000 rows and file 4 must contain 10000 rows. But the column... (1 Reply)
Discussion started by: Nivas
1 Replies

9. UNIX for Beginners Questions & Answers

Need to select files larger than 500Mb from servers

I need help modifying these two scripts to do the following: - print files in (MB) instead of (KB) - only select files larger than 500MB -> these will be mailed out daily - Select all files regardless of size all in (MB) -> these will be mailed out once a week this is what i have so far and... (5 Replies)
Discussion started by: donpasscal
5 Replies

10. UNIX for Beginners Questions & Answers

Help with Expect script for pulling log files size larger than 500Mb;

I am new at developing EXPECT scripts. I'm trying to create a script that will automatically connect to a several UNIX (sun solaris and HPUX) database server via FTP and pull the sizes of the listener/alert log files from specified server directory on the remote machines. 1. I want the script... (7 Replies)
Discussion started by: mikebantor
7 Replies
dtsdate(1m)															       dtsdate(1m)

NAME
dtsdate - Sets local clock from a remote dtsd server host SYNOPSIS
dtsdate [-q] [-s] [-u] remote_host [nsecs] ARGUMENTS
Queries the difference in time between the local host and the remote host, but does not change the local clock. The returned result (2 if the time would have been reset, 1 if there was an error, and 0 otherwise) can be used by a script to determine what action to take. Causes dtsdate to work silently, without showing the time. Shows the time in UTC, rather than in the current time zone. The name or the IP address of a remote host that has a dtsd server. An integer giving the number of seconds by which the remote and local host times can dif- fer, without the local host's clock being reset. If nsecs is 0, or if it is not specified, it is treated as if it were extremely large, and no resetting occurs. DESCRIPTION
The dtsdate command sets the local clock of a system to be the same as the host remote_host, running a dtsd server. The purpose of dtsdate is to ensure that clock skew is minimized at initial cell configuration or at host instantiation, because it is difficult to start DCE and its components if the skew is too great. Clocks among all DCE components must be within five minutes of each other, to prevent failure of CDS and of security. Some DCE components have even more stringent requirements. For instance, a DFS file server cannot start if its local host differs from other DFS hosts by more than ten seconds. The dtsdate command can be used for adjusting a clock backwards, before DCE is running on a host. Adjusting a clock backwards while DCE is running can cause many difficulties, because security and file system software generally require system time to increase monotonically. NOTES
The remote host must be running as a DTS server. This means that the dtsd on that system must have registered the DTS management inter- face, because dtsdate uses the management call to get the current time from that host. For dtsdate to be able to set the clock, it must run as a privileged user (root). EXIT VALUE
If the -q argument is given, dtsdate returns 2 if the remote time and local time differ by more than nsecs, 1 if there was an error, and 0 otherwise. If the -q argument is not given, dtsdate returns 1 if there was an error, and 0 otherwise. EXAMPLES
With only the host argument: dtsdate remotehost dtsdate prints out the time on the remote host. In this example: dtsdate -s -q remotehost 10 dtsdate does not print out the remote host's time. If the times differ by more than 10 seconds, it returns the value of 1, otherwise 0. In the next example: dtsdate -s remotehost 10 dtsdate sets the clock if it differed from the remote clock by more than 10 seconds. It does this work silently, because of the -s option. The following example shows a shell script that uses the return value of dtsdate: dtsdate -s -q remhost 10 result = $? if [ $result -eq 0 ] ; then echo "Time is within tolerence." elif [ $result -eq 1 ] ; then echo "Could not contact remote host." >&2 else # result = 2 if dtsdate remhost 10; then # it failed! echo "Could not set the clock." >&2 fi fi RELATED INFORMATION
Commands: dtsd(1m) dtsdate(1m)
All times are GMT -4. The time now is 02:59 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy