While find . -name "star_st*" -exec head -1 {} + | grep "1175 876330" helped reduce the time take to half it is still considered very slow.
I am answering the requested question so it help find a clue n solution.
If . is on a remote filesystem, network issues could also have a significant impact?
Its on the same file system not Remote.
Is there other heavy load on the system? Yes Here is the output of TOP showing high CPU.
However, one of the colleges says that CPU is common in this case and shouldn't affect the find command.
How big is the file hierarchy rooted in . ? There is no hierarchy. I m in the same directory in which the find command runs
How many files have names starting with star_st ? 180954
Last edited by mohtashims; 05-19-2015 at 06:43 AM..
Hi, I have 99583 files in the directory with start_st
Also, I tried the suggestion but looks like there is a syntax error:
I m on Linux
Is there other heavy load on the system? Answer: No
How big is the file hierarchy rooted in .? All the files are in the same directory.Answer: There are NO subdirectories.
If . is on a remote filesystem, network issues could also have a significant impact.: Answer: It is on the same Local File System.
There is a HUGE difference between your command above:
and the command I suggested:
If you put in the double-quotes I suggested (or the single-quotes agent.kgb suggested), it should work.
But if the ls you showed us above worked and all of the files are in a single directory, try just using:
And, if I'm reading your code correctly, RavinderSingh13's awk script can be modified to be more efficient than the above suggestion:
The nextfile command in awk is an extension to the standards, but I believe it is present in awk on Linux systems. If your awk doesn't include nextfile and your star_st* files are small, you could try:
The head and grep pipeline above may be faster if your files are larger than one block, depending on your average file size and the block size used on the filesystem containing your files. (Note that the above awk uses FNR == 1, not the NR == 1 in RavinderSingh13's script (which would only look at the 1st line in the 1st file instead of looking at the 1st line in each file).
This User Gave Thanks to Don Cragun For This Post:
Could you please try following command and let me know if this helps.
According to your statement alll files are in same directory without any sub directories so following may help you then.
Thanks,
R. Singh
I get an error while ruuning your suggestion bash: /bin/awk: Argument list too long
And, if I'm reading your code correctly, RavinderSingh13's awk script can be modified to be more efficient than the above suggestion:
Code:
cd /directory/containing/your/filesawk '$0 ~ /1175 876330/{nextfile}' star_st*
Hello Don,
Thank you for correcting me, I think your code should have included !~ instead of ~ as follows.
Please do correct me if I am worng here.
Thank you for correcting me, I think your code should have included !~ instead of ~ as follows.
Please do correct me if I am worng here.
Thanks,
R. Singh
Sorry, but I think you're wrong. The:
in awk (since there is no action part, uses the default print the current line when the line contains the string 1175 876330). This simulates the action of the grep. With !~ instead of ~, it would simulate grep -v ....
The second line of the script:
(with no condition, so it applies to every input line) skips to the next input file after processing the 1st line in a file (which duplicates the action of:
on each file processed.
And, when we're processing almost 100,000 files, we need to run this command in the directory where the files are located and just pass filenames as operands. Passing the absolute pathnames of 100,000 files runs a MUCH higher chance of exceeding ARG_MAX limits when execing awk. (Which mohtashims reported as a problem in post #11 in this thread.)
Last edited by Don Cragun; 05-19-2015 at 07:24 AM..
Reason: Fix typo caused by auto spellcheck corrections.
I have many files which contain about two million lines.
Now I want to use sed to delete the 9th line and add a new
line behind the 8th line. I use the command as follows:
for((i=1;i<100;i++));
do
echo $i;
sed -i '9d' $i.dat;
sed -i '8a this is a new line' $i.dat;
done
But it is... (3 Replies)
Hi,
I have a large number of input files with two columns of numbers.
For example:
83 1453
99 3255
99 8482
99 7372
83 175
I only wish to retain lines where the numbers fullfil two requirements. E.g:
=83
1000<=<=2000
To do this I use the following... (10 Replies)
I am facing a performance problem on a Solaris 10 Sparc V890 server, it is an old one I know. The first time we realized there is a problem with the server, is the time when ftp transfers are made. There were 4 other identical servers doing much better. Network drivers are checked and there... (3 Replies)
Dear All,
I am using the following script to find and replace the date format in a file. The field18 in the file has the following format: "01/26/2010 11:55:14 GMT+04:00" which I want to convert into the following format "20100126115514" for this purpose I am using the following lines of codes:... (5 Replies)
Dear World,
I just wrote a script, which puzzled me somewhat. The siginficant code was:
for file in `ls splits*`; # splits* came from a split command executed earlier
do
tail -$SomeNumber $file | cut -d" " -f6 > $file;
done;
The interesting thing is this: A few of the $files were... (2 Replies)
All of the sudden scp got really slow ... from 2-3 seconds to 30 seconds.
This happened for 5 hours, and then it went back to running fast.
Why?
If I use the -q qualifier which "Disables the progress meter" could this have any adverse effect?
Thanks (1 Reply)
Hi,
I have an SCO-Unix server running.
There are some processes (unknown to me) which consume a lot of the system resources. This slows down the server dramatically.
Is there a command or program that monitors what processes are using the cpu, disk, etc.. and tell me how excessive and how... (3 Replies)
Discussion started by: Hansaplast
3 Replies
8. Post Here to Contact Site Administrators and Moderators