speed test +20,000 file existance checks too slow


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users speed test +20,000 file existance checks too slow
# 1  
Old 12-13-2008
speed test +20,000 file existance checks too slow

Need to make a very fast file existence checker. Passing in 20-50K num of files


In the code below ${file} is a file with a listing of +20,000 files. test_speed is the script. I am commenting out the results of <time test_speed try>.

The normal "test -f" is much much too slow when a system call inside awk or perl. basic grep on +20,000 files is super fast, why does doing a file existence test slow it down so much.

Yes i am on try 55, and still i can not get this thing to go faster. I think try 55 would be very fast but i can not actauyly pass a file listing of +20,000 into a for loop becuase i run out of memory. anyone have any ideas on how to speed up a file check inside awk or perl or chell?

This would be fast if it actually worked

how can i pipe into pram $1 ?

awk '{print $10}' ${file} | if [ -f $1 ];then echo 1; else echo 0; fi

how can you pipe into an if statement?


Quote:
#!/bin/ksh

file=spySD.Dec10_aha~
u=aha2231

user=$USER

## No file existance test.

## time test_speed 1
## real 0m3.32s
## user 0m0.68s
## sys 0m0.19s

if [[ $1 = 1 ]];then
awk -v u=${u} '$5~u {print}' ${file} > /tmp/junk_${user}_f1
fi

## With existence test: Try 22

## time test_speed 22
##
## real 3h13m25.76s
## user 1h14m20.86s
## sys 52m23.13s

if [[ $1 = 22 ]];then

awk -v u=${u} '
$5~u {
sysA="if [[ -f " $10 " ]] ;then echo 1;else echo 0;fi"
sysA | getline chk
close(sysA)
if(chk=="1") {print}
}
' ${file} > /tmp/junk_${user}_f2

fi

## With existance test: Try 3
## This is slow too....

if [[ $1 = 3 ]];then

awk -v u=${u} '
$5~u {
sysA="ls " $10 " | grep -c " $10 " 2>/dev/null"
sysA | getline chk
close(sysA)
if(chk=="1") {print}
}
' ${file} > /tmp/junk_${user}_f3

fi


## With existence test: Try 55
if [[ $1 = 55 ]];then
for i in `awk '{print $10}' ${file}`
do

[ -f $i ] && echo 1 || echo 0

done > /tmp/junk_${user}_f55
fi
# 2  
Old 12-15-2008
Quote:
Originally Posted by nullwhat
Need to make a very fast file existence checker. Passing in 20-50K num of files

Write it in C.
Quote:
how can you pipe into an if statement?

You can't; an if statement is not a loop and it doesn't read standard input.
# 3  
Old 12-18-2008
Don't pipe "it" into a if statement, create the shell script on the fly and pipe that:

Code:
awk '{print "if [ -f \"" $10 "\" ]; then echo 1; else echo 0;fi"}' | sh

Login or Register to Ask a Question

Previous Thread | Next Thread

6 More Discussions You Might Find Interesting

1. Solaris

Rsync quite slow (using very little cpu): how to improve its speed?

I have "inherited" a OmniOS (illumos based) server. I noticed rsync is significantly slower in respect to my reference, FreeBSD 12-CURRENT, running on exactly same hardware. Using same hardware, same command with same source and target disks, OmniOS r151026 gives: test@omniosce:~# time... (11 Replies)
Discussion started by: priyadarshan
11 Replies

2. Shell Programming and Scripting

Slow Perl script: how to speed up?

I had written a perl script to compare two files: new and master and get the output of the first file i.e. the first file: words that are not in the master file STRUCTURE OF THE TWO FILES The first file is a series of names ramesh sushil jonga sudesh lugdi whereas the second file (could be... (4 Replies)
Discussion started by: gimley
4 Replies

3. UNIX for Dummies Questions & Answers

Test existance of a file

Hi, I need to find out if a particular file exists and i am using if with -e option. Scenarion is like There is a possibility of two files having nomaincluture like below First file = abc20101028.somthing Second File = abc20101028.somthing.done I need to check abc20101028.somthing... (1 Reply)
Discussion started by: siba.s.nayak
1 Replies

4. Shell Programming and Scripting

Test File Existance Remotely?

Thanks in advance to anyone that can help me answer this: I'm trying to write an if statement that will run test -f on whether a file exists on another server and if it does not then report that negative outcome to a log file. I'm thinking it should look something like this: if ; then rcp... (5 Replies)
Discussion started by: Korn0474
5 Replies

5. News, Links, Events and Announcements

Intel Benchmark Test: Linux Goes to 600,000

For story: http://story.news.yahoo.com/news?tmpl=story&cid=75&ncid=738&e=9&u=/nf/20030606/tc_nf/21680 (0 Replies)
Discussion started by: Neo
0 Replies

6. UNIX for Advanced & Expert Users

network speed is slow

Hello, everyone: i encounter a problem these days , pls help me ,thanks in advance. my env: machine: ES40 A ES40 B os: true64 Unix 4.0f note: src.tar 8M network card speed 100M my problem: ... (3 Replies)
Discussion started by: q30
3 Replies
Login or Register to Ask a Question