Need to make a very fast file existence checker. Passing in 20-50K num of files
In the code below ${file} is a file with a listing of +20,000 files. test_speed is the script. I am commenting out the results of <time test_speed try>.
The normal "test -f" is much much too slow when a system call inside awk or perl. basic grep on +20,000 files is super fast, why does doing a file existence test slow it down so much.
Yes i am on try 55, and still i can not get this thing to go faster. I think try 55 would be very fast but i can not actauyly pass a file listing of +20,000 into a for loop becuase i run out of memory. anyone have any ideas on how to speed up a file check inside awk or perl or chell?
This would be fast if it actually worked
how can i pipe into pram $1 ?
awk '{print $10}' ${file} | if [ -f $1 ];then echo 1; else echo 0; fi
how can you pipe into an if statement?
Quote:
#!/bin/ksh
file=spySD.Dec10_aha~
u=aha2231
user=$USER
## No file existance test.
## time test_speed 1
## real 0m3.32s
## user 0m0.68s
## sys 0m0.19s
if [[ $1 = 1 ]];then
awk -v u=${u} '$5~u {print}' ${file} > /tmp/junk_${user}_f1
fi
## With existence test: Try 22
## time test_speed 22
##
## real 3h13m25.76s
## user 1h14m20.86s
## sys 52m23.13s
if [[ $1 = 22 ]];then
awk -v u=${u} '
$5~u {
sysA="if [[ -f " $10 " ]] ;then echo 1;else echo 0;fi"
sysA | getline chk
close(sysA)
if(chk=="1") {print}
}
' ${file} > /tmp/junk_${user}_f2
fi
## With existance test: Try 3
## This is slow too....
if [[ $1 = 3 ]];then
awk -v u=${u} '
$5~u {
sysA="ls " $10 " | grep -c " $10 " 2>/dev/null"
sysA | getline chk
close(sysA)
if(chk=="1") {print}
}
' ${file} > /tmp/junk_${user}_f3
fi
## With existence test: Try 55
if [[ $1 = 55 ]];then
for i in `awk '{print $10}' ${file}`
do
[ -f $i ] && echo 1 || echo 0
done > /tmp/junk_${user}_f55
fi