You might want to build an "alternation regex", with not too many keywords, and modify the matching slightly. Compare performance of
Code:
awk '
NR==FNR {SRCH=SRCH DL $0
DL = "|"
next
}
tolower($0) ~ SRCH {print > "'"$PAD/removed_woord.csv"'"
next
}
{print > "'"$PAD/filtered_winnaar_2.csv"'"
}
' file3 file4
real 0m2,328s
user 0m2,318s
sys 0m0,005s
to this
Code:
time awk '
NR==FNR {id[$0]
next
}
{for (SP in id) if (tolower($0) ~ SP) {print > "'"$PAD/removed_woord.csv"'"
next
}
}
{print > "'"$PAD/filtered_winnaar_2.csv"'"
}
' file3 file4
real 0m17,038s
user 0m16,995s
sys 0m0,025s
seems to make a factor of roughly 7. The output seems to be identical. Please try and report back.
I just did this one again and i got it working. I noticed the -F";" was missing so i added that and it worked flawlessly. The complete script runs in about 20 sec now which was more then 7 min first.
hi someone tell me which ways i can improve disk I/O and system process performance.kindly refer some commands so i can do it on my test machine.thanks, Mazhar (2 Replies)
I have a data file of 2 gig
I need to do all these, but its taking hours, any where i can improve performance, thanks a lot
#!/usr/bin/ksh
echo TIMESTAMP="$(date +'_%y-%m-%d.%H-%M-%S')"
function showHelp {
cat << EOF >&2
syntax extreme.sh FILENAME
Specify filename to parse
EOF... (3 Replies)
Hi Friends,
I wrote the below shell script to generate a report on alert messages recieved on a day. But i for processing around 4500 lines (alerts) the script is taking aorund 30 minutes to process.
Please help me to make it faster and improve the performace of the script. i would be very... (10 Replies)
Hi All,
I have written a script as follows which is taking lot of time in executing/searching only 3500 records taken as input from one file in log file of 12 GB Approximately.
Working of script is read the csv file as an input having 2 arguments which are transaction_id,mobile_number and search... (6 Replies)
I have around 300 files(*.rdf,*.fmb,*.pll,*.ctl,*.sh,*.sql,*.prog) which are of large size.
Around 8000 keywords(which will be in the file $keywordfile) needed to be searched inside those files.
If a keyword is found in a file..I have to insert the filename,extension,catagoery,keyword,occurrence... (8 Replies)
Hi ,
i wrote a script to convert dates to the formate i want .it works fine but the conversion is tkaing lot of time . Can some one help me tweek this script
#!/bin/bash
file=$1
ofile=$2
cp $file $ofile
mydates=$(grep -Po '+/+/+' $ofile) # gets 8/1/13
mydates=$(echo "$mydates" | sort |... (5 Replies)
Hello,
I'm new to this forum and like to first of all say hello to everyone.
I've got a really annoying problem at the moment.
I'm trying to rsync some files (about 200MB with one file of 120MB) from a Raspberry PI with raspbian to a debian server via rsync.
This procedure is stored in a... (3 Replies)
Hello,
Attached is my very simple C++ code to remove any substrings (DNA sequence) of each other, i.e. any redundant sequence is removed to get unique sequences. Similar to sort | uniq command except there is reverse-complementary for DNA sequence. The program runs well with small dataset, but... (11 Replies)
Discussion started by: yifangt
11 Replies
LEARN ABOUT DEBIAN
pct-scanner-script
PCT-SCANNER-SCRIPT(1) General Commands Manual PCT-SCANNER-SCRIPT(1)NAME
pct-scanner-script - easy scan and create merged DjVu and PDF documents for archiving
SYNOPSIS
pct-scanner-script [--help]
DESCRIPTION
pct-scanner-script is a bash script and configuration system to easily scan documents and create a digital archiving. It can create DjVu
PDF and JPEG outputs, in merged and individual files. It also has a configuration system where different configurations can be defined,
making it easy to switch between configurations.
Please use pct-scanner-script --help to find all program options, the complete help, and all examples.
Features:
- auto archiving in merged and individual DjVu, PDF, and JPEG files
- auto and manual resume when scanning somehow fails
- cascading configuration system, for easy customizing
- easy command-line-style bash program, with useful options
- archiving made possible with both adf and flatbed scanners
- built for productivity and daily use
- uses bash, scanadf and other mainstream processing tools
- multilanguage commandline output
OPTIONS -h, --help, -?
Show summary of options, help, and examples.
DEBUGGING
This program is a helper script to easy scan and automated digital document archiving for normal and adf scanners. You can debug this
script by running it as bash -x pct-scanner-script [options]. The script is just a nice wrapper for the scanadf, DjVu and PDF tools.
SEE ALSO scanadf(1),
AUTHOR
pct-scanner-script and this manual page are written by:
Jelle de Jong <jelledejong@powercraft.nl>.
March 31, 2009 PCT-SCANNER-SCRIPT(1)