Help 'speeding' up this 'parsing' script - taking 24+ hours to run
Hi,
I've written a ksh script that read a file and parse/filter/format each line. The script runs as expected but it runs for 24+ hours for a file that has 2million lines. And sometimes, the input file has 10million lines which means it can be running for more than 2 days and still not finish. And of course, SA's been chasing me up as it is showing in top as running like forever.
I need some advise on maybe instead of reading one line at a time, I can run an awk one liner instead. I wish I can code it in Perl but not sure how to. Most says it is faster in Perl but not sure how to use Perl-like equivalence of the UNIX command besides using system
Anyway, hopefully I can interest someone into looking into this.
Below is the excerpt / part of the script that is taking the most time:
Below are example entries of the input file that the script reads, it can be 2million lines at least and go to as much as 10million lines. I've change entries as they are customer data.
What I am wanting to do really in simplest term is as below:
Change the date format to YYYY-MM-DD. Main reason being is it is most convenient sorting in this date format
Filter some information from each line, i.e host name, IP, program name, service name, return code etc.
I then re-direct these formatted line/record to a file that I can check group by return code value or simply do a sort | uniq -c so it displays and show a count of occurrence.
Hi every one,
We have HP UX server which normally loaded as avg load of 19-21.
NOw when I try and do ftp to this server it takes ages to get the FTP prompt.
I have seen this server loaded as max agv load of 35-40 tht time we never had such problems of FTP sessions.
Now my new Unix admin... (1 Reply)
Hi all,
I'm having some trouble with a shell script that I have put together to search our web pages for links to PDFs.
The first thing I did was:
ls -R | grep .pdf > /tmp/dave_pdfs.outWhich generates a list of all of the PDFs on the server. For the sake of arguement, say it looks like... (8 Replies)
Hi All
I have a problem, I wonder if you can help me sort it out:
I have the following entry in the cron:
00 1,13 * * * /home/report/opn_amt_gestores_credito.ksh > opn_amt_gestores_credito.log
But the entry only runs at 01:07
I have stopped the cron deamon, and started, but it still... (39 Replies)
I want to parse a log file which i am grepping root user connection but is showing whole day and previous day detail as well.
First i want to see last 2 hours log file then after that i want to search particular string. Lets suppose right now its 5:00PM, So i want to see the log of 3:00PM to... (6 Replies)
This is my first experience writing unix script. I've created the following script. It does what I want it to do, but I need it to be a lot faster. Is there any way to speed it up?
cat 'Tax_Provision_Sample.dat' | sort | while read p; do fn=`echo $p|cut -d~ -f2,4,3,8,9`; echo $p >> "$fn.txt";... (20 Replies)
Hi
I have task to zip files based on modified time but they are in millions and it is taking lot of time more than 12 hours and also eating up high cpu
is there any other / better way to handle it quickly with less cpu consumptionfind . ! -name \"*.gz\" -mtime +7 -type f | grep -v '/.*/' |... (2 Replies)
HI Guys hoping some one can help
I have two files on both containing uk phone numbers
master is a file which has been collated over a few years ad currently contains around 4 million numbers
new is a file which also contains 4 million number i need to split new nto two separate files... (4 Replies)
Hi All,
I have a bash script which is scheduled to run for every 20 minutes. Inside the bash script, one command which I am using need to be triggered only once in two or three hours.Is there anyway to achieve this.
For example,
if
then
echo "hi"
else
echo "Hello"
UNIX Command---once... (5 Replies)
Hello experts,
we have input files with 700K lines each (one generated for every hour). and we need to convert them as below and move them to another directory once.
Sample INPUT:-
# cat test1
1559205600000,8474,NormalizedPortInfo,PctDiscards,0.0,Interface,BG-CTA-AX1.test.com,Vl111... (7 Replies)