Sponsored Content
Full Discussion: Optimizing awk script
Top Forums Shell Programming and Scripting Optimizing awk script Post 302840085 by SkySmart on Sunday 4th of August 2013 11:00:48 AM
Old 08-04-2013
Optimizing awk script

Can this awk statement be optimized? i ask because log.txt is a giant file with several hundred thousands of lines of records.

Code:
myscript.sh:

while read line
do
        searchterm="${1}"
        datecurr=$(date  +%s)
        file=$(awk 'BEGIN{split(ARGV[1],var,",");print var[1]}' $line)
        llnum=$(awk 'BEGIN{split(ARGV[1],var,",");print var[2]}' $line)
        termcount=$(awk -v llnum=${llnum} 'NR>llnum' $file | egrep -c "${searchterm}")
        newfilelinecount=$(wc -l $file)
        echo "${file},${newfilelinecount},${termcount},${datecurr}" >> /tmp/log.txt_2
done < log.txt

Code:
log.txt:

/tmp/text1.txt,343,193,833
/tmp/text2.txt,43,93,533

The first column in "log.txt" contains the file name.
The second column in "log.txt" contains the last known total number of lines for each file.

myscript.sh reads in the file "log.txt" and for each file it finds, it gets the line number from the second column. begins scanning the file from that line number and gets the number of times it finds the search term provided by the user.

can this be optimized?

OS:
Linux (redhat, centos, ubuntu)/SunOS

Last edited by SkySmart; 08-04-2013 at 12:07 PM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Optimizing for a Speed-up

How would one go about optimizing this current .sh program so it works at a more minimal time. Such as is there a better way to count what I need than what I have done or better way to match patterns in the file? Thanks, #declare variables to be used. help=-1 count=0 JanCount=0 FebCount=0... (3 Replies)
Discussion started by: switch
3 Replies

2. UNIX and Linux Applications

Optimizing query

Hi All, My first thread to this sub-forum and first thread of this sub-forum :) Here it is, Am trying to delete duplicates from a table retaining just 1 duplicate value out of the duplicate records for example : from n records of a table out of which x are duplicates, I want to remove x... (15 Replies)
Discussion started by: matrixmadhan
15 Replies

3. OS X (Apple)

Optimizing OSX

Hi forum, I'm administrating a workstation/server for my lab and I was wondering how to optimize OSX. I was wondering what unnecessary background tasks I could kick off the system so I free up as much memory and cpu power. Other optimization tips are also welcome (HD parameters, memory... (2 Replies)
Discussion started by: deiphon
2 Replies

4. Shell Programming and Scripting

Need help optimizing this piece of code (Shell script Busybox)

I am looking for suggestions on how I could possibly optimized that piece of code where most of the time is spend on this script. In a nutshell this is a script that creates an xml file(s) based on certain criteria that will be used by a movie jukebox. Example of data: $SORTEDTMP= it is a... (16 Replies)
Discussion started by: snappy46
16 Replies

5. Shell Programming and Scripting

Optimizing the code

Hi, I have two files in the format listed below. I need to find out all values from field 12 to field 20 present in file 2 and list them in file3(format as file2) File1 : FEIN,CHRISTA... (2 Replies)
Discussion started by: nua7
2 Replies

6. Shell Programming and Scripting

Optimizing bash script

any way the following code can be optimized? FIRSTIN=$( HKIPP=$(echo ${TMFR} | egrep -v "mo|MO|Mo" | egrep "m |M ") HRAMH=$(echo ${TMFR} | egrep "h|H") HRAMD=$(echo ${TMFR} | egrep "d|D") HRAMW=$(echo ${TMFR} | egrep "w|W") HKIPPO=$(echo ${TMFR} |... (5 Replies)
Discussion started by: SkySmart
5 Replies

7. Shell Programming and Scripting

Optimizing for loop with awk or anything similar and portable

The variable COUNTPRO contains: COUNTPRO='Error__posting__message__to__EMR__Queue=0 Error__parsing__ReceiptSummary=0 xinetd__=4327 HTTP__1_1__500___=0 START__=2164 Marshaller__exception__while__converting__to__Receipt__xml=0 MessagePublisher__is__not__configured__correctly=0... (9 Replies)
Discussion started by: SkySmart
9 Replies

8. Shell Programming and Scripting

Optimizing script to reduce execution time

AFILENAME=glow.sh FILENAME="/${AFILENAME}" WIDTHA=$(echo ${FILENAME} | wc -c) NTIME=0 RESULTS=$(for eachletter in $(echo ${FILENAME} | fold -w 1) do WIDTHTIMES=$(awk "BEGIN{printf... (5 Replies)
Discussion started by: SkySmart
5 Replies

9. Shell Programming and Scripting

Optimizing the Shell Script [Expert Advise Needed]

I have prepared a shell script to find the duplicates based on the part of filename and retain latest. #!/bin/bash if ; then mkdir -p dup fi NOW=$(date +"%F-%H:%M:%S") LOGFILE="purge_duplicate_log-$NOW.log" LOGTIME=`date "+%Y-%m-%d %H:%M:%S"` echo... (6 Replies)
Discussion started by: gold2k8
6 Replies

10. Web Development

Optimizing JS and CSS

Yes. Got few suggestions. - How about minifying resources - mod_expires - Service workers setup https://www.unix.com/attachments/web-programming/7709d1550557731-sneak-preview-new-unix-com-usercp-vuejs-demo-screenshot-png (8 Replies)
Discussion started by: Akshay Hegde
8 Replies
ds.log(4)							   File Formats 							 ds.log(4)

NAME
ds.log - Availability Suite data services log file DESCRIPTION
The /var/adm/ds.log file contains the Availability Suite data services command log. The administration commands log activities to the file in the format: date time product: message Note that when the size of the log file exceeds 10 Mbytes, ds.log is renamed /var/adm/ds.log.bak and a new /var/adm/ds.log file is cre- ated. The ds.log fields are: date The date format is mmm nn, where mmm is the local three-character abbreviation for the month and nn is the day of the month on which the event occurred. time The time of the event, in hh:mm:ss format. product A product code that identifies which component of the data services produced the event. The code is separated from the message that follows by a colon (:) and a space. message A message that can extend over more than one line describing the event that occurred. The second or following lines are not pre- fixed by the date, time, and product code strings. EXAMPLES
The example below shows sample ds.log file content: Jan 25 05:26:17 ii: iiboot suspend cluster tag <none> Jan 25 05:32:02 ii: iiboot resume cluster tag <none> Jan 25 05:32:04 sv: svboot: resume /dev/vx/rdsk/bigmaster Jan 25 05:32:04 sv: svboot: resume /dev/vx/rdsk/bigshadow Jan 25 05:32:04 sv: svboot: resume /dev/vx/rdsk/mstvxfs Jan 25 05:32:04 sv: svboot: resume /dev/vx/rdsk/master01 ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +---------------------+---------------------------------------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +---------------------+---------------------------------------------------------+ |Architecture | x86 | +---------------------+---------------------------------------------------------+ |Availability | SUNWscmu | +---------------------+---------------------------------------------------------+ |Interface Stability | Committed | +---------------------+---------------------------------------------------------+ SEE ALSO
iiadm(1M), sndradm(1M), svadm(1M), attributes(5) SunOS 5.11 08 Jun 2007 ds.log(4)
All times are GMT -4. The time now is 06:25 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy