faster way to loop?


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users faster way to loop?
# 1  
Old 05-09-2006
faster way to loop?

Sample Log file

IP.address Date&TimeStamp GET/POST URL ETC
123.45.67.89 MMDDYYYYHHMM GET myURL http://ABC.com
123.45.67.90 MMDDYYYYHHMM GET myURL http://XYZ.com

I have a very huge web server log file (about 1.3GB) that contains entries like the one above. I need to get the last entries of all the different IPs that has myURL in it? Is there a quick way of looping? My idea was

# Get all the Unique IP addresses and then proceed to check each
cat weblog | awk '{print $1} > ip.list

for i in `cat ip.list`
do
cat weblog | grep $i | grep myURL > lastpages.lis
done


each day has around 3000+ unique IP entries and a day's log is about 48MB. with this process, it takes around 30 mins to process a days worth of data. is there a faster way to do this?
# 2  
Old 05-09-2006
This requires ksh and a lot of memory, but if it runs, it will be rather fast.

Code:
#! /usr/bin/ksh

exec < weblog
IFS=""
while read line ; do
        ip=${line%% *}
        octet4=${ip##*.}
        ip=${ip%.$octet4}
        octet3=${ip##*.}
        ip=${ip%.$octet3}
        octet2=${ip##*.}
        octet1=${ip%.$octet2}
        ip=${octet1}_${octet2}_${octet3}_${octet4}
        var=array_$ip
        eval $var=\$line
done
IFS="\="
set | while read  variable value ; do
        if [[ $variable = array_+([0-9])_+([0-9])_+([0-9])_+([0-9]) ]] ; then
                echo "$value"
        fi
done
exit 0

# 3  
Old 05-09-2006
Unless I misunderstand, you want the last entry for each distinct ip, and since it is a log file it is already in date order with the last entry for an ip=last time it appears. Correct? try:
Code:
awk '{arr[$1]=$0 }
        END{for (i in arr )
                  print arr[i] } '  myweblog > somefile

# 4  
Old 05-10-2006
thanks! it worked!

Here's a followup question...
I have a file that contains around 1500+ IPs, I want to get the last 5 entries of these IPs from the huge web log. how can I modify it to get only the last 5 entries of a specific IP address.

thanks for your help!
# 5  
Old 05-10-2006
If your originally solution is working then I'm proposing optimization which should reduce your time by 1/3rd

# Get all the Unique IP addresses and then proceed to check each
awk '{print $1} weblog > ip.list

while read i
do
grep -w "$i\|\(myURL\)" weblog
done < ip.list > lastpages.lis
# 6  
Old 05-10-2006
Quote:
Originally Posted by jim mcnamara
Unless I misunderstand, you want the last entry for each distinct ip, and since it is a log file it is already in date order with the last entry for an ip=last time it appears. Correct? try:
Code:
awk '{arr[$1]=$0 }
        END{for (i in arr )
                  print arr[i] } '  myweblog > somefile


thanks! kindly interpret how this works. this gets the last entry for each IP. is there a way on how I can include a grep using this? I want to get the last entries with the myURL for each IP. thanks!
# 7  
Old 05-10-2006
Dear tads98,
People are here to help you out, they are not here to work for you. You got some good hints on how to achieve and follow best of shell scripting.
Please respond after doing some extra work from your side.
Good Luck & Happy messaging!!!
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to make faster loop in multiple directories?

Hello, I am under Ubuntu 18.04 Bionic. I have one shell script run.sh (which is out of my topic) to run files under multiple directories and one file to control all processes running under those directories (control.sh). I set a cronjob task to check each of them with two minutes of intervals.... (3 Replies)
Discussion started by: baris35
3 Replies

2. UNIX for Dummies Questions & Answers

Which system is faster?

i'm trying to decide if to move operations from one of these hosts to the other. but i cant decide which one of them is the most powerful. each host has 8 cpus. HOSTA processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 44 model name : Intel(R) Xeon(R) CPU ... (6 Replies)
Discussion started by: SkySmart
6 Replies

3. Shell Programming and Scripting

need to process for loop faster

I have the following code running against a file. The file can have upwards of 10000 lines. problem is, the for loop takes a while to go through all those lines. is there a faster way to go about it? for line in `grep -P "${MONTH} ${DAY}," file | ${AWK} -F" " '{print $4}' | awk -F":"... (2 Replies)
Discussion started by: SkySmart
2 Replies

4. Shell Programming and Scripting

Make script faster

Hi all, In bash scripting, I use to read files: cat $file | while read line; do ... doneHowever, it's a very slow way to read file line by line. E.g. In a file that has 3 columns, and less than 400 rows, like this: I run next script: cat $line | while read line; do ## Reads each... (10 Replies)
Discussion started by: AlbertGM
10 Replies

5. UNIX for Dummies Questions & Answers

Why is RAID0 faster?

I have read anecdotes about people installing RAID0 (RAID - Wikipedia, the free encyclopedia) on some of their machines because it gives a performance boost. Because bandwidth on the motherboard is limited, can someone explain exactly why it should be faster? (7 Replies)
Discussion started by: figaro
7 Replies

6. UNIX for Dummies Questions & Answers

Which command will be faster? y?

i)wc -c/etc/passwd|awk'{print $1}' ii)ls -al/etc/passwd|awk'{print $5}' (4 Replies)
Discussion started by: karthi_g
4 Replies

7. UNIX for Dummies Questions & Answers

How to grep faster ?

Hi I have to grep for 2000 strings in a file one after the other.Say the file name is Snxx.out which has these strings. I have to search for all the strings in the file Snxx.out one after the other. What is the fastest way to do it ?? Note:The current grep process is taking lot of time per... (7 Replies)
Discussion started by: preethgideon
7 Replies

8. Shell Programming and Scripting

Faster then cp ?

Hi , I need to copy every day about 35GB of files from one file system to another. Im using the cp command and its toke me about 25 min. I also tried to use dd command but its toke much more. Is there better option ? Regards. (6 Replies)
Discussion started by: yoavbe
6 Replies

9. IP Networking

Mandrake should be faster.

For some reason 8.1 Mandrake Linux seems much slower than Windows 2000 with my cable modem. DSL reports test says they conferable speed with Windows2 though. This is consistant slow with both of my boxes, at the same time. Linux used to be faster, but not with Mandrake. Any way to fix this? (17 Replies)
Discussion started by: lancest
17 Replies
Login or Register to Ask a Question