Please help on "if" statement.

01-19-2013

Registered User

17, 0

Join Date: Dec 2012

Last Activity: 20 February 2013, 6:18 AM EST

Posts: 17

Thanks Given: 0

Thanked 0 Times in 0 Posts

Help on Loops to Grep logs

Hi,

Thanks RudiC for bringing up the order of the timestamp. Yes it does matter. Please help.

Current Setup:
Logs were rotated every 4hrs and they were compressed. If the logs exceed 10gb, it will be compressed (.Z), if not script will just zipped the logs (.gz). So archive directory may sometime contains *.gz and *.Z files, and there are times that it only contains *.gz files

Goal:
Help on Loops to Grep logs per domain, in a single text file, regardless if logs are in *gz or *Z.

domainlist="4prd 5prd 6prd 7prd 8prd"

expected output will be:

4prd_host.txt
5prd_host.txt
6prd_host.txt
7prd_host.txt
8prd_host.txt

I try to create one, but having problem on if statement.

Code:

domainlist="4prd 5prd 6prd 7prd 8prd"
for domain in $domainlist
   do
           if [ ? ] ; then
                gzgrep "$domain" $dir_arch/logs/*gz >> $dir_arch/logs/$domain_$myhost.txt    << for zip files (e.q log.gz)
                 gzip -f $dir_arch/logs/$domain_$myhost.txt
  else
                zcat $dir_arch/logs/*Z | grep "$domain" >> $dir_arch/logs/$domain_$myhost.txt << for compress files(e.q log.Z)
                gzip -f $dir_arch/logs/$domain_$myhost.txt
           fi
done

Kindly help. Thanks!

Regards,
Choco

Last edited by chococrunch6; 01-21-2013 at 01:52 PM.. Reason: stating clearer problem.

chococrunch6

View Public Profile for chococrunch6

Find all posts by chococrunch6

01-20-2013

Registered User

15,129, 5,008

Join Date: Jul 2012

Last Activity: 4 May 2020, 4:31 PM EDT

Location: Aachen, Germany

Posts: 15,129

Thanks Given: 735

Thanked 5,008 Times in 4,483 Posts

Not sure I understand your problem nor your code snippet.
Let's assume you have a directory $dir_arch/logs containing .gz and .Z files, all of which you want to search for 5 items in domainlist. The order of the resulting output does not seem to matter, does it? So sth. like this should do the job for you:

Code:

domainlist="4prd|5prd|6prd|7prd|8prd"
#               ^--- "or" in EREs 
gzgrep -E "$domainlist" $dir_arch/logs/*.gz >> $dir_arch/logs/$domain_$myhost.txt
#       ^--- tell grep to use ERE        ^--- run this on all gzipped files
zcat $dir_arch/logs/*.Z | grep -E "$domainlist" >> $dir_arch/logs/$domain_$myhost.txt
#                     ^--- run this on all compressed files
gzip -f $dir_arch/logs/$domain_$myhost.txt

RudiC

View Public Profile for RudiC

Find all posts by RudiC

01-21-2013

Registered User

17, 0

Join Date: Dec 2012

Last Activity: 20 February 2013, 6:18 AM EST

Posts: 17

Thanks Given: 0

Thanked 0 Times in 0 Posts

Hi RudiC,

Thanks for your reply, i rephrase the problem above, i hope it is clear this time. Kindly help.

chococrunch6

View Public Profile for chococrunch6

Find all posts by chococrunch6

01-22-2013

Registered User

15,129, 5,008

Join Date: Jul 2012

Last Activity: 4 May 2020, 4:31 PM EDT

Location: Aachen, Germany

Posts: 15,129

Thanks Given: 735

Thanked 5,008 Times in 4,483 Posts

Let me paraphrase your problem: You have in one single directory $dir_arch/logs a bunch of zipped/compressed logfiles, which you want to scan line by line for a set of domains, outputting each line to the respective $domain_$myhost.txt file, concatenating all logfiles' entries into single respective domain.txt file.
I still don't see how the TIMESTAMP you refer to is reflected in your code snippet.

I rate it unwise to gzgrep/zcat the files several times for the different domains for performance reasons. (g)unzip once to a tmp dir, order according to timestamp required, and run sth like an awk script over all files that distributes the lines to the respective files.

RudiC

View Public Profile for RudiC

Find all posts by RudiC

01-22-2013

Registered User

17, 0

Join Date: Dec 2012

Last Activity: 20 February 2013, 6:18 AM EST

Posts: 17

Thanks Given: 0

Thanked 0 Times in 0 Posts

thanks for clarification RudiC.

Actually there are only 6 log files which covers the whole day of logs( sometimes a combination of *gz and *Z files or all *gz files or all *Z files depending on the log size upon rotation).
Logs will be huge when (g)unzip, im afraid of having an FS issue by doing so i decided to use zcat | grep and gzgrep. Im sorry but im such a newbie and not so familiar with
"awk" command as you recommended.

Code:

Example logs on the archive:

/archive/2013-Jan-10$ 
        364M Jan 10 00:02 log.2013-Jan-10.00-00-50.Z
         75M Jan 10 04:00 log.2013-Jan-10.04-00-23.gz
         98M Jan 10 08:01 log.2013-Jan-10.08-00-32.gz
       174M Jan 10 12:02 log.2013-Jan-10.12-01-08.gz
        176M Jan 10 16:02 log.2013-Jan-10.16-01-23.gz
        354M Jan 10 20:02 log.2013-Jan-10.20-01-23.Z

/archive/2013-Jan-11$ 
        373M Jan 11 00:02 log.2013-Jan-11.00-00-53.Z
         83M Jan 11 04:01 log.2013-Jan-11.04-00-26.gz
        100M Jan 11 08:02 log.2013-Jan-11.08-00-31.gz
        344M Jan 11 12:02 log.2013-Jan-11.12-01-07.Z
        340M Jan 11 16:02 log.2013-Jan-11.16-01-23.Z
        362M Jan 11 20:02 log.2013-Jan-11.20-01-27.Z


/archive/2013-Jan-18$ 
        371M Jan 18 00:02 log.2013-Jan-18.00-00-52.Z
        91M Jan 18 04:01 log.2013-Jan-18.04-00-27.gz
        119M Jan 18 08:01 log.2013-Jan-18.08-00-31.gz
        154M Jan 18 12:02 log.2013-Jan-18.12-01-18.gz
         87M Jan 18 16:02 log.2013-Jan-18.16-01-40.gz
        105M Jan 18 20:02 log.2013-Jan-18.20-01-07.gz

The scripts will look like this:

Code:

date=`date +"%Y-%h-%d"`        
domainlist="4prd 5prd 6prd 7prd 8prd"
for domain in $domainlist
   do
     if [ -f /archive/$date/* ] ; then
                    # ^--- check if logs are present in the archive dir
        ls -l /archive/$date/* | grep Z
                    # ^--- check for *Z files
            if [ $? -eq 0 ] ; then
                zcat /archive/$date/*Z | grep "$domain" >> /archive/$date/$domain.txt
                    # ^--- grep "domain" for *Z files
                gzgrep "$domain" /archive/$date/*gz >> /archive/$date/$domain.txt
                    # ^--- grep "domain" for *gz files
                gzip -f /archive/$date/$domain.txt 
                  # ^--- archive has *Z files, perform the ff commands above, but the TIMESTAMP on the output file may sometimes won't be in order.
            else
                gzgrep "$domain" /archive/$date/*gz >> /archive/$date/$domain.txt
                gzip -f /archive/$date/$domain.txt
                # ^--- archive has no *Z files
            fi
        else
        echo ">>Logs not found, kindly check archive directory.."
     fi
  done

Kindly let me know if you have better solution for this. I would really appreciate your response. Thanks!

Regards,
Choco

chococrunch6

View Public Profile for chococrunch6

Find all posts by chococrunch6

01-23-2013

Registered User

15,129, 5,008

Join Date: Jul 2012

Last Activity: 4 May 2020, 4:31 PM EDT

Location: Aachen, Germany

Posts: 15,129

Thanks Given: 735

Thanked 5,008 Times in 4,483 Posts

No matter what you do, disk space will come into game. zcat and gzgrep will need space to uncompress, probably temporary files somewhere on disk, or swap file space. So - find a disk that can accomodate your huge files and uncompress there to, if possible maintining file time stamps. Then run sth like

Code:

$ domainlist="4prd|5prd|6prd|7prd|8prd"
$ awk 'match($0, D){ print $0 > substr ($0,RSTART,RLENGTH)}' D=$domainlist *

If * does not supply your files in the correct order, try to rename the files so they show up as needed in e.g. ls. The pipes in domainlist are mandatory for the regex to work!

RudiC

View Public Profile for RudiC

Find all posts by RudiC

01-23-2013

Registered User

17, 0

Join Date: Dec 2012

Last Activity: 20 February 2013, 6:18 AM EST

Posts: 17

Thanks Given: 0

Thanked 0 Times in 0 Posts

can you please explain what does this awk code do?

Code:

$ awk 'match($0, D){ print $0 > substr ($0,RSTART,RLENGTH)}' D=$domainlist *

chococrunch6

View Public Profile for chococrunch6

Find all posts by chococrunch6

UNIX for Dummies Questions & Answers

Please help on "if" statement.

9 More Discussions You Might Find Interesting

1. AIX

Apache 2.4 directory cannot display "Last modified" "Size" "Description"

Discussion started by: penchev

2. Shell Programming and Scripting

Bash script - Print an ascii file using specific font "Latin Modern Mono 12" "regular" "9"

Discussion started by: jcdole

3. UNIX for Dummies Questions & Answers

Using "mailx" command to read "to" and "cc" email addreses from input file

Discussion started by: asjaiswal

4. Shell Programming and Scripting

"if" statement based off "grep"

Discussion started by: Amzerik

5. UNIX for Dummies Questions & Answers

What is the meaning of "-s" option in "if" statement?

Discussion started by: rymnd_12345

6. Shell Programming and Scripting

awk command to replace ";" with "|" and ""|" at diferent places in line of file

Discussion started by: shis100

7. Shell Programming and Scripting

What "-a" operator means in "if" statement

Discussion started by: aoussenko

8. Red Hat

"if" and "then" statement is not working in RedHat

Discussion started by: Afi_Linux

9. UNIX for Dummies Questions & Answers

Explain the line "mn_code=`env|grep "..mn"|awk -F"=" '{print $2}'`"

Discussion started by: Lokesha