|
|||||||
| Forums | Search Forums | Register | Forum Rules | Man Pages | Albums | FAQ | Members | Calendar | Search | Today's Posts | Mark Forums Read |
| UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !! |
|
|
|
Thread Tools | Search this Thread | Display Modes |
|
#1
|
|||
|
|||
|
Help on Loops to Grep logs
Hi, Thanks RudiC for bringing up the order of the timestamp. Yes it does matter. Please help. Current Setup: Logs were rotated every 4hrs and they were compressed. If the logs exceed 10gb, it will be compressed (.Z), if not script will just zipped the logs (.gz). So archive directory may sometime contains *.gz and *.Z files, and there are times that it only contains *.gz files Goal: Help on Loops to Grep logs per domain, in a single text file, regardless if logs are in *gz or *Z. domainlist="4prd 5prd 6prd 7prd 8prd" expected output will be: 4prd_host.txt 5prd_host.txt 6prd_host.txt 7prd_host.txt 8prd_host.txt I try to create one, but having problem on if statement. Code:
domainlist="4prd 5prd 6prd 7prd 8prd"
for domain in $domainlist
do
if [ ? ] ; then
gzgrep "$domain" $dir_arch/logs/*gz >> $dir_arch/logs/$domain_$myhost.txt << for zip files (e.q log.gz)
gzip -f $dir_arch/logs/$domain_$myhost.txt
else
zcat $dir_arch/logs/*Z | grep "$domain" >> $dir_arch/logs/$domain_$myhost.txt << for compress files(e.q log.Z)
gzip -f $dir_arch/logs/$domain_$myhost.txt
fi
doneKindly help. Thanks! Regards, Choco Last edited by chococrunch6; 01-21-2013 at 12:52 PM.. Reason: stating clearer problem. |
| Sponsored Links | ||
|
|
#2
|
|||
|
|||
|
Not sure I understand your problem nor your code snippet. Let's assume you have a directory $dir_arch/logs containing .gz and .Z files, all of which you want to search for 5 items in domainlist . The order of the resulting output does not seem to matter, does it? So sth. like this should do the job for you: Code:
domainlist="4prd|5prd|6prd|7prd|8prd" # ^--- "or" in EREs gzgrep -E "$domainlist" $dir_arch/logs/*.gz >> $dir_arch/logs/$domain_$myhost.txt # ^--- tell grep to use ERE ^--- run this on all gzipped files zcat $dir_arch/logs/*.Z | grep -E "$domainlist" >> $dir_arch/logs/$domain_$myhost.txt # ^--- run this on all compressed files gzip -f $dir_arch/logs/$domain_$myhost.txt |
| Sponsored Links | ||
|
|
#4
|
|||
|
|||
|
Let me paraphrase your problem: You have in one single directory
$dir_arch/logs a bunch of zipped/compressed logfiles, which you want to scan line by line for a set of domains, outputting each line to the respective
$domain_$myhost.txt file, concatenating all logfiles' entries into single respective domain.txt file.
I still don't see how the TIMESTAMP you refer to is reflected in your code snippet. I rate it unwise to gzgrep/zcat the files several times for the different domains for performance reasons. (g)unzip once to a tmp dir, order according to timestamp required, and run sth like an awk script over all files that distributes the lines to the respective files. |
| Sponsored Links | |
|
|
#5
|
|||
|
|||
|
thanks for clarification RudiC. Actually there are only 6 log files which covers the whole day of logs( sometimes a combination of *gz and *Z files or all *gz files or all *Z files depending on the log size upon rotation). Logs will be huge when (g)unzip, im afraid of having an FS issue by doing so i decided to use zcat | grep and gzgrep. Im sorry but im such a newbie and not so familiar with "awk" command as you recommended. Code:
Example logs on the archive:
/archive/2013-Jan-10$
364M Jan 10 00:02 log.2013-Jan-10.00-00-50.Z
75M Jan 10 04:00 log.2013-Jan-10.04-00-23.gz
98M Jan 10 08:01 log.2013-Jan-10.08-00-32.gz
174M Jan 10 12:02 log.2013-Jan-10.12-01-08.gz
176M Jan 10 16:02 log.2013-Jan-10.16-01-23.gz
354M Jan 10 20:02 log.2013-Jan-10.20-01-23.Z
/archive/2013-Jan-11$
373M Jan 11 00:02 log.2013-Jan-11.00-00-53.Z
83M Jan 11 04:01 log.2013-Jan-11.04-00-26.gz
100M Jan 11 08:02 log.2013-Jan-11.08-00-31.gz
344M Jan 11 12:02 log.2013-Jan-11.12-01-07.Z
340M Jan 11 16:02 log.2013-Jan-11.16-01-23.Z
362M Jan 11 20:02 log.2013-Jan-11.20-01-27.Z
/archive/2013-Jan-18$
371M Jan 18 00:02 log.2013-Jan-18.00-00-52.Z
91M Jan 18 04:01 log.2013-Jan-18.04-00-27.gz
119M Jan 18 08:01 log.2013-Jan-18.08-00-31.gz
154M Jan 18 12:02 log.2013-Jan-18.12-01-18.gz
87M Jan 18 16:02 log.2013-Jan-18.16-01-40.gz
105M Jan 18 20:02 log.2013-Jan-18.20-01-07.gzThe scripts will look like this: Code:
date=`date +"%Y-%h-%d"`
domainlist="4prd 5prd 6prd 7prd 8prd"
for domain in $domainlist
do
if [ -f /archive/$date/* ] ; then
# ^--- check if logs are present in the archive dir
ls -l /archive/$date/* | grep Z
# ^--- check for *Z files
if [ $? -eq 0 ] ; then
zcat /archive/$date/*Z | grep "$domain" >> /archive/$date/$domain.txt
# ^--- grep "domain" for *Z files
gzgrep "$domain" /archive/$date/*gz >> /archive/$date/$domain.txt
# ^--- grep "domain" for *gz files
gzip -f /archive/$date/$domain.txt
# ^--- archive has *Z files, perform the ff commands above, but the TIMESTAMP on the output file may sometimes won't be in order.
else
gzgrep "$domain" /archive/$date/*gz >> /archive/$date/$domain.txt
gzip -f /archive/$date/$domain.txt
# ^--- archive has no *Z files
fi
else
echo ">>Logs not found, kindly check archive directory.."
fi
doneKindly let me know if you have better solution for this. I would really appreciate your response. Thanks! Regards, Choco |
| Sponsored Links | |
|
|
#6
|
|||
|
|||
|
No matter what you do, disk space will come into game. zcat and gzgrep will need space to uncompress, probably temporary files somewhere on disk, or swap file space. So - find a disk that can accomodate your huge files and uncompress there to, if possible maintining file time stamps. Then run sth like Code:
$ domainlist="4prd|5prd|6prd|7prd|8prd"
$ awk 'match($0, D){ print $0 > substr ($0,RSTART,RLENGTH)}' D=$domainlist *If * does not supply your files in the correct order, try to rename the files so they show up as needed in e.g. ls . The pipes in domainlist are mandatory for the regex to work! |
| Sponsored Links | |
|
|
#7
|
|||
|
|||
|
can you please explain what does this awk code do? Code:
$ awk 'match($0, D){ print $0 > substr ($0,RSTART,RLENGTH)}' D=$domainlist * |
| Sponsored Links | ||
|
![]() |
| Thread Tools | Search this Thread |
| Display Modes | |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| "if" statement based off "grep" | Amzerik | Shell Programming and Scripting | 13 | 06-11-2012 10:44 AM |
| What is the meaning of "-s" option in "if" statement? | rymnd_12345 | UNIX for Dummies Questions & Answers | 7 | 05-03-2012 12:22 PM |
| awk command to replace ";" with "|" and ""|" at diferent places in line of file | shis100 | Shell Programming and Scripting | 7 | 03-16-2011 08:59 AM |
| What "-a" operator means in "if" statement | aoussenko | Shell Programming and Scripting | 1 | 03-02-2011 10:30 AM |
| "if" and "then" statement is not working in RedHat | Afi_Linux | Red Hat | 10 | 01-28-2011 03:26 AM |
|
|