Sponsored Content
Top Forums Shell Programming and Scripting Find keywords in multiple log files Post 302975064 by Don Cragun on Tuesday 7th of June 2016 09:52:33 PM
Old 06-07-2016
I'm disappointed that you have chosen not to answer any of my questions (which would have helped give you code that might work for you), but maybe this will give you something you can adapt to your needs. It makes some wild assumptions based on sample input files you have provided in this thread, sample output files you have provided in this thread, sample code segments you have provided in this thread, statements you have made in this thread, and me reading a lot in between the lines:
  1. The input files you want to process are in the directory /tmp/log.
  2. The output file you want to produce should be placed in the directory /tmp/log.
  3. The name of the output file you want to produce is either file.txt or progflag.txt. (The following script uses the name progflag.txt.)
  4. You do not want to process your output file as an input file. (The following script ignores both file.txt and progflag.txt as input files.)
  5. All files in the directory /tmp/log whose names end with the string .txt (other than the two mentioned possible output files) are to be processed as input files.
  6. Your input files might or might not have DOS (CR-LF) line terminators instead of UNIX (LF) line terminators. If CR-LF line terminators are present, the CR should be removed before further processing an input line.
  7. Your input files might not have a line terminator on the last line. If an input file does not have a line terminator on the last line, a UNIX line terminator should be added.
  8. Your output file should be a properly formatted text file with UNIX line terminators.
  9. If an input file contains the string /SASFoundation/, an output line should be created in your output file with the string SASFoundation as the 3rd field in that line.
  10. If an input file contains a line matching the ERE ^MEMSIZE *= *[^;]*;{0,1}, an output line should be created in your output file with the string matched by the [^;]* portion of that ERE as the 1st field in that line.
  11. If an input line contains three words and the 1st word is real, and 2nd word is time, and the 3rd word matches the ERE [0-9]+:[0-9]{2}:[0-9]{2} (where the leading digit(s) represent hours, the middle digits represent minutes, and the last digits represent seconds) and the elapsed time represented by the 3rd word is greater than 5 hours; an output line should be created in your output file with the 3rd word (with a leading zero prepended if there is only one leading digit in that word) as the 2nd field in that line.
  12. If more than one line matching any one of the above three criteria would cause an output line to be created, the last line encountered in an input file meeting that criteria is the one used to determine what appears in the output line.
  13. If more than one of the criteria is found in a single input file, only one line of output should be produced for that input file and the 4th field in that output line should be the name of the input file from which that data was extracted.
Code:
#!/bin/bash
cd /tmp/log
for f in *.txt
do	# Skip output files
	[ "$f" = "file.txt" ] && continue
	[ "$f" = "progflag.txt" ] && continue

	# Add a header line for each remaining file to be processed, copy the
	# file to awk's standard input, and add a line terminator to the end of
	# each input file...
	printf '***File=%s\n' "$f"	# Header
	cat "$f"			# File contents
	echo				# Terminate last incomplete line
done | awk '
BEGIN {	FMT[0] = "%-9s%08s  %-15s%s\n"	# SECOND field format for HH:MM:SS
	FMT[1] = "%-9s%-10s%-15s%s\n"	# SECOND field format for other values
}
# Function to print data from data for one input file (including output file
# header before the first output produced).
function pr() {
	if(ms || rt || se) {
		# If we have not printed a header yet...
		if(!header) {
			# print a header.
			header = 1
			printf(FMT[1], "MEMSIZE", "SECOND", "SASEXE",
			    "Filename")
		}
		# Print data gathered from this input file...
		printf(FMT[length(rt) == 0], ms, rt, se, fn)
		ms = rt = se = ""
	}
}
{	# Convert DOS line terminators to UNIX line termiantors.
	sub(/\r$/, "")
}
/^\*\*\*File=/ {
	# File header found for a new input file...
	# Print data from previous file.
	pr()

	# Grab filename from this line.
	fn = substr($0, 9)
#	printf("fn=\"%s\" extracted from \"%s\"\n", fn, $0)
	next
}
/^MEMSIZE *=/ {
	# Grab MEMSIZE field data.
	split($0, fields, / *= *|;/)
	ms = fields[2]
#	printf("ms=\"%s\" extracted from \"%s\"\n", ms, $0)
	next
}
/\/SASFoundation\// {
	# If any line contains the literal string "/SASFoundation/", set se to
	# "SASFoundation".
	se = "SASFoundation"
#	printf("se=\"%s\" extracted from \"%s\"\n", se, $0)
	next
}
$1 == "real" && $2 == "time" && NF == 3 && split($3, fields, /:/) == 3 {
	# We have found a "real time" line with 3 fields and the 3rd field is of
	# the form hours:minutes:seconds.  Set rt to $3 if hours > 5 OR
	# (hours == 5 AND (minutes > 0 || seconds > 0)).
	if(fields[1] + 0 > 5 ||
		(fields[1] == 5 && (fields[2] != "00" || fields[3] != "00")))
		rt = $3
#	printf("rt\"%s\" extracted from \"%s\"\n", rt, $0)
	next
}
END {	# Print results from last input file.
	pr()
}' > progflag.txt

# Send mail if output was produced.
[ -s progflag.txt ] && echo "Code Need to be Evaluated" |
    mailx -s "subject text" -a  progflag.txt receiver@domain.com

This script was written using a Korn shell and tested with a Korn shell and with bash. It should work with any POSIX-conforming shell. If you want to try this on a Solaris/SunOS system, change awk in this script to /usr/xg4/bin/awk or nawk. If the file you uploaded as sample data for this thread are located in the directory /tmp/log this script creates a file named progflag.txt containing:
Code:
MEMSIZE  SECOND    SASEXE         Filename
400      06:00:00  SASFoundation  GT_5hr.txt
200                               SASFoundation_MEMSIZE.txt
400      06:00:00  SASFoundation  more_than_5_hr.txt

Of course, the script won't work if receiver@domain.com is not a valid e-mail address nor if your systems version of mailx does not include a -a file option to include file as an attachment to your mail message. (The POSIX standards do not include a mailx -a file option.)
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to find particular string in multiple files

Hello friends, I have find a paticular string from the files present in my user for example: a username and password is hardcoded in multiple files which present in the my user.so I have to search about username in which files it is available.there are several dirctories are there,so... (5 Replies)
Discussion started by: sivaranga001
5 Replies

2. Shell Programming and Scripting

Finding 50k Keywords in 3k files

Hi, I have a file with about 50k keywords. I have a requirement to scan about 3k files to identify which filename has which keyword i.e. an output like following: File1,Keyword1 File1,Keyword2 File3,Keyword1 ..... I have written a shell script which takes each of the 3k files, searches... (4 Replies)
Discussion started by: rjains
4 Replies

3. Shell Programming and Scripting

Script to find & replace a multiple lines string across multiple php files and subdirectories

Hey guys. I know pratically 0 about Linux, so could anyone please give me instructions on how to accomplish this ? The distro is RedHat 4.1.2 and i need to find and replace a multiple lines string in several php files across subdirectories. So lets say im at root/dir1/dir2/ , when i execute... (12 Replies)
Discussion started by: spfc_dmt
12 Replies

4. UNIX for Dummies Questions & Answers

finding keywords in many files using grep

Hi to all Sorry for the confusion because I did not explain the task clearly. There are many .hhr files in a folder There are so many lines in these .hhr files but I want only the following 2 lines to be transferred to the output file. The keyword No 1 and all the words in the next line They... (5 Replies)
Discussion started by: raghulrajan
5 Replies

5. UNIX for Advanced & Expert Users

Need to search for keywords within files modified at a certain time

I have a huge list of files in an Unix directory (around 10000 files). I need to be able to search for a certain keyword only within files that are modified between certain date and time, say for e.g 2012-08-20 12:30 to 2012-08-20 12:40 Can someone let me know what would be the fastest way... (10 Replies)
Discussion started by: virtual123
10 Replies

6. Shell Programming and Scripting

Find keywords, and append at the end of line

Task: Find keywords in each line, and append at the end of line; if not found in the line, do nothing. the code is wrong. how to make it work. thanks a lot. cat keywords.txt | while read line; do awk -F"|" '{if (/$line/) {print $0"$line , ";} else print;}' outfile.txt > tmp ... (9 Replies)
Discussion started by: dtdt
9 Replies

7. Shell Programming and Scripting

Search files in directory for keywords using bash

I have ~100 text files in a directory that I am trying to parse and output to a new file. I am looking for the words chr,start,stop,ref,alt in each of the files. Those fields should appear somewhere in those files. The first two fields of each new set of rows is also printed. Since this is on a... (7 Replies)
Discussion started by: cmccabe
7 Replies

8. UNIX for Dummies Questions & Answers

Find keywords in multiple log files

The Problem that I am having is when the code ran and populated the progflag.csv file, columns MEMSIZE, SECOND and SASEXE were blank. The next problems are the IF else statement isn't working and the email function isn't sending the progflag.csv attachment. a. What I want the program to do is to... (2 Replies)
Discussion started by: dellanicholson
2 Replies

9. Shell Programming and Scripting

Grep multiple keywords from a file

I have a script that will search for a keyword in all the log files. It work just fine. LOG_FILES={ "/Sandbox/logs/*" } for file in ${LOG_FILES}; do grep $1 $file done This only works for 1 keyword. What if I want to search for more then 1 keywords, say 4 or maybe even... (10 Replies)
Discussion started by: Loc
10 Replies

10. UNIX for Beginners Questions & Answers

Find and replace from multiple files

Hello everybody, I need your help. I have a php site that was expoited, the hacker has injected into many php files a phishing code that was discovered and removed in order to have again a clean code. Now we need to remove from many php files that malware. I need to create a script that find and... (2 Replies)
Discussion started by: ninocap
2 Replies
All times are GMT -4. The time now is 10:53 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy