Sponsored Content
Top Forums Shell Programming and Scripting Find keywords in multiple log files Post 302974898 by Don Cragun on Sunday 5th of June 2016 10:30:32 PM
Old 06-05-2016
Quote:
Originally Posted by dellanicholson
The OS is AIX 7.1.

My program searches for certain keywords and its values from multiple text files and output the information to a text file and sends an email attachment. One of the Keyword is named real time . if real time row value in the text files is greater than 5:00:00 than output the column name and its value and the text filename that stores the information to progflag.txt.

Another keyword that is included in the search is an assignment operator named Memsize and its value. Memsize and its value and the text filename that stores the information are outputted to progflag.txt.

The last keyword that is included in the search is a directory name SASFoundation. SASfoundation and the text filename that stores the information are outputted to progflag.txt.

My problem is in progflag.txt, I am getting the headers with no column values. Below is the output when I run the code:

Code:
MEMSIZE SECOND   SASEXE   FILENAME

Here is what the output results need to show in progflag.txt
Code:
MEMSIZE   SECOND     SASEXE                     Filename
200                                                        SASFoundation_MEMSIZE.txt
400       06:00:00         SASFoundation        GT_5hr.txt

In the below example, there should be only 2 filenames in the progflag.txt and not three. For example, no_SASFoundation_no_MEMSIZE.txt doesn't meet the criteria so there shouldn't be any data for this file in progflag.txt.


Here is my code:
Code:
#!/bin/bash


cd /log/tmp/*.txt | awk -F '[=:]' '
  function pr() {printf FORMAT, K[1],K[2],K[3],K[0]}
  BEGIN {FORMAT="%s\t%s\t%16s\t%s\n"
      printf FORMAT, "MEMSIZE","SECOND","SASEXE","Filename\n"
        for(i=split("/Memsize/ $2, ,/Real Time/ $2 ,/SASFoundation/ $3",A,",");i;i--) L[A[i]]=i
      FORMAT="%s\t%.1f\t%16s\t%s\n"
  }
  FNR==1 {
      if(K[1] || K[2]>'5:00:00' || K[3]) pr()
       K[0]=FILENAME
      K[1]=K[2]=K[3]=x
  }
  $1 in L {v=$2;gsub("^[/ ]*","",v);gsub(/ *$/,"",v);K[L[$1]]=v}
  END{if(K[1] || K[2]>'5:00:00' || K[3]) pr()}' *.txt > progflag.txt

[ -s progflag.txt ] && mailx -s "subject text" -a  progflag.txt receiver@domain.com < "Code Need to be Evaluated"

I'm going to ignore most of your sample shell script for the moment because it doesn't seem to match any of your stated requirements. But, it is the only thing we have where you state what the explicit key words are that you are looking for in your text file. The key words your script defines are the literal strings: /Memsize/ $2, a literal single space character, /Real Time/ $2 , and /SASFoundation/ $3. Except for the second keyword in this (the single <space> character), I have not been able to find any of these key words in any of your sample files.

Searching through your sample input files for the data shown in your desired output above, I can find a line that would be matched by the ERE *real time * on a line that does NOT also contain the string seconds. Note that regular expressions and filename pattern matches are case-sensitive on UNIX and UNIX-like systems. Real Time and real time are NOT the same! Note that printing the value 6:00:00 from the input line:
Code:
      real time     6:00:00

(which does not contain the word seconds like other "real time" values:
Code:
      real time         0.06 seconds
      real time     3.01  seconds
      real time     0.3  seconds
      real time     3.0   seconds

under the heading SECONDS) is highly counterintuitive, and will NOT be displayed as you have requested using the printf format string %.1f. (Using that format with the input 6:00:00 would produce the output 6.0.) The string 6:00:00 seems to be hours, minutes, and second; not just seconds. And the test you're using to determine if a line should be printed is a string comparison; not a numeric comparison. With your test, a value of 51:00 (less than 1 hour) would compare greater than 5:00:00 and a value of 10:00:01 (more than 10 hours) would compare less than 5:00:00. Please provide a much clearer description of which lines containing real time should be reported and explain what should happen if more than one of those lines in a single input file are selected. (Your code would only the report the last selected line, if your code actually selected any lines matching this pattern. Is that what you want?)


The ERE MEMSIZE *= * seems to match the lines you are trying to grab from your input files:
Code:
MEMSIZE = 200;
MEMSIZE= 400;

The only line in any of your input files containing the string SASFoundation is:
Code:
z=/SAS/SAS94/SASFoundation/9.4;

which seems to have the key word z which is not mentioned anywhere in your description. Why is the value to be placed in your output under the heading SAXEXE file just the 3rd of the three or four directories named in the z key word's value?

The final field in your output is described in your explanation above as "the text filename that stores the information", and the MEMSIZE = 200; data in your output file comes from a file named SASFoundation_MEMSIZE.txt. But, the data for the last line of your sample output file comes from a file named more_than_5_hr.txt not from the file listed in your sample output: GT_5hr.txt.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to find particular string in multiple files

Hello friends, I have find a paticular string from the files present in my user for example: a username and password is hardcoded in multiple files which present in the my user.so I have to search about username in which files it is available.there are several dirctories are there,so... (5 Replies)
Discussion started by: sivaranga001
5 Replies

2. Shell Programming and Scripting

Finding 50k Keywords in 3k files

Hi, I have a file with about 50k keywords. I have a requirement to scan about 3k files to identify which filename has which keyword i.e. an output like following: File1,Keyword1 File1,Keyword2 File3,Keyword1 ..... I have written a shell script which takes each of the 3k files, searches... (4 Replies)
Discussion started by: rjains
4 Replies

3. Shell Programming and Scripting

Script to find & replace a multiple lines string across multiple php files and subdirectories

Hey guys. I know pratically 0 about Linux, so could anyone please give me instructions on how to accomplish this ? The distro is RedHat 4.1.2 and i need to find and replace a multiple lines string in several php files across subdirectories. So lets say im at root/dir1/dir2/ , when i execute... (12 Replies)
Discussion started by: spfc_dmt
12 Replies

4. UNIX for Dummies Questions & Answers

finding keywords in many files using grep

Hi to all Sorry for the confusion because I did not explain the task clearly. There are many .hhr files in a folder There are so many lines in these .hhr files but I want only the following 2 lines to be transferred to the output file. The keyword No 1 and all the words in the next line They... (5 Replies)
Discussion started by: raghulrajan
5 Replies

5. UNIX for Advanced & Expert Users

Need to search for keywords within files modified at a certain time

I have a huge list of files in an Unix directory (around 10000 files). I need to be able to search for a certain keyword only within files that are modified between certain date and time, say for e.g 2012-08-20 12:30 to 2012-08-20 12:40 Can someone let me know what would be the fastest way... (10 Replies)
Discussion started by: virtual123
10 Replies

6. Shell Programming and Scripting

Find keywords, and append at the end of line

Task: Find keywords in each line, and append at the end of line; if not found in the line, do nothing. the code is wrong. how to make it work. thanks a lot. cat keywords.txt | while read line; do awk -F"|" '{if (/$line/) {print $0"$line , ";} else print;}' outfile.txt > tmp ... (9 Replies)
Discussion started by: dtdt
9 Replies

7. Shell Programming and Scripting

Search files in directory for keywords using bash

I have ~100 text files in a directory that I am trying to parse and output to a new file. I am looking for the words chr,start,stop,ref,alt in each of the files. Those fields should appear somewhere in those files. The first two fields of each new set of rows is also printed. Since this is on a... (7 Replies)
Discussion started by: cmccabe
7 Replies

8. UNIX for Dummies Questions & Answers

Find keywords in multiple log files

The Problem that I am having is when the code ran and populated the progflag.csv file, columns MEMSIZE, SECOND and SASEXE were blank. The next problems are the IF else statement isn't working and the email function isn't sending the progflag.csv attachment. a. What I want the program to do is to... (2 Replies)
Discussion started by: dellanicholson
2 Replies

9. Shell Programming and Scripting

Grep multiple keywords from a file

I have a script that will search for a keyword in all the log files. It work just fine. LOG_FILES={ "/Sandbox/logs/*" } for file in ${LOG_FILES}; do grep $1 $file done This only works for 1 keyword. What if I want to search for more then 1 keywords, say 4 or maybe even... (10 Replies)
Discussion started by: Loc
10 Replies

10. UNIX for Beginners Questions & Answers

Find and replace from multiple files

Hello everybody, I need your help. I have a php site that was expoited, the hacker has injected into many php files a phishing code that was discovered and removed in order to have again a clean code. Now we need to remove from many php files that malware. I need to create a script that find and... (2 Replies)
Discussion started by: ninocap
2 Replies
All times are GMT -4. The time now is 10:42 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy