Sponsored Content
Full Discussion: Parsing syslog from Linux
Top Forums Shell Programming and Scripting Parsing syslog from Linux Post 303037068 by Chubler_XL on Monday 22nd of July 2019 06:40:00 PM
Old 07-22-2019
You could try using index and substr instead of match to avoid regex overheads. This takes about 2mins for a 2GB file on my system:

Code:
awk '
BEGIN   {
    HDLN = "eventtime|srcip|dstip|srcport|dstport|transip|transport|" \
           "action|sessionid"
    MX = split (HDLN, HD, "|")
    print HDLN
}
{
  DL = ""
  for (i=1; i<=MX; i++)  {
      s=index($0, HD[i] "=")
      if(s) {
          s += length(HD[i]) + 1
          e=index(substr($0,s)," ")-1
          printf DL substr($0, s, e)
      } else printf DL
      DL = "|" 
  }
  printf "\n"
}' infile

--- Post updated at 09:40 AM ---

As a further test I used the above logic in C, and it finished in 1min 20sec on my system. This has to be close to the fastest you could expect:

Code:
#include <stdio.h>
#include <string.h>

int main()
{
   char line_buff[1024];
   int i;
   char *s;
   char dl[2] = "";
   char *match[] = {
     "eventtime=",
     "srcip=",
     "dstip=",
     "srcport=",
     "dstport=",
     "transip=",
     "transport=",
     "action=",
     "sessionid=",
      NULL };


   printf("%.*s", strlen(match[0])-1, match[0]);
   for(i=1;match[i];i++) printf("|%.*s", strlen(match[i])-1, match[i]);
   printf("\n");

   while (!feof(stdin)) {
       if (fgets(line_buff, 1024, stdin)) {
           dl[0]='\0';
           for(i=0;match[i];i++) {
              s=strstr(line_buff, match[i]);
              if(s) {
                printf("%s", dl);
                s+=strlen(match[i]);
                while(*s && *s!=' ') printf("%c", *(s++));
              } else printf("%s", dl);
              strcpy(dl, "|");
            }
           printf("\n");
       }
   }
   return 0;
}


Last edited by Chubler_XL; 07-22-2019 at 07:38 PM.. Reason: Fix indenting
This User Gave Thanks to Chubler_XL For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need some help with parsing

I have a big xml file with little formatting in it. It contains over 600 messages that I need to break each message out in its own separate file. The xml file looks in the middle of it something like this: </Title></Msg><Msg><Opener> Hello how are you?<Title> Some says hello</Title><Body>... (3 Replies)
Discussion started by: quixoticking11
3 Replies

2. Shell Programming and Scripting

Perl parsing compared to Ksh parsing

#! /usr/local/bin/perl -w $ip = "$ARGV"; $rw = "$ARGV"; $snmpg = "/usr/local/bin/snmpbulkget -v2c -Cn1 -Cn2 -Os -c $rw"; $snmpw = "/usr/local/bin/snmpwalk -Os -c $rw"; $syst=`$snmpg $ip system sysName sysObjectID`; sysDescr.0 = STRING: Cisco Internetwork Operating System Software... (1 Reply)
Discussion started by: popeye
1 Replies

3. Shell Programming and Scripting

Parsing of file for Report Generation (String parsing and splitting)

Hey guys, I have this file generated by me... i want to create some HTML output from it. The problem is that i am really confused about how do I go about reading the file. The file is in the following format: TID1 Name1 ATime=xx AResult=yyy AExpected=yyy BTime=xx BResult=yyy... (8 Replies)
Discussion started by: umar.shaikh
8 Replies

4. Red Hat

Parsing a linux file and formatting it.

Hi, I have a linux file that has data like this.. REQUEST_ID|text^Ctext^Ctext^C REQUEST_ID|text^Ctext^C REQUEST_ID| REQUEST_ID| REQUEST_ID|text^Ctext^Ctext^Ctext^Ctext^Ctext^C.... Where ever I see a ^C character, I need to copy the corresponding REQUEST_ID and that part of the text to a new... (17 Replies)
Discussion started by: charithainfadev
17 Replies

5. Shell Programming and Scripting

Parsing kiwi syslog from Astaro

Hello, I am trying to parse this syslog pulling out and logging results to a file. The information I want is: scrip, scrport, dstip, dstport. I just want the numbers, not including the text part ie srcip=". Problem is, the column locations change, so I can't use the nice awk $1 $2 etc to... (4 Replies)
Discussion started by: rmelnik
4 Replies

6. UNIX for Dummies Questions & Answers

Parsing linux commands through FTP

Hi Techies, I have made a shell script which stores the output of it in a text file. then i wanted to fetch that text file using windows scheduler in my windows xp desktop which i did successfully using the below mentioned ftp .bat file : @echo off @echo ftp_user>ftp_test.scr @echo... (0 Replies)
Discussion started by: gemnian.g
0 Replies

7. Shell Programming and Scripting

Help - Parsing data in XML in Linux

Hi, I have an XML file in Linux and it contains a long string of characters. The last part of the file is like ....... ....... ....... CAD</MarketDescription></InvestorTransaction></AdvisorAccount></DivisionAdvisor></Division>... (3 Replies)
Discussion started by: naveed
3 Replies

8. Shell Programming and Scripting

Specific string parsing in Linux/UNIX

Hi, I have a string which can be completely unstructred. I am looking to parse out values within that String. Here is an example <Random Strings> String1=<some number a> String2=<some number b> String3=<some number c> Satish=<some number d> String4=<some number e> I only want to parse out... (1 Reply)
Discussion started by: satishrao
1 Replies

9. SuSE

Location and name of SYSLOG in SUSE Linux

Esteemed listers, Where is the location of SYSLOG file? In etc/auditd.conf script, the log_file location is '/var/log/audit/audit.log' as below. Is this the location where SYSLOG is stored? Thank you in advance, log_file = /var/log/audit/audit.log log_format = RAW... (3 Replies)
Discussion started by: JDBA
3 Replies

10. Programming

Openlog and syslog in red-hat Linux doesn't write any thing to /var/log/*

Using redhat 64 bit ver 6.2 I have simple c++ app that is trying to write to syslog like this: /* try to write massage into linux log */ void foo::writeToSyslog() { openlog("testlogfoo", 0, 24); // Send the message. ... (1 Reply)
Discussion started by: umen
1 Replies
regex(1F)							   FMLI Commands							 regex(1F)

NAME
regex - match patterns against a string SYNOPSIS
regex [-e] [ -v "string"] [ pattern template] ... pattern [template] DESCRIPTION
The regex command takes a string from the standard input, and a list of pattern / template pairs, and runs regex() to compare the string against each pattern until there is a match. When a match occurs, regex writes the corresponding template to the standard output and returns TRUE. The last (or only) pattern does not need a template. If that is the pattern that matches the string, the function simply returns TRUE. If no match is found, regex returns FALSE. The argument pattern is a regular expression of the form described in regex(). In most cases, pattern should be enclosed in single quotes to turn off special meanings of characters. Note that only the final pattern in the list may lack a template. The argument template may contain the strings $m0 through $m9, which will be expanded to the part of pattern enclosed in ( ... )$0 through ( ... )$9 constructs (see examples below). Note that if you use this feature, you must be sure to enclose template in single quotes so that FMLI does not expand $m0 through $m9 at parse time. This feature gives regex much of the power of cut(1), paste(1), and grep(1), and some of the capabilities of sed(1). If there is no template, the default is $m0$m1$m2$m3$m4$m5$m6$m7$m8$m9. OPTIONS
The following options are supported: -e Evaluates the corresponding template and writes the result to the standard output. -v "string" Uses string instead of the standard input to match against patterns. EXAMPLES
Example 1: Cutting letters out of a string To cut the 4th through 8th letters out of a string (this example will output strin and return TRUE): `regex -v "my string is nice" '^.{3}(.{5})$0' '$m0'` Example 2: Validating input in a form In a form, to validate input to field 5 as an integer: valid=`regex -v "$F5" '^[0-9]+$'` Example 3: Translating an environment variable in a form In a form, to translate an environment variable which contains one of the numbers 1, 2, 3, 4, 5 to the letters a, b, c, d, e: value=`regex -v "$VAR1" 1 a 2 b 3 c 4 d 5 e '.*' 'Error'` Note the use of the pattern '.*' to mean "anything else". Example 4: Using backquoted expressions In the example below, all three lines constitute a single backquoted expression. This expression, by itself, could be put in a menu defini- tion file. Since backquoted expressions are expanded as they are parsed, and output from a backquoted expression (the cat command, in this example) becomes part of the definition file being parsed, this expression would read /etc/passwd and make a dynamic menu of all the login ids on the system. `cat /etc/passwd | regex '^([^:]*)$0.*$' ' name=$m0 action=`message "$m0 is a user"`'` DIAGNOSTICS
If none of the patterns match, regex returns FALSE, otherwise TRUE. NOTES
Patterns and templates must often be enclosed in single quotes to turn off the special meanings of characters. Especially if you use the $m0 through $m9 variables in the template, since FMLI will expand the variables (usually to "") before regex even sees them. Single characters in character classes (inside []) must be listed before character ranges, otherwise they will not be recognized. For exam- ple, [a-zA-Z_/] will not find underscores (_) or slashes (/), but [_/a-zA-Z] will. The regular expressions accepted by regcmp differ slightly from other utilities (that is, sed, grep, awk, ed, and so forth). regex with the -e option forces subsequent commands to be ignored. In other words, if a backquoted statement appears as follows: `regex -e ...; command1; command2` command1 and command2 would never be executed. However, dividing the expression into two: `regex -e ...``command1; command2` would yield the desired result. ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWcsu | +-----------------------------+-----------------------------+ SEE ALSO
awk(1), cut(1), grep(1), paste(1), sed(1), regcmp(3C), attributes(5) SunOS 5.10 12 Jul 1999 regex(1F)
All times are GMT -4. The time now is 02:26 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy