Grok filter to extract substring from path and add to host field in logstash


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Grok filter to extract substring from path and add to host field in logstash
# 1  
Old 04-23-2015
Grok filter to extract substring from path and add to host field in logstash

Hii,

I am reading data from files by defining path as *.log etc,

Files names are like app1a_test2_heep.log , cdc2a_test3_heep.log etc

How to configure logstash so that the part of string that is string before underscore (app1a, cdc2a..) should be grepped and added to host field and removing the default host.


Eg:

fileName: app1a_test2_heep.log

host => app1a

Here my path field is like,
path => /data/app1a_test2_heep.log
I want to extract the string before that first underscore and add to host field by removing the default host. What could be the filter for this.

Thanks in advance,
Regards,
Ravi
# 2  
Old 04-23-2015
I am unable to decipher what you are trying to do.
Are you trying to get a list of files to read?
Are you trying to extract a list of hosts from a list of files?
Are you trying to create a list of pathnames to process from a list of files?
Is the list of files in a file, or the current directory, or some other directory?

Please clearly explain what you are trying to do, show us what you have done (using CODE tags), show us the output you're getting from what you have done (using CODE tags), and show us the output you're trying to get (using CODE tags).

What operating system are you using?
What shell are you using?
What tools are you trying to use?
# 3  
Old 04-24-2015
I am trying to extract host name from the filenames.
# 4  
Old 04-24-2015
Quote:
Originally Posted by Ravi Kishore
Hii,

I am reading data from files by defining path as *.log etc,

Files names are like app1a_test2_heep.log , cdc2a_test3_heep.log etc

How to configure logstash so that the part of string that is string before underscore (app1a, cdc2a..) should be grepped and added to host field and removing the default host.


Eg:

fileName: app1a_test2_heep.log

host => app1a

Here my path field is like,
path => /data/app1a_test2_heep.log
I want to extract the string before that first underscore and add to host field by removing the default host. What could be the filter for this.

Thanks in advance,
Regards,
Ravi
Hello Ravi,

Could you please try following and let me know if this helps.
Code:
echo "/data/app1a_test2_heep.log" | awk '{match($0,/\/.*_/);gsub(/.*\//,X,$0);gsub(/_.*/,Y,$0);print $0}'
OR
echo "cdc2a_test3_heep.log" | awk '{match($0,/\/.*_/);gsub(/_.*/,Y,$0);print $0}'

You can use any of above as per your need and let me know if you have any queries.

Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
# 5  
Old 04-24-2015
Quote:
Originally Posted by RavinderSingh13
Hello Ravi,

Could you please try following and let me know if this helps.
Code:
echo "/data/app1a_test2_heep.log" | awk '{match($0,/\/.*_/);gsub(/.*\//,X,$0);gsub(/_.*/,Y,$0);print $0}'
OR
echo "cdc2a_test3_heep.log" | awk '{match($0,/\/.*_/);gsub(/_.*/,Y,$0);print $0}'

You can use any of above as per your need and let me know if you have any queries.

Thanks,
R. Singh
Could you please explain, why did you use match function here ? what it does here ?
# 6  
Old 04-24-2015
You refused to answer my questions about where your filenames are located and what you're really trying to do. Maybe this will help a little bit:
Code:
#!/bin/ksh
for path in /data/*.log *.log
do	host=${path##*/}
	host=${host%%[_.]*}
	printf 'pathname: %s\nhost: %s\n\n' "$path" "$host"
done

This was written and tested using the Korn shell, but will work with any shell that performs basic parameter substitutions as required by the POSIX standards. Depending on what files are present in /data and in the current directory, it produces output similar to the following:
Code:
pathname: /data/app1a_test2_heep.log
host: app1a

pathname: /data/cdc2a_test3_heep.log
host: cdc2a

pathname: abc_xyz.log
host: abc

pathname: xyz.log
host: xyz

# 7  
Old 04-24-2015
Thanks Akshay for pointing it out, I was trying first someting else with match before and later used gsub while posting I forgot to remove match from it. It can be as follows too.

Code:
echo "cdc2a_test3_heep.log" | awk '{gsub(/.*\//,X,$0);gsub(/_.*/,Y,$0);print $0}'
OR
echo "/data/app1a_test2_heep.log" | awk '{gsub(/.*\//,X,$0);gsub(/_.*/,Y,$0);print $0}'

Thanks,
R. Singh
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to update field using matching value in file1 and substring in field in file2

In the awk below I am trying to set/update the value of $14 in file2 in bold, using the matching NM_ in $12 or $9 in file2 with the NM_ in $2 of file1. The lengths of $9 and $12 can be variable but what is consistent is the start pattern will always be NM_ and the end pattern is always ;... (2 Replies)
Discussion started by: cmccabe
2 Replies

2. Shell Programming and Scripting

Extract a substring from a file

Hello, A question please. A have a file that contains a string. Ex: AAAABBCCCCCDDEEEEEEEEEEFF I'd want to recover 2 substrings, 'BB' and 'FF' and then leave them in a new file. From position 5, 2 caracters (ex:"BB") and from position 25, 2 caracters (ex:"FF") in a file. Could anoyone help me... (3 Replies)
Discussion started by: nolo41
3 Replies

3. Shell Programming and Scripting

Filter uniq field values (non-substring)

Hello, I want to filter column based on string value. All substring matches are filtered out and only unique master strings are picked up. infile: 1 abcd 2 abc 3 abcd 4 cdef 5 efgh 6 efgh 7 efx 8 fgh Outfile: 1 abcd 4 cdef 5 efgh 7 efxI have tried awk '!a++; match(a, $2)>0'... (32 Replies)
Discussion started by: yifangt
32 Replies

4. Shell Programming and Scripting

Filter specified path

We have 10 jobs entry in crontab like this 0 7 * * 0 && (source /x/y/z .bashrc ; /x/y/z /test.sh Table1 /ABC/TEST >x/y/z/log (every job have different o/p Path) can any one help me to filter only the output PATH /ABC/TEST from the file dup_cron. (1 Reply)
Discussion started by: netdbaind
1 Replies

5. Shell Programming and Scripting

How to extract a substring from a string

Hi, I have an input string say for example: ABC,DEF,IJK,LMN,...,XYZ The above string is comma delimited. Now I have to extract the last part after the comma i.e. XYZ. :b: (3 Replies)
Discussion started by: bghosh
3 Replies

6. Shell Programming and Scripting

one liner to extract path from PATH variable

Hi, Could anyone help me in writing a single line code by either using (sed, awk, perl or whatever) to extract a specific path from the PATH environment variable? for eg: suppose the PATH is being set as follows PATH=/usr/bin/:/usr/local/bin:/bin:/usr/sbin:/usr/bin/java:/usr/bin/perl3.4 ... (2 Replies)
Discussion started by: royalibrahim
2 Replies

7. Solaris

Extract substring from a string

i have srtring i.e. "NAME,CLASS,AGE" (length of string is not constant) and from this string i've extract each word delimited by "," (comma). INPUT: "NAME,CLASS,AGE" OUTPUT: NAME CLASS AGE how can i do that? i have tried some string manipulation function like... (5 Replies)
Discussion started by: jadoo_c2
5 Replies

8. Shell Programming and Scripting

Extract a substring.

I have a shell script that uses wget to grab a bunch of html from a url. URL_DATA=`wget -qO - "$URL1"` I now have a string $URL_DATA that I need to pull a substring out of..say I had the following in my string <p><a href="/scooby/929011567.html">Dog pictures check them out! -</a><font... (3 Replies)
Discussion started by: shellpower
3 Replies

9. Shell Programming and Scripting

Need Help... to extract the substring

> tnsping $TWO_TASK | grep HOST Attempting to contact (DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = 10.12.10.212)(PORT = 1540)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = OMTST15))) I want to extract like this HOST = 10.12.10.212 PORT = 1540 SERVICE_NAME = OMTST15 I... (4 Replies)
Discussion started by: dashok.83
4 Replies

10. Shell Programming and Scripting

Sed extract substring on (OS X)

On OS 10.4.11 I have filenames like: 670711 SA T2 v1-1_DS_EF.doc CT_670520 AM T1 v1-2_DS_EF.doc CT_670716 - 2 SA T4 v1-2_DS_EF.doc CT_670713 SA T3 v1-1_DS_EF.doc 670421 PA DYP1 v1-1_DS_EF.doc CT_670425 PA DYP2 v1-1_DS_EF.doc CT_670107 RA T3 v1-2_DS_EF.doc CT_670521 AM T2 v1-2_DS_EF.doc... (3 Replies)
Discussion started by: mlommel
3 Replies
Login or Register to Ask a Question