Grep command to search a regular expression in a line an only print the string after the match


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Grep command to search a regular expression in a line an only print the string after the match
# 8  
Old 08-25-2016
There is no use of -E (ERE) so the default (BRE) is okay to get the timestamp:
Code:
timestamp='[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-3][0-9][ ][0-9][0-9]:[0-9][0-9]:[0-9][0-9],[0-9][0-9][0-9]'
echo "$line" | grep -o "$timestamp"

sed uses BRE (by default), and you want to get everything behind the timestamp, so you simply need to cut it - substitute it with nothing
Code:
timestamp='[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-3][0-9][ ][0-9][0-9]:[0-9][0-9]:[0-9][0-9],[0-9][0-9][0-9]'
echo "$line" | sed "s/$timestamp//"

and maybe want to cut everything before the timestamp and some space after it
Code:
 echo "$line" | sed "s/.*$timestamp *//"

# 9  
Old 08-26-2016
Quote:
Originally Posted by RavinderSingh13
Hello Ramneekgupta91,

Could you please try following(tested with GNU awk).
Code:
awk --re-interval '{match($0,/[0-9]{4}-[0-2][0-9]-[0-9]{2} [0-2][0-9]:[0-5][0-9]:[0-5][0-9].*/);print substr($0,RSTART+24,RLENGTH-24)}'  Input_file
OR
awk --re-interval '{match($0,/[0-9]{4}-[0-2][0-9]-[0-9]{2} [0-2][0-9]:[0-5][0-9]:[0-5][0-9],[0-9]{3} .*/);print substr($0,RSTART+24,RLENGTH-24)}'  Input_file

Output will be as follows.
Code:
[ttp-/57.20.70.159:8111-35] ERROR com.lufthansa.lgt.exception.filter.LGTExceptionResolver - The error is : For input string: " S"

Adding more solution here.
Code:
awk --re-interval '{sub(/.*[0-9]{4}-[0-2][0-9]-[0-9]{2} [0-2][0-9]:[0-5][0-9]:[0-5][0-9],[0-9]{3} /,X,$0);print}'  Input_file

EDIT: Above solutions may get month more than 12 and time more than 23 too(though one could trust that data will be in correct time format but for exact match and safer side), so edited regex above solutions as follows.
So let's say we have following Input_file:
Code:
/logs/GRAS/LGT/applogs/lgt-2016-08-24/2016-08-24.8.log.zip:2016-08-24 19:12:48,602 [ttp-/57.20.70.159:8111-35] ERROR com.lufthansa.lgt.exception.filter.LGTExceptionResolver - The error is : For input string: " S"
/logs/GRAS/LGT/applogs/lgt-2016-08-24/2016-08-24.8.log.zip:2016-18-24 19:12:48,602 [ttp-/57.20.70.159:8111-35] ERROR com.lufthansa.lgt.exception.filter.LGTExceptionResolver - The error is : For input string: " S$$$$$$$"
/logs/GRAS/LGT/applogs/lgt-2016-28-24/2016-08-24.8.log.zip:2016-28-24 59:12:48,602 [ttp-/57.20.70.159:8111-35] ERROR com.lufthansa.lgt.exception.filter.LGTExceptionResolver - The error is : For input string: " S"
/logs/GRAS/LGT/applogs/lgt-2016-29-24/2016-08-24.8.log.zip:2016-29-24 59:12:48,602 [ttp-/57.20.70.159:8111-35] ERROR com.lufthansa.lgt.exception.filter.LGTExceptionResolver - The error is : For input string: " S"
/logs/GRAS/LGT/applogs/lgt-2016-29-24/2016-08-24.8.log.zip:2016-12-24 59:12:48,602 [ttp-/57.20.70.159:8111-35] ERROR com.lufthansa.lgt.exception.filter.LGTExceptionResolver - The error is : For input string: " S"
/logs/GRAS/LGT/applogs/lgt-2016-29-24/2016-08-24.8.log.zip:2016-11-24 59:12:48,602 [ttp-/57.20.70.159:8111-35] ERROR com.lufthansa.lgt.exception.filter.LGTExceptionResolver - The error is : For input string: " S"
/logs/GRAS/LGT/applogs/lgt-2016-29-24/2016-08-24.8.log.zip:2016-13-24 59:12:48,602 [ttp-/57.20.70.159:8111-35] ERROR com.lufthansa.lgt.exception.filter.LGTExceptionResolver - The error is : For input string: " S"
/logs/GRAS/LGT/applogs/lgt-2016-29-24/2016-08-24.8.log.zip:2016-13-24 79:12:48,602 [ttp-/57.20.70.159:8111-35] ERROR com.lufthansa.lgt.exception.filter.LGTExceptionResolver - The error is : For input string: " S"

I had put some more scenarios in it by adding 29th month in dates and having more than 59 mins of time too. So following code may resolve those things.
Code:
awk --re-interval '{match($0,/[0-9]{4}-(0[1-9]||1[0-2])-(0[1-9]||1[0-9]||2[0-9]||3[0-1]) ([0-1][1-9]||[0-2][0-3]):[0-5][0-9]:[0-5][0-9],[0-9]{3} .*/);if(substr($0,RSTART+24,RLENGTH-24)){print substr($0,RSTART+24,RLENGTH-24)}}'  Input_file

Output will be as follows.
Code:
[ttp-/57.20.70.159:8111-35] ERROR com.lufthansa.lgt.exception.filter.LGTExceptionResolver - The error is : For input string: " S"

Because only 1st line meets the criteria so it is printing only that one.

Adding a non-one liner form of solution too now.
Code:
awk --re-interval '{
                        match($0,/[0-9]{4}-(0[1-9]||1[0-2])-(0[1-9]||1[0-9]||2[0-9]||3[0-1]) ([0-1][1-9]||[0-2][0-3]):[0-5][0-9]:[0-5][0-9],[0-9]{3} .*/);
                        if(substr($0,RSTART+24,RLENGTH-24)){
                                                                print substr($0,RSTART+24,RLENGTH-24)
                                                           }
                   }
                  '   Input_file

Thanks,
R. Singh

Thanks R. Singh for the information.

I have used
Code:
awk --re-interval '{sub(/.*[0-9]{4}-[0-2][0-9]-[0-9]{2} [0-2][0-9]:[0-5][0-9]:[0-5][0-9],[0-9]{3} /,X,$0);print}'  Input_file

and it works perfectly fine.

Though i could only understand the regex part of the command.
Can you help me explain the meaning of each character in the command apart from regex

Also could you guide me on how can i learn awk and sed ?

Thanks
Ramneek
# 10  
Old 08-26-2016
Hello Ramneekgupta91,

Following may help you in same.
Code:
awk --re-interval   ####Enable the use of interval expressions in regular expression matching.
'{sub(              #### sub is awk's built-in keyword which substitutes the matching pattern with given pattern into a variable or a line(depending upon whatever you are mentioning in sub)
/.*[0-9]{4}         #### .* means take everything from starting to [0-9](means digits from 0 to 9) {4} digits should come 4 times eg--> 2016 is a year which has 4 digits in it similarly to match any year here.
-[0-2][0-9]         #### - means -(dash) only [0-2] means match digit from 0 to 2(to match any date).
-[0-9]{2}           #### - means -(dash) only where 0 to 9 comes 2 times,{2} denotes 2 continuous occurrences of 0-9 digits like dates.
[0-2][0-9]          #### [0-2] means from 0 to 2 any digit and [0-9] means from 0 to 9 any digits, so any combination could come of these eg--> 09 or 07 or 21 etc for hours.
:[0-5][0-9]         #### :(colon) [0-5] means from 0 to 5 any digit, [0-9] means from 0 to 9 digits, so their combinations should match here, eg--> 51 or 02 etc for minutes.
:[0-5][0-9]         #### :(colon) [0-5] means from digit 0 to 5 [0-9] means from digit 0 to 9, so their combinations should match here ,eg--> 51, 23, 02 etc.
,[0-9]{3} /         #### match [0-9] 0 to 9 digits {3}(3 times continuously) with a space after them(as per your Input_file shown).
,X,                 #### As mention above we could replace pattern with any variable or value so here as per your requirement I am substituting here(above regex) with X(a NULL value).
$0);                #### Mentioning the $0(which is complete current line).
print}              #### print the line(newly substituted pattern line).
'  Input_file       #### Mentioning Input_file here.

But I would suggest to use other solution for better REGEX matching.
Code:
awk --re-interval '{match($0,/[0-9]{4}-(0[1-9]||1[0-2])-(0[1-9]||1[0-9]||2[0-9]||3[0-1]) ([0-1][1-9]||[0-2][0-3]):[0-5][0-9]:[0-5][0-9],[0-9]{3} .*/);if(substr($0,RSTART+24,RLENGTH-24)){print substr($0,RSTART+24,RLENGTH-24)}}'  Input_file

The best way to learn anything is practice and reading, so keep asking good questions(you should do give some trys too before posting for learning) and
try to learn from forum's posts(As this is one of the BEST forum for learning UNIX/LINUX/Scripting/Admin.), reading good books, reading man pages etc.


Thanks,
R. Singh

Last edited by RavinderSingh13; 08-26-2016 at 09:30 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Regular expression in grep command

Here is the content of a file: abcdefgh 1234 When I do: grep a?c <file> I expect the output to show "abcdefgh". Its not happening. Any ideas? "a?c" should mean either ac or c. This should mean the first line is a match. Yet its not happening. I have tried with -e option in grep, with... (1 Reply)
Discussion started by: Rameshck
1 Replies

2. Shell Programming and Scripting

Need to print the next word from the same line based on grep string condtion match.

I need to fetch particular string from log file based on grep condition match. Actual requirement is need to print the next word from the same line based on grep string condtion match. File :Java.lanag.xyz......File copied completed : abc.txt Ouput :abc.txt I have used below... (5 Replies)
Discussion started by: siva83
5 Replies

3. Shell Programming and Scripting

How can awk search a string without using regular expression?

Hello, Awk seem treat the pattern as regular expression, how can awk search not using regular expression? e.g. just represent for "", not "A" or "a" . I don't want to add backslash . (2 Replies)
Discussion started by: 915086731
2 Replies

4. Shell Programming and Scripting

search a regular expression and match in two (or more files) using bash

Dear all, I have a specific problem that I don't quite understand how to solve. I have two files, both of the same format: XXXXXX_FIND1 bla bla bla bla bla bla bla bla bla bla bla bla ======== (return) XXXXXX_FIND2 bla bla bla bla bla bla (10 Replies)
Discussion started by: TheTransporter
10 Replies

5. Shell Programming and Scripting

Grep regular expression to get part of a line

Hi I just started on GNU Grep with regex and am finding it very challenging and need to ask for help already... here is the problem, I have a page (MYFILE) which consists of the following.... <div> <input type="hidden" name="__EVENTTARGET" id="__EVENTTARGET" value="" /> <input type="hidden"... (2 Replies)
Discussion started by: noobie74645
2 Replies

6. Shell Programming and Scripting

Regular Expression doesn't match dot "." in a string

hello, I am writting a regular expression that intend to match any tunnel or serial interface but it doesn't mtach any serial sub-interface. For example, statement should match "Tunnel3" or "Serial0/1" but shouldn't match "Serial0\1.1" (doesn't include dot ".") I tried the following but... (3 Replies)
Discussion started by: ahmed_zaher
3 Replies

7. Shell Programming and Scripting

exact string match ; search and print match

I am trying to match a pattern exactly in a shell script. I have tried two methods awk '/\<mpath${CURR_MP}\>/{print $1 $2}' multipath perl -ne '/\bmpath${CURR_MP}\b/ and print' /var/tmp/multipath Both these methods require that I use the escape character. I am guessing that is why... (8 Replies)
Discussion started by: bash_in_my_head
8 Replies

8. Shell Programming and Scripting

regular expression format string in one line.

Hi All, @months = qw(Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec); $day=091023; $day_combine = $day; $day_combine =~ s/({2})({2})({2})/20$1-$months-$3/; Instead of three lines, is possible to combine the last two lines into a single line? means no need assign $day to $day_combine... (2 Replies)
Discussion started by: jimmy_y
2 Replies

9. Shell Programming and Scripting

Regular expression in grep -E | awk print

Hi All, I have file.txt with contents like this: random text To: recipient@email.co.uk <HTML>S7randomtext more random text random text To: recip@smtpemail.com <HTML>E5randomtext more random text random text I need the output to look like this: 1,,,1,S7 1,,,1,E5 My code so... (9 Replies)
Discussion started by: terry2009
9 Replies

10. UNIX for Dummies Questions & Answers

Regular Expression - match 'b' that follows 'a' and is at the end of a string

Hi, I'm struggling with a regex that would match a 'b' that follows an 'a' and is at the end of a string of non-white characters. For example: Line 1: aba abab b abb aab bab baa I can find the right strings but I'm lacking knowledge of how to "discard" the bits that precede bs.... (2 Replies)
Discussion started by: machinogodzilla
2 Replies
Login or Register to Ask a Question