Sponsored Content
Top Forums Shell Programming and Scripting Replacing lines matching a multi-line pattern (sed/perl/awk) Post 302890120 by thefang on Tuesday 25th of February 2014 09:32:29 AM
Old 02-25-2014
Thanks Klashxx,

your python script works!
I changed the pattern to...
Code:
>>> pattern = re.compile(r'''
... ^[^\n]+@CAL\sRtlInitAnsiString\s@PA1\s0x0012f740[^\n]+\n
... (?:(?!^[^\n]+RtlInitAnsiString)[^\n]+\n){0,3}
... ^[^\n]+@CAL\smemmove\s@PA1\s0x0012f740[^\n]+
... ''', re.X|re.M|re.S)

...so it would also match adjacent lines. Now I have to figure out how to turn this into a "one-liner" (I currently use "eval" to loop through a file containing pattern matching commands (mostly "sed")) and what each part of the expression does (up until now, my scripting endeavors were limited to rather basic stuff Smilie ).

Does anyone know how python compares to other approaches (awk, etc.) in terms of performance? The files I plan to analyze have upwards of 50,000 lines each and are matched against hundreds of single-line and multi-line patterns.

Cheers

Last edited by thefang; 02-25-2014 at 10:37 AM.. Reason: python<>perl mixup
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

AWK - Pattern Matching & Replacing - Performance

Experts, I am a beginner to Unix Shell Scripting We have source as a flat file which contains CTRL+F character as the delimiter. We need to count the number of records in the file (CTRL+F) to perform file validation Following command being used: awk '{cnt+=gsub(//,"&")}END {print cnt}'... (4 Replies)
Discussion started by: srivijay81
4 Replies

2. Shell Programming and Scripting

How to use sed to modify a line above or below matching pattern?

I couldn't figure out how to use sed or any other shell to do the following. Can anyone help? Thanks. If seeing a string (e.g., TODAY) in the line, replace a string in the line above (e.g, replace "Raining" with "Sunny") and replace a string in the line below (e.g., replace "Reading" with... (7 Replies)
Discussion started by: sprinner
7 Replies

3. Shell Programming and Scripting

replacing multi lines with 1 line

I have an xml file that is stripped down to output that looks bacically like; <!-- TABLEA header --> <tablea> some fields </tablea> <!-- TABLEB header --> <!-- TABLEC header --> <tablec> some fields </tablec> I want to remove the header... (3 Replies)
Discussion started by: Griffs_Revenge
3 Replies

4. Shell Programming and Scripting

sed or awk delete character in the lines before and after the matching line

Sample file: This is line one, this is another line, this is the PRIMARY INDEX line l ; This is another line The command should find the line with “PRIMARY INDEX” and remove the last character from the line preceding it (in this case , comma) and remove the first character from the line... (5 Replies)
Discussion started by: KC_Rules
5 Replies

5. Shell Programming and Scripting

Summing over specific lines and replacing the lines with the sum using sed, awk

Hi friends, This is sed & awk type question. I have a text file which has numbers spread all over the file. I want to sum the series of numbers whenever i find it and produce an output file with the sum. For example ###start of input text file #### abc def ghi 1 2 3 4 kjld random... (3 Replies)
Discussion started by: kaaliakahn
3 Replies

6. Shell Programming and Scripting

sed to replace a line with multi lines from a var

I am trying to find a line in a file ("Replace_Flag") and replace it with a variable which hold a multi lined file. myVar=`cat myfile` sed -e 's/Replace_Flag/'$myVar'/' /pathto/test.file myfile: cat dog boy girl mouse house test.file: football hockey Replace_Flag baseball ... (4 Replies)
Discussion started by: bblondin
4 Replies

7. Shell Programming and Scripting

Sed/awk/perl command to replace pattern in multiple lines

Hi I know sed and awk has options to give range of line numbers, but I need to replace pattern in specific lines Something like sed -e '1s,14s,26s/pattern/new pattern/' file name Can somebody help me in this.... I am fine with see/awk/perl Thank you in advance (9 Replies)
Discussion started by: dani777
9 Replies

8. Shell Programming and Scripting

sed multiple multi line blocks of text containing pattern

Hi, I have a log file which has sessionids in it, each block in the log starts with a date entry, a block may be a single line or multiple lines. I need to sed (or awk) out the lines/blocks with that start with a date and include the session id. The files are large at several Gb. My... (3 Replies)
Discussion started by: andyatit
3 Replies

9. Shell Programming and Scripting

Sed: printing lines AFTER pattern matching EXCLUDING the line containing the pattern

'Hi I'm using the following code to extract the lines(and redirect them to a txt file) after the pattern match. But the output is inclusive of the line with pattern match. Which option is to be used to exclude the line containing the pattern? sed -n '/Conn.*User/,$p' > consumers.txt (11 Replies)
Discussion started by: essem
11 Replies

10. UNIX for Beginners Questions & Answers

awk with sed to combine lines and remove specific odd # pattern from line

In the awk piped to sed below I am trying to format file by removing the odd xxxx_digits and whitespace after, then move the even xxxx_digit to the line above it and add a space between them. There may be multiple lines in file but they are in the same format. The Filename_ID line is the last line... (4 Replies)
Discussion started by: cmccabe
4 Replies
PCREGREP(1)						      General Commands Manual						       PCREGREP(1)

NAME
pcregrep - a grep with Perl-compatible regular expressions. SYNOPSIS
pcregrep [-Vcfhilnrsvx] pattern [file] ... DESCRIPTION
pcregrep searches files for character patterns, in the same way as other grep commands do, but it uses the PCRE regular expression library to support patterns that are compatible with the regular expressions of Perl 5. See pcre(3) for a full description of syntax and semantics. If no files are specified, pcregrep reads the standard input. By default, each line that matches the pattern is copied to the standard out- put, and if there is more than one file, the file name is printed before each line of output. However, there are options that can change how pcregrep behaves. Lines are limited to BUFSIZ characters. BUFSIZ is defined in <stdio.h>. The newline character is removed from the end of each line before it is matched against the pattern. OPTIONS
-V Write the version number of the PCRE library being used to the standard error stream. -c Do not print individual lines; instead just print a count of the number of lines that would otherwise have been printed. If sev- eral files are given, a count is printed for each of them. -ffilename Read patterns from the file, one per line, and match all patterns against each line. There is a maximum of 100 patterns. Trailing white space is removed, and blank lines are ignored. An empty file contains no patterns and therefore matches nothing. -h Suppress printing of filenames when searching multiple files. -i Ignore upper/lower case distinctions during comparisons. -l Instead of printing lines from the files, just print the names of the files containing lines that would have been printed. Each file name is printed once, on a separate line. -n Precede each line by its line number in the file. -r If any file is a directory, recursively scan the files it contains. Without -r a directory is scanned as a normal file. -s Work silently, that is, display nothing except error messages. The exit status indicates whether any matches were found. -v Invert the sense of the match, so that lines which do not match the pattern are now the ones that are found. -x Force the pattern to be anchored (it must start matching at the beginning of the line) and in addition, require it to match the entire line. This is equivalent to having ^ and $ characters at the start and end of each alternative branch in the regular expression. SEE ALSO
pcre(3), Perl 5 documentation DIAGNOSTICS
Exit status is 0 if any matches were found, 1 if no matches were found, and 2 for syntax errors or inacessible files (even if matches were found). AUTHOR
Philip Hazel <ph10@cam.ac.uk> Last updated: 15 August 2001 Copyright (c) 1997-2001 University of Cambridge. PCREGREP(1)
All times are GMT -4. The time now is 01:48 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy