Deleting lines on matching certain pattern


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Deleting lines on matching certain pattern
# 8  
Old 05-23-2015
Try
Code:
awk '
/<\/invoices>/  {SLINV = $0
                 getline SB
                 getline INV
                 getline
                 if     (!(/<invoiceSerialNo/ &&
                         SB ~ /<\/shippingBill>/))      {print SLINV
                                                         print SB
                                                        }
                 print INV}
1
' file

# 9  
Old 05-24-2015
SmilieSmilie The script is working superbly and only those tags are removed as expected. I have also removed the blank lines by piping

Code:
 awk '!/^$/'

Thanks a lot .You,ve saved a lot of time for me as I have to submit this on a daily basis .Thanks Again
Due to my basic awk knowledge i could not understand the code fully .Can you please be kind enough to explain the code for me and other users benefit .
# 10  
Old 05-24-2015
Why pipe through awk once more if that can be done in one go? Try
Code:
awk '
/^ *$/          {next}                                          # remove blank lines
/<\/invoices>/  {SLINV = $0                                     # if $0 matches </invoices>, save it in SLINV
                 getline SB                                     # read three more lines, save in variables
                 getline INV                              
                 getline                                        # don't save this as its needed for matching and later printing $0
                 if     (!(/<invoiceSerialNo/ &&                # test for the conditions 2 and 3; if both met, don't print
                         SB ~ /<\/shippingBill>/))      {print SLINV
                                                         print SB
                                                        }
                 print INV}                                     # print regardless of conditions 
1                                                               # default: print $0
' file

This User Gave Thanks to RudiC For This Post:
# 11  
Old 05-24-2015
Quote:
Originally Posted by sunnyboy
SmilieSmilie The script is working superbly and only those tags are removed as expected. I have also removed the blank lines by piping

Code:
 awk '!/^$/'

[..]
Note: this will only work reliably if all lines are always completely empty. If there can sometimes be a space character somewhere on an otherwise empty line that would still count as an empty line, then it will not work for those lines..

A more reliable method then would be to use the NF variable:
Code:
awk NF


Last edited by Scrutinizer; 05-24-2015 at 07:00 AM..
# 12  
Old 05-24-2015
Thanks Rudi
It is working superbly and faster as well .Thanks for explaining the code .

Thanks to Scrutinizer as well for improving the code

SmilieSmilieSmilieSmilieSmilieSmilieSmilie
# 13  
Old 05-25-2015
When I try RudiC's code with Scrutinizer's suggestion added in:
Code:
awk '
!NF             {next}
/<\/invoices>/  {SLINV = $0
                 getline SB
                 getline INV
                 getline
                 if     (!(/<invoiceSerialNo/ &&
                         SB ~ /<\/shippingBill>/))      {print SLINV
                                                         print SB
                                                        }
                 print INV}
1
' file

with file containing the sample XML file from post #1 in this thread, I get the following two lines added to the end of the output I would expect from that input:
Code:
   </shippingBill>
     </invoices>

This appears to be because the getline calls aren't being checked for EOF before using the data. (At least that is what I get with awk on OS X. The standards do not specify whether the variable in a getline variable is cleared or left unchanged if an EOF is encountered.) Furthermore the 1st line in the script (throwing away blank lines) does not catch blank lines that might appear in the three lines read by:
Code:
                 getline SB
                 getline INV
                 getline

after a line containing </invoices> is found.

If I understand what is desired here, you might want to try:
Code:
awk '
function get3lines(	status) {
	while((status = getline) == 1 && !NF) {}
	if(status != 1) {
		print SLINV
		exit
	}
	SB = $0
	while((status = getline) == 1 && !NF) {}
	if(status != 1) {
		print SLINV
		print SB
		exit
	}
	INV = $0
	while((status = getline) == 1 && !NF) {}
	if(status != 1) {
		print SLINV
		print SB
		print INV
		exit
	}
}
!NF	{ next
}
/<\/invoices>/  {
	SLINV = $0
	get3lines()
	if(!(/<invoiceSerialNo/ &&
	    SB ~ /<\/shippingBill>/)) {
		print SLINV
		print SB
	}
	print INV
}
1
' file

This User Gave Thanks to Don Cragun For This Post:
# 14  
Old 05-25-2015
Absolutely! Thanks for pointing out.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Grep -v lines starting with pattern 1 and not matching pattern 2

Hi all! Thanks for taking the time to view this! I want to grep out all lines of a file that starts with pattern 1 but also does not match with the second pattern. Example: Drink a soda Eat a banana Eat multiple bananas Drink an apple juice Eat an apple Eat multiple apples I... (8 Replies)
Discussion started by: demmel
8 Replies

2. Shell Programming and Scripting

Sed: printing lines AFTER pattern matching EXCLUDING the line containing the pattern

'Hi I'm using the following code to extract the lines(and redirect them to a txt file) after the pattern match. But the output is inclusive of the line with pattern match. Which option is to be used to exclude the line containing the pattern? sed -n '/Conn.*User/,$p' > consumers.txt (11 Replies)
Discussion started by: essem
11 Replies

3. Shell Programming and Scripting

Help with a deleting lines based on a pattern

I have a header-detail file that goes like this: SHP00288820131021110921 ORDER0156605920131021110921INMMMMFN DETAIL0004 4C2Z 10769 AAFC 0000009600000094 4C2Z 10769 AAFC 0000672107 OIL DETAIL0002 ER3Z 14300 E 0000001300000012 ER3Z 14300 E 0000672107 OIL... (3 Replies)
Discussion started by: rbaggio666
3 Replies

4. Shell Programming and Scripting

Deleting a matching string(line) which is also in other lines

Hi, i need help with my shell script I have a file input.txt containing the following contents /. /usr /usr/share /usr/share/doc /usr/share/doc/wine /usr/share/doc/wine/copyright /usr/share/doc/wine/changelog.Debian.gz I need output as /usr/share/doc/wine /usr/share/doc/wine/copyright... (3 Replies)
Discussion started by: Amit0991
3 Replies

5. Shell Programming and Scripting

Deleting lines from a stream after matching a pattern

Hi, I have a requirement to to an ldapsearch and remove the shadow attributes in the output file. What I do is ldapsearch() | operation to remove shadow > FILE The ldapsearch gives output like this(with same line formation): objectClass: FSConfig objectClass: extensibleObject fsCAIP:... (10 Replies)
Discussion started by: lorzinian
10 Replies

6. Shell Programming and Scripting

Finding lines matching the Pattern and their previous lines in a file

Hi, I am trying to locate the occurences of certain pattern like 'Possible network disconnect' in a text file. I can get the actual lines matching the pttern using: grep -w 'Possible network disconnect' file_name. But I am more interested in getting the timing of these events which are... (7 Replies)
Discussion started by: sagarparadkar
7 Replies

7. Shell Programming and Scripting

pattern matching lines using the date, and then joining the lines

Hi Guys, Was trying to attempt the below using awk and sed, have no luck so far, so any help would be appreciated. Current Text File: The first line has got an "\n", and the second line has got spaces/tabs then the word and "\n" TIME SERVER/CLIENT TEXT... (6 Replies)
Discussion started by: eo29
6 Replies

8. Shell Programming and Scripting

pattern matching over multiple lines and deleting the first

I've got a longish log file with content such as Uplink traffic: Downlink traffic: I want to parse the log file and remove any line that contains the string "Uplink traffic:" at the beginning of the line, but only if the line following it beginnings with the string "Downlink traffic:" (in... (7 Replies)
Discussion started by: Yorkie99
7 Replies

9. Shell Programming and Scripting

sed: deleting 5 lines after a specified pattern

As an example (just an example, this could apply to any block of text) say I have this: architecture x86_64 cputype CPU_TYPE_X86_64 cpusubtype CPU_SUBTYPE_X86_64_ALL offset 4096 size 2972420 align 2^12 (4096) architecture ppc64 cputype CPU_TYPE_POWERPC64 cpusubtype... (3 Replies)
Discussion started by: pcwiz
3 Replies

10. Shell Programming and Scripting

counting the lines matching a pattern, in between two pattern, and generate a tab

Hi all, I'm looking for some help. I have a file (very long) that is organized like below: >Cluster 0 0 283nt, >01_FRYJ6ZM12HMXZS... at +/99% 1 279nt, >01_FRYJ6ZM12HN12A... at +/99% 2 281nt, >01_FRYJ6ZM12HM4TS... at +/99% 3 283nt, >01_FRYJ6ZM12HM946... at +/99% 4 279nt,... (4 Replies)
Discussion started by: d.chauliac
4 Replies
Login or Register to Ask a Question