Match paragraph between two patterns, delete the duplicate paragraphs | Unix Linux Forums | Shell Programming and Scripting

  Go Back    


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

Match paragraph between two patterns, delete the duplicate paragraphs

Shell Programming and Scripting


Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 07-10-2013
sb245 sb245 is offline
Registered User
 
Join Date: Mar 2011
Last Activity: 10 July 2013, 11:05 PM EDT
Location: Kathmandu
Posts: 11
Thanks: 2
Thanked 0 Times in 0 Posts
Match paragraph between two patterns, delete the duplicate paragraphs

Hello all

I have a file my DNS server where there are duplicate paragrapsh like below. How can I remove the duplicate paragraph so that only one paragraph remains.


Code:
BEGIN;
replace into domains (name,type) values ('225.168.192.in-addr.arpa','MASTER');
replace into records (domain_id, name,type,content,ttl,prio) select id ,'225.168.192.in-addr.arpa', 'SOA', 'ns1.test.com. hostmaster.test.com. 2013030511 600 7200 604800 345600', 86400, 0 from domains where name='225.168.192.in-addr.arpa';
replace into records (domain_id, name,type,content,ttl,prio) select id ,'225.168.192.in-addr.arpa', 'NS', 'ns1.test.com', 86400, 0 from domains where name='225.168.192.in-addr.arpa';
replace into records (domain_id, name,type,content,ttl,prio) select id ,'225.168.192.in-addr.arpa', 'NS', 'ns2.test.com', 86400, 0 from domains where name='225.168.192.in-addr.arpa';
replace into records (domain_id, name,type,content,ttl,prio) select id ,'54.225.168.192.in-addr.arpa', 'PTR', 'somedomain1.test.com', 86400, 0 from domains where name='225.168.192.in-addr.arpa';
replace into records (domain_id, name,type,content,ttl,prio) select id ,'117.225.168.192.in-addr.arpa', 'PTR', 'somedomain2.test.com', 86400, 0 from domains where name='225.168.192.in-addr.arpa';
COMMIT;
BEGIN;
replace into domains (name,type) values ('225.168.192.in-addr.arpa','MASTER');
replace into records (domain_id, name,type,content,ttl,prio) select id ,'225.168.192.in-addr.arpa', 'SOA', 'ns1.test.com. hostmaster.test.com. 2013030511 600 7200 604800 345600', 86400, 0 from domains where name='225.168.192.in-addr.arpa';
replace into records (domain_id, name,type,content,ttl,prio) select id ,'225.168.192.in-addr.arpa', 'NS', 'ns1.test.com', 86400, 0 from domains where name='225.168.192.in-addr.arpa';
replace into records (domain_id, name,type,content,ttl,prio) select id ,'225.168.192.in-addr.arpa', 'NS', 'ns2.test.com', 86400, 0 from domains where name='225.168.192.in-addr.arpa';
replace into records (domain_id, name,type,content,ttl,prio) select id ,'54.225.168.192.in-addr.arpa', 'PTR', 'somedomain1.test.com', 86400, 0 from domains where name='225.168.192.in-addr.arpa';
replace into records (domain_id, name,type,content,ttl,prio) select id ,'117.225.168.192.in-addr.arpa', 'PTR', 'somedomain2.test.com', 86400, 0 from domains where name='225.168.192.in-addr.arpa';
COMMIT;
BEGIN;
replace into domains (name,type) values ('225.168.192.in-addr.arpa','MASTER');
replace into records (domain_id, name,type,content,ttl,prio) select id ,'225.168.192.in-addr.arpa', 'SOA', 'ns1.test.com. hostmaster.test.com. 2013030511 600 7200 604800 345600', 86400, 0 from domains where name='225.168.192.in-addr.arpa';
replace into records (domain_id, name,type,content,ttl,prio) select id ,'225.168.192.in-addr.arpa', 'NS', 'ns1.test.com', 86400, 0 from domains where name='225.168.192.in-addr.arpa';
replace into records (domain_id, name,type,content,ttl,prio) select id ,'225.168.192.in-addr.arpa', 'NS', 'ns2.test.com', 86400, 0 from domains where name='225.168.192.in-addr.arpa';
replace into records (domain_id, name,type,content,ttl,prio) select id ,'54.225.168.192.in-addr.arpa', 'PTR', 'somedomain1.test.com', 86400, 0 from domains where name='225.168.192.in-addr.arpa';
replace into records (domain_id, name,type,content,ttl,prio) select id ,'117.225.168.192.in-addr.arpa', 'PTR', 'somedomain2.test.com', 86400, 0 from domains where name='225.168.192.in-addr.arpa';
COMMIT;

Note: pattern begins with "BEGIN;" and ends with "COMMIT;"

Any help please? I have changed the actual IP address and domain names. This is just one example. THere are other duplicate entries in the same format, only the IP address and domain names differ

TIA
sb
Sponsored Links
    #2  
Old 07-10-2013
vidyadhar85's Avatar
vidyadhar85 vidyadhar85 is offline Forum Advisor  
The Tutor
 
Join Date: Jun 2008
Last Activity: 9 April 2014, 1:31 AM EDT
Location: INDIA, Bangalore
Posts: 2,049
Thanks: 16
Thanked 106 Times in 102 Posts
you basically want to remove duplicate lines right?


Code:
 
 awk '!x[$0]++'  filename

The Following User Says Thank You to vidyadhar85 For This Useful Post:
RavinderSingh13 (07-10-2013)
Sponsored Links
    #3  
Old 07-10-2013
sb245 sb245 is offline
Registered User
 
Join Date: Mar 2011
Last Activity: 10 July 2013, 11:05 PM EDT
Location: Kathmandu
Posts: 11
Thanks: 2
Thanked 0 Times in 0 Posts
thanks vidyadhar85,

It removed all the duplicate lines. But the "BEGIN;" and "COMMIT;" were also needed which is needed in the file. The output came in below syntax.


Code:
BEGIN;
replace into domains (name,type) values ('blah blah 1,'MASTER');
COMMIT;
replace into domains (name,type) values ('blah blah 2','MASTER');
replace into domains (name,type) values ('blah blah 3','MASTER');
replace into domains (name,type) values ('blah blah 4','MASTER');

what I need is as below.



Code:
BEGIN;
replace into domains (name,type) values ('blah blah 1,'MASTER');
COMMIT;
BEGIN;
replace into domains (name,type) values ('blah blah 2','MASTER');
COMMIT;
BEGIN;
replace into domains (name,type) values ('blah blah 3','MASTER');
replace into domains (name,type) values ('blah blah 3','MASTER');
replace into domains (name,type) values ('blah blah 3','MASTER');
COMMIT;
BEGIN;
replace into domains (name,type) values ('blah blah 4','MASTER');
replace into domains (name,type) values ('blah blah 4','MASTER');
COMMIT;

Note: In the original file some paragraphs contain single line. Some contain multiple lines. The example I had pasted had multiple lines. Your script works perfectly for the example I had pasted.

TIA
sb
Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Using AWK to match CSV files with duplicate patterns isuewing Shell Programming and Scripting 3 02-10-2012 08:58 AM
Find files that do not match specific patterns nikos-koutax Shell Programming and Scripting 21 01-27-2011 05:00 AM
script to match patterns in 2 different files. mathews Shell Programming and Scripting 3 08-03-2010 03:45 PM
print lines which match multiple patterns annazpereira Shell Programming and Scripting 2 07-12-2010 04:46 PM
removing certain paragraphs for matching patterns kaushys Shell Programming and Scripting 7 08-19-2008 03:32 PM



All times are GMT -4. The time now is 03:01 AM.