Match paragraph between two patterns, delete the duplicate paragraphs


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Match paragraph between two patterns, delete the duplicate paragraphs
# 1  
Old 07-10-2013
Match paragraph between two patterns, delete the duplicate paragraphs

Hello all

I have a file my DNS server where there are duplicate paragrapsh like below. How can I remove the duplicate paragraph so that only one paragraph remains.

Code:
BEGIN;
replace into domains (name,type) values ('225.168.192.in-addr.arpa','MASTER');
replace into records (domain_id, name,type,content,ttl,prio) select id ,'225.168.192.in-addr.arpa', 'SOA', 'ns1.test.com. hostmaster.test.com. 2013030511 600 7200 604800 345600', 86400, 0 from domains where name='225.168.192.in-addr.arpa';
replace into records (domain_id, name,type,content,ttl,prio) select id ,'225.168.192.in-addr.arpa', 'NS', 'ns1.test.com', 86400, 0 from domains where name='225.168.192.in-addr.arpa';
replace into records (domain_id, name,type,content,ttl,prio) select id ,'225.168.192.in-addr.arpa', 'NS', 'ns2.test.com', 86400, 0 from domains where name='225.168.192.in-addr.arpa';
replace into records (domain_id, name,type,content,ttl,prio) select id ,'54.225.168.192.in-addr.arpa', 'PTR', 'somedomain1.test.com', 86400, 0 from domains where name='225.168.192.in-addr.arpa';
replace into records (domain_id, name,type,content,ttl,prio) select id ,'117.225.168.192.in-addr.arpa', 'PTR', 'somedomain2.test.com', 86400, 0 from domains where name='225.168.192.in-addr.arpa';
COMMIT;
BEGIN;
replace into domains (name,type) values ('225.168.192.in-addr.arpa','MASTER');
replace into records (domain_id, name,type,content,ttl,prio) select id ,'225.168.192.in-addr.arpa', 'SOA', 'ns1.test.com. hostmaster.test.com. 2013030511 600 7200 604800 345600', 86400, 0 from domains where name='225.168.192.in-addr.arpa';
replace into records (domain_id, name,type,content,ttl,prio) select id ,'225.168.192.in-addr.arpa', 'NS', 'ns1.test.com', 86400, 0 from domains where name='225.168.192.in-addr.arpa';
replace into records (domain_id, name,type,content,ttl,prio) select id ,'225.168.192.in-addr.arpa', 'NS', 'ns2.test.com', 86400, 0 from domains where name='225.168.192.in-addr.arpa';
replace into records (domain_id, name,type,content,ttl,prio) select id ,'54.225.168.192.in-addr.arpa', 'PTR', 'somedomain1.test.com', 86400, 0 from domains where name='225.168.192.in-addr.arpa';
replace into records (domain_id, name,type,content,ttl,prio) select id ,'117.225.168.192.in-addr.arpa', 'PTR', 'somedomain2.test.com', 86400, 0 from domains where name='225.168.192.in-addr.arpa';
COMMIT;
BEGIN;
replace into domains (name,type) values ('225.168.192.in-addr.arpa','MASTER');
replace into records (domain_id, name,type,content,ttl,prio) select id ,'225.168.192.in-addr.arpa', 'SOA', 'ns1.test.com. hostmaster.test.com. 2013030511 600 7200 604800 345600', 86400, 0 from domains where name='225.168.192.in-addr.arpa';
replace into records (domain_id, name,type,content,ttl,prio) select id ,'225.168.192.in-addr.arpa', 'NS', 'ns1.test.com', 86400, 0 from domains where name='225.168.192.in-addr.arpa';
replace into records (domain_id, name,type,content,ttl,prio) select id ,'225.168.192.in-addr.arpa', 'NS', 'ns2.test.com', 86400, 0 from domains where name='225.168.192.in-addr.arpa';
replace into records (domain_id, name,type,content,ttl,prio) select id ,'54.225.168.192.in-addr.arpa', 'PTR', 'somedomain1.test.com', 86400, 0 from domains where name='225.168.192.in-addr.arpa';
replace into records (domain_id, name,type,content,ttl,prio) select id ,'117.225.168.192.in-addr.arpa', 'PTR', 'somedomain2.test.com', 86400, 0 from domains where name='225.168.192.in-addr.arpa';
COMMIT;

Note: pattern begins with "BEGIN;" and ends with "COMMIT;"

Any help please? I have changed the actual IP address and domain names. This is just one example. THere are other duplicate entries in the same format, only the IP address and domain names differ

TIA
sb
# 2  
Old 07-10-2013
you basically want to remove duplicate lines right?

Code:
 
 awk '!x[$0]++'  filename

# 3  
Old 07-10-2013
thanks vidyadhar85,

It removed all the duplicate lines. But the "BEGIN;" and "COMMIT;" were also needed which is needed in the file. The output came in below syntax.

Code:
BEGIN;
replace into domains (name,type) values ('blah blah 1,'MASTER');
COMMIT;
replace into domains (name,type) values ('blah blah 2','MASTER');
replace into domains (name,type) values ('blah blah 3','MASTER');
replace into domains (name,type) values ('blah blah 4','MASTER');

what I need is as below.


Code:
BEGIN;
replace into domains (name,type) values ('blah blah 1,'MASTER');
COMMIT;
BEGIN;
replace into domains (name,type) values ('blah blah 2','MASTER');
COMMIT;
BEGIN;
replace into domains (name,type) values ('blah blah 3','MASTER');
replace into domains (name,type) values ('blah blah 3','MASTER');
replace into domains (name,type) values ('blah blah 3','MASTER');
COMMIT;
BEGIN;
replace into domains (name,type) values ('blah blah 4','MASTER');
replace into domains (name,type) values ('blah blah 4','MASTER');
COMMIT;

Note: In the original file some paragraphs contain single line. Some contain multiple lines. The example I had pasted had multiple lines. Your script works perfectly for the example I had pasted.

TIA
sb
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to print match or non-match and select fields/patterns for non-matches

In the awk below I am trying to output those lines that Match between file1 and file2, those Missing in file1, and those missing in file2. Using each $1,$2,$4,$5 value as a key to match on, that is if those 4 fields are found in both files the match, but if those 4 fields are not found then missing... (0 Replies)
Discussion started by: cmccabe
0 Replies

2. Shell Programming and Scripting

Find duplicate values in specific column and delete all the duplicate values

Dear folks I have a map file of around 54K lines and some of the values in the second column have the same value and I want to find them and delete all of the same values. I looked over duplicate commands but my case is not to keep one of the duplicate values. I want to remove all of the same... (4 Replies)
Discussion started by: sajmar
4 Replies

3. Shell Programming and Scripting

Match 2 patterns together

How can I quickly print out lines in a datafile which has presence of both patterns in a row of another file. Maybe awk can do it much faster than bash. Patternfile ID1 PAT11 PAT12 ID1 PAT21 PAT22 ID2 PAT31 PAT32 datafile headerline... (2 Replies)
Discussion started by: abh.kumar
2 Replies

4. UNIX for Dummies Questions & Answers

Match patterns from another file and tag

Hi all, I have a file , which has 6 tab delimited fields, with $3 and $4 subfielded with spaces. I wamt to match cols $2,$3,$4 of tmp1 with tmp2, ..and then flag the 5th col if found. tmp1 1756 Xerm XermA XermB XermC XermD AA TT AA GG A 1 1763 Xerm XermA XermB XermC... (3 Replies)
Discussion started by: senhia83
3 Replies

5. Shell Programming and Scripting

Match 2 different patterns and print the lines

Hi, i have been trying to extract multiple lines based on two different patterns as below:- file1 @jkm|kdo|aas012|192.2.3.1 blablbalablablkabblablabla sjfdsakfjladfjefhaghfagfkafagkjsghfalhfk fhajkhfadjkhfalhflaffajkgfajkghfajkhgfkf jahfjkhflkhalfdhfwearhahfl @jkm|sdf|wud08q|168.2.1.3... (8 Replies)
Discussion started by: redse171
8 Replies

6. Shell Programming and Scripting

Using AWK to match CSV files with duplicate patterns

Dear awk users, I am trying to use awk to match records across two moderately large CSV files. File1 is a pattern file with 173,200 lines, many of which are repeated. The order in which these lines are displayed is important, and I would like to preserve it. File2 is a data file with 456,000... (3 Replies)
Discussion started by: isuewing
3 Replies

7. Shell Programming and Scripting

Find files that do not match specific patterns

Hi all, I have been searching online to find the answer for getting a list of files that do not match certain criteria but have been unsuccessful. I have a directory that has many jpg files. What I need to do is get a list of the files that do not match both of the following patterns (I have... (21 Replies)
Discussion started by: nikos-koutax
21 Replies

8. Shell Programming and Scripting

script to match patterns in 2 different files.

I am new to shell scripting and need some help. I googled, but couldn't find a similar scenario. Basically, I need to rename a datafile. This is the scenario - I have a file, readonly.txt that has 2 columns - file# and name. I have another file,missing_files.txt that has id and name. Both the... (3 Replies)
Discussion started by: mathews
3 Replies

9. Shell Programming and Scripting

print lines which match multiple patterns

Hi, I have a text file as follows: 11:38:11.054 run1_rdseq avg_2-5 999988.0000 1024.0000 11:50:52.053 run3_rdrand 999988.0000 1135.0 128.0417 11:53:18.050 run4_wrrand avg_2-5 999988.0000 8180.5833 11:55:42.051 run4_wrrand avg_2-5 999988.0000 213.8333 11:55:06.053... (2 Replies)
Discussion started by: annazpereira
2 Replies

10. Shell Programming and Scripting

removing certain paragraphs for matching patterns

Hi, I have a log file which might have certain paragraphs. Switch not possible Error code 1234 Process number 678 Log not available Error code 567 Process number 874 ..... ...... ...... Now I create an exception file like this. cat text.exp Error code 1234 Process number 874 (7 Replies)
Discussion started by: kaushys
7 Replies
Login or Register to Ask a Question