The UNIX and Linux Forums  
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com



Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
how to read record by record from a file in unix raoscb UNIX for Dummies Questions & Answers 1 05-16-2008 06:30 AM
remove duplicated columns kamel.seg Shell Programming and Scripting 6 02-21-2008 07:36 AM
Remove First and Last Record from a file ravikuc UNIX for Dummies Questions & Answers 1 10-11-2007 03:35 AM
remove duplicated lines without sort lalelle Shell Programming and Scripting 6 08-21-2007 07:44 AM
command to remove last record on file mheinen UNIX for Dummies Questions & Answers 4 01-09-2007 04:39 PM

Closed Thread
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 09-20-2006
happyv happyv is offline
Registered User
  
 

Join Date: Sep 2006
Posts: 209
remove duplicated xml record in a file under unix

Hi,

If i have a file with xml format, i would like to remove duplicated records and save to a new file. Is it possible...to write script to do it?
  #2 (permalink)  
Old 09-20-2006
tayyabq8's Avatar
tayyabq8 tayyabq8 is offline Forum Advisor  
Moderator
  
 

Join Date: Nov 2004
Location: Bahrain
Posts: 578
Try
Code:
uniq inputfile
  #3 (permalink)  
Old 09-20-2006
Yogesh Sawant's Avatar
Yogesh Sawant Yogesh Sawant is offline Forum Staff  
Part Time Moderator and Full Time Dad
  
 

Join Date: Sep 2006
Location: Rossem, Tazenda
Posts: 1,086
I don't know if it's possible in shell or not, but it's possible in Perl. Do consider that option if you can.
  #4 (permalink)  
Old 09-20-2006
happyv happyv is offline
Registered User
  
 

Join Date: Sep 2006
Posts: 209
Is the Perl can run under ksh Unix?

Also, the record is a bit difference...it look like

record1:
this is testing
my id is 2001
end:
record2:
this is testing2
my id is 2002
end:
record3:
this is testing
my id is 2002
end:
record4:
this is testing2
my id is 2002
end:

For the above, record 2 and 4 is duplicated. Because of the "id" and "testing2" is the same. if only one line is the same which is not called duplicated..

So Perl or any friend can help for the script?
  #5 (permalink)  
Old 09-20-2006
ranj@chn ranj@chn is offline Forum Advisor  
Playing with Ubuntu Now!
  
 

Join Date: Oct 2005
Location: Chennai
Posts: 365
check this

I havent tested this, but please check it
Code:
paste -s -d"\t\t\t\n" filename|sort -u |tr "\t" "\n"

Last edited by ranj@chn; 09-20-2006 at 07:54 AM.. Reason: mistake in command
  #6 (permalink)  
Old 09-20-2006
aigles's Avatar
aigles aigles is offline Forum Advisor  
Registered User
  
 

Join Date: Apr 2004
Location: Bordeaux, France
Posts: 1,414
You can try to use awk.
Create the following awk script uniq.awk :
Code:
/^end:/ {
   if (! (Record in Records)) {
      Records[Record];
      print RecordLabel ":";
      print Record;
      print $0;  
      Record = "";
   }
   next;
}
$1 ~ /^.*:/ {
   sub(/:.*/, "", $1);
   RecordLabel = $1;
   next;
}
{
   Record = (Record ? Record "\n" : "") $0;
}
and execute it :
Code:
$ awk -f uniq.awk filename
record1:
this is testing
my id is 2001
end:
record2:
this is testing2
my id is 2002
end:
record3:
this is testing
my id is 2002
end:
$
jean-Pierre.
  #7 (permalink)  
Old 09-20-2006
nervous nervous is offline
Registered User
  
 

Join Date: Sep 2006
Posts: 55
Dear Sir,

It would be great help if you can describe the code below in detail, I have just started to learn about awk and I can say that understanding of following code in a clear way would help me a lot in future.
Quote:
/^end:/ {
if (! (Record in Records)) {
Records[Record];
print RecordLabel ":";
print Record;
print $0;
Record = "";
}
next;
}
$1 ~ /^.*:/ {
sub(/:.*/, "", $1);
RecordLabel = $1;
next;
}
{
Record = (Record ? Record "\n" : "") $0;
}
Thanks in advance.
Sponsored Links
Closed Thread

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 12:13 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0