Removing Duplicate Variables : SED?

10-14-2011

Moderator

8,825, 1,112

Join Date: Feb 2005

Last Activity: 23 August 2021, 11:26 AM EDT

Location: Foxborough, MA

Posts: 8,825

Thanks Given: 579

Thanked 1,112 Times in 1,003 Posts

Quote:

Originally Posted by Blue Solo

I opened terminal.app and pasted:

Code:

awk '!a[$NF]++' RS='' /Users/user/Desktop/LevelIndices2.mtl 
newmtl m2 
Kd 1.000000 1.000000 1.000000; 
Ka 0 0 0 illum 2 
Ns 64 
d 1.000000 
map_Kd 951E01Other4040.b.        mp

Just copy/paste the awk line. The rest was provided as the illustration of the output given your sample input.

This User Gave Thanks to vgersh99 For This Post:

vgersh99

View Public Profile for vgersh99

Find all posts by vgersh99

10-15-2011

Registered User

16, 0

Join Date: Oct 2011

Last Activity: 2 November 2012, 4:34 AM EDT

Posts: 16

Thanks Given: 5

Thanked 0 Times in 0 Posts

I installed nawk and ran this in terminal:

Code:

nawk '!a[$NF]++' RS='' /Users/user/Desktop/LevelIndices2.mtl

It returned every line of text that was in the file LevelIndices2.mtl. Am I supposed to fill in NF or RS='' with something?

By the way thanks for trying to help so far!

UPDATE:
Now I see what is going on. I ran this:

Code:

nawk '!a[$NF]++' /Users/user/Desktop/LevelIndices2.mtl

And I turned up these results:

Code:

newmtl m2
map_Kd 951E01Other4040.bmp
newmtl m3
newmtl m4
newmtl m5
newmtl m6
newmtl m7
newmtl m8
newmtl m9
map_Kd 952C57w20h200xC6COLORS.bmp
newmtl m10
map_Kd A1C039w20h200x1CCOLORS.bmp
newmtl m11
newmtl m12
newmtl m13
newmtl m14
newmtl m15
map_Kd 946418w20h200x87COLORS.bmp
newmtl m16
newmtl m17

and so on throughout the whole document. So the first newmtl mXXX before a .bmp is the one that needs to be kept (m2, m9, m10,m15...).

-Next step from here is to keep the newmtl mXXX, and the rest are the ones I need to delete.
-Then in another file all mXXX I deleted need to be found and replaced by the one that are kept in this file.

UPDATE 2:
There are a few imperfections; When a .bmp is separating another it doesn't index it.
Example:

Code:

newmtl m2
Kd 1.000000 1.000000 1.000000
Ka 0 0 0
illum 2
Ns 64
d 1.000000
map_Kd image.bmp

newmtl m3
Kd 1.000000 1.000000 1.000000
Ka 0 0 0
illum 2
Ns 64
d 1.000000
map_Kd anotherImage.bmp

newmtl m4
Kd 1.000000 1.000000 1.000000
Ka 0 0 0
illum 2
Ns 64
d 1.000000
map_Kd image.bmp

When the command is run it will show:
newmtl m2
map_Kd image.bmp
newmtl m3
map_Kd anotherImage.bmp
newmtl m4

Instead it should show:
newmtl m2
map_Kd image.bmp
newmtl m4
newmtl m3
map_Kd anotherImage.bmp

This is because there is a separate .bmp in between them.

Last edited by Blue Solo; 10-15-2011 at 06:41 PM.. Reason: Update

Blue Solo

View Public Profile for Blue Solo

Find all posts by Blue Solo

10-15-2011

Registered User

16, 0

Join Date: Oct 2011

Last Activity: 2 November 2012, 4:34 AM EDT

Posts: 16

Thanks Given: 5

Thanked 0 Times in 0 Posts

Is there a way to order all the newmtl mXXX based on map_Kd XXX.bmp so that they are all grouped together and right next to each other? This should fix the error with:
nawk '!a[$NF]++' /Users/user/Desktop/LevelIndices2.mtl

If it is any help I am attaching the files as .txt; Step1.txt is what should be done first, Step2.txt is what needs to be edited second.

step1.txt (45.3 KB)

Step2.txt (422.8 KB)

Blue Solo

View Public Profile for Blue Solo

Find all posts by Blue Solo

10-15-2011

Registered User

380, 91

Join Date: Aug 2009

Last Activity: 15 March 2013, 10:40 AM EDT

Location: New Jersey

Posts: 380

Thanks Given: 7

Thanked 91 Times in 75 Posts

According to your original post, this may be what you wanted:

Code:

awk '!RS {
  if ($1 != "newmtl") next
  if ($NF in p) {
    d["\\<" $2 "\\>"] = p[$NF] #gawk
    #d[$2] = p[$NF] # too loose
  } else {
    p[$NF] = $2
    print $0 "\n" > "step1.new"
  }
  next
}
{
  for (i in d) gsub(i, d[i])
  print
}' RS='' step1 RS='\n' step2 > step2.new

By the way, the files you attached were DOS files. You may need to convert them into Unix files.

binlib

View Public Profile for binlib

Find all posts by binlib

10-15-2011

Registered User

16, 0

Join Date: Oct 2011

Last Activity: 2 November 2012, 4:34 AM EDT

Posts: 16

Thanks Given: 5

Thanked 0 Times in 0 Posts

Quote:

Originally Posted by binlib

According to your original post, this may be what you wanted:

Code:

Code

By the way, the files you attached were DOS files. You may need to convert them into Unix files.

Oh wow, thanks! The tip about the DOS to Unix was great.
So that deleted all the duplicates, but it did not rename the duplicates in the step2 to their kept duplicate.

Thanks so far!

Last edited by Blue Solo; 10-16-2011 at 12:43 AM..

Blue Solo

View Public Profile for Blue Solo

Find all posts by Blue Solo

10-16-2011

Registered User

380, 91

Join Date: Aug 2009

Last Activity: 15 March 2013, 10:40 AM EDT

Location: New Jersey

Posts: 380

Thanks Given: 7

Thanked 91 Times in 75 Posts

I guess you are not using gawk. The problem with the replacement is that you can't just replace m2 with m1, for example, because it will replace m20 with m10. Since your step2 file always has the names as the last field, you can change

Code:

d["\\<" $2 "\\>"] = p[$NF] #gawk
to
d[" " $2 "$" ] = " " p[$NF]

for other awks that don't have \< and \> for beginging/end of word.

This User Gave Thanks to binlib For This Post:

binlib

View Public Profile for binlib

Find all posts by binlib

10-16-2011

Registered User

16, 0

Join Date: Oct 2011

Last Activity: 2 November 2012, 4:34 AM EDT

Posts: 16

Thanks Given: 5

Thanked 0 Times in 0 Posts

Quote:

Originally Posted by binlib

I guess you are not using gawk. The problem with the replacement is that you can't just replace m2 with m1, for example, because it will replace m20 with m10. Since your step2 file always has the names as the last field, you can change

Code:

d["\\<" $2 "\\>"] = p[$NF] #gawk
to
d[" " $2 "$" ] = " " p[$NF]

for other awks that don't have \< and \> for beginging/end of word.

Thank you so much! It finally works!

Blue Solo

View Public Profile for Blue Solo

Find all posts by Blue Solo

Shell Programming and Scripting

Removing Duplicate Variables : SED?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Removing duplicate items from an array

Discussion started by: mukulverma2408

2. Shell Programming and Scripting

Removing Duplicate Rows in a file

Discussion started by: ekbaazigar

3. Shell Programming and Scripting

Removing duplicate terms in a file

Discussion started by: Behrouzx77

4. Post Here to Contact Site Administrators and Moderators

Removing or Merging some duplicate threads

Discussion started by: k.a.docpp

5. Shell Programming and Scripting

Removing duplicate records from 2 files

Discussion started by: zooby

6. Shell Programming and Scripting

Removing Duplicate Lines per Section

Discussion started by: petersf

7. Shell Programming and Scripting

removing the duplicate lines in a file

Discussion started by: Sharmila_P

8. Shell Programming and Scripting

removing duplicate blank lines

Discussion started by: rameezrajas

9. UNIX for Dummies Questions & Answers

removing duplicate lines from a file

Discussion started by: ocelot

10. UNIX for Dummies Questions & Answers

Removing duplicate lines ignore case

Discussion started by: hellsd