need help sorting/deleting non-unique things

09-07-2009

Registered User

3, 0

Join Date: Aug 2009

Last Activity: 7 September 2009, 10:18 PM EDT

Posts: 3

Thanks Given: 0

Thanked 0 Times in 0 Posts

need help sorting/deleting non-unique things

I don't really know much about UNIX commands, so if someone could help me understand how to do this, I'd really appreciate it.

I have a text file with data that looks like this (filename: numbers.txt):
1 1 1 1 1 1 1 1 1 2 1 1_2 2_1
1 1 1 1 1 1 1 1 2 1 2 1_2 2_1
1 1 1 1 1 1 1 1 2 1 2 1_2 2_1
1 1 1 1 1 1 1 1 3 1 1 1_3 3_1
1 1 1 1 1 1 1 2 1 2 1 1_2 2_1
1 1 1 1 1 1 1 2 2 1 2 1_2 2_1
1 1 1 1 1 1 1 3 1 3 1 1_3 3_1
1 1 1 1 1 1 1 3 1 1 3 1_3 3_1
1 1 1 1 1 1 1 4 1 1 1 1_4 4_1
1 1 1 1 1 1 2 1 2 1 1 1_2 2_1

11 integers followed by an unspecified number of entries in the form "x_y".

What I want to do is this:
1) look at ONLY the x_y portions of each line, determining which lines are unique AFTER the 11 integers. In the above text, only the second-to-last line meets those criteria (1_4 4_1 doesn't appear anywhere else on the list).
2) Take those (partially) unique lines and write them to a new text file called new_numbers.txt.

(In my above example, new_numbers.txt would have only one line of text: 1 1 1 1 1 1 1 4 1 1 1 1_4 4_1)

If anyone can help me understand how to do this, I'd be very grateful! Thank you so much for your time and help!

---------- Post updated at 05:02 PM ---------- Previous update was at 04:59 PM ----------

If it's helpful, I should mention that the file (numbers.txt) is a file I've created myself, so if it would be easier to complete my task if the text were formatted differently, I can do that easily. (Like, if it would be better to have some sort of special character between the 11 integers and the x_y numbers, or if the x_y numbers should come at the beginning of the line, etc)

Thanks!

zac100

View Public Profile for zac100

Find all posts by zac100

09-07-2009

Registered User

149, 14

Join Date: Aug 2009

Last Activity: 12 November 2017, 10:42 PM EST

Posts: 149

Thanks Given: 4

Thanked 14 Times in 14 Posts

Code:

sort -ozac100.out -k12 -k11 -u zac100.in

cat zac100.out
1 1 1 1 1 1 1 1 1 2 1 1_2 2_1
1 1 1 1 1 1 1 1 2 1 2 1_2 2_1
1 1 1 1 1 1 2 1 2 1 1 1_2 2_1 
1 1 1 1 1 1 1 1 3 1 1 1_3 3_1
1 1 1 1 1 1 1 3 1 1 3 1_3 3_1
1 1 1 1 1 1 1 4 1 1 1 1_4 4_1

is that it?

daPeach

View Public Profile for daPeach

Find all posts by daPeach

09-07-2009

Registered User

3, 0

Join Date: Aug 2009

Last Activity: 7 September 2009, 10:18 PM EDT

Posts: 3

Thanks Given: 0

Thanked 0 Times in 0 Posts

Quote:

Originally Posted by daPeach

Code:

sort -ozac100.out -k12 -k11 -u zac100.in

cat zac100.out
1 1 1 1 1 1 1 1 1 2 1 1_2 2_1
1 1 1 1 1 1 1 1 2 1 2 1_2 2_1
1 1 1 1 1 1 2 1 2 1 1 1_2 2_1 
1 1 1 1 1 1 1 1 3 1 1 1_3 3_1
1 1 1 1 1 1 1 3 1 1 3 1_3 3_1
1 1 1 1 1 1 1 4 1 1 1 1_4 4_1

is that it?

Unfortunately not. The only line that should be in the output file is the one that ends in 1_4 4_1. I want it to interpret all the lines ending in 1_2 2_1 as duplicates (even though literally they're only partial duplicates).

Thanks for the effort, though! Any other ideas?

zac100

View Public Profile for zac100

Find all posts by zac100

09-08-2009

Registered User

151, 2

Join Date: Jul 2008

Last Activity: 13 May 2014, 6:14 PM EDT

Location: Texas

Posts: 151

Thanks Given: 1

Thanked 2 Times in 2 Posts

Code:

> cat sorttest
1 1 1 1 1 1 1 1 1 2 1 1_2 2_1
1 1 1 1 1 1 1 1 2 1 2 1_2 2_1
1 1 1 1 1 1 1 1 2 1 2 1_2 2_1
1 1 1 1 1 1 1 1 3 1 1 1_3 3_1
1 1 1 1 1 1 1 2 1 2 1 1_2 2_1
1 1 1 1 1 1 1 2 2 1 2 1_2 2_1
1 1 1 1 1 1 1 3 1 3 1 1_3 3_1
1 1 1 1 1 1 1 3 1 1 3 1_3 3_1
1 1 1 1 1 1 1 4 1 1 1 1_4 4_1
1 1 1 1 1 1 2 1 2 1 1 1_2 2_1

Code:

sort -k12 sorttest | uniq -c -f11 | perl -nle 'print $2 if /^(\s*1 )(.+)/'

Code:

> sort -k12 sorttest | uniq -c -f11 | perl -nle 'print $2 if /^(\s*1 )(.+)/'
1 1 1 1 1 1 1 4 1 1 1 1_4 4_1

Vi-Curious

View Public Profile for Vi-Curious

Find all posts by Vi-Curious

09-09-2009

Registered User

98, 2

Join Date: Dec 2008

Last Activity: 1 June 2017, 6:59 AM EDT

Location: India,Bangalore

Posts: 98

Thanks Given: 0

Thanked 2 Times in 2 Posts

the code you can use:
sort -k 12 numbers.txt|uniq -f 11 -c|awk -F " " '$1==1{print}'|cut -f 2- >new_file

if you want to know about this above how it performs then just ask

regards,
Sanjay

Last edited by sanjay.login; 09-09-2009 at 06:07 PM..

sanjay.login

View Public Profile for sanjay.login

Find all posts by sanjay.login

09-09-2009

Moderator

8,825, 1,112

Join Date: Feb 2005

Last Activity: 23 August 2021, 11:26 AM EDT

Location: Foxborough, MA

Posts: 8,825

Thanks Given: 579

Thanked 1,112 Times in 1,003 Posts

Quote:

Originally Posted by sanjay.login

the code you can use:
sort -k 12 numbers.txt|uniq -f 11 -c|awk -f " " '$1==1{print}'|cut -f 2- >new_file

regards,
Sanjay

Have you tried your solution?

vgersh99

View Public Profile for vgersh99

Find all posts by vgersh99

09-09-2009

Registered User

98, 2

Join Date: Dec 2008

Last Activity: 1 June 2017, 6:59 AM EDT

Location: India,Bangalore

Posts: 98

Thanks Given: 0

Thanked 2 Times in 2 Posts

yes vgres it is working fine.
and giving the correct output
1 1 1 1 1 1 1 4 1 1 1 1_4 4_1

sanjay.login

View Public Profile for sanjay.login

Find all posts by sanjay.login

UNIX for Dummies Questions & Answers

need help sorting/deleting non-unique things

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Sorting unique by column

Discussion started by: fat

2. Shell Programming and Scripting

Sorting out unique values from output of for loop.

Discussion started by: omkar.jadhav

3. UNIX for Dummies Questions & Answers

Sorting and saving values based on unique entries

Discussion started by: ida1215

4. Shell Programming and Scripting

Change unique file names into new unique filenames

Discussion started by: avonm

5. UNIX for Dummies Questions & Answers

Deleting words and sorting

Discussion started by: Xterra

6. Shell Programming and Scripting

Need help comparing two files and deleting some things in those files!

Discussion started by: linuxkid

7. Shell Programming and Scripting

Finding unique entries without sorting

Discussion started by: npatwardhan

8. Shell Programming and Scripting

get part of file with unique & non-unique string

Discussion started by: andrewsc

9. UNIX for Dummies Questions & Answers

Sorting with unique piping for a lot of files

Discussion started by: anjas

10. Shell Programming and Scripting

sorting file and unique commnad..

Discussion started by: amon