How can I remove partial duplicates and manipulate text?

 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers How can I remove partial duplicates and manipulate text?
# 1  
Old 12-30-2017
How can I remove partial duplicates and manipulate text?

Hello,

How can I remove partial duplicates and manipulate text in bash using either awk, grep or sed? Thanks.


Input:

Code:
ted,"foo,bar,zoo"
john-son,"foot,ben,zoo"
bob,"bar,foot"

Expected Output:

Code:
foo,ted
bar,ted
zoo,ted
foot,john-son
ben,john-son

# 2  
Old 12-30-2017
What have you tried so far?
# 3  
Old 12-30-2017
this did not work.

Code:
perl -lpe 's/\s\K\S+/join ",", grep {!$seen{$_}++} split ",", $&/e'

# 4  
Old 12-30-2017
It is interesting that you want code written in awk, grep, or sed but show us non-working perl code. Smilie

You might be able to use something like:
Code:
awk -F, -v OFS=, '
{	gsub(/"/, "")
	for(i = 2; i <= NF; i++)
		if(!($i in seen)) {
			seen[$i]
			print $i, $1
		}
}' file

which, if file contains your sample input, produces the output you said you wanted.

If you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk.
# 5  
Old 12-31-2017
Thank you very much. It worked.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

A better way to manipulate text

Good morning everyone, I'm currently trying to convert an environment variable into a string and then attach it at the end of a command and launch it. I have the following right now, but it's very ugly: AMI_TAGS="env=test,country=XX,city=blah,galaxy=blahblah" aws ec2 create-tags... (8 Replies)
Discussion started by: da1
8 Replies

2. Shell Programming and Scripting

Script to compare partial filenames in two folders and delete duplicates

Background: I use a TV tuner card to capture OTA video files (.mpeg) and then my Plex Media Server automatically optimizes the files (transcodes for better playback) and places them in a new directory. I have another Plex Library pointing to the new location for the optimized .mp4 files. This... (2 Replies)
Discussion started by: shaky
2 Replies

3. Shell Programming and Scripting

Remove duplicates

Hi I have a below file structure. 200,1245,E1,1,E1,,7611068,KWH,30, ,,,,,,,, 200,1245,E1,1,E1,,7611070,KWH,30, ,,,,,,,, 300,20140223,0.001,0.001,0.001,0.001,0.001 300,20140224,0.001,0.001,0.001,0.001,0.001 300,20140225,0.001,0.001,0.001,0.001,0.001 300,20140226,0.001,0.001,0.001,0.001,0.001... (1 Reply)
Discussion started by: tejashavele
1 Replies

4. Shell Programming and Scripting

Match partial text

I posted the incorrect files yesterday and apologize. I also modified the awk script but with no luck. There are two text files in the zip (name.txt and output.txt). I am trying to match $2 in name.txt with $1 in output.txt and if they match then $1 of name.txt is copied to $7 of output.txt. ... (7 Replies)
Discussion started by: cmccabe
7 Replies

5. Shell Programming and Scripting

Manipulate the text file in UNIX

Hi All, I have a file like below and i have 2 questions on this (They are 3 lines starts with 01 , 02 and 03. but is 01abc333644554 234 2334535 34534535355353 sfsdf345455 353 4543 jgkg tty 7676 02cdesdfsdfsdf 234 wesdfsdf 345345 234234 234234 2342342 dfgdfg sdfgg dgdgdg fgvfs... (6 Replies)
Discussion started by: siva.pitchai
6 Replies

6. Shell Programming and Scripting

sed to remove partial text in one line only

I have test.xml XML file like <Report account="123456" start_time="2014-09-08T00:00:00+00:00" end_time="2014-09-10T23:59:59+00:00" user="Dollar Tree" limit="1000000" more_sessions="some text "> <Session ......rest of xml............... I need output like <Report> <Session ......rest of... (3 Replies)
Discussion started by: kumars1331@gmai
3 Replies

7. Shell Programming and Scripting

Remove the partial duplicates by checking the length of a field

Hi Folks - I'm quite new to awk and didn't come across such issues before. The problem statement is that, I've a file with duplicate records in 3rd and 4th fields. The sample is as below: aaaaaa|a12|45|56 abbbbaaa|a12|45|56 bbaabb|b1|51|45 bbbbbabbb|b2|51|45 aaabbbaaaa|a11|45|56 ... (3 Replies)
Discussion started by: asyed
3 Replies

8. Shell Programming and Scripting

Remove duplicates

Hello Experts, I have two files named old and new. Below are my example files. I need to compare and print the records that only exist in my new file. I tried the below awk script, this script works perfectly well if the records have exact match, the issue I have is my old file has got extra... (4 Replies)
Discussion started by: forumthreads
4 Replies

9. Shell Programming and Scripting

Script to manipulate logfile text

Hi guys, I was wandering if a Shell guru could give me some advice on tackling a problem. I have used a mixture of grep, cut and awk to get data from a log file in the following format: 14/11/08 10:39: Checking currenly : Enabled 14/11/08 10:39: Records allocated : 221... (11 Replies)
Discussion started by: rosspaddock
11 Replies

10. UNIX for Dummies Questions & Answers

using sed to manipulate text in files

Hi, I have a slight problem in trying to manipulate the text within a file using the "sed" command in that the text i need changed has "/" slashes in. I have a .sh script that scans the "/db/sybbackup/" directories for any .dmp file older than 2 days and then to >> the information to a file called... (3 Replies)
Discussion started by: Jefferson333
3 Replies
Login or Register to Ask a Question