Remove duplicate words from column 1


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Remove duplicate words from column 1
# 1  
Old 06-29-2015
Remove duplicate words from column 1

Tried using sed and uniq but it's removing the entire line. Can't seem to figure a way to just remove the word. Any help is appreciated. I have a file:
Code:
dog, text1, text2, text3
dog, text1, text2, text3
dog, text1, text2, text3
cat, text1, text2, text3

Trying to remove all duplicate instances of dog and just keep the formatting:
Code:
dog, text1, text2, text3
    text1, text2, text3
    text1, text2, text3
cat, text1, text2, text3

Thanks.
# 2  
Old 06-29-2015
Code:
awk 'F[$1]++ {$1=OFS}1' test.file

This User Gave Thanks to Aia For This Post:
# 3  
Old 06-29-2015
Or try:
Code:
awk 'F[$1]++ {p=$1; gsub(/./,FS,p); sub($1,p)}1' file

or, if spacing is always exactly one space:
Code:
awk 'F[$1]++ {gsub(/./,FS,$1)}1'  file

Code:
dog, text1, text2, text3
     text1, text2, text3
     text1, text2, text3
cat, text1, text2, text3

# 4  
Old 06-29-2015
can the text be manipulated from horizontal to vertical for the output so there is only one row per instance?
Code:
dog, text1, text2, text3, text1 text2, text3, text1, text2, text3
cat, text1, text2, text3

# 5  
Old 06-29-2015
Try
Code:
awk '$1 != L {printf "%s%s", DL, $0; DL=RS; L = $1; next} {printf "%s%s%s%s%s", $2, OFS, $3, OFS, $4} END {print ""}' FS="," OFS="," file

This User Gave Thanks to RudiC For This Post:
# 6  
Old 06-29-2015
Thanks again RudiC!
# 7  
Old 06-30-2015
Code:
awk 'p!=$1{if(NR>1) print s; p=s=$1} {$1=x; s=s $0} END{print s}' FS=, OFS=, file

 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove duplicate values in a column(not in the file)

Hi Gurus, I have a file(weblog) as below abc|xyz|123|agentcode=sample code abcdeeess,agentcode=sample code abcdeeess,agentcode=sample code abcdeeess|agentadd=abcd stereet 23343,agentadd=abcd stereet 23343 sss|wwq|999|agentcode=sample1 code wqwdeeess,gentcode=sample1 code... (4 Replies)
Discussion started by: ratheeshjulk
4 Replies

2. Shell Programming and Scripting

Filter file to remove duplicate values in first column

Hello, I have a script that is generating a tab delimited output file. num Name PCA_A1 PCA_A2 PCA_A3 0 compound_00 -3.5054 -1.1207 -2.4372 1 compound_01 -2.2641 0.4287 -1.6120 3 compound_03 -1.3053 1.8495 ... (3 Replies)
Discussion started by: LMHmedchem
3 Replies

3. Shell Programming and Scripting

Find duplicate words in first column between "10" repetiotions

hello I have a file of the form: nameA number number ... nameB number number ... nameA number number ... nameC number number ... nameD number number ... 10 nameA number number ... nameC number number ... nameB number number ... nameC number number ...... (4 Replies)
Discussion started by: phaethon
4 Replies

4. Shell Programming and Scripting

Remove duplicate rows based on one column

Dear members, I need to filter a file based on the 8th column (that is id), and does not mather the other columns, because I want just one id (1 line of each id) and remove the duplicates lines based on this id (8th column), and does not matter wich duplicate will be removed. example of my file... (3 Replies)
Discussion started by: clarissab
3 Replies

5. UNIX for Dummies Questions & Answers

[SOLVED] remove lines that have duplicate values in column two

Hi, I've got a file that I'd like to uniquely sort based on column 2 (values in column 2 begin with "comp"). I tried sort -t -nuk2,3 file.txtBut got: sort: multi-character tab `-nuk2,3' "man sort" did not help me out Any pointers? Input: Output: (5 Replies)
Discussion started by: pathunkathunk
5 Replies

6. Shell Programming and Scripting

Remove very first pair of duplicate words

I have file which is almost look like below MMIT MMIT ... (2 Replies)
Discussion started by: manas_ranjan
2 Replies

7. UNIX for Dummies Questions & Answers

Remove duplicate rows when >10 based on single column value

Hello, I'm trying to delete duplicates when there are more than 10 duplicates, based on the value of the first column. e.g. a 1 a 2 a 3 b 1 c 1 gives b 1 c 1 but requires 11 duplicates before it deletes. Thanks for the help Video tutorial on how to use code tags in The UNIX... (11 Replies)
Discussion started by: informaticist
11 Replies

8. Shell Programming and Scripting

Remove duplicate line detail based on column one data

My input file: AVI.out <detail>named as the RRM .</detail> AVI.out <detail>Contains 1 RRM .</detail> AR0.out <detail>named as the tellurite-resistance.</detail> AWG.out <detail>Contains 2 HTH .</detail> ADV.out <detail>named as the DENR family.</detail> ADV.out ... (10 Replies)
Discussion started by: patrick87
10 Replies

9. Shell Programming and Scripting

remove duplicate words in a line

Hi, Please help! I have a file having duplicate words in some line and I want to remove the duplicate words. The order of the words in the output file doesn't matter. INPUT_FILE pink_kite red_pen ball pink_kite ball yellow_flower white no white no cloud nine_pen pink cloud pink nine_pen... (6 Replies)
Discussion started by: sam_2921
6 Replies

10. UNIX for Dummies Questions & Answers

Remove duplicate rows of a file based on a value of a column

Hi, I am processing a file and would like to delete duplicate records as indicated by one of its column. e.g. COL1 COL2 COL3 A 1234 1234 B 3k32 2322 C Xk32 TTT A NEW XX22 B 3k32 ... (7 Replies)
Discussion started by: risk_sly
7 Replies
Login or Register to Ask a Question