using sed to get rid of duplicated columns...


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting using sed to get rid of duplicated columns...
# 1  
Old 04-10-2008
using sed to get rid of duplicated columns...

I can not figure out this one, so I turn to unix.com for help, I have a file, in which there are some lines containing continuously duplicate columns, like the following

adb abc abc asd adfj
123 123 123 345
234 444 444 444 444 444 23

and the output I want is

adb abc asd adfj
123 345
234 444 23

Is it possible using sed to do this?

oh, btw, I thought this should work,

sed 's/(\([^ ]\)+ )+/\1/' file , but it does not...

Last edited by fedora; 04-10-2008 at 04:53 PM..
# 2  
Old 04-10-2008
You need to have backrefs already in the first part to make sure you are actually replacing repeats, otherwise it will simply reduce every line to one token (or two, if you require a trailing space after the replacement). Also you need to make up your mind on whether your sed requires a backslash before grouping parentheses or not.

Code:
sed 's/\([^ ]* \)\1*/\1/g' file

I used * instead of +; if your sed understands the plus, then by all means use that.
# 3  
Old 04-10-2008
Smilie Thanks, I messed up, time to go over sed and regexp again....


Quote:
Originally Posted by era
You need to have backrefs already in the first part to make sure you are actually replacing repeats, otherwise it will simply reduce every line to one token (or two, if you require a trailing space after the replacement). Also you need to make up your mind on whether your sed requires a backslash before grouping parentheses or not.

Code:
sed 's/\([^ ]* \)\1*/\1/g' file

I used * instead of +; if your sed understands the plus, then by all means use that.
# 4  
Old 04-10-2008
hmm, a second though, it seems that something is still wrong

>cat /tmp/test
123 123 123 345
akljsdfaljskd 7878 7878 7878 7878 123
akljsdfaljskd 7878 7878 7878 7878 123 123 123 345 234 345 345

>sed 's/\([^ ]* \)\1*/\1/g' /tmp/test
123 345
akljsdfaljskd 7878 123
akljsdfaljskd 7878 123 345 234 345 345

note the last line, the duplicate "345" are still there
# 5  
Old 04-10-2008
hmm, that last column does not have "space", which is why...
# 6  
Old 11-28-2008
Can you explain the syntax you have given in details

sed 's/\([^ ]* \)\1*/\1/g' file

Thanx
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Deleting duplicated chunks in a file using awk/sed

Hi all, I'd always appreciate all helps from this site. I would like to delete duplicated chunks of strings on the same row(?). One chunk is comprised of four lines such as: path name starting point ending point voltage number I would like to delete duplicated chunks on the same... (5 Replies)
Discussion started by: jypark22
5 Replies

2. UNIX for Dummies Questions & Answers

sed for all columns

Hi, I would like to know how can I use sed in all columns of a file tab separated. Example of input file: 0/0:0:1,0,0 0/2:0:0,2,0 Desired output file: 1,0 0,2 (3 Replies)
Discussion started by: fadista
3 Replies

3. Shell Programming and Scripting

sed to get rid of unwanted characters

so i have strings such as this: 'postfix/local#2,5#|CRON.*12062.*root.*CMD#2,5#|roice.*NQN1#1,2#|toysprc#1,4#' i need to get rid of the "#" and the numbers between them for each of the strings above. so the desired output should be: ... (1 Reply)
Discussion started by: SkySmart
1 Replies

4. UNIX for Dummies Questions & Answers

Find duplicated values in two columns out of three

hi! could u help in the following? I have the data (long list!) that looks like (three coumns white space separated): rs3094315 0.0665173 742429 rs12562034 0.0738998 758311 rs3934834 0.396449 995669 rs9442372 0.402693 1008567 rs3737728 0.406271 1011278 rs6687776 0.435429 1020428 rs9651273... (4 Replies)
Discussion started by: kush
4 Replies

5. Shell Programming and Scripting

Manipulate columns using sed

Hello, I would like to remove the first column of lines beginning by a character (in my case is an open square bracket) and finishing by a space (or any other delimiter). For example: string1 string2 string3 to string2 string3 I found this previous topic: ... (1 Reply)
Discussion started by: stoyanova
1 Replies

6. UNIX for Dummies Questions & Answers

Getting rid of selected columns

Hi All, I've got a file like this: a 1 0 0 0 1 0 0 1 1 3 3 1 4 4 4 b 1 0 0 0 1 4 4 1 3 1 1 4 4 2 2 c 1 0 0 0 2 0 0 3 3 1 3 1 1 2 4 d 1 0 0 0 2 0 0 1 1 0 0 4 4 2 4 The file has ~4200 entries. I need to exclude those columns that are zeros for all those rows that have 2 in column 6. For... (0 Replies)
Discussion started by: zajtat
0 Replies

7. Shell Programming and Scripting

get rid of xml comment by grep or sed

Hi, I would like to get rid of all comment in an xml file by grep or sed command: The content seem like this: <!-- ab cd ef gh ij kl --> Anyone can help? Thanks and Regards (3 Replies)
Discussion started by: RonLii
3 Replies

8. Shell Programming and Scripting

How to get rid of double quote in sed.

Hi, i am using sed command to grep just a valuable data for my report generating. Thanks to the person who assists me on before thread. the problem that i encounter now is when i executed below command The output will give me like below output in between the data, there is a double quote. How... (6 Replies)
Discussion started by: anakiar
6 Replies

9. Shell Programming and Scripting

Help removing lines with duplicated columns

Hi Guys... Please Could you help me with the following ? aaaa bbbb cccc sdsd aaaa bbbb cccc qwer as you can see, the 2 lines are matched in three fields... how can I delete this pupicate ? I mean to delete the second one if 3 fields were duplicated ? Thanks (14 Replies)
Discussion started by: yahyaaa
14 Replies

10. Shell Programming and Scripting

remove duplicated columns

hi all, i have a file contain multicolumns, this file is sorted by col2 and col3. i want to remove the duplicated columns if the col2 and col3 are the same in another line. example fileA AA BB CC DD CC XX CC DD BB CC ZZ FF DD FF HH HH the output is AA BB CC DD BB CC ZZ FF... (6 Replies)
Discussion started by: kamel.seg
6 Replies
Login or Register to Ask a Question