Command to remove duplicate lines with perl,sed,awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Command to remove duplicate lines with perl,sed,awk
# 8  
Old 10-18-2010
Hello Murphy,

Could you recommend us any page that could explain how the "N, P, D" options work in SED command?

Code:
sed '$!N; /^\(.*\)\n\1$/!P; D'

thanks in advance
# 9  
Old 10-18-2010
Quote:
Originally Posted by EAGL€
recommend us any page
man page
# 10  
Old 10-19-2010
Quote:
Originally Posted by cola
Anybody knows the solution with sed,perl?
Actually sed has not automate ready functions for this issues..

But for me the sed is still more powerfull others..
i can try to write specific some sed for sed lovers Smilie

Code:
# cat file
hello hello
hello hello
monkey
donkey
hello hello
drink
vay
dance
drink

Code:
# ./fsed.sedv1.uniq file
hello hello
monkey
donkey
drink
vay
dance

Code:
# ## fsed-Sedv1-Uniq ##
 
#!/bin/bash
xsed="";sedarr=""
while read -r l
 do
  x=( $( echo $(sed '=' 1 | sed -n 'N;s/\n/ /;p' | sed -n "s/^\(.\).*$l/\1/p") | sed 's/ .*//') );
  xsed=("$xsed $x" )
 done <"$1"
 
fsed=( $(echo ${xsed[@]}|sed 's/ /\n/g' | sed -n '/^1/p'|sed -n '1p') )
sedarr=("$fsed" )
 
for i in ${xsed[@]}
 do
  sedarr=( "$sedarr $( echo ${xsed[@]}|sed 's/ /\n/g' | sed -ne "/^$i/p"| sed -n '1p' | sed -e "/[${sedarr[@]}]/d" )" )
 done
 
for i in ${sedarr[@]}
 do
  sed -n "$i p" "$1"
 done


Little/Big Problem Correction
But I can discover this cant process for that file has 10 or more lines.
I can try to rewrite for this problem.
lets try this..

Code:
# cat newfile
hello hello
hello hello
monkey
donkey
hello hello4
drink
dance2
dance
drink4
hello hello1
donkey2
hello hello1
hello hello2
hello hello5
donkey3
donkey2
hello hello3
hello hello3
hello hello5
monkey3
dance3
dance3
monkey3
dance3


Code:
# ./fsed.sedv2.uniq newfile
hello hello
monkey
donkey
hello hello4
drink
dance2
dance
drink4
hello hello1
donkey2
hello hello2
hello hello5
donkey3
hello hello3
monkey3
dance3

Code:
# ## fsed-Sedv2-Uniq ##
 
#!/bin/bash
xsed="" ;uniq="" ;sedarr="" ;fsed=""
while read -r l
 do
  x=( $( echo $(sed '=' 1 | sed -n 'N;s/\n/ /;p' | sed -n "s/\(.*\) \b$l\b/\1/p")  ) );
  xsed=("$xsed ${x}\b\|" )
 done <"$1"
 
fsed=( $(echo ${xsed[@]}|sed 's/ /\n/g' | sed -n '/^1/p'|sed -n '1p') )
sedar=("\b$fsed" )
 
for i in ${xsed[@]}
 do
  newi=$(echo $i | sed 's/..$//')
  sedar=( $(echo $sedar|sed 's/..$//') )
  sedax=$(echo "${xsed[@]}"|sed 's/ /\n/g' | sed -ne "/^${newi}/p"| sed -n '1p'|sed -e "/${sedar[@]}/d" )
  x=("$(echo ${sedar[@]}|sed 's/\\|/\\b&\\b/g')" )
  sedar=("${x}\|${sedax}" )
 done
 
for i in $(echo ${sedar[@]} | sed 's/[^0-9]/ /g')
 do
  sed -n "$i p" "$1"
 done

Code:
PS:there are maybe some bugs!!..I dont guaranteed works wery well(like slow results)

Regards
ygemici

---------- Post updated at 11:57 AM ---------- Previous update was at 11:54 AM ----------

Quote:
Originally Posted by EAGL€
Hello Murphy,

Could you recommend us any page that could explain how the "N, P, D" options work in SED command?

Code:
sed '$!N; /^\(.*\)\n\1$/!P; D'

thanks in advance
This source is very usefull and very excellent for sed lovers
Thank you Bruce Barnett for this

Sed - An Introduction and Tutorial
This User Gave Thanks to ygemici For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to put the command to remove duplicate lines in my awk script?

I create a CGI in bash/html. My awk script looks like : echo "<table>" for fn in /var/www/cgi-bin/LPAR_MAP/*; do echo "<td>" echo "<PRE>" awk -F',|;' -v test="$test" ' NR==1 { split(FILENAME ,a,""); } $0 ~ test { if(!header++){ ... (12 Replies)
Discussion started by: Tim2424
12 Replies

2. Shell Programming and Scripting

Using sed, awk or perl to remove substring of all lines except the first

Greetings All, I would like to find all occurences of a pattern and delete a substring from the all matching lines EXCEPT the first. For example: 1234::group:user1,user2,user3,blah1,blah2,blah3 2222::othergroup:user9,user8 4444::othergroup2:user3,blah,blah,user1 1234::group3:user5,user1 ... (11 Replies)
Discussion started by: jacksolm
11 Replies

3. Shell Programming and Scripting

Sed/awk/perl command to replace pattern in multiple lines

Hi I know sed and awk has options to give range of line numbers, but I need to replace pattern in specific lines Something like sed -e '1s,14s,26s/pattern/new pattern/' file name Can somebody help me in this.... I am fine with see/awk/perl Thank you in advance (9 Replies)
Discussion started by: dani777
9 Replies

4. Shell Programming and Scripting

AWK Command to duplicate lines in a file?

Hi, I have a file with date in it like: UserString1 UserString2 UserString3 UserString4 UserString5 I need two entries for each line so it reads like UserString1 UserString1 UserString2 UserString2 etc. Can someone help me with the awk command please? Thanks (4 Replies)
Discussion started by: Grueben
4 Replies

5. Shell Programming and Scripting

[uniq + awk?] How to remove duplicate blocks of lines in files?

Hello again, I am wanting to remove all duplicate blocks of XML code in a file. This is an example: input: <string-array name="threeItems"> <item>item1</item> <item>item2</item> <item>item3</item> </string-array> <string-array name="twoItems"> <item>item1</item> <item>item2</item>... (19 Replies)
Discussion started by: raidzero
19 Replies

6. Shell Programming and Scripting

remove duplicate lines using awk

Hi, I came to know that using awk '!x++' removes the duplicate lines. Can anyone please explain the above syntax. I want to understand how the above awk syntax removes the duplicates. Thanks in advance, sudvishw :confused: (7 Replies)
Discussion started by: sudvishw
7 Replies

7. Shell Programming and Scripting

perl/shell need help to remove duplicate lines from files

Dear All, I have multiple files having number of records, consist of more than 10 columns some column values are duplicate and i want to remove these duplicate values from these files. Duplicate values may come in different files.... all files laying in single directory.. Need help to... (3 Replies)
Discussion started by: arvindng
3 Replies

8. Shell Programming and Scripting

How to remove lines before and after with awk / sed ?

Hi guys, I need to remove the pattern (ID=180), one line before and four lines after. Thanks. (5 Replies)
Discussion started by: ashimada
5 Replies

9. Shell Programming and Scripting

Sed or Awk to remove specific lines

I have searched the forum for this - forgive me if I missed a previous post. I have the following file: blah blah blah blah blah blah blah blah blah blah blah blah blah blah blah alter table "informix".esc_acct add constraint (foreign key (fi_id) references "informix".fi ... (5 Replies)
Discussion started by: Shoeless_Mike
5 Replies

10. Shell Programming and Scripting

Command/Script to remove duplicate lines from the file?

Hello, Can anyone tell Command/Script to remove duplicate lines from the file? (2 Replies)
Discussion started by: Rahulpict
2 Replies
Login or Register to Ask a Question