Remove lines containing 2 or more duplicate strings Post: 302964682

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

how to remove duplicate lines

I have following file content (3 fields each line): 23 888 10.0.0.1 dfh 787 10.0.0.2 dssf dgfas 10.0.0.3 dsgas dg 10.0.0.4 df dasa 10.0.0.5 df dag 10.0.0.5 dfd dfdas 10.0.0.5 dfd dfd 10.0.0.6 daf nfd 10.0.0.6 ... as can be seen, that the third field is ip address and sorted. but...

2. UNIX for Dummies Questions & Answers

Delete lines with duplicate strings based on date

Hey all, a relative bash/script newbie trying solve a problem. I've got a text file with lots of lines that I've been able to clean up and format with awk/sed/cut, but now I'd like to remove the lines with duplicate usernames based on time stamp. Here's what the data looks like 2007-11-03...

3. Shell Programming and Scripting

Remove duplicate lines

Hi, I have a huge file which is about 50GB. There are many lines. The file format likes 21 rs885550 0 9887804 C C T C C C C C C C 21 rs210498 0 9928860 0 0 C C 0 0 0 0 0 0 21 rs303304 0 9941889 A A A A A A A A A A 22 rs303304 0 9941890 0 A A A A A A A A A The question is that there are a few...

4. Shell Programming and Scripting

Delete lines in file containing duplicate strings, keeping longer strings

The question is not as simple as the title... I have a file, it looks like this <string name="string1">RZ-LED</string> <string name="string2">2.0</string> <string name="string2">Version 2.0</string> <string name="string3">BP</string> I would like to check for duplicate entries of...

5. Shell Programming and Scripting

Need to remove the duplicate lines from a log!!

Hello Folks, Can some one help me with the removal of duplicate lines from a log file and send it to another log file. It's bit complicated as two lines are same but only difference is the timestamp, but some lines are uniq. Line has been seperated by colon's. Log file:...

6. Shell Programming and Scripting

remove duplicate lines with condition

hi to all Does anyone know if there's a way to remove duplicate lines which we consider the same only if they have the first and the second column the same? For example I have : us2333 bbb 5 us2333 bbb 3 us2333 bbb 2 and I want to get us2333 bbb 10 The thing is I cannot...

7. UNIX for Dummies Questions & Answers

Remove Duplicate Lines

Hi I need this output. Thanks. Input: TAZ YET FOO FOO VAK TAZ BAR Output: YET VAK BAR

8. Shell Programming and Scripting

Getting lines between two strings with duplicate set of data

if I have the following lines in a file app.log some lines here <AAAA> abc <id>123456789</id> ddd </AAAA>some lines here too <BBBB> abc <id>123456789</id> ddd </BBBB>some lines here too <AAAA> xyz <id>987654321</id> ssss </AAAA>some lines here again... How do I get the...

9. Shell Programming and Scripting

Remove duplicate lines from a file

Hi, I have a csv file which contains some millions of lines in it. The first line(Header) repeats at every 50000th line. I want to remove all the duplicate headers from the second occurance(should not remove the first line). I don't want to use any pattern from the Header as I have some...

10. Shell Programming and Scripting

How to remove duplicate lines?

Hi All, I am storing the result in the variable result_text using the below code. result_text=$(printf "$result_text\t\n$name") The result_text is having the below text. Which is having duplicate lines. file and time for the interval 03:30 - 03:45 file and time for the interval 03:30 - 03:45 ...

LEARN ABOUT DEBIAN

swiss::textfunc

SWISS::TextFunc(3pm)					User Contributed Perl Documentation				      SWISS::TextFunc(3pm)

NAME

       SWISS::TextFunc

DESCRIPTION

       This module is designed to be a repository of functions that are repeatedly used during parsing and formatting of SWISS-PROT/TREMBL lines.
       If more than two line types need to do aproximately the same thing then it is probably in here.

       All functions expect to be called as package->function(param list)

       listFromText
	   Takes a piece of text, a seperator regex and a seperator that may appear at the end.  Returns an array of items that were seperated in
	   the text by that seperator.	Takes care of null items (looses them for you).

       textFromList
	   Takes an array of items, a separator, a terminating string, and a line width.  Returns an array of strings, each ending with the
	   separator or the terminator with a width less than or equal to the width specified.

	   Seems to do the wrong thing for references - not sure why.  Don't use it for that.

       wrapText
	   Takes a string and a length.  Returns an array of strings which are shorter or equal in length to length, spliting the string on white
	   space.

       wrapOn ($firstLinePrefix, $linePrefix, $colums, $text[, @separators])
	   Wraps $text into lines with at most $colums colums. Prepends the prefixes to the lines. @separators is a list of expressions on which
	   to wrap. The expression itself is part of the upper line.

	   If no @separators are provided, the $text is wrapped at whitespace except in EC/TC numbers or at dashes that separate words.

	   First tries to wrap on the first item of @separators, then the next etc.  If no wrap on any element of @separators or whitespaces is
	   possible, wraps into lines of exactly length $colums.

	   A special case is that the first item of @separators may be a reference to an array. This is used internally for wrapping FT VARIANT-
	   like lines.

	   Example:

	    wrapOn('DE ', 'DE ', 40,
		   '14-3-3 PROTEIN BETA/ALPHA (PROTEIN KINASE C INHIBITOR PROTEIN-1)',
		   's+')
	    returns ['14-3-3 PROTEIN BETA/ALPHA (PROTEIN ',
		     'KINASE C INHIBITOR PROTEIN-1)']
	    wrapOn('DE ', 'DE ', 40,
		   '14-3-3 PROTEIN BETA/ALPHA (PROTEIN KINASE C INHIBITOR PROTEIN-1)',
		   ' (?=()', 's+')
	    returns ['14-3-3 PROTEIN BETA/ALPHA ',
		     '(PROTEIN KINASE C INHIBITOR PROTEIN-1)']

       cleanLine
	   Remove the leading line Identifier and three blanks and trailing spaces from an SP line.

       joinWith ($text, $with, $noAddAfter, @list)
	   Concatenates $text and @list into one string. Adds $with between the original elements, unless the postfix of the current string is
	   $noAddAfter.  This is used to avoid inserting blanks after hyphens during concatenation.  So unpleasant strings like 'CALMODULIN-
	   DEPENDENT' are avoided. Unfortunately a correct reassembly of strings like 'CARBON-DIOXIDE' is not done.

       insertLineGroup ($textRef, $text, $pattern)
	   Inserts text block $text into the text referred to by $textRef. $text will replace the text block in $textRef matched by $pattern.

       uniqueList (@list)
	   Returns a list in which all duplicates from @list have been removed.

       currentSpDate
	   returns the current date in SWISS-PROT format

       toMixedCase($text, @regexps)
	   Convert a text to mixed case, according to one or more regular expressions.	In scalar context, returns the new text; in array context,
	   also returns the regexp with which the change was performed, or undef on failure.  See corresponding item in SWISS::GN for more
	   details.

perl v5.10.1							    2006-08-31						      SWISS::TextFunc(3pm)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

how to remove duplicate lines

Discussion started by: fredao

2. UNIX for Dummies Questions & Answers

Delete lines with duplicate strings based on date

Discussion started by: mattv

3. Shell Programming and Scripting

Remove duplicate lines

Discussion started by: zhshqzyc

4. Shell Programming and Scripting

Delete lines in file containing duplicate strings, keeping longer strings

Discussion started by: raidzero