Sponsored Content
Top Forums Shell Programming and Scripting remove consecutive duplicate rows Post 302529187 by ctsgnb on Wednesday 8th of June 2011 06:03:20 PM
Old 06-08-2011
Make all lines uniq (make duplicate consecutive line appear only once):
Code:
awk '{sub(".*"$2,$2)}1' yourfile | uniq

Log duplicate lines in a file (will be logged only lines that consecutively appear twice or more):
Code:
awk '{sub(".*"$2,$2)}1' yourfile | uniq -d >duplicate.txt

Log lines that appear not more than once consecutively in a file :
Code:
awk '{sub(".*"$2,$2)}1' yourfile | uniq -u >single.txt

Log all lines prefixed by the number of times they appear consecutively:
Code:
awk '{sub(".*"$2,$2)}1' yourfile | uniq -c >count.txt

You can renumber the lines if necessary with ... | cat -b or ... | cat -n if necessary


By the way, do you really care about the first field (line number) or can we get rid of it ?

Last edited by ctsgnb; 06-08-2011 at 07:14 PM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to capture 2 consecutive rows when a condition is true ?

Hi All, i have an input below. As long as "x= 1" , i would want to capture 2 lines using sed or awk for eg : 0001 x= 1 $---------------------------------..-.--.. 0001 tt= 137 171 423 1682 2826 0 Pls help. Thanks in advance. Note that the number of lines in each block do... (37 Replies)
Discussion started by: Raynon
37 Replies

2. UNIX for Dummies Questions & Answers

Remove duplicate rows of a file based on a value of a column

Hi, I am processing a file and would like to delete duplicate records as indicated by one of its column. e.g. COL1 COL2 COL3 A 1234 1234 B 3k32 2322 C Xk32 TTT A NEW XX22 B 3k32 ... (7 Replies)
Discussion started by: risk_sly
7 Replies

3. Shell Programming and Scripting

awk script to remove duplicate rows in line

i have the long file more than one ns and www and mx in the line like . i need the first ns record and first www and first mx from line . the records are seperated with tthe ; i am try ing in awk scripting not getiing the solution. ... (4 Replies)
Discussion started by: kiranmosarla
4 Replies

4. Shell Programming and Scripting

To remove date and duplicate rows from a log file using unix commands

Hi, I have a log file having size of 48mb. For such a large log file. I want to get the message in a particular format which includes only unique error and exception messages. The following things to be done : 1) To remove all the date and time from the log file 2) To remove all the... (1 Reply)
Discussion started by: Pank10
1 Replies

5. Shell Programming and Scripting

remove html tags,consecutive duplicate lines

I need help with a script that will remove all HTML tags from an HTML document and remove any consecutive duplicate lines, and save it as a text document. The user should have the option of including the name of an html file as an argument for the script, but if none is provided, then the script... (7 Replies)
Discussion started by: clicstic
7 Replies

6. UNIX for Dummies Questions & Answers

Remove duplicate rows when >10 based on single column value

Hello, I'm trying to delete duplicates when there are more than 10 duplicates, based on the value of the first column. e.g. a 1 a 2 a 3 b 1 c 1 gives b 1 c 1 but requires 11 duplicates before it deletes. Thanks for the help Video tutorial on how to use code tags in The UNIX... (11 Replies)
Discussion started by: informaticist
11 Replies

7. Shell Programming and Scripting

Remove duplicate rows based on one column

Dear members, I need to filter a file based on the 8th column (that is id), and does not mather the other columns, because I want just one id (1 line of each id) and remove the duplicates lines based on this id (8th column), and does not matter wich duplicate will be removed. example of my file... (3 Replies)
Discussion started by: clarissab
3 Replies

8. Shell Programming and Scripting

Compute value from more than three consecutive rows

Hello all, I am working on a file like below: site Date time value1 value2 0023 2014-01-01 00:00 32.0 23.7 0023 2014-01-01 01:00 38.0 29.9 0023 2014-01-01 02:00 85.0 26.6 0023 2014-01-01 03:00 34.0 25.3 0023 2014-01-01 04:00 37.0 23.8 0023 2014-01-01 05:00 80.0 20.3 0023 2014-01-01 06:00... (16 Replies)
Discussion started by: kathy wang
16 Replies

9. Shell Programming and Scripting

Check/print missing number in a consecutive range and remove duplicate numbers

Hi, In an ideal scenario, I will have a listing of db transaction log that gets copied to a DR site and if I have them all, they will be numbered consecutively like below. 1_79811_01234567.arc 1_79812_01234567.arc 1_79813_01234567.arc 1_79814_01234567.arc 1_79815_01234567.arc... (3 Replies)
Discussion started by: newbie_01
3 Replies

10. Shell Programming and Scripting

Remove duplicate consecutive lines with specific string

Hello, I'm trying to remove the duplicate consecutive lines with specific string "WARNING". File.txt abc; WARNING 2345 WARNING 2345 WARNING 2345 WARNING 2345 WARNING 2345 bcd; abc; 123 123 123 WARNING 1234 WARNING 2345 WARNING 2345 efgh; (6 Replies)
Discussion started by: Mannu2525
6 Replies
uniq(1) 							   User Commands							   uniq(1)

NAME
uniq - report or filter out repeated lines in a file SYNOPSIS
uniq [-c | -d | -u] [-f fields] [-s char] [ input_file [output_file]] uniq [-c | -d | -u] [-n] [ + m] [ input_file [output_file]] DESCRIPTION
The uniq utility will read an input file comparing adjacent lines, and write one copy of each input line on the output. The second and suc- ceeding copies of repeated adjacent input lines will not be written. Repeated lines in the input will not be detected if they are not adjacent. OPTIONS
The following options are supported: -c Precedes each output line with a count of the number of times the line occurred in the input. -d Suppresses the writing of lines that are not repeated in the input. -f fields Ignores the first fields fields on each input line when doing comparisons, where fields is a positive decimal integer. A field is the maximal string matched by the basic regular expression: [[:blank:]]*[^[:blank:]]* If fields specifies more fields than appear on an input line, a null string will be used for comparison. -s chars Ignores the first chars characters when doing comparisons, where chars is a positive decimal integer. If specified in con- junction with the -f option, the first chars characters after the first fields fields will be ignored. If chars specifies more characters than remain on an input line, a null string will be used for comparison. -u Suppresses the writing of lines that are repeated in the input. -n Equivalent to -f fields with fields set to n. +m Equivalent to -s chars with chars set to m. OPERANDS
The following operands are supported: input_file A path name of the input file. If input_file is not specified, or if the input_file is -, the standard input will be used. output_file A path name of the output file. If output_file is not specified, the standard output will be used. The results are unspeci- fied if the file named by output_file is the file named by input_file. EXAMPLES
Example 1: Using the uniq command The following example lists the contents of the uniq.test file and outputs a copy of the repeated lines. example% cat uniq.test This is a test. This is a test. TEST. Computer. TEST. TEST. Software. example% uniq -d uniq.test This is a test. TEST. example% The next example outputs just those lines that are not repeated in the uniq.test file. example% uniq -u uniq.test TEST. Computer. Software. example% The last example outputs a report with each line preceded by a count of the number of times each line occurred in the file: example% uniq -c uniq.test 2 This is a test. 1 TEST. 1 Computer. 2 TEST. 1 Software. example% ENVIRONMENT VARIABLES
See environ(5) for descriptions of the following environment variables that affect the execution of uniq: LANG, LC_ALL, LC_CTYPE, LC_MES- SAGES, and NLSPATH. EXIT STATUS
The following exit values are returned: 0 Successful completion. >0 An error occurred. ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWesu | +-----------------------------+-----------------------------+ |CSI |Enabled | +-----------------------------+-----------------------------+ |Interface Stability |Standard | +-----------------------------+-----------------------------+ SEE ALSO
comm(1), pack(1), pcat(1), sort(1), uncompress(1), attributes(5), environ(5), standards(5) SunOS 5.10 20 Dec 1996 uniq(1)
All times are GMT -4. The time now is 02:02 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy