awk to combine matching lines in file Post: 302981241

Sponsored Content

Top Forums Shell Programming and Scripting awk to combine matching lines in file Post 302981241 by Don Cragun on Thursday 8th of September 2016 06:55:10 PM

09-08-2016

Registered User

Hi cmccabe,
The code you were using:

Code:

awk '!(NR){print$0p}{p=$0}' input

only tries to print anything when the condition !(NR) evaluates to a non-zero value. But, since the awk NR variable is set to one when awk reads the first record from your input files and increments by 1 every time another input record is read, !NR ALWAYS evaluates to zero. Therefore, the above script is logically equivalent to:

Code:

awk '{p=$0}

which, as you said, produces no output.

If you are just trying to remove duplicated adjacent lines in a file (and the first line in your file is never an empty line), you could try:

Code:

awk '$0 != p {print;p = $0}' input

If you could have an empty line as the first line in your file (and you want to keep that empty line in the output), you would need to make it a little more complicated:

Code:

awk '$0 != p || NR == 1 {print;p = $0}' input

The code Yoda suggested removes duplicated lines even if they are not adjacent. If you just need to worry about adjacent lines, Yoda's code does that as well but takes more time and memory to get the job done. For a small file like your sample; it doesn't matter. For a file with a huge number of lines with different contents, the code above should run considerably faster.

Hope this helps.

This User Gave Thanks to Don Cragun For This Post:

Don Cragun

View Public Profile for Don Cragun

Find all posts by Don Cragun

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

SImple HELP! how to combine two lines together using sed or awk..

hi..im new to UNIX... ok i have this information in the normal shell... there are 2 lines display like this: h@hotmail.com k@hotmail.com i want it to display like this with a space betweem them h@hotmail.com k@hotmail.com the information is stored in a text file.... anyone...

2. Shell Programming and Scripting

search and combine lines in awk

Hi All, I have 1 "keyword" file like this: 00-1F-FB-00-04-18 00-19-CB-8E-66-DF 00-1F-FB-00-48-9C 00-1F-FB-00-AA-4F .... and the 2nd "details" file like this: Wed Feb 11 00:00:02 2009 NAS-IP-Address = xxxxxxxxxxxxxxxxxx Class = "P1-SHT-AAA01;1233704662;4886720" ...

3. Shell Programming and Scripting

Print lines matching value(s) in other file using awk

Hi, I have two comma separated files. I would like to see field 1 value of File1 exact match in field 2 of File2. If the value matches, then it should print matched lines from File2. I have achieved the results using cut, paste and egrep -f but I would like to use awk as it is efficient way and...

4. Shell Programming and Scripting

awk file comparison, x lines after matching as output

Hello, I couldn't find anything on the Forum that would help me to solve this problem. Could any body help me process below data using awk? I have got two files: file1: Worker1: Thomas Position: Manager Department: Sales Salary: $5,000 Worker2: Jason Position: ...

5. Shell Programming and Scripting

Combine multiple unique lines from event log text file into one line, use PERL or AWK?

I can't decide if I should use AWK or PERL after pouring over these forums for hours today I decided I'd post something and see if I couldn't get some advice. I've got a text file full of hundreds of events in this format: Record Number : 1 Records in Seq : ...

6. Shell Programming and Scripting

awk to combine all matching dates and remove non-matching

Using the awk below I am able to combine all the matching dates in $1, but I can not seem to remove the non-matching from the file. Thank you :). file 20161109104500.0+0000,x,5631 20161109104500.0+0000,y,2 20161109104500.0+0000,z,2 20161109104500.0+0000,a,4117...

7. Shell Programming and Scripting

awk to combine all matching fields in input but only print line with largest value in specific field

In the below I am trying to use awk to match all the $13 values in input, which is tab-delimited, that are in $1 of gene which is just a single column of text. However only the line with the greatest $9 value in input needs to be printed. So in the example below all the MECP2 and LTBP1...

8. Shell Programming and Scripting

awk to combine lines if fields match in lines

In the awk below, what I am attempting to do is check each line in the tab-delimeted input, which has ~20 lines in it, for a keyword SVTYPE=Fusion. If the keyword is found I am splitting $3 using the . (dot) and reading the portion before and after the dot in an array a. If it does have that...

9. Shell Programming and Scripting

awk to remove lines that do not start with digit and combine line or lines

I have been searching and trying to come up with an awk that will perform the following on a converted text file (original is a pdf). 1. Since the first two lines are (begin with) text they are removed 2. if $1 is a number then all text is merged (combined) into one line until the next...

10. UNIX for Beginners Questions & Answers

awk to average matching lines in file

The awk below executes and is close (producing the first 4 columns in desired). However, when I add the sum of $7, I get nothing returned. Basically, I am trying to combine all the matching $4 in f1 and output them with the average of $7 in each match. Thank you :). f1 ...

LEARN ABOUT V7

join

JOIN(1) 						      General Commands Manual							   JOIN(1)

NAME

       join - relational database operator

SYNOPSIS

       join [ options ] file1 file2

DESCRIPTION

       Join  forms,  on the standard output, a join of the two relations specified by the lines of file1 and file2.  If file1 is `-', the standard
       input is used.

       File1 and file2 must be sorted in increasing ASCII collating sequence on the fields on which they are to be joined, normally the  first	in
       each line.

       There  is  one line in the output for each pair of lines in file1 and file2 that have identical join fields.  The output line normally con-
       sists of the common field, then the rest of the line from file1, then the rest of the line from file2.

       Fields are normally separated by blank, tab or newline.	In this case, multiple separators count as one, and leading  separators  are  dis-
       carded.

       These options are recognized:

       -an    In addition to the normal output, produce a line for each unpairable line in file n, where n is 1 or 2.

       -e s   Replace empty output fields by string s.

       -jn m  Join on the mth field of file n.	If n is missing, use the mth field in each file.

       -o list
	      Each  output line comprises the fields specifed in list, each element of which has the form n.m, where n is a file number and m is a
	      field number.

       -tc    Use character c as a separator (tab character).  Every appearance of c in a line is significant.

SEE ALSO

       sort(1), comm(1), awk(1)

BUGS

       With default field separation, the collating sequence is that of sort -b; with -t, the sequence is that of a plain sort.

       The conventions of join, sort, comm, uniq, look and awk(1) are wildly incongruous.

																	   JOIN(1)

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

SImple HELP! how to combine two lines together using sed or awk..

Discussion started by: forevercalz

2. Shell Programming and Scripting

search and combine lines in awk

Discussion started by: xajax7

3. Shell Programming and Scripting

Print lines matching value(s) in other file using awk

Discussion started by: SBC

4. Shell Programming and Scripting

awk file comparison, x lines after matching as output

Discussion started by: killerbee

5. Shell Programming and Scripting

Combine multiple unique lines from event log text file into one line, use PERL or AWK?

Discussion started by: Mayday22

6. Shell Programming and Scripting

awk to combine all matching dates and remove non-matching

Discussion started by: cmccabe

7. Shell Programming and Scripting

awk to combine all matching fields in input but only print line with largest value in specific field

Discussion started by: cmccabe

8. Shell Programming and Scripting

awk to combine lines if fields match in lines

Discussion started by: cmccabe

9. Shell Programming and Scripting

awk to remove lines that do not start with digit and combine line or lines

Discussion started by: cmccabe

10. UNIX for Beginners Questions & Answers

awk to average matching lines in file

Discussion started by: cmccabe

LEARN ABOUT V7

join