Comparing alternate lines of code


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Comparing alternate lines of code
# 1  
Old 11-12-2018
Comparing alternate lines of code

Hi gents,

Have only a passing familiarity with linux/shell at this point, so please forgive simple question.

I have text files that have lines something like the following:

Code:
a
b
c
d
d
d
e
f
e
f
e
f
a
b
c
d
e
f
etc

I'm trying to remove 2 types of duplicates while preserving line order/format.
1) consecutive duplicate lines
2) alternate lines if they are duplicate

For removing type 1 lines,
Code:
cat "$file" | uniq > ./output/"$file"

gives me an output file that looks like

Code:
a
b
c
d
e
f
e
f
e
f
a
b
c
d
e
f
etc

which is fine.

I'm kinda stumped about type 2 duplicates though...

Ideally I'd like to get:

Code:
a
b
c
d
e
f
a
b
c
d
e
f

Not entirely sure how to compare alternate lines... Any assistance is appreciated

Last edited by RudiC; 11-12-2018 at 11:06 AM..
# 2  
Old 11-12-2018
How about

Code:
awk '!($0 == LAST1 || $0 == LAST2); {LAST2 = LAST1; LAST1 = $0}' file
a
b
c
d
e
f
a
b
c
d
e
f

This User Gave Thanks to RudiC For This Post:
# 3  
Old 11-12-2018
Hi RudiC,

Thanks for the assistance. That works wonderfully. May I ask for some further guidance breaking down the command so I may understand it?

Are we outputting if the current line does not equal the immediately preceding 2 lines (LAST1 || LAST2) and then incrementing the lines for the next iteration?

By extension, this also deals with the type 1 duplicates yes?
# 4  
Old 11-12-2018
Your analysis / function description is correct. Lines are printed only if they did not show up in the recent two lines.

Type 1 duplicates are handled by the LAST1 comparison, type 2 by LAST2. Then the two variables are sort of cycled through.
# 5  
Old 11-12-2018
Here is my effort to translate using input Smilie
Lines of input are enumerated to for easier grasp and are not in actual file / input the program is processing.
Code:
#
# Condition construct is met on line 1
# LAST2 is empty, LAST1 is defined as current processing line, or $0
#

1 a

#
# Condition construct is met on line 2
# LAST2 is defined as LAST1 (previous line), LAST1 as current processing line, or $0
# We do that till line 6, since condition is met, replacing the values of LAST1 / LAST2 accordingly.
#

2 b
3 c
4 d
5 e
6 f

#
# In this moment, on line 7, value of LAST1 is "f", while LAST2 is "e".
# Condition construct is not met for lines 7 to 10.
# LAST1/LAST2 do not change, nor those lines will be in output
#

7 e
8 f
9 e
10 f

#
# On line 11 LAST1 or LAST2 condition construct is met again.
# LAST2 is declared as "f", and LAST1 as "a" or $0 or current processing line
# The program continues to operate as above.
#

11 a
12 b
13 c
14 d
15 e
16 f

Hopefully that is correct.
Regards
Peasant.
# 6  
Old 11-12-2018
Code:
awk '$0 != line[NR-2] && $0 != line[NR-1]; {line[NR]=$0}' infile

# 7  
Old 11-14-2018
The solution is a lookup buffer of two, implemented by the two variables LAST1 and LAST2.
The following has a configurable buffer depth
Code:
awk '
{
  # preset: print
  prt=1
  # dont print if found in buf
  for (i=1; i<=d; i++) if (buf[i%d]==$0) {
    prt=0
    break
  }
  if (prt==1) print $0
  buf[NR%d]=$0
}
' d=2 file

With d=1 it will detect the repetition d d but not the e f e f
With d=3 it would also detect a repetition g h i g h i...
This User Gave Thanks to MadeInGermany For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Process alternate lines in awk/sed/perl

hi.. i have a fasta file with the following format >sequence1 CCGGTTTTCGATTTGGTTTGACT >sequence2 AAAGTGCCGCCAGGTTTTGAGTGT >sequence3 AGTGCCGCAGAGTTTGTAGTGT Now, i want to read alternate line and add "GGGGGGGGGGG" to end of every sequence Desired output: >sequence1... (4 Replies)
Discussion started by: empyrean
4 Replies

2. Shell Programming and Scripting

Grep values on alternate lines

Hi, I have a file like 2011|ACC|.* 2013|ACC|.* 2011|ACCC|.* 2013|ACCC|.* 2013|ACCV|.* 2011|ADB|.* 2013|ADB|.* 2011|ADBC|.* 2013|ADBC|.* 2011|AIA|.* 2013|AXJ|.* 2013|NNN|.* .* represnts any alphanumeric characters after this part of the string I need a code to return only the... (3 Replies)
Discussion started by: sam05121988
3 Replies

3. UNIX for Dummies Questions & Answers

Comparing lines of data

Total UNIX Rookie, but I'm learning. I have columns of integer data separated by spaces, and I'm using a Mac terminal. What I want to do: 1. Compare "line 1 column 2" (x) to "line 2 column 2" (y); is y-x>=100? 2. If yes, display difference and y's line number 3. If no, increment x and y by... (9 Replies)
Discussion started by: markymarkg123
9 Replies

4. Programming

Perl : joining alternate lines

Hi, I need to join every alternate line in a file for eg:input file $ cat abc abc def ghi jkloutput abc def ghi jklcode i wrote for this $ cat add_line.pl #!/usr/bin/perl -w my $count=1; #my $line=undef; my @mem_line; my $i=0; my $x=0; (2 Replies)
Discussion started by: sam05121988
2 Replies

5. Shell Programming and Scripting

Comparing lines of two different files

Hello, Please help me with this problem if you have a solution. I have two files: <file1> : In each line, first word is an Id and then other words that belong to this Id piMN-1 abc pqr xyz py12 niLM y12 FY4 pqs fiRLym F12 kite red <file2> : same as file1, but can have extra lds... (3 Replies)
Discussion started by: mira
3 Replies

6. Shell Programming and Scripting

Insert string in alternate lines

Hi All, In continuation of my previous thread 'Add text at the end of line conditionally', I need to further modfiy the file after adding text at the end of the line. Now, I need to add a fixed charater string at alternate lines starting from first line using awk or sed.My file is now as below:... (10 Replies)
Discussion started by: angshuman
10 Replies

7. Shell Programming and Scripting

reading alternate lines of a file

hi, i have 2 files. file1: 1 2 3 4 5 6 file2: a b c d e f g h i (5 Replies)
Discussion started by: vidyaj
5 Replies

8. Shell Programming and Scripting

comparing lines in file

i have 2 files and i want to compare i currently cat the files and awk print $1, $2 and doing if file1=file2 then fail, else exit 0 what i want to do is compare values, with column 1 being a reference i want to compare line by line and then still be able to do if then statement to see if worked... (1 Reply)
Discussion started by: sigh2010
1 Replies

9. Shell Programming and Scripting

alternate lines

Hi, I'm new to Unix. I want to read the all the lines from a text file and write the alternate lines into another file. Please give me a shell script solution. file1 ----- one two three four five six seven newfile(it should contain the alternate lines from the file1) ------- one... (6 Replies)
Discussion started by: pstanand
6 Replies

10. UNIX for Dummies Questions & Answers

alternate lines from two files

A basic request two files want to combine them but on alternate lines (1 Reply)
Discussion started by: SummitElse
1 Replies
Login or Register to Ask a Question