Thanks for the assistance. That works wonderfully. May I ask for some further guidance breaking down the command so I may understand it?
Are we outputting if the current line does not equal the immediately preceding 2 lines (LAST1 || LAST2) and then incrementing the lines for the next iteration?
By extension, this also deals with the type 1 duplicates yes?
Here is my effort to translate using input
Lines of input are enumerated to for easier grasp and are not in actual file / input the program is processing.
Code:
#
# Condition construct is met on line 1
# LAST2 is empty, LAST1 is defined as current processing line, or $0
#
1 a
#
# Condition construct is met on line 2
# LAST2 is defined as LAST1 (previous line), LAST1 as current processing line, or $0
# We do that till line 6, since condition is met, replacing the values of LAST1 / LAST2 accordingly.
#
2 b
3 c
4 d
5 e
6 f
#
# In this moment, on line 7, value of LAST1 is "f", while LAST2 is "e".
# Condition construct is not met for lines 7 to 10.
# LAST1/LAST2 do not change, nor those lines will be in output
#
7 e
8 f
9 e
10 f
#
# On line 11 LAST1 or LAST2 condition construct is met again.
# LAST2 is declared as "f", and LAST1 as "a" or $0 or current processing line
# The program continues to operate as above.
#
11 a
12 b
13 c
14 d
15 e
16 f
The solution is a lookup buffer of two, implemented by the two variables LAST1 and LAST2.
The following has a configurable buffer depth
Code:
awk '
{
# preset: print
prt=1
# dont print if found in buf
for (i=1; i<=d; i++) if (buf[i%d]==$0) {
prt=0
break
}
if (prt==1) print $0
buf[NR%d]=$0
}
' d=2 file
With d=1 it will detect the repetition d d but not the e f e f
With d=3 it would also detect a repetition g h i g h i...
This User Gave Thanks to MadeInGermany For This Post:
hi..
i have a fasta file with the following format
>sequence1
CCGGTTTTCGATTTGGTTTGACT
>sequence2
AAAGTGCCGCCAGGTTTTGAGTGT
>sequence3
AGTGCCGCAGAGTTTGTAGTGT
Now, i want to read alternate line and add "GGGGGGGGGGG" to end of every sequence
Desired output:
>sequence1... (4 Replies)
Hi,
I have a file like
2011|ACC|.*
2013|ACC|.*
2011|ACCC|.*
2013|ACCC|.*
2013|ACCV|.*
2011|ADB|.*
2013|ADB|.*
2011|ADBC|.*
2013|ADBC|.*
2011|AIA|.*
2013|AXJ|.*
2013|NNN|.*
.* represnts any alphanumeric characters after this part of the string
I need a code to return only the... (3 Replies)
Total UNIX Rookie, but I'm learning. I have columns of integer data separated by spaces, and I'm using a Mac terminal.
What I want to do:
1. Compare "line 1 column 2" (x) to "line 2 column 2" (y); is y-x>=100?
2. If yes, display difference and y's line number
3. If no, increment x and y by... (9 Replies)
Hi,
I need to join every alternate line in a file
for eg:input file
$ cat abc
abc
def
ghi
jkloutput
abc def
ghi jklcode i wrote for this
$ cat add_line.pl
#!/usr/bin/perl -w
my $count=1;
#my $line=undef;
my @mem_line;
my $i=0;
my $x=0; (2 Replies)
Hello,
Please help me with this problem if you have a solution.
I have two files:
<file1> : In each line, first word is an Id and then other words that belong to this Id
piMN-1 abc pqr xyz py12
niLM y12 FY4 pqs
fiRLym F12 kite red
<file2> : same as file1, but can have extra lds... (3 Replies)
Hi All,
In continuation of my previous thread 'Add text at the end of line conditionally', I need to further modfiy the file after adding text at the end of the line. Now, I need to add a fixed charater string at alternate lines starting from first line using awk or sed.My file is now as below:... (10 Replies)
i have 2 files and i want to compare
i currently cat the files and awk print $1, $2 and doing if file1=file2 then fail, else exit 0
what i want to do is compare values, with column 1 being a reference i want to compare line by line and then still be able to do if then statement to see if worked... (1 Reply)
Hi,
I'm new to Unix. I want to read the all the lines from a text file and write the alternate lines into another file. Please give me a shell script solution.
file1
-----
one
two
three
four
five
six
seven
newfile(it should contain the alternate lines from the file1)
-------
one... (6 Replies)