This seemed to work but I noticed that there seem to be a few duplicated left behind. How does the array know what the delimiter? $3 is the field, but not clear on delimiter. Would the same work with tabs for delimiter?
I have a file which looks like
AA BB CC DD EE FF GG HH KK
AA BB GG HH KK FF CC DD EE
AA BB CC DD EE UU VV XX ZZ
AA BB VV XX ZZ UU CC DD EE
....
I want the script to give me only one line based on duplicate contents:
AA BB CC DD EE FF GG HH KK
AA BB CC DD EE UU VV XX ZZ (7 Replies)
Hi Guys...
Please Could you help me with the following ?
aaaa bbbb cccc sdsd
aaaa bbbb cccc qwer
as you can see, the 2 lines are matched in three fields...
how can I delete this pupicate ? I mean to delete the second one if 3 fields were duplicated ?
Thanks (14 Replies)
I have million's of records each containing exactly 50 characters and have to check the uniqueness of 4 character substring of 50 character (postion known prior) and report if any duplicates are found.
Eg. data...
AAAA00000000000000XXXX0000 0000000000... upto50 chars... (2 Replies)
Hi team,
I have 20 columns csv files. i want to find the duplicates in that file based on the column1 column10 column4 column6 coulnn8 coulunm2 . if those columns have same values . then it should be a duplicate record.
can one help me on finding the duplicates,
Thanks in advance.
... (2 Replies)
I have an input file abc.txt with info like:
abcd
rateuse
inklite
robet
rateuse
abcd
I need to remove duplicates from the file (eg: abcd,rateuse) from the file and need to place the contents in same file abc.txt if needed can be placed in another file.
can anyone help me in this :( (4 Replies)
Hi All ,
I have a requirement where I need to remove duplicates from a fixed width file which has multiple key columns .Also , need to capture the duplicate records into another file .
File has 8 columns.
Key columns are col1 and col2.
Col1 has the length of 8 col 2 has the length of 3.
... (5 Replies)
Hi,
I have a requirement.for eg: i have a text file with pipe symbol as delimiter(|) with 4 columns a,b,c,d. Here a and b are primary key columns..
i want to process that file to find the duplicates and null values are in primary key columns(a,b) . I want to write the unique records in which... (5 Replies)
Hi guys,Got a bit of a bind I'm in. I'm looking to remove duplicates from a pipe delimited file, but do so based on 2 columns. Sounds easy enough, but here's the kicker...
Column #1 is a simple ID, which is used to identify the duplicate.
Once dups are identified, I need to only keep the one... (2 Replies)
Hello Gurus,
I have a multiple pipe separated files which have records going over multiple Lines. End of line separator is \n and records going over multiple lines have <CR> as separator. below is example from one file.
1|ABC DEF|100|10
2|PQ
RS
T|200|20
3| UVWXYZ|300|30
4| GHIJKL|400|40... (7 Replies)
Discussion started by: dJHa
7 Replies
LEARN ABOUT PLAN9
regexp
REGEXP(6) Games Manual REGEXP(6)NAME
regexp - regular expression notation
DESCRIPTION
A regular expression specifies a set of strings of characters. A member of this set of strings is said to be matched by the regular
expression. In many applications a delimiter character, commonly bounds a regular expression. In the following specification for regular
expressions the word `character' means any character (rune) but newline.
The syntax for a regular expression e0 is
e3: literal | charclass | '.' | '^' | '$' | '(' e0 ')'
e2: e3
| e2 REP
REP: '*' | '+' | '?'
e1: e2
| e1 e2
e0: e1
| e0 '|' e1
A literal is any non-metacharacter, or a metacharacter (one of .*+?[]()|^$), or the delimiter preceded by
A charclass is a nonempty string s bracketed [s] (or [^s]); it matches any character in (or not in) s. A negated character class never
matches newline. A substring a-b, with a and b in ascending order, stands for the inclusive range of characters between a and b. In s,
the metacharacters an initial and the regular expression delimiter must be preceded by a other metacharacters have no special meaning and
may appear unescaped.
A matches any character.
A matches the beginning of a line; matches the end of the line.
The REP operators match zero or more (*), one or more (+), zero or one (?), instances respectively of the preceding regular expression e2.
A concatenated regular expression, e1e2, matches a match to e1 followed by a match to e2.
An alternative regular expression, e0|e1, matches either a match to e0 or a match to e1.
A match to any part of a regular expression extends as far as possible without preventing a match to the remainder of the regular expres-
sion.
SEE ALSO awk(1), ed(1), sam(1), sed(1), regexp(2)REGEXP(6)