Search for duplicates and delete but remain the first one based on a specific pattern
Hi all,
I have been trying to delete duplicates based on a certain pattern but failed to make it works. There are more than 1 pattern which are duplicated but i just want to remove 1 pattern only and remain the rest. I cannot use awk '!x[$0]++' inputfile.txt or sed '/pattern/d' or use uniq and sort command as it will deleted all the duplicated patterns in the file. A sample as follows:
inputfile.txt
I want to remove the duplicates for pattern "FUNC" only, where the output should look like this:
output.txt
I have thousands of data like this and i need to delete a different pattern at one time. I tried to do it by specifying the column no too but it affects other duplicated values which i dont want it to be affected. Appreciate your help on this. Thanks
Hello Friend,
I have the followint command to delete 4th field and move forward. Can I delete all filed and just remain the first 2?
sed -e "/^*<Number/s/\(\) \(\)/\1\2/g" -e "/^*<Number/s/\(\)./\1/" -e "/^*<Number/s/\(\)/\1 /g" -e "/^*<Number/s/0</</" file
input
<Number>00000000<Number>... (5 Replies)
Hi,
I want to search a certain pattern with less command in a files. For examples, I have a files with this entry:
POLAR xx
POLARX xc
POLARXI x1
POLARZZZY vb
POLARLLLLLLL ee... (1 Reply)
I have my data something like this
(08/03/2009 22:57:42.414)(:) king aaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbbbbbb
(08/03/2009 22:57:42.416)(:) John cccccccccccc cccccvssssssssss baaaaa
(08/03/2009 22:57:42.417)(:) Michael ddddddd tststststtststts
(08/03/2009 22:57:42.425)(:) Ravi... (11 Replies)
I have one file which is having content as following...
0513468211,,,,20091208,084005,5,,2,3699310,
0206554475,,,,20090327,123634,85,,2,15615533
0206554475,,,,20090327,134431,554,,2,7246177
0103000300,,,,20090523,115501,89,,2,3869929
0736454328,,,,20091208,084005,75,,2,3699546... (7 Replies)
My files look like this
And I need to cut the sequences at the last "A" found in the following 'pattern' -highlighted for easier identification, the pattern is the actual file is not highlighted.
The expected result should look like this
Thus, all the sequences would end with AGCCCTA... (2 Replies)
Hi,
I am unable to search the duplicates in a file based on the 1st,2nd,4th,5th columns in a file and also remove the duplicates in the same file.
Source filename: Filename.csv
"1","ccc","information","5000","temp","concept","new"
"1","ddd","information","6000","temp","concept","new"... (2 Replies)
Hi all,
I am trying to extract the values ( text between the xml tags) based on the Order Number.
here is the sample input
<?xml version="1.0" encoding="UTF-8"?>
<NJCustomer>
<Header>
<MessageIdentifier>Y504173382</MessageIdentifier>
... (13 Replies)
Given a file such as this I need to remove the duplicates.
00060011 PAUL BOWSTEIN ad_waq3_921_20100826_010517.txt
00060011 PAUL BOWSTEIN ad_waq3_921_20100827_010528.txt
0624-01 RUT CORPORATION ad_sade3_10_20100827_010528.txt
0624-01 RUT CORPORATION ... (13 Replies)
Hi Unix gurus,
I am trying to remove the filenames based on MMDDYYYY in the physical name as such so that the directory always has the recent 3 files based on MMDDYYYY. "HHMM" is just dummy in this case. You wont have two files with different HHMM on the same day.
For example in a... (4 Replies)
I have unix file like below
>newuser
newuser
<hello
hello
newone
I want to find the unique values in the file(excluding <,>),so that the out put should be
>newuser
<hello
newone
can any body tell me what is command to get this new file. (7 Replies)
Discussion started by: shiva2985
7 Replies
LEARN ABOUT DEBIAN
regexp
Regexp(3I) InterViews Reference Manual Regexp(3I)NAME
Regexp - regular expression searching
SYNOPSIS
#include <InterViews/regexp.h>
DESCRIPTION
A Regexp encapsulates a regular expression pattern and defines operations for searching and matching the pattern against a string. The
syntax of the regular expression pattern is the same as that for ed(1). Information can be obtained about the most recent match of the
regular expression (and its sub-expressions).
PUBLIC OPERATIONS
Regexp(const char* pattern)
Regexp(const char* pattern, int length)
Construct a new Regexp for pattern.
int Match(const char* text, int length, int index)
Attempt a match against text (of length length) at position index. The return value is the length of the matching string, or a neg-
ative number if the match failed.
int Search(const char* text, int length, int index, int range)
Search for a match in the string text (of length length). Matches are attempted starting at positions between index and index plus
range. If range is positive the first match after index is reported. If range is negative the first match before index is
reported. The return value is the index of the starting position of the match, or a negative number if there is no match in the
specified range.
int BeginningOfMatch(int subexp)
int EndOfMatch(int subexp)
Return information about the most recent match. If subexp is zero (the default), information is reported for the complete regular
expression. Other values of subexp refer to sub-expressions in the pattern. For example, if subexp is 2, information is returned
for the sub-expression specified by the second pair of ( and ) delimiters in the pattern.
SEE ALSO ed(1)InterViews 23 May 1989 Regexp(3I)