Search for duplicates and delete but remain the first one based on a specific pattern
Hi all,
I have been trying to delete duplicates based on a certain pattern but failed to make it works. There are more than 1 pattern which are duplicated but i just want to remove 1 pattern only and remain the rest. I cannot use awk '!x[$0]++' inputfile.txt or sed '/pattern/d' or use uniq and sort command as it will deleted all the duplicated patterns in the file. A sample as follows:
inputfile.txt
I want to remove the duplicates for pattern "FUNC" only, where the output should look like this:
output.txt
I have thousands of data like this and i need to delete a different pattern at one time. I tried to do it by specifying the column no too but it affects other duplicated values which i dont want it to be affected. Appreciate your help on this. Thanks
Hello Friend,
I have the followint command to delete 4th field and move forward. Can I delete all filed and just remain the first 2?
sed -e "/^*<Number/s/\(\) \(\)/\1\2/g" -e "/^*<Number/s/\(\)./\1/" -e "/^*<Number/s/\(\)/\1 /g" -e "/^*<Number/s/0</</" file
input
<Number>00000000<Number>... (5 Replies)
Hi,
I want to search a certain pattern with less command in a files. For examples, I have a files with this entry:
POLAR xx
POLARX xc
POLARXI x1
POLARZZZY vb
POLARLLLLLLL ee... (1 Reply)
I have my data something like this
(08/03/2009 22:57:42.414)(:) king aaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbbbbbb
(08/03/2009 22:57:42.416)(:) John cccccccccccc cccccvssssssssss baaaaa
(08/03/2009 22:57:42.417)(:) Michael ddddddd tststststtststts
(08/03/2009 22:57:42.425)(:) Ravi... (11 Replies)
I have one file which is having content as following...
0513468211,,,,20091208,084005,5,,2,3699310,
0206554475,,,,20090327,123634,85,,2,15615533
0206554475,,,,20090327,134431,554,,2,7246177
0103000300,,,,20090523,115501,89,,2,3869929
0736454328,,,,20091208,084005,75,,2,3699546... (7 Replies)
My files look like this
And I need to cut the sequences at the last "A" found in the following 'pattern' -highlighted for easier identification, the pattern is the actual file is not highlighted.
The expected result should look like this
Thus, all the sequences would end with AGCCCTA... (2 Replies)
Hi,
I am unable to search the duplicates in a file based on the 1st,2nd,4th,5th columns in a file and also remove the duplicates in the same file.
Source filename: Filename.csv
"1","ccc","information","5000","temp","concept","new"
"1","ddd","information","6000","temp","concept","new"... (2 Replies)
Hi all,
I am trying to extract the values ( text between the xml tags) based on the Order Number.
here is the sample input
<?xml version="1.0" encoding="UTF-8"?>
<NJCustomer>
<Header>
<MessageIdentifier>Y504173382</MessageIdentifier>
... (13 Replies)
Given a file such as this I need to remove the duplicates.
00060011 PAUL BOWSTEIN ad_waq3_921_20100826_010517.txt
00060011 PAUL BOWSTEIN ad_waq3_921_20100827_010528.txt
0624-01 RUT CORPORATION ad_sade3_10_20100827_010528.txt
0624-01 RUT CORPORATION ... (13 Replies)
Hi Unix gurus,
I am trying to remove the filenames based on MMDDYYYY in the physical name as such so that the directory always has the recent 3 files based on MMDDYYYY. "HHMM" is just dummy in this case. You wont have two files with different HHMM on the same day.
For example in a... (4 Replies)
I have unix file like below
>newuser
newuser
<hello
hello
newone
I want to find the unique values in the file(excluding <,>),so that the out put should be
>newuser
<hello
newone
can any body tell me what is command to get this new file. (7 Replies)