07-25-2012
Request to check:remove duplicates only in first column
Hi all,
I have an input file like this
Quote:
machr fgf djfh dfhdj
machr fdj hdf hfdshf
machr dfg
nachr fdjk
nachr usd
nachr yeuio
Now
I have to remove duplicates only in first column and nothing has to be changed in second and third column. so that output would be
Quote:
machr fgf djfh dfhdj
fdj hdf hfdshf
dfg
nachr fdjk
usd
yeuio
Please let me know scripting regarding this
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Given a file such as this I need to remove the duplicates.
00060011 PAUL BOWSTEIN ad_waq3_921_20100826_010517.txt
00060011 PAUL BOWSTEIN ad_waq3_921_20100827_010528.txt
0624-01 RUT CORPORATION ad_sade3_10_20100827_010528.txt
0624-01 RUT CORPORATION ... (13 Replies)
Discussion started by: script_op2a
13 Replies
2. Shell Programming and Scripting
Hello,
I am new to shell scripting. I have a huge file with multiple columns for example:
I have 5 columns below.
HWUSI-EAS000_29:1:105 + chr5 76654650 AATTGGAA HHHHG
HWUSI-EAS000_29:1:106 + chr5 76654650 AATTGGAA B@HYL
HWUSI-EAS000_29:1:108 + ... (4 Replies)
Discussion started by: Diya123
4 Replies
3. Shell Programming and Scripting
Hi
I have a file like this
1234
2345
567889
567889
2345
234899420
83743
2345
67890
67890
................
so on
I want to delete entries which are more than once like 2345, 567889 and 67890 so that these appear once (4 Replies)
Discussion started by: manigrover
4 Replies
4. Shell Programming and Scripting
Hi
I have a file with following entries
122 N/A
123 5654656
123423 43534543
4544 45435
435454 N/A
i Have to remove entries with N/A so that only
123 5654656
123423 43534543
4544 45435
remain in output file
can anybody guide for a code/unix/perl (2 Replies)
Discussion started by: manigrover
2 Replies
5. Shell Programming and Scripting
Hi
I have a file
12m 345693460
12 1234
12 1234
34 345
34 345
And I want output fiel as
12m 345693460
12 1234
34 345
hw can it be done
Thanks (1 Reply)
Discussion started by: manigrover
1 Replies
6. Shell Programming and Scripting
Hi all
I have a file with following kind of data
I want to remove duplicates according to first column so that output contains
Kindly let me scripting regading this. (4 Replies)
Discussion started by: manigrover
4 Replies
7. Shell Programming and Scripting
Hi alll
I have a file with following kind input
I want in output duplicates should not be there but there should be numbering mentioned before that like (4 Replies)
Discussion started by: manigrover
4 Replies
8. Shell Programming and Scripting
Hi all
I have a file with following input
It contains 5 columns
gene name drug drug ID disease approved
Now the same gene is repeated many times with different data in column2,3 ,4,5
I want to arrange dat in such a way that there shuld be one entry in the column(no... (2 Replies)
Discussion started by: manigrover
2 Replies
9. Shell Programming and Scripting
Hi all
I have following kind of input file
ESR1 PA156 leflunomide PA450192 leflunomide
CHST3 PA26503 docetaxel Pa4586; thalidomide Pa34958; decetaxel docetaxel docetaxel
I want to remove duplicates and I want to separate anything before and after PAxxxx entry into columns or... (1 Reply)
Discussion started by: manigrover
1 Replies
10. Shell Programming and Scripting
Hi all,
I have huge a tab-delimited file with the following format and I want to remove the duplicates according to their frequency based on Column2 and Column3.
Column1 Column2 Column3 Column4 Column5 Column6 Column7
1 user1 access1 word word 3 2
2 user2 access2 ... (10 Replies)
Discussion started by: corfuitl
10 Replies
LEARN ABOUT DEBIAN
emgrip-dupes
EMGRIP-DUPES(1) User Contributed Perl Documentation EMGRIP-DUPES(1)
NAME
emgrip-dupes - find packages listed in more than one component
Synopsis
Syntax: emgrip-dupes -b PATH [OPTIONS]
emgrip-dupes -b PATH -m|--merge NAME [OPTIONS]
emgrip-dupes -b PATH -p|--purge NAME [OPTIONS]
emgrip-dupes -?|-h|--help|--version
Commands:
-b|--base-path PATH: path to the top level grip directory [required]
-a|--arch ARCHITECTURE: architecture to test [default: i386]
-m|--merge NAMES: retain this duplicate at the latest version in all
-p|--purge NAMES: remove the duplicates from 'main'
-t|--trim NAMES: retain the duplicates in main only
-?|-h|--help|--version: print this help message and exit
Options:
--grip-name STRING: alternative name for the grip repository
-s|--suite SUITE: suite to check (default: unstable)
-n|--dry-run: print the reprepro commands that would be used.
Description
emgrip-dupes scans the Grip repository Packages data and configuration, identifies the supported list of components in the requested suite.
In some cases, these duplicates are useful and only a small amount of space is taken up by the extra listing. However, the version in one
component can easily be out of sync with the version in another.
The main emphasis is on the size of the Packages file for the 'main' component (the one that every user needs to download). Purge mode will
remove the listing of the specified package from 'main'. Merge mode will bring the outdated version into line with the most recent version
of the package so that all components list the most recent version.
Limitations
Next step is to automate the "correction" of the duplicates but this does need care. Manual corrections involve identifying the packages to
retain in main (where the duplicate in dev, doc or debug is not wanted) and pass those to --trim.
The more complex case is to remove from main (e.g. package name suffix is -dev or -doc or -dbg or the Section is devel, dbg, doc or
libdevel). emgrip-dupes --purge removes each binary separately because removing the package from main in a single operation will also
remove the source. This is a particular problem if the source package also builds binary packages that are intended for main, e.g. dbus.
Copyright and Licence
Copyright (C) 2009 Neil Williams <codehelp@debian.org>
This package is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <http://www.gnu.org/licenses/>.
perl v5.12.3 2011-03-27 EMGRIP-DUPES(1)