09-26-2008
Remove duplicate rows of a file based on a value of a column
Hi,
I am processing a file and would like to delete duplicate records as indicated by one of its column. e.g.
COL1 COL2 COL3
A 1234 1234
B 3k32 2322
C Xk32 TTT
A NEW XX22
B 3k32 2322
I want the file not to contain duplicate COL1. i.e. the file should only contain the ff:
COL1 COL2 COL3
A 1234 1234
B 3k32 2322
C Xk32 TTT
The records with duplicate COL1 were deleted.
Anybody who has suggestions on how to do this?
Thank you.
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi,
I have a log file having size of 48mb.
For such a large log file. I want to get the message in a particular format which includes only unique error and exception messages.
The following things to be done :
1) To remove all the date and time from the log file
2) To remove all the... (1 Reply)
Discussion started by: Pank10
1 Replies
2. Shell Programming and Scripting
hii i have a huge amt of data stored in a file.Here in this file i need to remove duplicates rows in such a way that the last column has different data & i must check for greatest among last colmn data & print the largest data along with other entries but just one of other duplicate entries is... (16 Replies)
Discussion started by: reva
16 Replies
3. Shell Programming and Scripting
My input file:
AVI.out <detail>named as the RRM .</detail>
AVI.out <detail>Contains 1 RRM .</detail>
AR0.out <detail>named as the tellurite-resistance.</detail>
AWG.out <detail>Contains 2 HTH .</detail>
ADV.out <detail>named as the DENR family.</detail>
ADV.out ... (10 Replies)
Discussion started by: patrick87
10 Replies
4. UNIX for Dummies Questions & Answers
Hii Friends.. I have a huge set of data stored in a file.Which is as shown below
a.dat:
RAO 1869 12 19 0 0 0.00 17.9000 82.3000 10.0 0 0.00 0 3.70 0.00 0.00 0 0.00 3.70 4 NULL
LEE 1870 4 11 1 0 0.00 30.0000 99.0000 0.0 0 0.00 0 0.00 0.00 0.00 0 ... (3 Replies)
Discussion started by: reva
3 Replies
5. UNIX for Dummies Questions & Answers
Hello, I'm trying to delete duplicates when there are more than 10 duplicates, based on the value of the first column.
e.g.
a 1
a 2
a 3
b 1
c 1
gives
b 1
c 1
but requires 11 duplicates before it deletes.
Thanks for the help
Video tutorial on how to use code tags in The UNIX... (11 Replies)
Discussion started by: informaticist
11 Replies
6. UNIX for Dummies Questions & Answers
I have 2 files,
file01= 7 columns, row unknown (but few)
file02= 7 columns, row unknown (but many)
now I want to create an output with the first field that is shared in both of them and then subtract the results from the rest of the fields and print there
e.g.
file 01
James|0|50|25|10|50|30... (1 Reply)
Discussion started by: A-V
1 Replies
7. Shell Programming and Scripting
Dear members, I need to filter a file based on the 8th column (that is id), and does not mather the other columns, because I want just one id (1 line of each id) and remove the duplicates lines based on this id (8th column), and does not matter wich duplicate will be removed.
example of my file... (3 Replies)
Discussion started by: clarissab
3 Replies
8. Shell Programming and Scripting
Dear community,
I have to remove duplicate lines from a file contains a very big ammount of rows (milions?) based on 1st and 3rd columns
The data are like this:
Region 23/11/2014 09:11:36 41752
Medio 23/11/2014 03:11:38 4132
Info 23/11/2014 05:11:09 4323... (2 Replies)
Discussion started by: Lord Spectre
2 Replies
9. Shell Programming and Scripting
Hello,
I have a script that is generating a tab delimited output file.
num Name PCA_A1 PCA_A2 PCA_A3
0 compound_00 -3.5054 -1.1207 -2.4372
1 compound_01 -2.2641 0.4287 -1.6120
3 compound_03 -1.3053 1.8495 ... (3 Replies)
Discussion started by: LMHmedchem
3 Replies
10. Shell Programming and Scripting
Hi Gurus,
I have a file(weblog) as below
abc|xyz|123|agentcode=sample code abcdeeess,agentcode=sample code abcdeeess,agentcode=sample code abcdeeess|agentadd=abcd stereet 23343,agentadd=abcd stereet 23343
sss|wwq|999|agentcode=sample1 code wqwdeeess,gentcode=sample1 code... (4 Replies)
Discussion started by: ratheeshjulk
4 Replies
aclsort(3C) aclsort(3C)
NAME
aclsort() - sort an Access Control List (JFS File Systems only)
SYNOPSIS
DESCRIPTION
The routine sorts JFS Access Control List (ACL) entries into the correct order to be accepted by the acl(2) system call.
points to a buffer containing ACL entries; if non-zero, indicates that the permissions should be recalculated; and specifies the number of
ACL entries in the buffer.
sorts the contents of the ACL buffer as follows:
Entries will be in order and
Entries of type and will be sorted in increasing order by numeric ID.
The call will succeed if all of the following are true:
There is exactly one entry each of type and
There is at most one entry each of type and
Entries of type or may not contain duplicate entries. A duplicate entry is one of the same type containing the same numeric id,
irrespective of permission bits.
If the argument is zero and there are no entries of type and no entries of type the permissions of the and entries must be the same.
If there are no entries of type and no entries of type and the entry is specified, then the entry must also be specified, and the
permissions of the and entries must be the same.
RETURN VALUE
Upon successful completion, the return value is 0. If there are duplicate entries, the return value is the position of the first duplicate
entry. If there is more than one entry of type or they are treated as duplicate entries, and the return value is the position of the first
duplicate entry. For all other errors, the return value is -1.
NOTICES
The buffer is sorted by type and ID before checking for any failures. Therefore the buffer is always sorted, even if there is a failure.
The position of a duplicate entry returned on failure is not the byte offset of the duplicate entry from its base; rather it refers to the
entry number of the duplicate entry within the sorted buffer.
Checks will be performed in order of entry type. If there are multiple failures, the failure returned will be the first encountered, for
example, if the ACL buffer contains a duplicate entry and does not contain an entry, the return value will be the first duplicate entry.
ACLs do not have to be sorted with prior to passing them to acl(2).
DEPENDENCIES
is supported only on JFS file systems on the standard HP-UX operating system.
AUTHOR
was developed by AT&T.
SEE ALSO
acl(2), aclv(5).
aclsort(3C)