Remove duplicates according to their frequency in column
Hi all,
I have huge a tab-delimited file with the following format and I want to remove the duplicates according to their frequency based on Column2 and Column3.
In this case, the result should be:
because user1 with access1 occur twice. Moreover, in case the original list contains the following entry:
The result should be
because user1 with access1 and user2 with access2 occur twice, so the smaller numbers of Column6 and Column7 should be taken into consideration.
Thanks in advance for your time and consideration.
Hi,
How to output the duplicate record to another file. We say the record is duplicate based on a column whose position is from 2 and its length is 11 characters.
The file is a fixed width file.
ex of Record:
DTYU12333567opert tjhi kkklTRG9012
The data in bold is the key on which... (1 Reply)
Given a file such as this I need to remove the duplicates.
00060011 PAUL BOWSTEIN ad_waq3_921_20100826_010517.txt
00060011 PAUL BOWSTEIN ad_waq3_921_20100827_010528.txt
0624-01 RUT CORPORATION ad_sade3_10_20100827_010528.txt
0624-01 RUT CORPORATION ... (13 Replies)
Hello,
I am new to shell scripting. I have a huge file with multiple columns for example:
I have 5 columns below.
HWUSI-EAS000_29:1:105 + chr5 76654650 AATTGGAA HHHHG
HWUSI-EAS000_29:1:106 + chr5 76654650 AATTGGAA B@HYL
HWUSI-EAS000_29:1:108 + ... (4 Replies)
Hi all,
I have an input file like this
Now
I have to remove duplicates only in first column and nothing has to be changed in second and third column. so that output would be
Please let me know scripting regarding this (20 Replies)
Hi all
I have following kind of input file
ESR1 PA156 leflunomide PA450192 leflunomide
CHST3 PA26503 docetaxel Pa4586; thalidomide Pa34958; decetaxel docetaxel docetaxel
I want to remove duplicates and I want to separate anything before and after PAxxxx entry into columns or... (1 Reply)
Hi Experts ,
we have a CDC file where we need to get the latest record of the Key columns
Key Columns will be CDC_FLAG and SRC_PMTN_I
and fetch the latest record from the CDC_PRCS_TS
Can we do it with a single awk command.
Please help.... (3 Replies)
Hi, I have tab-deliminated data similar to the following:
dot is-big 2
dot is-round 3
dot is-gray 4
cat is-big 3
hot in-summer 5
I want to count the frequency of each individual "unique" value in the 1st column. Thus, the desired output would be as follows:
dot 3
cat 1
hot 1
is... (5 Replies)
I have a file with the following format:
fields seperated by "|"
title1|something class|long...content1|keys
title2|somhing class|log...content1|kes
title1|sothing class|lon...content1|kes
title3|shing cls|log...content1|ks
I want to remove all duplicates with the same "title field"(the... (3 Replies)
Hi Experts,
Please bear with me, i need help
I am learning AWk and stuck up in one issue.
First point : I want to sum up column value for column 7, 9, 11,13 and column15 if rows in column 5 are duplicates.No action to be taken for rows where value in column 5 is unique.
Second point : For... (1 Reply)
Discussion started by: as7951
1 Replies
LEARN ABOUT XFREE86
libbash
LIBBASH(7) libbash Manual LIBBASH(7)NAME
libbash -- A bash shared libraries package.
DESCRIPTION
libbash is a package that enables bash dynamic-like shared libraries. Actually its a tool for managing bash scripts whose functions you may
want to load and use in scripts of your own.
It contains a 'dynamic loader' for the shared libraries ( ldbash(1)), a configuration tool (ldbashconfig(8)), and some libraries.
Using ldbash(1) you are able to load loadable bash libraries, such as getopts(1) and hashstash(1). A bash shared library that can be loaded
using
ldbash(1) must answer 4 requirments:
1. It must be installed in $LIBBASH_PREFIX/lib/bash (default is /usr/lib/bash).
2. It must contain a line that begins with '#EXPORT='. That line will contain (after the '=') a list of functions that the library
exports. I.e. all the function that will be usable after loading that library will be listed in that line.
3. It must contain a line that begins with '#REQUIRE='. That line will contain (after the '=') a list of bash libraries that are
required for our library. I.e. every bash library that is in use in our bash library must be listed there.
4. The library must be listed (For more information, see ldbashconfig(8)).
Basic guidelines for writing library of your own:
1. Be aware, that your library will be actually sourced. So, basically, it should contain (i.e define) only functions.
2. Try to declare all variables intended for internal use as local.
3. Global variables and functions that are intended for internal use (i.e are not defined in '#EXPORT=') should begin with:
__<library_name>_
For example, internal function myfoosort of hashstash library should be named as
__hashstash_myfoosort
This helps to avoid conflicts in global name space when using libraries that come from different vendors.
4. See html manual for full version of this guide.
AUTHORS
Hai Zaar <haizaar@haizaar.com>
Gil Ran <ril@ran4.net>
SEE ALSO ldbash(1), ldbashconfig(8), getopts(1), hashstash(1)colors(1)messages(1)urlcoding(1)locks(1)Linux Epoch Linux