Sponsored Content
Top Forums UNIX for Beginners Questions & Answers Remove duplicates in a dataframe (table) keeping all the different cells of just one of the columns Post 303032289 by RavinderSingh13 on Thursday 14th of March 2019 11:18:06 PM
Old 03-15-2019
Hello pedro88,

Could you please try following too, I am reading Input_file 2 times here and output will be in same sequence in which $1 appears to be in Input_file.

Code:
awk '
FNR==NR{
  a[$1]=a[$1]>$2?a[$1]:$2
  b[$1]=a[$1]>$2?b[$1]?b[$1]:$0:$0
  next
}
($1 in a){
  print b[$1]
  delete a[$1]
}
'   Input_file  Input_file

Output will be as follows.

Code:
A                           B        C                  D
apple                  25        bbb         acgacgacgcga
banana               36        ffff         cacacgtgtgct
cherry                 36        ddd        actgctgtcgagtag
berry                   55        eee        gactgatgctgtcgtc

Thanks,
R. Singh
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Deleting table cells in a script

I'd like to use sed or awk to do this but I'm weak on both along with RE. Looking for a way with sed or awk to count for the 7th table data within a table row and if the condition is met to delete "<td>and everything in between </td>". Since the table header start on a specific line each time, that... (15 Replies)
Discussion started by: phpfreak
15 Replies

2. Shell Programming and Scripting

Remove duplicates based on the two key columns

Hi All, I needs to fetch unique records based on a keycolumn(ie., first column1) and also I needs to get the records which are having max value on column2 in sorted manner... and duplicates have to store in another output file. Input : Input.txt 1234,0,x 1234,1,y 5678,10,z 9999,10,k... (7 Replies)
Discussion started by: kmsekhar
7 Replies

3. Shell Programming and Scripting

Search based on 1,2,4,5 columns and remove duplicates in the same file.

Hi, I am unable to search the duplicates in a file based on the 1st,2nd,4th,5th columns in a file and also remove the duplicates in the same file. Source filename: Filename.csv "1","ccc","information","5000","temp","concept","new" "1","ddd","information","6000","temp","concept","new"... (2 Replies)
Discussion started by: onesuri
2 Replies

4. UNIX for Dummies Questions & Answers

Two files; if cells match then copy over other columns

My current issue is dealing with two space delimited files. The first file has column 1 as the sample ID's, then columns 2 - n as the observations. The second file has column 1 as the sample ID's, column 2 as the mother ID's, column 3 as the father ID's, column 4 as the gender, and column 5... (3 Replies)
Discussion started by: Renyulb28
3 Replies

5. UNIX Desktop Questions & Answers

Using grep to remove cells instead of lines

I would like to use grep to remove certain strings from a text file but I can't use the grep -v option because it removes the whole line that includes the string whereas I just want to remove the string. How do I go about doing that? My input file: Magmas CEU rs12542019 CPNE1 RBM12 CEU... (1 Reply)
Discussion started by: evelibertine
1 Replies

6. Shell Programming and Scripting

CSV with commas in field values, remove duplicates, cut columns

Hi Description of input file I have: ------------------------- 1) CSV with double quotes for string fields. 2) Some string fields have Comma as part of field value. 3) Have Duplicate lines 4) Have 200 columns/fields 5) File size is more than 10GB Description of output file I need:... (4 Replies)
Discussion started by: krishnix
4 Replies

7. Shell Programming and Scripting

Remove Duplicates on multiple Key Columns and get the Latest Record from Date/Time Column

Hi Experts , we have a CDC file where we need to get the latest record of the Key columns Key Columns will be CDC_FLAG and SRC_PMTN_I and fetch the latest record from the CDC_PRCS_TS Can we do it with a single awk command. Please help.... (3 Replies)
Discussion started by: vijaykodukula
3 Replies

8. Shell Programming and Scripting

Remove duplicates by keeping the order intact

Hello friends, I have a file with duplicate lines. I could eliminate duplicate lines by running sort <file> |uniq >uniq_file and it works fine BUT it changes the order of the entries as it we did "sort". I need to remove duplicates and also need to keep the order/sequence of entries. I... (1 Reply)
Discussion started by: magnus29
1 Replies

9. UNIX for Beginners Questions & Answers

Merge cells in all rows of a HTML table dynamically.

Hello All, I have visited many pages in Unix.com and could find out one solution for merging the HTML cells in the 1st row. (Unable to post the complete URL as I should not as per website rules). But, however I try, I couldn't achieve this merging to happen for all other rows of HTML... (17 Replies)
Discussion started by: Mounika
17 Replies

10. UNIX for Beginners Questions & Answers

Sort and remove duplicates in directory based on first 5 columns:

I have /tmp dir with filename as: 010020001_S-FOR-Sort-SYEXC_20160229_2212101.marker 010020001_S-FOR-Sort-SYEXC_20160229_2212102.marker 010020001-S-XOR-Sort-SYEXC_20160229_2212104.marker 010020001-S-XOR-Sort-SYEXC_20160229_2212105.marker 010020001_S-ZOR-Sort-SYEXC_20160229_2212106.marker... (4 Replies)
Discussion started by: gnnsprapa
4 Replies
Devel::Refcount(3pm)					User Contributed Perl Documentation				      Devel::Refcount(3pm)

NAME
"Devel::Refcount" - obtain the REFCNT value of a referent SYNOPSIS
use Devel::Refcount qw( refcount ); my $anon = []; print "Anon ARRAY $anon has " . refcount($anon) . " reference "; my $otherref = $anon; print "Anon ARRAY $anon now has " . refcount($anon) . " references "; DESCRIPTION
This module provides a single function which obtains the reference count of the object being pointed to by the passed reference value. FUNCTIONS
$count = refcount($ref) Returns the reference count of the object being pointed to by $ref. COMPARISON WITH SvREFCNT This function differs from "Devel::Peek::SvREFCNT" in that SvREFCNT() gives the reference count of the SV object itself that it is passed, whereas refcount() gives the count of the object being pointed to. This allows it to give the count of any referent (i.e. ARRAY, HASH, CODE, GLOB and Regexp types) as well. Consider the following example program: use Devel::Peek qw( SvREFCNT ); use Devel::Refcount qw( refcount ); sub printcount { my $name = shift; printf "%30s has SvREFCNT=%d, refcount=%d ", $name, SvREFCNT($_[0]), refcount($_[0]); } my $var = []; printcount 'Initially, $var', $var; my $othervar = $var; printcount 'Before CODE ref, $var', $var; printcount '$othervar', $othervar; my $code = sub { undef $var }; printcount 'After CODE ref, $var', $var; printcount '$othervar', $othervar; This produces the output Initially, $var has SvREFCNT=1, refcount=1 Before CODE ref, $var has SvREFCNT=1, refcount=2 $othervar has SvREFCNT=1, refcount=2 After CODE ref, $var has SvREFCNT=2, refcount=2 $othervar has SvREFCNT=1, refcount=2 Here, we see that SvREFCNT() counts the number of references to the SV object passed in as the scalar value - the $var or $othervar respectively, whereas refcount() counts the number of reference values that point to the referent object - the anonymous ARRAY in this case. Before the CODE reference is constructed, both $var and $othervar have SvREFCNT() of 1, as they exist only in the current lexical pad. The anonymous ARRAY has a refcount() of 2, because both $var and $othervar store a reference to it. After the CODE reference is constructed, the $var variable now has an SvREFCNT() of 2, because it also appears in the lexical pad for the new anonymous CODE block. PURE-PERL FALLBACK An XS implementation of this function is provided, and is used by default. If the XS library cannot be loaded, a fallback implementation in pure perl using the "B" module is used instead. This will behave identically, but is much slower. Rate pp xs pp 225985/s -- -66% xs 669570/s 196% -- SEE ALSO
o Test::Refcount - assert reference counts on objects AUTHOR
Paul Evans <leonerd@leonerd.org.uk> perl v5.14.2 2011-11-15 Devel::Refcount(3pm)
All times are GMT -4. The time now is 03:56 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy