Sponsored Content
Top Forums Shell Programming and Scripting Remove duplicates according to their frequency in column Post 302958067 by corfuitl on Monday 19th of October 2015 06:16:22 AM
Old 10-19-2015
Hi,

Thank you for your reply. Lines 4 and 5 are identical, so no problem, it will be correct if it extracts line 4.

I am not familiar with awk but I have found the following command from a similar post but it seems that it doesn't work in my case.

Code:
awk '(NR==1);a[$2]<$3||d[$2]<$4{a[$2]=$3;d[$2]=$4;b[$2]=$0};END{for(i in b)if(b[i] !~ /ID/){print b[i]}}'

Thanks
 

9 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Remove duplicates based on a column in fixed width file

Hi, How to output the duplicate record to another file. We say the record is duplicate based on a column whose position is from 2 and its length is 11 characters. The file is a fixed width file. ex of Record: DTYU12333567opert tjhi kkklTRG9012 The data in bold is the key on which... (1 Reply)
Discussion started by: Qwerty123
1 Replies

2. Shell Programming and Scripting

need to remove duplicates based on key in first column and pattern in last column

Given a file such as this I need to remove the duplicates. 00060011 PAUL BOWSTEIN ad_waq3_921_20100826_010517.txt 00060011 PAUL BOWSTEIN ad_waq3_921_20100827_010528.txt 0624-01 RUT CORPORATION ad_sade3_10_20100827_010528.txt 0624-01 RUT CORPORATION ... (13 Replies)
Discussion started by: script_op2a
13 Replies

3. Shell Programming and Scripting

remove duplicates based on single column

Hello, I am new to shell scripting. I have a huge file with multiple columns for example: I have 5 columns below. HWUSI-EAS000_29:1:105 + chr5 76654650 AATTGGAA HHHHG HWUSI-EAS000_29:1:106 + chr5 76654650 AATTGGAA B@HYL HWUSI-EAS000_29:1:108 + ... (4 Replies)
Discussion started by: Diya123
4 Replies

4. Shell Programming and Scripting

Request to check:remove duplicates only in first column

Hi all, I have an input file like this Now I have to remove duplicates only in first column and nothing has to be changed in second and third column. so that output would be Please let me know scripting regarding this (20 Replies)
Discussion started by: manigrover
20 Replies

5. Shell Programming and Scripting

Remove duplicates within row and separate column

Hi all I have following kind of input file ESR1 PA156 leflunomide PA450192 leflunomide CHST3 PA26503 docetaxel Pa4586; thalidomide Pa34958; decetaxel docetaxel docetaxel I want to remove duplicates and I want to separate anything before and after PAxxxx entry into columns or... (1 Reply)
Discussion started by: manigrover
1 Replies

6. Shell Programming and Scripting

Remove Duplicates on multiple Key Columns and get the Latest Record from Date/Time Column

Hi Experts , we have a CDC file where we need to get the latest record of the Key columns Key Columns will be CDC_FLAG and SRC_PMTN_I and fetch the latest record from the CDC_PRCS_TS Can we do it with a single awk command. Please help.... (3 Replies)
Discussion started by: vijaykodukula
3 Replies

7. Shell Programming and Scripting

Count frequency of unique values in specific column

Hi, I have tab-deliminated data similar to the following: dot is-big 2 dot is-round 3 dot is-gray 4 cat is-big 3 hot in-summer 5 I want to count the frequency of each individual "unique" value in the 1st column. Thus, the desired output would be as follows: dot 3 cat 1 hot 1 is... (5 Replies)
Discussion started by: owwow14
5 Replies

8. Shell Programming and Scripting

Remove duplicates

I have a file with the following format: fields seperated by "|" title1|something class|long...content1|keys title2|somhing class|log...content1|kes title1|sothing class|lon...content1|kes title3|shing cls|log...content1|ks I want to remove all duplicates with the same "title field"(the... (3 Replies)
Discussion started by: dtdt
3 Replies

9. Shell Programming and Scripting

awk to Sum columns when other column has duplicates and append one column value to another with Care

Hi Experts, Please bear with me, i need help I am learning AWk and stuck up in one issue. First point : I want to sum up column value for column 7, 9, 11,13 and column15 if rows in column 5 are duplicates.No action to be taken for rows where value in column 5 is unique. Second point : For... (1 Reply)
Discussion started by: as7951
1 Replies
PMDABASH(1)						      General Commands Manual						       PMDABASH(1)

NAME
pmdabash - Bourne-Again SHell trace performance metrics domain agent SYNOPSIS
$PCP_PMDAS_DIR/bash/pmdabash [-C] [-d domain] [-l logfile] [-I interval] [-t timeout] [-U username] configfile DESCRIPTION
pmdabash is an experimental Performance Metrics Domain Agent (PMDA) which exports "xtrace" events from a traced bash(1) process. This includes the command execution information that would usually be sent to standard error with the set -x option to the shell. Event metrics are exported showing each command executed, the function name and line number in the script, and a timestamp. Additionally, the process identifier for the shell and its parent process are exported. This requires bash version 4 or later. A brief description of the pmdabash command line options follows: -d It is absolutely crucial that the performance metrics domain number specified here is unique and consistent. That is, domain should be different for every PMDA on the one host, and the same domain number should be used for the same PMDA on all hosts. -l Location of the log file. By default, a log file named bash.log is written in the current directory of pmcd(1) when pmdabash is started, i.e. $PCP_LOG_DIR/pmcd. If the log file cannot be created or is not writable, output is written to the standard error instead. -s Amount of time (in seconds) between subsequent evaluations of the shell trace file descriptor(s). The default is 2 seconds. -m Maximum amount of memory to be allowed for each event queue (one per traced process). The default is 2 megabytes. -U User account under which to run the agent. The default is the unprivileged "pcp" account in current versions of PCP, but in older versions the superuser account ("root") was used by default. INSTALLATION
In order for a host to export the names, help text and values for the bash performance metrics, do the following as root: # cd $PCP_PMDAS_DIR/bash # ./Install As soon as an instrumented shell script (see INSTRUMENTATION selection below) is run, with tracing enabled, new metric values will appear - no further setup of the agent is required. If you want to undo the installation, do the following as root: # cd $PCP_PMDAS_DIR/bash # ./Remove pmdabash is launched by pmcd(1) and should never be executed directly. The Install and Remove scripts notify pmcd(1) when the agent is installed or removed. INSTRUMENTATION
In order to allow the flow of event data between a bash(1) script and pmdabash, the script should take the following actions: #!/bin/sh source $PCP_DIR/etc/pcp.sh pcp_trace on $@ # enable tracing echo "awoke, $count" pcp_trace off # disable tracing The tracing can be enabled and disabled any number of times by the script. On successful installation of the agent, several metrics will be available: $ pminfo bash bash.xtrace.numclients bash.xtrace.maxmem bash.xtrace.queuemem bash.xtrace.count bash.xtrace.records bash.xtrace.parameters.pid bash.xtrace.parameters.parent bash.xtrace.parameters.lineno bash.xtrace.parameters.function bash.xtrace.parameters.command When an instrumented script is running, the generation of event records can be verified using the pmevent(1) command, as follows: $ pmevent -t 1 -x '' bash.xtrace.records host: localhost samples: all bash.xtrace.records["4538 ./test-trace.sh 1 2 3"]: 5 event records 10:00:05.000 --- event record [0] flags 0x19 (point,id,parent) --- bash.xtrace.parameters.pid 4538 bash.xtrace.parameters.parent 4432 bash.xtrace.parameters.lineno 43 bash.xtrace.parameters.command "true" 10:00:05.000 --- event record [1] flags 0x19 (point,id,parent) --- bash.xtrace.parameters.pid 4538 bash.xtrace.parameters.parent 4432 bash.xtrace.parameters.lineno 45 bash.xtrace.parameters.command "(( count++ ))" 10:00:05.000 --- event record [2] flags 0x19 (point,id,parent) --- bash.xtrace.parameters.pid 4538 bash.xtrace.parameters.parent 4432 bash.xtrace.parameters.lineno 46 bash.xtrace.parameters.command "echo 'awoke, 3'" 10:00:05.000 --- event record [3] flags 0x19 (point,id,parent) --- bash.xtrace.parameters.pid 4538 bash.xtrace.parameters.parent 4432 bash.xtrace.parameters.lineno 47 bash.xtrace.parameters.command "tired 2" 10:00:05.000 --- event record [4] flags 0x19 (point,id,parent) --- bash.xtrace.parameters.pid 4538 bash.xtrace.parameters.parent 4432 bash.xtrace.parameters.lineno 38 bash.xtrace.parameters.function "tired" bash.xtrace.parameters.command "sleep 2" FILES
$PCP_PMCDCONF_PATH command line options used to launch pmdabash $PCP_PMDAS_DIR/bash/help default help text file for the bash metrics $PCP_PMDAS_DIR/bash/Install installation script for the pmdabash agent $PCP_PMDAS_DIR/bash/Remove undo installation script for pmdabash $PCP_LOG_DIR/pmcd/bash.log default log file for error messages and other information from pmdabash PCP ENVIRONMENT
Environment variables with the prefix PCP_ are used to parameterize the file and directory names used by PCP. On each installation, the file /etc/pcp.conf contains the local values for these variables. The $PCP_CONF variable may be used to specify an alternative configura- tion file, as described in pcp.conf(5). SEE ALSO
bash(1), pmevent(1) and pmcd(1). Performance Co-Pilot PCP PMDABASH(1)
All times are GMT -4. The time now is 09:58 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy