Sponsored Content
Top Forums Shell Programming and Scripting Count and keep duplicates in Column Post 302969510 by RudiC on Wednesday 23rd of March 2016 12:05:57 PM
Old 03-23-2016
Please use code tags as required by forum rules!

I guess the second column header should go with the column, no? Having fields with the field separator inside doesn't really help processing. Try
Code:
awk 'NR == FNR {T[$1]++; next} FNR == 1 {print $1, $2, "CNT", $3, $4; next} {print $1, T[$1], $2}' file file
Column A CNT Column B
Apples 2 1900
Apples 2 1901
Pears 2 1902
Pears 2 1903


Last edited by RudiC; 03-24-2016 at 08:11 AM.. Reason: typo
This User Gave Thanks to RudiC For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

duplicates lines with one column different

Hi I have the following lines in a file SANDI108085FRANKLIN WRAP 7285 SANDI109514ZIPLOC STRETCH N SEAL 7285 SANDI110198CHOICE DM 0911 SANDI111144RANDOM WEIGHT BRAND 0704 SANDI111144RANDOM WEIGHT BRAND 0738... (10 Replies)
Discussion started by: dhanamurthy
10 Replies

2. Shell Programming and Scripting

Delete Duplicates on the basis of two column values.

Hi All, i need ti delete two duplicate processss which are running on the same device type (column 1) and port ID (column 2). here is the sample data p1sc1m1 15517 11325 0 01:00:24 ? 0:00 scagntclsx25octtcp 2967 in3v mvmp01 0 8000 N S 969 750@751@752@ p1sc1m1 15519 11325 0 01:00:24 ? ... (5 Replies)
Discussion started by: neeraj617
5 Replies

3. Shell Programming and Scripting

need to remove duplicates based on key in first column and pattern in last column

Given a file such as this I need to remove the duplicates. 00060011 PAUL BOWSTEIN ad_waq3_921_20100826_010517.txt 00060011 PAUL BOWSTEIN ad_waq3_921_20100827_010528.txt 0624-01 RUT CORPORATION ad_sade3_10_20100827_010528.txt 0624-01 RUT CORPORATION ... (13 Replies)
Discussion started by: script_op2a
13 Replies

4. Shell Programming and Scripting

Getting Data Count by Removing Duplicates

Hi Experts, I have many CSV data files in the below format (Example) :- Doc Number,Line Number,Condition Number 111,10,ABC 111,10,PQR 111,10,XYZ 222,20,DEF 222,20,EFG 222,20,HIJ 333,30,CCC 333,30,TCP Now, for the above data i want to get the row count based on the Doc Number & Line... (9 Replies)
Discussion started by: naikamit
9 Replies

5. UNIX for Dummies Questions & Answers

Grep and Count Duplicates

I have a delimited file (by |), and the second field is made out of Surnames. Is it possible to list the surnames together with their count of occurances. For example, image the first two lines are the following: Joe | Doe | 30 Jane | Doe | 28 Peter | Smith | 25 John | Jones | 26 I... (2 Replies)
Discussion started by: mouthpiec
2 Replies

6. Shell Programming and Scripting

Count total duplicates

Hi all, I have found another post threads talking about count duplicate lines, but I am interested in obtain the total number of duplicates. For example: #file.txt a1 a2 a1 a3 a1 a2 a4 a5 #out 3 (lines are duplicates) Thank you! (12 Replies)
Discussion started by: mikloz
12 Replies

7. Shell Programming and Scripting

Remove duplicates according to their frequency in column

Hi all, I have huge a tab-delimited file with the following format and I want to remove the duplicates according to their frequency based on Column2 and Column3. Column1 Column2 Column3 Column4 Column5 Column6 Column7 1 user1 access1 word word 3 2 2 user2 access2 ... (10 Replies)
Discussion started by: corfuitl
10 Replies

8. Shell Programming and Scripting

Read first column and count lines in second column using awk

Hello all, I would like to ask your help here: I've a huge file that has 2 columns. A part of it is: sorted.txt: kss23 rml.67lkj kss23 zhh.6gf kss23 nhd.09.fdd kss23 hp.767.88.89 fl67 nmdsfs.56.df.67 fl67 kk.fgf.98.56.n fl67 bgdgdfg.hjj.879.d fl66 kl..hfh.76.ghg fl66... (5 Replies)
Discussion started by: Padavan
5 Replies

9. Shell Programming and Scripting

Filter first column duplicates

Dear All, I really enjoy your help or suggestion for resolving an issue. Briefly, I have a file like this: a b c a d e f g h k g h x y z If the first column has the same ID, for example a, just remove it. The output should be this: f g h k g h x y z I was thinking to do it... (11 Replies)
Discussion started by: giuliangiuseppe
11 Replies

10. Shell Programming and Scripting

awk to Sum columns when other column has duplicates and append one column value to another with Care

Hi Experts, Please bear with me, i need help I am learning AWk and stuck up in one issue. First point : I want to sum up column value for column 7, 9, 11,13 and column15 if rows in column 5 are duplicates.No action to be taken for rows where value in column 5 is unique. Second point : For... (1 Reply)
Discussion started by: as7951
1 Replies
RTBL(3) 						   BSD Library Functions Manual 						   RTBL(3)

NAME
rtbl_create, rtbl_destroy, rtbl_set_flags, rtbl_get_flags, rtbl_set_prefix, rtbl_set_separator, rtbl_set_column_prefix, rtbl_set_column_affix_by_id, rtbl_add_column, rtbl_add_column_by_id, rtbl_add_column_entry, rtbl_add_column_entry_by_id, rtbl_new_row, rtbl_format -- format data in simple tables LIBRARY
The roken library (libroken, -lroken) SYNOPSIS
#include <rtbl.h> int rtbl_add_column(rtbl_t table, const char *column_name, unsigned int flags); int rtbl_add_column_by_id(rtbl_t table, unsigned int column_id, const char *column_header, unsigned int flags); int rtbl_add_column_entry(rtbl_t table, const char *column_name, const char *cell_entry); int rtbl_add_column_entry_by_id(rtbl_t table, unsigned int column_id, const char *cell_entry); rtbl_t rtbl_create(void); void rtbl_destroy(rtbl_t table); int rtbl_new_row(rtbl_t table); int rtbl_set_column_affix_by_id(rtbl_t table, unsigned int column_id, const, char, *prefix", const char *suffix); int rtbl_set_column_prefix(rtbl_t table, const char *column_name, const char *prefix); unsigned int rtbl_get_flags(rtbl_t table); void rtbl_set_flags(rtbl_t table, unsigned int flags); int rtbl_set_prefix(rtbl_t table, const char *prefix); int rtbl_set_separator(rtbl_t table, const char *separator); int rtbl_format(rtbl_t table, FILE, *file"); DESCRIPTION
This set of functions assemble a simple table consisting of rows and columns, allowing it to be printed with certain options. Typical use would be output from tools such as ls(1) or netstat(1), where you have a fixed number of columns, but don't know the column widths before hand. A table is created with rtbl_create() and destroyed with rtbl_destroy(). Global flags on the table are set with rtbl_set_flags and retrieved with rtbl_get_flags. At present the only defined flag is RTBL_HEADER_STYLE_NONE which suppresses printing the header. Before adding data to the table, one or more columns need to be created. This would normally be done with rtbl_add_column_by_id(), column_id is any number of your choice (it's used only to identify columns), column_header is the header to print at the top of the column, and flags are flags specific to this column. Currently the only defined flag is RTBL_ALIGN_RIGHT, aligning column entries to the right. Columns are printed in the order they are added. There's also a way to add columns by column name with rtbl_add_column(), but this is less flexible (you need unique header names), and is considered deprecated. To add data to a column you use rtbl_add_column_entry_by_id(), where the column_id is the same as when the column was added (adding data to a non-existent column is undefined), and cell_entry is whatever string you wish to include in that cell. It should not include newlines. For columns added with rtbl_add_column() you must use rtbl_add_column_entry() instead. rtbl_new_row() fills all columns with blank entries until they all have the same number of rows. Each column can have a separate prefix and suffix, set with rtbl_set_column_affix_by_id; rtbl_set_column_prefix allows setting the prefix only by column name. In addition to this, columns may be separated by a string set with rtbl_set_separator (by default columns are not seprated by anything). The finished table is printed to file with rtbl_format. EXAMPLES
This program: #include <stdio.h> #include <rtbl.h> int main(int argc, char **argv) { rtbl_t table; table = rtbl_create(); rtbl_set_separator(table, " "); rtbl_add_column_by_id(table, 0, "Column A", 0); rtbl_add_column_by_id(table, 1, "Column B", RTBL_ALIGN_RIGHT); rtbl_add_column_by_id(table, 2, "Column C", 0); rtbl_add_column_entry_by_id(table, 0, "A-1"); rtbl_add_column_entry_by_id(table, 0, "A-2"); rtbl_add_column_entry_by_id(table, 0, "A-3"); rtbl_add_column_entry_by_id(table, 1, "B-1"); rtbl_add_column_entry_by_id(table, 2, "C-1"); rtbl_add_column_entry_by_id(table, 2, "C-2"); rtbl_add_column_entry_by_id(table, 1, "B-2"); rtbl_add_column_entry_by_id(table, 1, "B-3"); rtbl_add_column_entry_by_id(table, 2, "C-3"); rtbl_add_column_entry_by_id(table, 0, "A-4"); rtbl_new_row(table); rtbl_add_column_entry_by_id(table, 1, "B-4"); rtbl_new_row(table); rtbl_add_column_entry_by_id(table, 2, "C-4"); rtbl_new_row(table); rtbl_format(table, stdout); rtbl_destroy(table); return 0; } will output the following: Column A Column B Column C A-1 B-1 C-1 A-2 B-2 C-2 A-3 B-3 C-3 A-4 B-4 C-4 HEIMDAL
June 26, 2004 HEIMDAL
All times are GMT -4. The time now is 01:20 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy