Awk: count unique elements in a field and sum their occurence across the entire file Post: 303016007

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Getting Sum, Count and Distinct Count of a file

Hi all this is a UNIX question. I have a large flat file with millions of records. col1|col2|col3 1|a|b 2|c|d 3|e|f 3|g|h footer**** I am supposed to calculate the sum of col1 1+2+3+3=9, count of col1 1,2,3,3=4, and distinct count of col1 1,2,3=c3 I would like it if you avoid...

2. UNIX for Dummies Questions & Answers

How to search unique occurence in a file?

Hi, I have to search and count unique occurence of DE numbers in bold below in a file which has content like below. Proc Tran F-BUY Item Tkey Q5JV Item Tsid JTIZ9 Item Tdat 20091001 Item Tset 20091001 Item Tbkr 5 Item Tshs 2 Item Tprc 897.0 Item Tcom 2000.0 Item Tcm1 20091001...

3. Shell Programming and Scripting

Printing entire field, if at least one row is matching by AWK

Dear all, I have been trying to print an entire field, if the first line of the field is matching. For example, my input looks something like this. aaa ddd zzz 123 987 126 24 0.650 985 354 9864 0.32 0.333 4324 000 I am looking for a pattern,...

4. Shell Programming and Scripting

awk and count sum ?

I have a input.txt file which have 3 fields separate by a comma place, os and timediff in seconds tampa,win7, 2575 tampa,win7, 157619 tampa,win7, 3352 dallas,vista,604799 greenbay,winxp, 14400 greenbay,win7 , 518400 san jose,winxp, 228121 san jose,winxp, 70853 san jose,winxp, 193514...

5. Shell Programming and Scripting

awk if statement not printing entire field

I have an input that looks like this: chr1 mm9_knownGene utr3 3204563 3206102 0 - . gene_id "Xkr4"; transcript_id "uc007aeu.1"; chr1 mm9_knownGene utr3 4280927 4283061 0 - . gene_id "Rp1"; transcript_id "uc007aew.1"; chr1 mm9_knownGene ...

6. Shell Programming and Scripting

awk sum entire string

Hi I am trying to carry out a sum on a file (totals.txt). The file looks like: So far i have this command this returns 20610 I however want it to return 000000206100 Any help would be great thanks!

7. Shell Programming and Scripting

Looping through entire directory and count unique values

Hello, I`m a complete newbie to coding, please help with this problem. I have multiple files in a directory, I have to loop through the contents of each file and extract number of unique isoforms in that file. Each file is tab delimited and only the line with the first parent (column 3)...

8. Shell Programming and Scripting

awk to count using each unique value

Im looking for an awk script that will take the unique values in column 5, then print and count the unique values in column 6. CA001011500 11111 11111 -9999 201301 AAA CA001012040 11111 11111 -9999 201301 AAA CA001012573 11111 11111 -9999 201301 BBB CA001012710 11111 11111 -9999 201301...

9. Shell Programming and Scripting

Count of unique lines in field 4

When I use the below awk to count the unique lines in $4 for the input it seems to work. The answer is 3 because $4 is only unique 3 times in all the entries. However, when I use the same on actual data I get 56,536 and I know the answer should be 56,548. My question is there a better way to...

10. UNIX for Beginners Questions & Answers

Use strings from nth field from one file to match strings in entire line in another file, awk

I cannot seem to get what should be a simple awk one-liner to work correctly and cannot figure out why. I would like to use patterns from a specific field in one file as regex to search for matching strings in the entire line ($0) of another file. I would like to output the lines of File2 which...

LEARN ABOUT MOJAVE

locale::codes::langext

Locale::Codes::LangExt(3pm)				 Perl Programmers Reference Guide			       Locale::Codes::LangExt(3pm)

NAME

       Locale::Codes::LangExt - standard codes for language extension identification

SYNOPSIS

	  use Locale::Codes::LangExt;

	  $lext = code2langext('acm');		       # $lext gets 'Mesopotamian Arabic'
	  $code = langext2code('Mesopotamian Arabic'); # $code gets 'acm'

	  @codes   = all_langext_codes();
	  @names   = all_langext_names();

DESCRIPTION

       The "Locale::Codes::LangExt" module provides access to standard codes used for identifying language extensions, such as those as defined in
       the IANA language registry.

       Most of the routines take an optional additional argument which specifies the code set to use. If not specified, the default IANA language
       registry codes will be used.

SUPPORTED CODE SETS

       There are several different code sets you can use for identifying language extensions. A code set may be specified using either a name, or
       a constant that is automatically exported by this module.

       For example, the two are equivalent:

	  $lext = code2langext('acm','alpha');
	  $lext = code2langext('acm',LOCALE_LANGEXT_ALPHA);

       The codesets currently supported are:

       alpha
	   This is the set of three-letter (lowercase) codes from the IANA language registry, such as 'acm' for Mesopotamian Arabic.

	   This is the default code set.

ROUTINES

       code2langext ( CODE [,CODESET] )
       langext2code ( NAME [,CODESET] )
       langext_code2code ( CODE ,CODESET ,CODESET2 )
       all_langext_codes ( [CODESET] )
       all_langext_names ( [CODESET] )
       Locale::Codes::LangExt::rename_langext  ( CODE ,NEW_NAME [,CODESET] )
       Locale::Codes::LangExt::add_langext  ( CODE ,NAME [,CODESET] )
       Locale::Codes::LangExt::delete_langext  ( CODE [,CODESET] )
       Locale::Codes::LangExt::add_langext_alias  ( NAME ,NEW_NAME )
       Locale::Codes::LangExt::delete_langext_alias  ( NAME )
       Locale::Codes::LangExt::rename_langext_code  ( CODE ,NEW_CODE [,CODESET] )
       Locale::Codes::LangExt::add_langext_code_alias  ( CODE ,NEW_CODE [,CODESET] )
       Locale::Codes::LangExt::delete_langext_code_alias  ( CODE [,CODESET] )
	   These routines are all documented in the Locale::Codes::API man page.

SEE ALSO

       Locale::Codes
	   The Locale-Codes distribution.

       Locale::Codes::API
	   The list of functions supported by this module.

       http://www.iana.org/assignments/language-subtag-registry
	   The IANA language subtag registry.

AUTHOR

       See Locale::Codes for full author history.

       Currently maintained by Sullivan Beck (sbeck@cpan.org).

COPYRIGHT

	  Copyright (c) 2011-2013 Sullivan Beck

       This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

perl v5.18.2							    2013-11-04					       Locale::Codes::LangExt(3pm)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Getting Sum, Count and Distinct Count of a file

Discussion started by: singhabhijit

2. UNIX for Dummies Questions & Answers

How to search unique occurence in a file?

Discussion started by: akash028

3. Shell Programming and Scripting

Printing entire field, if at least one row is matching by AWK

Discussion started by: Chulamakuri

4. Shell Programming and Scripting

awk and count sum ?

Discussion started by: sabercats