Speed : awk command to count the occurrences of fields from one file present in the other file Post: 302915633

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

To find the count of records from tables present inside a file.

hi gurus, I am having a file containing a list of tables.i want to find the count of records inside thes tables. for this i have to connect into database and i have to put the count for all the tables inside another file i used the following loop once all the tablenames are inside the file. ...

2. UNIX for Dummies Questions & Answers

Search and Count Occurrences of Pattern in a File

I need to search and count the occurrences of a pattern in a file. The catch here is it's a pattern and not a word ( not necessarily delimited by spaces). For eg. if ABCD is the pattern I need to search and count, it can come in all flavors like (ABCD, ABCD), XYZ.ABCD=100, XYZ.ABCD>=500,...

3. Shell Programming and Scripting

Help with Unix and Awk to count number of occurrences

Hi, I have a file (movies.sh), this file contains list of movies such as I want to redirect the movies from movies.sh to file_to_process to allow me process the file with out losing anything. I have tried Movies.sh >> file_to_process But I want to add the row number to the data...

4. Shell Programming and Scripting

Count occurrences in awk

Hello, I have an output from GDB with many entries that looks like this 0x00007ffff7dece94 39 in dl-fini.c 0x00007ffff7dece97 39 in dl-fini.c 0x00007ffff7ab356c 50 in exit.c 0x00007ffff7aed9db in _IO_cleanup () at genops.c:1022 115 in dl-fini.c 0x00007ffff7decf7b in _dl_sort_fini (l=0x0,...

5. Shell Programming and Scripting

Looping over a file to count common fields from another file

Hi, I would like to know how can I get the number of rows in file1 that: - the 1st and 2nd field should be the same (text) - the 3rd field should be less or equal (numeric) when comparing to file2. So for each row of file1, I would like to have the number of rows in file2 that follow the...

6. Shell Programming and Scripting

awk Group By and count string occurrences

Hi Gurus, I'm scratching my head over and over and couldn't find the the right way to compose this AWK properly - PLEASE HELP :confused: Input: c,d,e,CLICK a,b,c,CLICK a,b,c,CONV c,d,e,CLICK a,b,c,CLICK a,b,c,CLICK a,b,c,CONV b,c,d,CLICK c,d,e,CLICK c,d,e,CLICK b,c,d,CONV...

7. Shell Programming and Scripting

Total record count of all the file present in a directory

Hi All , We need one help on the below requirement.We have multiple pipe delimited .txt file(around 100 .txt files) present on one directory.We need the total record count of all the files present in that directory without header.File format as below : ...

8. UNIX for Beginners Questions & Answers

awk or sed script to count number of occurrences and creating an average

Hi Friends , I am having one problem as stated file . Having an input CSV file as shown in the code U_TOP_LOGIC/U_HPB2/U_HBRIDGE2/i_core/i_paddr_reg_2_/Q,1,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,0,0...

9. UNIX for Beginners Questions & Answers

awk code to find difference in second file which is not present in first file .

Hi All, I want to find difference between two files and output only lines which are not present in second file .I am using awk and I am getting only the first difference but I want to get all the lines which are not present in file2 .Below is the code I am using . Please help to get the desired...

LEARN ABOUT DEBIAN

bup-margin

bup-margin(1)						      General Commands Manual						     bup-margin(1)

NAME

       bup-margin - figure out your deduplication safety margin

SYNOPSIS

       bup margin [options...]

DESCRIPTION

       bup margin  iterates  through  all  objects  in	your  bup repository, calculating the largest number of prefix bits shared between any two
       entries.  This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids.

       For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45.  That  means  a  46-bit
       hash  would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by
       its first 46 bits.

       The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects.  Since SHA-1 hashes have 160 bits,
       that  leaves 115 bits of margin.  Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits
       with far fewer objects.

       If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see	if
       you're getting dangerously close to 160 bits.

OPTIONS

       --predict
	      Guess  the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer
	      from the guess.  This is potentially useful for tuning an interpolation search algorithm.

       --ignore-midx
	      don't use .midx files, use only .idx files.  This is only really useful when used with --predict.

EXAMPLE

	      $ bup margin
	      Reading indexes: 100.00% (1612581/1612581), done.
	      40
	      40 matching prefix bits
	      1.94 bits per doubling
	      120 bits (61.86 doublings) remaining
	      4.19338e+18 times larger is possible

	      Everyone on earth could have 625878182 data sets
	      like yours, all in one repository, and we would
	      expect 1 object collision.

	      $ bup margin --predict
	      PackIdxList: using 1 index.
	      Reading indexes: 100.00% (1612581/1612581), done.
	      915 of 1612581 (0.057%)

SEE ALSO

       bup-midx(1), bup-save(1)

BUP

       Part of the bup(1) suite.

AUTHORS

       Avery Pennarun <apenwarr@gmail.com>.

Bup unknown-															     bup-margin(1)