Sponsored Content
Top Forums Shell Programming and Scripting Comparison and editing of files using awk.(And also a possible bug in awk for loop?) Post 302439185 by radoulov on Thursday 22nd of July 2010 04:37:57 AM
Old 07-22-2010
As far as the first question is concerned:
Code:
awk 'NR == FNR {
  for (i = 1; (i += 2) <= NF;) {
    r[$1, $(i - 1)] = $i 
    v[$1] = v[$1] ? v[$1] FS $(i - 1) : $(i - 1)
    }
  next    
  }
{ 
  n = split(v[$1], t)
  printf "%s ", $1
  for (i = 0; ++i <= n;) {
    if ($2 <= t[i] && t[i] <= $3) 
      printf "%s %s", t[i], (r[$1, t[i]] (i <= n ? FS : x))
      }
    print (($1, $2) in r) ? x : $2 FS r[$1, $2]
  }' file2  file1

As far as the second one is concerned: the floating arithmetic in computers works like that, it's not a bug (1.2 is stored as an approximation, it could in reality be 1.20001 or even 1.19998)).

Last edited by radoulov; 07-22-2010 at 03:35 PM.. Reason: Corrected.
This User Gave Thanks to radoulov For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

String Comparison between two files using awk

I have two files with field seperator as "~". File A: 12~13~14~15 File B: 22~22~32~11 i want to calculate the difference between two files and than calculate the percentage difference and output it to a new file. How do i do this using awk. Also please suggest GOOD awk tutorials. Thank... (7 Replies)
Discussion started by: rudoraj
7 Replies

2. Shell Programming and Scripting

Comparison of two files in awk

Hi, I have two files file1 and file2 delimited by semicolon, And I want to compare column 2 and column3 of file1 to column3 and column 4 in file2. file1 -------- abc;cef;155.67;143_34; def;fgh;146.55;123.3; frg;hff;134.67;; yyy;fgh;134.78;35_45; file 2 --------- abc;cef;155.09;;... (12 Replies)
Discussion started by: jerome Sukumar
12 Replies

3. Shell Programming and Scripting

Awk Comparison of 2 specific files

Hi Everybody, I know the topic sounds familiar but I just couldn't adapt or find the right code that solves my particular issue. I really hope you can help. I would like to compare 2 files in an awk script. Both files have different paths. The awk script call should look like that awk -f... (7 Replies)
Discussion started by: hhoosscchhii
7 Replies

4. Shell Programming and Scripting

For loop in awk, a bug or something else?

Seems for loop in awk has bug? For first awk, i expect to get 4 output: 1.5, 1.4, 1.3, 1.2, but there are only 3. $ awk 'BEGIN {for (i=1.5;i>=1.2;i-=0.1) print i}' 1.5 1.4 1.3 $ awk 'BEGIN {for (i=15;i>=12;i-=1) print i}' 15 14 13 12 (1 Reply)
Discussion started by: rdcwayx
1 Replies

5. UNIX for Dummies Questions & Answers

df -> output files; comparison using awk or...

:wall: I am trying to do the following using awk (is that the best way?): Read 2 files created from the output of df (say, on different days) and compare the entries using the 1st (FileSys) and 6th (Mount) fields to see if the size has changed. Output (at least), to a new file (some header... (2 Replies)
Discussion started by: renata
2 Replies

6. Shell Programming and Scripting

comparison of 2 files using unix or awk

Hello, I have 2 files and I want them to be compared in a specific fashion file1: A_1200_1250 A_1251_1300 B_1301_1350 B_1351_1400 B_1401_1450 C_1451_1500 and so on... file2: 1210 1305 1260 1295 1400 1500 1450 1495 Now The script should look for "1200" from A_1200_1250 of... (8 Replies)
Discussion started by: Diya123
8 Replies

7. Shell Programming and Scripting

For Loop Field editing - without using "awk"

Hi, I'm using Linux and bash shell. I have a file (F1.txt) with contents like Table1 Column1 123abc Table1 Column2 xyz Table2 Column1 543 Now, I would like to get the output as UPDATE Table1 SET Column1='123abc'; UPDATE Table1 SET Column2='xyz'; UPDATE Table2 SET Column1='543';... (3 Replies)
Discussion started by: Dev_Dev
3 Replies

8. Shell Programming and Scripting

Awk: Replacement using 2 diff files input and comparison

Requirement: If $5(date field) in ipfile is less than $7(date field) in deact file & $1 of ipfile is present in deactfile then $1 to be replaced by $2,$3,$4,$5,$6 of deact file else if $5(date field) in ipfile is greater than $7(date field) in actfile & $1 of ipfile is present in actfile then... (5 Replies)
Discussion started by: siramitsharma
5 Replies

9. Shell Programming and Scripting

awk - 2 files comparison without for loop - multi-line issue

Greetings Experts, I need to handle the views created over monthly retention tables for which every new table in YYYYMMDD format, there is equivalent view created and the older table which might be dropped, the view over it has to be re-created over a dummy table so that it doesn't fail.... (2 Replies)
Discussion started by: chill3chee
2 Replies

10. UNIX for Beginners Questions & Answers

awk comparison using multiple files

Hi, I have 2 files, I need to use column of file1 and do a comparison on file2 column 1 and print the mismatch is file3 as mentioned below. Kindly consider that file 1 is having uniq key(column) whereas in file2 we have multiple duplicates (like 44). These duplicates should not come in... (2 Replies)
Discussion started by: grv
2 Replies
bup-margin(1)						      General Commands Manual						     bup-margin(1)

NAME
bup-margin - figure out your deduplication safety margin SYNOPSIS
bup margin [options...] DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids. For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by its first 46 bits. The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits, that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits with far fewer objects. If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if you're getting dangerously close to 160 bits. OPTIONS
--predict Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer from the guess. This is potentially useful for tuning an interpolation search algorithm. --ignore-midx don't use .midx files, use only .idx files. This is only really useful when used with --predict. EXAMPLE
$ bup margin Reading indexes: 100.00% (1612581/1612581), done. 40 40 matching prefix bits 1.94 bits per doubling 120 bits (61.86 doublings) remaining 4.19338e+18 times larger is possible Everyone on earth could have 625878182 data sets like yours, all in one repository, and we would expect 1 object collision. $ bup margin --predict PackIdxList: using 1 index. Reading indexes: 100.00% (1612581/1612581), done. 915 of 1612581 (0.057%) SEE ALSO
bup-midx(1), bup-save(1) BUP
Part of the bup(1) suite. AUTHORS
Avery Pennarun <apenwarr@gmail.com>. Bup unknown- bup-margin(1)
All times are GMT -4. The time now is 01:12 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy