Sponsored Content
Top Forums UNIX for Dummies Questions & Answers compare columns from 2 files and merge Post 302347361 by rubin on Tuesday 25th of August 2009 12:44:03 PM
Old 08-25-2009
Quote:
Originally Posted by samwilkinson
...
File 2 is longer than File 1. I wish to compare the contents of the two files by column 2. Where there is a match I wish to extract the line from File 2 and place it after corresponding line from File 1 thus creating a new file.

e.g.
118,1,0,2,3,0,5,0.3,0,0.3,0.6,1 118,1,BFGL-NGS-109695,3610326,0,18,1...

Code:
awk -F, 'BEGIN { while (( getline < "file_2" )>0) a[$2]=$0 } 
         a[$2] { print $0 , a[$2] }'   file_1  >  newfile


or in a more traditional ( NR==FNR ) way,

Code:
awk -F, 'NR==FNR { a[$2]=$0; next  }
           a[$2] { print $0, a[$2] }
        ' file_2 file_1 > newfile

Note that the file order is important, also use getline with care.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Compare two files and merge columns in a third

Hi, I'm working with snmp, with a little script I'm able to obtain from a switch a list with a couple of values with this format Port Mac 1 00:0A:0B:0C:0D:0E .... (hundred of entries) Now with a simple arp on a router I am able to obtain another list 00:0A:0B:0C:0D:0E... (20 Replies)
Discussion started by: CM64
20 Replies

2. Shell Programming and Scripting

compare the column from 3 files and merge that line

I have 3 file, each of has got 80000 records. file1.txt ----------------------- ABC001;active;modify;accept; ABC002;notactive;modify;accept; ABC003;notactive;no-modify;accept; ABC004;active;modify;accept; ABC005;active;no-modify;accept; file2.txt ---------------------------... (8 Replies)
Discussion started by: ganesh_mak
8 Replies

3. Shell Programming and Scripting

merge the two files which has contain columns

Hi may i ask how to accomplish this task: I have 2 files which has multiple columns first file 1 a 2 b 3 c 4 d second file 14 a 9 .... 13 b 10.... 12 c 11... 11 d 12... I want to merge the second file to first file that will looks like this ... (2 Replies)
Discussion started by: jao_madn
2 Replies

4. UNIX for Dummies Questions & Answers

Merge two files with two columns being similar

Hi everyone. How can I merge two files, where each file has 2 columns and the first columns in both files are similar? I want all in a file of 4 columns; join command removes the duplicate columns. 1 Dave 2 Mark 3 Paul 1 Apple 2 Orange 3 Grapes to get it like this in the 3rd file:... (9 Replies)
Discussion started by: Atrisa
9 Replies

5. Shell Programming and Scripting

Merge columns of different files

Hi, I have tab limited file 1 and tab limited file 2 The output should contain common first column vales and corresponding 2nd column values; AND also unique first column value with corresponding 2nd column value of the file that contains it and 0 for the second file. the output should... (10 Replies)
Discussion started by: polsum
10 Replies

6. Shell Programming and Scripting

Compare and Merge files

Hi All, I have two different files as shown below separated by a "|". I need to compare the first column from both the files and if they match merge both the columns. File 1 "S00172012"|"CHRONIC RENAL FAILURE"|""|"I" "S00159962"|"SUBENDO INFRC-INIT EPISD"|""|"I" "S00255303"|"BENIGN... (6 Replies)
Discussion started by: nua7
6 Replies

7. Shell Programming and Scripting

Merge columns on different files

Hello, I have two files that have this format: file 1 86.82 0.00 86.82 43.61 86.84 0.00 86.84 43.61 86.86 0.00 86.86 43.61 86.88 0.00 86.88 43.61 file 2 86.82 0.22 86.84 0.22 86.86 0.22 86.88 0.22 I would like to merge these two files such that the final file looks like... (5 Replies)
Discussion started by: kayak
5 Replies

8. Shell Programming and Scripting

Compare two files and merge into third

Hello: Newbie with Awk. Trying to compare two files and merge data based on CID. Please see the input file format and desired output. Any help is appreciated. TIA Input File1 CID1 --- TYP1 --- DCN1 --- INDATE1 --- IN-DATA1 CID2 --- TYP2 --- DCN2 --- INDATE2 --- IN-DATA2 CID3 ---... (6 Replies)
Discussion started by: wincrazy
6 Replies

9. Shell Programming and Scripting

Compare 2 csv files by columns, then extract certain columns of matcing rows

Hi all, I'm pretty much a newbie to UNIX. I would appreciate any help with UNIX coding on comparing two large csv files (greater than 10 GB in size), and output a file with matching columns. I want to compare file1 and file2 by 'id' and 'chain' columns, then extract exact matching rows'... (5 Replies)
Discussion started by: bkane3
5 Replies

10. Shell Programming and Scripting

Compare and merge two big CSV files

Hi all, i need help. I have two csv files with a huge amount of data. I need the first column of the first file, to be compared with the data of the second, to have at the end a file with the data not present in the second file. Example File1: (only one column) profile_id 57036226... (11 Replies)
Discussion started by: SirMannu
11 Replies
bup-margin(1)						      General Commands Manual						     bup-margin(1)

NAME
bup-margin - figure out your deduplication safety margin SYNOPSIS
bup margin [options...] DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids. For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by its first 46 bits. The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits, that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits with far fewer objects. If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if you're getting dangerously close to 160 bits. OPTIONS
--predict Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer from the guess. This is potentially useful for tuning an interpolation search algorithm. --ignore-midx don't use .midx files, use only .idx files. This is only really useful when used with --predict. EXAMPLE
$ bup margin Reading indexes: 100.00% (1612581/1612581), done. 40 40 matching prefix bits 1.94 bits per doubling 120 bits (61.86 doublings) remaining 4.19338e+18 times larger is possible Everyone on earth could have 625878182 data sets like yours, all in one repository, and we would expect 1 object collision. $ bup margin --predict PackIdxList: using 1 index. Reading indexes: 100.00% (1612581/1612581), done. 915 of 1612581 (0.057%) SEE ALSO
bup-midx(1), bup-save(1) BUP
Part of the bup(1) suite. AUTHORS
Avery Pennarun <apenwarr@gmail.com>. Bup unknown- bup-margin(1)
All times are GMT -4. The time now is 09:58 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy