Sponsored Content
Top Forums Shell Programming and Scripting Left Join in Unix based on Key? Post 302562570 by birei on Friday 7th of October 2011 09:40:12 AM
Old 10-07-2011
Hi stack,

Using 'Perl':

Code:
$ cat file1
111,Mike,Stipe
222,Peter,Buck
333,Mike,Mills
$ cat file2
222,Mr,Bono
444,Mr,Edge
$ cat stack.pl
use warnings;
use strict;

die qq[Usage: perl $0 file1 file2\n] if @ARGV != 2;

my ( %code );

while ( <> ) {
        chomp;
        my @f = split /,/, $_, 2;
        next if @f != 2;
        push @{ $code{ $f[0] } }, $f[1];
}

for ( sort keys %code ) {
        printf qq[%d,%s\n],
                $_,
                join ",", @{ $code{ $_ } };
}
$ perl stack.pl file1 file2
111,Mike,Stipe
222,Peter,Buck,Mr,Bono
333,Mike,Mills
444,Mr,Edge

Regards,
Birei
This User Gave Thanks to birei For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Left join on files using awk

nawk 'NR==FNR{a;next} {if($1 in a) print $1,"Found" else print}' OFS="," File_B File_A The above code is not working help is appreciated (6 Replies)
Discussion started by: pinnacle
6 Replies

2. Shell Programming and Scripting

Join 3 files using key column in a mapping file

I'm new of UNIX shell scripting. I'm recently generating a excel report in UNIX(file with delimiter is fine). How should I make a script to do it? 1 file to join comes from output of one UNIX command, the second from another UNIX command, and third from a database query. The key columes of all... (7 Replies)
Discussion started by: bigsmile
7 Replies

3. UNIX for Advanced & Expert Users

Relational Join (Composite Key)

hi, i have file 1: ====== 0501000|X1 0502000|X2 0501231|X3 0981222|X4 0502000|X6 0503000|X7 0932322|X8 file 2: ======= 050 0501 0502 09 098 (1 Reply)
Discussion started by: magedfawzy
1 Replies

4. Shell Programming and Scripting

"Join" or "Merge" more than 2 files into single output based on common key (column)

Hi All, I have working (Perl) code to combine 2 input files into a single output file using the join function that works to a point, but has the following limitations: 1. I am restrained to 2 input files only. 2. Only the "matched" fields are written out to the "matched" output file and... (1 Reply)
Discussion started by: Katabatic
1 Replies

5. Programming

LEFT JOIN issue in Mysql

I have a data table as follows: mysql> select * from validations where source = "a03"; +------------+-------+--------+ | date | price | source | +------------+-------+--------+ | 2001-01-03 | 80 | a03 | | 2001-01-04 | 82 | a03 | | 2001-01-05 | 84 | a03 | | 2001-01-06... (2 Replies)
Discussion started by: figaro
2 Replies

6. Shell Programming and Scripting

left join using awk

Hi guys, I need to use awk to join 2 files file_1 A 001 B 002 C 003 file_2 A XX1 B XX2 output desired A 001 XX1 B 002 missing C 003 XX2 thank you! (2 Replies)
Discussion started by: g1org1o
2 Replies

7. Shell Programming and Scripting

left join using awk

Hi guys, I need AWK to merge the following 2 files: file1 1 a 1 1 2 b 2 2 3 c 3 3 4 d 4 4 file2 a a/a c/c a/c c/c a/a c/t c c/t c/c a/t g/g c/c c/t desired output: 1 a 1 1 a/a c/c a/c c/c a/a c/t 2 b 2 2 x x x x x x 3 c 3 3 c/t c/c a/t g/g c/c c/t 4 d 4 4 x x x x x x (2 Replies)
Discussion started by: g1org1o
2 Replies

8. Shell Programming and Scripting

Generate Join clause based on key data

Hi, I have a file pk.txt which has pk data in following format TableName | PK Employee | id Contact|name,country My Output should be Employee | t1.id=s.id Contact| t1.name=s.name AND t1.country=s.country I started of like this: for LIST in `cat pk.txt` do... (5 Replies)
Discussion started by: wahi80
5 Replies

9. Shell Programming and Scripting

Join 2 CSVs based on 1 key

Hello, I have 2 csv as follows: a.csv: name,phone,adress,car xy,1234,asbd yz,2134,asbdf tc,6789,salkdur b.csv: telphone,vehicle 2134,toyota 6789,bmw 1234,honda What is need is this: output.csv: name,phone,adres,car xy,1234,asbd,honda yz,2134,asbdf,toyota (7 Replies)
Discussion started by: Zam_1234
7 Replies

10. Shell Programming and Scripting

Join, merge, fill NULL the void columns of multiples files like sql "LEFT JOIN" by using awk

Hello, This post is already here but want to do this with another way Merge multiples files with multiples duplicates keys by filling "NULL" the void columns for anothers joinning files file1.csv: 1|abc 1|def 2|ghi 2|jkl 3|mno 3|pqr file2.csv: 1|123|jojo 1|NULL|bibi... (2 Replies)
Discussion started by: yjacknewton
2 Replies
bup-margin(1)						      General Commands Manual						     bup-margin(1)

NAME
bup-margin - figure out your deduplication safety margin SYNOPSIS
bup margin [options...] DESCRIPTION
bup margin iterates through all objects in your bup repository, calculating the largest number of prefix bits shared between any two entries. This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids. For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45. That means a 46-bit hash would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by its first 46 bits. The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects. Since SHA-1 hashes have 160 bits, that leaves 115 bits of margin. Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits with far fewer objects. If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see if you're getting dangerously close to 160 bits. OPTIONS
--predict Guess the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer from the guess. This is potentially useful for tuning an interpolation search algorithm. --ignore-midx don't use .midx files, use only .idx files. This is only really useful when used with --predict. EXAMPLE
$ bup margin Reading indexes: 100.00% (1612581/1612581), done. 40 40 matching prefix bits 1.94 bits per doubling 120 bits (61.86 doublings) remaining 4.19338e+18 times larger is possible Everyone on earth could have 625878182 data sets like yours, all in one repository, and we would expect 1 object collision. $ bup margin --predict PackIdxList: using 1 index. Reading indexes: 100.00% (1612581/1612581), done. 915 of 1612581 (0.057%) SEE ALSO
bup-midx(1), bup-save(1) BUP
Part of the bup(1) suite. AUTHORS
Avery Pennarun <apenwarr@gmail.com>. Bup unknown- bup-margin(1)
All times are GMT -4. The time now is 05:53 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy