Sponsored Content
Top Forums UNIX for Beginners Questions & Answers Data match 2 files based on first 2 columns matching only and join if match Post 303041396 by axis88 on Friday 22nd of November 2019 07:07:38 AM
Old 11-22-2019
Hi,

ok thanks for the reply,

here is an example of the match file and the master file

so what needs to happen is if the first 2 columns of the file match then the script would pull out the entire row with the mobile number added to the match file

so basically i need the mobiles added to field 30 on the match file as all the mobiles would be in the master file

hope this makes sense and again thanks for this

Code:
match file

surname	dob	f3	f4	f5	f6	f7	f8	f9	f10	f11	f12	f13	f14	f15	f16	f17	f18	f19	f20	f21	f22	f23	f24	f25	f26	f27	f28	f29
jones	        22/03/1988	  xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx
smith	19/03/1977	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx
john	22/07/1955	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx	xxxx

so first 2 colums of both files are surname and DOB

Code:
master file with the mobile in the 3rd

f1	f2	mobile
john	22/02/1988	7777777777
jones	19/05/1975	7888888888
etc	22/07/1990	7878787878
etc	01/01/1959	7878787878


hope this makes more sense


thanks
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Matching same columns and finding the smallest match

Hi all, I am wondering if its possible to solve my problem with a simple code. Basically I have a file that looks like this (tab delimited) bob 8 250 tina 8 225 sam 8 225 ellen 9 315 kyle 9 275 sally 9 135 So what I want to do is match columns 2 and 5. If columns 2 and 5... (2 Replies)
Discussion started by: phil_heath
2 Replies

2. Shell Programming and Scripting

Extract data based on match against one column data from a long list data

My input file: data_5 Ali 422 2.00E-45 102/253 140/253 24 data_3 Abu 202 60.00E-45 12/23 140/23 28 data_1 Ahmad 256 7.00E-45 120/235 140/235 22 data_4 Aman 365 8.00E-45 15/65 140/65 20 data_10 Jones 869 9.00E-45 65/253 140/253 18... (12 Replies)
Discussion started by: patrick87
12 Replies

3. Shell Programming and Scripting

Matching string on two files based on match rules.

Hi, How to check if a string on file2 exactly matches with a part or complete string on file1, and return a match indicator based on some match rules. 1) only records on file1 with category A should be matched. for other category, the output match indicator should default to 'N' 2) on file2... (13 Replies)
Discussion started by: effay
13 Replies

4. Shell Programming and Scripting

Match files based on either of the two columns awk

Dear Shell experts, I have 2 files with structure: File 1: ID and count head test_GI_count1.txt 1000094 2 10039307 1 10039641 1 10047177 11 10047359 1 1008555 2 10120302 1 10120672 13 10121776 1 10121865 32 And 2nd file: head Protein_gi_GeneID_symbol.txt protein_gi GeneID... (11 Replies)
Discussion started by: smitra
11 Replies

5. Shell Programming and Scripting

Join lines from two files based on match

I have two files. File1 >gi|11320906|gb|AF197889.1|_Buchnera_aphidicola ATGAAATTTAAGATAAAAAATAGTATTTT >gi|11320898|gb|AF197885.1|_Buchnera_aphidicola ATGAAATTTAATATAAACAATAAAA >gi|11320894|gb|AF197883.1|_Buchnera_aphidicola ATGAAATTTAATATAAACAATAAAATTTTT File2 AF197885 Uroleucon aeneum... (2 Replies)
Discussion started by: pathunkathunk
2 Replies

6. UNIX for Advanced & Expert Users

Match and print based on columns

HI, I have 2 different questions in this thread. Consider 2 files as input (input file have different line count ) File 1 1 1 625 56 1 12 657 34 1 9 25 45 1 2 20 54 67 3 25 35 27 4 45 73 36 5 125 56 45 File2 1 1 878 76 1 9 83 67 2 20 73 78 4 47 22 17 3 25 67 99 (4 Replies)
Discussion started by: rossi
4 Replies

7. Shell Programming and Scripting

Join two files combining multiple columns and produce mix and match output

I would like to join two files when two columns in each file matches with each other and then produce an output when taking multiple columns. Like I have file A 1234,ABCD,23,JOHN,NJ,USA 2345,ABCD,24,SAM,NY,USA 5678,GHIJ,24,TOM,NY,USA 5678,WXYZ,27,MAT,NJ,USA and file B ... (2 Replies)
Discussion started by: mady135
2 Replies

8. Shell Programming and Scripting

New files based off match or no match

Trying to match $2 in original_targets with $2 of new_targets . If the two numbers match exactly then a match.txt file is outputted using the information in the new_targets in the beginning 4 fields $1, $2, $3, $4 and value of $4 in the original_targets . If there is "No Match" then a no... (2 Replies)
Discussion started by: cmccabe
2 Replies

9. Shell Programming and Scripting

awk to update file based on partial match in field1 and exact match in field2

I am trying to create a cronjob that will run on startup that will look at a list.txt file to see if there is a later version of a database using database.txt as the source. The matching lines are written to output. $1 in database.txt will be in list.txt as a partial match. $2 of database.txt... (2 Replies)
Discussion started by: cmccabe
2 Replies

10. Shell Programming and Scripting

Comparing two columns in two files and printing a third based on a match

Hello all, First post here. I did not notice a previous post to help me down the right path. I am looking to compare a column in a CSV file against another file (which is not a column match one for one) but more or less when a match is made, I would like to append a third column that contains a... (17 Replies)
Discussion started by: dis0wned
17 Replies
funjoin(1)							SAORD Documentation							funjoin(1)

NAME
funjoin - join two or more FITS binary tables on specified columns SYNOPSIS
funjoin [switches] <ifile1> <ifile2> ... <ifilen> <ofile> OPTIONS
-a cols # columns to activate in all files -a1 cols ... an cols # columns to activate in each file -b 'c1:bvl,c2:bv2' # blank values for common columns in all files -bn 'c1:bv1,c2:bv2' # blank values for columns in specific files -j col # column to join in all files -j1 col ... jn col # column to join in each file -m min # min matches to output a row -M max # max matches to output a row -s # add 'jfiles' status column -S col # add col as status column -t tol # tolerance for joining numeric cols [2 files only] DESCRIPTION
funjoin joins rows from two or more (up to 32) FITS Binary Table files, based on the values of specified join columns in each file. NB: the join columns must have an index file associated with it. These files are generated using the funindex program. The first argument to the program specifies the first input FITS table or raw event file. If "stdin" is specified, data are read from the standard input. Subsequent arguments specify additional event files and tables to join. The last argument is the output FITS file. NB: Do not use Funtools Bracket Notation to specify FITS extensions and row filters when running funjoin or you will get wrong results. Rows are accessed and joined using the index files directly, and this bypasses all filtering. The join columns are specified using the -j col switch (which specifies a column name to use for all files) or with -j1 col1, -j2 col2, ... -jn coln switches (which specify a column name to use for each file). A join column must be specified for each file. If both -j col and -jn coln are specified for a given file, then the latter is used. Join columns must either be of type string or type numeric; it is illegal to mix numeric and string columns in a given join. For example, to join three files using the same key column for each file, use: funjoin -j key in1.fits in2.fits in3.fits out.fits A different key can be specified for the third file in this way: funjoin -j key -j3 otherkey in1.fits in2.fits in3.fits out.fits The -a "cols" switch (and -a1 "col1", -a2 "cols2" counterparts) can be used to specify columns to activate (i.e. write to the output file) for each input file. By default, all columns are output. If two or more columns from separate files have the same name, the second (and subsequent) columns are renamed to have an underscore and a numeric value appended. The -m min and -M max switches specify the minimum and maximum number of joins required to write out a row. The default minimum is 0 joins (i.e. all rows are written out) and the default maximum is 63 (the maximum number of possible joins with a limit of 32 input files). For example, to write out only those rows in which exactly two files have columns that match (i.e. one join): funjoin -j key -m 1 -M 1 in1.fits in2.fits in3.fits ... out.fits A given row can have the requisite number of joins without all of the files being joined (e.g. three files are being joined but only two have a given join key value). In this case, all of the columns of the non-joined file are written out, by default, using blanks (zeros or NULLs). The -b c1:bv1,c2:bv2 and -b1 'c1:bv1,c2:bv2' -b2 'c1:bv1,c2 - bv2' ... switches can be used to set the blank value for columns common to all files and/or columns in a specified file, respectively. Each blank value string contains a comma-separated list of col- umn:blank_val specifiers. For floating point values (single or double), a case-insensitive string value of "nan" means that the IEEE NaN (not-a-number) should be used. Thus, for example: funjoin -b "AKEY:???" -b1 "A:-1" -b3 "G:NaN,E:-1,F:-100" ... means that a non-joined AKEY column in any file will contain the string "???", the non-joined A column of file 1 will contain a value of -1, the non-joined G column of file 3 will contain IEEE NaNs, while the non-joined E and F columns of the same file will contain values -1 and -100, respectively. Of course, where common and specific blank values are specified for the same column, the specific blank value is used. To distinguish which files are non-blank components of a given row, the -s (status) switch can be used to add a bitmask column named "JFILES" to the output file. In this column, a bit is set for each non-blank file composing the given row, with bit 0 corresponds to the first file, bit 1 to the second file, and so on. The file names themselves are stored in the FITS header as parameters named JFILE1, JFILE2, etc. The -S col switch allows you to change the name of the status column from the default "JFILES". A join between rows is the Cartesian product of all rows in one file having a given join column value with all rows in a second file having the same value for its join column and so on. Thus, if file1 has 2 rows with join column value 100, file2 has 3 rows with the same value, and file3 has 4 rows, then the join results in 2*3*4=24 rows being output. The join algorithm directly processes the index file associated with the join column of each file. The smallest value of all the current columns is selected as a base, and this value is used to join equal-valued columns in the other files. In this way, the index files are traversed exactly once. The -t tol switch specifies a tolerance value for numeric columns. At present, a tolerance value can join only two files at a time. (A completely different algorithm is required to join more than two files using a tolerance, somethng we might consider implementing in the future.) The following example shows many of the features of funjoin. The input files t1.fits, t2.fits, and t3.fits contain the following columns: [sh] fundisp t1.fits AKEY KEY A B ----------- ------ ------ ------ aaa 0 0 1 bbb 1 3 4 ccc 2 6 7 ddd 3 9 10 eee 4 12 13 fff 5 15 16 ggg 6 18 19 hhh 7 21 22 fundisp t2.fits AKEY KEY C D ----------- ------ ------ ------ iii 8 24 25 ggg 6 18 19 eee 4 12 13 ccc 2 6 7 aaa 0 0 1 fundisp t3.fits AKEY KEY E F G ------------ ------ -------- -------- ----------- ggg 6 18 19 100.10 jjj 9 27 28 200.20 aaa 0 0 1 300.30 ddd 3 9 10 400.40 Given these input files, the following funjoin command: funjoin -s -a1 "-B" -a2 "-D" -a3 "-E" -b "AKEY:???" -b1 "AKEY:XXX,A:255" -b3 "G:NaN,E:-1,F:-100" -j key t1.fits t2.fits t3.fits foo.fits will join the files on the KEY column, outputting all columns except B (in t1.fits), D (in t2.fits) and E (in t3.fits), and setting blank values for AKEY (globally, but overridden for t1.fits) and A (in file 1) and G, E, and F (in file 3). A JFILES column will be output to flag which files were used in each row: AKEY KEY A AKEY_2 KEY_2 C AKEY_3 KEY_3 F G JFILES ------------ ------ ------ ------------ ------ ------ ------------ ------ -------- ----------- -------- aaa 0 0 aaa 0 0 aaa 0 1 300.30 7 bbb 1 3 ??? 0 0 ??? 0 -100 nan 1 ccc 2 6 ccc 2 6 ??? 0 -100 nan 3 ddd 3 9 ??? 0 0 ddd 3 10 400.40 5 eee 4 12 eee 4 12 ??? 0 -100 nan 3 fff 5 15 ??? 0 0 ??? 0 -100 nan 1 ggg 6 18 ggg 6 18 ggg 6 19 100.10 7 hhh 7 21 ??? 0 0 ??? 0 -100 nan 1 XXX 0 255 iii 8 24 ??? 0 -100 nan 2 XXX 0 255 ??? 0 0 jjj 9 28 200.20 4 SEE ALSO
See funtools(7) for a list of Funtools help pages version 1.4.2 January 2, 2008 funjoin(1)
All times are GMT -4. The time now is 01:58 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy