Sponsored Content
Top Forums Shell Programming and Scripting AWK help, matching 2 files into one Post 302525498 by jmeasel7 on Thursday 26th of May 2011 11:18:11 PM
Old 05-27-2011
AWK help, matching 2 files into one

I'm newbie with AWK. What I'm trying to do is matching file1 and file2 into a file3 with records listed in columns with pipe as delimiter.

The thing is the file1 has thousands of records while file2 has very few. But I want the file3 to show all records in file1 and with data from file2 to be added on those that exists on file1 and leave a one space for those that doesn't exist on file2. All on one line for each matched record. Know what I mean?

I looked all over for AWK that do this. I tried the JOIN method, it doesn't work as it has to be sorted and matched but it did not succeed due to too many records in file1 and few in file2.

Appreciate any help!!

Jmeasel7

---------- Post updated at 11:18 PM ---------- Previous update was at 11:09 PM ----------

what i need is like this

file1:
Code:
rec1|11111|11111|11111
rec2|22222|22222|22222
rec3|33333|33333|33333
rec4|44444|44444|44444

file2:
Code:
rec2|55555
rec4|77777

result I want for file3:
Code:
rec1|11111|11111|11111|
rec2|22222|22222|22222|55555
rec3|33333|33333|33333|
rec4|44444|44444|44444|77777


Last edited by Franklin52; 05-30-2011 at 06:02 AM.. Reason: Please use code tags
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

AWK Matching Fields and Combining Files

Hello! I am writing a program to run through two large lists of data (~300,000 rows), find where rows in one file match another, and combine them based on matching fields. Due to the large file sizes, I'm guessing AWK will be the most efficient way to do this. Overall, the input and output I'm... (5 Replies)
Discussion started by: Michelangelo
5 Replies

2. Shell Programming and Scripting

AWK: matching patterns in 2 different files

In a directory, there are two different file extensions (*.txt and *.xyz) having similar names of numerical strings (*). The (*.txt) contains 5000 multiple files and the (*.xyz) also contains 5000 multiple files. Each of the files has around 4000 rows and 8 columns, with several unique string... (5 Replies)
Discussion started by: asanjuan
5 Replies

3. Shell Programming and Scripting

awk - Matching columns between 2 files and reordering results

I am trying to match 4 colums (first_name,last_name,dob,ssn) between 2 files and when there is an exact match I need to write out these matches to a new file with a combination of fields from file1 and file2. I've managed to come up with a way to match these 2 files based on the columns (see below)... (7 Replies)
Discussion started by: ambroze
7 Replies

4. Shell Programming and Scripting

AWK - Comparing/Matching/Counting with 2 files

I have 2 files that I want to do some comparing on. First, I want to find the unique list of devices in file1 and then put them to a new file, file2. I was able to do this without any problem with the following statement: cat file1 | awk '{print $2}' | awk '!x++' > file2Here is what I can't... (2 Replies)
Discussion started by: jontjioe
2 Replies

5. Shell Programming and Scripting

Help with awk Matching columns from two files

Hello, I have two files as following: #bin chrom chromStart chromEnd name score strand observed 585 chr2 29442 29443 rs4637157 0 + C/T 585 chr2 33011 33012 rs13423995 0 + A/G 585 chr2 34502 34503 rs13386087 0 + ... (2 Replies)
Discussion started by: Homa
2 Replies

6. Shell Programming and Scripting

Matching two files with awk

Hello, I have two files as follow: AFFY_ID RS_ID CHROMOSOME POS_START POS_END ALLELE1 ALLELE2 SNP_A-1780283 rs17011450 chr4 127630275 127630276 C T SNP_A-1780285 rs6919430 chr6 90919464 90919465 A C SNP_A-1780286 --- chr7 104281409 104281410 A G SNP_A-1780301 rs2342723 chr16 5748790... (1 Reply)
Discussion started by: Homa
1 Replies

7. Shell Programming and Scripting

awk two succeeding lines and moving matching files

Hello everyone I have a few hundreds of .mol2 files that has this pattern @<TRIPOS>ATOM 2 H18 65.2220 Du 1 RES1 0.0000 @<TRIPOS>BOND 1 3 5 ar @<TRIPOS>SUBSTRUCTURE among them, some of the files are missing the line after the @<TRIPOS>BOND and they look... (2 Replies)
Discussion started by: Error404
2 Replies

8. Shell Programming and Scripting

awk to print matching lines in files that meet critera

In the tab delimited files below I am trying to match $2 in file1 to $2 of file2. If a match is found the awk checks $3 of file2 and if it is greater than 40% and $4 of file2 is greater than 49, the line in file1 is printed. In the desired output line3 of file1 is not printed because $3 off file2... (9 Replies)
Discussion started by: cmccabe
9 Replies

9. UNIX for Beginners Questions & Answers

Awk: matching multiple fields between 2 files

Hi, I have 2 tab-delimited input files as follows. file1.tab: green A apple red B apple file2.tab: apple - A;Z Objective: Return $1 of file1 if, . $1 of file2 matches $3 of file1 and, . any single element (separated by ";") in $3 of file2 is present in $2 of file1 In order to... (3 Replies)
Discussion started by: beca123456
3 Replies

10. UNIX for Beginners Questions & Answers

awk for matching fields between files with repeated records

Hello all, I am having trouble with what should be an easy task, but seem to be missing something fundamental. I have two files, with File 1 consisting of a single field of many thousands of records. I also have File 2 with two fields and many thousands of records. My goal is that when $1 of... (2 Replies)
Discussion started by: jvoot
2 Replies
merge(1)																  merge(1)

NAME
merge - three-way file merge SYNOPSIS
merge [-Llabel1 [-Llabel3]] [-p] [-q] file1 file2 file3 DESCRIPTION
merge incorporates all changes that lead from file2 to file3 into file1. The result goes to standard output if -p is present, into file1 otherwise. merge is useful for combining separate changes to an original. Suppose file2 is the original, and both file1 and file3 are modifications of file2. Then merge combines both changes. An overlap occurs if both file1 and file3 have changes in a common segment of lines. On a few older hosts where diff3 does not support the -E option, merge does not detect overlaps, and merely supplies the changed lines from file3. On most hosts, if overlaps occur, merge out- puts a message (unless the -q option is given), and includes both alternatives in the result. The alternatives are delimited as follows: <<<<<<< file1 lines in file1 ======= lines in file3 >>>>>>> file3 If there are overlaps, the user should edit the result and delete one of the alternatives. If the -L label1 and -L label3 options are given, the labels are output in place of the names file1 and file3 in overlap reports. DIAGNOSTICS
Exit status is 0 for no overlaps, 1 for some overlaps, 2 for trouble. IDENTIFICATION
Author: Walter F. Tichy. Revision Number: 1.1.6.2; Release Date: 1993/10/07. Copyright (C) 1982, 1988, 1989 by Walter F. Tichy. Copyright (C) 1990, 1991 by Paul Eggert. SEE ALSO
diff3(1), diff(1), rcsmerge(1), co(1) merge(1)
All times are GMT -4. The time now is 06:19 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy