The UNIX and Linux Forums  
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com



Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
To get an output by combining fields from two different files smriti_shridhar Shell Programming and Scripting 8 10-22-2008 07:21 AM
Matching and combining two files sickboy Shell Programming and Scripting 3 07-17-2008 01:20 PM
combining fields in awk json4639 Shell Programming and Scripting 2 07-10-2008 03:52 PM
Matching and combining two files p3t3r Shell Programming and Scripting 5 05-20-2008 11:16 AM
combining fields in two text fields shocker Shell Programming and Scripting 3 01-16-2008 11:27 AM

 
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Bulgarian Greek Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
Prev Previous Post   Next Post Next
  #1 (permalink)  
Old 03-29-2009
Michelangelo Michelangelo is offline
Registered User
  
 

Join Date: Mar 2009
Posts: 2
AWK Matching Fields and Combining Files

Hello!

I am writing a program to run through two large lists of data (~300,000 rows), find where rows in one file match another, and combine them based on matching fields. Due to the large file sizes, I'm guessing AWK will be the most efficient way to do this. Overall, the input and output I'm looking for is similar to to this:

File1: *first three columns are coordinates in (x, y, z)*
123 456 678 A B C
234 345 567 D F B
234 456 324 H J K
765 432 987 M N K


File2: *the last three columns are coordinates in (x, y, z)*
45 234 345 567
46 765 432 987
47 111 222 333
48 234 345 567
49 987 765 432
50 444 555 666
51 765 432 987
... and so on

Output file:
45 234 345 567 D F B
46 765 432 987 M N K
48 234 345 567 D F B
51 765 432 987 M N K

File2 has many more entries than File1, and every coordinate in File1 is located somewhere in File2. The problem I am having is how to search through all of File2 finding where each of the individual File1 coordinates is listed, and the number in column 1 of File2 that corresponds to that coordinate.

In a nutshell:
Make new file3
Find where File2($2, $3, $4) is equal to File1($1, $2, $3)
print to file3 File2($1, $2, $3, $4), File1($4, $5, $6)

Thank you!
 

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 05:02 AM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0