Need Help with Joining Command or AWK


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Need Help with Joining Command or AWK
# 1  
Old 04-14-2009
Need Help with Joining Command or AWK

Hello. I have 2 files.

File 1:

1692 @00000000201@00000205@20090402@000015000
1692 @00000000202@00000206@20090402@000010000
263157760@00000000401@00000201@20090402@000056250
510584583@00000000300@00000207@20090402@000030000
571764164@00000000100@00000128@20090402@000000500
571764164@00000000202@00000147@20090402@000005000
770000464@00000000100@00000125@20090402@000005000
770000464@00000000201@00000000@ @000005000


File 2:

1692 @RONALD F. SZOPA, DMD
004408365@STRESS REDUCTION SYSTEMS
010555926@B. FAREID, DDS A PROFESSIONAL CORP.
571764164@METABOLIC IMAGING MEDICAL GROUP
033464787@DAVID YUNG NING LI
770000464@COMPREHENSIVE MEDICAL G

I tried to do a join however it isn't working. I am trying to match the first field of the first file to the first field of the 2nd file. If that condition match then produce all record from both file 1 and file 2.

Tried

awk -F\@ 'NR == FNR {i _[$1,$2]; next } ($1,$2) in _' file1 file2

doesn't work

Any help is much appreciated. Thank you.
# 2  
Old 04-14-2009
for your example files, what's your output look like?
# 3  
Old 04-14-2009
You can use join to find all matching lines in both files and then output the fields you want from both.
Code:
join -1 1 -2 1 -t@ -o 1.1,1.2,1.3,1.4,1.5,2.2 file1 file2

#-1 1  file1 field1
#-2 1  file2 field1
#-t@   @ delimiter
#-o format output with fields from each file where 1.*=file1 and 2.*=file2

# 4  
Old 04-14-2009
I thought awk scanned files sequentially? not sure what you script will do.
Consider using python...
Code:
import sys
l1 = open(sys.argv[1]).readlines()
l2 = open(sys.argv[2]).readlines()
if (l1[0].split()[0] == l2[0].split()[0]):
        for x in range(len(l1)):
                print "%s,%s" % (l1[x][:-1],l2[x][:-1])

or ruby ...
Code:
l1 = open(ARGV[0]).readlines; l2 = open(ARGV[1]).readlines
if (l1[0].split[0] == l2[0].split[0])
        l1.each_with_index do |l,x| puts l.strip + "," + l2[x]; end
end

# 5  
Old 04-14-2009
Quote:
Originally Posted by qneill
I thought awk scanned files sequentially? not sure what you script will do.
Consider using python...
Code:
import sys
l1 = open(sys.argv[1]).readlines()
l2 = open(sys.argv[2]).readlines()
if (l1[0].split()[0] == l2[0].split()[0]):
        for x in range(len(l1)):
                print "%s,%s" % (l1[x][:-1],l2[x][:-1])

please test you code.
# 6  
Old 04-15-2009
Quote:
Originally Posted by ghostdog74
for your example files, what's your output look like?
sorry, output would look like this:

File 1:

1692 @00000000201@00000205@20090402@000015000
1692 @00000000202@00000206@20090402@000010000
263157760@00000000401@00000201@20090402@000056250
510584583@00000000300@00000207@20090402@000030000
571764164@00000000100@00000128@20090402@000000500
571764164@00000000202@00000147@20090402@000005000
770000464@00000000100@00000125@20090402@000005000
770000464@00000000201@00000000@ @000005000


File 2:

1692 @RONALD F. SZOPA, DMD
004408365@STRESS REDUCTION SYSTEMS
010555926@B. FAREID, DDS A PROFESSIONAL CORP.
571764164@METABOLIC IMAGING MEDICAL GROUP
033464787@DAVID YUNG NING LI
770000464@COMPREHENSIVE MEDICAL G
263157760@TEST
510584583@PHOTO

Output:

1692 @00000000201@00000205@20090402@000015000@RONALD F. SZOPA, DMD

1692 @00000000202@00000206@20090402@000010000@RONALD F. SZOPA, DMD

263157760@00000000401@00000201@20090402@000056250@TEST

510584583@00000000300@00000207@20090402@000030000@PHOTO

571764164@00000000100@00000128@20090402@000000500@METABOLIC IMAGING MEDICAL GROUP

571764164@00000000202@00000147@20090402@000005000@METABOLIC IMAGING MEDICAL GROUP

770000464@00000000100@00000125@20090402@000005000@COMPREHENSIVE MEDICAL G

770000464@00000000201@00000000@ @000005000@COMPREHENSIVE MEDICAL G


I need all data from file 1 however there are a lot on File 2 that I do not need if it doesn't match file 1 field 1. Thank you for your help.
# 7  
Old 04-15-2009
Quote:
Originally Posted by ldapswandog
You can use join to find all matching lines in both files and then output the fields you want from both.
Code:
join -1 1 -2 1 -t@ -o 1.1,1.2,1.3,1.4,1.5,2.2 file1 file2

#-1 1  file1 field1
#-2 1  file2 field1
#-t@   @ delimiter
#-o format output with fields from each file where 1.*=file1 and 2.*=file2

I tried that however it will only grab the first match. For example that I just gave you, it will only grab the first 2 line.

1692 @00000000201@00000205@20090402@000015000@RONALD F. SZOPA, DMD

1692 @00000000202@00000206@20090402@000010000@RONALD F. SZOPA, DMD
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Issue with awk when joining two files when field has '-' hyphen

Dear Community; I need to join two files but I am facing issues. 1st file has multiple columns. Primary (1st) columns has unique values. There are other columns out of which some has non-ascii characters as well (other language). Example File below: 1-1001JRL,BiRecurring... (5 Replies)
Discussion started by: mystition
5 Replies

2. Shell Programming and Scripting

awk joining multiple lines based on field count

Hi Folks, I have a file with fields as follows which has last field in multiple lines. I would like to combine a line which has three fields with single field line for as shown in expected output. Please help. INPUT hname01 windows appnamec1eda_p1, ... (5 Replies)
Discussion started by: shunya
5 Replies

3. Shell Programming and Scripting

Joining files using awk not extracting all columns from File 2

Hello All I'm joining two files using Awk by Left outer join on the file 1 File 1 1 AA 2 BB 3 CC 4 DD File 2 1 IND 100 200 300 2 AUS 400 500 600 5 USA 700 800 900 (18 Replies)
Discussion started by: venkat_reddy
18 Replies

4. Shell Programming and Scripting

awk --> math-operation in data-record and joining with second file data

Hi! I have a pretty complex job - at least for me! i have two csv-files with meassurement-data: fileA ...... (2 Replies)
Discussion started by: IMPe
2 Replies

5. Shell Programming and Scripting

Joining broken lines with awk or perl

Hi, I have a huge file with sql broken statements like: PP3697HB @@@@0 <<<<<<Record has been deleted as per PP3697HB>>>>>> FROM sys.xtab_ref rc,sys.xtab_sys f,sys.domp ur WHE RE rc.milf = ur.milf AND rc.molf = f.molf AND ur.dept = 'SWIT'AND ur .department = 'IND' AND share = '2' AND... (4 Replies)
Discussion started by: som.nitk
4 Replies

6. Shell Programming and Scripting

Joining lines in a text file using AWK or SED

Hi All I'm struggling a bit here :( I need a way of joining lines contained in a text file. I've seen numerous SED and AWK examples and none of them seem to be working for me. The text file has 4 lines: DELL1427 DOC 30189342 79 Now bear with me on this one as I'm actually... (4 Replies)
Discussion started by: huskie69
4 Replies

7. Shell Programming and Scripting

awk joining lines

Hello, I'm trying to write a piece of code in awk, which should be able recognize by some regexps two lines and then join them together (maybe write them without \n would be enough, I don't know).. the thing is that every line in the file i'm working with starts with some number, for example: ... (4 Replies)
Discussion started by: midin
4 Replies

8. Shell Programming and Scripting

Joining Two Files Using Awk

Hi All, I am new to awk program. But i have got some assignment on awk. The problem is: i have two files file1 and file2. Both files have same structure. First i have to join both files on filed1,field2 and field3 and then for matching records i want to perform some calculation like:... (1 Reply)
Discussion started by: Jeetuibm
1 Replies

9. Shell Programming and Scripting

joining command results, and substitution

Hello community I'd like to join to command results and put it to the same line in one file, how can I do that? file: a.txt so when I put Date '+%H:%M' and echo date '+%D' in the file appears 14:44 01/05/08 not 14:44 01/05/08 I like to know how can I make a substituion of a whole... (6 Replies)
Discussion started by: ncatdesigner
6 Replies

10. Shell Programming and Scripting

Awk error for joining records with CR/newline

Is there any way to remove carriage retuns between the records? These carriage returns are created in an excel cell by using Alt+enter, this is similar to new line... We have input records separated by TABS and have carriage returns as below: 123 456 789 ABC "1952.00" 678 "abcdef ghik... (5 Replies)
Discussion started by: acheepi
5 Replies
Login or Register to Ask a Question