Help with merging 2 files into 1


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Help with merging 2 files into 1
# 1  
Old 07-14-2010
CPU & Memory Help with merging 2 files into 1

:::::::::
::FileA::
:::::::::
A1-------A2--------A3---A4---A5--
=================================
AC5VXVLT-XX----------------------
B57E434--XXXX1-----MMMM-ZZZ--111-
C325G20--XXXXX3----CCCC------3332
DC35S51--XXXXY1----DDDD------44X-
DC35S52--XXXXY2----DDDD------44Y-
DC35S53--XXXXY3----DDDD------44Z-
EDMDAPP--XX----------------------
FHSMJDP--XX----------------------
GFH2543--XXXXX6----LLLL------666-
I278997--XXXXXY1---HHHH------88X-
J278998--XXXXXY2---HHHH------88Y-
KIXH5DM--XXXXXX9-----------------
NKK7021--XXXXX12-------------112-
OC30H0D--XXXXXXX13-FFFF-YYY--1134
PKD30571-XXX14-----AAAA------114-
=================================

:::::::::
::FileB::
:::::::::
B1------B2-------------------B3--------B4--------B5----------
=============================================================
AAAA----+NDNXN.XNYMT---------88/29J----NX/927----00970-------
GGGGGG1-+NDNXX.TUDTON------------------TX/088----555.555.7901
CCCC----+NDMNNNZ.XNNX--------77/190--------------99178-------
DDDD----+NDMMN.BMXXMNX-------27/98B----NX/778----93010-------
EEEEEEE1+NDMMN.OONTNOT-----------------NX/T80----10238-------
FFFF----+NDRMN.MNTTNY--------08/000----DN/088----555.8180----
JJJJ----+NDRMNN.ONTTNXXON----J0/MNO----NX/798----555.555.2797
HHHH----+NDRMNN.JNOKTON------39/870--------------------------
IIII----+NDRMNN.XNYRNNON-----08/98N----TX/088----------------
BBBB----+NDRMNN.MOOXNRKXMN---72/08K----DN/088----555.555.9270
MMMM----+NDRMNN.YMXXMNMT-----27/992--------------97913-------
LLLL----+NDRMNNN.NRD-------------------OJ/880----------------
=============================================================

:::::::::::::::
::Merged File::
:::::::::::::::
A1--------A3----B2----------------B3------B4------B5------
==========================================================
B57E434---MMMM--+NDRMNN.YMXXMNMT--27/992----------97911---
C325G20---CCCC--+NDMNNNZ.XNNX-----77/190----------99178---
DC35S51---DDDD--+NDMMN.BMXXMNX----27/98B--NX/778--99010---
DC35S52---DDDD--+NDMMN.BMXXMNX----27/98B--NX/778--99010---
DC35S53---DDDD--+NDMMN.BMXXMNX----27/98B--NX/778--99010---
GFH2543---LLLL--+NDRMNNN.NRD--------------OJ/880----------
I278997---HHHH--+NDRMNN.JNOKTON---39/870------------------
J278998---HHHH--+NDRMNN.JNOKTON---39/870------------------
OC30H0D---FFFF--+NDRMN.MNTTNY-----08/000--DN/088--555.8180
PKD30571--AAAA--+NDNXN.XNYMT------88/29J--NX/927--00970---
==========================================================

Notes:
-The files do not have any actual dashes. They are there so that the tables aren't smashed together for the posting. Just ignore them if at all possible.
-FileA is sorted by A1.
-FileB is not sorted.
-A1 is unique, but multiple A1's can have the same A3.
-A3 is not required.
-B1 is unique.
-B2 field starts with a "+".

Requirements:
-The merged output file should be joined on columns A3 and B1.
-FileA is the master and should only match to FileB if A3 is present. (aka eliminate empty A3 lines from FileA)

I have tried mutiple avenues for this one: sed, awk, join, paste. Frankly, my head is starting to hurt looking at the crazy awk's and sed's out there. How should this be handled? Multiple steps? I am pretty convinced that there isn't going to be long magical sed/awk that brings this together for me. I keep getting stumped in the way these commands recognize columns(fields). For example, if B3 is missing, it seems to think B4 is actually $field3 for that line. Does this mean that I need to reference columns by bytes instead?

Anybody got any ideas on how to attack this?
# 2  
Old 07-14-2010
try this:
Code:
nawk 'FNR==NR{fA[$3]=$1;next} $1 in fA {print fA[$1], $0}' FileA FileB

# 3  
Old 07-14-2010
Wow that works great. Looks like I need to read up some more on nawk.

One thing though. This is eliminating the rows from FileA that have the same A3 value but the unique A1. In other words, the merged file should contain 1 row for every A1 field that contains an A3 (no matter if its the same as the one above it). This is eliminating the first 2 lines that have a "DDDD" in FileA. There needs to be 3 lines that have a "DDDD" in the merged output since FileA had 3 different A1's with the same A3. "HHHH" is the same way.

How would you alter this to allow those scenarios to be caught?


:::::::::
::FileA::
:::::::::
A1-------A2--------A3---A4---A5--
=================================
DC35S51--XXXXY1----DDDD------44X-
DC35S52--XXXXY2----DDDD------44Y-
DC35S53--XXXXY3----DDDD------44Z-
=================================

:::::::::
::FileB::
:::::::::
B1------B2-------------------B3--------B4--------B5----------
=============================================================
DDDD----+NDMMN.BMXXMNX-------27/98B----NX/778----93010-------
=============================================================

It should come out to this:

:::::::::::::::
::Merged File::
:::::::::::::::
A1--------A3----B2----------------B3------B4------B5------
==========================================================
DC35S51---DDDD--+NDMMN.BMXXMNX----27/98B--NX/778--93010---
DC35S52---DDDD--+NDMMN.BMXXMNX----27/98B--NX/778--93010---
DC35S53---DDDD--+NDMMN.BMXXMNX----27/98B--NX/778--93010---
==========================================================

But instead its coming out like this:

A1--------A3----B2----------------B3------B4------B5------
==========================================================
DC35S53---DDDD--+NDMMN.BMXXMNX----27/98B--NX/778--93010---
==========================================================


eliminating the DC35S51 & DC35S52.
# 4  
Old 07-14-2010
try this - not tested:
Code:
nawk 'FNR==NR{fA[$3]=($3 in fA)?fA[$3] FS $1:$1;next} $1 in fA {n=split(fA[$1],t, FS); for(i=1;i<=n;i++) print t[i], $0}' FileA FileB

# 5  
Old 07-14-2010
Bug

Most excellent!

That works. Thanks!
# 6  
Old 07-14-2010
Quote:
Originally Posted by lordsmiter
Most excellent!

That works. Thanks!
you're welcome.
In the future, please use the code tags when quoting code/data samples.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Merging two files

Hi All , I have a scenario where we need to combine two files . Below are the sample files and expected output , File 1: 1|ab 1|ac 1|ae 2|ad 2|ac File 2: 1|xy 1|fc 2|gh 2|ku Output file : 1|ab|xy (3 Replies)
Discussion started by: saj
3 Replies

2. Shell Programming and Scripting

Merging two files

Guys, I am having little problem with getting a daily report! The daily process that I do is as follows 1. Unload Header for the report from the systables to one unl file, say Header.unl 2. Unload the data from the required table/tables to another unl file, say Data.unl 3. Send a... (2 Replies)
Discussion started by: PikK45
2 Replies

3. Shell Programming and Scripting

Merging two files with same name

Hello all, I have limited experience in shell scripting. Here goes my question: I have two directories that have same number of files with same file names i.e. consider 2 directories A and B. Both directories have files 1.txt, 2.txt...... I need to merge the file 1.txt of A with file 1.txt... (5 Replies)
Discussion started by: jaysean
5 Replies

4. Shell Programming and Scripting

Merging files

I have two files file 1 containing x rows and 1 column file 2 containing x rows and 1 column I want to merge both the files and add a comma between the two eg plz guide (1 Reply)
Discussion started by: test_user
1 Replies

5. UNIX for Dummies Questions & Answers

Merging two files

Hi, I have two files a.txt and b.txt. a.txt 1 2 3 4 b.txt a b c d e I want to generate a file c.txt by merging these two file and the resultant file would contain c.txt 1 (4 Replies)
Discussion started by: siba.s.nayak
4 Replies

6. Shell Programming and Scripting

merging of files.

Hi, I want to merge the two files on the basis of columns like... file 1 Data Key A 12 B 13 file2 Data Value A A1 A A2 B B1 B B2 (5 Replies)
Discussion started by: clx
5 Replies

7. Shell Programming and Scripting

merging two files

Hi everyone, I have two files which will be exactly same at first. After sometime there will be inserts in one file. My problem is how to reflect these changes in second file also. I found out that any compare and merge utility would do the job like, GNU " sdiff " command. But the... (14 Replies)
Discussion started by: rameshonline
14 Replies

8. Shell Programming and Scripting

Help with merging files

i would like to merge two files that have the same format but have different data. i would like to create one output file that contains information from both the original files.:rolleyes: (2 Replies)
Discussion started by: joe black
2 Replies

9. Shell Programming and Scripting

Merging 2 files

Hi, I have got two files 1.txt 1111|apple| 2222|orange| 2.txt 1111|1234|000000000004356| 1111|1234|000000001111| 1111|1234|002000011112| 2222|5678|000000002222| 2222|9102|000000002222| I need to merge these two so that my out put looks like below: Search code being used should be... (4 Replies)
Discussion started by: jisha
4 Replies

10. Shell Programming and Scripting

merging files

Thanks in advance I have 2 files having key field in each.I would like to join both on common key.I have used join but not sucessful. The files are attached here . what i Want in the output is on the key field SLS OFFR . I have used join commd but not successful. File one ======= SNO ... (6 Replies)
Discussion started by: vakharia Mahesh
6 Replies
Login or Register to Ask a Question