Matching the substring and join two files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Matching the substring and join two files
# 1  
Old 11-22-2011
Matching the substring and join two files

Hi

I had two files like below.

file-1
Code:
 
101001234567890 
101001234567891 
101001234567892 
101001234567893 
101001234567894 
101001234567895 
101001234567896 
101001234567897 
101001234567898 
101001234567899


file-2

Code:
 
1234567890 
1234567891 
1234567892 
1234567893 
1234567894 
1234567895 
1234567896 
1234567897 
1234567898 
1234567899

I want an o/p file (file-3)like below.

File-3

Code:
 
1234567890 101001234567890 
1234567891 101001234567891 
1234567892 101001234567892 
1234567893 101001234567893 
1234567894 101001234567894 
1234567895 101001234567895 
1234567896 101001234567896 
1234567897 101001234567897 
1234567898 101001234567898 
1234567899 101001234567899

Please help how do that substring comparasion and join after that.Can we do that using awk?

Last edited by p_sai_ias; 11-22-2011 at 09:48 AM.. Reason: sorry files are not appearing in proper format
# 2  
Old 11-22-2011
Like this?
Code:
paste file2 file1 > file3

--ahamed

Last edited by ahamed101; 11-22-2011 at 10:03 AM..
# 3  
Old 11-22-2011
its ok when all are uniform. for ex some patterns in one file1 are not at all there in file2.Giving one example below.

file1
Code:
101001234567890
101001234567891
101001234567892
101001234567893
101001234567894
101001234567895
101001234567896
101001234567897
101001234567898
101001234567899

file2
Code:
1234567890
1234567892
1234567894
1234567895
1234567896
1234567898
1234567899

file3

Code:
1234567890 101001234567890
1234567892 101001234567892
1234567894 101001234567894
1234567895 101001234567895
1234567896 101001234567896
1234567898 101001234567898
1234567899 101001234567899

How to implement this?
# 4  
Old 11-22-2011
Code:
nawk 'FNR==NR{f2[$0];if(!l)l=length;next}((s=substr($0,length-l+1)) in f2) {print s,$0}' file2 file1

# 5  
Old 11-23-2011
its working properly only except last line is missing even tjough it is matching.O/P like below

Code:
 
1234567890 101001234567890
1234567892 101001234567892
1234567894 101001234567894
1234567895 101001234567895
1234567896 101001234567896
1234567898 101001234567898

Missing the below line.

Code:
 
1234567899 101001234567899

and can you please explain this script briefly?
# 6  
Old 11-23-2011
this is the output I get with your sample files:
Code:
1234567890 101001234567890
1234567892 101001234567892
1234567894 101001234567894
1234567895 101001234567895
1234567896 101001234567896
1234567898 101001234567898
1234567899 101001234567899

You must have some leading/trailing blanks for one/both of your files for the last line.
Try this version:
Code:
nawk 'FNR==NR{f2[$1];if(!l)l=length;next}((s=substr($1,length-l+1)) in f2) {print s,$1}' file2 file1

# 7  
Old 11-24-2011
ThankQ.Its working fine.

Can you please explain how this script works?
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Data match 2 files based on first 2 columns matching only and join if match

Hi, i have 2 files , the data i need to match is in masterfile and i need to pull out column 3 from master if column 1 and 2 match and output entire row to new file I have tried with join and awk and i keep getting blank outputs or same file is there an easier way than what i am... (4 Replies)
Discussion started by: axis88
4 Replies

2. Shell Programming and Scripting

Remove lines matching a substring in a specific column

Dear group, I have following input text file: Brit 2016 11 18 12 00 10 1.485,00 EUR Brit 2016 11 18 12 00 10 142,64 EUR Brit 2016 11 18 12 00 10 19,80 EUR Brit 2016 11 18 12 00 10 545,00 EUR Brit 2016 11 18 12 00 10 6.450,00 EUR... (3 Replies)
Discussion started by: gfhsd
3 Replies

3. Shell Programming and Scripting

Join two files with matching columns

Hi, I need to join two files together with one common value in a column. I think I can use awk or join or a combination but I can't quite get it. Basically my data looks like this, with the TICKER columns matching up in each file File1 TICKER,column 1, column, 2, column, 3, column 4 ... (6 Replies)
Discussion started by: unkleruckus
6 Replies

4. Shell Programming and Scripting

Join all the lines matching similar pattern

I am trying to Join all the lines matching similar pattern. Example ; I wanted to join all the lines which has sam to a single line. In next line, i wanted to have all the lines with jones to a single line....etc > cat sample.txt sam 2012/11/23 sam 2012/12/5 sam 2012/12/5 jones... (2 Replies)
Discussion started by: evrurs
2 Replies

5. Shell Programming and Scripting

Substring by matching a pattern

Hi, I have a string which is in the below format "/abc/123/xyz/HI_I_AM_THE_FILE_12122012123456.TXT" I want to extract the file name which is "HI_IAM_THE_FILE_12122012123456.TXT". the depth of the directory in which the file is sitting may vary. The file can sit in /abc/123/xyz or... (2 Replies)
Discussion started by: siddu_chittari
2 Replies

6. UNIX for Dummies Questions & Answers

How to use the the join command to join multiple files by a common column

Hi, I have 20 tab delimited text files that have a common column (column 1). The files are named GSM1.txt through GSM20.txt. Each file has 3 columns (2 other columns in addition to the first common column). I want to write a script to join the files by the first common column so that in the... (5 Replies)
Discussion started by: evelibertine
5 Replies

7. Shell Programming and Scripting

Join 3 or more files using matching column

Dear Forum, Full title of the topic would be: "Join 3 or more files using matching column without full list in any of these columns" I have several, typically 3 or 4 files which I need to join, something like FULL JOIN in slq scripts, all combinations of matches should be printed into an... (3 Replies)
Discussion started by: cyz700
3 Replies

8. Shell Programming and Scripting

Substring and Join

Experts, Im struggling with something for the past hour or so and here is the challenge. File1: DO93948388 LDCNND 343 48848 -- Row1 DKKDF9933433 DKK 3384774 DLLFLFD -- Row2 DKKFJ38383734 DJF934988 REPFD FD -- Row3 File2: LDCNNDDKJF --- Row1... (5 Replies)
Discussion started by: OMLEELA
5 Replies

9. Shell Programming and Scripting

Need Help Matching a Substring

All: I am having trouble with matching substrings, and could use some input. I have a list of files in the form /path/to/filename.ext stored in a text file (one file per line; was created with find) referenced by $TEMPFILE. I need to take each file in the list and search for any number of... (1 Reply)
Discussion started by: rjlohman
1 Replies

10. UNIX for Dummies Questions & Answers

Join 2 files with multiple columns: awk/grep/join?

Hello, My apologies if this has been posted elsewhere, I have had a look at several threads but I am still confused how to use these functions. I have two files, each with 5 columns: File A: (tab-delimited) PDB CHAIN Start End Fragment 1avq A 171 176 awyfan 1avq A 172 177 wyfany 1c7k A 2 7... (3 Replies)
Discussion started by: InfoSeeker
3 Replies
Login or Register to Ask a Question