Combining information from Excel files


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Combining information from Excel files
# 1  
Old 09-24-2011
Combining entries from Excel files

Hi,

I am looking for an AWK or grep script (join will not work here since the data is not sorted) to combine two Excel files that look lke this:

Infile1:
 
Georgia Atlanta 1234 1234
Georgia Marrieta 2134 2134
Georgia Scottdale 3414 3414
Georgia Clarkston 2321 2321

Infile2:
 
Georgia Acworth 12300 1474 1374
Georgia Marrieta 3245 1134 3234
Georgia Scottdale 23456 5614 8714
Georgia Clarkston 34212 2341 4621
Georgia Cadwell 21222 6781 5521

I need to find the entries (states and cities; first and second columns) in file 1 that are also present in file 2 and combine them in such way that I can recover certain information from each city (I have highlighted the info that I need to retrieve). Thus, the outfile should look somelike this:

Outfile:
 
Georgia Marrieta 2134 3245
Georgia Scottdale 3414 23456
Georgia Clarkston 2321 34212

Aditionally, the entries from file 1 and 2 that were not found should be place in two different extra files.

Outfile1:
 
Georgia Atlanta 1234 1234

Outfile2:
 
Georgia Acworth 12300 1474 1374
Georgia Cadwell 21222 6781 5521

Any help will be greatly appreciated!

Last edited by Xterra; 09-25-2011 at 01:08 PM..
# 2  
Old 09-26-2011
Code:
$ cat f1
Georgia Atlanta 1234 1234
Georgia Marrieta 2134 2134
Georgia Scottdale 3414 3414
Georgia Clarkston 2321 2321

Code:
$ cat f2
Georgia Acworth 12300 1474 1374
Georgia Marrieta 3245 1134 3234
Georgia Scottdale 23456 5614 8714
Georgia Clarkston 34212 2341 4621
Georgia Cadwell 21222 6781 5521

Code:
$ ls outfile*
ls: outfile*: No such file or directory

Code:
$ nawk '{i=$1 FS $2}NR==FNR{A[i]=$3;B[i]=$4;next}
{C[i];if(i in A) {print i FS A[i] FS $3 > "outfile"}
else{print $0 > "outfile2"}next}
END {for(k in A) if (!(k in C)) print k FS A[k] FS B[k] > "outfile1" }' f1 f2

Code:
$ ls outfile*
outfile         outfile1        outfile2

Code:
$ cat outfile
Georgia Marrieta 2134 3245
Georgia Scottdale 3414 23456
Georgia Clarkston 2321 34212

Code:
$ cat outfile1
Georgia Atlanta 1234 1234

Code:
$ cat outfile2
Georgia Acworth 12300 1474 1374
Georgia Cadwell 21222 6781 5521


Last edited by ctsgnb; 09-26-2011 at 05:36 PM..
This User Gave Thanks to ctsgnb For This Post:
# 3  
Old 09-26-2011
ctsgnb

Thank you very, very much!
# 4  
Old 09-26-2011
Quote:
Originally Posted by Xterra
Thank you very, very much!
Did you understand the code ?
# 5  
Old 09-26-2011
I am actually dissecting the script now.
Thanks!
# 6  
Old 09-27-2011
nawk '{i=$1 FS $2}build an indice with $1 and $2 for further use in associative array. "FS" stand for Field Separator (by default : a space)
NR==FNR{if scanning the first file
A[i]=$3;retain 3rd field in array A whose indice has previously been build with $1 and $2
B[i]=$4;retain 4th field in array B whose indice has previously been build with $1 and $2
next}skip further instructions and process next line from the beginning of awk instructions
{C[i];retain $1 and $2 of the second file in the indice of an associative array C
if(i in A) if $1 and $2 are found found in first file
{print i FS A[i] FS $3 > "outfile"}log the expected output into outfile
else{print $0 > "outfile2"}otherwise log output in outfile2
next}skip further instructions and reprocess (from the beginning of awk instructions), the next line of input file
END {After both file have been processed
for(k in A)for all key pair $1 $2 of the first file
if (!(k in C))if the key pair does not exist in second file
print k FS A[k] FS B[k] > "outfile1"then log the entry into outfile1
}' f1 f2arguments (input files) of nawk command

Last edited by ctsgnb; 09-27-2011 at 08:45 AM..
This User Gave Thanks to ctsgnb For This Post:
# 7  
Old 09-27-2011
That's very nice!

I really appreciate it! One pretty quick question, what if $2 is not a single word, for instance El Paso?

Last edited by Xterra; 09-27-2011 at 08:54 AM..
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extracting information from XML to excel

Hi, I am trying to extract information from a XML file and write it to a excel sheet. I am not sure where to start from. Here is the content from my input XML file. <com.cloudbees.hudson.plugins.folder.properties.FolderProxyGroupContainer plugin="nectar-rbac@4.5"> <groups> ... (4 Replies)
Discussion started by: Sajjadmehdi
4 Replies

2. Shell Programming and Scripting

Perl script to Merge contents of 2 different excel files in a single excel file

All, I have an excel sheet Excel1.xls that has some entries. I have one more excel sheet Excel2.xls that has entries only in those cells which are blank in Excel1.xls These may be in different workbooks. They are totally independent made by 2 different users. I have placed them in a... (1 Reply)
Discussion started by: Anamika08
1 Replies

3. Shell Programming and Scripting

Combining files

Hi I have about 108 files (text files) that end with .avg and each one of these files have a distinct name that describes what is in the file. In each file there is a set of 80 values that are tab separated. I want to combine all 108 files into ONE main file. So each file is named: 1.avg... (5 Replies)
Discussion started by: phil_heath
5 Replies

4. UNIX for Dummies Questions & Answers

Need Help in reading N days files from a Directory & combining the files

Hi All, Request your expertise in tackling one requirement in my project,(i dont have much expertise in Shell Scripting). The requirement is as below, 1) We store the last run date of a process in a file. When the batch run the next time, it should read this file, get the last run date from... (1 Reply)
Discussion started by: dsfreddie
1 Replies

5. UNIX for Dummies Questions & Answers

combining two files

Hi Gurus, I have 2 files: File1 Filename1 xx Filename1 yy Filename1 Total Filename2 xx Filename2 yy Filename2 zz Filename2 Total Filename3 xx Filename3 Total and File2: Filename1 10296 xxx Date: 09/01/08 Filename2 10296 xxx Date: 09/05/08... (36 Replies)
Discussion started by: rock1
36 Replies

6. Shell Programming and Scripting

PERL: Split Excel Workbook to Indiv Excel files

Hi, I am trying to find a way to read an excel work book with multiple worksheets. And write each worksheet into a new excel file using perl. My environment is Unix. For example: I have an excel workbook TEST.xls and it has Sheet1, Sheet2, Sheet3 worksheets. I would like to create... (2 Replies)
Discussion started by: sandeep78
2 Replies

7. UNIX for Dummies Questions & Answers

Combining information from a comma delimited file

I have a comma delimited file which also contains commas in the text. I was wondering how I can combine one whole column of that file with another file. In essence I'm trying to do an excel vlookup in UNIX to return the information from one column in the comma delimited file(containing text commas... (1 Reply)
Discussion started by: vzismann
1 Replies

8. Shell Programming and Scripting

Combining Two Files

Could someone help me reduce the number of runs for a shell program I created? I have two text files below: $ more list1.txt 01 AAA 02 BBB 03 CCC 04 DDD $ more list2.txt 01 EEE 02 FFF 03 GGG I want to combine the lines with the same number to get the below: 01 AAA 01 EEE 02... (4 Replies)
Discussion started by: stevefox
4 Replies

9. Shell Programming and Scripting

Combining Two Files

I have two files which contain data from two different transactions in the same format: <Name> - <Count> My goal is to end up with data in this format after combining the two: <Name> - <Count1> - <Count2> Is this possible to do with awk, or is there something better? Thanks... (3 Replies)
Discussion started by: bat711
3 Replies

10. UNIX for Dummies Questions & Answers

combining files

how will i combine these 2 files below, with the desired output specified below: file1: one two three four file2: red blue yellow green file3: aaa bbb ccc ddd (3 Replies)
Discussion started by: apalex
3 Replies
Login or Register to Ask a Question