Merge Multiple Files and Transpose


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Merge Multiple Files and Transpose
# 1  
Old 07-09-2010
Merge Multiple Files and Transpose

Looking to join three files and then transpose some columns from multiple rows into a single row.

Code:
File Info:

FIELD TERMINATED BY '^'
ENCLOSED BY '~'
LINE TERMINATED BY '\r\n'

FIRST FILE (FOOD_DES.txt)
Code:
~01001~^~0100~^~Butter, salted~^~BUTTER,WITH SALT~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87
~01002~^~0100~^~Butter, whipped, with salt~^~BUTTER,WHIPPED,WITH SALT~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87
~01003~^~0100~^~Butter oil, anhydrous~^~BUTTER OIL,ANHYDROUS~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87
~01004~^~0100~^~Cheese, blue~^~CHEESE,BLUE~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87
~01005~^~0100~^~Cheese, brick~^~CHEESE,BRICK~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87
~01006~^~0100~^~Cheese, brie~^~CHEESE,BRIE~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87
~01007~^~0100~^~Cheese, camembert~^~CHEESE,CAMEMBERT~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87
~01008~^~0100~^~Cheese, caraway~^~CHEESE,CARAWAY~^~~^~~^~~^~~^0^~~^6.38^4.27^8.79^3.87
~01009~^~0100~^~Cheese, cheddar~^~CHEESE,CHEDDAR~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87
~01010~^~0100~^~Cheese, cheshire~^~CHEESE,CHESHIRE~^~~^~~^~~^~~^0^~~^6.38^4.27^8.79^3.87

Matches to Second File by first field of each

SECOND FILE (NUT_DATA.txt)
Code:
~01001~^~203~^0.85^16^0.074^~1~^~~^~~^~~^^^^^^^~~^
~01001~^~204~^81.11^580^0.065^~1~^~~^~~^~~^^^^^^^~~^
~01001~^~205~^0.06^0^^~4~^~NC~^~~^~~^^^^^^^~~^
~01001~^~207~^2.11^35^0.054^~1~^~~^~~^~~^^^^^^^~~^
~01002~^~203~^0.85^16^0.074^~1~^~~^~~^~~^^^^^^^~~^
~01002~^~204~^81.11^580^0.065^~1~^~~^~~^~~^^^^^^^~~^
~01002~^~205~^0.06^0^^~4~^~NC~^~~^~~^^^^^^^~~^
~01002~^~207~^2.11^35^0.054^~1~^~~^~~^~~^^^^^^^~~^
~01003~^~203~^0.28^1^^~1~^~~^~~^~~^^^^^^^~~^
~01003~^~204~^99.48^1^^~1~^~~^~~^~~^^^^^^^~~^
~01003~^~205~^0.00^0^^~1~^~~^~~^~~^^^^^^^~~^
~01003~^~207~^0.00^1^^~1~^~~^~~^~~^^^^^^^~~^
~01004~^~203~^21.40^19^0.339^~1~^~~^~~^~~^^^^^^^~~^
~01004~^~204~^28.74^20^0.399^~1~^~~^~~^~~^^^^^^^~~^
~01004~^~205~^2.34^0^^~4~^~NC~^~~^~~^^^^^^^~~^
~01004~^~207~^5.11^15^0.138^~1~^~~^~~^~~^^^^^^^~~^
~01005~^~203~^23.24^14^0.607^~1~^~~^~~^~~^^^^^^^~~^
~01005~^~204~^29.68^16^0.319^~1~^~~^~~^~~^^^^^^^~~^
~01005~^~205~^2.79^0^^~4~^~NC~^~~^~~^^^^^^^~~^
~01005~^~207~^3.18^14^0.178^~1~^~~^~~^~~^^^^^^^~~^
~01006~^~203~^20.75^7^0.210^~1~^~~^~~^~~^^^^^^^~~^
~01006~^~204~^27.68^4^1.841^~1~^~~^~~^~~^^^^^^^~~^
~01006~^~205~^0.45^0^^~4~^~NC~^~~^~~^^^^^^^~~^
~01006~^~207~^2.70^1^^~1~^~~^~~^~~^^^^^^^~~^
~01007~^~203~^19.80^12^0.367^~1~^~~^~~^~~^^^^^^^~~^
~01007~^~204~^24.26^13^0.608^~1~^~~^~~^~~^^^^^^^~~^
~01007~^~205~^0.46^0^^~4~^~NC~^~~^~~^^^^^^^~~^
~01007~^~207~^3.68^8^0.135^~1~^~~^~~^~~^^^^^^^~~^
~01008~^~203~^25.18^4^0.189^~1~^~~^~~^~~^^^^^^^~~^
~01008~^~204~^29.20^4^1.389^~1~^~~^~~^~~^^^^^^^~~^
~01008~^~205~^3.06^0^^~4~^~NC~^~~^~~^^^^^^^~~^
~01008~^~207~^3.28^4^0.155^~1~^~~^~~^~~^^^^^^^~~^
~01009~^~203~^24.90^42^0.275^~1~^~~^~~^~~^^^^^^^~~^
~01009~^~204~^33.14^106^0.206^~1~^~~^~~^~~^^^^^^^~~^
~01009~^~205~^1.28^0^^~4~^~NC~^~~^~~^^^^^^^~~^
~01009~^~207~^3.93^79^0.146^~1~^~~^~~^~~^^^^^^^~~^
~01010~^~203~^23.37^12^^~1~^~~^~~^~~^^^^^^^~~^
~01010~^~204~^30.60^4^^~1~^~~^~~^~~^^^^^^^~~^
~01010~^~205~^4.78^0^^~4~^~NC~^~~^~~^^^^^^^~~^
~01010~^~207~^3.60^3^0.101^~1~^~~^~~^~~^^^^^^^~~^

Matches to Third File by Second Field of Second File to First Field of Third File

THIRD FILE (NUTR_DEF.txt)
Code:
~203~^~g~^~PROCNT~^~Protein~^~2~^~600~
~204~^~g~^~FAT~^~Total lipid (fat)~^~2~^~800~
~205~^~g~^~CHOCDF~^~Carbohydrate, by difference~^~2~^~1100~
~207~^~g~^~ASH~^~Ash~^~2~^~1000~

DESIRED OUTPUT (only going to put together the first two as an example)
Code:
1^~01001~^~0100~^~Butter, salted~^0.85^~g~^81.11^~g~^0.06^~g~^2.11^~g~
2^~01002~^~0100~^~Butter, whipped, with salt~^0.85^~g~^81.11^~g~^0.06^~g~^2.11^~g~

EXPLANATION OF DESIRED OUTPUT
Code:
counter^~first column, first file~^~second column, first file~^~third column, first file~^~third column, second file~^~second column, third file~

Hope you have fun with this one...

---------- Post updated at 01:21 PM ---------- Previous update was at 01:11 PM ----------

I have this data in a MySQL database. Ideally I'd like to have this as a view, but I'm not very good with SQL and I have no idea how to do the transpose portion of this merge.

Anyway if you're interested in working this out in sql I will include a script to create the tables and import the data.

Code:
CREATE DATABASE NDB IF NOT EXISTS;

DROP TABLE IF EXISTS FOOD_DES, NUT_DATA, NUTR_DEF;

CREATE TABLE FOOD_DES
 (
  NDB_No      INT,
  FdGrp_Cd      VARCHAR(8),
  Long_Desc     VARCHAR(400),
  Shrt_Desc     VARCHAR(120),
  ComName     VARCHAR(200),
  ManufacName     VARCHAR(130),
  Survey      VARCHAR(2),
  Ref_Desc      VARCHAR(270),
  Refuse      INTEGER,
  SciName     VARCHAR(130),
  N_Factor      FLOAT,
  Pro_Factor      FLOAT,
  Fat_Factor      FLOAT,
  CHO_Factor      DOUBLE,
    INDEX (NDB_No)
);

CREATE TABLE NUT_DATA
 (
  NDB_No      INT,
  Nutr_No     INT,
  Nutr_Val      DOUBLE,
  Num_Data_Pts      INTEGER,
  Std_Error     DOUBLE,
  Src_Cd      VARCHAR(4),
  Deriv_Cd      VARCHAR(8),
  Ref_NDB_No      INT,
  Add_Nutr_Mark     VARCHAR(2),
  Num_Studies     INTEGER,
  Min     DOUBLE,
  Max     DOUBLE,
  DF      DOUBLE,
  Low_EB      DOUBLE,
  Up_EB     DOUBLE,
  Stat_Cmt      VARCHAR(20),
    INDEX (NDB_No)
);


CREATE TABLE NUTR_DEF
 (
  Nutr_No     INT,
  Units     VARCHAR(14),
  Tagname     VARCHAR(40),
  NutrDesc      VARCHAR(120),
  Num_Dec     VARCHAR(2),
  SR_Order      FLOAT,
     INDEX (Nutr_No)
);

LOAD DATA INFILE 'FOOD_DES.txt' INTO TABLE FOOD_DES FIELDS TERMINATED BY '^' ENCLOSED BY '~' LINES TERMINATED BY '\r\n';

LOAD DATA INFILE 'NUT_DATA.txt' INTO TABLE NUT_DATA FIELDS TERMINATED BY '^' ENCLOSED BY '~' LINES TERMINATED BY '\r\n';

LOAD DATA INFILE 'NUTR_DEF.txt' INTO TABLE NUTR_DEF FIELDS TERMINATED BY '^' ENCLOSED BY '~' LINES TERMINATED BY '\r\n';

# 2  
Old 07-09-2010
Hello, mkastin:

f=file being processed
nunit=unit used for this component of nutritional data (i assume 'g' is for grams)
ndata=nutritional data

The order of the file arguments at the end of the command is critical.

Code:
awk '
  BEGIN {FS=OFS="^"}
  FNR==1 {++f}
  f==1 {nunit[$1]=$2}
  f==2 {ndata[$1]=ndata[$1] FS $3 FS nunit[$2]}
  f==3 {print FNR,$1,$2,$3 ndata[$1]}' \
NUTR_DEF.txt NUT_DATA.txt FOOD_DES.txt



Test run using your data:
Code:
$ awk '
>   BEGIN {FS=OFS="^"}
>   FNR==1 {++f}
>   f==1 {nunit[$1]=$2}
>   f==2 {ndata[$1]=ndata[$1] FS $3 FS nunit[$2]}
>   f==3 {print FNR,$1,$2,$3 ndata[$1]}' \
> NUTR_DEF.txt NUT_DATA.txt FOOD_DES.txt
1^~01001~^~0100~^~Butter, salted~^0.85^~g~^81.11^~g~^0.06^~g~^2.11^~g~
2^~01002~^~0100~^~Butter, whipped, with salt~^0.85^~g~^81.11^~g~^0.06^~g~^2.11^~g~
3^~01003~^~0100~^~Butter oil, anhydrous~^0.28^~g~^99.48^~g~^0.00^~g~^0.00^~g~
4^~01004~^~0100~^~Cheese, blue~^21.40^~g~^28.74^~g~^2.34^~g~^5.11^~g~
5^~01005~^~0100~^~Cheese, brick~^23.24^~g~^29.68^~g~^2.79^~g~^3.18^~g~
6^~01006~^~0100~^~Cheese, brie~^20.75^~g~^27.68^~g~^0.45^~g~^2.70^~g~
7^~01007~^~0100~^~Cheese, camembert~^19.80^~g~^24.26^~g~^0.46^~g~^3.68^~g~
8^~01008~^~0100~^~Cheese, caraway~^25.18^~g~^29.20^~g~^3.06^~g~^3.28^~g~
9^~01009~^~0100~^~Cheese, cheddar~^24.90^~g~^33.14^~g~^1.28^~g~^3.93^~g~
10^~01010~^~0100~^~Cheese, cheshire~^23.37^~g~^30.60^~g~^4.78^~g~^3.60^~g~

Regards,
Alister
This User Gave Thanks to alister For This Post:
# 3  
Old 07-09-2010
Many thanks Alister.

Your guess about the units is correct.

These files are actually excerpts for the USDA Nutrient Database available here:

http://www.nal.usda.gov/fnic/foodcomp/search/

Last edited by mkastin; 07-09-2010 at 05:36 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Merge Multiple html files into one

Hi all I have written some code to write my output in html. As i have multiple servers, need to generate single html file. but my code is generating html file for each server. I have merged the files using below code. cat /home/*_FinalData.html > /home/MergedFinalData.html But how to... (1 Reply)
Discussion started by: Snehasish
1 Replies

2. Shell Programming and Scripting

Merge multiple files with common header

Hi all, Say i have multiple files x1 x2 x3 x4, all with common header (date, time, year, age),, How can I merge them to one singe file "X" in shell scripting Thanks for your suggestions. (2 Replies)
Discussion started by: msarguru
2 Replies

3. Shell Programming and Scripting

Merge columns from multiple files

Hello and Good day I have a lot of files with same number of rows and columns.$2 and $3 are the same in all files . I need to merge $2,$3,$6 from first file and $6 from another files. File1: $1 $2 $3 $4 $5 $6... (8 Replies)
Discussion started by: ali.seifaddini
8 Replies

4. Shell Programming and Scripting

Merge the multiple text files into one file

Hi All, I am trying to merge all the text files into one file using below snippet cat /home/Temp/Test/Log/*.txt >> all.txt But it seems it is not working. I have multiple files like Output_ServerName1.txt, Output_ServreName2.txt I want to merge each file into one single file and... (6 Replies)
Discussion started by: sharsour
6 Replies

5. UNIX for Dummies Questions & Answers

Merge columns from multiple files

Hi all, I've searched the web for a long time trying to figure out how to merge columns from multiple files. I know paste will append columns like so: paste file1 file2 file3 file4 file5 ... But this becomes inconvenient when you want to append a large number of files into a single file. ... (2 Replies)
Discussion started by: torchij
2 Replies

6. Shell Programming and Scripting

Merge column headers and transpose

Hello Everyone! I am new on this forum and this is my first post. I wish to apologize for my, not canonical, English. I would like to solve this problem but I have no clue of how do it!I will be grateful if someone could help me! I have a table like this: gene TF1 TF2 TF3 TF4 gene1 1 2 3 4... (5 Replies)
Discussion started by: giuliangiuseppe
5 Replies

7. UNIX for Dummies Questions & Answers

Merge multiple files

Hi All, How can I merge 3rd column of multiple files into 1 file, the column header in the merged file being the name of the file from which the 3rd column was taken. The first 2 columns of all the files are exactly same. Thanks for your help ! (3 Replies)
Discussion started by: newbie83
3 Replies

8. UNIX for Advanced & Expert Users

merge two column multiple files into one

Hi I have multiple files each with two columns and I need to combine all those file into a tab delimited file. (multiple entry with same name separated by a comma) The content of the files are as follows: --- file1.txt: name var1 aaa xx aaa gg bbb yy ddd zz --- file2.txt ... (8 Replies)
Discussion started by: mary271
8 Replies

9. Shell Programming and Scripting

Help to merge multiple .sql files

Hello all, I have a shell script that uses multiple .sql files. These .sql files mainly contain Oracle SQL queries to pull fields from the database. I want to place all the contents of these .sql files in one .sql file and have some parameter sent based on which the respective block or query... (6 Replies)
Discussion started by: snvniranjanrao
6 Replies

10. UNIX for Advanced & Expert Users

Merge multiple .so files

Hi all, I am developing an application in Tcl, inwhich i have to load many modules written in C. I am converting those C modules into shared object(.so) files, and wrap it with my application using SWIG, for which i had the interface file. Now my question is, i have two different... (2 Replies)
Discussion started by: senthilvnr
2 Replies
Login or Register to Ask a Question