awk concatenation issue - SQL generation


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk concatenation issue - SQL generation
# 1  
Old 04-11-2016
awk concatenation issue - SQL generation

Greetings Experts,

I have an excel file and I am unable to read it directly into awk (contains , " etc); So, I cleansed and copied the data into notepad.

I need to generate a script that generates the SQL.

Requirement:
1. Filter and select only the data that has the "mapping" as "direct"
2. Add COALESCE and NVL (with sample value of 11111) for all columns according to datatype. I will change it appropriately wherever applicable
3. Whenever the source and target table names are same the queries need to be concatenated with "," so that I validate all relative columns at one pass for each combination of source and target.
4. I will add the required join conditions based on the source and target table names manually;

From the intended output, t1c4 column query is separated as the source table is different; t2c2 is separated as the target table is different;

Input File:
Code:
SRC_TABLE  SRC_COL  SRC_DATATYPE  MAPPING  TGT_TABLE  TGT_COL  TGT_DATATYPE
SRC_TBL1    S1C1        DATE        DIRECT     TGT_TBL1   T1C1  CHAR(6)YYYYMM
SRC_TBL1    S1C2        VARCHAR     DIRECT     TGT_TBL1   T1C2  INTEGER
SRC_TBL1    S1C3        DATE        OTHER      TGT_TBL1   T1C3  INTEGER
SRC_TBL2    S2C4        INTEGER     DIRECT     TGT_TBL1   T1C4  INTEGER
SRC_TBL2    S2C2        INTEGER     DIRECT     TGT_TBL2   T2C2  INTEGER

Intended Output:
Code:
SUM(CASE WHEN COALESCE(TO_CHAR(SRC_TBL1.S1C1,'YYYYMMDD'), '111101' = COALESCE(TGT_TBL1.T1C1,'111101') THEN 0 ELSE 1 END) COL_T1C1,
SUM(CASE WHEN COALESCE(SRC_TBL1.S1C2,'11111')=COALESCE(TGT_TBL1.T1C2,'11111') THEN 0 ELSE 1 END) COL_T1C2

Code:
SUM(CASE WHEN NVL(SRC_TBL2.S2C4,11111)=NULL(TGT_TBL1.T1C4,11111) THEN 0 ELSE 1 END) COL_T1C4

Code:
SUM(CASE WHEN NVL(SRC_TBL2.S2C2,11111)=NVL(TGT_TBL2.T2C2,11111) THEN 0 ELSE 1 END) COL_T2C1

What I have done:
Code:
awk -F " " ' {
if((tolower($4) ~ /direct/) && (toupper($3) ~ /DATE/) && (toupper($7) ~ /CHAR\(6/)) {
str="SUM(CASE WHEN COALESCE(TO_CHAR("$1"."$2",\47YYYYMM\47), \47777707\47)=COALESCE("$5"."$6 ",\47777707\47) THEN 0 ELSE 1 END) COL_"$6
a[$1 OFS $5]=(!a[$1 OFS $5])?a[$1 OFS $5]","str:str

if((tolower($4) ~ /direct/) && (toupper($3) ~ /CHAR/)) && (toupper($7) ~ /CHAR\(6/) {
str="SUM(CASE WHEN COALESCE("$1"."$2",\47777707\47)=COALESCE("$5"."$6 ",\47777707\47) THEN 0 ELSE 1 END) COL_"$6
a[$1 OFS $5]=(!a[$1 OFS $5])?a[$1 OFS $5]","str:str
#.......similarly handling other data types

}
}
END {
for (i in a)
{
print "source and target table names are " i
print "--------------"
print "excerpt of the query is" a[i]
print "%%%%%%%%%%%%%%%%%%%%"
}
}' < input_file.txt > output_file.txt

Excuse any syntax issues as I was not able to copy/paste the code;
My output doesn't seem to concatenate when there are 2 or more than eligible records for the same source and target table name; It has only one target_column no matter how many eligible records are there; For eg: the first source and target table names gives this result
My output through code:
Code:
SUM(CASE WHEN COALESCE(SRC_TBL1.S1C2,'11111')=COALESCE(TGT_TBL1.T1C2,'11111') THEN 0 ELSE 1 END) COL_T1C2

I am not able to figure it out as I am learning awk recently through the forums posts. Also, I am not able to have single quote (') as part of the query directly using ' (tried with backslash,'',''','''') still not working; after searching, used \47 and it breaks when used with numeric chars as \47777777\47
Can you please tell me how to have ' with out using awk variables at the command line or BEGIN and with using awk variables;
Many thanks for your time..
# 2  
Old 04-11-2016
Hmmm - feeling a bit lost here. Nevertheless, some hints:
- for the single quote, either use \047 or terminate the string and immediately reopen it: "...\47" "777707\47"
- as the input consistently seems to be upper case, you might be able to drop the to_lower(...) and to_upper(...) conversions.
- it would concatenate if it got there, but it can't as the respective branch is chosen max once. There IS just nothing to concatenate.

Last edited by RudiC; 04-11-2016 at 04:05 PM..
# 3  
Old 04-11-2016
This may point you in a direction:
Code:
awk '
$4 == "DIRECT"  {IX = $1 OFS $5
                 if ($3 $7 ~ /DATE.*CHAR\(6/)
                        str="SUM(CASE WHEN COALESCE(TO_CHAR("$1"."$2",\47""YYYYMM\47), \47""777707\47)=COALESCE("$5"."$6 ",\47""777707\47) THEN 0 ELSE 1 END) COL_"$6

                 if ($3 $7 ~ /CHAR.*INTEGER/)
                        str="SUM(CASE WHEN COALESCE("$1"."$2",\47""777707\47)=COALESCE("$5"."$6 ",\47""777707\47) THEN 0 ELSE 1 END) COL_"$6

                 a[IX] = (a[IX])?a[IX] "," str:str
                }

END     {for (i in a)   {print "source and target table names are " i
                         print "--------------"
                         print "excerpt of the query is " a[i]
                         print "%%%%%%%%%%%%%%%%%%%%"
                        }
        }
' < file

This User Gave Thanks to RudiC For This Post:
# 4  
Old 04-12-2016
Thank you RudiC; that worked
# 5  
Old 04-12-2016
Glad it helped - but it was not meant to work but to indicate how you (c|sh)ould continue your development efforts. Anyhow: welcome!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Using awk and grep for sql generation

Hi, I have a file pk.txt which has table related data in following format TableName | PK Employee | id Contact|name,phone,country I have another file desc.txt which lists datatype of each field like this: Table|Field|Type Employee|id|int Contact|name|string Contact|country|string... (7 Replies)
Discussion started by: wahi80
7 Replies

2. Shell Programming and Scripting

Issue in Concatenation/Joining of lines in a dynamically generated file

Hi, I have a file containing many records delimited by pipe (|). Each record should contain 17 columnns/fields. there are some fields having fields less than 17.So i am extracting those records to a file using the below command awk 'BEGIN {FS="|"} NF !=17 {print}' feedfile.txt... (8 Replies)
Discussion started by: TomG
8 Replies

3. Shell Programming and Scripting

Concatenation in awk not working

Hello I want to achieve the following. However the concatenation is not working mv `ls -ltr *myfile*.log|awk '{print $9}'` `ls -ltr *myfile*.log|awk '{print `date +'%d%m%y%k%M%S'` $9}'` I tried awk '{x=`date +'%d%m%y%k%M%S'` print $x "" $9}' awk '{x=`date +'%d%m%y%k%M%S'`... (2 Replies)
Discussion started by: Chetanz
2 Replies

4. UNIX for Dummies Questions & Answers

awk for concatenation of column values

Hello, I have a table as shown below. I want to concatenate values in col2 and col3 based on a value in col4. 1 X Y A 3 Y Z B 4 A W B 5 T W A If col4 is A, then I want to concatenate col3 with itself. Otherwise it should concateneate col2 with col3. 1 X Y YY 3 Y Z YZ... (10 Replies)
Discussion started by: Gussifinknottle
10 Replies

5. Shell Programming and Scripting

Random word generation with AWK

Hi - I have a word GTTCAGAGTTCTACAGTCCGACGAT I need to extract all the possible "chunks" of 7 or above letter "words" from this. SO, my out put should be GTTCAGA TTCAGAG TCAGAGT CAGAGTTCT TCCGACGAT CAGTCCGACG etc. How can I do that with awk or any other language? I have no... (2 Replies)
Discussion started by: polsum
2 Replies

6. Shell Programming and Scripting

Awk concatenation in different lines

Hi All I have the data as id-number 01 name-id x0 input-id x0 output-id x0 name-id x0 input-id x0 output-id x0 name-id x0 input-id x0 output-id x0 id-number 02 name-id x0 input-id x0 output-id x0 name-id x0 input-id x0 output-id x0 name-id x0 input-id x0 output-id x0 . . I... (4 Replies)
Discussion started by: posner
4 Replies

7. Shell Programming and Scripting

Dynamic command line generation with awk

Hi, I'm not an expert in awk but i need a simple script to do this: I'd like to AutoCrop PDF files. I found 2 simple script that combined together could help me to automatize :) The first utiliti is "pdfinfo" that it gives the MediaBox and TrimBox values from the pdf. The pdfinfo output... (8 Replies)
Discussion started by: gbagagli
8 Replies

8. Shell Programming and Scripting

String concatenation issue in ksh

Hello All, I'm tryying to concatenate string and variables value in ksh, but i'm unable to do it, can someone please help in rectifying my error, here is the code i have written, #!/usr/bin/ksh -x cat $1 | while read fileline do val1= echo $fileline | awk -F, '{print $1}' val2= echo... (3 Replies)
Discussion started by: arvindcgi
3 Replies

9. Shell Programming and Scripting

cannot get logic for concatenation awk

Hello friends, I have a problem in printing an array.. Example if my array line contains 4 elements like following line=0002 , line=202200, line=200002, line= 300313 Now one = sprintf line line line line will concatenate my whole array to one. But I am not sure about the... (7 Replies)
Discussion started by: user_prady
7 Replies

10. UNIX for Dummies Questions & Answers

SQL Loader Auto Number Generation

Hi all, I have a doubt in SQL Loader. We have SEQUENCE function in SQL Loader or can create Sequence in Oracle database for generating a number sequence for a column while loading data using SQL Loader into table or multiple tables. My requirment is this. For the first run in SQL... (2 Replies)
Discussion started by: vinoth_kumar
2 Replies
Login or Register to Ask a Question