DB2 - Performance Issue using MERGE option


 
Thread Tools Search this Thread
Top Forums Programming DB2 - Performance Issue using MERGE option
# 1  
Old 06-04-2018
DB2 - Performance Issue using MERGE option

Dear Team,
I am using DB2 v10.5 and trying to load huge data using MERGE option ( 10-12 Million) using below query. Basically it loads data from staging to target . Staging table schemaname.Customer_Staging has 12 Million records . Runs for 25 Mins for just select query. But while doing merge (first time load) , it has to insert all records and below process is getting hanged after running 1.5-2 hrs (With No inserts)


HTML Code:
   MERGE INTO schemaname.Customer_Table A
    USING ( SELECT DISTINCT B.Name , B.CUST_ID, B.ORG_ID from schemaname.Customer_Staging B 
    WHERE (B.Name is not null and length(B.Name)>0) and (B.CUST_ID is not null and length(B.CUST_ID)>0) and (B.ORG_ID is NOT null and length(B.ORG_ID) > 0)) B
     ON B.Name = A.Name and B.CUST_ID=A.CUST_ID and B.ORG_ID=A.ORG_ID and A.Load_dt is null
    WHEN NOT MATCHED THEN INSERT (Name,CUST_ID,Load_dt,ORG_ID)  
    VALUES(B.Name,B.CUST_ID,current timestamp - current timeZone,B.ORG_ID);

DDL columns for both staging and Target is same .
Included Unique index for three columns Name,CUST_ID,ORG_ID using
Code:
ALLOW REVERSE SCANS PAGE SPLIT SYMMETRIC option


Please can someone help how to tweak this query or use best approach.
any help appreciated
Thanks
# 2  
Old 06-04-2018
I expect that this is doing what you asked, and although it does seem a sensible query, it is not being very practical. I think that your request is going to build some huge temporary tables to work out the various clauses before applying any inserts/updates. You may even fill your temporary table tablespace before your MERGE completes this step.


You might have some success with indices though. What indices do you have on table schemaname.Customer_Staging and schemaname.Customer_Staging?
  • If you don't have a single index for the three columns you mention in the SELECT DISTINCT, then you might do a full table scan of schemaname.Customer_Staging.
  • If you don't have a single index for all the columns in the whole query (e.g A.Name, B.Name, A.Load_dt etc. ) then you might do a full table scan of the appropriate table.
Any reason to do a full table scan that these scales will be bad. An index might take a while to build and needs to have space, but then your MERGE should run better. I've had something that took me 20 minutes to build an index then the process I wanted ran in about an hour, after which I dropped the index again.because it was a one-off report. The run without the index was abandoned after over 22 hours (someone set if off and went home)

Is there some sort of query profiler you can use to consider this? MSSQLServer has one, Oracle has one so I'm sure that DB2 will have one, but I've not ever used it. You need to avoid a full table scan when dealing with this volume of data.



I hope that this helps,
Robin

Last edited by rbatte1; 06-04-2018 at 08:03 AM.. Reason: Spelling correction.
# 3  
Old 06-06-2018
Hi Robin
First Thing : Thank you for looking into it and providing valid suggestions.

Made few changes :
1.I have removed distinct in the select query.
2.I already have indexes for both tables for required columns as below.

Code:
--staging 
CREATE UNIQUE INDEX schemaname.PK_Customer_Staging
    ON schemaname.Customer_Staging(Name ,CUST_ID,ORG_ID)
ALLOW REVERSE SCANS PAGE SPLIT SYMMETRIC

--target
CREATE UNIQUE INDEX schemaname.PK_Customer_Table
    ON schemaname.Customer_Table(Name ,CUST_ID,ORG_ID)
ALLOW REVERSE SCANS PAGE SPLIT SYMMETRIC

3.I observed my db2 uses MAX_LOG => 32 and LOGPRIMARY is 12(No way to extend this as its max provided by tool client ) and do get transaction log space error while loading first time since no data exists in target table.

Is there any way to split the logic based on row count and insert data in two parts . Like First 5 Million , then 5=M to Max count .
Or any other approach to load huge 8 Million data . Any help appreciated.
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Best performance to merge two files

Hi Gurus, I need to merge two files. file1 (small file, only one line) this is first linefile2 (large file) abc def ghi ... I use below command to merge the file, since the file2 is really large file, the command read whole file2, the performance is not good. cat file1 > file3... (7 Replies)
Discussion started by: green_k
7 Replies

2. Programming

DB2 - Determine Cost Savings in USD - After performance Tuning

Dear Team Have this interesting question on how to determine cost savings(USD) based on performance tuning in Db2 I am using DB2 v10.5 . I worked on db2 procedure that loaded 20Million records in just 2 Mins. ETL execution time reduced from 30 Mins to 2 Mins. From 15 Hrs Monthly to 1... (2 Replies)
Discussion started by: Perlbaby
2 Replies

3. Shell Programming and Scripting

UNIX with DB2 error status Issue

I have a shell script main.ksh We are calling dbscript.ksh from main.ksh I am using select statement in dbscript.ksh but there is a problem with the select statement in dbscript.ksh but still echo $? is showing as zero. I am using DB2 commands in dbscript.ksh Main.ksh dbscript.ksh echo $? ... (13 Replies)
Discussion started by: vamsi.valiveti
13 Replies

4. Shell Programming and Scripting

Issue on executing db2 queries through shell script

hi i am trying to execute db2 queries through shell script. it's working fine but for few queries is not working ( those queries are taking time so the script is not waiting to get the complete the execution of that query ) could you please any one help me on this is there any wait... (1 Reply)
Discussion started by: bhaskar v
1 Replies

5. Shell Programming and Scripting

/db2home full issue in db2

Hi all, I am new for linux environment, and i am working as a DBA. I am facing some issues in OS level: In our dev boxes /db2home under this directory i'm not finding any folder but it's showing 98% used . /dev/dm-14 5.0G 4.6G 115M 98% /db2home # ls -lrt total 16... (1 Reply)
Discussion started by: suresh_target
1 Replies

6. Shell Programming and Scripting

issue with grep -B option

hi friends, i have a file where every word is present in a new line for example: more file1: i want to fetch previous line wherever i am getting "as" as a keyword. i tried at home the follwing code in linex: grep -B 1 "as" file1 ouput: caste caste1 it was working!! but now i am... (6 Replies)
Discussion started by: neelmani
6 Replies

7. Shell Programming and Scripting

Unix and db2 where condition issue(new line)

Hi I am extracting a column value(DESCRIPTION) from one table and passing it to another db2 statement in a shell code to fetch some value(ID) but the value when passed in where condition is taking as newline+value. Please find the out put when executed: + echo description is ::::... (1 Reply)
Discussion started by: msp2244
1 Replies

8. Shell Programming and Scripting

DB2 Connect issue

Hi i m trying to connect DB2 via unix. it is successfully connect. but the connect is getting disconnect . below is the query , countvalue=$(db2 "connect to <Database> user <username> using <Password>" | db2 -x 'select count(*) from <tablename>' ); echo $countvalue while... (2 Replies)
Discussion started by: baskivs
2 Replies

9. UNIX for Advanced & Expert Users

Performance issue

Hello all, I just stuck up in an uncertain situation related to network performance... I am trying to access one of my remote client unix machine from a distant location.. The client machine is Ultra-5_10 , with SunOS 5.5.1 The ndd result ( hme1 )shows that the machine is hooked to a... (5 Replies)
Discussion started by: shibz
5 Replies
Login or Register to Ask a Question