06-04-2018
I expect that this is doing what you asked, and although it does seem a sensible query, it is not being very practical. I think that your request is going to build some huge temporary tables to work out the various clauses before applying any inserts/updates. You may even fill your temporary table tablespace before your MERGE completes this step.
You might have some success with indices though. What indices do you have on table
schemaname.Customer_Staging and
schemaname.Customer_Staging?
- If you don't have a single index for the three columns you mention in the SELECT DISTINCT, then you might do a full table scan of schemaname.Customer_Staging.
- If you don't have a single index for all the columns in the whole query (e.g A.Name, B.Name, A.Load_dt etc. ) then you might do a full table scan of the appropriate table.
Any reason to do a full table scan that these scales will be bad. An index might take a while to build and needs to have space, but then your MERGE should run better. I've had something that took me 20 minutes to build an index then the process I wanted ran in about an hour, after which I dropped the index again.because it was a one-off report. The run without the index was abandoned after over 22 hours (someone set if off and went home)
Is there some sort of query profiler you can use to consider this? MSSQLServer has one, Oracle has one so I'm sure that DB2 will have one, but I've not ever used it. You need to avoid a full table scan when dealing with this volume of data.
I hope that this helps,
Robin
Last edited by rbatte1; 06-04-2018 at 08:03 AM..
Reason: Spelling correction.
9 More Discussions You Might Find Interesting
1. UNIX for Advanced & Expert Users
Hello all,
I just stuck up in an uncertain situation related to network performance...
I am trying to access one of my remote client unix machine from a distant location..
The client machine is Ultra-5_10 , with SunOS 5.5.1
The ndd result ( hme1 )shows that the machine is hooked to a... (5 Replies)
Discussion started by: shibz
5 Replies
2. Shell Programming and Scripting
Hi
i m trying to connect DB2 via unix. it is successfully connect. but the connect is getting disconnect .
below is the query ,
countvalue=$(db2 "connect to <Database> user <username> using <Password>" | db2 -x 'select count(*) from <tablename>' ); echo $countvalue
while... (2 Replies)
Discussion started by: baskivs
2 Replies
3. Shell Programming and Scripting
Hi
I am extracting a column value(DESCRIPTION) from one table and passing it to another db2 statement in a shell code to fetch some value(ID) but the value when passed in where condition is taking as newline+value.
Please find the out put when executed:
+ echo description is ::::... (1 Reply)
Discussion started by: msp2244
1 Replies
4. Shell Programming and Scripting
hi friends,
i have a file where every word is present in a new line for example:
more file1:
i want to fetch previous line wherever i am getting "as" as a keyword.
i tried at home the follwing code in linex:
grep -B 1 "as" file1
ouput:
caste
caste1
it was working!!
but now i am... (6 Replies)
Discussion started by: neelmani
6 Replies
5. Shell Programming and Scripting
Hi all,
I am new for linux environment, and i am working as a DBA.
I am facing some issues in OS level:
In our dev boxes /db2home under this directory i'm not finding any folder but it's showing 98% used .
/dev/dm-14 5.0G 4.6G 115M 98% /db2home
# ls -lrt
total 16... (1 Reply)
Discussion started by: suresh_target
1 Replies
6. Shell Programming and Scripting
hi
i am trying to execute db2 queries through shell script. it's working fine but for few queries is not working ( those queries are taking time so the script is not waiting to get the complete the execution of that query )
could you please any one help me on this
is there any wait... (1 Reply)
Discussion started by: bhaskar v
1 Replies
7. Shell Programming and Scripting
I have a shell script main.ksh We are calling dbscript.ksh from main.ksh
I am using select statement in dbscript.ksh but there is a problem with the select statement in dbscript.ksh but still echo $? is showing as zero. I am using DB2 commands in dbscript.ksh
Main.ksh
dbscript.ksh
echo $? ... (13 Replies)
Discussion started by: vamsi.valiveti
13 Replies
8. Programming
Dear Team
Have this interesting question on how to determine cost savings(USD) based on performance tuning in Db2
I am using DB2 v10.5 . I worked on db2 procedure that loaded 20Million records in just 2 Mins.
ETL execution time reduced from 30 Mins to 2 Mins.
From 15 Hrs Monthly to 1... (2 Replies)
Discussion started by: Perlbaby
2 Replies
9. Shell Programming and Scripting
Hi Gurus,
I need to merge two files.
file1 (small file, only one line)
this is first linefile2 (large file)
abc
def
ghi
... I use below command to merge the file, since the file2 is really large file, the command read whole file2, the performance is not good.
cat file1 > file3... (7 Replies)
Discussion started by: green_k
7 Replies
CLUSTER(7) SQL Commands CLUSTER(7)
NAME
CLUSTER - cluster a table according to an index
SYNOPSIS
CLUSTER [VERBOSE] tablename [ USING indexname ]
CLUSTER [VERBOSE]
DESCRIPTION
CLUSTER instructs PostgreSQL to cluster the table specified by tablename based on the index specified by indexname. The index must already
have been defined on tablename.
When a table is clustered, it is physically reordered based on the index information. Clustering is a one-time operation: when the table is
subsequently updated, the changes are not clustered. That is, no attempt is made to store new or updated rows according to their index
order. (If one wishes, one can periodically recluster by issuing the command again. Also, setting the table's FILLFACTOR storage parameter
to less than 100% can aid in preserving cluster ordering during updates, since updated rows are preferentially kept on the same page.)
When a table is clustered, PostgreSQL remembers which index it was clustered by. The form CLUSTER tablename reclusters the table using the
same index as before.
CLUSTER without any parameter reclusters all the previously-clustered tables in the current database that the calling user owns, or all
such tables if called by a superuser. This form of CLUSTER cannot be executed inside a transaction block.
When a table is being clustered, an ACCESS EXCLUSIVE lock is acquired on it. This prevents any other database operations (both reads and
writes) from operating on the table until the CLUSTER is finished.
PARAMETERS
tablename
The name (possibly schema-qualified) of a table.
indexname
The name of an index.
VERBOSE
Prints a progress report as each table is clustered.
NOTES
In cases where you are accessing single rows randomly within a table, the actual order of the data in the table is unimportant. However, if
you tend to access some data more than others, and there is an index that groups them together, you will benefit from using CLUSTER. If
you are requesting a range of indexed values from a table, or a single indexed value that has multiple rows that match, CLUSTER will help
because once the index identifies the table page for the first row that matches, all other rows that match are probably already on the same
table page, and so you save disk accesses and speed up the query.
During the cluster operation, a temporary copy of the table is created that contains the table data in the index order. Temporary copies of
each index on the table are created as well. Therefore, you need free space on disk at least equal to the sum of the table size and the
index sizes.
Because CLUSTER remembers the clustering information, one can cluster the tables one wants clustered manually the first time, and setup a
timed event similar to VACUUM so that the tables are periodically reclustered.
Because the planner records statistics about the ordering of tables, it is advisable to run ANALYZE [analyze(7)] on the newly clustered ta-
ble. Otherwise, the planner might make poor choices of query plans.
There is another way to cluster data. The CLUSTER command reorders the original table by scanning it using the index you specify. This can
be slow on large tables because the rows are fetched from the table in index order, and if the table is disordered, the entries are on ran-
dom pages, so there is one disk page retrieved for every row moved. (PostgreSQL has a cache, but the majority of a big table will not fit
in the cache.) The other way to cluster a table is to use:
CREATE TABLE newtable AS
SELECT * FROM table ORDER BY columnlist;
which uses the PostgreSQL sorting code to produce the desired order; this is usually much faster than an index scan for disordered data.
Then you drop the old table, use ALTER TABLE ... RENAME to rename newtable to the old name, and recreate the table's indexes. The big dis-
advantage of this approach is that it does not preserve OIDs, constraints, foreign key relationships, granted privileges, and other ancil-
lary properties of the table -- all such items must be manually recreated. Another disadvantage is that this way requires a sort temporary
file about the same size as the table itself, so peak disk usage is about three times the table size instead of twice the table size.
EXAMPLES
Cluster the table employees on the basis of its index employees_ind:
CLUSTER employees USING employees_ind;
Cluster the employees table using the same index that was used before:
CLUSTER employees;
Cluster all tables in the database that have previously been clustered:
CLUSTER;
COMPATIBILITY
There is no CLUSTER statement in the SQL standard.
The syntax
CLUSTER indexname ON tablename
is also supported for compatibility with pre-8.3 PostgreSQL versions.
SEE ALSO
clusterdb [clusterdb(1)]
SQL - Language Statements 2010-05-14 CLUSTER(7)