Go back to OCM 12c Preparation Project
Hands On Lab – Index
[1] Star schema configuration
1.1 STAR_TRANSFORMATION_ENABLED Behaviour with False and True
1.2 We can force the optimization processing using the Hint star STAR_TRANSFORMATION.
1.3 STAR_TRANSFORMATION_ENABLED Behaviour with temp_disable
[2] Optimizing Star Queries
2.1 Tuning Star Queries
2.2 How Oracle Chooses to Use Star Transformation
2.3 Star Transformation Restrictions
2.4 Optimizing Star Queries Using VECTOR GROUP BY Aggregation
—————————————————————————-
HOL: Create two dimension and one fact table and create query that
uses start transformation and optimize the star transformation query.
—————————————————————————-
[1] Star schema configuration It is quite common in type DB Data Warehouse have star schema. A fact table (FACT TABLE) relates to several tables of dimensions (DIMENSION TABLES). About Facts and Dimensions in Star Schemas Star schemas divide data into facts and dimensions Facts are the measurements of some event such as a sale and are typically numbers. Dimensions are the categories you use to identify facts, such as date, location, and product. The fact table is sales. sales shows columns amount_sold and quantity_sold. times,channels,products and customers are Dimensions The important guidelines that must be followed to optimize sentences star are using Bitmap indexes for each of the foreign keys of the fact table and enable STAR_TRANSFORMATION_ENABLED parameter, which is set by default to FALSE. Let's see how to enable and check. 1.1 STAR_TRANSFORMATION_ENABLED Syntax : STAR_TRANSFORMATION_ENABLED = { FALSE | TRUE | TEMP_DISABLE } Default value : FALSE Modifiable : ALTER SESSION, ALTER SYSTEM Modifiable in a PDB : Yes STAR_TRANSFORMATION_ENABLED determines whether a cost-based query transformation will be applied to star queries. Values FALSE : The transformation will not be applied. TRUE : The optimizer will consider performing a cost-based query transformation on the star query. TEMP_DISABLE : The optimizer will consider performing a cost-based query transformation on the star query but will not use temporary tables in the star transformation. - Before enabling optimization STAR QUERIES let's take an example query - We check the value of the parameter STAR_TRANSFORMATION_ENABLED (default FALSE) -bash-4.1$ sqlplus sys/oracle_4U@pdb1 as sysdba SQL> SHOW PARAMETER STAR_TRANSFORMATION_ENABLED NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ star_transformation_enabled string FALSE SQL> SET AUTOT TRACE EXPLAIN SQL> SQL> SELECT ch.channel_class, c.cust_city, t.calendar_quarter_desc, SUM (s.amount_sold) sales_amount FROM sh.sales s, sh.times t, sh.customers c, sh.channels ch WHERE s.time_id = t.time_id AND s.cust_id = c.cust_id AND s.channel_id = ch.channel_id AND c.cust_state_province = 'CA' AND ch.channel_desc in ('Internet', 'Catalog') AND T.calendar_quarter_desc IN ('1999 to 1901', '1999 to 1902') GROUP BY ch.channel_class, c.cust_city, t.calendar_quarter_desc; Execution Plan ---------------------------------------------------------- Plan hash value: 819542696 -------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | -------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 286 | 24024 | 1807 (1)| 00:00:01 | | | | 1 | HASH GROUP BY | | 286 | 24024 | 1807 (1)| 00:00:01 | | | |* 2 | HASH JOIN | | 1558 | 127K| 1806 (1)| 00:00:01 | | | |* 3 | TABLE ACCESS FULL | CUSTOMERS | 383 | 9958 | 423 (1)| 00:00:01 | | | | 4 | NESTED LOOPS | | 28730 | 1627K| 1383 (0)| 00:00:01 | | | | 5 | NESTED LOOPS | | 28730 | 1627K| 1383 (0)| 00:00:01 | | | | 6 | MERGE JOIN CARTESIAN | | 183 | 6771 | 37 (0)| 00:00:01 | | | |* 7 | TABLE ACCESS FULL | CHANNELS | 2 | 42 | 3 (0)| 00:00:01 | | | | 8 | BUFFER SORT | | 91 | 1456 | 34 (0)| 00:00:01 | | | |* 9 | TABLE ACCESS FULL | TIMES | 91 | 1456 | 17 (0)| 00:00:01 | | | | 10 | PARTITION RANGE ITERATOR | | | | | | KEY | KEY | | 11 | BITMAP CONVERSION TO ROWIDS | | | | | | | | | 12 | BITMAP AND | | | | | | | | |* 13 | BITMAP INDEX SINGLE VALUE | SALES_TIME_BIX | | | | | KEY | KEY | |* 14 | BITMAP INDEX SINGLE VALUE | SALES_CHANNEL_BIX | | | | | KEY | KEY | | 15 | TABLE ACCESS BY LOCAL INDEX ROWID| SALES | 157 | 3297 | 1383 (0)| 00:00:01 | 1 | 1 | -------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("S"."CUST_ID"="C"."CUST_ID") 3 - filter("C"."CUST_STATE_PROVINCE"='CA') 7 - filter("CH"."CHANNEL_DESC"='Catalog' OR "CH"."CHANNEL_DESC"='Internet') 9 - filter("T"."CALENDAR_QUARTER_DESC"='1999 to 1901') 13 - access("S"."TIME_ID"="T"."TIME_ID") 14 - access("S"."CHANNEL_ID"="CH"."CHANNEL_ID") Note ----- - this is an adaptive plan SQL> ALTER SYSTEM SET STAR_TRANSFORMATION_ENABLED = TRUE; System altered. SQL> SELECT ch.channel_class , c.cust_city , t.calendar_quarter_desc , SUM ( s.amount_sold ) sales_amount FROM sh.sales s , sh.times t, sh.customers c , sh.channels ch 2 3 4 WHERE s.time_id = t.time_id AND s.cust_id = c.cust_id AND s.channel_id = ch.channel_id AND c.cust_state_province = ' CA' AND ch.channel_desc in ( 'Internet' , ' Catalog' ) AND T.calendar_quarter_desc IN (' 1999 to 1901 ', ' 1999 to 1902 ' ) GROUP BY ch.channel_class , c.cust_city , t.calendar_quarter_desc ; 5 6 7 8 9 10 Execution Plan ---------------------------------------------------------- Plan hash value: 3089720908 ------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows |Bytes | Cost (%CPU)| Time | Pstart| Pstop | ------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 81 | 495 (1)| 00:00:01 | | | | 1 | TEMP TABLE TRANSFORMATION | | | | | | | | | 2 | LOAD AS SELECT | SYS_TEMP_0FD9D6611_6575DF | | | | | | | |* 3 | TABLE ACCESS FULL | CUSTOMERS | 1 | 26 | 423 (1)| 00:00:01 | | | | 4 | HASH GROUP BY | | 1 | 81 | 72 (2)| 00:00:01 | | | |* 5 | HASH JOIN | | 1 | 81 | 71 (0)| 00:00:01 | | | | 6 | MERGE JOIN CARTESIAN | | 1 | 52 | 23 (0)| 00:00:01 | | | | 7 | MERGE JOIN CARTESIAN | | 1 | 31 | 20 (0)| 00:00:01 | | | |* 8 | TABLE ACCESS FULL | TIMES | 1 | 16 | 18 (0)| 00:00:01 | | | | 9 | BUFFER SORT | | 1 | 15 | 2 (0)| 00:00:01 | | | | 10 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6611_6575DF | 1 | 15 | 2 (0)| 00:00:01 | | | | 11 | BUFFER SORT | | 2 | 42 | 21 (0)| 00:00:01 | | | |* 12 | TABLE ACCESS FULL | CHANNELS | 2 | 42 | 3 (0)| 00:00:01 | | | | 13 | VIEW | VW_ST_62EEF96F | 1 | 29 | 48 (0)| 00:00:01 | | | | 14 | NESTED LOOPS | | 1 | 57 | 25 (0)| 00:00:01 | | | | 15 | PARTITION RANGE SUBQUERY | | | 12 | 25 (0)| 00:00:01 |KEY(SQ)|KEY(SQ)| | 16 | BITMAP CONVERSION TO ROWIDS| | | 12 | 25 (0)| 00:00:01 | | | | 17 | BITMAP AND | | | | | | | | | 18 | BITMAP MERGE | | | | | | | | | 19 | BITMAP KEY ITERATION | | | | | | | | | 20 | BUFFER SORT | | | | | | | | | 21 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6611_6575DF | 1 | 5 | 2 (0)| 00:00:01 | | | |* 22 | BITMAP INDEX RANGE SCAN| SALES_CUST_BIX | | | | |KEY(SQ)|KEY(SQ)| | 23 | BITMAP MERGE | | | | | | | | | 24 | BITMAP KEY ITERATION | | | | | | | | | 25 | BUFFER SORT | | | | | | | | |* 26 | TABLE ACCESS FULL | CHANNELS | 2 | 26 | 3 (0)| 00:00:01 | | | |* 27 | BITMAP INDEX RANGE SCAN| SALES_CHANNEL_BIX | | | | |KEY(SQ)|KEY(SQ)| | 28 | BITMAP MERGE | | | | | | | | | 29 | BITMAP KEY ITERATION | | | | | | | | | 30 | BUFFER SORT | | | | | | | | |* 31 | TABLE ACCESS FULL | TIMES | 1 | 16 | 18 (0)| 00:00:01 | | | |* 32 | BITMAP INDEX RANGE SCAN| SALES_TIME_BIX | | | | |KEY(SQ)|KEY(SQ)| | 33 | TABLE ACCESS BY USER ROWID | SALES | 1 | 29 | 23 (0)| 00:00:01 | ROWID | ROWID | -------------------------------------------------------------------------------- ----------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter("C"."CUST_STATE_PROVINCE"=' CA') 5 - access("ITEM_3"="T"."TIME_ID" AND "ITEM_2"="C0" AND "ITEM_1"="CH"."CHANNEL_ID") 8 - filter("T"."CALENDAR_QUARTER_DESC"=' 1999 to 1901 ') 12 - filter("CH"."CHANNEL_DESC"=' Catalog' OR "CH"."CHANNEL_DESC"='Internet') 22 - access("S"."CUST_ID"="C0") 26 - filter("CH"."CHANNEL_DESC"=' Catalog' OR "CH"."CHANNEL_DESC"='Internet') 27 - access("S"."CHANNEL_ID"="CH"."CHANNEL_ID") 31 - filter("T"."CALENDAR_QUARTER_DESC"=' 1999 to 1901 ') 32 - access("S"."TIME_ID"="T"."TIME_ID") Note ----- - star transformation used for this statement - this is an adaptive plan 1.2 We can force the optimization processing using the Hint star STAR_TRANSFORMATION. SQL> ALTER SYSTEM SET STAR_TRANSFORMATION_ENABLED = FALSE; System altered. SQL> SET AUTOT TRACE EXPLAIN SQL> SELECT /*+ STAR_TRANSFORMATION */ ch.channel_class , c.cust_city , t.calendar_quarter_desc , SUM ( s.amount_sold ) sales_amount FROM sh.sales s , sh.times t, sh.customers c , sh.channels ch WHERE s.time_id = t.time_id AND s.cust_id = c.cust_id AND s.channel_id = ch.channel_id AND c.cust_state_province = ' CA' AND ch.channel_desc in ( 'Internet' , ' Catalog' ) AND T.calendar_quarter_desc IN (' 1999 to 1901 ', ' 1999 to 1902 ' ) GROUP BY ch.channel_class , c.cust_city , t.calendar_quarter_desc ; Execution Plan ---------------------------------------------------------- Plan hash value: 3089720908 ------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows |Bytes | Cost (%CPU)| Time | Pstart| Pstop | ------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 81 | 495 (1)| 00:00:01 | | | | 1 | TEMP TABLE TRANSFORMATION | | | | | | | | | 2 | LOAD AS SELECT | SYS_TEMP_0FD9D6613_6575DF | | | | | | | |* 3 | TABLE ACCESS FULL | CUSTOMERS | 1 | 26 | 423 (1)| 00:00:01 | | | | 4 | HASH GROUP BY | | 1 | 81 | 72 (2)| 00:00:01 | | | |* 5 | HASH JOIN | | 1 | 81 | 71 (0)| 00:00:01 | | | | 6 | MERGE JOIN CARTESIAN | | 1 | 52 | 23 (0)| 00:00:01 | | | | 7 | MERGE JOIN CARTESIAN | | 1 | 31 | 20 (0)| 00:00:01 | | | |* 8 | TABLE ACCESS FULL | TIMES | 1 | 16 | 18 (0)| 00:00:01 | | | | 9 | BUFFER SORT | | 1 | 15 | 2 (0)| 00:00:01 | | | | 10 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6613_6575DF | 1 | 15 | 2 (0)| 00:00:01 | | | | 11 | BUFFER SORT | | 2 | 42 | 21 (0)| 00:00:01 | | | |* 12 | TABLE ACCESS FULL | CHANNELS | 2 | 42 | 3 (0)| 00:00:01 | | | | 13 | VIEW | VW_ST_62EEF96F | 1 | 29 | 48 (0)| 00:00:01 | | | | 14 | NESTED LOOPS | | 1 | 57 | 25 (0)| 00:00:01 | | | | 15 | PARTITION RANGE SUBQUERY | | | 12 | 25 (0)| 00:00:01 |KEY(SQ)|KEY(SQ)| | 16 | BITMAP CONVERSION TO ROWIDS| | | 12 | 25 (0)| 00:00:01 | | | | 17 | BITMAP AND | | | | | | | | | 18 | BITMAP MERGE | | | | | | | | | 19 | BITMAP KEY ITERATION | | | | | | | | | 20 | BUFFER SORT | | | | | | | | | 21 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6613_6575DF | 1 | 5 | 2 (0)| 00:00:01 | | | |* 22 | BITMAP INDEX RANGE SCAN| SALES_CUST_BIX | | | | |KEY(SQ)|KEY(SQ)| | 23 | BITMAP MERGE | | | | | | | | | 24 | BITMAP KEY ITERATION | | | | | | | | | 25 | BUFFER SORT | | | | | | | | |* 26 | TABLE ACCESS FULL | CHANNELS | 2 | 26 | 3 (0)| 00:00:01 | | | |* 27 | BITMAP INDEX RANGE SCAN| SALES_CHANNEL_BIX | | | | |KEY(SQ)|KEY(SQ)| | 28 | BITMAP MERGE | | | | | | | | | 29 | BITMAP KEY ITERATION | | | | | | | | | 30 | BUFFER SORT | | | | | | | | |* 31 | TABLE ACCESS FULL | TIMES | 1 | 16 | 18 (0)| 00:00:01 | | | |* 32 | BITMAP INDEX RANGE SCAN| SALES_TIME_BIX | | | | |KEY(SQ)|KEY(SQ)| | 33 | TABLE ACCESS BY USER ROWID | SALES | 1 | 29 | 23 (0)| 00:00:01 | ROWID | ROWID | ------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter("C"."CUST_STATE_PROVINCE"=' CA') 5 - access("ITEM_3"="T"."TIME_ID" AND "ITEM_2"="C0" AND "ITEM_1"="CH"."CHANNE L_ID") 8 - filter("T"."CALENDAR_QUARTER_DESC"=' 1999 to 1901 ') 12 - filter("CH"."CHANNEL_DESC"=' Catalog' OR "CH"."CHANNEL_DESC"='Internet') 22 - access("S"."CUST_ID"="C0") 26 - filter("CH"."CHANNEL_DESC"=' Catalog' OR "CH"."CHANNEL_DESC"='Internet') 27 - access("S"."CHANNEL_ID"="CH"."CHANNEL_ID") 31 - filter("T"."CALENDAR_QUARTER_DESC"=' 1999 to 1901 ') 32 - access("S"."TIME_ID"="T"."TIME_ID") Note ----- - star transformation used for this statement - this is an adaptive plan SQL> 1.3 STAR_TRANSFORMATION_ENABLED Behaviour with temp_disable sqlplus sys/oracle_4U@pdb1 as sysdba SQL> alter system set star_transformation_enabled=temp_disable; System altered. SQL> SELECT /*+ STAR_TRANSFORMATION */ ch.channel_class , c.cust_city , t.calendar_quarter_desc , SUM ( s.amount_sold ) sales_amount FROM sh.sales s , sh.times t, sh.customers c , sh.channels ch WHERE s.time_id = t.time_id AND s.cust_id = c.cust_id AND s.channel_id = ch.channel_id AND c.cust_state_province = ' CA' AND ch.channel_desc in ( 'Internet' , ' Catalog' ) AND T.calendar_quarter_desc IN (' 1999 to 1901 ', ' 1999 to 1902 ' ) GROUP BY ch.channel_class , c.cust_city , t.calendar_quarter_desc ; 2 3 4 5 6 7 8 9 10 Execution Plan ---------------------------------------------------------- Plan hash value: 3210422895 ---------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | ---------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 92 |1335 (1)| 00:00:01 | | | | 1 | HASH GROUP BY | | 1 | 92 |1335 (1)| 00:00:01 | | | |* 2 | HASH JOIN | | 1 | 92 |1334 (1)| 00:00:01 | | | | 3 | MERGE JOIN CARTESIAN | | 1 | 63 | 444 (1)| 00:00:01 | | | | 4 | MERGE JOIN CARTESIAN | | 1 | 37 | 21 (0)| 00:00:01 | | | |* 5 | TABLE ACCESS FULL | TIMES | 1 | 16 | 18 (0)| 00:00:01 | | | | 6 | BUFFER SORT | | 2 | 42 | 3 (0)| 00:00:01 | | | |* 7 | TABLE ACCESS FULL | CHANNELS | 2 | 42 | 3 (0)| 00:00:01 | | | | 8 | BUFFER SORT | | 1 | 26 | 441 (1)| 00:00:01 | | | |* 9 | TABLE ACCESS FULL | CUSTOMERS | 1 | 26 | 423 (1)| 00:00:01 | | | | 10 | VIEW | VW_ST_34C376F1 | 1 | 29 | 890 (1)| 00:00:01 | | | | 11 | NESTED LOOPS | | 1 | 57 | 446 (1)| 00:00:01 | | | | 12 | PARTITION RANGE SUBQUERY | | | 12 | 446 (1)| 00:00:01 |KEY(SQ)|KEY(SQ)| | 13 | BITMAP CONVERSION TO ROWIDS| | | 12 | 446 (1)| 00:00:01 | | | | 14 | BITMAP AND | | | | | | | | | 15 | BITMAP MERGE | | | | | | | | | 16 | BITMAP KEY ITERATION | | | | | | | | | 17 | BUFFER SORT | | | | | | | | |* 18 | TABLE ACCESS FULL | CUSTOMERS | 1 | 16 | 423 (1)| 00:00:01 | | | |* 19 | BITMAP INDEX RANGE SCAN| SALES_CUST_BIX | | | | |KEY(SQ)|KEY(SQ)| | 20 | BITMAP MERGE | | | | | | | | | 21 | BITMAP KEY ITERATION | | | | | | | | | 22 | BUFFER SORT | | | | | | | | |* 23 | TABLE ACCESS FULL | CHANNELS | 2 | 26 | 3 (0)| 00:00:01 | | | |* 24 | BITMAP INDEX RANGE SCAN| SALES_CHANNEL_BIX | | | | |KEY(SQ)|KEY(SQ)| | 25 | BITMAP MERGE | | | | | | | | | 26 | BITMAP KEY ITERATION | | | | | | | | | 27 | BUFFER SORT | | | | | | | | |* 28 | TABLE ACCESS FULL | TIMES | 1 | 16 | 18 (0)| 00:00:01 | | | |* 29 | BITMAP INDEX RANGE SCAN| SALES_TIME_BIX | | | | |KEY(SQ)|KEY(SQ)| | 30 | TABLE ACCESS BY USER ROWID | SALES | 1 | 29 | 444 (1)| 00:00:01 | ROWID | ROWID | ---------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("ITEM_3"="T"."TIME_ID" AND "ITEM_2"="C"."CUST_ID" AND "ITEM_1"="CH "."CHANNEL_ID") 5 - filter("T"."CALENDAR_QUARTER_DESC"=' 1999 to 1901 ') 7 - filter("CH"."CHANNEL_DESC"=' Catalog' OR "CH"."CHANNEL_DESC"='Internet') 9 - filter("C"."CUST_STATE_PROVINCE"=' CA') 18 - filter("C"."CUST_STATE_PROVINCE"=' CA') 19 - access("S"."CUST_ID"="C"."CUST_ID") 23 - filter("CH"."CHANNEL_DESC"=' Catalog' OR "CH"."CHANNEL_DESC"='Internet') 24 - access("S"."CHANNEL_ID"="CH"."CHANNEL_ID") 28 - filter("T"."CALENDAR_QUARTER_DESC"=' 1999 to 1901 ') 29 - access("S"."TIME_ID"="T"."TIME_ID") Note ----- - star transformation used for this statement - this is an adaptive plan SQL> 2. Optimizing Star Queries A star query is a join between a fact table and a number of dimension tables. Each dimension table is joined to the fact table using a primary key to foreign key join, but the dimension tables are not joined to each other. The optimizer recognizes star queries and generates efficient execution plans for them. "Tuning Star Queries" describes how to improve the performance of star queries. 2.1 Tuning Star Queries To get the best possible performance for star queries, it is important to follow some basic guidelines: [1] A bitmap index should be built on each of the foreign key columns of the fact table or tables. #Query to find constraint name and column name of foreign key SET PAGESIZE 1000 COL CONSTRAINT_NAME FORMAT A20 COL TABLE_NAME FORMAT A15 COL COLUMN_NAME FORMAT A15 SELECT A.CONSTRAINT_NAME,B.TABLE_NAME,B.COLUMN_NAME FROM USER_CONSTRAINTS A,USER_CONS_COLUMNS B WHERE A.CONSTRAINT_NAME = B.CONSTRAINT_NAME AND A.CONSTRAINT_TYPE='R' AND A.TABLE_NAME='SALES'; CONSTRAINT_NAME TABLE_NAME COLUMN_NAME -------------------- --------------- --------------- SALES_PROMO_FK SALES PROMO_ID SALES_CUSTOMER_FK SALES CUST_ID SALES_PRODUCT_FK SALES PROD_ID SALES_TIME_FK SALES TIME_ID SALES_CHANNEL_FK SALES CHANNEL_ID #Query to find index_name and column_name for bitmap indexes SELECT A.INDEX_NAME,A.COLUMN_NAME FROM USER_IND_COLUMNS A,USER_INDEXES B WHERE A.INDEX_NAME = B.INDEX_NAME AND A.TABLE_NAME='SALES' AND B.INDEX_TYPE='BITMAP'; INDEX_NAME COLUMN_NAME -------------------- --------------- SALES_PROD_BIX PROD_ID SALES_CUST_BIX CUST_ID SALES_TIME_BIX TIME_ID SALES_CHANNEL_BIX CHANNEL_ID SALES_PROMO_BIX PROMO_ID [2] The initialization parameter STAR_TRANSFORMATION_ENABLED should be set to TRUE. This enables an important optimizer feature for star-queries. It is set to FALSE by default for backward-compatibility. If your environment meets these two criteria, your star queries should use a powerful optimization technique that rewrites or transforms your SQL called star transformation. Star transformation executes the query in two phases: Phase1 -> Retrieves the necessary rows from the fact table (row set). Phase2 -> Joins this row set to the dimension tables. Example 5-2 Star Transformation Provides the step by step process to use STAR_TRANSFORMATION to optimize a star query. A business question that could be asked against the star schema in Figure 3-1 would be "What was the total number of umbrellas sold in Boston during the month of May 2008?" 1. The original query. SELECT ch.channel_class, c.cust_city, t.calendar_quarter_desc, SUM(s.amount_sold) sales_amount FROM sales s, times t, customers c, channels ch WHERE s.time_id = t.time_id AND s.cust_id = c.cust_id AND s.channel_id = ch.channel_id AND c.cust_state_province = 'CA' AND ch.channel_desc in ('Internet','Catalog') AND t.calendar_quarter_desc IN ('1999-Q1','1999-Q2') GROUP BY ch.channel_class, c.cust_city, t.calendar_quarter_desc; As you can see all of the where clause predicates are on the dimension tables and the fact table (Sales) is joined to each of the dimensions using their foreign key, primary key relationship. 2. Take the following actions: 2.1 Create a bitmap index on each of the foreign key columns in the fact table or tables. 2.2 Set the initialization parameter STAR_TRANSFORMATION_ENABLED to TRUE. 3. The rewritten query. Oracle rewrites and transfers the query to retrieve only the necessary rows from the fact table using bitmap indexes on the foreign key columns SELECT ... FROM sales WHERE time_id IN (SELECT time_id FROM times WHERE calendar_quarter_desc IN('1999-Q1','1999-Q2')) AND cust_id IN (SELECT cust_id FROM customers WHERE cust_state_province='CA') AND channel_id IN (SELECT channel_id FROM channels WHERE channel_desc IN('Internet','Catalog')); By rewriting the query in this fashion you can now leverage the strengths of bitmap indexes. Bitmap indexes provide set based processing within the database, allowing you to use various fact methods for set operations such as AND, OR, MINUS, and COUNT. So, you use the bitmap index on time_id to identify the set of rows in the fact table corresponding to sales in May 2008. In the bitmap the set of rows are actually represented as a string of 1's and 0's. A similar bitmap is retrieved for the fact table rows corresponding to the sale of umbrellas and another is accessed for sales made in Boston. At this point there are three bitmaps, each representing a set of rows in the fact table that satisfy an individual dimension constraint. The three bitmaps are then combined using a bitmap AND operation and this newly created final bitmap is used to extract the rows from the fact table needed to evaluate the query. 4. Using the rewritten query, Oracle joins the rows from fact tables to the dimension tables. The join back to the dimension tables is normally done using a hash join, but the Oracle Optimizer selects the most efficient join method depending on the size of the dimension tables. -> The rows from the fact table are retrieved by using bitmap joins between the bitmap indexes on all of the foreign key columns. -> The end user never needs to know any of the details of STAR_TRANSFORMATION, as the optimizer automatically chooses STAR_TRANSFORMATION when it is appropriate. Execution Plan for a Star Transformation with a Bitmap Index SQL> SELECT ch.channel_class, c.cust_city, t.calendar_quarter_desc, SUM(s.amount_sold) sales_amount FROM sales s, times t, customers c, channels ch WHERE s.time_id = t.time_id AND s.cust_id = c.cust_id AND s.channel_id = ch.channel_id AND c.cust_state_province = 'CA' AND ch.channel_desc in ('Internet','Catalog') AND t.calendar_quarter_desc IN ('1999-Q1','1999-Q2') GROUP BY ch.channel_class, c.cust_city, t.calendar_quarter_desc; Execution Plan ---------------------------------------------------------- Plan hash value: 3070449514 --------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows| Bytes | Cost (%CPU)| Time | Pstart| Pstop | --------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 766| 62046 | 579 (1)| 00:00:01 | | | | 1 | TEMP TABLE TRANSFORMATION | | | | | | | | | 2 | LOAD AS SELECT | SYS_TEMP_0FD9D6622_671609 | | | | | | | |* 3 | TABLE ACCESS FULL | CUSTOMERS | 383 | 9958 | 423 (1)| 00:00:01 | | | | 4 | HASH GROUP BY | | 766 | 62046 | 157 (2)| 00:00:01 | | | |* 5 | HASH JOIN | | 3117 | 246K| 155 (0)| 00:00:01 | | | | 6 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6622_671609 | 383 | 5745 | 2 (0)| 00:00:01 | | | |* 7 | HASH JOIN | | 3117 | 200K| 153 (0)| 00:00:01 | | | |* 8 | TABLE ACCESS FULL | CHANNELS | 2 | 42 | 3 (0)| 00:00:01 | | | |* 9 | HASH JOIN | | 3117 | 136K| 150 (0)| 00:00:01 | | | |* 10 | TABLE ACCESS FULL | TIMES | 183 | 2928 | 18 (0)| 00:00:01 | | | | 11 | VIEW | VW_ST_62EEF96F | 3124 | 90596 | 132 (0)| 00:00:01 | | | | 12 | NESTED LOOPS | | 3124 | 173K| 109 (0)| 00:00:01 | | | | 13 | PARTITION RANGE SUBQUERY | | 3124 | 87482 | 55 (2)| 00:00:01 |KEY(SQ)|KEY(SQ)| | 14 | BITMAP CONVERSION TO ROWIDS| | 3124 | 87482 | 55 (2)| 00:00:01 | | | | 15 | BITMAP AND | | | | | | | | | 16 | BITMAP MERGE | | | | | | | | | 17 | BITMAP KEY ITERATION | | | | | | | | | 18 | BUFFER SORT | | | | | | | | |* 19 | TABLE ACCESS FULL | CHANNELS | 2 | 26 | 3 (0)| 00:00:01 | | | |* 20 | BITMAP INDEX RANGE SCAN| SALES_CHANNEL_BIX | | | | |KEY(SQ)|KEY(SQ)| | 21 | BITMAP MERGE | | | | | | | | | 22 | BITMAP KEY ITERATION | | | | | | | | | 23 | BUFFER SORT | | | | | | | | |* 24 | TABLE ACCESS FULL | TIMES | 183 | 2928 | 18 (0)| 00:00:01 | | | |* 25 | BITMAP INDEX RANGE SCAN| SALES_TIME_BIX | | | | |KEY(SQ)|KEY(SQ)| | 26 | BITMAP MERGE | | | | | | | | | 27 | BITMAP KEY ITERATION | | | | | | | | | 28 | BUFFER SORT | | | | | | | | | 29 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6622_671609 | 383 | 1915 | 2 (0)| 00:00:01 | | | |* 30 | BITMAP INDEX RANGE SCAN| SALES_CUST_BIX | | | | |KEY(SQ)|KEY(SQ)| | 31 | TABLE ACCESS BY USER ROWID | SALES | 1 | 29 | 78 (0)| 00:00:01 | ROWID | ROWID | --------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter("C"."CUST_STATE_PROVINCE"='CA') 5 - access("ITEM_2"="C0") 7 - access("ITEM_1"="CH"."CHANNEL_ID") 8 - filter("CH"."CHANNEL_DESC"='Catalog' OR "CH"."CHANNEL_DESC"='Internet') 9 - access("ITEM_3"="T"."TIME_ID") 10 - filter("T"."CALENDAR_QUARTER_DESC"='1999-Q1' OR "T"."CALENDAR_QUARTER_DES C"='1999-Q2') 19 - filter("CH"."CHANNEL_DESC"='Catalog' OR "CH"."CHANNEL_DESC"='Internet') 20 - access("S"."CHANNEL_ID"="CH"."CHANNEL_ID") 24 - filter("T"."CALENDAR_QUARTER_DESC"='1999-Q1' OR "T"."CALENDAR_QUARTER_DES C"='1999-Q2') 25 - access("S"."TIME_ID"="T"."TIME_ID") 30 - access("S"."CUST_ID"="C0") Note ----- - star transformation used for this statement - this is an adaptive plan SQL> Star Transformation with a Bitmap Join Index <- In addition to bitmap indexes, you can use a bitmap join index during star transformations. Assume you have the following additional index structure: CREATE BITMAP INDEX sales_c_state_bjix ON sales(customers.cust_state_province) FROM sales, customers WHERE sales.cust_id = customers.cust_id LOCAL NOLOGGING COMPUTE STATISTICS; SQL> CREATE BITMAP INDEX sales_c_state_bjix ON sales(customers.cust_state_province) FROM sales, customers WHERE sales.cust_id = customers.cust_id 2 3 4 5 / FROM sales, customers * ERROR at line 3: ORA-25954: missing primary key or unique constraint on dimension SQL> !oerr ora 25954 25954, 00000, "missing primary key or unique constraint on dimension\n" // *Cause: An attempt to create a join index was made, which failed // because one or more dimensions did not have an appropriate // constraint matching the join conditions. // *Action: Ensure that the where clause is correct (contains all of the // constraint columns) and that an enforced constraint is on // each dimension table. SQL> SELECT dbms_metadata.get_dependent_ddl('INDEX','CUSTOMERS') from dual; DBMS_METADATA.GET_DEPENDENT_DDL('INDEX','CUSTOMERS') -------------------------------------------------------------------------------- CREATE BITMAP INDEX "SH"."CUSTOMERS_YOB_BIX" ON "SH"."CUS TOMERS" ("CUST_YEAR_OF_BIRTH") PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STA TISTICS NOLOGGING STORAGE(INITIAL 65536 NEXT 1048576 MINE XTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT C ELL_FLASH_CACHE DEFAULT) TABLESPACE "EXAMPLE" CREATE BITMAP INDEX "SH"."CUSTOMERS_MARITAL_BIX" ON "SH"."CUSTOMERS" ("CUST_MARITAL_STATU S") PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS NOLOGGING STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLA SH_CACHE DEFAULT) TABLESPACE "EXAMPLE" CREATE BITMAP INDEX "SH"."CUSTOMERS_GENDER_BIX" ON "SH". "CUSTOMERS" ("CUST_GENDER") PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATIS TICS NOLOGGING STORAGE(INITIAL 65536 NEXT 1048576 MINEXTE NTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL _FLASH_CACHE DEFAULT) TABLESPACE "EXAMPLE" CREATE UNIQUE INDEX "SH"."CUSTOMERS_PK" ON "SH"."CUS TOMERS" ("CUST_ID") PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS STORAGE(INITIAL 65536 NEXT 1048576 MINEX TENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CE LL_FLASH_CACHE DEFAULT) TABLESPACE "EXAMPLE" SQL> select dbms_metadata.get_ddl('TABLE','CUSTOMERS') from dual; DBMS_METADATA.GET_DDL('TABLE','CUSTOMERS') -------------------------------------------------------------------------------- CREATE TABLE "SH"."CUSTOMERS" ( "CUST_ID" NUMBER NOT NULL ENABLE, "CUST_FIRST_NAME" VARCHAR2(20) NOT NULL ENABL E, "CUST_LAST_NAME" VARCHAR2(40) NOT NULL ENABLE, "CUST_GENDER" CHAR(1) NOT NULL ENABLE, "CUST_YEAR_OF_BIRTH" NUMBER(4,0) NOT NULL ENA BLE, "CUST_MARITAL_STATUS" VARCHAR2(20), "CUST_STREET_ADDRESS" VARCHAR2(40) NOT NULL ENABLE, "CUST_POSTAL_CODE" VARCHAR2(10) NOT NULL ENABLE, "CUST_CITY" VARCHAR2(30) NOT NULL ENABLE, "CUST_CITY_ID" NUMBER NOT NULL ENABLE, "CUST_STATE_PROVINCE" VARCHAR2(40) NOT NULL E NABLE, "CUST_STATE_PROVINCE_ID" NUMBER NOT NULL ENABLE, "COUNTRY_ID" NUMBER NOT NULL ENABLE, "CUST_MAIN_PHONE_NUMBER" VARCHAR2(25) NOT NULL ENABLE, "CUST_INCOME_LEVEL" VARCHAR2(30), "CUST_CREDIT_LIMIT" NUMBER, "CUST_EMAIL" VARCHAR2(50), "CUST_TOTAL" VARCHAR2(14) NOT NULL ENABLE, "CUST_TOTAL_ID" NUMBER NOT NULL ENABLE, "CUST_SRC_ID" NUMBER, "CUST_EFF_FROM" DATE, "CUST_EFF_TO" DATE, "CUST_VALID" VARCHAR2(1), CONSTRAINT "CUSTOMERS_PK" PRIMARY KEY ("CUST_ID") USING INDEX PCTFREE 10 INITRANS 2 MAXTRANS 255 COMP UTE STATISTICS STORAGE(INITIAL 65536 NEXT 1048576 MINEXTE NTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL _FLASH_CACHE DEFAULT) TABLESPACE "EXAMPLE" ENABLE NOVALIDATE, CONSTRAINT "CUSTOMERS_COUNTRY_FK" FOREIGN KEY ("COUNTRY_ID") REFERENCES "SH"."COUNTRIES" ("COUNTRY_ID") ENABLE NOVALIDATE ) SEGMENT CREATION IMMEDIATE PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255 NOCOMPRESS LOGGING STORAGE(INITIAL 65536 NEXT 1048576 MINEXTEN TS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_ FLASH_CACHE DEFAULT) TABLESPACE "EXAMPLE" SQL> alter table sales 2 drop constraint SALES_CUSTOMER_FK; Table altered. SQL> alter table customers drop primary key; Table altered. SQL> alter table customers add primary key(cust_id); Table altered. SQL> alter table sales 2 add constraint SALES_CUSTOMER_FK FOREIGN KEY(CUST_ID) REFERENCES CUSTOMERS(CUST_ID); Table altered. SQL> CREATE BITMAP INDEX sales_c_state_bjix ON sales(customers.cust_state_province) FROM sales, customers WHERE sales.cust_id = customers.cust_id LOCAL NOLOGGING COMPUTE STATISTICS; Index created. SQL> set autot trace exp SQL> SELECT ch.channel_class, c.cust_city, t.calendar_quarter_desc, SUM(s.amount_sold) sales_amount FROM sales s, times t, customers c, channels ch 2 3 4 WHERE s.time_id = t.time_id AND s.cust_id = c.cust_id AND s.channel_id = ch.channel_id AND c.cust_state_province = 'CA' AND ch.channel_desc in ('Internet','Catalog') AND t.calendar_quarter_desc IN ('1999-Q1','1999-Q2') GROUP BY ch.channel_class, c.cust_city, t.calendar_quarter_desc; 5 6 7 8 9 10 Execution Plan ---------------------------------------------------------- Plan hash value: 3705392714 ------------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | ------------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1236 | 101K| 552 (1)| 00:00:01 | | | | 1 | HASH GROUP BY | | 1236 | 101K| 552 (1)| 00:00:01 | | | |* 2 | HASH JOIN | | 27195 | 2230K| 530 (1)| 00:00:01 | | | |* 3 | TABLE ACCESS FULL | CHANNELS | 2 | 42 | 3 (0)| 00:00:01 | | | |* 4 | HASH JOIN | | 27195 | 1673K| 527 (1)| 00:00:01 | | | |* 5 | TABLE ACCESS FULL | TIMES | 183 | 2928 | 18 (0)| 00:00:01 | | | |* 6 | HASH JOIN | | 27255 | 1250K| 509 (1)| 00:00:01 | | | |* 7 | TABLE ACCESS FULL | CUSTOMERS | 3341 | 86866 | 423 (1)| 00:00:01 | | | | 8 | PARTITION RANGE SUBQUERY | | 57585 | 1180K| 86 (2)| 00:00:01 |KEY(SQ)|KEY(SQ)| | 9 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| SALES | 57585 | 1180K| 86 (2)| 00:00:01 |KEY(SQ)|KEY(SQ)| | 10 | BITMAP CONVERSION TO ROWIDS | | | | | | | | | 11 | BITMAP AND | | | | | | | | | 12 | BITMAP MERGE | | | | | | | | | 13 | BITMAP KEY ITERATION | | | | | | | | | 14 | BUFFER SORT | | | | | | | | |* 15 | TABLE ACCESS FULL | CHANNELS | 2 | 26 | 3 (0)| 00:00:01 | | | |* 16 | BITMAP INDEX RANGE SCAN | SALES_CHANNEL_BIX | | | | |KEY(SQ)|KEY(SQ)| | 17 | BITMAP MERGE | | | | | | | | | 18 | BITMAP KEY ITERATION | | | | | | | | | 19 | BUFFER SORT | | | | | | | | |* 20 | TABLE ACCESS FULL | TIMES | 183 | 2928 | 18 (0)| 00:00:01 | | | |* 21 | BITMAP INDEX RANGE SCAN | SALES_TIME_BIX | | | | |KEY(SQ)|KEY(SQ)| |* 22 | BITMAP INDEX SINGLE VALUE | SALES_C_STATE_BJIX | | | | |KEY(SQ)|KEY(SQ)| ------------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("S"."CHANNEL_ID"="CH"."CHANNEL_ID") 3 - filter("CH"."CHANNEL_DESC"='Catalog' OR "CH"."CHANNEL_DESC"='Internet') 4 - access("S"."TIME_ID"="T"."TIME_ID") 5 - filter("T"."CALENDAR_QUARTER_DESC"='1999-Q1' OR "T"."CALENDAR_QUARTER_DESC"='1999-Q2') 6 - access("S"."CUST_ID"="C"."CUST_ID") 7 - filter("C"."CUST_STATE_PROVINCE"='CA') 15 - filter("CH"."CHANNEL_DESC"='Catalog' OR "CH"."CHANNEL_DESC"='Internet') 16 - access("S"."CHANNEL_ID"="CH"."CHANNEL_ID") 20 - filter("T"."CALENDAR_QUARTER_DESC"='1999-Q1' OR "T"."CALENDAR_QUARTER_DESC"='1999-Q2') 21 - access("S"."TIME_ID"="T"."TIME_ID") 22 - access("S"."SYS_NC00008$"='CA') Note ----- - star transformation used for this statement SQL> The difference between this plan as compared to the previous one is that the inner part of the bitmap index scan for the customer dimension has no subselect. This is because the join predicate information on customer.cust_state_province can be satisfied with the bitmap join index sales_c_state_bjix. 2.2 How Oracle Chooses to Use Star Transformation The optimizer generates and saves the best plan it can produce without the transformation. If the transformation is enabled, the optimizer then tries to apply it to the query and, if applicable, generates the best plan using the transformed query. Based on a comparison of the cost estimates between the best plans for the two versions of the query, the optimizer then decides whether to use the best plan for the transformed or untransformed version. If the query requires accessing a large percentage of the rows in the fact table, it might be better to use a full table scan and not use the transformations. However, if the constraining predicates on the dimension tables are sufficiently selective that only a small portion of the fact table must be retrieved, the plan based on the transformation will probably be superior. Note that the optimizer generates a subquery for a dimension table only if it decides that it is reasonable to do so based on a number of criteria. There is no guarantee that subqueries will be generated for all dimension tables. The optimizer may also decide, based on the properties of the tables and the query, that the transformation does not merit being applied to a particular query. In this case, the best regular plan will be used. 2.3 Star Transformation Restrictions Star transformation is not supported for tables with any of the following characteristics: Queries with a table hint that is incompatible with a bitmap access path Tables with too few bitmap indexes. There must be a bitmap index on a fact table column for the optimizer to generate a subquery for it. Remote fact tables. However, remote dimension tables are allowed in the subqueries that are generated. Anti-joined tables Tables that are already used as a dimension table in a subquery Tables that are really unmerged views, which are not view partitions Tables where the fact table is an unmerged view Tables where a partitioned view is used as a fact table The star transformation may not be chosen by the optimizer for the following cases: Tables that have a good single-table access path Tables that are too small for the transformation to be worthwhile In addition, temporary tables will not be used by star transformation under the following conditions: The database is in read-only mode The star query is part of a transaction that is in serializable mode 2.4 Optimizing Star Queries Using VECTOR GROUP BY Aggregation Documentation: 7 Optimizing Joins with In-Memory Aggregation 7.1 About IM Aggregation Starting with 12.1.0.2, In-Memory Aggregation (IM aggregation) enables queries to aggregate while scanning. IM aggregation optimizes query blocks involving aggregation and joins from a large table to multiple small tables. The KEY VECTOR and VECTOR GROUP BY operations use efficient arrays for joins and aggregation. The optimizer chooses VECTOR GROUP BY for GROUP BY operations based on cost. The optimizer does not choose VECTOR GROUP BY aggregations for GROUP BY ROLLUP, GROUPING SETS, or CUBE operations. Note: IM aggregation is also called vector aggregation and VECTOR GROUP BY aggregation. IM aggregation requires INMEMORY_SIZE to be set to a nonzero value. However, IM aggregation does not require that the referenced tables be populated in the IM column store. 7.2 Purpose of IM Aggregation IM aggregation pre-processes the small tables to accelerate the per-row work performed on the large table. A typical analytic query aggregates from a fact table, and joins it to dimension tables. This type of query scans a large volume of data, with optional filtering, and performs a GROUP BY of between 1 and 40 columns. The first aggregation on the fact table processes the most rows. Before Oracle Database 12c, the only GROUP BY operations were HASH and SORT. The VECTOR GROUP BY is an additional cost-based transformation that transforms a join between a dimension and fact table into a filter. The database can apply this filter during the fact table scan. The joins use key vectors, which are similar to Bloom filters, and the aggregation uses a VECTOR GROUP BY. Note: Although vector transformations are independent of the IM column store, they can be applied very efficiently to In-Memory data through SIMD vector processing. IM aggregation enables vector joins and GROUP BY operations to occur simultaneously with the scan of the large table. Thus, these operations aggregate as they scan, and do not need to wait for table scans and join operations to complete. IM aggregation optimizes CPU usage, especially the CPU cache. IM aggregation can greatly improve query performance. The database can create a report outline dynamically, and then fill in report details during the scan of the fact table. 7.2.1 When IM Aggregation Is Useful -> IM aggregation improves performance of queries that join relatively small tables to a relatively large fact table, and aggregate data in the fact table. This typically occurs in a star or snowflake query. Consider the following query, which performs a join of the customers dimension table with the sales fact table: ----------------- ENABLE In-Memory ----------------- SQL> alter system set inmemory_size=200M scope=spfile; System altered. SQL> startup force -bash-4.1$ sqlplus sh/oracle_4U@pdb1 SQL> alter table sales INMEMORY; Table altered. SQL> alter table customers INMEMORY; Table altered. SQL> set autot trace exp SQL> SQL> set timing on SQL> select count(*) from sales; Elapsed: 00:00:00.16 SQL> select count(*) from sales; Elapsed: 00:00:00.00 SQL> select count(*) from customers; Elapsed: 00:00:00.09 SQL> SELECT c.CUST_ID, s.quantity_sold, s.amount_sold FROM customers c, sales s WHERE c.CUST_ID = s.CUST_ID AND c.country_id = 'FR'; 2 3 4 Elapsed: 00:00:00.08 Execution Plan ---------------------------------------------------------- Plan hash value: 910832015 ----------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | ----------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 380K| 8540K| 4398 (1)| 00:00:01 | | | |* 1 | HASH JOIN | | 380K| 8540K| 4398 (1)| 00:00:01 | | | | 2 | JOIN FILTER CREATE | :BF0000 | 380K| 8540K| 4398 (1)| 00:00:01 | | | |* 3 | TABLE ACCESS INMEMORY FULL | CUSTOMERS | 2921 | 29210 | 17 (6)| 00:00:01 | | | | 4 | JOIN FILTER USE | :BF0000 | 918K| 11M| 4378 (1)| 00:00:01 | | | | 5 | PARTITION RANGE ALL | | 918K| 11M| 4378 (1)| 00:00:01 | 1 | 28 | |* 6 | TABLE ACCESS INMEMORY FULL| SALES | 918K| 11M| 4378 (1)| 00:00:01 | 1 | 28 | ----------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("C"."CUST_ID"="S"."CUST_ID") 3 - inmemory("C"."COUNTRY_ID"=TO_NUMBER('FR')) filter("C"."COUNTRY_ID"=TO_NUMBER('FR')) 6 - inmemory(SYS_OP_BLOOM_FILTER(:BF0000,"S"."CUST_ID")) filter(SYS_OP_BLOOM_FILTER(:BF0000,"S"."CUST_ID")) Note ----- - this is an adaptive plan SQL> When both tables are populated in the IM column store, the database can use SIMD vector processing to scan the row sets and apply filters. The optimizer converts the predicate on the customers table, c.country_id='FR' into a filter on the sales fact table. The filter is country_id='FR'. Because sales is stored in columnar format, the query only needs to scan one column to determine the result. Figure 7-1 Vector Joins Using In-Memory Column Store 7.2.2 When IM Aggregation Is Not Beneficial IM aggregation benefits certain star queries when sufficient system resources exist. Other queries may receive little or no benefit. Situations Where VECTOR GROUP BY Aggregation Is Not Advantageous Specifically, VECTOR GROUP BY aggregation does not benefit performance in the following scenarios: 1. Joins are performed between two very large tables. By default, the optimizer chooses a VECTOR GROUP BY transformation only if a relatively small table is joined to a relatively large table. Dimensions contain more than 2 billion rows. 2. The VECTOR GROUP BY transformation is not used if a dimension contains more than 2 billion rows. 3. The system does not have sufficient memory. Most databases that use the IM column store benefit from IM aggregation. ---------------------------------------------------------------------------- HOL: Create two dimension and one fact table and create query that uses start transformation and optimize the star transformation query. ---------------------------------------------------------------------------- SQL> create table tbl_dimension1 2 ( r_num number constraint tbl_dimension1_pk primary key, 3 day_of_week varchar2(20) 4 ); Table created. SQL> 1 1* create table tbl_dimension1 SQL> c/1/2 1* create table tbl_dimension2 SQL> 2 2* ( r_num number constraint tbl_dimension1_pk primary key, SQL> c/1/2 2* ( r_num number constraint tbl_dimension2_pk primary key, SQL> r 1 create table tbl_dimension2 2 ( r_num number constraint tbl_dimension2_pk primary key, 3 day_of_week varchar2(20) 4* ) Table created. SQL> create table tbl_fact 2 ( r_num number, 3 r1 number constraint fk_fact_dim1 references tbl_dimension1, 4 r2 number constraint fk_fact_dim2 references tbl_dimension2); Table created. SQL> INSERT INTO tbl_dimension1 2 SELECT ROWNUM,TRIM(TO_CHAR(SYSDATE+ROWNUM,'DAY')) FROM DUAL 3 CONNECT BY LEVEL <= 10; 10 rows created. SQL> 1 1* INSERT INTO tbl_dimension1 SQL> c/1/2 1* INSERT INTO tbl_dimension2 SQL> r 1 INSERT INTO tbl_dimension2 2 SELECT ROWNUM,TRIM(TO_CHAR(SYSDATE+ROWNUM,'DAY')) FROM DUAL 3* CONNECT BY LEVEL <= 10 10 rows created. SQL> INSERT INTO tbl_fact 2 SELECT ROWNUM,ROUND(DBMS_RANDOM.VALUE(1,10)),ROUND(DBMS_RANDOM.VALUE(1,10)) 3 FROM DUAL 4 CONNECT BY LEVEL <= 50000; 50000 rows created. SQL> COMMIT; Commit complete. SQL> CREATE BITMAP INDEX idx_b1 ON tbl_fact(r1); Index created. SQL> CREATE BITMAP INDEX idx_b2 ON tbl_fact(r2); Index created. SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(USER,'TBL_FACT',CASCADE=>TRUE); PL/SQL procedure successfully completed. SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(USER,'TBL_DIMENSION1',CASCADE=>TRUE); PL/SQL procedure successfully completed. SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(USER,'TBL_DIMENSION2',CASCADE=>TRUE); PL/SQL procedure successfully completed. SQL> SELECT COUNT(*) 2 FROM TBL_FACT f,TBL_DIMENSION1 d1,TBL_DIMENSION2 d2 3 WHERE f.r1 = d1.r_num 4 AND f.r2 = d2.r_num 5 AND d1.day_of_week = 'MONDAY' 6 AND d2.day_of_week = 'THURSDAY'; COUNT(*) ---------- 940 SQL> SET AUTOT TRACE EXP SQL> SELECT COUNT(*) 2 FROM TBL_FACT f,TBL_DIMENSION1 d1,TBL_DIMENSION2 d2 3 WHERE f.r1 = d1.r_num 4 AND f.r2 = d2.r_num 5 AND d1.day_of_week = 'MONDAY' 6 AND d2.day_of_week = 'THURSDAY'; Execution Plan ---------------------------------------------------------- Plan hash value: 3421921432 ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 28 | 38 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 28 | | | |* 2 | HASH JOIN | | 1020 | 28560 | 38 (0)| 00:00:01 | | 3 | MERGE JOIN CARTESIAN| | 2 | 44 | 6 (0)| 00:00:01 | |* 4 | TABLE ACCESS FULL | TBL_DIMENSION1 | 1 | 11 | 3 (0)| 00:00:01 | | 5 | BUFFER SORT | | 1 | 11 | 3 (0)| 00:00:01 | |* 6 | TABLE ACCESS FULL | TBL_DIMENSION2 | 1 | 11 | 3 (0)| 00:00:01 | | 7 | TABLE ACCESS FULL | TBL_FACT | 50000 | 292K| 32 (0)| 00:00:01 | ----------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("F"."R1"="D1"."R_NUM" AND "F"."R2"="D2"."R_NUM") 4 - filter("D1"."DAY_OF_WEEK"='MONDAY') 6 - filter("D2"."DAY_OF_WEEK"='THURSDAY') Without using star transformation. Oracle will access both dimension tables and use cartesian join to join them. as no join condition found then the result is hash joined with the full table scan of the fact table, returning 50k rows and accessing around 292k data. SQL> ALTER SESSION SET STAR_TRANSFORMATION_ENABLED=TRUE; Session altered. SQL> SELECT COUNT(*) 2 FROM TBL_FACT f,TBL_DIMENSION1 d1,TBL_DIMENSION2 d2 3 WHERE f.r1 = d1.r_num 4 AND f.r2 = d2.r_num 5 AND d1.day_of_week = 'MONDAY' 6 AND d2.day_of_week = 'THURSDAY'; Execution Plan ---------------------------------------------------------- Plan hash value: 1426459311 ----------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 6 | 12(0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 6 | | | | 2 | BITMAP CONVERSION COUNT | | 500 | 3000 | 6(0)| 00:00:01 | | 3 | BITMAP AND | | | | | | | 4 | BITMAP MERGE | | | | | | | 5 | BITMAP KEY ITERATION | | | | | | |* 6 | TABLE ACCESS FULL | TBL_DIMENSION1 | 1 | 11 | 3(0)| 00:00:01 | |* 7 | BITMAP INDEX RANGE SCAN| IDX_B1 | | | | | | 8 | BITMAP MERGE | | | | | | | 9 | BITMAP KEY ITERATION | | | | | | |* 10 | TABLE ACCESS FULL | TBL_DIMENSION2 | 1 | 11 | 3(0)| 00:00:01 | |* 11 | BITMAP INDEX RANGE SCAN| IDX_B2 | | | | | ----------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 6 - filter("D1"."DAY_OF_WEEK"='MONDAY') 7 - access("F"."R1"="D1"."R_NUM") 10 - filter("D2"."DAY_OF_WEEK"='THURSDAY') 11 - access("F"."R2"="D2"."R_NUM") Note ----- - star transformation used for this statement As it is seen from the execution plan and from the node section at the end of the plan transformation used for this statement. The above query was rewrite as below. select count(*) from tbl_fact where r1 in (select r_num from tbl_dimension1 where day_of_week='MONDAY') and r2 in (select r_num from tbl_dimension2 where day_of_week='THURSDAY'); Cost of the first statement was 38 and decreased to 12 in the second query. The retrieve data is also decreased from 292k to 11k by using star transformation. 2. You can also use bitmap join indexes which store the result of a join and thus the join operation can be avoided. As We have a primary key - foreign key relationship between fact and dimension tables. Let's create bitmap join indexes as follows: SQL> CREATE BITMAP INDEX idx_bt_join ON tbl_fact(tbl_dimension1.day_of_week) FROM tbl_fact,tbl_dimension1 WHERE tbl_fact.r1=tbl_dimension1.r_num; Index created. SQL> CREATE BITMAP INDEX idx_bt_join2 ON tbl_fact(tbl_dimension2.day_of_week) FROM tbl_fact,tbl_dimension2 WHERE tbl_fact.r2=tbl_dimension2.r_num; Index created. SQL> select count(*) 2 FROM TBL_FACT f,TBL_DIMENSION1 d1,TBL_DIMENSION2 d2 3 WHERE f.r1 = d1.r_num 4 AND f.r2 = d2.r_num 5 AND d1.day_of_week = 'MONDAY' 6 AND d2.day_of_week = 'THURSDAY'; Execution Plan ---------------------------------------------------------- Plan hash value: 2125110096 --------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 6 | 4 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 6 | | | | 2 | BITMAP CONVERSION COUNT | | 7143 | 42858 | 4 (0)| 00:00:01 | | 3 | BITMAP AND | | | | | | |* 4 | BITMAP INDEX SINGLE VALUE| IDX_BT_JOIN | | | | | |* 5 | BITMAP INDEX SINGLE VALUE| IDX_BT_JOIN2 | | | | | --------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 4 - access("F"."SYS_NC00004$"='MONDAY') 5 - access("F"."SYS_NC00005$"='THURSDAY') SQL> Note: Do not use alias with table during bitmap join index. As you see, two bitmap join indexes were used to retrieve the data without even joining fact and dimension tables. >> Enable InMemory SQL> show parameter memory NAME TYPE VALUE ------------------------------------ ----------- ------------------------------ hi_shared_memory_address integer 0 inmemory_clause_default string inmemory_force string DEFAULT inmemory_max_populate_servers integer 1 inmemory_query string ENABLE inmemory_size big integer 200M inmemory_trickle_repopulate_servers_ integer 1 percent memory_max_target big integer 1G memory_target big integer 1G optimizer_inmemory_aware boolean TRUE shared_memory_address integer 0 SQL> >> IM aggregation improves performance of queries that join relatively small tables to a relatively large fact table, and aggregate data in the fact table. This typically occurs in a star or snowflake query. SQL> alter table tbl_fact inmemory; Table altered. SQL> alter table tbl_dimension1 inmemory; Table altered. SQL> alter table tbl_dimension2 inmemory; Table altered. SQL> select count(*) from tbl_fact; SQL> select count(*) from tbl_dimension1; SQL> select count(*) from tbl_dimension2; SQL> select count(*) FROM TBL_FACT f,TBL_DIMENSION1 d1,TBL_DIMENSION2 d2 WHERE f.r1 = d1.r_num AND f.r2 = d2.r_num AND d1.day_of_week = 'MONDAY' AND d2.day_of_week = 'THURSDAY'; Execution Plan ---------------------------------------------------------- Plan hash value: 2602122435 -------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost(%CPU)| Time | -------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 28 | 3 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 28 | | | |* 2 | HASH JOIN | | 1020 | 28560 | 3 (0)| 00:00:01 | | 3 | JOIN FILTER CREATE | :BF0000 | 2 | 44 | 1 (0)| 00:00:01 | | 4 | MERGE JOIN CARTESIAN | | 2 | 44 | 1 (0)| 00:00:01 | |* 5 | TABLE ACCESS INMEMORY FULL | TBL_DIMENSION1 | 1 | 11 | 1 (0)| 00:00:01 | | 6 | BUFFER SORT | | 1 | 11 | 1 (0)| 00:00:01 | |* 7 | TABLE ACCESS INMEMORY FULL| TBL_DIMENSION2 | 1 | 11 | 1 (0)| 00:00:01 | | 8 | JOIN FILTER USE | :BF0000 | 50000 | 292K| 2 (0)| 00:00:01 | |* 9 | TABLE ACCESS INMEMORY FULL | TBL_FACT | 50000 | 292K| 2 (0)| 00:00:01 | -------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("F"."R1"="D1"."R_NUM" AND "F"."R2"="D2"."R_NUM") 5 - inmemory("D1"."DAY_OF_WEEK"='MONDAY') filter("D1"."DAY_OF_WEEK"='MONDAY') 7 - inmemory("D2"."DAY_OF_WEEK"='THURSDAY') filter("D2"."DAY_OF_WEEK"='THURSDAY') 9 - inmemory(SYS_OP_BLOOM_FILTER(:BF0000,"F"."R1")) filter(SYS_OP_BLOOM_FILTER(:BF0000,"F"."R1")) SQL> SQL> alter session set star_transformation_enabled=true; Session altered. SQL> SQL> select count(*) FROM TBL_FACT f,TBL_DIMENSION1 d1,TBL_DIMENSION2 d2 WHERE f.r1 = d1.r_num AND f.r2 = d2.r_num AND d1.day_of_week = 'MONDAY' AND d2.day_of_week = 'THURSDAY'; Execution Plan ---------------------------------------------------------- Plan hash value: 1426459311 -------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost(%CPU)| Time | -------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 6 | 2 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 6 | | | | 2 | BITMAP CONVERSION COUNT | | 500 | 3000 | 1 (0)| 00:00:01 | | 3 | BITMAP AND | | | | | | | 4 | BITMAP MERGE | | | | | | | 5 | BITMAP KEY ITERATION | | | | | | |* 6 | TABLE ACCESS INMEMORY FULL| TBL_DIMENSION1 | 1 | 11 | 1 (0)| 00:00:01 | |* 7 | BITMAP INDEX RANGE SCAN | IDX_B1 | | | | | | 8 | BITMAP MERGE | | | | | | | 9 | BITMAP KEY ITERATION | | | | | | |* 10 | TABLE ACCESS INMEMORY FULL| TBL_DIMENSION2 | 1 | 11 | 1 (0)| 00:00:01 | |* 11 | BITMAP INDEX RANGE SCAN | IDX_B2 | | | | | -------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 6 - inmemory("D1"."DAY_OF_WEEK"='MONDAY') filter("D1"."DAY_OF_WEEK"='MONDAY') 7 - access("F"."R1"="D1"."R_NUM") 10 - inmemory("D2"."DAY_OF_WEEK"='THURSDAY') filter("D2"."DAY_OF_WEEK"='THURSDAY') 11 - access("F"."R2"="D2"."R_NUM") Note ----- - star transformation used for this statement :) ===================== Path to Documentation: ===================== Database Data Warehousing Guide -> 21 SQL for Modeling -> Star Schemas Database Data Warehousing Guide 4 Data Warehousing Optimizations and Techniques -> Optimizing Star Queries
Thank you for visiting this blog…