Using High-Speed Data Loading and Rolling Window Operations with Partitioning

Purpose

In this tutorial, you learn how to use Oracle Database for high-speed data loading and leverage Oracle Partitioning for a rolling window operation.

Time to Complete

Approximately 2 hours

Topics

This tutorial covers the following topics:

	Overview
	Scenario
	Prerequisites
	Implement Schema Changes for the Sales History Schema
	Load Data by Using External Tables
	Compare SQL*Loader to the External Table Loading and Transformation Process
	Leverage Table Compression to Save Disk Space and Reduce the TCO
	Perform a Rolling Window Operation Using Oracle Partitioning
	Summary

Viewing Screenshots

Place the cursor over this icon to load and view all the screenshots for this tutorial. (Caution: This action loads all screenshots simultaneously, so response time may be slow depending on your Internet connection.)

Note: Alternatively, you can place the cursor over an individual icon in the following steps to load and view only the screenshot associated with that step. You can hide an individual screenshot by clicking it.

Overview

Most of the time, online transaction processing (OLTP) source systems feeding a data warehouse are not directly connected to the data warehousing system for extracting new data. Commonly, the OLTP systems send data feeds in the form of external files. This data must be loaded into the data warehouse, preferably in parallel, thus leveraging the existing resources.

For example, due to the business needs and disk space constraints of the sample company used in this tutorial (MyCompany), only the data from the last three years is relevant for analysis. This means that with the insertion of new data, disk space has to be freed, either by purging the old data or by leveraging Oracle Database table compression. The maintenance of this so-called rolling window operation is performed by using Oracle Partitioning.

Back to Topic List

Prerequisites

Before starting this tutorial, you should:

1.	Install Oracle Database 11g.
2.	Create a directory named wkdir. Download and unzip etl.zip into the wkdir directory.

Back to Topic List

Scenario

External Tables

To load external files into their data warehouse, MyCompany uses the Oracle Database external table feature, which allows external data such as flat files to be exposed within the database just like regular database tables. External tables can be accessed by SQL, so that external files can be queried directly and in parallel using the full power of SQL, PL/SQL, and Java. External tables are often used in the extraction, transformation, and loading (ETL) process to combine data-transformation (through SQL) with data-loading in a single step. External tables are a very powerful feature with many possible applications in ETL and other database environments where flat files are processed. External tables are an alternative to using SQL*Loader.

Parallel Execution

Parallel execution dramatically reduces response time for data-intensive operations on large databases typically used with decision support systems (DSS) and data warehouses. You can also implement parallel execution on certain types of OLTP and hybrid systems. Simply expressed, parallelism is the idea of breaking down a task so that instead of one process doing all of the work in a query, many processes do parts of the work at the same time. For example, parallel execution can be used when four processes handle four different quarters in a year instead of one process handling all four quarters by itself.

Rolling Window Operations Using Oracle Partitioning

A very important task in the back office of a data warehouse is to keep the data synchronized with the various changes that are taking place in the OLTP (source) systems. In addition, the life span of the data from an analysis perspective is very often limited, so that older data must be purged from the target system while new data is loaded; this operation is often called a rolling window operation. Ideally, this operation should be done as fast as possible without any implication for the concurrent online access of the data warehousing system.

Back to Topic List

Implement Schema Changes for the Sales History Schema

Before starting the tasks for this OBE, you need to implement some changes to the existing Sales History (SH) schema. You need to create additional objects in the SH schema. In addition, you need to grant additional system privileges to the SH user. The SQL file for making these changes is modifySH_11g.sql. Perform the following steps:

Open a terminal window. Change your working directory to /home/oracle/wkdir by executing the following command from your terminal session:

cd wkdir

(Note: This tutorial assumes you have a /home/oracle/wkdir folder. If you do not, you will need to create one and unzip the contents of etl.zip into this folder.)

Start a SQL*Plus session and log in as the SH user with a password of SH.

Execute the modifySH_11g.sql script in your SQL*Plus session as follows:

@modifySH_11g.sql

The end of your output should match the image below.

Back to Topic List

Load Data by Using External Tables

In this section of the tutorial, you load data into the data warehouse using external tables.

To create and use external tables, perform the following steps:

1.	Create the necessary directory objects.
2.	Create the external table.
3.	Select from the external table.
4.	Provide transparent high-speed parallel access of external tables.
5.	Review Oracle's parallel insert capabilities.
6.	Perform parallel insert.

Back to Topic List

1. Create the Necessary Directory Objects

Before you create the external table, you need to create a directory object in the database that points to the directory on the file system where the data files will reside. Optionally, you can separate the location for the log, bad and discard files from the location of the data files. To create the directory, perform the following steps:

In a SQL*Plus session logged on as the SH user, execute the create_directory.sql script or copy the following SQL statements into your SQL*Plus session:

DROP DIRECTORY data_dir;
DROP DIRECTORY log_dir;
CREATE DIRECTORY data_dir AS '/home/oracle/wkdir';
CREATE DIRECTORY log_dir AS '/home/oracle/wkdir';

Move your mouse over this icon to see the image

The scripts are set up for a Linux system and assume that the files were extracted into /home/oracle/wkdir. Note that due to security reasons, symbolic links are not supported as DIRECTORY objects within the database.

1.	The metadata information for the table representation inside the database
2.	The HOW access parameter definition to extract the data from the external file

1.	Create a staging table.
2.	Load the data into the staging table by using SQL*Loader.
3.	Load the staging table into the target database.
4.	Drop the staging table.

1.	Prepare a stand-alone table with the new data
2.	Add the new data to the fact table
3.	Delete old data from the fact table

1.1	Modify the external table to use the Sales Q1 data.
1.2	Create the table for the new Sales Q1 data.
1.3	Load this table.
1.4	Create bitmap indexes for this table.
1.5	Create constraints for this table.

2.1	Create a new partition, if one does not already exist.
2.2	Exchange the partition. This is only a data dictionary operation and does not touch any data.
2.3	Select from the partition to control the success.
2.4	Split the most recent partition to ensure (business) data integrity .

3.1	Create an empty stand-alone table.
3.2	Create bitmap indexes for this table.
3.3	Create constraints for this table.
3.4	Show the data in the partition before the exchange.
3.5	Exchange the empty new table with the existing Q1-1998 partition.
3.6	Show the data in the partition after the exchange.

1.	Utilize Oracle Database 10g Enhancements for Local Index Maintenance
2.	Utilize Oracle's Global Index Maintenance

1.1	Split the most recent partition by using the default placement rules.
1.2	Split a partition by using the extended SQL syntax for local index maintenance.
1.3	Clean up.

2.1	Prepare for global index maintenance.
2.2	Build a global index.
2.3	Exchange a partition with global index maintenance and experience its effect on global indexes.
2.4	Exchange a partition without global index maintenance and experience its effect on global indexes.
2.5	Drop the global index and exchange back (clean up).

	Load data using external tables
	Compare usage of SQL*Loader to external tables
	Perform a table compression to save disk space
	Perform a rolling window operation using Oracle partitioning

Purpose

Topics

Viewing Screenshots

Overview

Scenario

External Tables

Parallel Execution

Rolling Window Operations Using Oracle Partitioning

Load Data by Using External Tables

1. Create the Necessary Directory Objects

2. Create the External Table

3. Select From the External Table

4. Provide Transparent High-Speed Parallel Access of External Tables

5. Review Oracle's Parallel Insert Capabilities

6. Perform Parallel Insert

Compare SQL*Loader to the External Table Loading and Transformation Process

1. Create a Staging Table

2. Load the Data into the Staging Table by Using SQL*Loader

3. Load the Staging Table into the Target Database

4. Drop the Staging Table

Leverage Table Compression to Save Disk Space and Reduce the TCO

1. Compress the Most Recent Partition

Perform a Rolling Window Operation Using Oracle Partitioning

Perform the Steps of the Rolling Window Operation:

1. Prepare a Stand-Alone Table with the New Data

1.1 Modify the External Table to use the Sales Q1 Data

1.2 Create the Table for the New Sales Q1 Data

1.3 Load This Table

1.4 Create Bitmap Indexes for This Table

1.5 Create Constraints for This Table

2. Add the New Data to the Fact Table

2.1 Create a New Partition

2.2 Exchange the Partition

2.3 Select From the Partition

2.4 Split the Most Recent Partition to Ensure (Business) Data Integrity

3. Delete Old Data from the Fact Table

3.1 Create an Empty Stand-Alone Table

3.2 Create Bitmap Indexes for This Table

3.3 Create Constraints for This Table

3.4 Show the Data in the Partition Before the Exchange

3.5 Exchange the Partition

3.6 Show the Data in the Partition After the Exchange

Oracle Database 10g Enhancements for Local Index Maintenance

1.1 Split the Most Recent Partition

1.2 Split a Partition Using the Extended SQL Syntax

1.3 Clean Up

2. Utilize Oracle's Global Index Maintenance

2.1 Prepare for Global Index Maintenance

2.2 Build a Global Index

2.3 Exchange a Partition with Global Index Maintenance

2.4 Exchange a Partition Without Global Index Maintenance

2.5 Drop Global Index and Exchange Back

Summary