4 Using Indexes in Database Applications

This chapter explains how to use indexes in database applications.

Topics:

Privileges Needed to Create Indexes
Guidelines for Application-Specific Indexes
Examples of Creating Basic Indexes
When to Use Domain Indexes
When to Use Function-Based Indexes
See Also:
- Oracle Database Administrator's Guide for information about creating and managing indexes
- Oracle Database Performance Tuning Guide for detailed information about using indexes
- Oracle Database SQL Language Reference for the syntax of statements to work with indexes
- Oracle Database Administrator's Guide for information about creating hash clusters to improve performance, as an alternative to indexing

Privileges Needed to Create Indexes

When using indexes in an application, you might need to ask the DBA to grant privileges or make changes to initialization parameters.

To create an index, you must own, or have the INDEX object privilege for, the corresponding table. The schema that contains the index must also have a quota for the tablespace intended to contain the index, or the UNLIMITED TABLESPACE system privilege. To create an index in another user's schema, you must have the CREATE ANY INDEX system privilege.

Guidelines for Application-Specific Indexes

You can create indexes on columns to speed up queries. Indexes provide faster access to data for operations that return a small portion of a table's rows.

In general, create an index on a column in any of these situations:

The column is queried frequently.
A referential constraint exists on the column.
A UNIQUE key constraint exists on the column.

You can create an index on any column; however, if the column is not used in any of these situations, creating an index on the column does not increase performance and the index takes up resources unnecessarily.

Although the database creates an index for you on a column with a constraint, explicitly creating an index on such a column is recommended.

You can use these techniques to determine which columns are best candidates for indexing:

Use the EXPLAIN PLAN feature to show a theoretical execution plan of a given query statement.
Use the dynamic performance view V$SQL_PLAN to determine the actual execution plan used for a given query statement.

Sometimes, if an index is not being used by default and it would be more efficient to use that index, you can use a query hint so that the index is used.

See Also:

Oracle Database Performance Tuning Guide for information about using the V$SQL_PLAN view, the EXPLAIN PLAN statement, query hints, and measuring the performance benefits of indexes
Oracle Database Reference for general information about the V$SQL_PLAN view

Topics:

Which Come First, Data or Indexes?
Create a Temporary Table Space Before Creating Indexes
Index the Correct Tables and Columns
Limit the Number of Indexes for Each Table
Choose Column Order in Composite Indexes
Gather Index Statistics
Drop Unused Indexes

Which Come First, Data or Indexes?

Typically, you insert or load data into a table (using SQL*Loader or Import) before creating indexes. Otherwise, the overhead of updating the index slows down the insert or load operation. The exception to this rule is that you must create an index for a cluster before you insert any data into the cluster.

Create a Temporary Table Space Before Creating Indexes

When you create an index on a table that has data, Oracle Database must use sort space to create the index. The database uses the sort space in memory allocated for the creator of the index (the amount for each user is determined by the initialization parameter SORT_AREA_SIZE), but the database must also swap sort information to and from temporary segments allocated on behalf of the index creation. If the index is extremely large, Oracle recommends following these steps:

Create a temporary tablespace using the CREATE TABLESPACE statement.
Use the TEMPORARY TABLESPACE option of the ALTER USER statement to make this your temporary tablespace.
Create the index using the CREATE INDEX statement.
Drop this tablespace using the DROP TABLESPACE statement. Then use the ALTER USER statement to reset your temporary tablespace to your original temporary tablespace.

Under certain conditions, you can load data into a table with the SQL*Loader "direct path load", and an index can be created as data is loaded.

See Also:

Oracle Database Utilities for information about direct path load

Index the Correct Tables and Columns

Use these guidelines for determining when to create an index:

Create an index if you frequently want to retrieve less than about 15% of the rows in a large table. This threshold percentage varies greatly, however, according to the relative speed of a table scan and how clustered the row data is about the index key. The faster the table scan, the lower the percentage; the more clustered the row data, the higher the percentage.
Index columns that are used for joins to improve join performance.
Primary and unique keys automatically have indexes, but you might want to create an index on a foreign key; see Chapter 5, "Maintaining Data Integrity in Database Applications," for more information.
Small tables do not require indexes; if a query is taking too long, then the table might have grown from small to large.

Some columns are strong candidates for indexing. Columns with one or more of these characteristics are good candidates for indexing:

Values are unique in the column, or there are few duplicates.
There is a wide range of values (good for regular indexes).
There is a small range of values (good for bitmap indexes).
The column contains many nulls, but queries often select all rows having a value. In this case, a comparison that matches all the non-null values, such as:
```
WHERE COL_X >= -9.99 *power(10,125)
```
is preferable to
```
WHERE COL_X IS NOT NULL
```
This is because the first uses an index on COL_X (if COL_X is a numeric column).

Columns with these characteristics are less suitable for indexing:

There are many nulls in the column and you do not search on the non-null values.

LONG and LONG RAW columns cannot be indexed.

The size of a single index entry cannot exceed roughly one-half (minus some overhead) of the available space in the data block. Consult with the database administrator for assistance in determining the space required by an index.

Limit the Number of Indexes for Each Table

The more indexes, the more overhead is incurred as the table is altered. When rows are inserted or deleted, all indexes on the table must be updated. When a column is updated, all indexes on the column must be updated.

You must weigh the performance benefit of indexes for queries against the performance overhead of updates. For example, if a table is primarily read-only, you might use more indexes; but, if a table is heavily updated, you might use fewer indexes.

Choose Column Order in Composite Indexes

Although you can specify columns in any order in the CREATE INDEX statement, the order of columns in the CREATE INDEX statement can affect query performance. In general, put the column expected to be used most often first in the index. You can create a composite index (using several columns), and use the same index for queries that reference all or some of these columns.

Example 4-1 VENDOR_PARTS Table

DROP TABLE vendor_parts;
CREATE TABLE vendor_parts (
  vendor_id  VARCHAR2(9),
  part_no    VARCHAR2(7),
  unit_cost  REAL
);
 
INSERT INTO vendor_parts (vendor_id, part_no, unit_cost)
VALUES (1012, 10440, .25);
 
INSERT INTO vendor_parts (vendor_id, part_no, unit_cost)
VALUES (1012, 10441, .39);
 
INSERT INTO vendor_parts (vendor_id, part_no, unit_cost)
VALUES (1012, 457, 4.95);
 
INSERT INTO vendor_parts (vendor_id, part_no, unit_cost)
VALUES (1010, 10440, .27);
 
INSERT INTO vendor_parts (vendor_id, part_no, unit_cost)
VALUES (1010, 457, 5.12);
 
INSERT INTO vendor_parts  (vendor_id, part_no, unit_cost)
VALUES (1220, 8300, 1.33);
 
INSERT INTO vendor_parts (vendor_id, part_no, unit_cost)
VALUES (1012, 8300, 1.19);
 
INSERT INTO vendor_parts (vendor_id, part_no, unit_cost) 
VALUES (1292, 457, 5.28);

Query:

SELECT * FROM vendor_parts
ORDER BY vendor_id;

Result:

VENDOR_ID PART_NO  UNIT_COST
--------- ------- ----------
1010      10440          .27
1010      457           5.12
1012      457           4.95
1012      8300          1.19
1012      10441          .39
1012      10440          .25
1220      8300          1.33
1292      457           5.28
 
8 rows selected.

Assume that there are five vendors, and each vendor has about 1000 parts.

Suppose that the VENDOR_PARTS table is commonly queried by SQL statements such as:

SELECT * FROM vendor_parts
WHERE part_no = 457 AND vendor_id = 1012
ORDER BY vendor_id;

Result:

VENDOR_ID PART_NO  UNIT_COST
--------- ------- ----------
1012      457           4.95
 
1 row selected.

To increase the performance of such queries, you might create a composite index putting the most selective column first; that is, the column with the most values:

CREATE INDEX ind_vendor_id
ON vendor_parts (part_no, vendor_id);

Composite indexes speed up queries that use the leading portion of the index. So in this example, the performance of queries with WHERE clauses using only the PART_NO column improve also. Because there are only five distinct values, placing a separate index on VENDOR_ID serves no purpose.

Gather Index Statistics

The database can use indexes more effectively when it has statistical information about the tables involved in the queries. You or the DBA can periodically gather statistics by invoking procedures such as DBMS_STATS.GATHER_TABLE_STATISTICS and DBMS_STATS.GATHER_SCHEMA_STATISTICS. For information about these procedures, see Oracle Database PL/SQL Packages and Types Reference.

Drop Unused Indexes

You might drop an index if:

It does not speed up queries. The table might be very small, or there might be many rows in the table but very few index entries.
The queries in your applications do not use the index.

To find out if an index is being used, you can monitor it. If you see that the index is never used, rarely used, or used in a way that seems to provide no benefit, you can either drop it immediately or you can make it invisible until you are sure that you do not need it, and then drop it. If you discover that you do need the invisible index, you can make it visible again.

When you drop an index, all extents of the index's segment are returned to the containing tablespace and become available for other objects in the tablespace.

To drop an index, use the SQL statement DROP INDEX. For example, this statement drops the index named Emp_name:

DROP INDEX Emp_ename;

If you drop a table, then all associated indexes are dropped.

To drop an index, the index must be contained in your schema or you must have the DROP ANY INDEX system privilege.

See Also:

Oracle Database Administrator's Guide for information about monitoring index usage
Oracle Database Administrator's Guide for information about making indexes invisible
Oracle Database SQL Language Reference for information about the DROP INDEX statement

Examples of Creating Basic Indexes

You can create an index for a table to improve the performance of queries issued against the corresponding table. You can also create an index for a cluster. You can create a composite index on multiple columns up to a maximum of 32 columns. A composite index key cannot exceed roughly one-half (minus some overhead) of the available space in the data block.

Oracle Database automatically creates an index to enforce a UNIQUE or PRIMARY KEY constraint. In general, it is better to create such constraints to enforce uniqueness, instead of using the obsolete CREATE UNIQUE INDEX syntax.

Use the SQL statement CREATE INDEX to create an index.

Example 4-2 Creating Indexes

Create index for single column, to speed up queries that test that column:

CREATE INDEX emp_phone ON EMPLOYEES(PHONE_NUMBER);

Create index for single column, specifying some physical attributes for index:

CREATE INDEX emp_lastname ON EMPLOYEES (LAST_NAME)
  STORAGE (
    INITIAL     20K
    NEXT        20k
    PCTINCREASE 75
  )
  PCTFREE  0;

Create index for two columns, to speed up queries that test either first column or both columns:

CREATE INDEX emp_id_email ON EMPLOYEES(EMPLOYEE_ID, EMAIL);

For query that sorts on UPPER(LASTNAME), index on LAST_NAME column does not speed up operation, and might be slow to invoke function for each result row. Create function-based index that precomputes the result of the function for each column value,speeding up queries that use the function for searching or sorting:

CREATE INDEX emp_upper_lastname ON EMPLOYEES(UPPER(LAST_NAME));

When to Use Domain Indexes

Domain indexes are appropriate for special-purpose applications implemented using data cartridges. The domain index helps to manipulate complex data, such as spatial, audio, or video data. If you must develop such an application, see Oracle Database Data Cartridge Developer's Guide.

Oracle Database supplies specialized data cartridges to help manage these kinds of complex data. So, if you must create a search engine, or a geographic information system, you can do much of the work simply by creating the right kind of index.

When to Use Function-Based Indexes

A function-based index is an index built on an expression. It extends your indexing capabilities beyond indexing on a column. A function-based index increases the variety of ways in which you can access data.

Note:

The index is more effective if you gather statistics for the table or schema, using the procedures in the DBMS_STATS package.
The index cannot contain any null values. Either ensure that the appropriate columns contain no null values, or use the NVL function in the index expression to substitute some other value for nulls.

The expression indexed by a function-based index can be an arithmetic expression or an expression that contains a PL/SQL function, package function, C callout, or SQL function. Function-based indexes also support linguistic sorts based on collation keys, efficient linguistic collation of SQL statements, and case-insensitive sorts.

Like other indexes, function-based indexes improve query performance. For example, if you must access a computationally complex expression often, then you can store it in an index. Then when you must access the expression, it is available. You can find a detailed description of the advantages of function-based indexes in "Advantages of Function-Based Indexes".

Function-based indexes have all of the same properties as indexes on columns. Unlike indexes on columns that can be used by both cost-based and rule-based optimization, however, function-based indexes can be used by only by cost-based optimization. Other restrictions on function-based indexes are described in "Restrictions on Function-Based Indexes".

See Also:

Oracle Database Concepts for general information about function-based indexes
Oracle Database Administrator's Guide for information about creating function-based indexes

Topics:

Advantages of Function-Based Indexes
Restrictions on Function-Based Indexes
Examples of Function-Based Indexes

Advantages of Function-Based Indexes

Function-based indexes:

Increase the number of situations where the optimizer can perform a range scan instead of a full table scan (as in Example 4-3).

Range scans typically produce fast response times if the predicate selects less than 15% of the rows of a large table. The optimizer can estimate how many rows are selected by expressions more accurately if the expressions are materialized in a function-based index. (Expressions of function-based indexes are represented as virtual columns and ANALYZE can build histograms on such columns.)
Precompute the value of a computationally intensive function and store it in the index.

An index can store computationally intensive expression that you access often. When you must access a value, it is available, greatly improving query execution performance.
Create indexes on object columns and REF columns.

Methods that describe objects can be used as functions on which to build indexes. For example, you can use the MAP method to build indexes on an ADT column.
Create more powerful sorts.

You can perform case-insensitive sorts with the UPPER and LOWER functions, descending order sorts with the DESC keyword, and linguistic-based sorts with the NLSSORT function.

Note:

Oracle Database sorts columns with the DESC keyword in descending order. Such indexes are treated as function-based indexes. Descending indexes cannot be bitmapped or reverse, and cannot be used in bitmapped optimizations. To get the DESC functionality before Oracle Database version 8, remove the DESC keyword from the CREATE INDEX statement.

In Example 4-3, an index is built on (Column_a + Column_b); therefore, the expression in the WHERE clause of the SELECT statement allows the optimizer to perform a range scan instead of a full table scan.

Example 4-3 Function-Based Index Allows Optimizer to Perform Range Scan

DROP TABLE Example_tab;
CREATE TABLE Example_tab (
  Column_a INTEGER,
  Column_b INTEGER
);

INSERT INTO Example_tab (Column_a, Column_b)
VALUES (1, 2);
 
INSERT INTO Example_tab (Column_a, Column_b) 
VALUES (2, 4);
 
INSERT INTO Example_tab (Column_a, Column_b) 
VALUES (3, 6);
 
INSERT INTO Example_tab (Column_a, Column_b) 
VALUES (4, 8);
 
INSERT INTO Example_tab (Column_a, Column_b) 
VALUES (5, 10);

Query:

SELECT * FROM Example_tab ORDER BY Column_a;

Result:

COLUMN_A   COLUMN_B
---------- ----------
         1          2
         2          4
         3          6
         4          8
         5         10
 
5 rows selected.

Create index:

CREATE INDEX Idx ON Example_tab(Column_a + Column_b);

Query:

SELECT * FROM Example_tab
WHERE Column_a + Column_b < 10
ORDER BY Column_a;

Result:

COLUMN_A   COLUMN_B
---------- ----------
         1          2
         2          4
         3          6
 
3 rows selected.

In Example 4-4:

The function-based index Distance_index calls the object method Distance_from_equator for each city in a table. The method is applied to the object column Reg_Obj. A query uses Distance_index to quickly find cities that are more than 1000 miles from the equator. (The table is not populated for the example, so the query returns no rows.)
The function-based index Compare_index stores the temperature delta and the maximum temperature. The result of the delta is sorted in descending order. A query uses Compare_index to quickly find table rows where the temperature delta is less than 20 and the maximum temperature is greater than 75. (The table is not populated for the example, so the query returns no rows.)

Example 4-4 Function-Based Indexes

DROP TABLE Weatherdata_tab;
CREATE TABLE Weatherdata_tab (
  Reg_obj INTEGER,
  Maxtemp INTEGER,
  Mintemp INTEGER
);
 
CREATE OR REPLACE FUNCTION Distance_from_equator (
  Reg_obj IN INTEGER
) RETURN INTEGER
  DETERMINISTIC
IS
BEGIN
  RETURN(3000);
END;
/

Create index:

CREATE INDEX Distance_index
ON Weatherdata_tab (Distance_from_equator (Reg_obj));

Query:

SELECT * FROM Weatherdata_tab
WHERE (Distance_from_equator (Reg_Obj)) > '1000';

Result:

no rows selected

Create index:

CREATE INDEX Compare_index
  2    ON Weatherdata_tab ((Maxtemp - Mintemp) DESC, Maxtemp);

Query:

SELECT * FROM Weatherdata_tab
WHERE ((Maxtemp - Mintemp) < '20' AND Maxtemp > '75');

Result:

no rows selected

Restrictions on Function-Based Indexes

Function-based indexes have these restrictions:

Only cost-based optimization can use function-based indexes. Remember to invoke DBMS_STATS.GATHER_TABLE_STATISTICS or DBMS_STATS.GATHER_SCHEMA_STATISTICS, for the function-based index to be effective.
Any top-level or package-level PL/SQL functions that are used in the index expression must be declared as DETERMINISTIC. That is, they always return the same result given the same input, for example, the UPPER function. You must ensure that the subprogram really is deterministic, because Oracle Database does not check that the assertion is true.

These semantic rules demonstrate how to use the keyword DETERMINISTIC:
- You can declare a top level subprogram as DETERMINISTIC.
- You can declare a PACKAGE level subprogram as DETERMINISTIC in the PACKAGE specification but not in the PACKAGE BODY. An exception is raised if DETERMINISTIC is used inside a PACKAGE BODY.
- You can declare a private subprogram (declared inside another subprogram or a PACKAGE BODY) as DETERMINISTIC.
- A DETERMINISTIC subprogram can invoke another subprogram whether the invoked subprogram is declared as DETERMINISTIC or not.
If you change the semantics of a DETERMINISTIC function and recompile it, then existing function-based indexes and materialized views report results for the prior version of the function. Thus, if you change the semantics of a function, you must manually rebuild any dependent function-based indexes and materialized views.
Expressions in a function-based index cannot contain any aggregate functions. The expressions must reference only columns in a row in the table.
You must analyze the table or index before the index is used.
Bitmap optimizations cannot use descending indexes.
Function-based indexes are not used when OR-expansion is done.
The index function cannot be marked NOT NULL. To avoid a full table scan, you must ensure that the query cannot fetch null values.

Function-based indexes cannot use expressions that return VARCHAR2 or RAW data types of unknown length from PL/SQL functions. A workaround is to limit the size of the function's output by indexing a substring of known length. For example:

CREATE OR REPLACE FUNCTION initials (
  name  IN VARCHAR2
) RETURN VARCHAR2
  DETERMINISTIC
IS
BEGIN
  RETURN('A. J.');
END;
/

/* Invoke SUBSTR both when creating index and when referencing
   function in queries. */
 
CREATE INDEX func_substr_index ON
EMPLOYEES(SUBSTR(initials(FIRST_NAME),1,10));
 
SELECT SUBSTR(initials(FIRST_NAME),1,10) FROM EMPLOYEES;

See Also:

Oracle Database PL/SQL Language Reference for CREATE FUNCTION restrictions

Examples of Function-Based Indexes

Function-Based Index for Case-Insensitive Searches
Precomputing Arithmetic Expressions with a Function-Based Index
Function-Based Index for Language-Dependent Sorting

Function-Based Index for Case-Insensitive Searches

This statement allows faster case-insensitive searches in table EMP_TAB.

CREATE INDEX emp_lastname ON EMPLOYEES (UPPER(LAST_NAME));

The SELECT statement uses the function-based index on UPPER(LAST_NAME) to return all of the employees with name like :KEYCOL.

SELECT * FROM EMPLOYEES WHERE UPPER(LAST_NAME) LIKE 'J%S_N';

Precomputing Arithmetic Expressions with a Function-Based Index

This statement computes a value for each row using columns A, B, and C, and stores the results in the index.

DROP TABLE Fbi_tab;
CREATE TABLE Fbi_tab (
  a INTEGER, 
  b INTEGER, 
  c INTEGER
);

CREATE INDEX Idx ON Fbi_tab (a + b * (c - 1), a, b);

The SELECT statement can either use index range scan (because the expression is a prefix of index Idx) or index fast full scan (which might be preferable if the index has specified a high parallel degree).

SELECT a FROM Fbi_tab WHERE a + b * (c - 1) < 100;

Function-Based Index for Language-Dependent Sorting

This example demonstrates how a function-based index can be used to sort based on the collation order for a national language. The NLSSORT function returns a sort key for each name, using the collation sequence GERMAN.

DROP TABLE nls_tab;
CREATE TABLE nls_tab (NAME VARCHAR2(80));

CREATE INDEX nls_index
  ON nls_tab (NLSSORT(NAME, 'NLS_SORT = GERMAN'));

The SELECT statement selects all of the contents of the table and orders it by NAME. The rows are ordered using the German collation sequence. The Globalization Support parameters are not needed in the SELECT statement, because in a German session, NLS_SORT is set to German and NLS_COMP is set to ANSI.

SELECT * FROM nls_tab
WHERE NAME IS NOT NULL
ORDER BY NAME;