I’ll preface this question by stating that I’m using Oracle 10g Enterprise edition and

Question

0

Editorial Team

Asked: May 12, 20262026-05-12T08:00:15+00:00 2026-05-12T08:00:15+00:00

I’ll preface this question by stating that I’m using Oracle 10g Enterprise edition and

0

I’ll preface this question by stating that I’m using Oracle 10g Enterprise edition and I’m relatively new to Oracle.

I’ve got a table with the following schema:

ID           integer (pk)  -- unique index
PERSON_ID    integer (fk)  -- b-tree index
NAME_PART    nvarchar      -- b-tree index
NAME_PART_ID integer (fk)  -- bitmap index

The PERSON_ID is the foreign key for the unique id of a person record. The NAME_PART_ID is the foreign key of a lookup table with static values like “First Name”, “Middle Name”, “Last Name”, etc. The point of the table is to store individual parts of people’s names separately. Every person record has at least a first name. When trying to pull the data out, I first considered using joins, like so:

select
    first_name.person_id,
    first_name.name_part,
    middle_name.name_part,
    last_name.name_part
from
    NAME_PARTS first_name
left join
    NAME_PARTS middle_name
    on first_name.person_id = middle_name.person_id
left join
    NAME_PARTS last_name
    on first_name.person_id = last_name.person_id
where
    first_name.name_part_id = 1
    and middle_name.name_part_id = 2
    and last_name.name_part_id = 3;

But the table has tens of millions of records, and the bitmap index for the NAME_PART_ID column isn’t being used. The explain plan indicates that the optimizer is using full table scans and hash joins to retrieve the data.

Any suggestions?

EDIT: The reason the table was designed this way was because the database is used across several different cultures, each of which has different conventions for how individuals are named (e.g. in some middle-eastern cultures, individuals usually have a first name, then their father’s name, then his father’s name, etc). It is difficult to create one table with multiple columns that account for all of the cultural differences.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-12T08:00:16+00:00

Given that you’re essentially doing a full table scan anyway (as your query is extracting all data from this table, excluding the few rows that wouldn’t have name parts that were first, middle or last), you may want to consider writing the query so that it just returns the data in a slightly different format, such as:

  SELECT person_id
       , name_part_id
       , name_part
    FROM NAME_PART
   WHERE name_part_id IN (1, 2, 3)
ORDER BY person_id
       , name_part_id;

Of course, you’ll end up with 3 rows instead of one for each name, but it may be trivial for your client code to roll these together. You can also roll the 3 rows up into one by using decode, group by and max:

  SELECT person_id
       , max(decode(name_part_id, 1, name_part, null)) first
       , max(decode(name_part_id, 2, name_part, null)) middle
       , max(decode(name_part_id, 3, name_part, null)) last
    FROM NAME_PART
   WHERE name_part_id IN (1, 2, 3)
GROUP BY person_id
ORDER BY person_id;

This will produce results identical to your original query. Both versions will only scan the table once (with a sort), instead of dealing with the 3-way join. If you made the table an index-organized table on the person_id index, you’d save the sort step.

I ran a test with a table with 56,150 persons, and here’s a rundown of the results:

Original query:

Execution Plan
----------------------------------------------------------

------------------------------------------------------------------------------
| Id  | Operation           | Name      | Rows  | Bytes |TempSpc| Cost (%CPU)|
------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |           |   113K|    11M|       |  1364   (2)|
|*  1 |  HASH JOIN          |           |   113K|    11M|  2528K|  1364   (2)|
|*  2 |   TABLE ACCESS FULL | NAME_PART | 56150 |  1864K|       |   229   (3)|
|*  3 |   HASH JOIN         |           | 79792 |  5298K|  2528K|   706   (2)|
|*  4 |    TABLE ACCESS FULL| NAME_PART | 56150 |  1864K|       |   229   (3)|
|*  5 |    TABLE ACCESS FULL| NAME_PART | 56150 |  1864K|       |   229   (3)|
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("FIRST_NAME"."PERSON_ID"="LAST_NAME"."PERSON_ID")
   2 - filter("LAST_NAME"."NAME_PART_ID"=3)
   3 - access("FIRST_NAME"."PERSON_ID"="MIDDLE_NAME"."PERSON_ID")
   4 - filter("FIRST_NAME"."NAME_PART_ID"=1)
   5 - filter("MIDDLE_NAME"."NAME_PART_ID"=2)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       6740  consistent gets
          0  physical reads
          0  redo size
    5298174  bytes sent via SQL*Net to client
      26435  bytes received via SQL*Net from client
       3745  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      56150  rows processed

My query #1 (3 rows/person):

Execution Plan
----------------------------------------------------------

-----------------------------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Bytes |TempSpc| Cost (%CPU)|
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           |   168K|  5593K|       |  1776   (2)|
|   1 |  SORT ORDER BY     |           |   168K|  5593K|    14M|  1776   (2)|
|*  2 |   TABLE ACCESS FULL| NAME_PART |   168K|  5593K|       |   230   (3)|
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("NAME_PART_ID"=1 OR "NAME_PART_ID"=2 OR "NAME_PART_ID"=3)

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       1005  consistent gets
          0  physical reads
          0  redo size
    3799794  bytes sent via SQL*Net to client
      78837  bytes received via SQL*Net from client
      11231  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
     168450  rows processed

My query #2 (1 row/person):

Execution Plan
----------------------------------------------------------

-----------------------------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Bytes |TempSpc| Cost (%CPU)|
-----------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           | 56150 |  1864K|       |  1115   (3)|
|   1 |  SORT GROUP BY     |           | 56150 |  1864K|  9728K|  1115   (3)|
|*  2 |   TABLE ACCESS FULL| NAME_PART |   168K|  5593K|       |   230   (3)|
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("NAME_PART_ID"=1 OR "NAME_PART_ID"=2 OR "NAME_PART_ID"=3)

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       1005  consistent gets
          0  physical reads
          0  redo size
    5298159  bytes sent via SQL*Net to client
      26435  bytes received via SQL*Net from client
       3745  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
      56150  rows processed

Turns out, you can squeeze it a bit faster still; I tried to avoid the sort by adding an index hint to force the use of the person_id index. I managed to knock off another 10%, but it still looks like it’s sorting:

  SELECT /*+ index(name_part,NAME_PART_person_id) */ person_id
       , max(decode(name_part_id, 1, name_part)) first
       , max(decode(name_part_id, 2, name_part)) middle
       , max(decode(name_part_id, 3, name_part)) last
    FROM name_part
   WHERE name_part_id IN (1, 2, 3)
GROUP BY person_id
ORDER BY person_id;

Execution Plan
----------------------------------------------------------

-----------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name                  | Rows  | Bytes |TempSpc| Cost (%CPU)|
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |                       | 56150 |  1864K|       |  3385   (1)|
|   1 |  SORT GROUP BY                 |                       | 56150 |  1864K|  9728K|  3385   (1)|
|   2 |   INLIST ITERATOR              |                       |       |       |       |            |
|   3 |    TABLE ACCESS BY INDEX ROWID | NAME_PART             |   168K|  5593K|       |  2500   (1)|
|   4 |     BITMAP CONVERSION TO ROWIDS|                       |       |       |       |            |
|*  5 |      BITMAP INDEX SINGLE VALUE | NAME_PART_NAME_PART_ID|       |       |       |            |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("NAME_PART_ID"=1 OR "NAME_PART_ID"=2 OR "NAME_PART_ID"=3)

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        971  consistent gets
          0  physical reads
          0  redo size
    5298159  bytes sent via SQL*Net to client
      26435  bytes received via SQL*Net from client
       3745  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
      56150  rows processed

However, the plans above are all based on the assumption that you’re selecting from the entire table. If you constrain the results based on person_id (e.g., person_id between 55968 and 56000), it turns out that your original query with the hash joins is the fastest (27 vs. 106 consistent gets for the constraint I specified).

On the THIRD hand, if the queries above are being used to populate a GUI that uses a cursor to scroll over the result set (such that you would only see the first N rows of the result set initially – reproduced here by adding a “and rowcount < 50” predicate), my versions of the query once again become fast – very fast (4 consistent gets vs. 417).

The moral of the story is that it really depends exactly how you’re accessing the data. Queries that work well on the entire result set may be worse when applied against different subsets.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’ll preface this question by stating that I’m using Oracle 10g Enterprise edition and

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply