I am learning about all things SQL, and upon reading about indexes I naturally wanted to create a simple adhoc test to see how indexes REALLY affect performance (rather than just taking the reading at face value). However, I am having some big problems with the tests I was trying to set up. In short, no matter how big I make my tables, with or without indexes, the queries finish very quickly, on average 0.5 seconds every time. I was hoping that I could create some tables so large that certain queries would take a long time to execute, and then by creating indexes I could reduce those execution times, and compare the reductions against different types of indexes. However, I am having no luck at all! Here is how I set up my sample table:
CREATE TABLE EMPLOYEE
(EmpID NUMBER(6),
Lname VARCHAR2(20),
Fname VARCHAR2(20),
Gender CHAR(1),
HomeState CHAR(2),
BirthDate DATE,
HiredDate DATE,
Occupation VARCHAR2(20),
Salary NUMBER(6),
NumDep NUMBER(1));
The EMPLOYEE table has 10 attributes, and I created a separate Java program to make SQL commands to populate the table with random info. For example, here is the output from the Java program with an input size of 3:
INSERT INTO EMPLOYEE VALUES (1, 'HENRY', 'VIRGINIA', 'F', 'MA', '14-NOV-1966', '27-APR-1987', 'MANAGER', '80514', '5');
INSERT INTO EMPLOYEE VALUES (2, 'PARSONS', 'KEVIN', 'M', 'KS', '11-DEC-1961', '14-JAN-2004', 'NURSE', '74416', '3');
INSERT INTO EMPLOYEE VALUES (3, 'GARRETT', 'JIMMY', 'M', 'NE', '24-MAR-1963', '20-MAY-1992', 'SERVICE', '87116', '2');
The last name is chosen from 500 common last names randomly.
The first name is chosen from 100 gender-specific first names randomly.
The state is chosen randomly (from a list of 50, obviously).
The dates are chosen randomly within certain reasonable limits.
The occupation is chosen randomly from a list of 20 common jobs.
The salary is chosen randomly between 50000 and 99999 (which leads to some unrealistic salary/job matches, but that is not really the point here).
The number of dependents is chosen randomly between 0 and 5.
I figured that one good thing about the above setup is the range of domain sizes. I.e. 500 last names possible, 100 first names possible, 50 states, 20 jobs, etc. I know from reading that indexes work better when the amount of possible values is smaller compared to the overall number of records, and I was hoping to be able to prove this with my testing. However, as I said before, my testing is failing horrible.
I started off populating the EMPLOYEE table with only 500 records, and found that every query I did (example below) took only 0.5 seconds in SQL Developer (the software I am using to connect to Oracle 11g).
SELECT * FROM EMPLOYEE WHERE salary BETWEEN 60000 and 62000;
So then I tried with 1,000 records, then 10,000 records, then 100,000 records, then 500,000 records (which I had to insert in batches of 100,000 since my clipboard and/or SQL Developer can’t seem to handle such large script)… and the results are still the same. It solves every query in 0.5 seconds! Which blows my mind since it takes over 60 seconds to insert 100,000 records. Also SQL Developer can’t seem to return more than 5,000 results in a single query script execution, and if I come close to this max sometimes it takes 1.0 seconds instead of 0.5 seconds, but I suspect this is mainly because of the time needed to print all 5,000 results in the script output window.
I have tried my queries on various attributes, and I’ve tried with and without indexes on those attributes, and the indexes seem to make no difference. I am very disappointed here. I was expecting my testing to yield much more tangible results. I was also hoping to use this topic, along with the results of this testing for a class paper that I need to write soon, but with my testing yielding no results that is going to be kind of hard. Any suggestions?
P.S. Thanks for reading this far if you did! Really sorry for the length…
If you’re doing
SELECT *, Oracle is basically going to have to retrieve all the blocks for your table no matter what, so it’s very possible that it’s not using your index. To determine if it’s using your index, you need to look at the explain plan or the execution plan. That will tell you what route Oracle chose to get your data, a full table scan, an index scan, etc.Here’s some documentation on using EXPLAIN PLAN in 10g.
As for the times in SQL Dev, once Oracle has read all the data blocks for the table, it holds them in cache for a while. That’s part of the IO that often takes the longest. The 5000 limit in SQL Dev can be set somewhere in the options, but I forget where.
You might be better off doing this in SQL*Plus, and setting AUTOTRACE to on. You can get a record of the actual IO calls that way. Look for “db block gets” and “consistent gets”, I think.
Really, without an execution plan or an explain plan to tell us what Oracle is really doing to get your data, it’s hard to tell if it’s even using your indexes.