I’m working on a web service that fetches data from an oracle data source in chunks and passes it back to an indexing/search tool in XML format. I’m the C#/.NET guy, and am kind of fuzzy on parts of Oracle.
Our Oracle team gave us the following script to run, and it works well:
SELECT ROWID, [columns]
FROM [table]
WHERE ROWID IN (
SELECT ROWID
FROM (
SELECT ROWID
FROM [table]
WHERE ROWID > '[previous_batch_last_rowid]'
ORDER BY ROWID
)
WHERE ROWNUM <= 10000
)
ORDER BY ROWID
10,000 rows is an arbitrary but reasonable chunk size and ROWID is sufficiently unique for our purposes to use as a UID since each indexing run hits only one table at a time. Bracketed values are filled in programmatically by the web service.
Now we’re going to start adding views to the indexing, each of which will union a few separate tables. Since ROWID would no longer function as a unique identifier, they added a column to the views (VIEW_UNIQUE_ID) that concatenates the ROWIDs from the component tables to construct a UID for each union.
But this script does not work, even though it follows the same form as the previous one:
SELECT VIEW_UNIQUE_ID, [columns]
FROM [view]
WHERE VIEW_UNIQUE_ID IN (
SELECT VIEW_UNIQUE_ID
FROM (
SELECT VIEW_UNIQUE_ID
FROM [view]
WHERE VIEW_UNIQUE_ID > '[previous_batch_last_view_unique_id]'
ORDER BY VIEW_UNIQUE_ID
)
WHERE ROWNUM <= 10000
)
ORDER BY VIEW_UNIQUE_ID
It hangs indefinitely with no response from the Oracle server. I’ve waited 20+ minutes and the SQLTools dialog box indicating a running query remains the same, with no progress or updates.
I’ve tested each subquery independently and each works fine and takes a very short amount of time (<= 1 second), so the view itself is sound. But as soon as the inner two SELECT queries are added with “WHERE VIEW_UNIQUE_ID IN…”, it hangs.
Why doesn’t this query work for views? In what important way are they not interchangeable here?
Updated: the architecture of the solution stipulates that it is to be stateless, so I shouldn’t try to make the web service preserve any index state information between requests from consumers.
God, that is the most obscene idea I’ve seen in a long time.
Let’s say the view is a simple one like
Every time you want to do the
It has to build that entire result set, apply the filter, and order it. For anything other than trivially sized tables, that will be a nightmare.
Stop using the database to paginate/chunk the data here and do that in the client. Open the database connection, execute the query, fetch the first ten thousand rows from the query, index them, fetch the next ten thousand. Don’t close and reopen the query each time, only after you’ve processed each row. You’ll be able to forget about ordering.