We use a web-based service that runs on Oracle. They are strict about only allowing SELECT-only ODBC access. Some of the reporting that we do isn’t well accounted for by the views that the company provisions, so we set up a db_link using the express edition of Oracle 11g and rewrote some of the more important queries as materialized views, using the refresh: specify settings to re-run the queries hourly, which is current enough for our purposes. All good so far.
I noticed that some of the MVs would stop refreshing, with no real pattern behind it. Investigating further it looks from time to time the external database (the one that we’re connected to via db_link) doesn’t complete the query from time to time, and the refresh process sits patiently waiting on the event ‘SQL*Net more data from dblink’ indefinitely.
Here’s the query I ran to get data about the stuck refresh sessions and the three sessions that seem to be stuck refresh statements:
select a.username, a.osuser, a.sid, a.serial#, b.spid, a.seconds_in_wait,
a.event, a.state, a.wait_class
from v$session a, v$process b
where a.paddr = b.addr
and a.seconds_in_wait > 500 and a.username is not null;
USERNAME OSUSER SID SERIAL# SPID SECONDS_IN_WAIT EVENT STATE WAIT_CLASS
KIPP_NWK SYSTEM 27 7904 2704 161991 SQL*Net more data from dblink WAITING Network
KIPP_NWK SYSTEM 35 2469 3880 139489 SQL*Net more data from dblink WAITING Network
KIPP_NWK SYSTEM 37 6051 1408 40860 SQL*Net more data from dblink WAITING Network
I think my question is thus ‘any suggestions about a script that will periodically (say, hourly) scan for stuck sessions of event type ‘more data from dblink’ and terminate them?’ The desired behavior for me is that this refresh is more resilient — if it times out, I’d like it to start over and try, try again…
I don’t think that figuring out why the external db stops executing these queries is going to be all that fruitful — the provider provides the select privileges but is pretty clear about not wanting to support/troubleshoot any issues that arise.
I read about setting idle_time and changing the set SQLNET.EXPIRE_TIME in sqlnet.ora file – but I don’t get the sense that’s the right approach as the connections aren’t idle, they’re active but waiting indefinitely, as well as the complicating factor that these sessions are initiated by the database itself.
How do I make this refresh more resilient/automatically kill these long-waiting refreshes?
@tbone, thanks for the dbms_scheduler resources. What I ended up doing was building a scheduled job that looks for the stuck dblink refreshes and terminates those. It seems to be working nicely.
I created this stored procedure that kills jobs that have been waiting for 10 minutes on SQL*Net more data from dblink:
and then this dbms job that regularly runs the stored procedure:
this how-to was helpful, even though it doesn’t use the new syntax for scheduled jobs: http://baurdotnet.wordpress.com/2010/11/11/oracle-job-session-killer/