|
|
 |
 |
 |
 |
Intermittent Failure When Accessing DB Constantly?
Hi All, We have a script in a cron job which will login our oracle 9.2.0.4 DB constantly (meaning every minutes, 7 days/week, & 365 days/yr), and the script is to check an availability of a record in a table which is passed through a variable in a script. The script is running perfectly fine except that it has an intermittent failure :( The weird thing is that the failure happens at exactly the same time but at random day. My 1st logical thought is that we may have some other job in a system which interferes this job; However, when I look through the system crontab file and other places & can't find any cron job which runs at that time. Please keep in mind that this is an intermittent failure. I already look through the log file and the failure indicates that "the record is not available in a table xyz", but I know for sure that record is there. I know that accessing DB like this is that we are looking for troubles, but there is a strong argument that we should be able to access oracle DB at any time (which is true), but I don't have info to say "Not to that extents". PROD Cancelled at Sun Apr 8 00:34:00 PDT 2007 PROD Cancelled at Tue Apr 10 00:34:00 PDT 2007 PROD Cancelled at Sun Apr 15 00:34:00 PDT 2007 PROD Cancelled at Wed Apr 18 00:34:00 PDT 2007 PROD Cancelled at Mon Apr 30 00:34:00 PDT 2007 Question: Is there a limitation in accessing oracle DB constantly & intensively like the way we have? If there is, could you please provide some info? However, if there is not, please let me know if there are any other ideas/ways to troubleshoot this problem? Any info/suggestions/inputs are greatly appreciated. Btw, sorry for a long email but i want you to have a detail info as much as possible. TIA, -Chris
On May 8, 1:40 pm, "lazyboy @yahoo.com" <lazyboy @yahoo.com> wrote:
> Hi All, > We have a script in a cron job which will login our oracle 9.2.0.4 DB > constantly (meaning every minutes, 7 days/week, & 365 days/yr), and > the script is to check an availability of a record in a table which is > passed through a variable in a script. The script is running > perfectly fine except that it has an intermittent failure :( The > weird thing is that the failure happens at exactly the same time but > at random day. My 1st logical thought is that we may have some other > job in a system which interferes this job; However, when I look > through the system crontab file and other places & can't find any cron > job which runs at that time. Please keep in mind that this is an > intermittent failure. I already look through the log file and the > failure indicates that "the record is not available in a table xyz", > but I know for sure that record is there. I know that accessing DB > like this is that we are looking for troubles, but there is a strong > argument that we should be able to access oracle DB at any time (which > is true), but I don't have info to say "Not to that extents". > PROD Cancelled at Sun Apr 8 00:34:00 PDT 2007 > PROD Cancelled at Tue Apr 10 00:34:00 PDT 2007 > PROD Cancelled at Sun Apr 15 00:34:00 PDT 2007 > PROD Cancelled at Wed Apr 18 00:34:00 PDT 2007 > PROD Cancelled at Mon Apr 30 00:34:00 PDT 2007 > Question: Is there a limitation in accessing oracle DB constantly & > intensively like the way we have? If there is, could you please > provide some info? However, if there is not, please let me know if > there are any other ideas/ways to troubleshoot this problem? > Any info/suggestions/inputs are greatly appreciated. > Btw, sorry for a long email but i want you to have a detail info as > much as possible. > TIA, > -Chris
What is the Oracle error raised at the failure times? -----------------------------------------------Reply-----------------------------------------------
On 8 May 2007 10:40:14 -0700, "lazyboy @yahoo.com" <lazyboy @yahoo.com> wrote: >Question: Is there a limitation in accessing oracle DB constantly & >intensively like the way we have? If there is, could you please >provide some info? Obviously, with products like Oracle Enterprise Manager available (comes free with Oracle), it is a very inefficient way to check database availability. Even if you don't do anything, many Oracle tools, including sql*plus will generate statements to set several NLS settings. OEM doesn't. --
-----------------------------------------------Reply-----------------------------------------------
lazyboy @yahoo.com wrote: > Hi All, > We have a script in a cron job which will login our oracle 9.2.0.4 DB > constantly (meaning every minutes, 7 days/week, & 365 days/yr), and > the script is to check an availability of a record in a table which is > passed through a variable in a script. The script is running > perfectly fine except that it has an intermittent failure :( The > weird thing is that the failure happens at exactly the same time but > at random day. My 1st logical thought is that we may have some other > job in a system which interferes this job; However, when I look > through the system crontab file and other places & can't find any cron > job which runs at that time. Please keep in mind that this is an > intermittent failure. I already look through the log file and the > failure indicates that "the record is not available in a table xyz",
Is it possible that the routine which is supposed to write this record to the table has not yet committed? If so, then your other session will not be able to read this record until COMMIT. > but I know for sure that record is there. I know that accessing DB > like this is that we are looking for troubles, but there is a strong > argument that we should be able to access oracle DB at any time (which > is true), but I don't have info to say "Not to that extents".
It might be better if you had a persistent connection. But that is the only place that comes to mind right now. Making a connection over and over again in such a short time can cause problems. But issuing the query over and over again should not (provided the query is optimized). > PROD Cancelled at Sun Apr 8 00:34:00 PDT 2007 > PROD Cancelled at Tue Apr 10 00:34:00 PDT 2007 > PROD Cancelled at Sun Apr 15 00:34:00 PDT 2007 > PROD Cancelled at Wed Apr 18 00:34:00 PDT 2007 > PROD Cancelled at Mon Apr 30 00:34:00 PDT 2007 > Question: Is there a limitation in accessing oracle DB constantly & > intensively like the way we have? If there is, could you please > provide some info? However, if there is not, please let me know if > there are any other ideas/ways to troubleshoot this problem?
Like I said, the connection request can take a bit of time. Otherwise, constant queries on the db aren't a problem so long as the queries are optimized. I have one database that sees the same query, over and over again, approximately 300-400 times per second. (The vendor's idea...not mine. And I've been trying to get this changed). The query is optimized so it runs in sub-second time. The only impact is that this does tax my CPU's a bit (to perform logical reads, etc). So long as I have the server resources, all is fine. > Any info/suggestions/inputs are greatly appreciated. > Btw, sorry for a long email but i want you to have a detail info as > much as possible. > TIA, > -Chris
Do you have Oracle errors? Or is it just that the record is missing and you expect it to be there? If the former, then post the errors. If the latter, then it looks like an application issue, not an Oracle issue. HTH, Brian -- =================================================================== Brian Peasland d@nospam.peasland.net http://www.peasland.net Remove the "nospam." from the email address to email me. "I can give it to you cheap, quick, and good. Now pick two out of the three" - Unknown -- Posted via a free Usenet account from http://www.teranews.com
-----------------------------------------------Reply-----------------------------------------------
On May 8, 1:40 pm, "lazyboy @yahoo.com" <lazyboy @yahoo.com> wrote:
> Hi All, > We have a script in a cron job which will login our oracle 9.2.0.4 DB > constantly (meaning every minutes, 7 days/week, & 365 days/yr), and > the script is to check an availability of a record in a table which is > passed through a variable in a script. The script is running > perfectly fine except that it has an intermittent failure :( The > weird thing is that the failure happens at exactly the same time but > at random day. My 1st logical thought is that we may have some other > job in a system which interferes this job; However, when I look > through the system crontab file and other places & can't find any cron > job which runs at that time. Please keep in mind that this is an > intermittent failure. I already look through the log file and the > failure indicates that "the record is not available in a table xyz", > but I know for sure that record is there. I know that accessing DB > like this is that we are looking for troubles, but there is a strong > argument that we should be able to access oracle DB at any time (which > is true), but I don't have info to say "Not to that extents". > PROD Cancelled at Sun Apr 8 00:34:00 PDT 2007 > PROD Cancelled at Tue Apr 10 00:34:00 PDT 2007 > PROD Cancelled at Sun Apr 15 00:34:00 PDT 2007 > PROD Cancelled at Wed Apr 18 00:34:00 PDT 2007 > PROD Cancelled at Mon Apr 30 00:34:00 PDT 2007 > Question: Is there a limitation in accessing oracle DB constantly & > intensively like the way we have? If there is, could you please > provide some info? However, if there is not, please let me know if > there are any other ideas/ways to troubleshoot this problem? > Any info/suggestions/inputs are greatly appreciated. > Btw, sorry for a long email but i want you to have a detail info as > much as possible. > TIA, > -Chris
If your "failure" at exactly the same time on different days shows that the connection attempt to oracle "did succeed" but that the record is "not there" I would guess that your have proven perhaps that the "record is not there". Could this be caused by other SQL that deletes the record then subsequently inserts it back out there?
-----------------------------------------------Reply-----------------------------------------------
On May 8, 10:40 am, "lazyboy @yahoo.com" <lazyboy @yahoo.com> wrote:
> Hi All, > We have a script in a cron job which will login our oracle 9.2.0.4 DB > constantly (meaning every minutes, 7 days/week, & 365 days/yr), and > the script is to check an availability of a record in a table which is > passed through a variable in a script. The script is running > perfectly fine except that it has an intermittent failure :( The > weird thing is that the failure happens at exactly the same time but > at random day. My 1st logical thought is that we may have some other > job in a system which interferes this job; However, when I look > through the system crontab file and other places & can't find any cron > job which runs at that time. Please keep in mind that this is an > intermittent failure. I already look through the log file and the > failure indicates that "the record is not available in a table xyz", > but I know for sure that record is there. I know that accessing DB > like this is that we are looking for troubles, but there is a strong > argument that we should be able to access oracle DB at any time (which > is true), but I don't have info to say "Not to that extents". > PROD Cancelled at Sun Apr 8 00:34:00 PDT 2007 > PROD Cancelled at Tue Apr 10 00:34:00 PDT 2007 > PROD Cancelled at Sun Apr 15 00:34:00 PDT 2007 > PROD Cancelled at Wed Apr 18 00:34:00 PDT 2007 > PROD Cancelled at Mon Apr 30 00:34:00 PDT 2007 > Question: Is there a limitation in accessing oracle DB constantly & > intensively like the way we have? If there is, could you please > provide some info? However, if there is not, please let me know if > there are any other ideas/ways to troubleshoot this problem? > Any info/suggestions/inputs are greatly appreciated. > Btw, sorry for a long email but i want you to have a detail info as > much as possible. > TIA, > -Chris
Wild speculation: You are hitting the db just as a log switch is happening. Do you have log switching set up to happen at specific intervals? That combined with some number of hot block updates might simply latch you out at particular times. Anything happening in the alert*log at those times? More wild speculation, a variant on Brian's commit comment: you have some severe I/O or log buffer problem, so the other process that needs to commit can't know that it is committed yet. I've seen some unix filesystems do some sort of housekeeping at times that just stops everything else. Yet more wilder speculation: 9204 shared pool bugs are messing with you. And as Valentin asked, what exactly is the error? You might want to post your error handling code, too. Back in the early O7 days I used to see things like this happen because the database writer would just be overwhelmed (even while running catalog!), but that was long ago. If it is somewhat predictable, maybe you want to record latch waits at that particular time. jg -- @home.com is bogus. http://www.signonsandiego.com/uniontrib/20070508/news_1b8micro.html http://www.nbcsandiego.com/news/3486705/detail.html
-----------------------------------------------Reply-----------------------------------------------
On May 8, 5:25 pm, joel garry <joel-ga @home.com> wrote:
> On May 8, 10:40 am, "lazyboy @yahoo.com" <lazyboy @yahoo.com> > wrote: > > Hi All, > > We have a script in a cron job which will login our oracle 9.2.0.4 DB > > constantly (meaning every minutes, 7 days/week, & 365 days/yr), and > > the script is to check an availability of a record in a table which is > > passed through a variable in a script. The script is running > > perfectly fine except that it has an intermittent failure :( The > > weird thing is that the failure happens at exactly the same time but > > at random day. My 1st logical thought is that we may have some other > > job in a system which interferes this job; However, when I look > > through the system crontab file and other places & can't find any cron > > job which runs at that time. Please keep in mind that this is an > > intermittent failure. I already look through the log file and the > > failure indicates that "the record is not available in a table xyz", > > but I know for sure that record is there. I know that accessing DB > > like this is that we are looking for troubles, but there is a strong > > argument that we should be able to access oracle DB at any time (which > > is true), but I don't have info to say "Not to that extents". > > PROD Cancelled at Sun Apr 8 00:34:00 PDT 2007 > > PROD Cancelled at Tue Apr 10 00:34:00 PDT 2007 > > PROD Cancelled at Sun Apr 15 00:34:00 PDT 2007 > > PROD Cancelled at Wed Apr 18 00:34:00 PDT 2007 > > PROD Cancelled at Mon Apr 30 00:34:00 PDT 2007 > > Question: Is there a limitation in accessing oracle DB constantly & > > intensively like the way we have? If there is, could you please > > provide some info? However, if there is not, please let me know if > > there are any other ideas/ways to troubleshoot this problem? > > Any info/suggestions/inputs are greatly appreciated. > > Btw, sorry for a long email but i want you to have a detail info as > > much as possible. > > TIA, > > -Chris > Wild speculation: You are hitting the db just as a log switch is > happening. Do you have log switching set up to happen at specific > intervals? That combined with some number of hot block updates might > simply latch you out at particular times. Anything happening in the > alert*log at those times? > More wild speculation, a variant on Brian's commit comment: you have > some severe I/O or log buffer problem, so the other process that needs > to commit can't know that it is committed yet. I've seen some unix > filesystems do some sort of housekeeping at times that just stops > everything else. > Yet more wilder speculation: 9204 shared pool bugs are messing with > you. > And as Valentin asked, what exactly is the error? You might want to > post your error handling code, too. > Back in the early O7 days I used to see things like this happen > because the database writer would just be overwhelmed (even while > running catalog!), but that was long ago. > If it is somewhat predictable, maybe you want to record latch waits at > that particular time. > jg > -- > @home.com is bogus.http://www.signonsandiego.com/uniontrib/20070508/news_1b8micro.htmlht... Hide quoted text -
Those are some wild guesses given the OP hasn't given us an oracle error message yet. -----------------------------------------------Reply-----------------------------------------------
Thank you all for a quick response, and I like the idea of oracle error messages. The script will exist out when it can't retrieve the record in a table, and the record is always there. In other words, we don't have to run an insert query at all. So, does anyone know where the oracle error msg log file located? For instance, you run a select statement to display a unavailable field in a table, it'll error out on a terminal, but does it record it somewhere in a file as well? If it does, please let me know where it is & file name. I thought it's in one of those dump directories, but so far can't find it. -----------------------------------------------Reply-----------------------------------------------
On May 8, 5:52 pm, "lazyboy @yahoo.com" <lazyboy @yahoo.com> wrote: > Thank you all for a quick response, and I like the idea of oracle > error messages. The script will exist out when it can't retrieve the > record in a table, and the record is always there. In other words, we > don't have to run an insert query at all. So, does anyone know where > the oracle error msg log file located? For instance, you run a select > statement to display a unavailable field in a table, it'll error out > on a terminal, but does it record it somewhere in a file as well? If > it does, please let me know where it is & file name. I thought it's in > one of those dump directories, but so far can't find it.
I think maybe you should talk to your oracle dba for this system. Oracle has an alert log but it "usually" contains fairly severe type errors in it or system shutdown restarts ... not often ( mileage will vary ) application related errors. Do you use sqlplus to invoke the script? How do you run it exactly? What is the script coded in? Usually if you are driving the script via cron or some scheduling type of software any oracle error messages ( if any ) that you encounter should be recorded and available somewhere. Yes sometimes you can get trace files *.trc in the dump directories ( usually udump ) ... but that's not exactly what we are looking for. When you execute your script and it has a "problem" what does all the output look like? When you execute your script and it runs "normally" what does the output look like? What exactly does your script contain? That's the type of info that people will need to help you further.
-----------------------------------------------Reply-----------------------------------------------
On May 8, 5:52 pm, "lazyboy @yahoo.com" <lazyboy @yahoo.com> wrote: > Thank you all for a quick response, and I like the idea of oracle > error messages. The script will exist out when it can't retrieve the > record in a table, and the record is always there. In other words, we > don't have to run an insert query at all. So, does anyone know where > the oracle error msg log file located? For instance, you run a select > statement to display a unavailable field in a table, it'll error out > on a terminal, but does it record it somewhere in a file as well? If > it does, please let me know where it is & file name. I thought it's in > one of those dump directories, but so far can't find it.
The error is most likely not going to show up anywhere if the application does not capture it. The select will not error out if there is no record in the result set, so you will have to code specifically to capture the information. Can you post the content of your script? -----------------------------------------------Reply-----------------------------------------------
On May 8, 2:31 pm, hpuxrac <johnbhur @sbcglobal.net> wrote:
> On May 8, 5:25 pm, joel garry <joel-ga @home.com> wrote: > > On May 8, 10:40 am, "lazyboy@yahoo.com" <lazyboy@yahoo.com> > > wrote: > > > Hi All, > > > We have a script in a cron job which will login our oracle 9.2.0.4 DB > > > constantly (meaning every minutes, 7 days/week, & 365 days/yr), and > > > the script is to check an availability of a record in a table which is > > > passed through a variable in a script. The script is running > > > perfectly fine except that it has an intermittent failure :( The > > > weird thing is that the failure happens at exactly the same time but > > > at random day. My 1st logical thought is that we may have some other > > > job in a system which interferes this job; However, when I look > > > through the system crontab file and other places & can't find any cron > > > job which runs at that time. Please keep in mind that this is an > > > intermittent failure. I already look through the log file and the > > > failure indicates that "the record is not available in a table xyz", > > > but I know for sure that record is there. I know that accessing DB > > > like this is that we are looking for troubles, but there is a strong > > > argument that we should be able to access oracle DB at any time (which > > > is true), but I don't have info to say "Not to that extents". > > > PROD Cancelled at Sun Apr 8 00:34:00 PDT 2007 > > > PROD Cancelled at Tue Apr 10 00:34:00 PDT 2007 > > > PROD Cancelled at Sun Apr 15 00:34:00 PDT 2007 > > > PROD Cancelled at Wed Apr 18 00:34:00 PDT 2007 > > > PROD Cancelled at Mon Apr 30 00:34:00 PDT 2007 > > > Question: Is there a limitation in accessing oracle DB constantly & > > > intensively like the way we have? If there is, could you please > > > provide some info? However, if there is not, please let me know if > > > there are any other ideas/ways to troubleshoot this problem? > > > Any info/suggestions/inputs are greatly appreciated. > > > Btw, sorry for a long email but i want you to have a detail info as > > > much as possible. > > > TIA, > > > -Chris > > Wild speculation: You are hitting the db just as a log switch is > > happening. Do you have log switching set up to happen at specific > > intervals? That combined with some number of hot block updates might > > simply latch you out at particular times. Anything happening in the > > alert*log at those times? > > More wild speculation, a variant on Brian's commit comment: you have > > some severe I/O or log buffer problem, so the other process that needs > > to commit can't know that it is committed yet. I've seen some unix > > filesystems do some sort of housekeeping at times that just stops > > everything else. > > Yet more wilder speculation: 9204 shared pool bugs are messing with > > you. > > And as Valentin asked, what exactly is the error? You might want to > > post your error handling code, too. > > Back in the early O7 days I used to see things like this happen > > because the database writer would just be overwhelmed (even while > > running catalog!), but that was long ago. > > If it is somewhat predictable, maybe you want to record latch waits at > > that particular time. > > jg > > -- > > @home.com is bogus.http://www.signonsandiego.com/uniontrib/20070508/news_1b8micro.htmlht... quoted text - > Those are some wild guesses given the OP hasn't given us an oracle > error message yet
Of course. We don't even know if the "error" comes from not being able to log in. It seems from the more recent post the OP doesn't have any error handling at all and expects the errors to be automagically kept. If he's using cron with no redirection maybe he needs to check his mail... Hey lazyboy, see http://www.dbaoracle.net/readme-cdos.htm#subj12 jg -- @home.com is bogus. Why pull teeth when we can instigate a wild goose chase?
-----------------------------------------------Reply-----------------------------------------------
I'm using ksh to invoke sqlplus statement & I have 1 main script which calls several other scripts, and each of sub script calls other scripts as well which depends on the required procedure. Hence, it's not doable to post a script. Anyway, thx for your all inputs & I'll work w/ a dba team.
|
 |
 |
 |
 |
|