Monthly Archives: July 2006

Oracle ORA-04030 Application Errors

Recently I posted about an ORA-04030 Oracle error we received at work. At the end of many hours of pain, the solution was to swap the apparently damaged physical memory for fresh sticks of memory. This post is a follow up to highlight that an Oracle ORA-04030 can be generated by an application error as well.

As soon as the application started generating the ORA-04030 error, we immediately started working through the first set of points from our last excursion with this error. Thankfully, when looking through the Oracle log files it was throwing an error on a particular PL/SQL package. After inspecting the PL/SQL package, there was a clear opportunity for the PL/SQL procedure to spiral out of control and consume all the memory on the server.

In this instance, the PL/SQL procedure was building up a tree of information using the Oracle CONNECT BY PRIOR clause. As you can imagine, if for some reason a node had itself as its parent or child or any recursive relationship; the SQL statement would loop infinitely. Soon enough, the statement has retrieved a million or more rows and the server suddenly has no more memory left and it throws an ORA-04030 error.

The solution in this case was just as simple as the cause, remove the recursive data. For safe measure, there will be checks added using a BEFORE ROW INSERT and BEFORE ROW UPDATE triggers to make sure that the same problem doesn’t resurface in the future.

Time Management

I’m currently reading Management, Theory & Practice by Kris Cole. I flicked through the chapters and decided to jump ahead to a chapter about managing personal work priorities. The chapter goes through the common scenarios or problems which waste time, along with different techniques to combat each one.

It seemed like a good idea that I put some of the time management strategies through their paces and see what works for me and what doesn’t. While I’m learning what works for me, I thought I’d drop notes here for you all to read and comment on.

What are your favourite time management strategies for getting things done?

ORA-04030: out of process memory when trying to allocate <x> bytes

Quite some time ago, we encountered a very strange Oracle problem at work:

  1. ORA-04030: out of process memory when trying to allocate <number> bytes

Initially, the problem was intermittent then its frequency increased. Soon enough, you could nearly generate the ORA-0430 error on command by clicking through our primary site half a dozen times. The standard trouble shooting events took place:

  • Confirm no database changes since the last known good state
  • Check the load and performance statistics on the Oracle RAC nodes
  • Check the Oracle log files to see if there was anything obvious going wrong
  • Check the resources on the Oracle RAC nodes

After all of those options were worked through, we immediately moved onto the application:

  • Confirm no application changes since the last known good state
  • Check resources on the web servers
  • Check web server log files for anything obvious

There were in fact some application changes, so they were rolled back immediately. Unfortunately, that didn’t restore the site to an error free state. The hunting continued to look for anything that might have changed and we continued to draw blanks. At this stage, a support request was logged through the Oracle Metalink to try and resolve the error.

Since the service we were providing was so fundamentally broken, the next thing on the list was to cycle the servers:

  • The IIS services were restarted
  • The web servers themselves were rebooted
  • The Oracle RAC nodes were rebooted

The important thing we didn’t notice or think of immediately is that it could have been just one of the Oracle RAC nodes causing the ORA-04030 problem. When the nodes were rebooted, they were cycled too close together for us to notice if anything had changed. Shortly there after, the servers were shutdown one node at a time and with continued testing; revealed an individual node was causing the problem.

Now that services were restored (though slightly degraded), time was with us and not against us. It seemed quite reasonable that the problem was related to the physical memory in the server. Since the server uses ECC memory, when the boxes were rebooted if there were any defects in the RAM modules – the POST tests should have highlighted them. Unfortunately, after rebooting the server again; there were no POST error messages alerting us to that fact.

While waiting for the Oracle technical support to come back to us with a possible solution or cause, the physical memory was swapped out for an identical set from another server. To test the server, it was joined back into the cluster to see if the error could be regenerated. Of course, even though the server didn’t report any errors with the memory; replacing it seemed to solve the error. In this instance, nothing the Oracle technical support mentioned gave us any real help and after seemingly having the problem nipped, the ticket was closed.

To put it through its paces as we were convinced that it had to have been a physical memory error (given the apparent solution); the memory was run through a series of grueling memory test utilities for days on end. After days of testing, not a single error was reported – go figure.

The moral of this story is simple:

When troubleshooting a technical problem, confirm or double check that a possible problem really isn’t a problem just because something else suggests that it isn’t.

State Of Origin 2006

The Winning Queensland 2006 State Of Origin Team, QUEENSLANDER!The Rugby League State Of Origin has been part of Australian culture since 1980 and this seasons series didn’t disappoint. The State Of Origin is about pitting Australia’s biggest and best Rugby League players against one another, where the teams are decided based on where you started playing football. Amazingly, after 26 years of State Of Origin clashes, the difference between Queensland and New South Wales is very slim – it’s always a close game.

This evening saw the decider of the 2006 State Of Origin and it was held in Melbourne. After a fantastic start to the match, lots of hard running and big hits – the two teams were locked at four a piece. As half time drew near, it was clear that Queensland were feeling the effects of a very physical and fast paces first half.

Half time happened and both teams resumed with plenty of vigor and spring in their steps. Shortly, every Queensland supporter had to swallow their stomachs when the umpire and the video referee made two huge mistakes within about 10 minutes of each other! The commentators were absolutely disgraced at the decisions, even the New South Wales commentators were appalled. The side effect of these mistakes was potentially devastating, with the score now 14-4 in favour of New South Wales. Thankfully, the Queensland team rallied together and started to find some space through Thurston which was finally rewarded with a try. The clinching moment though, happened in the dieing minutes of the game when Hodgson threw a pass from dummy half that didn’t find any team mates and Lockyer was fast to snap up the ball and score under the posts!

At the end of a hard 80 minutes of top notch Rugby League, Queensland managed to bring it home with a final score of 16-14, whilst winning the first series in three years.