Draft Abend Hunting Wiki page
This page quotes instructions for using the Netware support forums to analyze an abend. Posts should be to novell.support.netware.5x.abend-hangs or novell.support.netware.6x.abend-hangs as appropriate.
Describe the Problem
A quick summary of what happened to your system helps set the stage. "My system abended! Help!" doesn't give the volunteers as much to go on as does "My system was running backups using Brand X last night, and we found it halted by abends when we came in the next morning".
To diagnose an abend, the volunteers usually need the complete text of abend 0 or abend 1, including the modules list and any stack dumps or hex dumps. If there is no abend 0 or 1 in your log, you can get a better chance at recording the abend by setting your recovery options "set auto restart after abend = 0".
This causes the server to halt upon abend and it will ask you what to do to continue. In this case, you have to reply "X" which is write the abend.log file and exit.
Multiple abends mean that your server had an abend, and in trying to handle that abend it ran into another abend condition. No abend log will be available, but the top entries modules listed on the screen might, maybe, be useful. Sometimes multiple abends can be reduced to the original abend by setting your recovery options to "set auto restart after abend = 0".
Note that if you have something like the HP ASR feature or something similar that automatically reboots your server upon a hang, you have to disable this feature or you will not get a chance to get your abend.log file in the way described above.
(Tip of the hat to Marcel)
Sometimes the module listing is not available or additional information is needed. When asked, please download fconfig15.exe from the file finder at support.novell.com. Extract Config.nlm from it and copy that to SYS:SYSTEM .
On the console do LOAD CONFIG /jumba1se, and wait until the output file CONFIG.TXT gets created (on NW 6.x this message only appears on the Logger screen). Please post that file here, with and any sensitive infomation -- eg serial number, public IP addresses, snmp community strings, remote access passwords etc. -- edited out. Thank you.
(Tip of the hat to Andrew)
Tip for Hangs
The following advice is offered when the server has not abended, but appears to be hung:
Next time it happens try to get into the debugger with shift-shift-alt-esc - this often works even if no other keys do. If nothing happens you almost certainly have a hardware fault.
If you get to a "#" prompt then you are in the debugger, and the cause of the hang could be software or hardware. Do ? <enter> to see what module the hang happened in. If it always hangs in the same module it is likely to be a problem with that module, if it is always different modules it is likely to be a hardware fault. Enter Q <enter> to exit the debugger to Dos.
(Tip of the hat to Andrew)
Note: This technique may find the system in certain house-keeping activities, which may give an appearance of "always different modules".
High Utilization is not an Abend, but it can be just as distressing. Basically, the approach is to use Monitor.nlm or NRM to identify the busiest threads, and then figure out how to deal with it from there. See also High Utilization
Room for Corrections
Looking forward to the sysops' notes on this page.