SUSE Manager/Osad and jabberd troubleshooting
Jabber and OSAD client connection issues
After a disk full error or a disk crash, the jabberd database might be corrupted and jabberd fails to start up during spacewalk-service start:
Starting spacewalk services... Initializing jabberd processes... Starting router done Starting sm startproc: exit status of parent of /usr/bin/sm: 2 failed Terminating jabberd processes...
/var/log/messages shows more details:
jabberd/sm: starting up jabberd/sm: process id is 31445, written to /var/lib/jabberd/pid/sm.pid jabberd/sm: loading 'db' storage module jabberd/sm: db: corruption detected! close all jabberd processes and run db_recover jabberd/router: shutting down
Remove the jabberd database and restart. Jabberd will automatically re-create the database.
spacewalk-service stop rm -Rf /var/lib/jabberd/db/* spacewalk-service start
An alternative is to try another db, but SUSE Manager do not deliver drivers for it:
rcosa-dispatcher stop rcjabberd stop cd /var/lib/jabberd/db rm * cp /usr/share/doc/packages/jabberd/db-setup.sqlite . sqlite3 sqlite.db < db-setup.sqlite chown jabber:jabber * rcjabberd start rcosa-dispatcher start
OSAD requires all clients to have different credentials in /etc/sysconfig/rhn/osad-auth.conf, in fact as soon as two clients have the same file they will conflict and, unfortunately, this is very poorly reported by our logs.
In case of a conflict, clients get disconnected repeatedly in a hard-to-predict pattern.
In case duplicates are found have the customer stop the OSAD process, delete the file and start OSAD again. The daemon will recreate the file with new, random contents that should really be unique:
rcosad stop rm /etc/sysconfig/rhn/osad-auth.conf rcosad start