Difference between revisions of "SUSE Manager/Osad and jabberd troubleshooting"

From MicroFocusInternationalWiki
Jump to: navigation, search
(Cloned hosts section removed - it's obsoleted since many months now)
Line 66: Line 66:
     rcjabberd start
     rcjabberd start
     rcosa-dispatcher start
     rcosa-dispatcher start
== Cloned hosts ==
=== Symptoms ===
Cloned clients get disconnected repeatedly in a hard-to-predict
=== Cause ===
OSAD requires all clients to have different credentials in
<tt>/etc/sysconfig/rhn/osad-auth.conf</tt>, in fact as soon as two clients have the same
file they will conflict and, unfortunately, this is very poorly reported by our
=== Cure ===
Update server and all client tools to the latest maintenance update and wait 30 minutes. Updated clients will detect and automatically heal from this issue.
== Upstream guides ==
== Upstream guides ==

Revision as of 05:05, 31 May 2016

SUSE Manager Main Page

Typical issues

Open file count exceeded


OSAD clients cannot contact the SUSE Manager Server, jabberd takes a lot of time responding to port 5222.


The number of maximum files that the jabber user can open is lower thant the number of connected clients. Since every clients needs one always-open TCP connection and each of this consume one file handler, jabberd starts queuing and refusing connections.


Add a line like the following to /etc/security/limits.conf

jabbersoftnofile<#clients + 100> jabberhardnofile<#clients + 1000>

You should substitute <#clients + 100> and <#clients + 1000> according to your setup, for example for 5000 clients:

jabbersoftnofile5100 jabberhardnofile6000

Explanation: the soft file limit is the limit of the maximum open files for a single process. In SUSE Manager case the highest consuming process is c2s, which opens a connection per client. 100 additional files are added, here, to accommodate for any non-connection file that c2s needs to work correctly. The hard limit applies to all processes belonging to the jabber user, and accounts for open files from the router, s2s and sm processes as well.

jabberd database corruption


After a disk full error or a disk crash, the jabberd database might be corrupted and jabberd fails to start up during spacewalk-service start:

   Starting spacewalk services...
   Initializing jabberd processes...
       Starting router                                                                   done
       Starting sm startproc:  exit status of parent of /usr/bin/sm: 2                   failed
   Terminating jabberd processes...

/var/log/messages shows more details:

   jabberd/sm[31445]: starting up
   jabberd/sm[31445]: process id is 31445, written to /var/lib/jabberd/pid/sm.pid
   jabberd/sm[31445]: loading 'db' storage module
   jabberd/sm[31445]: db: corruption detected! close all jabberd processes and run db_recover
   jabberd/router[31437]: shutting down


Remove the jabberd database and restart. Jabberd will automatically re-create the database.

   spacewalk-service stop
   rm -Rf /var/lib/jabberd/db/*
   spacewalk-service start

An alternative is to try another db, but SUSE Manager do not deliver drivers for it:

   rcosa-dispatcher stop
   rcjabberd stop
   cd /var/lib/jabberd/db
   rm *
   cp /usr/share/doc/packages/jabberd/db-setup.sqlite .
   sqlite3 sqlite.db < db-setup.sqlite
   chown jabber:jabber *
   rcjabberd start
   rcosa-dispatcher start

Upstream guides

Configuring Osad


Jabber and OSAD client connection issues