[Spacewalk-list] Ongoing jabberd/osad issues.

Daryl Rose darylrose at outlook.com
Wed Aug 17 18:43:33 UTC 2016

I've posted here issues that I've had with jabberd and osad, as have others.  But I haven't gotten things resolved, so I am posting additional information.

I put SW into production about a year ago.  After a period of time, I noticed issues with the WUI and servers not reporting correctly and other issues.  Google searches show that I need to shutdown spacewalk and remove all the contents in /var/lib/jabberd/db.   This seemed to work, but after a few months, I realized that osad was no longer communicating with osa-dispatcher.

I started doing some additional research and learned that was not a good way to resolve this issue.  According to the official Spacewalk documentation, I should create a checkpoint and then clean up log files keeping the database and auth database files.


JabberDatabase – spacewalk - Fedora Hosted<https://fedorahosted.org/spacewalk/wiki/JabberDatabase>
Jabber Database. Spacewalk utilizes Jabber to facilitate communications between the server and the clients for osa-dispatcher/osad. The Jabber program uses the ...

These are the steps that I followed:

/usr/bin/db_checkpoint -1 -h /var/lib/jabberd/db/ ## mark logs for deletion
/usr/bin/db_archive -d -h /var/lib/jabberd/db/  ## delete logs
service jabberd restart

However, this also causes problems with jabberd and osad.  If I use the commands as the documentation instructs, then osa-dispatcher will start, but die, and I get errors in the log that there is an invalid password.

So to help explain my issue, I ran a test and tried to capture everything that I could and I'll post it here.

1. Listing of /var/lib/jabberd/db

[root@<spwalk-server> db]# ls
__db.001  __db.006        log.0000000004  log.0000000009  log.0000000014  log.0000000019  log.0000000024  sm.db
__db.002  authreg.db      log.0000000005  log.0000000010  log.0000000015  log.0000000020  log.0000000025
__db.003  log.0000000001  log.0000000006  log.0000000011  log.0000000016  log.0000000021  log.0000000026
__db.004  log.0000000002  log.0000000007  log.0000000012  log.0000000017  log.0000000022  log.0000000027
__db.005  log.0000000003  log.0000000008  log.0000000013  log.0000000018  log.0000000023  log.0000000028

2. Spacewalk Server Status

[root@<spwalk-server> db]# spacewalk-service status
postmaster (pid  1175) is running...
router (pid 21431) is running...
sm (pid 21441) is running...
c2s (pid 21451) is running...
s2s (pid 21461) is running...
tomcat6 (pid 1304) is running...                           [  OK  ]
httpd (pid  1385) is running...
osa-dispatcher (pid  21479) is running...
rhn-search is running (1441).
cobblerd (pid 1491) is running...
RHN Taskomatic is running (1515).

3.  Most recent log file entry:

2016/08/17 07:44:13 -05:00 21476 osad/jabber_lib.__init__
2016/08/17 07:44:13 -05:00 21476 osad/jabber_lib.setup_connection('Connected to jabber server', '<spwalk-server>.com')
2016/08/17 07:44:13 -05:00 21476 osad/osa_dispatcher.fix_connection('Upstream notification server started on port', 1290)
2016/08/17 07:44:14 -05:00 21476 osad/jabber_lib.process_forever

4.  Ran the commands as instructed in the jabberd documentation.

/usr/bin/db_checkpoint -1 -h /var/lib/jabberd/db/ ## mark logs for deletion
/usr/bin/db_archive -d -h /var/lib/jabberd/db/  ## delete logs
service jabberd restart

5.  Log file entry:

2016/08/17 13:28:19 -05:00 21476 osad/jabber_lib.main('ERROR', 'Traceback (most recent call last):\n  File "/usr/share/rhn/osad/jabber_lib.py", line 121, in main\n    self.process_forever(c)\n  File "/usr/share/rhn/osad/jabber_lib.py", line 179, in process_forever\n    self.process_once(client)\n  File "/usr/share/rhn/osad/osa_dispatcher.py", line 187, in process_once\n    client.retrieve_roster()\n  File "/usr/share/rhn/osad/jabber_lib.py", line 729, in retrieve_roster\n    stanza = self.get_one_stanza()\n  File "/usr/share/rhn/osad/jabber_lib.py", line 801, in get_one_stanza\n    self.process(timeout=tm)\n  File "/usr/share/rhn/osad/jabber_lib.py", line 1055, in process\n    data = self._read(self.BLOCK_SIZE)\nSSLError: (\'OpenSSL error; will retry\', "(-1, \'Unexpected EOF\')")\n')
2016/08/17 13:28:29 -05:00 21476 osad/jabber_lib.__init__
2016/08/17 13:28:29 -05:00 21476 osad/jabber_lib.setup_connection('Connected to jabber server', '<spwalk-server>.com')
2016/08/17 13:28:29 -05:00 21476 osad/jabber_lib.register('ERROR', 'Invalid password')

6.  Spacewalk server status

[root@<spwalk-server> db]# spacewalk-service status
postmaster (pid  1175) is running...
router (pid 27119) is running...
sm (pid 27129) is running...
c2s (pid 27139) is running...
s2s (pid 27149) is running...
tomcat6 (pid 1304) is running...                           [  OK  ]
httpd (pid  1385) is running...
osa-dispatcher dead but pid file exists
rhn-search is running (1441).
cobblerd (pid 1491) is running...
RHN Taskomatic is running (1515).

7. Long listing of /var/lib/jabberd/db

[root@<spwalk-server> db]# ls -l
total 7536
-rw-r-----. 1 jabber jabber    24576 Aug 17 13:28 __db.001
-rw-r-----. 1 jabber jabber   204800 Aug 17 13:29 __db.002
-rw-r-----. 1 jabber jabber   270336 Aug 17 13:29 __db.003
-rw-r-----. 1 jabber jabber    98304 Aug 17 13:29 __db.004
-rw-r-----. 1 jabber jabber   753664 Aug 17 13:29 __db.005
-rw-r-----. 1 jabber jabber    57344 Aug 17 13:29 __db.006
-rw-r-----. 1 jabber jabber   368640 Aug 17 07:46 authreg.db
-rw-r-----. 1 jabber jabber 10485760 Aug 17 13:29 log.0000000031
-rw-r-----. 1 jabber jabber   487424 Aug 17 13:29 sm.db

So, neither completely cleaning out jabberd database/log files works, and creating a checkpoint and removing log files that need to be cleaned out doesn't' work, so what can I do to get jabberd and osad to work, and to push out updates when I need to push them out?

Thank you.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/spacewalk-list/attachments/20160817/b3f5057b/attachment.htm>

More information about the Spacewalk-list mailing list