[Spacewalk-list] Ongoing jabberd/osad issues.
Daryl Rose
darylrose at outlook.com
Wed Aug 17 18:43:33 UTC 2016
I've posted here issues that I've had with jabberd and osad, as have others. But I haven't gotten things resolved, so I am posting additional information.
I put SW into production about a year ago. After a period of time, I noticed issues with the WUI and servers not reporting correctly and other issues. Google searches show that I need to shutdown spacewalk and remove all the contents in /var/lib/jabberd/db. This seemed to work, but after a few months, I realized that osad was no longer communicating with osa-dispatcher.
I started doing some additional research and learned that was not a good way to resolve this issue. According to the official Spacewalk documentation, I should create a checkpoint and then clean up log files keeping the database and auth database files.
https://fedorahosted.org/spacewalk/wiki/JabberDatabase
JabberDatabase – spacewalk - Fedora Hosted<https://fedorahosted.org/spacewalk/wiki/JabberDatabase>
fedorahosted.org
Jabber Database. Spacewalk utilizes Jabber to facilitate communications between the server and the clients for osa-dispatcher/osad. The Jabber program uses the ...
These are the steps that I followed:
/usr/bin/db_checkpoint -1 -h /var/lib/jabberd/db/ ## mark logs for deletion
/usr/bin/db_archive -d -h /var/lib/jabberd/db/ ## delete logs
service jabberd restart
However, this also causes problems with jabberd and osad. If I use the commands as the documentation instructs, then osa-dispatcher will start, but die, and I get errors in the log that there is an invalid password.
So to help explain my issue, I ran a test and tried to capture everything that I could and I'll post it here.
1. Listing of /var/lib/jabberd/db
[root@<spwalk-server> db]# ls
__db.001 __db.006 log.0000000004 log.0000000009 log.0000000014 log.0000000019 log.0000000024 sm.db
__db.002 authreg.db log.0000000005 log.0000000010 log.0000000015 log.0000000020 log.0000000025
__db.003 log.0000000001 log.0000000006 log.0000000011 log.0000000016 log.0000000021 log.0000000026
__db.004 log.0000000002 log.0000000007 log.0000000012 log.0000000017 log.0000000022 log.0000000027
__db.005 log.0000000003 log.0000000008 log.0000000013 log.0000000018 log.0000000023 log.0000000028
2. Spacewalk Server Status
[root@<spwalk-server> db]# spacewalk-service status
postmaster (pid 1175) is running...
router (pid 21431) is running...
sm (pid 21441) is running...
c2s (pid 21451) is running...
s2s (pid 21461) is running...
tomcat6 (pid 1304) is running... [ OK ]
httpd (pid 1385) is running...
osa-dispatcher (pid 21479) is running...
rhn-search is running (1441).
cobblerd (pid 1491) is running...
RHN Taskomatic is running (1515).
3. Most recent log file entry:
2016/08/17 07:44:13 -05:00 21476 0.0.0.0: osad/jabber_lib.__init__
2016/08/17 07:44:13 -05:00 21476 0.0.0.0: osad/jabber_lib.setup_connection('Connected to jabber server', '<spwalk-server>.com')
2016/08/17 07:44:13 -05:00 21476 0.0.0.0: osad/osa_dispatcher.fix_connection('Upstream notification server started on port', 1290)
2016/08/17 07:44:14 -05:00 21476 0.0.0.0: osad/jabber_lib.process_forever
4. Ran the commands as instructed in the jabberd documentation.
/usr/bin/db_checkpoint -1 -h /var/lib/jabberd/db/ ## mark logs for deletion
/usr/bin/db_archive -d -h /var/lib/jabberd/db/ ## delete logs
service jabberd restart
5. Log file entry:
2016/08/17 13:28:19 -05:00 21476 0.0.0.0: osad/jabber_lib.main('ERROR', 'Traceback (most recent call last):\n File "/usr/share/rhn/osad/jabber_lib.py", line 121, in main\n self.process_forever(c)\n File "/usr/share/rhn/osad/jabber_lib.py", line 179, in process_forever\n self.process_once(client)\n File "/usr/share/rhn/osad/osa_dispatcher.py", line 187, in process_once\n client.retrieve_roster()\n File "/usr/share/rhn/osad/jabber_lib.py", line 729, in retrieve_roster\n stanza = self.get_one_stanza()\n File "/usr/share/rhn/osad/jabber_lib.py", line 801, in get_one_stanza\n self.process(timeout=tm)\n File "/usr/share/rhn/osad/jabber_lib.py", line 1055, in process\n data = self._read(self.BLOCK_SIZE)\nSSLError: (\'OpenSSL error; will retry\', "(-1, \'Unexpected EOF\')")\n')
2016/08/17 13:28:29 -05:00 21476 0.0.0.0: osad/jabber_lib.__init__
2016/08/17 13:28:29 -05:00 21476 0.0.0.0: osad/jabber_lib.setup_connection('Connected to jabber server', '<spwalk-server>.com')
2016/08/17 13:28:29 -05:00 21476 0.0.0.0: osad/jabber_lib.register('ERROR', 'Invalid password')
6. Spacewalk server status
[root@<spwalk-server> db]# spacewalk-service status
postmaster (pid 1175) is running...
router (pid 27119) is running...
sm (pid 27129) is running...
c2s (pid 27139) is running...
s2s (pid 27149) is running...
tomcat6 (pid 1304) is running... [ OK ]
httpd (pid 1385) is running...
osa-dispatcher dead but pid file exists
rhn-search is running (1441).
cobblerd (pid 1491) is running...
RHN Taskomatic is running (1515).
7. Long listing of /var/lib/jabberd/db
[root@<spwalk-server> db]# ls -l
total 7536
-rw-r-----. 1 jabber jabber 24576 Aug 17 13:28 __db.001
-rw-r-----. 1 jabber jabber 204800 Aug 17 13:29 __db.002
-rw-r-----. 1 jabber jabber 270336 Aug 17 13:29 __db.003
-rw-r-----. 1 jabber jabber 98304 Aug 17 13:29 __db.004
-rw-r-----. 1 jabber jabber 753664 Aug 17 13:29 __db.005
-rw-r-----. 1 jabber jabber 57344 Aug 17 13:29 __db.006
-rw-r-----. 1 jabber jabber 368640 Aug 17 07:46 authreg.db
-rw-r-----. 1 jabber jabber 10485760 Aug 17 13:29 log.0000000031
-rw-r-----. 1 jabber jabber 487424 Aug 17 13:29 sm.db
So, neither completely cleaning out jabberd database/log files works, and creating a checkpoint and removing log files that need to be cleaned out doesn't' work, so what can I do to get jabberd and osad to work, and to push out updates when I need to push them out?
Thank you.
Daryl
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/spacewalk-list/attachments/20160817/b3f5057b/attachment.htm>
More information about the Spacewalk-list
mailing list