I have a 2.8.0 install running on a patched-up CentOS 6.7 machine, running all from the pulp-stable distribution. QPID as MQ, and Mongo server mongodb-server-2.4.14-1.el6.x86_64.
I can’t seem to keep it running more than a week before it falls over, tasks stop running, and following repeated in the syslog:
Apr 5 16:09:22 oa-ftc-repo0001 pulp: pulp.server.async.scheduler:ERROR: There are 0 pulp_resource_manager processes running. Pulp will not operate correctly without at least one pulp_resource_mananger process running.
Apr 5 16:09:22 oa-ftc-repo0001 pulp: pulp.server.async.scheduler:ERROR: There are 0 pulp_celerybeat processes running. Pulp will not operate correctly without at least one pulp_celerybeat process running.
I do see this running in process list:
10606 ? Sl 7:57 /usr/bin/python /usr/bin/celery beat --app=pulp.server.async.celery_instance.celery --scheduler=pulp.server.async.scheduler.Scheduler --workdir=/var/run/pulp/ -f /var/log/pulp/celerybeat.log -l INFO --detach --pidfile=/var/run/pulp/celerybeat.pid
If I attempt to stop celerybeat:
# service pulp_celerybeat stop
celery init v10.0.
Using configuration: /etc/default/pulp_workers, /etc/default/pulp_celerybeat
Stopping pulp_celerybeat... ERROR
Timed out while stopping (30s)
I’m not sure how to determine what it dying… If I hard stop and start everything, or if I reboot, I can get out of the issue for a few days, before it recurs.
Does anyone have advice on what to look for? The Pulp logs basically says everything is logging to syslog, but have not found a smoking gun to indicate what fell over.