[Pulp-list] Missing celery workers

Brian Bouterse bbouters at redhat.com
Wed Nov 18 16:05:12 UTC 2015


Jason and Jeffrey,

Thanks for reporting this. I've written up a bug [0] and I am
investigating the root cause.

On the bug are you able to leave some answers to these questions?

- Can you confirm that it affects both RabbitMQ and Qpid usage?
- Can you confirm that the workers "go missing" and then return, and
then "go missing" in a continuous cycle? I expect it to happen every 90
seconds.

- Jeffrey specifically, what OS are you using?

[0]: https://pulp.plan.io/issues/1380

Thanks,
Brian

On 11/18/2015 09:33 AM, Miller, Jeffrey L wrote:
> I am seeing this behavior as well after upgrading from 2.6 to 2.7.
> However, I am using qpid not rabbitmq.
> 
>  
> 
> -Jeffrey
> 
>  
> 
>  
> 
>  
> 
> *From:* pulp-list-bounces at redhat.com
> [mailto:pulp-list-bounces at redhat.com] *On Behalf Of *Ashby, Jason (IMS)
> *Sent:* Wednesday, November 18, 2015 8:29 AM
> *To:* pulp-list at redhat.com
> *Subject:* [Pulp-list] Missing celery workers
> 
>  
> 
> Hi all,
> 
> I’m hitting another issue with the upgrade to Pulp 2.7.0 + changing from
> qpid to rabbitmq for messaging.  The workers are continuously going
> missing, every minute or so.  The effect is that the tasks in the task
> list stay in a Waiting state and are never completed.
> 
>  
> 
> Rabbitmq looks healthy; I see successful accepted connections per the
> logs and can see a bunch of connections in the rabbitmq management GUI. 
> I’m kind of stuck as far as troubleshooting goes.  Any tips on what else
> to investigate?
> 
>  
> 
> Pulp and rabbitmq servers are both CentOS 6.
> 
>  
> 
> # /var/log/messages
> 
> Nov 18 08:53:56 pulp01 pulp: celery.worker.consumer:INFO: missed
> heartbeat from resource_manager at pulp01
> 
> Nov 18 09:05:46 pulp01 pulp: pulp.server.async.worker_watcher:INFO: New
> worker 'reserved_resource_worker-3 at pulp01' discovered
> 
> Nov 18 09:05:46 pulp01 pulp: pulp.server.async.worker_watcher:INFO: New
> worker 'reserved_resource_worker-1 at pulp01' discovered
> 
> Nov 18 09:05:46 pulp01 pulp: pulp.server.async.worker_watcher:INFO: New
> worker 'reserved_resource_worker-2 at pulp01' discovered
> 
> Nov 18 09:05:46 pulp01 pulp: pulp.server.async.worker_watcher:INFO: New
> worker 'reserved_resource_worker-0 at pulp01' discovered
> 
> Nov 18 09:05:56 pulp01 pulp: pulp.server.async.worker_watcher:INFO: New
> worker 'resource_manager at pulp01' discovered
> 
> Nov 18 09:06:46 pulp01 pulp: pulp.server.async.scheduler:ERROR: Worker
> 'reserved_resource_worker-3 at pulp01' has gone missing, removing from list
> of work
> 
> ers
> 
> Nov 18 09:06:46 pulp01 pulp: pulp.server.async.tasks:ERROR: The worker
> named reserved_resource_worker-3 at pulp01 is missing. Canceling the tasks
> in its q
> 
> ueue.
> 
> Nov 18 09:06:46 pulp01 pulp: pulp.server.async.scheduler:ERROR: Worker
> 'reserved_resource_worker-1 at pulp01' has gone missing, removing from list
> of work
> 
> ers
> 
> Nov 18 09:06:46 pulp01 pulp: pulp.server.async.tasks:ERROR: The worker
> named reserved_resource_worker-1 at pulp01 is missing. Canceling the tasks
> in its q
> 
> ueue.
> 
> Nov 18 09:06:46 pulp01 pulp: pulp.server.async.scheduler:ERROR: Worker
> 'reserved_resource_worker-2 at pulp01' has gone missing, removing from list
> of work
> 
> ers
> 
> Nov 18 09:06:46 pulp01 pulp: pulp.server.async.tasks:ERROR: The worker
> named reserved_resource_worker-2 at pulp01 is missing. Canceling the tasks
> in its q
> 
> ueue.
> 
> Nov 18 09:06:46 pulp01 pulp: pulp.server.async.scheduler:ERROR: Worker
> 'reserved_resource_worker-0 at pulp01' has gone missing, removing from list
> of work
> 
> ers
> 
> Nov 18 09:06:46 pulp01 pulp: pulp.server.async.tasks:ERROR: The worker
> named reserved_resource_worker-0 at pulp01 is missing. Canceling the tasks
> in its q
> 
> ueue.
> 
> Nov 18 09:06:46 pulp01 pulp: pulp.server.async.scheduler:ERROR: Worker
> 'resource_manager at pulp01' has gone missing, removing from list of workers
> 
> Nov 18 09:06:46 pulp01 pulp: pulp.server.async.tasks:ERROR: The worker
> named resource_manager at pulp01 is missing. Canceling the tasks in its queue.
> 
> Nov 18 09:06:46 pulp01 pulp: pulp.server.async.scheduler:ERROR: There
> are 0 pulp_resource_manager processes running. Pulp will not operate
> correctly without
> 
> at least one pulp_resource_mananger process running.
> 
>  
> 
> ------------------------------------------------------------------------
> 
> 
> Information in this e-mail may be confidential. It is intended only for
> the addressee(s) identified above. If you are not the addressee(s), or
> an employee or agent of the addressee(s), please note that any
> dissemination, distribution, or copying of this communication is
> strictly prohibited. If you have received this e-mail in error, please
> notify the sender of the error.
> 
> 
> 
> _______________________________________________
> Pulp-list mailing list
> Pulp-list at redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-list
> 




More information about the Pulp-list mailing list