[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Pulp-list] 2.8 Tasks Stuck at Waiting



Slight correction:  (in the middle of my email):

I did grep /var/log/httpd/ssl_access_log for my task-id (f3a4a828-1a98-4d30-989f-a8f62acc29f9) in /var/log/* and /var/log/*/*, but it only found it in /var/log/httpd/ssl_access_log and /var/log/httpd/ssl_request_log (lines like:
/var/log/httpd/ssl_access_log:135.1.215.52 - - [12/Apr/2016:15:04:17 -0500] "GET /pulp/api/v2/tasks/f3a4a828-1a98-4d30-989f-a8f62acc29f9/ HTTP/1.1" 200 579

regards,
Paul
On 04/12/2016 03:09 PM, EXT Paul Jochum wrote:
Hi Brian:

    I am running into the same problem of "waiting", so here are my outputs (let me know if you need anything else):

-sh-4.1# pulp-admin -vv tasks details --task-id f3a4a828-1a98-4d30-989f-a8f62acc29f9
+----------------------------------------------------------------------+
                              Task Details
+----------------------------------------------------------------------+

2016-04-12 14:59:46,539 - DEBUG - sending GET request to /pulp/api/v2/tasks/f3a4a828-1a98-4d30-989f-a8f62acc29f9/
2016-04-12 14:59:46,602 - INFO - GET request to /pulp/api/v2/tasks/f3a4a828-1a98-4d30-989f-a8f62acc29f9/ with parameters None
2016-04-12 14:59:46,602 - INFO - Response status : 200

2016-04-12 14:59:46,602 - INFO - Response body :
 {
  "exception": null,
  "task_type": "pulp.server.managers.repo.sync.sync",
  "_href": "/pulp/api/v2/tasks/f3a4a828-1a98-4d30-989f-a8f62acc29f9/",
  "task_id": "f3a4a828-1a98-4d30-989f-a8f62acc29f9",
  "tags": [
    "pulp:repository:rhel-i386-server-optional-6.5.z-debuginfo",
    "pulp:action:sync"
  ],
  "finish_time": null,
  "_ns": "task_status",
  "start_time": null,
  "traceback": null,
  "spawned_tasks": [],
  "progress_report": {},
  "queue": "None.dq",
  "state": "waiting",
  "worker_name": null,
  "result": null,
  "error": null,
  "_id": {
    "$oid": "570d1727e75b4adec6bd0d58"
  },
  "id": "570d1727e75b4adec6bd0d58"
}

Operations:       sync
Resources:        rhel-i386-server-optional-6.5.z-debuginfo (repository)
State:            Waiting
Start Time:       Unstarted
Finish Time:      Incomplete
Result:           Incomplete
Task Id:          f3a4a828-1a98-4d30-989f-a8f62acc29f9
Progress Report:


-sh-4.1# ps -awfux | grep celery
Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.8/FAQ
root     16147  0.0  0.0   6444   704 pts/2    S+   15:01 0:00 \_ grep celery
apache     977  0.1  0.1 561608 25908 ?        Sl   Apr09   6:15 /usr/bin/python /usr/bin/celery beat --app=pulp.server.async.celery_instance.celery --scheduler=pulp.server.async.scheduler.Scheduler --workdir=/var/run/pulp/ -f /var/log/pulp/celerybeat.log -l INFO --detach --pidfile=/var/run/pulp/celerybeat.pid
apache    9558  0.0  0.4 553760 58972 ?        Sl   10:40   0:05 /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 -n resource_manager lss-pulp01 ih lucent com --events --app=pulp.server.async.app --loglevel=INFO -Q resource_manager --logfile=/var/log/pulp/resource_manager.log --pidfile=/var/run/pulp/resource_manager.pid
apache    9602  0.4  0.3 472928 50512 ?        Sl   10:40   1:16  \_ /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 -n resource_manager lss-pulp01 ih lucent com --events --app=pulp.server.async.app --loglevel=INFO -Q resource_manager --logfile=/var/log/pulp/resource_manager.log --pidfile=/var/run/pulp/resource_manager.pid
apache    9792  0.0  0.4 553976 58924 ?        Sl   10:41   0:06 /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 -n reserved_resource_worker-0 lss-pulp01 ih lucent com --events --app=pulp.server.async.app --loglevel=INFO --logfile=/var/log/pulp/reserved_resource_worker-0.log --pidfile=/var/run/pulp/reserved_resource_worker-0.pid
apache    9865  0.0  0.3 472916 49692 ?        S    10:41   0:00  \_ /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 -n reserved_resource_worker-0 lss-pulp01 ih lucent com --events --app=pulp.server.async.app --loglevel=INFO --logfile=/var/log/pulp/reserved_resource_worker-0.log --pidfile=/var/run/pulp/reserved_resource_worker-0.pid
apache    9807  0.0  0.4 553976 58920 ?        Sl   10:41   0:06 /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 -n reserved_resource_worker-1 lss-pulp01 ih lucent com --events --app=pulp.server.async.app --loglevel=INFO --logfile=/var/log/pulp/reserved_resource_worker-1.log --pidfile=/var/run/pulp/reserved_resource_worker-1.pid
apache    9887  0.0  0.3 472916 49688 ?        S    10:41   0:00  \_ /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 -n reserved_resource_worker-1 lss-pulp01 ih lucent com --events --app=pulp.server.async.app --loglevel=INFO --logfile=/var/log/pulp/reserved_resource_worker-1.log --pidfile=/var/run/pulp/reserved_resource_worker-1.pid
apache    9826  0.0  0.4 553564 58896 ?        Sl   10:41   0:06 /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 -n reserved_resource_worker-2 lss-pulp01 ih lucent com --events --app=pulp.server.async.app --loglevel=INFO --logfile=/var/log/pulp/reserved_resource_worker-2.log --pidfile=/var/run/pulp/reserved_resource_worker-2.pid
apache    9893  0.0  0.3 472924 49684 ?        S    10:41   0:00  \_ /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 -n reserved_resource_worker-2 lss-pulp01 ih lucent com --events --app=pulp.server.async.app --loglevel=INFO --logfile=/var/log/pulp/reserved_resource_worker-2.log --pidfile=/var/run/pulp/reserved_resource_worker-2.pid
apache    9847  0.0  0.4 553976 58912 ?        Sl   10:41   0:06 /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 -n reserved_resource_worker-3 lss-pulp01 ih lucent com --events --app=pulp.server.async.app --loglevel=INFO --logfile=/var/log/pulp/reserved_resource_worker-3.log --pidfile=/var/run/pulp/reserved_resource_worker-3.pid
apache    9917  0.0  0.3 472916 49692 ?        S    10:41   0:00  \_ /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 -n reserved_resource_worker-3 lss-pulp01 ih lucent com --events --app=pulp.server.async.app --loglevel=INFO --logfile=/var/log/pulp/reserved_resource_worker-3.log --pidfile=/var/run/pulp/reserved_resource_worker-3.pid
apache    9871  0.0  0.4 553752 58896 ?        Sl   10:41   0:06 /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 -n reserved_resource_worker-4 lss-pulp01 ih lucent com --events --app=pulp.server.async.app --loglevel=INFO --logfile=/var/log/pulp/reserved_resource_worker-4.log --pidfile=/var/run/pulp/reserved_resource_worker-4.pid
apache    9945  0.0  0.3 472920 49684 ?        S    10:41   0:00  \_ /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 -n reserved_resource_worker-4 lss-pulp01 ih lucent com --events --app=pulp.server.async.app --loglevel=INFO --logfile=/var/log/pulp/reserved_resource_worker-4.log --pidfile=/var/run/pulp/reserved_resource_worker-4.pid
apache    9896  0.0  0.4 553988 58932 ?        Sl   10:41   0:06 /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 -n reserved_resource_worker-5 lss-pulp01 ih lucent com --events --app=pulp.server.async.app --loglevel=INFO --logfile=/var/log/pulp/reserved_resource_worker-5.log --pidfile=/var/run/pulp/reserved_resource_worker-5.pid
apache    9982  0.0  0.3 472924 49700 ?        S    10:41   0:00  \_ /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 -n reserved_resource_worker-5 lss-pulp01 ih lucent com --events --app=pulp.server.async.app --loglevel=INFO --logfile=/var/log/pulp/reserved_resource_worker-5.log --pidfile=/var/run/pulp/reserved_resource_worker-5.pid
apache    9918  0.0  0.4 553972 58916 ?        Sl   10:41   0:06 /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 -n reserved_resource_worker-6 lss-pulp01 ih lucent com --events --app=pulp.server.async.app --loglevel=INFO --logfile=/var/log/pulp/reserved_resource_worker-6.log --pidfile=/var/run/pulp/reserved_resource_worker-6.pid
apache    9990  0.0  0.3 472912 49680 ?        S    10:41   0:00  \_ /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 -n reserved_resource_worker-6 lss-pulp01 ih lucent com --events --app=pulp.server.async.app --loglevel=INFO --logfile=/var/log/pulp/reserved_resource_worker-6.log --pidfile=/var/run/pulp/reserved_resource_worker-6.pid
apache    9940  0.0  0.4 553756 58892 ?        Sl   10:41   0:06 /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 -n reserved_resource_worker-7 lss-pulp01 ih lucent com --events --app=pulp.server.async.app --loglevel=INFO --logfile=/var/log/pulp/reserved_resource_worker-7.log --pidfile=/var/run/pulp/reserved_resource_worker-7.pid
apache   10013  0.0  0.3 472920 49684 ?        S    10:41   0:00  \_ /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1 -n reserved_resource_worker-7 lss-pulp01 ih lucent com --events --app=pulp.server.async.app --loglevel=INFO --logfile=/var/log/pulp/reserved_resource_worker-7.log --pidfile=/var/run/pulp/reserved_resource_worker-7.pid
-sh-4.1#


I did grep /var/log/httpd/ssl_access_log in /var/log/* and /var/log/*/*, but it only found it in /var/log/httpd/ssl_access_log and /var/log/httpd/ssl_request_log (lines like:
/var/log/httpd/ssl_access_log:135.1.215.52 - - [12/Apr/2016:15:04:17 -0500] "GET /pulp/api/v2/tasks/f3a4a828-1a98-4d30-989f-a8f62acc29f9/ HTTP/1.1" 200 579


-sh-4.1# rpm -q -a | grep -i pulp | sort
mod_wsgi-3.4-2.pulp.el6.x86_64
pulp-admin-client-2.8.1-1.el6.noarch
pulp-puppet-admin-extensions-2.8.1-1.el6.noarch
pulp-puppet-plugins-2.8.1-1.el6.noarch
pulp-rpm-admin-extensions-2.8.1-1.el6.noarch
pulp-rpm-plugins-2.8.1-1.el6.noarch
pulp-selinux-2.8.1-1.el6.noarch
pulp-server-2.8.1-1.el6.noarch
python-isodate-0.5.0-4.pulp.el6.noarch
python-kombu-3.0.33-4.pulp.el6.noarch
python-pulp-bindings-2.8.1-1.el6.noarch
python-pulp-client-lib-2.8.1-1.el6.noarch
python-pulp-common-2.8.1-1.el6.noarch
python-pulp-oid_validation-2.8.1-1.el6.noarch
python-pulp-puppet-common-2.8.1-1.el6.noarch
python-pulp-repoauth-2.8.1-1.el6.noarch
python-pulp-rpm-common-2.8.1-1.el6.noarch


-sh-4.1# uname -a
Linux lss-pulp01.ih.lucent.com 2.6.32-042stab113.21 #1 SMP Wed Mar 23 11:05:25 MSK 2016 x86_64 x86_64 x86_64 GNU/Linux


-sh-4.1# more /etc/*release*
::::::::::::::
/etc/redhat-release
::::::::::::::
Scientific Linux release 6.7 (Carbon)
::::::::::::::
/etc/system-release
::::::::::::::
Scientific Linux release 6.7 (Carbon)
::::::::::::::
/etc/system-release-cpe
::::::::::::::
cpe:/o:redhat:enterprise_linux:6.7:ga
-sh-4.1#

On 04/11/2016 03:32 PM, Brian Bouterse wrote:
To anyone who has a stuck task running Pulp 2.6 or newer, can you open a
bug or e-mail the list with the following. The full output from:

pulp-admin -vv tasks details --task-id <the_stuck_task_UUID>

Post the full output, but observe the value of "worker_name" similar to
what was suggested here[0].

Can you also include the `sudo ps -awfux | grep celery` output to show
your running process?

Also can you grep the logs for that UUID and include any lines that you
see as output too.

[0]: https://www.redhat.com/archives/pulp-list/2016-April/msg00026.html

Thanks,
Brian


On 04/11/2016 01:32 PM, Mihai Ibanescu wrote:
I have a similar problem on 2.7.1 with rabbitmq.

Occasionally a task will get "lost":

pulp-admin tasks details --task-id 622041ac-e9e4-4a15-bd7c-7c98a17782e0
+----------------------------------------------------------------------+
                               Task Details
+----------------------------------------------------------------------+

Operations:       publish
Resources:        thirdparty-snapshot-rpm-latest (repository)
State:            Waiting
Start Time:       Unstarted
Finish Time:      Incomplete
Result:           Incomplete
Task Id:          622041ac-e9e4-4a15-bd7c-7c98a17782e0
Progress Report:


I have seen more than one getting in this state.

I can cancel the task, but I'd like to find out if I can tickle
something to re-process the queue. Other tasks do get correctly processed.

If there's something you would like me to do in order to debug this
further, please let me know.

Thank you!
Mihai

On Fri, Apr 8, 2016 at 9:27 AM, Brian Bouterse <bbouters redhat com
<mailto:bbouters redhat com>> wrote:

     There is a known issue [0] introduced in 2.8.0 which will cause your
     tasks to become stuck at waiting. This occurs after Pulp runs normally
     for hours or days.

     A fix is completed but won't be released until Pulp 2.8.3 is released.
     The good news is, you can resolve that issue now by upgrading
     python-kombu on your existing 2.8.0, 2.8.1, or 2.8.2 system. Manually
     install python-kombu-3.0.33-5.pulp for your distro on your system,
     restart Pulp services and you'll be all set. All of the links below come
     out of the Pulp build system[1]

     EL6: http://tinyurl.com/j5sy5zh
     EL7: http://tinyurl.com/jppwcdv
     F22: http://tinyurl.com/hp7z4sq
     F23: http://tinyurl.com/zplzua3
     F24: update is in -testing repo [2]
     rawhide: already available in stable

     These are the same versions of python-kombu that you'll receive when
     2.8.3 comes out so upgrading this package will keep your packages on the
     normal upgrade path. These newer python-kombu versions are compatible
     with 2.8.0, 2.8.1, and 2.8.2 versions of Pulp.

     [0]: https://pulp.plan.io/issues/1801
     [1]: http://koji.katello.org/koji/packageinfo?packageID=665
     [2]: https://bodhi.fedoraproject.org/updates/FEDORA-2016-ec038bbf19

     -Brian

     _______________________________________________
     Pulp-list mailing list
     Pulp-list redhat com <mailto:Pulp-list redhat com>
     https://www.redhat.com/mailman/listinfo/pulp-list


_______________________________________________
Pulp-list mailing list
Pulp-list redhat com
https://www.redhat.com/mailman/listinfo/pulp-list

_______________________________________________
Pulp-list mailing list
Pulp-list redhat com
https://www.redhat.com/mailman/listinfo/pulp-list


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]