[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Pulp-list] Messaging Questions

On 07/07/2010 06:27 PM, Bryan Kearney wrote:

Two questions:

1) at what point does this become QMF?

This is still a long way from QMF.  But, a good question to periodically ask as we go along.

2) Does AMQP support the notion of temporary queues? That could/should
solve te dead queue issue.

Yes it does. If we stick with only synchronous requests to that agent and leave asynchronous stuff to the pulp Task engine, temporary (non-durable) queues will be a good approach to pruning dead queues.

-- bk

On 07/07/2010 05:37 PM, Jeff Ortel wrote:

On 07/07/2010 08:57 AM, Jason Dobies wrote:
Hash: SHA1

Sorry I'm so late getting back to this.

Synchronous messages will fail immediately if the agent is unavailable
so let's assume for the moment, we're only talking about asynchronous
messages (RMI).

Makes sense.

I believe that all asynchronous messages to the agent
should be dispatched through the Tasking framework. That way *all*
policy around asynchronous operations will be in one place.

So in this rationale, all message bus invocations are synchronous, they
just get their asynchronous-ness from our task framework?

I like that, it limits the amount of places we need to address these
ugly cases.


lifecycle of the asynchronous message should be tied to (and
by) the Task. So long as the task lives, the message should also live.
If the task times out, the message should be dequeued.

Does our tasking framework support time outs yet?

Not yet.

So, the
messaging framework need to support message dequeuing. It can do this
by sending a cancellation message with higher priority if dequeuing
directly supported by qpid.

This leaves orphaned queues for consumer un-registration. Seems like
the ConsumerApi could be responsible for this by doing something like:

from pulp.agent import Agent

which would remove the associated queue.

That covers the case where an agent knowingly is going away, but what
about when the consumer just full on disappears? For instance, the box
is reprovisioned, goes up in flames, or whatever other reason and the
admin doesn't think to unregister it?

Unfortunately, AWOL consumers leave a lot of resources that need to be
cleaned up. I'll ping the qpid guys and see how earnest we need to be
about cleaning up dead queues.

I think we still need some sort of reaper/ping on agents to make sure
they are still alive. That'll get us into the questions on what happens
if an agent is temporarily down, but I think those are better than the
alternative of dead queues floating around.


Thinking that agents would publish heartbeat events on the bus ...

The messaging framework (pmf) ensures that messages are processed
(dispatched) before they are acknowledged (taken from the queue). This
prevents against cases where the agent consumes a message then dies
thus never processes it. Due to guaranteed message delivery, the agent
will always reply unless it's dead. In which case, see above.

I see what you're saying, but I'm thinking of a different case. Maybe
I'm viewing this wrong. I thought the flow looked like:

- - Server sends message to agent
- - Agent acknowledges and says it'll start processing the request
- - Server makes note somewhere that the requested action is "in
- - Later, when it's finished, the agent sends a message to the server
that the operation has completed and its status. Looking at the wiki,
this looks like its sent to the server queue.

The approach I'm thinking of is that async activities will be dispatched
using async tasks. When a task runs, it does synchronous RMI on the
agent. If the agent is unavailable, the messaging framework reports this
immediately. The task catch the 'Unavailable' exception, goes to RETRY
(later) state. This way, all this logic is in the Tasks framework.

The task states would go something like this:

<agent is down>

<try again later>
<agent is alive now>

If that's the case, then my question is about what happens when that
last bullet point doesn't happen (for instance, zombie attack caused the
power to go out and the machine died). Won't there still be something in
the server that says "I sent a message to the agent that was accepted,
but he never sent me a message back. I'm sad."

This is a good reason to keep all the asynchronous behaviour in the Task

If that's not the case, can you clear up how that flow looks for me?

Yeah. It would be a mess.

Yes. All requests (messages) have unique serial numbers which are
placed in the reply and matched by the message framework. Agent B,
never see request 1234. This behaviour is standardized and enforced by
the messaging framework.

I'm gonna punt on my follow up question until I'm clear on the above
flow so I don't make us discuss something that's potentially not

Assuming that it cannot re-register with the same ID, it would be
considered a new consumer. The previous registration, will orphan many
resources in pulp - including the queue. Orphans need to be addressed
across the board. See comment above for queue clean up.

If not, when does that queue get deleted? What happens if that
re-registration happens while the agent is doing a task before it
replies, will it confuse the server that the reply came from a
"different" consumer?

- Are replies back to the server guaranteed delivery as well?


thinking of the situation where the server is offline when the agent
finishes doing its business.

Pulp-list mailing list
Pulp-list redhat com

Pulp-list mailing list
Pulp-list redhat com

- --
Jason Dobies
RHCE# 805008743336126
Freenode: jdob
Version: GnuPG v2.0.14 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/


Pulp-list mailing list
Pulp-list redhat com

Pulp-list mailing list
Pulp-list redhat com

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]