[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

[Pulp-list] Code coupling, concurreny, and task scheduling

All Developers,

One problem that keeps recurring in our code base is tight coupling
between objects. We have had this problem before in enough places to
know that it almost always causes problems. Coupling make coding faster
at the time, but it makes testing and future modifications both

The latest problem is in the pulp server's async tasking subsystem: I've
been adding scheduling to tasks and the task queue. However, folks have
implemented derived classes of Task and have coupled them to the global
task queue instance. This means that there is little or no opportunity
to add scheduling to these tasks. I'll show you what I mean:

The AgentTask class:

class AgentTask(AsyncTask):
    def enqueue(self, unique=False):
        if _queue.enqueue(self, unique):
            return self

A task should not know how to enqueue itself in a task queue instance. A
task queue should know how to enqueue a task. This isn't disastrous in
itself, but when combined with the InstallPackages class:

class InstallPackages(AgentTask):
    def __init__(self, consumerid, packages, errata=(), reboot_suggested=False, assumeyes=False):

We end up with a constructor that has side-effects based on the tight
coupling with the task queue. THERE IS NO OPPORTUNITY TO SCHEDULE THIS
TASK! The task will get enqueued and possibly even run before we can add
a scheduler to it.

Even the existing scheduling is not thread safe. The task is enqueued
long before its scheduled_time field gets set. Meaning the task queue is
sorted with a scheduled_time value of None and the task will potentially
get run immediately if the dispatcher wakes up and runs between the
enqueueing and setting of the scheduled_time field. This is simply

Two things central to OO mantra are being disregarded here:
     1. No tight coupling. The AgentTask class knows about a specific
        instance of the TaskQueue class. If it's not a field, it
        shouldn't know about it.
     2. No side-effects in constructors. Constructors are for memory
        allocation and initialization, not executing code.

I will have to refactor all of this in order to (1) fix the current
scheduling and (2) port to the new scheduling.

Lessons to take away:
      * Coupling is bad: This OO 101, and we should all know better.
        It's bitten us several times now.
      * Concurrency is hard: Any time you're touching code that does
        concurrency, make sure to get it reviewed by another pair of

Jason L Connor
linear on freenode #pulp
RHCE: 805010912355231
GPG Fingerprint: 2048R/CC4ED7C1

Attachment: signature.asc
Description: This is a digitally signed message part

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]