[Pulp-list] working directories proposal

Mon Dec 15 20:04:25 UTC 2014

----- Original Message -----
> ----- Original Message -----
> > From: "Dennis Kliban" <dkliban at redhat.com>
> > To: "pulp-list" <pulp-list at redhat.com>
> > Sent: Monday, December 15, 2014 2:37:35 PM
> > Subject: [Pulp-list] working directories proposal
> > 
> > What we have now:
> > 
> >   - Storage directory defined here [0]
> > 
> >   - The above value is used to create a path for working directories here
> >   [1]
> > 
> > Proposed changes:
> > 
> >   - Add a new config value in the 'server' section called
> >   'working_directory'.  It's default value would be /var/lib/storage
> 
> We should have "pulp" in the path. /var/cache/pulp would be a good choice and
> works well with the FHS.
> 
> > 
> >   - Change common utils [1] to use ['server']['working_dirctory'] config as
> >   base path.
> > 
> >   - Create 'working_directories' collection in database.
> > 
> >   - Add a check to repository_working_dir [2] and the rest of *_working_dir
> >   methods to determine if this method is called from a task.  If it is, add
> >   the task id, worker id, and path to 'working_directories' collection.
> > 
> >   - Create a period task that will check for any tasks in final state with
> >   existing working directories and delete them.  This will need to take
> >   into
> >   account that the directories exist on a specific worker.
> 
> This may be complicated. The directory exists on a specific machine, which
> may have several workers. Dispatching the cleanup task to a specific worker
> or specific machine may be difficult.
> 
> Here's another option that wouldn't require a DB collection. Assume that
> /var/cache/pulp/ will actually be the value of the "working_directory"
> setting.
> 
> When a worker starts, it creates the directory /var/cache/pulp/<worker_id>/.
> If it already exists, any content gets blown away. On exit, the whole
> directory named <worker_id> gets deleted.
> 
> If a task requests temporary storage, the worker creates
> /var/cache/pulp/<worker_id>/<task_id>/ and makes a new temp directory within
> that. Multiple directories can be requested per task. When the task
> completes, the worker blows away the entire directory named <task_id>.
> 
> I think that covers all of our bases. Thoughts?

This is a much more straight forward solution.  Can anyone think of situations where we wouldn't want to clean up the working directories when a worker is (re)started?  

>