[rhn-users] Software Content Management (Introducing Pulp)

Máirín Duffy duffy at redhat.com
Mon Feb 25 21:56:20 UTC 2008


Hi folks,

As some of you who have participated in them already know, over the past 
couple of years or so Red Hat has been conducting some studies on how 
folks manage their systems using the Red Hat Network and Satellite 
products. We've learned a lot about the processes many of you have 
established for managing your systems and the strengths and weaknesses 
of the RHN products in supporting those processes. In addition to this, 
it is also clear that the free and open source management tools 
available for Fedora, RHEL, and CentOS (as well as other *nixes) don't 
sufficiently cover some of the areas of need that the current Satellite 
product addresses.

Over the past few weeks some Red Hat folks and Fedora community members 
have been working on a free and open source project that will not only 
attempt to fill one of the gaps in free & open source systems management 
tools, but also to take some of the things we've learned from talking 
with Satellite and RHN customers and improve upon how we could address 
one area of systems management. This project is called 'Pulp', and its 
scope is centered around the management of software content. From the 
Pulp Fedora project website [1]:

"Pulp is an application for managing the software installed on your 
systems. Suppose you want to control what machines on your network get 
what software updates, to establish testing/stage repositories, to 
mirror 3rd party content, to create your own repositories, or to add new 
content to existing repositories. Pulp will provide an easy web, 
web-services, and command line interface for managing all of this."

REPOSITORY CREATION AND MIRRORING MANAGEMENT

To start, we were thinking that Pulp could be a way of improving upon 
the custom channel management capabilities of Satellite, using yum 
repositories instead of RHN channels. Last month Michael DeHaan hosted a 
discussion introducing Pulp at FUDCon in Raleigh [2]. What we have taken 
away from the participants of that session is that they would like to 
see less emphasis on system <=> content mapping, and would like tools 
that focus on mirroring contents from many different 'upstream' sources 
and organizing them neatly in one place. It could be kind of like mirror 
manager, except instead of managing the mirroring of a particular set of 
content across many sites, it would manage the mirroring of MANY 
different sets of content at ONE particular site. On top of this it 
could greatly simplify the creation and management of yum repositories 
from this mirrored content as well as from other local content sources. 
(This today has some annoying manual process involved.)

CONTENT INVENTORY, ACCESS CONTROL, AND DELIVERY

We have also thought about Pulp as a way of managing which content gets 
to which systems and maintaining an inventory of which content is which 
systems. For example, maybe using Pulp to get a list of which systems 
are allowed to connect to which repositories, and maybe on a more 
granular level, using Pulp to store black or whitelists of packages that 
the system is allowed to access. Or maybe using it to create a system 
whereby using some logical/policy statements you can create virtual yum 
repositories that compose content from many sources in a particular way 
and then contrl access to those.

The group at FUDCon seemed to care less about content access control and 
delivery, seeming to prefer letting their configuration management 
systems (eg cfengine) handle content access and delivery to systems and 
having Pulp stop at providing yum repos for these configuration 
management tools to access. I do think, from talking with several 
different types of Satellite and RHN users, that some folks may still be 
interested in content access control, but at this point it seems that 
repository creation and mirroring management is one area that both 
groups of people would find great value in.

DISCUSSION

Many of the folks subscribed to these lists are seasoned Linux system 
engineers, system administrators, and/or release engineers for software 
content, so we would love to hear some of your thoughts on what problems 
areas you'd like to see addressed by free and open source management 
tools like Pulp. If you have any thoughts on the following topics or 
others that are related but maybe not mentioned here, please let's 
discuss them here and see if we figure out the best way to make Pulp 
useful for you!:

- Do you host internal mirrors of external content? What kind of 
content? How many mirrors? Do you have mirrors available for multiple 
geographic locations within your organization?

- How many different 'upstream' sources of content need to be made 
available for systems at your organization? Hardware drivers from 
hardware vendors? Operating systems from OS vendors or from FOSS repos? 
Non-FOSS proprietary applications from application vendors? In-house 
application/software development teams?

- How often do you pull down content ('sync' maybe could be a term) from 
these different upstream content sources?

- How do you organize all of the software content that is delivered to 
your systems right now? What are the strengths you've found to your 
approach today? What are the weaknesses you'd like to address?

- How much customization/general 'mucking' do you do with the content 
you pull down from various sources? Are you more interested in simply 
making all the content available or do you have requirements for 
modifying/customizing it as well?

- If you do customize the content, to what extent do you need to do 
this? Branding? Localization? Etc.?

- How strict are your policies for which systems have access to which 
kind of content? Is access completely open, is access constrained by 
which system owners have purchased licenses/entitlements to which 
content? Is access constrained by security concerns? Is access 
constrained by stability concerns (e.g., production systems must never 
be able to have development level content deployed to them?)

- What kind of requirements do you have for producing data about which 
systems had which content installed when, if any?

- How many different environments do you manage content for? Do you 
manage content for development / qa / production environments?

- How do you prefer to deploy content to systems? Do you prefer to have 
a software management tool to do that or do you prefer to tie this into 
a configuration management tool?

- At what level of granularity do you perform software-management 
related tasks on your systems? For example, do you find yourself most often:
   - automatically selecting and deploying content to many systems at 
once in a uniform fashion
   - automatically selecting and deploying content to smaller groupings 
of systems with carefully defined templates
   - manually selecting and deploying content to many systems at once
   - manually selecting and deploying content to individual systems 
one-by-one
   What level of importance does each of these abilities have to you?

SHAMELESS PLUG

Pulp is an open project, stop by the mailing list (cc'ed :) ) to say hi! 
Feedback, bug reports, ideas, and patches are always welcome. :)

Thanks,
~m and the Pulp Team :)

[1] https://fedorahosted.org/pulp

[2] Notes available here: 
https://fedorahosted.org/pulp/wiki/FudConOhEightNotes




More information about the rhn-users mailing list