[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: [Pulp-list] Software Content Management (Introducing Pulp)

Hey all,

This is Chris Murphy from Castle Branch. Glad to see some traffic on this after FUDCon. As I said there I have very little experience with the python aspect but do manage a couple sets of repos for our different distributions. I thought I would fill this out as a way of getting back into the project and look forward to seeing it progress.

Just a quick overview of our repo management that is currently in place. This example is for our desktops which use Fedora and can use all public mirrors. When I started here they were using yam from Dag's repo's which I moved to a newer server and upgraded/changed to his new version/name mrepo. The biggest problem I had with it was the whole symlinking setup. I didn't see the need or want to have soft links from /var/mrepo to /var/www/html/mrepo. So I had our intern strip out all of the symlink stuff so that it would just sync directly to a directory that was defined in the configuration file. Then added a few defaults for mirroring like bandwidth usage and exclude dirs for rsync and lftp. Then we set up a few extra options:

   def version(self):
       print MYNAME+' '+MYVERSION
       print 'Written by Dag Wieers <dag wieers com>'
       print 'Rewritten by Tyler Gates <tjgates castlebranch com>'
       print 'platform '+os.name+'/'+sys.platform
       print 'python '+sys.version

   def usage(self):
       print 'usage: '+MYNAME+' [options] [--repo=dist1,[dist2-arch ..]]'

   def help(self):
       print '''Set up a distribution server

 -h, --help                      show help message and exit
 -c, --config=file               specify alternative configfile
 -s, --sync                      sync against mirrors
 -g, --generate                  generate metadata from synced mirrors
 -e, --regenerate                regenerate metadata from synced mirrors
 -u, --update                    update metadata from synced mirrors
 -r, --repo=repo1,repo2          target repos
 -i, --include=mirror1,mirror2   include only mirror(s)
 -e, --exclude=mirror1,mirror2   exclude mirror(s)
 -d, --dry                       do a dry run
 -v, --version                   show version and exit
The reason was so that we could specify the createrepo options easier using just one flag ie:
-g ->    default_generate_metadatacmd    = 'createrepo -d -v -p REPO_DIR'
-e -> default_update_metadatacmd = 'createrepo -d -v -update -p REPO_DIR'

I think mrepo had this originally but I know that the createrepo options in the /etc/mrepo.conf were getting overwritten somewhere because I couldn't get it to create useful metadata after Fedora 4 (we jumped from FC4 to F7, F8...). And lastly added a place in the configuration file (taken from the mrepo.conf) for the path to the comps files (ie. /svn/workstations/fedora/7/i386/comps/comps-f7-desktops.chris.02-25.xml ) that we wanted to associate to that repo and would get appended to the createrepo call:

comps_file = /svn/workstations/fedora/7/i386/comps/comps-f7-desktops.chris.02-25.xml everything = rsync://mirror.anl.gov/fedora/linux/releases/7/Fedora/i386/os/Fedora/
updates = rsync://fedora.mirror.iweb.ca/fedora/updates/7/i386/
livna = rsync://livna.cat.pdx.edu/rpm.livna.org-fedora/7/i386/
custom = file:///repodata/fedora/7/i386/stable/custom/
custom-testing = file:///repodata/fedora/7/i386/stable/custom-testing/
remi = http://iut-info.ens.univ-reims.fr/remirpms/fc7.i386
dries = ftp://ftp.pbone.net/mirror/dries.studentenweb.org/apt/fedora/fc7/i386/RPMS.dries/

So that's a basic outline of our very KISS style setup.

Here's the answers to the questions:
Hi folks,

As some of you who have participated in them already know, over the past couple of years or so Red Hat has been conducting some studies on how folks manage their systems using the Red Hat Network and Satellite products. We've learned a lot about the processes many of you have established for managing your systems and the strengths and weaknesses of the RHN products in supporting those processes. In addition to this, it is also clear that the free and open source management tools available for Fedora, RHEL, and CentOS (as well as other *nixes) don't sufficiently cover some of the areas of need that the current Satellite product addresses.

Over the past few weeks some Red Hat folks and Fedora community members have been working on a free and open source project that will not only attempt to fill one of the gaps in free & open source systems management tools, but also to take some of the things we've learned from talking with Satellite and RHN customers and improve upon how we could address one area of systems management. This project is called 'Pulp', and its scope is centered around the management of software content. From the Pulp Fedora project website [1]:

"Pulp is an application for managing the software installed on your systems. Suppose you want to control what machines on your network get what software updates, to establish testing/stage repositories, to mirror 3rd party content, to create your own repositories, or to add new content to existing repositories. Pulp will provide an easy web, web-services, and command line interface for managing all of this."


To start, we were thinking that Pulp could be a way of improving upon the custom channel management capabilities of Satellite, using yum repositories instead of RHN channels. Last month Michael DeHaan hosted a discussion introducing Pulp at FUDCon in Raleigh [2]. What we have taken away from the participants of that session is that they would like to see less emphasis on system <=> content mapping, and would like tools that focus on mirroring contents from many different 'upstream' sources and organizing them neatly in one place. It could be kind of like mirror manager, except instead of managing the mirroring of a particular set of content across many sites, it would manage the mirroring of MANY different sets of content at ONE particular site. On top of this it could greatly simplify the creation and management of yum repositories from this mirrored content as well as from other local content sources. (This today has some annoying manual process involved.)


We have also thought about Pulp as a way of managing which content gets to which systems and maintaining an inventory of which content is which systems. For example, maybe using Pulp to get a list of which systems are allowed to connect to which repositories, and maybe on a more granular level, using Pulp to store black or whitelists of packages that the system is allowed to access. Or maybe using it to create a system whereby using some logical/policy statements you can create virtual yum repositories that compose content from many sources in a particular way and then contrl access to those.

The group at FUDCon seemed to care less about content access control and delivery, seeming to prefer letting their configuration management systems (eg cfengine) handle content access and delivery to systems and having Pulp stop at providing yum repos for these configuration management tools to access. I do think, from talking with several different types of Satellite and RHN users, that some folks may still be interested in content access control, but at this point it seems that repository creation and mirroring management is one area that both groups of people would find great value in.


Many of the folks subscribed to these lists are seasoned Linux system engineers, system administrators, and/or release engineers for software content, so we would love to hear some of your thoughts on what problems areas you'd like to see addressed by free and open source management tools like Pulp. If you have any thoughts on the following topics or others that are related but maybe not mentioned here, please let's discuss them here and see if we figure out the best way to make Pulp useful for you!:

- Do you host internal mirrors of external content? What kind of content? How many mirrors? Do you have mirrors available for multiple geographic locations within your organization?

Yes, all synced from public mirrors and the local mirrors are setup for two separate offices.
- How many different 'upstream' sources of content need to be made available for systems at your organization? Hardware drivers from hardware vendors? Operating systems from OS vendors or from FOSS repos? Non-FOSS proprietary applications from application vendors? In-house application/software development teams?

- How often do you pull down content ('sync' maybe could be a term) from these different upstream content sources?

Monthly to the unstable yum repos. Once unstable is ready to be used, it's hard linked to the stable. Forgot that part in the intro, mrepo (just for clarification and so that I don't cause any static, I didn't really spend a lot of time looking at the code for mrepo so the "problems" I've cited are most likely RTFM errors on my part) had a tendency to randomly delete content not on source during rsync's which would cause havoc with the hard linked rpms.
- How do you organize all of the software content that is delivered to your systems right now? What are the strengths you've found to your approach today? What are the weaknesses you'd like to address?

Mostly through the comps.xml, kickstart and yum.conf
- How much customization/general 'mucking' do you do with the content you pull down from various sources? Are you more interested in simply making all the content available or do you have requirements for modifying/customizing it as well?

Almost none at all. If we do, it goes into the "custom" repo and we maintain and pull down updates manually
- If you do customize the content, to what extent do you need to do this? Branding? Localization? Etc.?

Mostly these are cpan2rpm that we create and a few purchased rpm's.
- How strict are your policies for which systems have access to which kind of content? Is access completely open, is access constrained by which system owners have purchased licenses/entitlements to which content? Is access constrained by security concerns? Is access constrained by stability concerns (e.g., production systems must never be able to have development level content deployed to them?)

- What kind of requirements do you have for producing data about which systems had which content installed when, if any?
This is a definite area of weakness. I'm using OCS Inventory right now because it's the easiest and it can be used to update the few Windows boxes we have. It's not perfect but since we have the comps groups clearly defined and OCS has good filtering for deployment, it's as easy as say: selecting SalesDept and deploying the command yum -y groupupdate <group> which is why I wanted to be able to easily toggle the --update flag to regenerate groups and add packages often. This is especially critical in the development repos. OCS generates a graph of which hosts have completed the updates, which is consistently around 98-99% of them, the rest I do by hand.

- How many different environments do you manage content for? Do you manage content for development / qa / production environments?

All three
- How do you prefer to deploy content to systems? Do you prefer to have a software management tool to do that or do you prefer to tie this into a configuration management tool?

Tied in with answer above.
- At what level of granularity do you perform software-management related tasks on your systems? For example, do you find yourself most often: - automatically selecting and deploying content to many systems at once in a uniform fashion - automatically selecting and deploying content to smaller groupings of systems with carefully defined templates
  - manually selecting and deploying content to many systems at once
Yes, exclusively

- manually selecting and deploying content to individual systems one-by-one
  What level of importance does each of these abilities have to you?


Pulp is an open project, stop by the mailing list (cc'ed :) ) to say hi! Feedback, bug reports, ideas, and patches are always welcome. :)

~m and the Pulp Team :)

[1] https://fedorahosted.org/pulp

[2] Notes available here: https://fedorahosted.org/pulp/wiki/FudConOhEightNotes

Pulp-list mailing list
Pulp-list redhat com

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]