Doug Ledford wrote:
I think you're missing some context. I've just read the message where you haven't been on this list for long. There's three or four discussions on this list, fedora-infrastructure-list (and possibly fedora-advisory-board) where we have talked about doing exploded trees. The basic idea I laid out for that was importing tarballs into a distributed SCM and making our changes against those imports (rather than checking patch files into the SCM). In cases where our SCM matched up against upstream's SCM we could start with upstream's SCM as the basis, import the tarballs into a branch and have better merging in those cases.On Fri, 2008-07-11 at 17:37 -0700, Toshio Kuratomi wrote:Firstly: what is your overall idea? Is it exploded trees as jcollie and I were arguing for at one time or is it mirroring of upstream repos onto Fedora servers?It's both. You first have to support exploded source repos to make the rest of this worth anything. However, part of *truly* supporting an exploded source repo is making that repo available INSTEAD OF srpms. In other words, fulfill our legal obligation to provide source for a package via the source repo instead of via an srpm.
I think you're taking a different approach::1) Support multiple different upstream SCMs (I've been thinking about taking this up since we are doing this for Fedora Hosted. However, doing this for packaging SCMs is a bigger job and is more in the critical path.)
2) Clone upstream's repo to our infrastructure. 3) Then do all of our work within a branch(es) of that repository.A) If upstream doesn't use a supported SCM, then continue to do what we do now with tarballs or import snapshots of upstream work (you don't specify where these snapshots come from... tarballs, their SCM, etc... I think your position would be "any of the above")
(Make corrections if I'm wrong in what you're thinking) [snip stuff I think I summarized above]
Go into more depth about the specifics of what you've thought out otherwise people don't know what the issues and solutions are going to be.The first issue is simply supporting an exploded source repo. An exploded source repo really only requires a few things. First, you no longer need a %setup or %patch portion in the spec file.
I'd think we'd want to extend these rather than get rid of them.1) We'll continue to have some packages that don't use supported revision control so we'll have to release for them from tarball.
2) We have cases where we release a package with multiple source tarballs; we'll also want the ability to release a package with multiple source SCM repos.
3) We want some way to produce different sets of changes for distinct features. With the current system we use multiple patch files for this. Under an SCM system I'd use distinct feature branches which we'd need to represent. (There could be other ways to represent this, though -- ideas?)
Second, you treat things differently in that sourcedir, specdir, and builddir are all one and the same.
This is probably true but should mention that we're adding repo locations into the mix. The repo locations take the place of sourcedir.
Finally, since you built the binary packages from this exploded source repo, then in order to give people the exact sources you built from, you need to make the repo available for clone/checkout by people.
We'll have to be careful here. SCM's that we support will need to have the ability to remove changesets from their history in case we have to remove something that's illegal to distribute from the history. We'll also need to be able to process any merges from upstream before they hit our repository.
I'm not sure how this fits into the feature branches talked about earlier either. We'd have to be able to tag a set of branmches as belonging to a certain build or have a release branch that gets built up from the feature branches. Maybe this build up would be done prior to getting to the rpm stage so that the rpm only has to deal with a single repository... but we don't want to encourage our developers to make single monolithic patches (one of the main flaws with Debian's dpkg format that they've tried to address with various add-ons to the base tools.)
You need never once build an srpm or tarball from this repo if you don't want to (and in fact, an srpm wouldn't build from the same spec file as an exploded source repo spec file unless you conditionalized the spec to know if it was in an srpm or in its native exploded source repo format).
I'm not certain we'd want to go this route. It seems like it would be easy to generate tarballs when building the package so it would be easy to generate SRPMs as well (although rpmbuild would have to be taught that SCM referencing spec files could use a local cache tarball instead of checking out source). If this is easy, then the question becomes whether SRPMs are useful. SRPMs are mirrorable. They are buildable on networks that can't contact the internet. They can be signed and resigned by local admins. They can be put into repositories that yum understands and you can find all their build dependencies by using that repository information.
Also, since we're going to have somethings that aren't backed by a distributed SCM but by a tarball, we're still going to need a way to distribute those sources. It'll just spawn confusion to have SRMs for part of the distro and a separate method for the rest.
[snip cool stuff that I've mentioned before as well]
You're straying into git specifics here. Different upstream SCMs will give us different abilities. We'll need to figure out what things, like this, are important to us, and then make abstracted commands that implement the backend magic to talk to them all in native syntax. For other things, we'll have to decide on lowest-common-denominator.This is where I point out that Jesse's email I responded to about the upstream RPM devel cluttering up fedora's devel branch, the one where I said he wasn't imaginative at all in terms of branching, is a perfect example. Panu mentioned he was pulling the new rpm from the upstream git repo. We would simply clone that. In the process, our official repo would have a list of references to the remote, upstream repo's branches. These branches are inviolate by us. We can never change them, they simply are a copy of upstream's metadata. We can, however, create our own branches. In fact, the standard modus operandi in a case like this would be to clone upstream, then create tracking branches in our repo that show us upstreams branches (because we don't see anything but master from upstream by default), then create our own branches (so upstream has it's own devel branch, usually just named master, and we could create our own branch named fedora-devel that would be our primary devel branch, then as we approach a release we can branch from fedora-devel to f-8, f-9, etc), and then we simply merge or don't merge from upstream to our devel branch as we see fit. For things where we want to follow upstream, we can actually configure fedora-devel to automatically merge any new changes from upstream's master branch in anytime we do a pull (in fact, you can do this on a per branch basis, any given branch can be told to automatically merge changes from another branch into it, or it can be a more static branch that doesn't auto merge anything). Had this been the case, then merely setting the fedora-devel branch to not automerge from the remote (upstream) devel branch would have resulted in all of the auto-rebuilds and things like that working just fine on the fedora-devel branch as Jesse mentioned needed to happen, but it would have let us see the changes going on in the remote tracking branches and everyone who bothered to update their rpm repo would see those changes on those remote branches and know something was up.
Also, this is unfortunately upstream policy dependent as well. I've worked with some upstreams recently who use DCVS to make project management spaghetti :-(. Nothing we can do about that when upstream doesn't see that they have a problem.
[Cut more cool stuff I've mentioned before]
One note: None of this is tied to SRPMs. SRPMs are a product of the build just as RPMs are. The system we're criticizing and trying to extend is really how we store and manage the inputs to the build system. (We've been calling this dist-cvs. The work that Jesse did earlier on git and hg were dist-hg and dist-git as they were just cloning the cvs process into the distributed SCMs. Exploded trees are different from that work.)Really, there are all sorts of reasons to use exploded source repos, to join our own development efforts in with upstream and to hook our source systems together. In the end though, it all boils down to this. Some people are comfortable with and want to keep using srpms and our current disconnected SCM methodology, and some people want another choice. I'm perfectly fine with other people not wanting to change. They don't have to. But I would prefer to be granted the ability to modernize my own way of working should I choose to do so. And this is a big part of that.
This has been more of a sales pitch than anything to be honest. If you want to know more about what I had in mind for nuts and bolts changes to rpm, then I'm attaching a tar.gz of my ~/.tomboy directory. As I was working on things, I just made notes (I really like Tomboy now). Move your own .tomboy out of the way if you have anything you'd like to save, then unpack mine in place, restart tomboy, and start reading from the Enabling optimal SCM usage in Fedora. Everything is linked from that one note. Of course, I was really only a little ways in. I was still concentrating on the rpm changes and hadn't touched on build system changes, or repo server changes, or access controls with different scms, or any of that stuff. And what I *had* accomplished in terms of rpm knowledge is now at least somewhat wrong given the rpm update.
I'll try to get a chance to look at this but.... /me wishes tomboy had an export to treeview function -Toshio
Description: OpenPGP digital signature