What is an open source upstream?

16 ottobre 20204 minuti (tempo di lettura)

Former editorial director

At Red Hat we talk a lot about "upstreams" of our products, but that term may not be well-known outside of open source and developer communities. In this post, we'll take a quick look at what an upstream is, how it relates to enterprise open source products, and how they matter to your organization.

What is an upstream?

Within information technology, the term upstream (and related term "downstream") refers to the flow of data. An upstream in open source is the source repository and project where contributions happen and releases are made. The contributions flow from upstream to downstream.

When talking about an upstream, it's usually the precursor to other projects and products. One of the best-known examples is the Linux kernel, which is an upstream project for many Linux distributions. Distributors like Red Hat take the unmodified (often referred to as "vanilla") kernel source and then add patches, add an opinionated configuration, and build the kernel with the options they want to offer their users.

In some cases, users get releases or code directly from the upstream. Windows and macOS users who run Firefox, as one example, generally get their software releases directly from Mozilla rather than through a third party. Linux users, on the other hand, often get Firefox packaged for their distribution --and usually with a few changes in the release’s configuration to better integrate Firefox to their desktop environment or otherwise be more suitable for the distribution.

In some cases, a project or product might have more than one upstream. Red Hat Enterprise Linux (RHEL) releases are based on Fedora Linux releases. The Fedora Project, in turn, pulls from many upstream projects to create Fedora Linux, like the Linux kernel, GNOME, systemd, Podman, various GNU utilities and projects, the Wayland and X.org display servers, and many more.

The Fedora Project releases a new version of Fedora roughly every six months. Periodically, Red Hat will take a Fedora Linux release and base a RHEL release on that. Rather than starting from scratch with the vanilla sources for the Linux kernel, GNOME, systemd, and the rest Red Hat starts with the Fedora sources for these projects and utilities, which makes Fedora an upstream of RHEL--with a further upstream of the originating projects. Fedora is downstream of these projects and RHEL is downstream of Fedora.

Why are upstreams important?

Upstreams are important because that's where the source contribution comes from, obviously, but it's much more than that. Each upstream is unique, but generally the upstream is where decisions are made, the contribution happens, and where the community for a project comes together to collaborate for the benefit of all parties. Work done at the upstream might flow out to many other open source projects.

The upstream is the focal point where collaborators do the work. It's far better if all the contributors work together rather than, say, contributors from different companies working on features behind closed doors and then trying to integrate them later.

The upstream is also a fixed place where (if we’re talking about creating code) developers can report bugs and security vulnerabilities. If a bug or security flaw is fixed upstream, then every project or product based on the upstream can benefit from that work. (Typically users will report problems to the project or vendor they received the code from, so it's up to developers to check those out and carry things back upstream if the bug or flaw originated there.)

Upstream first

Because the upstreams are so important, Red Hat has a longstanding practice of doing work upstream first and trying to get features and patches accepted upstream rather than just building them directly in our own products.

The reasons for this are many. First, it's just good open source citizenship to do the work side-by-side with the rest of the community and share our work with the communities from which we're benefitting.

By working upstream first, you have the opportunity to vet ideas with the larger community and work together to build new features, releases, content, etc. The features or changes you want to make may have an impact on other parts of the project. It's good to find these things out early and give the rest of the community an opportunity to weigh in.

Secondly, it's a better choice pragmatically to do the work upstream first. Sometimes it can be faster to implement a feature in a downstream project or product--especially if there are competing ideas about the direction of a project--but it's usually more work in the long run to carry those patches back to the project. By the time it's been shipped in a downstream, there's a good chance that the upstream code has changed, making it harder to integrate patches developed against an older version of the project.

If the features or patches aren't accepted upstream for some reason, then a vendor can carry those separately. That's one of the benefits of open source, you can modify and distribute your own version (within the terms of the license, of course) that meets your needs. It's possible that will be more work in the long run, but sometimes there's a good reason to diverge from upstream. But if there isn't, there's no point in incurring more work than needed.

From upstream projects to downstream products

Much of the time, users don't want to get code directly from the upstream. In the olden days of Linux (think late 90s and early 2000s), it was fairly common to compile code from source if you wanted to use something like Apache httpd or even the Linux kernel. Just bought a new laptop with a modem and sound card? Time to configure and compile the kernel!

Those days are pretty much behind us. Sure, you can compile code and tweak software configurations if you want to--but most of the time, users don't want to. Organizations generally don't want to, they want to rely on certified products that they can vet for their environment and get support for. This is why enterprise open source exists. Users and organizations count on vendors to turn upstreams into coherent downstream products that meet their needs.

In turn, vendors like Red Hat learn from customer requests and feedback about what features they need and want. That, then, benefits the upstream project in the form of new features and bugfixes, etc., and ultimately finds its way into products and the cycle continues.

Sull'autore

Joe Brockmeier

Former editorial director

Joe Brockmeier is the editorial director of the Red Hat Blog. He also acts as Vice President of Marketing & Publicity for the Apache Software Foundation.

Brockmeier joined Red Hat in 2013 as part of the Open Source and Standards (OSAS) group, now the Open Source Program Office (OSPO). Prior to Red Hat, Brockmeier worked for Citrix on the Apache OpenStack project, and was the first OpenSUSE community manager for Novell between 2008-2010.

He also has an extensive history in the tech press and publishing, having been editor-in-chief of Linux Magazine, editorial director of Linux.com, and a contributor to LWN.net, ZDNet, UnixReview.com, and many others.