Default Groups, Default Packages, CD Sets and DVD Size

Jeroen van Meeuwen kanarip at kanarip.com
Mon Aug 11 15:40:26 UTC 2008


Hi there,

there's a number of issues which you may or may not agree with, and are 
very real concerns. I'll first try to explain what the issue is;

The CD sets in Fedora 10 Alpha, but also Fedora 8 and 9 Re-Spins, 
require all discs in order to be installed. I say installed, but what I 
really mean is next, next, next, finish. Eg., this includes the "Office 
and Productivity" shortcut to the office, games and sound-and-video 
comps groups of packages.

This problem may need a little more detailed description of 1) CD sets 
are composed, and 2) how the installation procedures selects 
groups/packages to install -in a default install.

FWIW, when I use the term "default install" in this email, it defines 
"next, next, finish" without checking or unchecking any package or group 
related settings.

Also, FWIW, I'm not putting in all the details in composing- and 
installing-logic, so that only the bare flow of logic related to the 
problem set forth remains (or at least I'm attempting to do so).

== The compose ==

Compose tools get input via a kickstart package manifest, which 
basically describes what groups and packages should be available during 
the installation. It selects all the relevant groups and packages, 
resolves the dependencies (inclusive or exclusive -see PS.), puts these 
in a tree and may or may not create a DVD from that tree. In the case of 
CD sets however, the packages need to be split over multiple discs and 
here's how that works (the process is called package ordering for those 
of you who didn't know that already):

 From a list of groups, the compose tools first select @core and @base 
since these are most commonly (e.g. "always") installed. It resolves 
dependencies for the packages selected (e.g. the mandatory and default 
packages in the groups @core and @base). The transaction is ordered and 
the order in which yum will want to install the packages is spit out.

Then, another set of groups is selected, depsolved, ordered is if it 
were for a real transaction, and spit out in the correct order.

This continues until all groups the compose tool knows about are 
selected and spit out, and the package ordering process will then spit 
out "the remainder of packages".

Note that the next generation of compose tools is going to change this 
package ordering process to match the behavior of the installation 
procedure, but I'll continue with more about that later.

== The installation procedure ==

By the time the installation procedure has gained the necessary input 
wrt. which packages to install, it has also selected a number of default 
groups (as well as "Office and Productivity" when performing a default 
installation). For this purpose, it uses the "default" (True/False) 
attribute in the available -possibly aggregated from multiple 
repositories, but not in this case- comps.xml. Given no additional input 
during the default installation, these are the groups selected to be 
installed[1] (from today's x86_64 rawhide):

  -> office
  -> admin-tools
  -> editors
  -> input-methods
  -> fonts
  -> text-internet
  -> gnome-desktop
  -> core
  -> base
  -> hardware-support
  -> games
  -> java
  -> base-x
  -> graphics
  -> dial-up
  -> printing
  -> sound-and-video
  -> graphical-internet

Resulting in a required RPM payload (on the media) of 2.88 GB, using 
exclusive dependency resolving.

2.88 GB in RPMs spans 5 CDs. This means, that a default installation of 
Fedora Rawhide today, when using CDs, would require 5 discs minimum.

== However ==

However, the default installation requires all 7 CDs, because of how the 
compose tools resolve dependencies (inclusive) during the compose of the 
media, and pull in more then is minimally required to complete the 
actual transaction of a default installation. The compose tools do so 
for good reason:

- one cannot know what package is the user-preferred package for any 
given required capability (eg. for fictive capability 'web-client', 
there's firefox, iceweasel, elinks, wget, curl, emacs, emacs, emacs, 
foo, bar and baz)

- one cannot predict on what installed system one is performing an 
upgrade, and to be able to close the transaction certain considerations 
must be met justifying the need for inclusive dependency resolving when 
composing the media (set) or installation tree.

- it makes the released media apply to N+X use-cases where the package 
set or transaction payload during the installation is controlled in a 
more granular fashion then the selection dialogs allow (by means of a 
kickstart package manifest maybe?), which I guess applies more to 
businesses or advanced users using Fedora then it does to Joe Average users.

Basically what I'm saying is that the 2.88 GB is spread over more then 
the minimal amount of discs it would fit on, since the complete package 
payload when using inclusive dependency resolving grows to 3.66 GB, or 7 
discs. So, the compose process spits out 7 discs, each of which contain 
a part of the 2.88 GB sized RPM payload needed for a default installation.

== Next Generation of Package Ordering ==

So, what package ordering is going to do -instead of having a static 
list of groups to add to a transaction, resolve, spit out the packages- 
is use the "default" parameter to groups in comps.xml as well (and then 
instead of exclusive dependency resolving like it does now, move to 
inclusive dependency resolving as well). This makes the "which packages 
and groups are in a default installation" a little less hard to maintain 
and the package ordering will almost automagically match up with 
comps.xml (of which the installation procedure also uses the default 
parameter to groups!)

== DVD and Sizes ==

This leads me to another concern which may or may not be an immediate 
issue but requires attention from those in the decision making chain as 
well as generally interested people; the size of the DVD ISO is getting 
to it's maximum allowed size (just under 4GB for those who use FAT 
systems as their downloaded data partition), providing just a default 
(eg. not including anything in addition to the default).

== Advise needed ==

There's several ways this can be solved, but I'm not sure what is the 
most advisable (some are not feasible I'm sure, I'm just brainstorming 
here):

1) reduce the number of mandatory and default packages per group in 
comps.xml

2) reduce the number of groups in comps.xml that have "default" set to True

3) Revisit how comps is formatted; Example: Keep the "default" for 
compose decisions, but add an "install" attribute for installation 
decision making. Install set to True may require default set to True as 
well for the group to even be included on the media.

4) Split the packages that are mandatory or default in comps groups, 
into smaller packages providing what the group needs and another set of 
smaller packages belonging to the group as to reduce the number of 
dependencies needing to be met when the compose or installation 
procedure selects a group. See also PS2.

5) Or, compared to 4, revisit the Requires in mandatory and default 
packages and the Provides in the packages that provide the required 
capabilities so that it abstracts from the requires/provides matching 
with too many other packages (related to the inclusive dependency 
resolving which will then make for a thinner RPM payload on the composed 
media)

6) Have the compose tools as well as the installation procedures not 
depend on the default attribute to groups anymore, at all.

Thank you very much for reading this message so far, and please don't 
hesitate to ask any questions if I wasn't clear enough in how this works 
(though I also hope I didn't overdo it for those who already knew) or 
make a remark or two when you think I'm wrong about something ;-)

Ideas and insight on the topic very much appreciated,

Kind regards,

Jeroen van Meeuwen
-kanarip

PS. Inclusive dependency resolving is grabbing _all_ packages that 
provide a required capability. Exclusive dependency resolving is 
grabbing the one best fit (this is YUM dependency resolving). I'm taking 
a few shortcuts here but I hope that's OK.

PS2. Selecting all 18 default groups in rawhide results in 79 mandatory 
packages, 302 default packages, and (after depsolving) 1386 packages in 
total, being a 3.66 GB payload

[1] the find-default-groups.py run against a rawhide yum configuration
[jmeeuwen at ghandalf scripts]$ ./find-default-groups.py -c 
../unity/conf/conf.d/revisor-rawhide-x86_64-respin.conf -r
Default groups:
  -> office
  -> admin-tools
  -> editors
  -> input-methods
  -> fonts
  -> text-internet
  -> gnome-desktop
  -> core
  -> base
  -> hardware-support
  -> games
  -> java
  -> base-x
  -> graphics
  -> dial-up
  -> printing
  -> sound-and-video
  -> graphical-internet
3.66 GB (3934874324 archive_size)
79 mandatory packages, 302 default packages, 1386 packages in total 
(after depsolving)

The actual script can be found at 
https://fedorahosted.org/revisor/browser/scripts/find-default-groups.py




More information about the fedora-devel-list mailing list