starting Fedora Server SIG

Les Mikesell lesmikesell at gmail.com
Wed Nov 19 19:24:06 UTC 2008


Doug Ledford wrote:
>
>>> That's not true.  The relating of numbers/names to wires is an arbitrary
>>> abstraction, and one that need not be the *only* possible abstraction.
>>> I admit it's handy and common, especially with servers.  And certainly
>>> you want a server to get the same address from boot to boot, especially
>>> if it's a live server on the internet with people coming to it (or live
>>> server on an intranet).  But, the eth0 naming is really
>>> irrelevant...
>> It's not irrelevant in the context of a unix-like system where devices 
>> are identified by name and you want to clone a bazillion copies of a 
>> machine.
> 
> This is a circular argument.  I argued that the eth0 naming is arbitrary
> and could be done differently, you say it can't be done differently
> because you do it that way now.

No, what I'm saying is that to manage servers you need a way to identify 
NICs that is not arbitrary.  And that changing the convention, 
particularly in arbitrary ways, is expensive.  All previously working 
procedures have to be re-invented and re-tested.  And this is 
particularly a problem when the differences in new behavior are subtle 
and only appear randomly after the copied disk is moved to its remote 
location.

>>> it's the hardware mac address that matters.
>> It is now.  In the 2.4 kernel days I could copy a system disk from an 
>> identical machine, change the hostname and IP address for eth0, eth1, 
>> etc.
> 
> At the point you clone a disk and then manually edit the eth0, eth1,
> etc. addresses, you are no longer strictly cloning.  You are
> customizing.  The choice of customization is arbitrary.

Agreed, but when the kernel hardware detection order was predictable, 
this was simple.  Now it isn't.

>  hether you
> customize the disk by cloning and editing, or by using something like
> cobbler to clone the install via a profile and then have cobbler
> customize the addresses based upon its database is merely
> implementation.  And that's my point, there are better implementations
> to be had.

Errr, doesn't having to build server to run cobbler before you can 
install your real server make this a circular argument too?  Assuming I 
wanted a cobbler server at every remote location instead of shipping a 
pre-configured disk, how would I build it when it needs a cobbler server 
first?

> 
>>  and ship a box of them to remote locations where anyone could 
>> insert them into a chassis and have them come up working.  Now it 
>> doesn't work and I either have to know every target mac address and 
>> match up the disks or talk some 'hands-on' operator through the setup 
>> with a console on a crash cart to get the address assignment done.  Is 
>> that an improvement?
> 
> Failure to use tools that automate this sort of thing is not a valid
> indictment of the infrastructure that's been put in place.

What tools don't involve the bootstrap problem - or are suitable for 
isolated remote servers?  Or maintaining a diverse set of OS's?

>>>  Being able to
>>> go from wire to mac address matters, and going from mac address to
>>> configuration matters.  Whether that configuration is called eth0 or
>>> pr0n_serv0 doesn't really matter.  To use a similar situation that we've
>>> already changed, back in the day, the only way to mount your root
>>> filesystem was by device major:minor.  Eventually, we figured out that
>>> since file systems have unique identifiers, we could screw the
>>> major:minor pair and mount by label instead.
>> That sucked too, if you recall what happened when you tried to re-use a 
>> disk, putting one you had used before into the same chassis as a current 
>> one.  For the first year or so after this change was made, this scenario 
>> would result in a machine that _would not even boot_.
> 
> Anaconda uses default labels for devices. 

Which is obviously an unrealistic design, knowing that disks can be 
re-used and moved around (and if they weren't there was nothing wrong 
with the old way of identifying them by partition name).  The change 
doesn't solve the real problem and breaks anything using prior 
documented behavior.

> You could have done a small
> post script during a kickstart install to rectify this.  One loop to
> modify all the device labels of filesystems to a unique label based
> upon, say, hostname + mount point, eg. firewall-10.0.1-root as a
> filesystem label combined with modifying the entries in fstab, then a
> final line to rebuild the initrd image.  This sort of thing can be
> automated easily in cobbler such that the default kickstart template
> need not know about each machines name/purpose, you can use variable
> substitution to do what you want.

Lovely.  I just sit around waiting for extra work like that - especially 
version-specific stuff.  And it misses the point that I want to be able 
to shove a disk into chassis slots in a certain order and know what to 
call a partition on a particular physical disk regardless of where it 
was used before.  Plus,  the concepts are wildly different when you use 
md devices (and probably LVM's too but I've avoided those completely).

>> And I think you are entirely off-base in terms of making it harder for 
>> someone who knows which device is which to actually identify it to the 
>> OS. I won't say it would be impossible to make things better but so far 
>> it has just created more work.
> 
> Actually, it has created less work for those people that utilized the
> tools that have been created to automate these things.  In your case,
> you already mention having to go in and hand edit network settings any
> time you clone a disk for a new machine.  That's not 0 work.

It is the minimal amount of work to get a correct setup.  The 
information needed to set the hostname and IP addresses has to be known 
and entered somewhere.

> Yet, with
> things like cobbler, there is a certain amount of work to get things set
> up initially, but once that's done, the amount of work goes down.

Please start from scratch here.  How does that cobbler server get 
installed?  How does it make it less work to get from the person who 
knows the hostname and IP addresses to the machine than entering it 
directly?  What if you want to replace the OS with a platform cobbler 
won't install?

> Your
> real complaint is that your work has gone up *because* you choose not to
> make use of these tools or better methods of doing things.

Yes, I choose not to use them because they aren't appropriate for my 
usage.  They require a large amount of infrastructure work and only 
serve a specific OS - and probably only one or a few versions before you 
have to rebuild your infrastructure again.

> I don't know
> what to say to that.  If your going to do things the 1980s way and no
> other, then I'm not sure there's anything that anyone can do to make
> your life easier.

What can possibly be easier than typing 'dd'?  I like unix-like systems 
because everything is a file or a stream of bytes and those don't take 
specialized tools to manage.  If you want a copy of something, you copy 
it, including the raw disk containing the whole system.  If it becomes 
something that takes specialized tools to touch every specialized device 
in its own special way, I won't be interested any more.

Actually I use drbl and clonezilla to make most copies because it really 
is easier than typing 'dd', but that's a practical, not a philosophical 
choice - the effect is the same.  The tool simply has to be able to 
handle multiple OS versions and the bulk of our systems are still 
running windows.

>>>> For people who have already automated these processes, try not to screw 
>>>> it up too badly.  If the way it is done now didn't work, we wouldn't 
>>>> have an internet.
>>> I'm sure it won't screw existing setups unless there is an overwhelming
>>> compelling reason.
>> But the changes you mentioned already have.
> 
> Not for those of us doing things in any way other than the old way.

How do you deal with overlap?  How much human time did it take to 
maintain working services on a large set of machines across the changes 
from, say RH7.3 (probably the first really reliable version) up through 
current?  How much of that 'other way' is useful in a mixed OS environment?

>> I don't want dynamic devices on my servers.  I want to know exactly what 
>> they are and how they are named by the OS.  And I want a hundred of them 
>> with image-copied disks to all work the same way.
> 
> But that's the fallacy of your argument, things *didn't* work that way,
> ever.  At least not under linux.  A device failure could cause sdb to
> become sda,

Ummm, OK - so are you implying that having a label on a partition on sda 
is useful in that circumstance?  Things that break just have to be 
replaced before they work again.  The way md devices work is sort-of ok, 
if you've handled the special case for booting, but they worked that way 
all along.   I'll agree that linux got most of the things it didn't copy 
from sysvr4 wrong in the first place including scsi drive naming, but 
changing 'detection order' naming to 'labels likely to be duplicates' 
isn't a good fix.

> or a BIOS or kernel update could cause eth0 and eth1 to flip
> flop. 

Kernel updates didn't do that until the 2.6 series.  And bios updates 
usually don't take me by surprise.

> The changes that were made were to deal with real world
> situations that you get to ignore because you tightly control your
> setups. 

Yes, I'm running servers.  You know - the big use for Linux...  I have 
old systems and new systems running simultaneously.  I want procedures 
that don't require changing everything at once or training operators to 
know the difference between versions for concepts that have not really 
changed - like mounting a disk partition or assigning an IP address to 
an interface.

> If you embraced some of these changes and worked *with* them
> instead of disabling them, then you might be able to loosen up some of
> that control and find that things still work like they are supposed to.

I have very little interest in converting to procedures that only work 
with one or a few versions of one distribution of one OS.  I'd be 
_slightly_ more interested if there were a clear development path toward 
those procedures as there once was before the RHEL/fedora split.  For 
example, back in the old days I could work on procedures and local 
applications on RHX.0 and not have too many surprises by the time it was 
production-ready around X.2.  With current fedora, there's no way to 
know what to expect to flow into an EL version or prepare for it.

>>   Some tools to deal 
>> with the changes being made could help with this but so far I haven't 
>> seen any.
> 
> I'm sorry, but you must not have looked very hard.  The tools are there,
> and they do a damn fine job.

Which tool besides clonezilla is good for cross platform work?  Are 
there even tools for a specific purpose like replacing a set of RHEL3 
servers with RHEL5 equivalents, maintaining the existing IP addresses on 
several interfaces each?  I eventually came up with something to scarf 
all the old ones from the running systems along with the corresponding 
mac addresses and included them in the clonezilla image with a script to 
patch things up but it wasn't pretty.

> I really think this all boils down to one simple fact: Fedora is
> supposed to bleeding edge, and that includes improving upon old, tired
> ways of doing things for better ways, and that seems to be anathema to
> you.  I hate to say it, but it sounds like what you really want is not
> Fedora, but OpenSolaris.

Agreed - I'd switch in a second if someone packaged it with drivers for 
all my hardware and the same userland we've been using for years.  You 
can see a history of understanding servers there that is missing in Linux.

> I'm afraid that as long as you want to
> maintain your setup the same as it has been for decades, that Fedora and
> the direction Fedora is heading in is going to continue to frustrate
> you.

When something has been working for decades why would anyone want to 
change it?  And if it hasn't been working, why even look at a unix-like 
system in the first place?

 > And I really hate to say that, because I *want* Fedora to be all
> things to all people, but realistically it can't.  And in this
> particular conflict, it's a case of "we have real world problems from
> some users so we fix those problems with a better way of doing things"
> versus your case of "in my particular world, these problems don't exist,
> so don't change things around" and I just don't know how to rectify
> those two positions.

The way to do it is to have the kernel and the hardware use predictable 
but unfriendly conventions for the 'real' names that connect drivers to 
hardware and some optional intermediate user level daemon that maps them 
to a friendly name in case there is a human involved instead of a script 
- like the old /dev/cdrom symlink.  In any case you need to look at some 
worst-case scenarios before applying any change to decide if it really 
helps anything or not.

Will any of the changes involving friendly identifiers for partitions 
help me when I connect a new unformatted drive?  Will any help with 
mounts that are done over nfs or cifs?  What about iscsi?  If I have to 
identify a new raw disk myself to make the partitions and filesystems 
when adding it, why do you think I need different terminology to 
identify the partitions  after that step is done?

Likewise with network interfaces: when what I want is some particular 
vlan from a trunk, will the changes help with that?

> In the end, Fedora is going to be true to its
> goals, including being bleeding edge and fixing things that are broken,
> and I don't think it's even possible to stop the march of that progress
> for the sake of your particular setup and working habits.  We can
> attempt to, but sometimes things simply must be changed in order to deal
> with other people's reality.

OK, but if you make server management harder or more specialized as a 
result of changes that only matter to desktop clients, don't expect 
anyone to run them.  I think you were on to something with OpenSolaris.

-- 
   Les Mikesell
     lesmikesell at gmail.com






More information about the fedora-devel-list mailing list