Today, in a world that seems to consist only of containers, cloud computing, and serverless computing, you might ask, "why should I care about hardware?"
Well, as a sysadmin, you should care, because a cloud is just another computer belonging to someone else, and serverless computing is just a marketing term. Beneath the API, deep down under the virtualization layer, you'll find a hardware server processing your workloads.
I'll tell you a story about the things you have to do in my organization to get your hardware server and show you the steps that must be taken to put it into production.
Configuring and purchasing
Imagine Bob needs a server to host his service. Bob goes to Alice, who is in the infrastructure team that manages the purchasing and onboarding of new server hardware, to request it. (Onboarding describes the process involved in getting the new box up and running, which we'll get to in a moment.)
Alice surprises Bob with a series of questions: "Why do you need only one server? What will happen to your service when the server breaks down, or we lose the whole data center location? You know we have two datacenter locations, right? Don't you want to stretch your service to both locations? Because nobody likes to work at night, on weekends, or on holidays when something breaks."
Bob sees no reason to argue with Alice on that, so after just a few seconds, he ends up with two servers. Next, Alice and Bob discuss the lifecycle of the servers, and Alice asks Bob about the hardware requirements like CPU, RAM, NICs, disk space, and so on. Bob explains the purpose of the service that will run on the new machines. In this example, Bob needs two new machines for an LDAP service that consists of six machines in total—some virtual and some in hardware, so the service keeps running even if the virtual environment breaks down.
Most hardware vendors have an application that helps you with the configuration of your new boxes. In most cases, you can choose between tower chassis or rack chassis. The rack chassis fits into standard server racks you could find in most data centers. The height of server racks is measured in so-called "rack mounting units," which are used to specify the height of a rack-mountable chassis. For example, a rack with 42 rack units fits 42 servers with one rack unit each, or 22 with one rack unit and 10 servers with two rack units each. To use the space most efficiently, the chassis should be as small as possible but big enough that all necessary components fit into it.
CPU (socket) and RAM
Choosing the right mainboard and CPU could be another entire article, so I'll only discuss the most important aspects here.
What kind of CPU you should choose depends heavily on your workload and the licensing model of your application. If your application is not good in multi-threading but needs a high single-core performance, you may want to choose a CPU model with fewer cores but the highest possible speed per core.
Since many vendors of database management systems charge you by the number of CPU sockets in your box, you could try to get a single socket board with a CPU that has as many cores as you can get. This way, you save money on licenses.
For a common virtualization cluster, where you want to provide a lot of cores to run as many virtual machines as possible, you might choose a lower clock frequency and more cores per CPU and a second CPU socket on your mainboard instead.
Since the count of cores per CPU increases with each new model, vendors started to take the core count into consideration. For example, when running a CPU with more than 32 cores, a well-known vendor of virtualization products expects that you buy an additional license for it.
Last but not least, you should take the thermal design power (TDP) into consideration when choosing your CPU. Think about your data center. There are a lot of racks with servers, and they all need cooling. As the TDP specifies the maximum amount of heat generated by a CPU, it determines how much power you need for your cooling system. You have to get the hot air away from your servers to prevent hot spots, which would decrease the lifespan of your hardware.
Regarding RAM—as a rule of thumb, you should just assume the more, the better. Only the mainboard and your budget are your limits here.
Bob is glad that Alice helps him figure out which configuration best fits his need, and Bob gets the most hardware for his budget.
Disk space, NICs, HBAs, etc.
But Bob is not done yet. Now he has to decide how much local disk space the new server would need and whether it needs a physical RAID controller or not.
Again, the pros and cons of hardware RAID vs. software RAID would fill another article, so I won't dig into that in this one.
For Bob's server, a small RAID mirror is sufficient. Only the operating system and a replica of the LDAP database will run on the new boxes. Because they have no need for additional SAN storage, the servers don't need additional Fiber Channel host bus adapters.
Bob likes the new servers to have two network interface controllers (NIC) each for load balancing and failover.
Alice puts a second power supply into the configuration. This way, the server keeps running in the event that one power supply blows up. Because these are hot pluggable, they could be changed without putting the server in maintenance mode.
Alice also adds some rack mount rails and cable management to the configuration. These parts help to pull out the server from the rack like a drawer without having to unplug all of the cables first.
All that Bob has to do now is to wait for the new servers to get delivered. Alice, on the other side, still has some tasks to do.
Things to do until your new hardware arrives
Alice knows that a server needs network connectivity to provide its services on the network. Therefore, it needs IP addresses, DNS records, and cables for the physical connections. So here are some tasks Alice completes so that Bob can start without delay:
- Ask the IP and DNS teams to register four IP addresses and DNS records—one IP and DNS record for each box and one IP and DNS record for each baseboard management controller.
- Check the rack documentation to find space to place the new servers into the racks.
- Look for network ports to connect to and ask the network team to activate the appropriate VLANs on those ports.
When IP addresses and DNS names are registered, the team keeps Bob in the loop so he can update the documentation with the information for the new hosts and so firewall and load balancer configuration can be updated according to the documentation later when the servers have arrived.
Things to do when the servers arrive
After the new hardware arrives in our data centers, the first job Alice has to do is a series of electricity checks following the procedures of the DGUV V3. Only hardware that has passed all these checks is allowed to be put to operation to avoid the risk of an electrical shock due to faulty hardware.
The next step is to mount the new boxes into the rack. Usually, this is a job for two, so I give Alice a hand here. When the server is in place, it's powered up, and Alice checks if all hardware is recognized by the server's BIOS/UEFI. Alice checks whether the server could operate with only one power supply present and runs some burn-in tests on the memory and CPU. These tests could take a day or two. This way, we try to make sure that the hardware wouldn't break in phases of high load.
Alice gives her approval, and Bob can start installing the operating system of his choice. He checks that it recognizes all hardware and installs missing drivers for unknown devices.
And there we are. Two new servers in two different locations are up and running, and most probably, they stay this way for the next five or six years. And because Alice and Bob tried to avoid single points of failure, they hopefully won't have to work at night, on weekends or holidays.
[ Free cheat sheet: IT job interview tips. ]