Issue #1 November 2004

What is Security-Enhanced Linux?

Introduction

In today's world of high speed Internet connections, coffee shops with free wireless access, and way too many root kits floating around on the Web, thinking about computer security has become commonplace. To combat this issue, National Security Agency (NSA), with the help of Linux community, has developed an access control architecture to confine processes to only the files they need to complete their actions. This architecture is called security-enhanced Linux, or SELinux for short.

Overview of SELinux

SELinux, is a Mandatory Access Control (MAC) security system for Linux based on the domain-type model. It was written by the NSA (http://www.nsa.gov/research/selinux/) and is comprised of a kernel module (included in all 2.6 kernels), patches to certain security related applications, and a security policy.

The standard Unix security model is known as Discretionary Access Control (DAC). This means that every program has full control over all access to its resources. If it chooses to put files in /tmp/ that contain potentially secret data and makes them world readable then there is nothing to stop it! MAC is fundamentally different in that the system security policy has full control over the access that is granted to each resource. The MAC implementation in SELinux allows a program to create a file in /tmp/ with world read access according to Unix permissions, but after the Unix permission checks are applied the SELinux permission checks determine whether access to a file is granted. Among other security measures, this means that if a file has Unix permission mode 0777 you may still be refused access to read, write, or execute it. With only DAC, any program run by the user can leak or modify the contents of any file that the user has access to. SELinux permits restricting which files each process can access and what level of access is to be granted. This means that if a program is using secret data then it can be prevented from writing to any files that less privileged programs can read.

SELinux allows more fine grained access controls than traditional Unix permissions offer. For example, the administrator can permit an application to append data to a log file but not re-write or truncate it. Ext2 and ext3 file systems have an append-only flag that can be set by chattr(1), but this is for all access to the file (it can not be append-only for one process and fully writable for another). Also an application can be permitted to create files and write data to them but not delete existing files with the same permission in the same directory: this variation on append-only files can not be done in a stock Linux kernel without SELinux. Network programs can be permitted to bind to the port or ports that they need (for example, port 53 for BIND) but denied bind access to other ports.

The domain-type model means that every process runs in a security domain, and every resource (file, directory, socket, etc.) has a type associated with it. There is a set of rules listing the actions that each domain may perform on every type. One advantage of the domain-type model is that the policy can be analyzed to determine which information flows are possible. In standard Unix, users can usually view each other's processes with the ps command, which can give valuable information to an attacker. Even if wide ps access is blocked, there are still many possible ways for data to be accidentally or intentionally leaked, and there is no way of determining which information flows might be permitted on a given Unix configuration. There are tools to analyze SELinux policy and determine which data flows are possible. For example, if two applications are granted append access to a file type that is used for log files, they have no direct method of communication. If one of the applications is granted read access as well then there is a possibility of one-way communication.

One example of the benefits of analyzing the policy is the restriction on access to /etc/shadow. In a strict configuration of the Fedora SELinux policy in Fedora Core 3, there are 17 domains which are permitted access to the type shadow_t (used for /etc/shadow), and only 9 of them have write access. Of these 17 domains only two can be entered from a user domain, one is for /usr/bin/passwd, and the other is for /sbin/unix_chkpwd (a helper program for screen blankers and other unprivileged programs to perform a password check). Some of those 17 domains are for programs that are not used on most systems (for example, the domain radius_t for RADIUS servers), so therefore any machine which does not have a RADIUS server installed has at most 16 accessible domains which can access /etc/shadow! Note that the Fedora policy is getting more tunable options; future versions of the policy may allow more than 17 domains to access /etc/shadow depending on the tunable options selected. Also note that this number is based on the strict policy with the most strict options. A default install allows more access.

On a non-SE machine every process that runs as root can access /etc/shadow. This means that any security problem in any binary that is SETUID root or in any network daemon which runs as root is catastrophic. SELinux allows us to restrict daemons to the access that they need. BIND can only provide a service on port 53, DHCP servers have raw network access, and DHCP clients have raw network access and can change local interfaces. But none of those programs can access /etc/shadow, /home, /root, or other significant resources. So on SELinux if BIND is compromised, the worst it can do is send out bogus DNS packages. If DHCPD is compromised, the worst it can do is mess up your IP address assignment. This is much better than a remote root exploit!

Now, obviously programs can invoke other programs. In this case, DHCPD might try to invoke /sbin/unix_chkpwd manually to perform a brute-force password attack, for example. But even that potential vulnerability is closed: SELinux has the ability to specify “transition” rules, which determine which domains can enter other domains. In this case, a screen saver invoked on a local user’s behalf might be allowed to transition into the privileged domain for /sbin/unix_chkpwd, but that transition is denied to DHCPD.

These limitations give you assurances about the state of the system. If your DHCP server is discovered to be a buggy version, then all you have to do is install the fixed one and make sure that all clients have the right IP addresses. On a non-SE system you would be left wondering whether an attacker had taken over your system and hidden their tracks.

SELinux provides strong separation between domains. I ran a Debian machine with an open root password for two years, and I have now run a Fedora Core machine with an open root password for several months. They have withstood many attempts to crack them as root. Refer to http://www.coker.com.au/selinux/play.html for details if you would like to try logging in to them. There are also links to sites containing information on SELinux play machines running Debian and Gentoo. Be aware that when logging in to such a machine the same root account is shared by other people who may be hostile to you. It would be most unwise to log in to such a machine with X11 or SSH agent forwarding turned on or to send any data to such a machine that you regard as secret. Do not log in to such a machine and then ssh to another machine! If you do not feel confident in your ability to deal with these issues, then it may be best to avoid using the play machines. Note that I run my Fedora Core SELinux play machine in my spare time on hardware that I own. It is not an official Red Hat service.

Note that play machines aren't always running, it takes time to monitor such a machine, and they get turned off when the owner is busy.

Fedora Core 2 was released with integrated SELinux support (although disabled by default). This is used as a test of the code that will be released in Red Hat Enterprise Linux (RHEL) 4. However there will be some minor differences. The plan is that RHEL will have a default policy that is slightly more restrictive than Fedora Core because we believe that is what users will want. The plan is that future versions of Fedora Core will also have more restrictive policy than Fedora Core 2, but still less restrictive than RHEL.

Details of SELinux Operation

The SELinux policy database controls all aspects of SELinux. It determines which domain each program may be run in and specifies the types that each domain may access. A typical policy has more than 100,000 rules. But that is nothing to be concerned about. When writing policies, high level macros are used, where a single line can result in as many as 10,000 policy rules. There are tools to analyze policy to confirm that it correctly implements your security goals, and the 100,000 rules is considerably smaller than the permissions on all the files in the system using Unix permissions.

The goal for Fedora Core 2 is to have the default policy work for most users without any modification. In the policy, there will be a set of tunable options designed to make the most common changes to policy available as one-line options. This includes whether users should be able to read the kernel message log with dmesg, whether the administrator (sysadm_r) can log in directly via SSH or launch an X session, and whether users should be able to bind to TCP sockets.

A policy database of 100,000 rules is a file of approximately 2.6 MBs on disk and takes the same amount of kernel memory when loaded. The default strict policy that ships with Fedora Core 3 has more than 290,000 rules and take more than 7 MBs of kernel memory. The default targeted policy for Fedora Core 3 has about 5,000 rules and take about 150K of kernel memory.

Currently SELinux has not been optimized for memory use; however, there are plans for future work in this regard, and ways of reducing memory use have been identified. It is not difficult to remove the policy files which refer to daemons that you do not plan to use and build a much smaller policy. At the Ottawa Linux Symposium 2003 I presented a paper on my work in porting SELinux to the HP iPAQ PDA (available at http://archive.linuxsymposium.org/ols2003/Proceedings/). I had SELinux with the strict policy running well on an iPAQ with 64 MBs of RAM and 32 MBs of storage and believe that I can get it to run on even smaller machines. However for Fedora Core 2 we are aiming at hardware that is in most common use now for the default configuration. Users of older machines may wish to configure a minimal policy to reduce memory use and improve performance.

On a SELinux system every process has a context which is comprised of three parts: an identity, a role, and a domain. The identity is the name of the Unix account for a user login if the user name is compiled into the SELinux policy, otherwise it is system_u for system processes or user_u for processes of users whose names are not compiled into the policy. The role determines which domains are permitted and is used to restrict the transitions to other domains. For example the user_r role is not permitted to have the domain sysadm_t (the main domain for system administration). Therefore, the user_u identity is only permitted to have the user_r role, and the user_r role is not permitted to have the sysadm_t domain. Thus, someone who logs in with the identity user_u can never get to the sysadm_t domain. Not all of these features are enabled in the default policy for Fedora Core 2. Currently we are focusing our efforts on policy for daemons and will have a less restrictive policy for user domains (the targeted policy has no restrictions on user logins).

A security context is represented as a plain text string of the form identity:role:domain. So, the typical context used for system administration is root:sysadm_r:sysadm_t. Every object that can be accessed will have a security context of this form. Note that a domain is merely a type that is assigned to a process. So for permissions to send signals to processes or to inspect /proc entries for ps, the domain of the process is used as the type for the domain-type rule checks. For files on disk the role is currently not used, so every file will have a role of object_r (this role is just used as a place-holder and has no significance to the policy). The identity of a file is always the identity of the process which creates it. This is used in the constraints policy source file to determine access to change the type of a file. The policy controls the access to changing the context of files, but where it is permitted unprivileged, user domains may not change the type of files unless the identity of the file (both before and after the relabel) matches the identity of the process. For example, a process with the context rjc:user_r:user_t could relabel rjc:object_r:user_games_rw_t to rjc:object_r:user_games_ro_t but could not relabel to or from the type john:object_r:user_games_rw_t.

To view the security context of running processes, use the -Z option to ps as shown in Example 1, “Example Output of ps ax -Z”.


  PID CONTEXT                                  COMMAND
 1634 root:user_r:user_t                       -bash
 1662 root:user_r:user_t                       ps ax -Z

Example 1. Example Output of ps ax -Z

To view the security context of files and directories, use the -Z option to ls as shown in Example 2, “Example Output of ls -Z”.


drwxr-xr-x  root     root     system_u:object_r:bin_t          bin
drwxr-xr-x  root     root     system_u:object_r:boot_t         boot
drwxr-xr-x  root     root     system_u:object_r:device_t       dev
drwxr-xr-x  root     root     system_u:object_r:etc_t          etc
drwxr-xr-x  root     root     system_u:object_r:home_root_t    home
drwxr-xr-x  root     root     system_u:object_r:root_t         initrd
drwxr-xr-x  root     root     system_u:object_r:lib_t          lib
drwx------  root     root     system_u:object_r:lost_found_t   lost+found
drwxr-xr-x  root     root     system_u:object_r:default_t      misc
drwxr-xr-x  root     root     system_u:object_r:mnt_t          mnt
drwxr-xr-x  root     root     system_u:object_r:usr_t          opt
?---------  ?        ?                                         oracle
dr-xr-xr-x  root     root                                      proc
drwxr-x---  root     root     system_u:object_r:user_home_dir_t root
drwxr-xr-x  root     root     system_u:object_r:sbin_t         sbin
drwxr-xr-x  root     root                                      selinux
drwxr-xr-x  root     root     system_u:object_r:default_t      srv
drwxr-xr-x  root     root                                      sys
drwxrwxrwt  root     root     system_u:object_r:tmp_t          tmp
drwxr-xr-x  root     root     system_u:object_r:usr_t          usr
drwxr-xr-x  root     root     system_u:object_r:var_t          var

Example 2. Example Output of ls -Z

Note that ls leaves the context field blank for files that have no assigned context (which usually is only for file systems that do not support persistent labels such as /proc, /selinux, and /sys). For files where ls has no permission to stat(2) them, the Unix permission is listed as “?---------”, the user and group are also listed as “?”, and the context field is left blank.

At a shell prompt, the id command shows you the context as well as shown in Example 3, “Example Output of the id Command”.


uid=0(root) gid=0(root) groups=0(root),1(bin),2(daemon),3(sys),4(adm),6(disk),10(wheel) context=root:user_r:user_t

Example 3. Example Output of the id Command

When running a SELinux system with a strict policy it is not uncommon to notice programs doing unusual things. Often you will discover bugs in applications when you write SELinux policy to allow them to do what they need to do but discover that they try to do other things.

When using SELinux the context of a process can only be changed at execution time. The domain and role of a process can change automatically based on the context used for the exec system call and the file type. Also, it is possible for a process to specify a context before calling exec. Naturally this is subject to SELinux policy since the identity, role, and domain are all controlled by the policy.

Implementing Policies in Fedora

Starting in Fedora Core 3, the source to the SELinux policy is stored under /etc/security/selinux/X/src/policy/ (where X is “strict” or “targeted” depending on which you are using). In this directory, compile and load a policy with the command “make load”. That command compiles the policy into a binary form and then load it into the kernel (it takes affect immediately). It also places the policy binary into the file /etc/selinux/X/policy/policy.YY,where X is “strict” or “targeted” and YY is the policy version number. From this location, /sbin/init can load it in the early stages of booting. The configuration file /etc/selinux/config tells init which policy file to load. Fedora Core 3 has support for policy version 18.

When you boot a SELinux system the first actions that init performs are mounting /proc and determining whether SELinux is enabled. It checks for the presence of SELinux in the kernel through the existence of a selinuxfs file system type. If SELinux is not in the kernel or if it is disabled by the selinux=0 kernel parameter, then the boot proceeds as on a non-SE system. If SELinux is detected then the virtual file system /selinux will be mounted, and /selinux/policyvers will be checked for the policy version that the kernel supports. The policy database /etc/selinux/X/policy/policy.YY will then be loaded into the kernel.

After the policy is loaded every running process (only init and kernel threads as the policy is loaded early in the boot) will be assigned the security context system_u:system_r:kernel_t (NB all kernel threads started at any time will get that context). Once init has loaded the policy it will re-exec itself. The policy contains the rule domain_auto_trans(kernel_t, init_exec_t, init_t). This means that when the kernel_t domain executes a file of type init_exec_t (for example, /sbin/init) then the domain will automatically transition to init_t (the correct domain for /sbin/init). After that init does its usual job and the system boots. The kernel threads continue running as kernel_t.

The contexts of files and directories are stored in extended attributes (known as XATTRs). Refer to the man pages attr(5), getfattr(1) and setfattr(1) for more information. Basically an XATTR is a property of a file on disk, consisting of a pair of a name and some data. The SELinux context of a file or directory is stored in the security.selinux attribute for any persistent file system. The proc file system does not support this, SELinux labels the files and directories internally for its own use but does not export this information via the getxattr(2) API. The devpts file system (used for /dev/pts Unix98 pseudo-tty's) exports the context of each pseudo tty to the application via getxattr(2) and supports setxattr(2) to change the context of a device (so that sshd, and other similar programs can change the context of a tty device). For file systems that have permanent storage (ext2, ext3, Reiserfs, XFS, vfat, etc) there are two options: the file system type can have all files on it labeled with a particular context, or XATTRs can be used to store the context for each file system object. For example every file in a iso9660 (CD-ROM) file system will have the context system_u:object_r:iso9660_t. This is known as genfs labeling. Ext2, ext3, and XFS support XATTRs and for Fedora have been compiled with support for the security attribute name space, so in the default policy they will be labeled with XATTRs, but this is optional. The ReiserFS code in Fedora does not properly support labelled operation with SE Linux as Hans Reiser has no interest in XATTR support before Reiser4. Therefore it will be impossible to use ReiserFS as the root file system when running SELinux.

XFS has one significant issue related to XATTRs: if they can't fit into the Inode then they will be stored in a data block with one Inode per data block. The default options for mkfs.xfs are that the block size is 4096 bytes and the Inode size is 256 bytes (about 30 bytes too small to allow the SELinux XATTR to be stored in the Inode). This means that on a default XFS file system the XATTRs for SELinux will take up 4096 bytes per Inode which is often a significant portion of the disk space! If you use the option “-i size=512” when creating an XFS file system the Inodes will be 512 bytes in size, the SELinux XATTR will fit into the Inode (with significant space and performance benefits). 512 byte Inodes will also apparently give some performance benefits for other operations. If you use XFS and plan to use SELinux in the future then it would be a good idea to create all your file systems with 512 byte Inodes.

When mounting a file system on a recent Linux kernel (eg the latest Fedora kernel or the standard 2.6.8.1 kernel) and using the latest version of mount there is an option to set the context for the file system. You can mount a file system with the option “-o context=system_u:object_r:mail_spool_t” to set the context for all files in your mail spool file system. If you have a large mail server with an XFS file system for the mail spool this could avoid the issues with Inode size. Even for file systems with little overhead for XATTRs, such as Ext3, using the context= mount option will save some overhead, and also avoid having to relabel the existing file system (which could take a long time if the file system has millions of files).

Default SELinux Policies in Fedora

In Fedora Core 3, the default policy is the targeted policy. There was a lot of discussion leading up to this decision, as the main issue is that we want to get as many people using SELinux as possible. If SELinux is considered to be too invasive and prevents people from doing what they want to do then they will turn it off. We would rather offer a moderate amount of protection to many people than a large amount of protection to a small number of people at this time. A default install of Fedora Core 3 has SELinux enabled with the targeted policy, to change it to strict policy then you can run the program system-config-securitylevel.

If you install the policy source package then the source to the SE Linux policy will be installed in /etc/selinux/X/src/policy/ (where X is “strict” or “targeted” depending on which policy you are using). Under the policy source directory there is a subdirectory domains/program/ which contains a .te file for each daemon. You can reduce kernel memory use (and improve performance) by removing .te files which match daemons that you do not use and do not plan to use. For example if you have a machine which does not run BIND then you can remove named.te. If you then reload the policy with the “make load” command, and the kernel memory use will decrease. However, be aware that removing the wrong files can result in your system being unable to boot in enforcing mode. At the moment such tuning is only recommended for experts.

When starting with SELinux, note that there is a kernel parameter of enforcing which determines whether the kernel is in enforcing or permissive mode. In permissive mode SELinux logs what it would do but does not actually prevent any operation. In enforcing mode SELinux prevents operations. If you have a mistake in your policy, it could prevent you from logging in! So for normal operation of SELinux, use the boot parameter enforcing=1. If you have a problem with it, boot with enforcing=0 to debug the problem. Also the file /etc/selinux/config has a setting to allow booting in permissive mode.

If you decide to cease using SELinux, then you can use the kernel parameter selinux=0. This causes all SELinux functionality to be disabled and gives the same result as if the kernel had been compiled without SELinux support enabled. Some people have used this for recovery when there were problems such as a corrupted file system, but I do not recommend this. When you are running with selinux=0, any files or directories you create will not have SELinux contexts. This means that if you replace critical files such as /etc/passwd or /etc/shadow, then the next time you boot SELinux the system will not work correctly. This is not a serious problem, and the same problem will occur if you boot from a CD, boot for system recovery, or restore from a backup using backup software which does not support XATTRs. This can be solved by relabeling the affected file systems, but it is easier to use enforcing=0 instead of selinux=0 when repairing a damaged system to save the effort. You can permanently disable SELinux by editing the file /etc/selinux/config.

The release of Fedora Core 2 was a significant development for SELinux. It was the first release of a major distribution with full SELinux support. Fedora Core 3 is another significant milestone as the first release of a major distribution with SE Linux as the default install option. Red Hat Enterprise Linux 4 will follow on from the development of Fedora Core and the SELinux support will be further developed. When RHEL 4 is released, it will benefit greatly from the work in Fedora Core and from the users who have learned SELinux on Fedora.

SELinux in Fedora is supported in the #fedora-selinux channel on the irc.freenode.net IRC server and on the Fedora SELinux mailing list http://www.redhat.com/mailman/listinfo/fedora-selinux-list. I am usually on #fedora-selinux, and I am subscribed to the mailing list, I look forward to answering any questions you may have in those forums.

Further Reading

About the Author

Russell Coker has been working for Red Hat since August 2003 on the SELinux project. Before joining Red Hat he worked as an independent consultant and worked on SELinux in his spare time. He first learned about SELinux at the 2001 Ottawa Linux Symposium conference when Pete Loscocco of the NSA gave a talk on SELinux. At the 2002 OLS he presented a paper on his work in putting SELinux support in the Debian distribution. As part of his Debian work, he wrote policy to support all the programs that he uses, and in the process became the main contributor to the SELinux sample policy. He has also presented SELinux papers at OLS2003 and Linux Kongress 2002.