[PATCH] capabilities: audit capability use

Topi Miettinen toiwoton at gmail.com
Mon Jul 11 16:05:08 UTC 2016


On 07/11/16 15:25, Serge E. Hallyn wrote:
> Quoting Topi Miettinen (toiwoton at gmail.com):
>> There are many basic ways to control processes, including capabilities,
>> cgroups and resource limits. However, there are far fewer ways to find
>> out useful values for the limits, except blind trial and error.
>>
>> Currently, there is no way to know which capabilities are actually used.
>> Even the source code is only implicit, in-depth knowledge of each
>> capability must be used when analyzing a program to judge which
>> capabilities the program will exercise.
>>
>> Generate an audit message at system call exit, when capabilities are used.
>> This can then be used to configure capability sets for services by a
>> software developer, maintainer or system administrator.
>>
>> Test case demonstrating basic capability monitoring with the new
>> message types 1330 and 1331 and how the cgroups are displayed (boot to
>> rdshell):
> 
> Thanks, Topi, I'll find time this week to look this over in detail.
> 
> How much chattier does this make the syslog/journald during a regular
> boot?  I was thinking "this is audit, we can choose what messages
> will show up", but I guess that' sonly what auditd actually listens to,
> not what kernel emits?  (sorry i've not looked at audit in a long
> time).  Drat, that makes it seem like tracepoints would be better
> after all.  But let's see how much it addes to the noise.

For example "loadkeys" causes thousands of entries. :-( I'm checking how
to avoid audit message rate limiting, now some messages are lost.

It's still too easy to drown the logs with noise. That could be limited
a lot by emitting a message only when the capability is used for the
first time. But the question is how to define where to start counting
(fork, exec, and/or setpcap?). I'm also not sure if that is the right
way to log, since the first use of a capability could be expected and an
innocent one, but then the 100th one could be malicious.

It's also very complex and error-prone to collect a capability mask from
audit logs, which was my original goal.

-Topi

> 
>> BusyBox v1.22.1 (Debian 1:1.22.0-19) built-in shell (ash)
>> Enter 'help' for a list of built-in commands.
>>
>> (initramfs) cd /sys/fs
>> (initramfs) mount -t cgroup2 cgroup cgroup
>> [   12.343152] audit_printk_skb: 5886 callbacks suppressed
>> [   12.355214] audit: type=1300 audit(1468234317.100:518): arch=c000003e syscall=165 success=yes exit=0 a0=7fffe1e9ae2d a1=7fffe1e9ae34 a2=7fffe1e9ae25 a3=8000 items=0 ppid=469 pid=470 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=ttyS0 ses=4294967295 comm="mount" exe="/bin/mount" key=(null)
>> [   12.414853] audit: type=1327 audit(1468234317.100:518): proctitle=6D6F756E74002D74006367726F757032006367726F7570006367726F7570
>> [   12.438338] audit: type=1330 audit(1468234317.100:518): cap_used=0000000000200000
>> [   12.453893] audit: type=1331 audit(1468234317.100:518): cgroups=:/;
>> (initramfs) cd cgroup
>> (initramfs) mkdir test; cd test
>> [   17.335625] audit: type=1300 audit(1468234322.092:519): arch=c000003e syscall=83 success=yes exit=0 a0=7ffddfd75e29 a1=1ff a2=0 a3=1e2 items=0 ppid=469 pid=471 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=ttyS0 ses=4294967295 comm="mkdir" exe="/bin/mkdir" key=(null)
>> [   17.392686] audit: type=1327 audit(1468234322.092:519): proctitle=6D6B6469720074657374
>> [   17.409404] audit: type=1330 audit(1468234322.092:519): cap_used=0000000000000002
>> [   17.425404] audit: type=1331 audit(1468234322.092:519): cgroups=:/;
>> (initramfs) echo $$ >cgroup.procs
>> (initramfs) mknod /dev/z_$$ c 1 2
>> [   28.385681] audit: type=1300 audit(1468234333.144:520): arch=c000003e syscall=133 success=yes exit=0 a0=7ffe16324e11 a1=21b6 a2=102 a3=5c9 items=0 ppid=469 pid=472 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=ttyS0 ses=4294967295 comm="mknod" exe="/bin/mknod" key=(null)
>> [   28.443674] audit: type=1327 audit(1468234333.144:520): proctitle=6D6B6E6F64002F6465762F7A5F343639006300310032
>> [   28.465888] audit: type=1330 audit(1468234333.144:520): cap_used=0000000008000000
>> [   28.482080] audit: type=1331 audit(1468234333.144:520): cgroups=:/test;
>> (initramfs) chown 1234 /dev/z_*
>> [   34.772992] audit: type=1300 audit(1468234339.532:521): arch=c000003e syscall=92 success=yes exit=0 a0=7ffd0b563e17 a1=4d2 a2=0 a3=60a items=0 ppid=469 pid=473 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=ttyS0 ses=4294967295 comm="chown" exe="/bin/chown" key=(null)
>> [   34.828569] audit: type=1327 audit(1468234339.532:521): proctitle=63686F776E0031323334002F6465762F7A5F343639
>> [   34.848747] audit: type=1330 audit(1468234339.532:521): cap_used=0000000000000001
>> [   34.864404] audit: type=1331 audit(1468234339.532:521): cgroups=:/test;
>>
>> Signed-off-by: Topi Miettinen <toiwoton at gmail.com>
>> ---
>>  include/linux/audit.h      |  4 +++
>>  include/linux/cgroup.h     |  2 ++
>>  include/uapi/linux/audit.h |  2 ++
>>  kernel/audit.c             |  7 +++---
>>  kernel/audit.h             |  1 +
>>  kernel/auditsc.c           | 28 ++++++++++++++++++++-
>>  kernel/capability.c        |  5 ++--
>>  kernel/cgroup.c            | 62 ++++++++++++++++++++++++++++++++++++++++++++++
>>  8 files changed, 105 insertions(+), 6 deletions(-)
>>
>> diff --git a/include/linux/audit.h b/include/linux/audit.h
>> index e38e3fc..971cb2e 100644
>> --- a/include/linux/audit.h
>> +++ b/include/linux/audit.h
>> @@ -438,6 +438,8 @@ static inline void audit_mmap_fd(int fd, int flags)
>>  		__audit_mmap_fd(fd, flags);
>>  }
>>  
>> +extern void audit_log_cap_use(int cap);
>> +
>>  extern int audit_n_rules;
>>  extern int audit_signals;
>>  #else /* CONFIG_AUDITSYSCALL */
>> @@ -545,6 +547,8 @@ static inline void audit_mmap_fd(int fd, int flags)
>>  { }
>>  static inline void audit_ptrace(struct task_struct *t)
>>  { }
>> +static inline void audit_log_cap_use(int cap)
>> +{ }
>>  #define audit_n_rules 0
>>  #define audit_signals 0
>>  #endif /* CONFIG_AUDITSYSCALL */
>> diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
>> index a20320c..b5dc8aa 100644
>> --- a/include/linux/cgroup.h
>> +++ b/include/linux/cgroup.h
>> @@ -100,6 +100,8 @@ char *task_cgroup_path(struct task_struct *task, char *buf, size_t buflen);
>>  int cgroupstats_build(struct cgroupstats *stats, struct dentry *dentry);
>>  int proc_cgroup_show(struct seq_file *m, struct pid_namespace *ns,
>>  		     struct pid *pid, struct task_struct *tsk);
>> +struct audit_buffer;
>> +void audit_cgroup_list(struct audit_buffer *ab);
>>  
>>  void cgroup_fork(struct task_struct *p);
>>  extern int cgroup_can_fork(struct task_struct *p);
>> diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
>> index d820aa9..c1ae016 100644
>> --- a/include/uapi/linux/audit.h
>> +++ b/include/uapi/linux/audit.h
>> @@ -111,6 +111,8 @@
>>  #define AUDIT_PROCTITLE		1327	/* Proctitle emit event */
>>  #define AUDIT_FEATURE_CHANGE	1328	/* audit log listing feature changes */
>>  #define AUDIT_REPLACE		1329	/* Replace auditd if this packet unanswerd */
>> +#define AUDIT_CAPABILITY	1330	/* Record showing capability use */
>> +#define AUDIT_CGROUP		1331	/* Record showing cgroups */
>>  
>>  #define AUDIT_AVC		1400	/* SE Linux avc denial or grant */
>>  #define AUDIT_SELINUX_ERR	1401	/* Internal SE Linux Errors */
>> diff --git a/kernel/audit.c b/kernel/audit.c
>> index 8d528f9..98dd920 100644
>> --- a/kernel/audit.c
>> +++ b/kernel/audit.c
>> @@ -54,6 +54,7 @@
>>  #include <linux/kthread.h>
>>  #include <linux/kernel.h>
>>  #include <linux/syscalls.h>
>> +#include <linux/cgroup.h>
>>  
>>  #include <linux/audit.h>
>>  
>> @@ -1682,7 +1683,7 @@ void audit_log_cap(struct audit_buffer *ab, char *prefix, kernel_cap_t *cap)
>>  {
>>  	int i;
>>  
>> -	audit_log_format(ab, " %s=", prefix);
>> +	audit_log_format(ab, "%s=", prefix);
>>  	CAP_FOR_EACH_U32(i) {
>>  		audit_log_format(ab, "%08x",
>>  				 cap->cap[CAP_LAST_U32 - i]);
>> @@ -1696,11 +1697,11 @@ static void audit_log_fcaps(struct audit_buffer *ab, struct audit_names *name)
>>  	int log = 0;
>>  
>>  	if (!cap_isclear(*perm)) {
>> -		audit_log_cap(ab, "cap_fp", perm);
>> +		audit_log_cap(ab, " cap_fp", perm);
>>  		log = 1;
>>  	}
>>  	if (!cap_isclear(*inh)) {
>> -		audit_log_cap(ab, "cap_fi", inh);
>> +		audit_log_cap(ab, " cap_fi", inh);
>>  		log = 1;
>>  	}
>>  
>> diff --git a/kernel/audit.h b/kernel/audit.h
>> index a492f4c..680e8b5 100644
>> --- a/kernel/audit.h
>> +++ b/kernel/audit.h
>> @@ -202,6 +202,7 @@ struct audit_context {
>>  	};
>>  	int fds[2];
>>  	struct audit_proctitle proctitle;
>> +	kernel_cap_t cap_used;
>>  };
>>  
>>  extern u32 audit_ever_enabled;
>> diff --git a/kernel/auditsc.c b/kernel/auditsc.c
>> index 2672d10..32c3813 100644
>> --- a/kernel/auditsc.c
>> +++ b/kernel/auditsc.c
>> @@ -197,7 +197,6 @@ static int audit_match_filetype(struct audit_context *ctx, int val)
>>   * References in it _are_ dropped - at the same time we free/drop aux stuff.
>>   */
>>  
>> -#ifdef CONFIG_AUDIT_TREE
>>  static void audit_set_auditable(struct audit_context *ctx)
>>  {
>>  	if (!ctx->prio) {
>> @@ -206,6 +205,7 @@ static void audit_set_auditable(struct audit_context *ctx)
>>  	}
>>  }
>>  
>> +#ifdef CONFIG_AUDIT_TREE
>>  static int put_tree_ref(struct audit_context *ctx, struct audit_chunk *chunk)
>>  {
>>  	struct audit_tree_refs *p = ctx->trees;
>> @@ -1439,6 +1439,18 @@ static void audit_log_exit(struct audit_context *context, struct task_struct *ts
>>  
>>  	audit_log_proctitle(tsk, context);
>>  
>> +	ab = audit_log_start(context, GFP_KERNEL, AUDIT_CAPABILITY);
>> +	if (ab) {
>> +		audit_log_cap(ab, "cap_used", &context->cap_used);
>> +		audit_log_end(ab);
>> +	}
>> +	ab = audit_log_start(context, GFP_KERNEL, AUDIT_CGROUP);
>> +	if (ab) {
>> +		audit_log_format(ab, "cgroups=");
>> +		audit_cgroup_list(ab);
>> +		audit_log_end(ab);
>> +	}
>> +
>>  	/* Send end of event record to help user space know we are finished */
>>  	ab = audit_log_start(context, GFP_KERNEL, AUDIT_EOE);
>>  	if (ab)
>> @@ -2428,3 +2440,17 @@ struct list_head *audit_killed_trees(void)
>>  		return NULL;
>>  	return &ctx->killed_trees;
>>  }
>> +
>> +void audit_log_cap_use(int cap)
>> +{
>> +	struct audit_context *context = current->audit_context;
>> +
>> +	if (context) {
>> +		cap_raise(context->cap_used, cap);
>> +		audit_set_auditable(context);
>> +	} else {
>> +		audit_log(NULL, GFP_NOFS, AUDIT_CAPABILITY,
>> +			  "cap_used=%d pid=%d no audit_context",
>> +			  cap, task_pid_nr(current));
>> +	}
>> +}
>> diff --git a/kernel/capability.c b/kernel/capability.c
>> index 45432b5..d45d5b1 100644
>> --- a/kernel/capability.c
>> +++ b/kernel/capability.c
>> @@ -366,8 +366,8 @@ bool has_capability_noaudit(struct task_struct *t, int cap)
>>   * @ns:  The usernamespace we want the capability in
>>   * @cap: The capability to be tested for
>>   *
>> - * Return true if the current task has the given superior capability currently
>> - * available for use, false if not.
>> + * Return true if the current task has the given superior capability
>> + * currently available for use, false if not. Write an audit message.
>>   *
>>   * This sets PF_SUPERPRIV on the task if the capability is available on the
>>   * assumption that it's about to be used.
>> @@ -380,6 +380,7 @@ bool ns_capable(struct user_namespace *ns, int cap)
>>  	}
>>  
>>  	if (security_capable(current_cred(), ns, cap) == 0) {
>> +		audit_log_cap_use(cap);
>>  		current->flags |= PF_SUPERPRIV;
>>  		return true;
>>  	}
>> diff --git a/kernel/cgroup.c b/kernel/cgroup.c
>> index 75c0ff0..1931679 100644
>> --- a/kernel/cgroup.c
>> +++ b/kernel/cgroup.c
>> @@ -63,6 +63,7 @@
>>  #include <linux/nsproxy.h>
>>  #include <linux/proc_ns.h>
>>  #include <net/sock.h>
>> +#include <linux/audit.h>
>>  
>>  /*
>>   * pidlists linger the following amount before being destroyed.  The goal
>> @@ -5789,6 +5790,67 @@ out:
>>  	return retval;
>>  }
>>  
>> +/*
>> + * audit_cgroup_list()
>> + *  - Print task's cgroup paths with audit_log_format()
>> + *  - Used for capability audit logging
>> + *  - Otherwise very similar to proc_cgroup_show().
>> + */
>> +void audit_cgroup_list(struct audit_buffer *ab)
>> +{
>> +	char *buf, *path;
>> +	struct cgroup_root *root;
>> +
>> +	buf = kmalloc(PATH_MAX, GFP_NOFS);
>> +	if (!buf)
>> +		return;
>> +
>> +	mutex_lock(&cgroup_mutex);
>> +	spin_lock_irq(&css_set_lock);
>> +
>> +	for_each_root(root) {
>> +		struct cgroup_subsys *ss;
>> +		struct cgroup *cgrp;
>> +		int ssid, count = 0;
>> +
>> +		if (root == &cgrp_dfl_root && !cgrp_dfl_visible)
>> +			continue;
>> +
>> +		if (root != &cgrp_dfl_root)
>> +			for_each_subsys(ss, ssid)
>> +				if (root->subsys_mask & (1 << ssid))
>> +					audit_log_format(ab, "%s%s",
>> +							 count++ ? "," : "",
>> +							 ss->legacy_name);
>> +		if (strlen(root->name))
>> +			audit_log_format(ab, "%sname=%s", count ? "," : "",
>> +					 root->name);
>> +		audit_log_format(ab, ":");
>> +
>> +		cgrp = task_cgroup_from_root(current, root);
>> +
>> +		if (cgroup_on_dfl(cgrp) || !(current->flags & PF_EXITING)) {
>> +			path = cgroup_path_ns_locked(cgrp, buf, PATH_MAX,
>> +						current->nsproxy->cgroup_ns);
>> +			if (!path)
>> +				goto out_unlock;
>> +		} else
>> +			path = "/";
>> +
>> +		audit_log_format(ab, "%s", path);
>> +
>> +		if (cgroup_on_dfl(cgrp) && cgroup_is_dead(cgrp))
>> +			audit_log_format(ab, " (deleted);");
>> +		else
>> +			audit_log_format(ab, ";");
>> +	}
>> +
>> +out_unlock:
>> +	spin_unlock_irq(&css_set_lock);
>> +	mutex_unlock(&cgroup_mutex);
>> +	kfree(buf);
>> +}
>> +
>>  /* Display information about each subsystem and each hierarchy */
>>  static int proc_cgroupstats_show(struct seq_file *m, void *v)
>>  {
>> -- 
>> 2.8.1




More information about the Linux-audit mailing list