Executable memory: some apps that work on RH9 don't on FC1

Mon Nov 17 20:32:22 UTC 2003

> I just built a machine with the latest Fedora and tried to build Axiom.
> Axiom depends on GCL, Gnu Common Lisp. It appears that all of the 
> lisps are broken under the new memory model.
> 
> Can you give me some pointers to any documentation or source code in
> Fedora that might give me a clue about how to fix this? I understand
> there might be a compiler switch but haven't found any documentation.

Unfortunately I don't think we've succeeded in getting proper documentation
up to date in step with the implementation of the features.  I have written
this explanation up a couple of times, but AFAIK there is no good place to
look for it.  Documentation volunteers take note! :-)

The important changes have to do with what memory will be executable
(i.e. have the PROT_EXEC bit set on those pages).  If you have an issue
with memory layout that changed, other than the question of executability,
then you almost surely have a bug in your application or have uncovered one
in the system.  I'd be glad to help you understand which of those it is and
how to fix it.  There is at least one known issue of this nature (brk
address).  Please try to determine if nonexecutability alone is what's
breaking you, and if not, please post the details of your problem so we can
determine what different problem there might be.

The status quo ante was that the stack was executable, and the brk area
(used by malloc for small allocations) was executable, and on x86 pages
with PROT_READ set but PROT_EXEC did not have any enforcement of
nonexecutability anyway.  All of these things are either just as they were
before, or different now, on a per-process granularity (changed by exec calls).

System-wide, you can disable the exec-shield functionality with:

	echo 0 > /proc/sys/kernel/exec-shield

If that doesn't make your binaries work, then you probably have a different
problem.  If it does, then the system-wide switch is a stop-gap you can use
while getting your binaries fixed.  We have also overloaded the inherited
"personality" setting so you can disable it per-process:

	setarch i386 foobar

That runs "foobar" with the "personality" bits set such that exec-shield is
disabled for that process and its children (unless one of them uses setarch
or is setuid or somesuch).  Again, if that doesn't make your binaries work,
then you probably have a different problem.

If disabling exec-shield momentarily does work around your problem, then
you want to figure out why you had to do that.  The most common situation
is that you were using executable stack in some way that you don't really
need to, e.g. GCC nested function trampolines.  You can avoid that by
rewriting the code not to use trampolines (i.e. take the address of a
nested function that uses its parent's local variables).  Things like Lisp
systems that produce executable code at run time should generally avoid
using stack space for that.  You also should not be using malloc or direct
brk/sbrk calls to get memory that you need to be executable--you have never
had a specified guarantee that malloc returns executable memory.  For
dynamic allocation of memory where you need to put executable code, use
mmap with PROT_READ|PROT_WRITE|PROT_EXEC.  It is also fine to mmap with
different protections and then use mprotect with e.g. PROT_READ|PROT_EXEC
later.  It is not proper to call mprotect on memory returned by malloc,
because when you free that memory later it may be reused in ways that don't
require the executability.  The same goes for the brk area.  (It's also the
case that no specification guarantees that mprotect is meaningful on
malloc-returned space, though in fact it will also work as you expect on
malloc and brk/sbrk space in Linux and probably all Unixoid systems.)

If you have a genuine need for executable stack, you can put a marker in
your binary to tell the system that's what you want.  This marker goes in
ELF executables (and DSOs) as the PT_GNU_STACK phdr entry, with p_flags
containing PF_X to indicate need for executable stack and not containing
PF_X to indicate no need for executable stack.  I'll describe how to
compile those markers in a little later.  When a binary does not have any
PT_GNU_STACK marker at all, as is the case with binaries produced by all
older tool versions, it's treated as needing executable stack to be safe.
That should retain compatibility with older systems.

The story is the same for DSOs as for executable files.  The difference is
that while the kernel looks for the marker in executable files at exec
time, the dynamic linker looks at the marker in DSOs when it's loading
them.  This is because an executable file that itself does not require an
executable stack might load a DSO at runtime (either as a needed library or
by using dlopen, e.g. for plug-in libraries) that does require executable
stack.  In this instance, the dynamic linker stops and makes all the stacks
executable before completing the load of the DSO in question.  Note that
this support applies only to the stack--if a DSO dynamically allocates
memory it needs to be executable and does that the wrong way, no marker
will work around it, the code just has to be fixed.

If you have an old DSO binary that it's not feasible for you to rebuild for
some reason (e.g. 3rd-party plug-ins for your applications), you can try
marking it using the `execstack' utility (part of the `prelink' rpm).
execstack edits an existing ELF binary for you, either to add a
PT_GNU_STACK phdr if it's missing or to set or clear the PF_X flag.
`execstack -q FILE' will tell you the current status of that file: X for
executable, - for not, and ? for an old binary with no marker at all.  (You
can also use readelf -l or objdump -p to see the phdrs.)  Note that there
should never really be a need to add a marker to an old executable file
because of the compatibility default--a good thing, since execstack cannot
move things around to make room for the phdr in an executable as it can in
a DSO.  Remember, the default when there is no marking is to assume
executable stack is required for compatibility with older systems.  Ergo,
you don't need to add a marker if it would have PF_X set.  The reason to
add a marker is to avoid enabling executable stack at runtime when it's not
really needed.  

When compiling from source with current tools (including those in FC1), you
don't usually need to do anything special to get the right markers into
your binaries.  The way it works is that the linker produces the
PT_GNU_STACK marker when there are special marker sections in the input
object files, called ".note.GNU-stack".  The flags of these sections
determine the flags of the PT_GNU_STACK entry.  Your object files (.o) will
normally have these sections because GCC emits them in its assembler
output.  When GCC compiles nested function trampoline code, it emits a
.note.GNU-stack section with the SHF_EXECINSTR flag set:

	.section .note.GNU-stack, "x", @progbits
	.previous

When GCC compiles a module that does not contain any code requiring
executable stack, it emits the complementary marker section with no
SHF_EXECINSTR flag bit:

	.section .note.GNU-stack, "", @progbits
	.previous

If you have assembly code of your own, then you need to add these markers.
The best way is to amend the source code with one of the assembly
directives above.  If that is problematic for some reason, another thing
you can do is tell the assembler directly what to emit on the command line
using -Wa,--execstack or -Wa,--noexecstack.  Finally, if you want to punt
altogether on marking your .o files properly, you can tell the linker to
ignore the marker sections and override its output setting directly on the
command using -Wl,-z,execstack or -Wl,-z,noexecstack.

Thanks,
Roland