Paul Bone

Code Loading and Security

Today I was thinking about code loading and security. To write more robust code it is a very good idea to separate executable code from writable data. Many processors and OSs have supported some kind of execute protection for years now. This and other features make our systems more secure, a bug in a program cannot easily be exploited. Memory safe languages improve this further by making these classes of bugs impossible, nevertheless we don’t always have that luxury.

Where this may become a problem is when it is a good idea to modify code at runtime or load code, then execute it. The simplest example is using a breakpoint in a debugger. On x86/x86_64 a breakpoint is implemented by writing a single byte, one that calls a trap defined by the OS for debugging, into the instruction stream.

Just in time (JIT) compilation and code loading are also situations where you may want to write and then execute some memory. For example in Plasma I intend to use subroutine threaded bytecode execution on x86 and x86_64. (The current interpreter is generic and uses token threading. In the distant future it might use native code generation.) Subroutine threading involves reading the bytecode and writing machine code into memory and then running that.

To enable code loading, and maintain good security practices we can start with read and write memory protection, copy the code in, then change the protection to read and execute. I’ve demonstrated this with this C program.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include <sys/mman.h>

/*
 * Code for "return 23"
 */
const unsigned char code_bytes[] =
    {
	/* mov 23 eax */
	0xb8, 0x17, 0x00, 0x00, 0x00,
	/* ret */
	0xc3
    };
#define CODE_LEN (5 + 1)

int main(int argc, const char **argv) {
    void *memory;
    int (*func)(void);
    int result;

    /* Note: PROT_READ | PROT_WRITE */
    memory = mmap(NULL, 4096,
	PROT_READ | PROT_WRITE,
	MAP_PRIVATE | MAP_ANONYMOUS,
	-1, 0);
    if (MAP_FAILED == memory) {
	perror("mmap");
	return EXIT_FAILURE;
    }

    memcpy(memory, code_bytes, CODE_LEN);

    /* Note: PROT_READ | PROT_EXEC */
    if (-1 == mprotect(memory, 4096, PROT_READ | PROT_EXEC)) {
	perror("mprotect");
	return EXIT_FAILURE;
    }

    printf("Calling generated code\n");
    func = (int (*)(void))memory;
    result = func();
    printf("Code returned %d\n", result);

    return EXIT_SUCCESS;
}

This works on Linux and FreeBSD on x86_64, and Linux on x86. It should work on other x86 / x86_64 systems but I haven’t tested it. It won’t work on anything other than x86 or x86_64 as the machine code it attempts to load is x86/x86_64 code.

What I would like to know, and I’ll update this blog entry once I know, is whether there are any systems, perhaps some security hardened systems, that do not allow changing memory permissions and therefore the above program won’t work. It looks like PaX Linux is one such system. If this isn’t supported what do things like the Java Runtime do to enable code loading and JIT? Is double-mapping the only option?

Update 2016-09-15

Thanks David Coles via G+ who pointed me at some more information about hardened Gentoo which uses PaX Linux. Indeed PaX Linux forbids changing memory protection from writable to executable, and this does affect runtimes like Java. However this feature (called MPROTECT) can be disabled per-binary or globally. More details are available about MPROTECT including a paragraph about how the interface to mmap() and mprotect() could be extended to allow code loading and JIT compilation.

While preparing this article I found some other interesting things. My friend Paweł Lasek showed me a case where changing (or flipping) memory protection like this is very inconvenient. Inline caching is used to speedup virtual method lookup in dynamic languages and is easiest if memory can be updated and executed at the same time. This is not supported on some OSs such as OpenBSD and PaX Linux.

I also ran two other tests. I tested creating read, write and execute memory on x86_64 Linux and found that this also worked. Take the above program, add | PROT_EXEC to the mmap call and remove the mprotect() call.

I tested executing read and write memory (deleting the mprotect() call), this crashed as expected on x86_64, and worked on x86. Most 32-bit only x86 systems do not support page-level no-execute protection (the NX bit) and therefore the hardware was unable to detect this and trap it. Sometimes 64-bit x86_64 systems, running in 32-bit mode do support the NX bit, but that depends heavily on the OS used.