Apple’s Move to x86 vs. Security

I was fairly surprised at Apple’s announcement of their transition from PowerPC processors to using Intel chips, and I am still fairly sceptic about this whole affair. After having watched the keynote, I am seeing things in a slightly more positive light (mostly due to Steve Jobs being an excellent salesman).
Having written assembly for x86 and having read a lot of documentation about PPCs, it very much feels like moving to an inferior architecture with a superior implementation. Suddenly people have to pass arguments over the stack again, they (according to the Universal Binary guide) need to check for MMX, SSE, SSE2, SSE3 and optimise accordingly instead of a nice orthogonal AltiVec unit or no vectorisation. Welcome back to a stack-based FPU with 8 registers, same as the integer core and no more Open Firmware. I at least hope that Apple is going to ensure a certain standard in the Intel CPUs they sell (e.g. x86-64 + SSE3 guaranteed), to avoid even more ugly #ifdefs and code-paths than are already necessary for ensuring a single code-base builds on big- and little endian machines with different ABIs and capabilities.
As far as I am aware, the PowerPC stores the return address in a register, and is thus harder to exploit via buffer overflows. x86s store their return addresses on the stack, which makes them more vulnerable to these types of attacks. Recently, Microsoft has made that a bit harder by storing sentry cookies on the stack and checking them in SP2 for Windows XP and SP1 for Windows 2003 Server, but that is something of a work-around that costs you performance as well as stack-space.
Apple seem to offer the tools to make this transition less grating, but it is work with no immediately obious pay-off in sight. Certainly they are going into this with much more information than any of us have, so we’ll have to wait and see how things play out. I am well aware, that the CPU does not make a Mac; and I will hardly leave Mac OS X behind for any of the alternatives because Intel now gets a share of my money instead of Freescale / IBM.

One thought on “Apple’s Move to x86 vs. Security

  1. Phil

    I agree with you for the most part (especially the bits about MMX, SSE, etc and the utterly miserable FPU) I’ve never worked with PPC, so I’m not familiar with how it handles stack operations/function call returns. But I do have a few points to make:

    1. PowerPC does indeed use a register (LR, the Link Register) for storing function return addresses. But the called function will most likely immediately store the LR to the stack, so that it is able to make function calls of it’s own. There’s only one LR; without using the stack, nested function calls wouldn’t work. Say you have three functions, for the sake of originality, we’ll call them A, B and C. Function A calls Function B; the return address to return to A is stored in the Link register. Function B then calls function C. Barring some major advances in the application of quantum mechanics, the return address could not be stored in the same register without overwriting the existing address. Function B must save A’s return address before calling C. It could save it at an absolute location in a data segment, but that makes the code non-reentrant (Function B might be interrupted and called again from another thread, after all) and makes recursion impossible. Re-entrant and/or recursive code (most code, these days) would need to store LR to the stack before calling other routines and restore it from the stack before finally returning to it’s caller.

    2. Stack protection on x86 processors has two major components: the NX bit and guard pages. Memory allocated for data storage (including the stack) has the NX bit set by default on CPUs that support it. With the NX bit set, even if someone manages to inject code into the stack and ovewrite the return address to call it, the CPU will refuse to execute it. Guard pages are implemented by ensuring that one page of virtual address space on either side of the space allocated for the stack are not mapped to physical storage. Thus, any attempt to overrun an allocated buffer will cause a protection fault. (Unlike the NX bit, which is only supported by relatively recent CPUs, this mechanism works on all VM-capable CPUs from the 386 and up.) Neither of these mechanisms require additional instructions or checks in the code (aside from the initial allocation of memory in the OS kernel) because they are implemented in the CPU/MMU hardware, and have little or no performance impact.

    3. The remaining change made to stack handling in XP SP2 (inserting guard flags between function calls within the stack and verifying them before returning from the function) does require extra code and does impose a performance penalty. However, these modifications aren’t a result of changes to the OS code, but rather to the compiler used to compile the OS. VC has a new option that enables insertion of these checks during compilation. Microsoft enabled this protection in XP SP2 by simply turning on the compiler option and recompiling everything. Third party code isn’t affected unless it too is recompiled with the new compiler option enabled. Recent versions of GCC and other compilers also have this option, and it would not surprise me in the least if newer versions of OS X contain code using such protection. (In fact, I’d be rather disturbed if it did not.)

Leave a Reply

Your email address will not be published. Required fields are marked *