How to Swizzle All of UIKit

If you wanted to replace the implementation of every single method in a given framework — maybe to add logging, or to record call counts, or to check that every method is being run on the main thread — how would you go about doing that?

Note: You can checkout a demo project for this post here.


Table of Contents


Swizzling 1 Method

Normal Swizzling Setup

For a quick recap, let's start with the basics. We'll take a single method (-[UIView setTag:]) and add a log statement for when that method is called:

@import ObjectiveC;
@import UIKit;

@implementation UIView (Swizzle)

+ (void)load {
    method_exchangeImplementations(
        class_getInstanceMethod(self, @selector(setTag:)),
        class_getInstanceMethod(self, @selector(swizzled_setTag:))
    );
}

- (void)swizzled_setTag:(NSInteger)tag {
    NSLog(@"-[UIView setTag:] Called!");
    [self swizzled_setTag:tag];
}

@end

// ...

// Prints "-[UIView setTag:] Called!"
[[UIView new] setTag:1];

This is just our standard swizzling set up; we declare a new method, swizzled_setTag:, in a category, and then swap the implementations — such that calling setTag: will actually invoke swizzled_setTag:.

Finally, to invoke the original method from within our swizzled implementation, we call -[self swizzled_setTag:] (as after exchanging methods, this now points to the original). This is second-nature to anyone who is at all familiar with swizzling, but is still worth calling out explicitly, as things will start to differ from the norm shortly.


Swizzling 2 Methods

Missing Information

With the basics covered, we can start to look at scaling up to replacing more method implementations, staring with a single addition: -[UIBarItem setTag:].

Of course, we could copy & paste the above setup and modify as needed, or even codegen the equivalent — but it's easy to see the fragility there, especially once we get to dealing with different parameters and return types, and we already know that we wouldn't be able to handle all of UIKit in that manner. Now would be a good time to start down a different path.

What we really want is a single common implementation for all our replacements, like a common_swizzle_handler method. But what would that actually look like?

- (void)common_swizzle_handler {
    // TODO: How do we get
    // the original method name?

    NSLog( /* original method name */ );

    // TODO: How do we call
    // the original method?

    return /* original method call */
}

It's clear now that our original replacement only works because it's able to hard-code information about the original method; it knows both the method's name and how to invoke it, neither of which we can rely on here.

What we need is a way to pass additional information into common_swizzle_handler — but since we don't control the call sites, we can't really pass any additional information in as parameters.

Note: You can actually get part of the way to solving both of these problems using the _cmd argument that is passed to every Objective-C method — but there are some cases that this cannot cover (and any solution using it would be less performant than we need for an entire framework)

Psuedo-Trampolines

There is actually a way that we can pass some implicit information through to our common swizzle handler: by going through an intermediate function.

Let's start with the following setup (assuming the functions are previously declared):

void uiview_trampoline() {
    // Print the address of this function
    NSLog(@"UIView Trampoline:    %p", uiview_trampoline);
    common_swizzle_handler();
}

void uibaritem_trampoline() {
    // Print the address of this function
    NSLog(@"UIBarItem Trampoline: %p", uibaritem_trampoline);
    common_swizzle_handler();
}

void common_swizzle_handler() {
    // Print the address that we will return
    // to when this function is complete
    NSLog(@"Return address:       %p", __builtin_return_address(0));

    // TODO: Log method name and
    // call original method?
    return;
}

Then, we'll tell both UIView and UIBarItem to use their respective trampoline functions as replacement implementations for setTag:

method_setImplementation(
    class_getInstanceMethod([UIView class], @selector(setTag:)),
    uiview_trampoline
);

method_setImplementation(
    class_getInstanceMethod([UIBarItem class], @selector(setTag:)),
    uibaritem_trampoline
);

And finally, a quick test:

[[UIView new] setTag:1];
// Prints:
//   UIView Trampoline:    0x100000100
//   Return address:       0x100000121

[[UIBarItem new] setTag:1];
// Prints:
//   UIBarItem Trampoline: 0x100000200
//   Return address:       0x100000221

Looks like we are successfully calling our swizzle handler, but we now have to acknowledge that we've lost our already-minimal functionality: we're no longer able to log the method being called (or even call the original method!). So what exactly have we gained?

Notice that when we are coming from -[UIView setTag:], the address that we will return to is just a small number of bytes past the start of uiview_trampoline; and likewise for UIBarItem and its trampoline. This makes sense — execution needs to continue in those methods after the function they call has completed — but it also means we do have a way to tell the callers apart.

In fact, although it's certainly not scalable in this format, we actually have enough information now to technically build a single swizzle implementation that works for both our cases:

void common_swizzle_handler() {
    void * returnAddress = __builtin_return_address(0);

    if (returnAddress == uiview_trampoline + 0x21) {
        /* log & call original -[UIView setTag:] */
    } else if (returnAddress == uibaritem_trampoline + 0x21) {
        /* log & call original -[UIBarItem setTag:] */
    }
}

This is the key piece of functionality that we need to build a common swizzle handler; by using a unique trampoline for each method, and mapping from the trampoline's address to some information about the method it represents, we can get all the information we need.

Actual Trampolines

The above functions work fine as trampolines, but we don't want to have to actually implement a new function for every method we want to swizzle.

We can simplify things by dropping down to assembly (a sentence which often sounds untrue out of context, but bear with me!). We'll use x86_64 here, which can be run on the iOS Simulator.

; Define a new function containing all our trampolines
_trampoline_start:
call    _common_swizzle_handler
call    _common_swizzle_handler
call    _common_swizzle_handler
; ...

; Allow our function to be accessed from elsewhere
.globl  _trampoline_start

This new function is, as a whole, not that useful — if you were to call it directly, you'd simply call the swizzle handler several times and then exit.

Instead, we'll set things up to jump somewhere in the middle of the function. For example, if we were to jump straight to the address given by _trampoline_start + 5 (5 being the size, in bytes, of each call instruction here) we would start with the second call instruction, and we'd see a particular return address from within our swizzle handler.

But if we were to jump to _trampoline_start + 10 instead, we would see a different return address from within the handler. With this setup, we have as many trampolines as we want, each reachable at _trampoline_start + (i * 5).

There are a couple of issues we still need to fix though:

  1. After returning from the swizzle handler, we're going to immediately hit another call instruction
  2. The original caller will pass at least some of its arguments by copying them into registers. While our barebones trampolines aren't modifying any of those registers, our swizzle handler (if it has any functionality at all) likely will.

We can solve both to these problems by having an intermediate assembly function that our trampolines call instead:

_trampoline_target:
; Save (some of) the argument registers to the stack
; (actual implementation saves quite a few more)
pushq   %rdi
pushq   %rax

; The address that this function will return
; to once finished is placed at the top of
; the stack before the function is called.
; Copy this address (which is now 16 bytes from
; the top of the stack, after our pushes above)
; into `rdi` to act as our first argument
movq    16(%rsp), %rdi

; Call swizzle handler
call    _common_swizzle_handler

; Replace our return value with the
; one returned by our swizzle handler.
movq    %rax, 16(%rsp)

; Restore the argument registers
popq    %rax
popq    %rdi

; Return!
ret

Instead of returning to the giant trampoline function, this function overwrites its own return address with whatever address is returned by common_swizzle_handler; this means the swizzle handler can return the address of the original method implementation, and we can jump there after restoring our registers to their original state.

This way, we can run whatever code we want in the handler, and right after returning from the above function, we'll be in the exact same state as if the swizzle had never been added in the first place!

Note: If the exact path here is still unclear, don't worry — we'll recap what's actually happening once we have the last few pieces in place. In the meantime, just remember that we can jump to individual trampolines, and that after going through a few steps, things should just work.

Using the Trampolines

Now that we have a way to address trampolines more easily, let's switch back to actually trying to use them.

Let's first define a struct to store information about each method, which we'll then be able to look up later. We'll also define an array to hold our methods, along with a counter to track how manly methods we've added.

// Stores information about a given method
typedef struct SwizzledMethod {
    Class class;
    Method method; 
    IMP originalImplementation;
} SwizzledMethod;

// Storage for our swizzled methods
static SwizzledMethod swizzledMethods[2];
static int usedTrampolinesCount = 0;

Now that we have our storage set up, we can define a function to perform the swizzling. Each time we call this function, it will add the given method to our array, and set the next available trampoline as the method's new implementation:

// Reference to the first trampoline
extern void trampoline_start(void);

void swizzle_method(Class class, Method method) {
    // Store information about the given method,
    // including its original implementation
    IMP originalImp = method_getImplementation(method);
    swizzledMethods[usedTrampolinesCount] = (SwizzledMethod){
        class,
        method,
        originalImp
    };

    // Get a reference to the next unused trampoline
    int trampolineSize = 5;
    void *nextTrampoline = trampoline_start
        + (usedTrampolinesCount * trampolineSize);
    usedTrampolinesCount++;

    // Set the method's implementation
    // to that next trampoline
    method_setImplementation(
        method,
        nextTrampoline
    );
}

Finally, we must also update our common swizzle handler to account for the new trampoline format. This finds the index of the trampoline we're coming from, and then looks up the corresponding method accordingly; e.g., the nth trampoline corresponds to the nth method that we swizzled (and as a result, the nth method in the array).

void *common_swizzle_handler(long trampolineAddress) {
    // Find trampoline's address relative
    // to the first trampoline
    long offsetInTrampolines = trampolineAddress
        - (long)trampoline_start;

    // Divide by trampoline size to get
    // the index of our trampoline
    int trampolineSize = 5;
    long trampolineIndex =  (offsetInTrampolines / trampolineSize) - 1;

    // Look up original method info
    SwizzledMethod sm = swizzledMethods[trampolineIndex];

    // Log the method name
    printf("-[%s %s]\n",
        class_getName(sm.class),
        sel_getName(method_getName(sm.method)));

    // Return the address of the original implementation
    return sm.originalImplementation;
}

And we're all ready! We simply have to call swizzle_method with the classes & methods we care about — with our two existing methods, that gives us:

swizzle_method([UIView class],
    class_getInstanceMethod(
        [UIView class],
        @selector(setTag:)
    )
);

swizzle_method([UIBarItem class],
    class_getInstanceMethod(
        [UIBarItem class],
        @selector(setTag:)
    )
);

// Prints "-[UIView setTag:]"
[[UIView new] setTag:1];

// Prints "-[UIBarItem setTag:]"
[[UIBarItem new] setTag:1];

And it works! We have our logging for individual methods, and crucially — the only code specific to these two methods is localized to the code block above. Everything else is completely generic.

Trampoline Recap

We covered a lot while building this trampoline setup, so now that we have everything finished, I think it's worth stepping through exactly what happens with the trampolines.

Let's start with the -[UIView setTag:] call above and walk through what happens at each step, along with what the top of the call stack would look like. We'll assume we're making our initial call from a method called -[ViewController viewDidLoad].

  1. We start with the state just before the initial call to -[UIView setTag:].
Callstack:
    -[ViewController viewDidLoad]
  1. We make the call, but because we've replaced the method's implementation, we enter one of our trampolines instead.
Callstack:
    trampoline_start + (SOME_OFFSET)
    -[ViewController viewDidLoad]
  1. Our trampoline just calls our intermediate trampoline target.
Callstack:
    trampoline_target
    trampoline_start + (SOME_OFFSET)
    -[ViewController viewDidLoad]
  1. The trampoline target saves all argument registers (in this case, the arguments to -[UIView setTag:]). It then calls common_swizzle_handler, passing the address of our original trampoline as an argument.
Callstack:
    common_swizzle_handler
    trampoline_target
    trampoline_start + (SOME_OFFSET)
    -[ViewController viewDidLoad]
  1. The common swizzle handler looks up the trampoline, along with the method associated with it, based on the trampoline's address. It then logs the method information and returns the address of the original implementation.
Callstack:
    trampoline_target
    trampoline_start + (SOME_OFFSET)
    -[ViewController viewDidLoad]
  1. The trampoline target restores all argument registers and, instead of returning to trampoline_start, “returns” to the address given by the common swizzle handler, which points to the original method implementation.
Callstack:
    -[UIView setTag:] 
    -[ViewController viewDidLoad]

And with that, we have reached -[UIView setTag:], with the correct parameters and expected call stack — but with functionality injected in between. And more importantly, we've made things generic enough that we can add additional swizzled methods dynamically & with ease.


Swizzling 75,000 Methods

Well that escalated quickly.

We're actually in a pretty good state for swizzling as many methods as we want right now. With the above setup, we only have to do three things:

  1. Make sure we have at least as many trampolines as methods we want to swizzle
    • In our setup, that simply means adding more call instructions (or generating them in a precompile stage)
  2. Make sure we have enough storage for our method metadata
    • Again for our setup, this is simply updating the swizzledMethods array length
  3. Find all the methods we want to swizzle
    • This part is more interesting!

Finding All UIKit Methods

If our goal is to swizzle all of UIKit, the easiest option is probably to find all Objective-C methods in the app, filter down to ones in UIKit, and then swizzle them all.

We can define a function to help with this goal, which will take a Class parameter and return some unique information about the framework/image that it is defined in. I've opted for the image's base address specifically:

@import Darwin;
/*
 Returns the base address of the framework/image
 associated with the given class.

 This is just the same starting address you might see
 in the Binary Images section of a crash report,
 or in the `image list` output of `lldb`.
 */
void *framework_address_for_class(Class class) {
    struct dl_info info;
    dladdr((__bridge const void *)class, &info);
    return info.dli_fbase;
}

Note: You could also have this function return the image's name and do a string comparison instead. But that seems unnecessary, and it also would force us to face the fact that all classes we care about actually live within the private UIKitCore framework rather than UIKit itself, a fact that the title of this article is happily ignoring.

Now, we can use a known class to get a base address to compare against, use it to find all classes that live in UIKit, and swizzle every method on each:

// Swizzles all methods in UIKit
void swizzle_uikit_classes() {
    // Get UIKit[Core]'s base address for comparison
    void *uikitBaseAddress = framework_address_for_class(
        [UIView class]
    );

    // TODO: free
    // For every class...
    unsigned int classCount = 0;
    Class *classes = objc_copyClassList(&classCount);
    for (int i=0; i<classCount; i++) {
        Class class = classes[i];

        // Check if it's in the same image as UIView...
        void *classAddress = framework_address_for_class(class);
        if (classAddress != uikitBaseAddress) {
            continue;
        }

        // And swizzle all its methods if so
        swizzle_class(class);
    }

    free(classes);
}

// Swizzles all methods on a given class
void swizzle_class(Class class) {
    // For every method...
    unsigned int methodCount = 0;
    Method *methods = class_copyMethodList(class, &methodCount);
    for (int i=0; i<methodCount; i++) {
        // Call our previously-defined
        // `swizzle_method` function
        Method method = methods[i];
        swizzle_method(class, method);
    }
}

And with that, one call to swizzle_uikit_classes() and we're now logging every call to UIKit!

@implementation UIView (Swizzle)

+ (void)load {
    swizzle_uikit_classes()
}

/* Eventually Prints:
    -[_UIApplicationConfigurationLoader _init]
    -[_UIApplicationConfigurationLoader startPreloadInitializationContext]
    -[_UIApplicationConfigurationLoader usesLocalInitializationContext]
    ...
    -[UIEventFetcher init]
    -[UIEventFetcher setupThreadAndRun]
    -[UIEventFetcher threadMain]
    ... */

Performance

The above setup takes about… 40 seconds on my machine. Hey, not bad for nearly 75,000 methods!

Luckily, the slowness is fairly easy to fix. Almost all of the delay comes from the calls to method_setImplementation, which acquires the Objective-C runtime lock, flushes caches, and more.

This logic is especially important when the app is already up-and-running — it allows us to safely change method implementations with new calls still behaving as expected — but it doesn't buy us much benefit in the highly-constrained, single-threaded circumstances of +[UIView load]. We can skip this logic by updating the method ourselves:

// Before
method_setImplementation(
    method,
    nextTrampoline
);

// After (see objc/runtime.h for `method` structure)
long *imp = (void *)method + 0x10;
*imp = (long)nextTrampoline;

Note: With a bit of refactoring, you should be able to use class_replaceMethodsBulk instead to get similar levels of performance without having to hardcode this offset, as long as you're targeting iOS 12.0 or newer.

This brings our startup swizzling time down to under a second. Performance can likely be further improved, with dladdr being a good target to start with; but this is enough of an improvement for me.


Swizzling n Methods

More?

We so far have been able to take advantage of knowing exactly how many methods we'd need to swizzle, and have used that knowledge to adjust our number of trampolines and our method storage.

This is all you need to swizzle an entire framework, and is all I've spent time implementing locally — but if you have a use case in mind where you don't know that information upfront, or creating an appropriate number of trampolines at compile time is impractical (like if you are trying to swizzle many frameworks) I'd recommend reading this article on imp_implementationWithBlock by Landon Fuller.

imp_implementationWithBlock uses trampolines as well (though for a different purpose) but does not know the required number upfront — so it essentially adds more trampolines at runtime by memory-mapping existing ones to a new location. The setup is a bit more difficult; trampolines can no longer be trivially looked up by their location in memory, and mapping our existing trampoline implementations to a new location is not enough for them to function properly; but if implemented correctly, it does allow you to create as many or as few trampolines as you need.

Finally, if you're trying to swizzle literally everything — consider looking at ways to simply hook objc_msgSend instead!


Say Hello!