Removing Randomness with LLDB


Bryce Bostwick

 •  •  • 

Let’s say you’re debugging a third-party app that has some code like this:

let data1 = randomData(length: 8)
let data2 = randomData(length: 8)
let data3 = randomData(length: 16)
doSomething(data1, data2, data3)

Sure, the code doesn’t really look like that — those calls are spread miles apart in the binary, with at least eight layers of abstraction in between them — but the end result is exactly the same. The app generates random data in a few different places, and that random data is all fed into some larger system.

You could be trying to figure out how an animation works, or examining an edge case in a video game, or trying to break some client-side cryptography (🙋).

But you’re debugging, and the randomness is getting in the way. Debugging benefits from consistency, and randomness is your enemy.

Let’s get rid of the randomness accordingly — all from lldb.


Our Test Program

First, let’s write a quick program that we can use for testing. It creates some random data values and prints them out.

#import <Foundation/Foundation.h>

// Creates an NSData instance of a given
// length filled with random data
NSData* randomData(NSUInteger length) {
    // Create a mutable data instance
    NSMutableData* data
        = [NSMutableData dataWithLength:length];

    // Fill the data with random `int` values
    uint32_t* bytes = data.mutableBytes;
    for (NSUInteger i = 0; i < length / sizeof(int); i++) {
        bytes[i] = arc4random();
    }

    return data;
}

// Print out an array of NSData values
void printDatas(NSArray<NSData *>* datas) {
    for (NSData* data in datas) {
        NSLog(@"%@", [data base64EncodedStringWithOptions:0]);
    }
}

// Our main function:
// Create some values and print them out!
int main(int argc, const char * argv[]) {
    @autoreleasepool {
        NSData* data1 = randomData(8);
        NSData* data2 = randomData(8);
        NSData* data3 = randomData(16);

        printDatas(@[data1, data2, data3]);
    }
    return 0;
}

Running this locally, I get a result like:

B9aD3b0xxoM=
V+DfMmixVEw=
zq8LseN/VR4GBAjSMlEtIQ==

Though of course, the output will change every time!

Now that we have some example randomness, let’s work on getting rid of it through the debugger.

Fixed Size, Fixed Data

I’ve talked about lldb’s thread return command a bit in a previous post — it lets you exit a method early and return any value you’d like. Combined with auto-continuing breakpoints, it’s a super quick & easy way to provide a mock return value to a given method.

We can use thread return to effectively make it such that every time someone calls our randomData function, they automatically get the same value back. We’ll do this by setting a breakpoint on randomData, instructing the debugger to immediately return a fixed value when that breakpoint is triggered, and then auto-continue execution.

// Set a breakpoint...
breakpoint set

  // On the function named `randomData`
  -n "randomData"

  // Command to run when the breakpoint is hit:
  // Create an `NSMutableData` instance
  // of length `8` filled with zeroed bytes.
  --command "thread return
    [NSMutableData dataWithLength:8]"

  // Automatically continue once the
  // breakpoint has been hit, rather
  // than waiting for us to continue
  // execution manually
  --auto-continue true

lldb doesn’t love multiline commands (let alone comments), so here’s a pastable version of the above:

breakpoint set -n "randomData" --command "thread return [NSMutableData dataWithLength:8]" --auto-continue true

To use this command, we need to launch our program, pause execution in the debugger (if you’re running from Xcode, just setting a breakpoint on main is easiest), and then simply enter the command into the debugger.

Setting the breakpoint while stopped in main Setting the breakpoint while stopped in main

Note: If you’re running this in Xcode, you may need to disable backtrace recording in order to get debugger commands working — though this is not an issue you would run into when debugging a released app.

Now we can resume execution, and see our new output!

AAAAAAAAAAA=
AAAAAAAAAAA=
AAAAAAAAAAA=

Perfect! No randomness there at all!

Depending on your use case, something like this might actually be fine; but it does lead to two potential issues:

  1. We’re now disregarding the requested length of data — remember that our test code is requesting data of length 8, then 8, then 16. Returning a fixed length might be OK if that length is long enough, but this seems likely to cause issues.
  2. Our data doesn’t look random — we don’t want all zeroes (or A’s, in the base64 case) — what we want is random-looking but consistent data.

Let’s fix both of those!

Dynamic Size, Fixed Data

We hardcoded the above example to return data of length 8 — let’s instead derive that length from the data passed into the method.

If we set a breakpoint on our randomData function — a normal breakpoint, not our fancy command one from above — and re-run the program, we can access the given length value by inspecting $arg1, which represents the first (and in our case, only!) argument passed into the method:

NSData* data1 = randomData(8);

// breakpoint on `randomData`
(lldb) p $arg1
(unsigned long) $0 = 8

Note: Finding Argument Values

In our test case, we access the length value with $arg1, but how you’d find an equivalent value in other cases will depend on how the function was defined (and in some cases, how it was implemented!).

If this were a method on an Objective-C object (like a method called randomDataWithLength: in some RandomHelper class) we would need to access $arg3 instead — since for Obj-C methods, $arg1 will be self, $arg2 will be _cmd, and then $arg3 will be our first “real” argument.

Swift’s calling conventions get much more complicated, and lldb will likely prevent you from using the $arg variables altogether.

For a basic Swift function, you may just be able to inspect a register directly (like $X0 for the first argument on an ARM-based machine), but that can change quickly based on a number of factors — including whether the method implementation references self! Finding this value will be a larger battle in Swift, but should always be possible with enough digging.

Updating Our Breakpoint Command

Currently in our breakpoint command, we’re hard-coding our data length to be 8:

thread return [NSMutableData dataWithLength:8]

Since we know the actual length we want is accessible via $arg1, in a perfect world, we could simply swap out that length argument and be done:

thread return [NSMutableData dataWithLength:$arg1]

This command is evaluated in the context of our randomData function, so $arg1 should have the correct value.

However, if you try this, you’ll likely run into an error:

error: Aborting reading of commands after command #0:
'thread return [NSMutableData dataWithLength:$arg1]'
failed with error: Error evaluating result expression:
error: Couldn't apply expression side effects :
couldn't dematerialize register x0 without a stack frame

First, let’s take a moment to acknowledge that the string couldn't dematerialize register doesn’t return any search engine results outside of one link to lldb’s source code itself — which means we must be doing something fun here! Hooray for us.

Without knowing the full context here, some quick speculation on my part — $arg0 is (in my case, on an ARM-based machine) effectively a helper to access register X0, where the first argument is stored while calling (most) functions. You’ll see the same error if you try to access the value through $X0 directly.

lldb knows that we want to read the value of this register, but in order to make the method call we’re requesting (+[NSMutableData dataWithLength:]), lldb also has to change the value of that register; since that method has its own set of arguments, the first of which has to be stored in X0.

These two goals are slightly at odds; ideally, lldb could store the existing value of X0 on the stack, but as the error message notes, it does not have a stack frame available with which to do so. It ends up erroring out as a result.

If that is indeed the issue, we can work around it by breaking our command into two; one to read the value of $arg0, and another to use it:

// Create a `$length` variable
// that we will set later
e int $length;

// Set a breakpoint...
breakpoint set

  // On the function named `randomData`
  -n "randomData"

  // First command — update `$length`
  // to the value in $arg1
  --command "e $length = $arg1"

  // Second command — create an
  // `NSMutableData` instance with
  // `$length` bytes
  --command "thread return
    [NSMutableData dataWithLength:$length]"

  // Automatically continue once the
  // breakpoint has been hit, rather
  // than waiting for us to continue
  // execution manually
  --auto-continue true

We now have two top-level debugger commands: one to create our length variable, and one to set our breakpoint. And the breakpoint itself has two commands attached to it: one to set our length variable, and then one to perform the actual data creation.

It’s a slightly more annoying setup due to a couple lldb design choices (breakpoint commands not sharing a variable scope, new persistent variable declarations not overwriting old ones — both reasonable choices, just not helpful for us here).

As before, here’s a pastable version:

e int $length;
breakpoint set -n "randomData" --command "e $length=$arg1" --command "thread return [NSMutableData dataWithLength:$length]" --auto-continue true

By pasting in the above commands at the start of the program, then continuing execution so that our random data is created & printed out…

AAAAAAAAAAA=
AAAAAAAAAAA=
AAAAAAAAAAAAAAAAAAAAAA==

We are now properly recognizing the length values of our different calls — 8, 8, and 16. Perfect!

Dynamic Size, Dynamic Data

Now that we have access to the correct length value for each call, it’s not a huge jump to return whatever data we want.

We’ll just need to update our current data creation code:

[NSMutableData dataWithLength:$length]

to something more advanced.

One of the easiest ways to do this is to simply append a bunch of randomly-generated int’s to our data until we have reached the desired number of bytes — similar to what our test implementation already does.

Better yet — if we use a random number generator that we control the seed of (like rand via srand), we can ensure that we get the same exact result each time; giving us the random-looking but consistent data that we hoped for.

// Seed `rand` with any
// fixed value we want
srand(0);

// For each byte...
for(int i = 0; i < $length / sizeof(int); i++) {
    // Get a random value
    int randomInt = (int)rand();

    // Add it to the data
    [data appendBytes:(void*)&randomInt
               length:sizeof(int)];
}

This is also quite nice in that we can change the seed value if we want to investigate a different result — you can find a seed value that works for what you need to debug, and then keep using it across launches.

Let’s update our lldb commands to include the above setup. Some of our breakpoint commands are getting quite long, but they still work. I’ve added some additional spacing here to keep things somewhat readable.

// Create a `$length` variable
// that we will set later
e int $length;

// Create a `$mockData` variable
// that we will set later
e NSMutableData *$mockData; 

// Seed the `rand` function with
// a set seed. We can change this
// to get different results on
// different runs
e srand(0);

// Set a breakpoint...
breakpoint set

  // On the function named `randomData`
  -n "randomData"

  // First command — update `$length`
  // to the value in $arg1
  --command "e $length=$arg1"

  // Second command - update `$mockData`
  // to a new instance of random data
  --command "e
    // Create a new NSMutableData instance
    $mockData = (NSMutableData *)
      [NSMutableData dataWithCapacity:$length];

    // For each byte...
    for(int _i = 0; _i < $length / sizeof(int); _i++) {
      // Get a random value
      int _randomInt = (int)rand();

      // Add it to the data
      (void)[$mockData
        appendBytes:(void*)&_randomInt
        length:sizeof(int)
      ];
    }"

  // Third command — return our
  // new data instance
  --command "thread return $mockData"

  // Automatically continue once the
  // breakpoint has been hit, rather
  // than waiting for us to continue
  // execution manually
  --auto-continue true

And again, a pastable version:

e int $length;
e NSMutableData *$mockData;
e srand(0);
breakpoint set -n "randomData" --command "e $length=$arg1" --command "e $mockData = (NSMutableData *)[NSMutableData dataWithCapacity:$length]; for(int _i = 0; _i < $length / sizeof(int); _i++) { int _randomInt = (int)rand(); (void)[$mockData appendBytes:(void*)&_randomInt length:sizeof(int)];}" --command "thread return $mockData" --auto-continue true

If I run these lldb commands at the start of our test app, and then continue execution, I now get the same output every time:

Qs4MH/teuQE=
n7EKMeBWEzU=
WaesCIHIHn9azv09gZb2YQ==

and, if you’re following along on your machine (and your rand implementation is equivalent to mine) — you probably do too!

Whatever values you get, you’ll see them printed every time you run the app with these debugger commands; and if you need different random outcomes, you can simply change the seed passed to srand.

Even more fun, you can create a breakpoint on main that executes these four debugger actions, then automatically continues, and now every execution of the program will have the same un-randomized behavior without you having to lift a finger.

Wrapping Up

This technique takes a bit of work to set up, but it can be super useful when debugging anything that has an inherently random aspect to it — and the example above usually only requires minor tweaks to be adapted to all sorts of situations.

Anyways — back to trying to break some encryption!