**>> Pokémon GO - REVISITING THE "HACKING" SCENE (PART 6)**
*c* ← *head*(*P*) **while** *c* ≠ Λ **do** **if** *ok*(*P*,*c*) **then** *dump*(*P*, *c*) *c* ← *next*(*P*,*c*) **end while**

I made an ambitious promise to extend the efforts from my
previous entries
to demonstrate how to capitalize on the fact the hashing algorithm within
Pokémon GO remains effectively the same and the only difference
is a series of magic numbers it utilizes with every API update. Once we
have a known set of 32-bit register constants determines; we can use a
brute-force style approach (exhaustive search) to find out which ones are part
of the magic numbers.

So; what exactly does "brute-force" mean in the context of computing?

While a brute-force search is simple to implement, and will always find a
solution if it exists, its cost is proportional to the number of candidate
solutions – which in many practical problems tends to grow very quickly as
the size of the problem increases.
source: wikipedia

Well; let's put that into context of what we are trying to do here.

Let's say we need to find two values from 50 potential candidates
- if we were to test every single value in every single position; we would
to check a total of 2,500 (50*50) possible combinations. Let's extend it
to four values, then there would be 6,250,000 (50*50*50*50) possible
combinations. It is quite obvious how this can be quite a CPU intensive
process even with simple numbers.

Pokémon GO has sixteen magic numbers and two finalization numbers
(`uint64_t`) and a nice big round magic (`uint128_t`).
This is effectively 40 32-bit numbers that can have a value between 0 and
4,294,967,296. If you were to brute-force this without any way of filtering
our some values - you will need more than a few generations of family to do so
- in otherwise, forget it.

You would probably have more chance cracking an encryption algorithm like RSA
- also difficult.

But; wait - didn't we implement some tracing algorithms in the Unicorn
CPU emulation to identify potential candidates for these values? Absolutely;
but, even if we reduce the number of options down to a few hundred;
a true brute-force algorithm would still take months, possibly years to test
every combination; but then, the game will definitely have a new hash function
or magic numbers.

So; let's investigate - is it even worth attempting this feat?

After closer investigation of the algorithm; it is clear that there is a
way to execute the hash function without it even loading the sixteen
magic numbers - by simply passing an empty buffer with a length of zero
bytes. In fact, the algorithm has early-exit logic we can exploit:

__uint128_t hash_chunk(const uint8_t *chunk, int64_t size)
{
int i;
__uint128_t hash = 0;
for (i = 0; i < 8; i++) {
int offset = i * 16;
** if (offset >= size) {
break;
}
// CODE NEVER GETS HERE **
uint64_t a = read_int64(chunk + offset);
uint64_t b = read_int64(chunk + offset + 8);
hash += (__uint128_t) (a + magic_table[i * 2]) *
(__uint128_t) (b + magic_table[i * 2 + 1]);
}
return hash << 2 >> 2;
}

At least; that is the theory - after checking against known numbers in
the magic table; we could confirm that none of them were ever loaded
into memory, this effectively makes our job a little easier. Instead
of needing to search for forty 32-bit numbers, we only need to find
eight; which will represent the `ROUND_MAGIC` and
the `FINAL_MAGIC0` and `FINAL_MAGIC1` constants.

While this may seem like a breakthrough, picking eight numbers from
a few hundred is still a very daunting task that is going to take a
lot of processing power to pull off, so; we need to find other ways to
help filter out combinations that do not make sense. This is where it
was time to look at the constants being detected; and see if we can
restrict the ones we test against.

Firstly; I need a list of candidates - which we can extract from our log file.

$ ./pogohash > x.log
$ grep LOAD x.log | cut -f4 -d\ | wc -l > x.candidates
$ grep LOAD x.log | cut -f4 -d\ >> x.candidates

This effectively give us a file where the first line contains the number
of candidates we have, and each subsequent line in the file has a number
we need to test against. With an empty test vector of length zero bytes;
we detect `131` 32-bit register loads. Doing a brute-force attack
on all of these would still be `2.01E+118` (yes, 2 with 118 zeros)
combinations. Still out of the question.

It was time to go back to the algorithm, and study the known patterns.

After extensive analysis; I was able to determine that there was in
fact a pattern with how the magic numbers were being loaded. In essence;
there were a few non-related constant loads (obfuscation related), then
`ROUND_MAGIC`, a few more non-related, `FINAL_MAGIC0`,
a few more non-related and `FINAL_MAGIC1`. A nice little
sentinel value of `0xfffffefe` is used, which we know happens
after the magics numbers are used.

So; what does this mean?

Well; lets say we implemented the brute-force as a series of nested
for-loop's, with global constants named `i0, i1, i2, i3, i4, i5,
i6,` and `i7` - we could define the following:

ROUND_MAGIC = 128-bit based on (val[i0], val[i1], val[i2], val[i3])
FINAL_MAGIC0 = 64-bit based on (val[i4], val[i5])
FINAL_MAGIC1 = 64-bit based on (val[i6], val[i7])

Since we know the order in which these constants are loaded in the file;
we can also apply simple filtering rules to the brute-force algorithm
that imply that `i6,i7` will be greater than `i4,i5` which
will be greater than `i0,i1,i2,i3` - of course; this only holds
true if we do not change the order of the candidate values while
processing. Since we know a sentinal value; we can also restrict the
`max` value. But what about the `min` value? For a while;
I was scratching my head over this.

Then voila; brain fart!

Since the `pogohash` logger was an instruction tracer - we could
use it to our advantage. Long story short; but it was possible to compare
the output when using a zero length buffer and compare it to a buffer
with at least one byte to isolate loading of the magic number table.
From this; I was able to pin-point an address and isolate the first
32-bit load - which just happened to be `FINAL_MAGIC1`.

To give my brute-force algorithm a test; I decided to remove logging and
run against known `min` and `max` values - boy; was I
shocked at how quick it was able to isolate the right values.

$ time ./pogohash-brute
PogoHash-brute #1
-----------------
x.candidates
total: 131
min: 57 <-- known value
max: 106 <-- known value
:: Hash(buffer, sizeof(buffer) [iOS code]
:: 58 57 62 61 99 100 105 106
len= 0
0xeebb9c527a4d5f53
ROUND_MAGIC_HI = 0xe3f0d44988bcdfab
ROUND_MAGIC_LO = 0x081570afdd535ec3
FINAL_MAGIC0 = 0xce7c4801d683e824
FINAL_MAGIC1 = 0x6823775b1daad522
done
**real 0m2.078s**
user 0m2.076s
sys 0m0.000s

Yes, you read that correctly - `2.078` seconds! But that was based
on knowing `min` and `max`.
we kn

$ time ./pogohash-brute
PogoHash-brute #1
-----------------
x.candidates
total: 131
min: 57 <-- known value (based on diff)
max: 119 <-- number before 0xfffffefe sentinal
:: Hash(buffer, sizeof(buffer) [iOS code]
:: 58 57 62 61 99 100 105 106
len= 0
0xeebb9c527a4d5f53
ROUND_MAGIC_HI = 0xe3f0d44988bcdfab
ROUND_MAGIC_LO = 0x081570afdd535ec3
FINAL_MAGIC0 = 0xce7c4801d683e824
FINAL_MAGIC1 = 0x6823775b1daad522
done
**real 57m0.064s**
user 57m2.452s
sys 0m0.000s

Running it based on the assumption we knew nothing about the
`min` and `max` positions of the candidate values -
and we are able to brute-force the calculation of the non magic
table numbers in less than one hour. Not bad for what was originally
considered an impossible feat. This whole investigation did take
a number of hours to figure out; and I have plenty of ideas on
how to obtain the remainding values - best part is; we can confirm
these values and focus on the rest.

**UPDATE: 2016-11-21**

After further investigation; I was able to isolate a number of 32-bit
integer loads as part of the code obfuscation process - these could then
be excluded from the list of candidates. These instructions are typically
followed by an `add rX, pc` instruction - effectively building
a series of jump tables to be used later. The original 131 entries can
then be filtered down to 61 after applying this logic.

$ time ./pogohash-brute
PogoHash-brute #1
-----------------
x.candidates
total: 61
min: 13 <-- known value (based on diff)
max: 58 <-- number before 0xfffffefe sentinal
:: Hash(buffer, sizeof(buffer) [iOS code]
:: 13 12 17 16 41 42 47 48
len= 0
0xeebb9c527a4d5f53
ROUND_MAGIC_HI = 0xe3f0d44988bcdfab
ROUND_MAGIC_LO = 0x081570afdd535ec3
FINAL_MAGIC0 = 0xce7c4801d683e824
FINAL_MAGIC1 = 0x6823775b1daad522
done
**real 4m55.084s**
user 4m55.324s
sys 0m0.000s

This is a reduction in brute-force execution from almost an hour down to
five minutes - *boom*.