Hideous, obfuscated IDs

Sun, 03.04.2011 obfuscation

When you are developing a public online application where your users have to type in URLs and the like to visit it, you most probably don’t want them know how many users, posts or whatever items there are in your app. But our Rails app is giving all this information away without any discretion. This means that the default behavior regarding to object ids is not very stakeholder friendly and we will have to take care of that ourselves.

In the app that we are currently developing in our startup, we share this perspective: we want to obscure the ids of our objects in front of the outside world, but we still have to allow our users to fetch our resources. None of the solutions that we found (we might have not searched that well, condoned) seemed good enough: either too complicated, too slow, or too insecure, so we had to roll our own.

The requirements are:
- Simple to implement: one or two lines of code in each affected model and controller should be sufficient. We mind more about ease and speed, although beauty would be a plus.
- Recoverable: Given the obfuscated id, we know how to get our original one back!
- Not breakable: if an attacker creates several resources consecutively in time and analyzes the resulting IDs, he should not be able to get the original IDs back. The obfuscated IDs must not be consecutive or follow any recognizable pattern.
- Unobtrusive: the application should not know much about it (not that much in this post on this subject).

This is what I have come up with until now, which seems to be new, for what I know, although built on previous ideas.

First, we need some kind of library to perform the transformation. Actually a very simple one can do. Put this under /extras/hideous.rb:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
class Hideous
  MAXID = 2147483647  # (2**31 - 1)
  PRIME = #a very big prime number of your choice (like 9 digits big)
  PRIME_INVERSE = #another big integer number so that (PRIME * PRIME_INVERSE) & MAXID == 1
  RNDXOR = #some random big integer number, just not bigger than MAXID

  def self.hide(bare_integer)
    (((bare_integer * PRIME) & MAXID) ^ RNDXOR).to_s(16)
  end

  def self.bare(hide_integer)
    ((hide_integer.to_i(16) ^ RNDXOR) * PRIME_INVERSE) & MAXID
  end
end

How to calculate the proper PRIME and its PRIME_INVERSE? On that there are a couple of answers out there. You can use Scott’s PRIME and INVERSE, but that’s not very secret anymore, so you should go with your own prime number. Where do I get a big one?, you might be asking yourself. Well, you can calculate it yourself, but I suggest you to go faster by grabbing one from the list of the first fifty million primes. Afterwards you need to calculate its inverse (always relative to MAXID, but indeed not a prime number itself). This is a one timer, but none of us wants to do the math (that’s what computers were invented for!). Happily, I already took care of it and offer you a fast solution here for free. Seed your chosen prime into the last line of the following code (I adapted both the modinv function and the egcd function from their respective original Python sources). Simply open an irb session and paste the following in (remember to define your PRIME and MAXID):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# Modular inverse
# Returns x so that ax ≡ 1 (mod m)
def modinv(a, m)
    x, y, g = egcd(a, m)
    if g==1
      return x%m
    else
      return nil
    end
end

# Extended great common divisor
# Returns x, y and gcd(a,b) so that ax + by = gcd(a,b)
def egcd(a, b)
  x, x1, y, y1 = 1, 0, 0, 1
  while b!=0
    q = a/b
    x, x1 = x1, x-q*x1
    y, y1 = y1, y-q*y1
    a, b  =  b, a-q*b
  end
  return x, y, a
end

prime_inverse = modinv(PRIME,MAXID+1)

you surely get what you are looking for. Proof that it went well doing:

1
2
(PRIME * PRIME_INVERSE) & MAXID == 1
=> true

And what’s the point of all this, actually? What are we getting here? Well, let’s go back to our hide function. We could decompose it into three steps:

1
2
3
4
5
def self.hide(bare_integer)
  knuth_hash = bare_integer * PRIME & MAXID
  unleashable_hash = knuth_hash ^ RNDXOR
  shorter_hash = unleashable_hash.to_s(16)
end

The first one, bare_integer * PRIME & MAXID, is called the Knuth’s integer hash. It has the very important advantage that it shouldn’t produce collisions (i.e. two different inputs would never give the same output). But it also has a downside: if you compare several consecutively generated obfuscated numbers, you can extract a pattern and therefore break the illusion created by the hash. To avoid that, I have devised a second step in the algorithm (this is my real contribution here, all the rest being blows and whistles), knuth_hash ^ RNDXOR, which should make its result virtually impossible to unleash. Hidden, obfuscated, befuddled. It is actually a very easy concept, just make a bit XOR comparison of the number with another invented one. The beauty of it is that you just have to repeat it to get the original number back:

1
2
hashed_integer = original_integer ^ RNDXOR
original_integer  =  hashed_integer ^ RNDXOR

Besides, it is a very fast calculation which introduces almost no delay in the chain. The third step is an optional one. If you employ the suggested algorithm and appropriate big numbers, you should be getting 10 digit long ids back, which might be quite a length. In case you want to shorten them, you can make them be displayed in base 16, which will make them 8 digit long (or even base 32, making them 6 digit long, it is up to you).

So, all requirements are met: recoverability, unobtrusiveness and unbreakability (well, some very very brute force attack could unveil our original ids, but none of us mortals has to worry about that, since being a victim of such attacks would mean that we have become big numbers).

Modify your /config/application.rb. so that the hideous.rb file gets loaded by Rails at server startup:

1
config.autoload_paths += %W(#{config.root}/extras)

Now we are ready to start using our Hideous class to obscure our ids, like so (that goes for the last requirement, easiness):

1
2
hidden_id = Hideous.hide(plain_id)
plain_id = Hideous.bare(hidden_id)

Yes, we could have utilized it in some other way. But that’s another discussion, and there are different alternatives about how to integrate such an obfuscating library into our Rails app.

Update
You might be wondering about what a possibly good looking RNDXOR is. As the name itself expresses, any random integer not bigger than MAXID is alright, since we will be using it for a simple XOR bit operation. But, if you want to be really really random by not choosing the number by yourself, I have a small suggestion for you. In your irb:

1
2
require 'active_support/secure_random'
ActiveSupport::SecureRandom.hex(4).to_i(16) & MAXID

That way you avoid introducing any bias in the randomness of your RNDXOR.

1 comment