# Lattices and the Hidden Number Problem

The Hidden Number Problem was introduced by Boneh and Venkatesan to show that computing part of a Diffie Hellman secret is as hard as computing the entire secret. There are many variants of the HNP, generally defined with an oracle that yields partial information about some secret value and an experiment in which an adversary succeeds by efficiently discovering this hidden value. Hidden number problems are interesting because such oracles are practical and often realizable as side channel attacks that leak partial information about a secret.

## Textbook RSA

Textbook RSA should be pretty familiar. To encrypt a message $m$ we have

```
m = 0xdeadbeef
p = random_prime(2^512)
q = random_prime(2^512)
n = p*q
e = 65537
d = inverse_mod(e, (p-1)*(q-1))
ciphertext = pow(m, e, n)
plaintext = pow(ciphertext, d, n)
assert(plaintext == m)
```

This of course is deterministic and thus vulnerable to chosen ciphertext attacks. To fix this, padding schemes with some nondeterminsim are often used before encrypting.

## PKCS #1 v1.5

One such padding scheme used by legacy SSL v3 servers was PKCS #1 v1.5 The details are not particularly important, the scheme pads messages into the format

```
+--------+---------------------------+------+---+
| 0x0002 | >= 8 nonzero random bytes | 0x00 | m |
+--------+---------------------------+------+---+
```

Dan Bleichenbacher famously noticed
that SSL servers using this scheme behave like a **0002**-oracle. That is, given an encrypted message
$c = m^{e} \pmod n$, the server reveals whether m was padded correctly, either
explicitly via padding error messages or implicitly through timing analysis (i.e. terminating early on invalid padding).
Bleichenbacher describes an adaptive chosen ciphertext attack that uses this oracle to decrypt messages and forge signatures.

## The Hidden Number Problem

We have an oracle that leaks information about the first two bytes of $m$. We know that RSA is malleable so that scaling $c$ by $r^{e}$ scales $m$ by $r$. Concretely, for $r \overset{$}{\leftarrow} \mathbb{Z}_{n}$ the oracle tells us if the decryption of

\[r^{e}c = (rm)^{e} \pmod n\]begins with 0x$0002$. We know this should happen with probability $2^{-16}$. If we feed many such values into the oracle, keeping track of when it accepts, we end up with $r_i \in \mathbb{Z}_{n}$ for which we know the first two bytes of $m_i = r_{i} m$ are 0x0002.

Knowing the first two bytes of $m_i$ gives us the following bound on the decryption

\[2M \leq m_i < 3M\]where

\[M = \text{ 0x0001 } \ll (\|n\| - 16)\]Taking an average and approximating $m_i \approx \frac{5}{2}M$ leads to a hidden number problem for $m$.

## Constructing a Basis

The hidden number problem can be solved by reduction to a closest vector problem over a lattice $\mathbf{L}$ spanned by an appropriate basis $\mathbf{B}$. The idea is to construct $\mathbf{B}$ carefully so that

- The hidden value is on the lattice
- The closest vector to our approximation is indeed the hidden value
- The closest vector problem can be solved efficiently over $\mathbf{L}$

If we set

\[\mathbf{b}_1 = (2^{-16}, r_1, r_2, \dots, r_N)\]then the hidden value

\[\mathbf{h} = m \mathbf{b}_1 = (m2^{-16}, mr_1, mr_2, \dots, mr_N)\]is on $\mathbf{L}$ and will be relatively close to our approximation

\[\mathbf{u}= (\frac{1}{2}M, \frac{5}{2}M, \frac{5}{2}M, \dots, \frac{5}{2}M)\]Indeed, we can check that the entries of $\mathbf{h} - \mathbf{u} < n2^{-16}$

To account for the remaining dimensions, we can ensure that no basis vector $\mathbf{b}_i$ is at a distance greater than $n2^{-16}$ from $\mathbf{u}$ by setting the $i^{\text{th}}$ entry $\mathbf{b}_i = (0, \dots, n, \dots, 0)$ for $i > 1$.

This gives us the basis

\[\mathbf{B} = \begin{bmatrix} 2^{-16} & r_{1} & r_{2} & \dots & r_{N} \\ 0 & n & 0 & \dots & 0\\ \vdots & 0 & n & \ddots & \vdots \\ \vdots & \vdots & \ddots & \ddots & 0 \\ 0 & 0 & \dots & 0 & n\\ \end{bmatrix} \qquad \in \mathbb{Q}^{(N+1) \times (N+1)}\\\]## Solving the Closest Vector Problem

Our constraints on $\mathbf{B}$ ensure that running an approximation algorithm like Babai’s with target vector $\mathbf{u}$ should reveal the hidden value $\mathbf{h}$. Decrypting the original message is then as simple as picking off the first entry and scaling appropriately.

```
#!/usr/bin/env sage
from os import urandom
from Crypto.Util.number import long_to_bytes, bytes_to_long
class RSA:
def __init__(self, k=512, e=65537):
p = random_prime(2^k)
q = random_prime(2^k)
self.e = e
self.n = p * q
self.k = (self.n.nbits() - 1) // 8 + 1
self.d = inverse_mod(self.e, (p-1)*(q-1))
def pkcs_pad(self, msg):
"""
https://www.rfc-editor.org/rfc/rfc2313
"""
message_len = len(msg)
if message_len >= self.k - 11:
return None
padded = []
while len(padded) != self.k - message_len - 3:
x = urandom(1)
if(x != b'\x00'):
padded.append(x)
padded = b"".join(padded)
assert(len(padded) == self.k - message_len - 3)
return bytes_to_long(b'\x00\x02' + padded + b'\x00' + msg)
def pkcs_unpad(self, msg):
msg = b'\x00' + long_to_bytes(Integer(msg))
if len(msg) > self.k or msg[:2] != b'\x00\x02':
return None
msg = msg[1:]
index = msg.find(b'\x00')
return msg[index + 1:] if index >= 8 else None
def oracle(self, msg):
return self.pkcs_unpad(pow(msg, self.d, self.n)) != None
def cvp(B, t):
"""
Babai's nearest plane algorithm
"""
B = B.LLL()
d = B.dimensions()[1]
G = B.gram_schmidt()[0]
b = t
for i in range(d)[::-1]:
c = QQ(b * G[i]) / QQ(G[i] * G[i])
b -= B[i]*c.round()
return t - b
if __name__ == "__main__":
N = 100
U = []
R = RSA()
z = Zmod(R.n)
msg = R.pkcs_pad(b"Secret")
m = pow(msg, R.e, R.n)
M = bytes_to_long(b'\x00\x01' + b'\x00'*(R.k-2))
while len(U) < N:
r = z.random_element()
y = m*pow(r, R.e, R.n)
if R.oracle(y):
m_i = r * msg % R.n
assert(2*M < m_i)
assert(m_i < 3*M)
U.append(r)
B = matrix(QQ, matrix.identity(N+1) * R.n)
B[0] = vector(QQ, [2^(-16)] + U)
u = vector(QQ, [M*.5] + [2.5*M for i in range(N)])
h = cvp(B, u)
m = Integer(h[0] * (2^16)) % R.n
decrypted = R.pkcs_unpad(m)
print(f"Decrypted Message: {decrypted}")
assert(m == msg)
```