Mae Garden

I originally wrote this post on github gists, for the UIUCTF 2023 writeup competition(I received a 50 eur prize). I really enjoyed playing with gzip internals, and I think it's a good read, so I decided to republish it here

Also thanks to everyone who played with me on the Spotless CTF team, especially to people who solved futuredisk 2 with me

Overview

Futuredisk 1 - 22 solves

I'm from the year 2123. Here's what I did:

Mounted my 10 exabyte flash drive

fallocate -l 8E haystack.bin

dd if=flag.txt bs=1 seek=[REDACTED] conv=notrunc of=haystack.bin

gzip haystack.bin

Put haystack.bin.gz on my web server for you to download

HTTP over Time Travel is a bit slow, so I hope gzipping it made it a little faster to download :)

https://futuredisk-web.chal.uiuc.tf/haystack.bin.gz

Author: kuilin
Category: Web

We are given a server serving a massive gzip compressed file, which (when decompressed) contains null bytes and the flag stored at some unknown offset.

Firefox download manager showing that the 8 etabyte file is gonna take over 25 billion days

Exploration

For starters it's worth looking at the headers returned by the server, especially since the challenge is tagged as "web".
Let's try curling the file:

~ curl -v 'https://futuredisk-web.chal.uiuc.tf/haystack.bin.gz' -o /dev/null
[...]
< HTTP/2 200
< accept-ranges: bytes
< content-type: application/octet-stream
< date: Sat, 08 Jul 2023 18:02:49 GMT
< etag: "1209f04b4-7fffffffffffffff"
< last-modified: Sat, 12 Jun 2123 16:07:16 GMT
< server: nginx/1.23.1
< content-length: 9223372036854775807
<

there is some headers here I find interesting, namely:

last-modified: Sat, 12 Jun 2123 16:07:16 GMT

not really helpful, but I appreciate the attention to detail
accept-ranges: bytes

this header is used by the web-server to indicate to the client that it supports "partial" requests, meaning we can use the range header to specify a part of the file that we want to download.

For example:
```
curl -r "9223372036854775707-9223372036854775807" 'https://futuredisk-web.chal.uiuc.tf/haystack.bin.gz'
```
will download the last 100 bytes of the file.
etag: "1209f04b4-7fffffffffffffff"

The etag header is used to identify if a file has been changed, and the e-tag generation algorithm is left up to the server. Maybe the way nginx generates it is in some way reveals useful information to us?

Unfortunately the nginx etag is made up of the last-modified time and the content-length of the file, both of which are already known, and aren't really helpful.

Digging in

Gzip

After trying a few dumb approaches it became obvious to me that we can't solve this without at least taking a look at the file itself. Quickly skimming the wikipedia page for "gzip" we find that it's mainly a thin wrapper over DEFLATE, with a 10 byte header, containing a magic number (0x1f 0x8b), the compression method (always 0x08 for DEFLATE), 1-byte of header flags (null in our case, indicating that the gzip wrapper doesn't contain extra information), a 4-byte timestamp, compression flags (indicating the level of compression applied, either 2 (slowest) or 4 (fastest)), the operating system ID, and an 8-byte trailer, containing a CRC-32 checksum and the length of the original uncompressed data (modulo 2³²).

For a second I've considered if we could use the CRC-32 checksum to find the flag or maybe deduce it's position, but after further investigation I've rejected that idea as implausible.

DEFLATE

The body of a gzip file contains the real star of the show: DEFLATE. Deflate is a lossless compression algorithm using a bunch of smart math that I don't fully understand.

What's important for the purpouses of this challenge is that deflate compresses the data in two steps:

First it replaces duplicate strings with references to data at most 32KiB behind. In our case the previous data will just be null bytes (or the flag, but we can ignore that)
It then divides the data into blocks, either storing the uncompressed data, or compressing it using huffman coding.
- Huffman coding uses something called huffman trees, which can be either static (defined by the deflate standard), or dynamic (defined as a part of the header). We don't need to understand this for the purpouses of this challenge

Interestingly since the decompressed contents of previous blocks are (for our purpouses) static, the value of any one block doesn't depend on the value of any previous block. That means that if we can figure out where a block starts we can decompress it

Code!

Now that we more or less understand what's going on in the file we can try to play around with it a bit more.

First let's search for a "simple python deflate implementation", and download the first impl we can find on github:

https://github.com/nayuki/Simple-DEFLATE-decompressor/blob/master/python/deflatedecompress.py

🛈

This section features a lot of code. It is included for completeness, should you want to follow along and you don't need to understand it to understand the solution. Feel free to skip over them.

Now we can download a sample from the file and see what happens when we try to decompress it.

curl -r 0-131072 'https://futuredisk-web.chal.uiuc.tf/haystack.bin.gz' -o haystack.part.bin

128 KiB ought to be enough for everyone

import deflatedecompress

with open("./haystack.part.bin", "rb") as f:
	# Skip the gzip header
	f.seek(10)

	inp = deflatedecompress.BitInputStream(f)
	decompressed = deflatedecompress.Decompressor.decompress_to_bytes(inp)
print(decompressed)

Traceback (most recent call last):
[...]
EOFError: Unexpected end of stream

...right, we don't have the complete file, and the decompressor tries to decompress the whole thing at once.
Let's modify the decompressor to allow us to process the stream one block at a time:

diff --git a/deflatedecompress.py b/deflatedecompress.py
index feda42e..10fb7e0 100644
--- a/deflatedecompress.py
+++ b/deflatedecompress.py
@@ -293,26 +293,25 @@ class Decompressor:
                self._output = out
                self._dictionary = ByteHistory(32 * 1024)

-               # Process the stream of blocks
-               while True:
-                       # Read the block header
-                       isfinal: bool = self._input.read_uint(1) != 0  # bfinal
-                       type: int = self._input.read_uint(2)  # btype
-
-                       # Decompress rest of block based on the type
-                       if type == 0:
-                               self._decompress_uncompressed_block()
-                       elif type == 1:
-                               self._decompress_huffman_block(Decompressor._FIXED_LITERAL_LENGTH_CODE, Decompressor._FIXED_DISTANCE_CODE)
-                       elif type == 2:
-                               litlencode, distcode = self._decode_huffman_codes()
-                               self._decompress_huffman_block(litlencode, distcode)
-                       elif type == 3:
-                               raise ValueError("Reserved block type")
-                       else:
-                               assert False, "Unreachable value"
-                       if isfinal:
-                               break
+       def process_block(self):
+               # Read the block header
+               isfinal: bool = self._input.read_uint(1) != 0  # bfinal
+               type: int = self._input.read_uint(2)  # btype
+
+               # Decompress rest of block based on the type
+               if type == 0:
+                       self._decompress_uncompressed_block()
+               elif type == 1:
+                       self._decompress_huffman_block(Decompressor._FIXED_LITERAL_LENGTH_CODE, Decompressor._FIXED_DISTANCE_CODE)
+               elif type == 2:
+                       litlencode, distcode = self._decode_huffman_codes()
+                       self._decompress_huffman_block(litlencode, distcode)
+               elif type == 3:
+                       raise ValueError("Reserved block type")
+               else:
+                       assert False, "Unreachable value"
+               if isfinal:
+                       raise EOFError()

Let's see now:

import deflatedecompress
import io

out = io.BytesIO()
with open("./haystack.part.bin", "rb") as f:
    # Skip the gzip header
    f.read(10)

	inp = deflatedecompress.BitInputStream(f)
	dec = deflatedecompress.Decompressor(inp, out)
	dec.process_block()

output = out.getvalue()
assert b"\x00"*len(output) == output

$ python  solve.py
[no output]

yay!

Let's try to see if we can notice something interesting about the blocks:

import deflatedecompress
import io

out = io.BytesIO()
f = open("./haystack.part.bin", "rb")

# Skip the gzip header
f.read(10)
inp = deflatedecompress.BitInputStream(f)
dec = deflatedecompress.Decompressor(inp, out)


while True:
	pos_pre = f.tell()
	out_pre = out.tell()

	dec.process_block()

	pos_after = f.tell()
	out_post = out.tell()

	print(f"block: {hex(pos_pre)}, length: {hex(pos_after-pos_pre)}, decompressed_len: {hex(out_post-out_pre)}")
	output = out.getvalue()
	assert b"\x00"*len(output) == output

$ python solve.py
block: 0xa, length: 0x200e, decompressed_len: 0x80fcfc
block: 0x2018, length: 0x200c, decompressed_len: 0x80fefe
block: 0x4024, length: 0x200c, decompressed_len: 0x80fefe
block: 0x6030, length: 0x200c, decompressed_len: 0x80fefe
block: 0x803c, length: 0x200d, decompressed_len: 0x80fefe
block: 0xa049, length: 0x200c, decompressed_len: 0x80fefe
block: 0xc055, length: 0x200c, decompressed_len: 0x80fefe
block: 0xe061, length: 0x200c, decompressed_len: 0x80fefe
block: 0x1006d, length: 0x200d, decompressed_len: 0x80fefe
block: 0x1207a, length: 0x200c, decompressed_len: 0x80fefe
block: 0x14086, length: 0x200c, decompressed_len: 0x80fefe
block: 0x16092, length: 0x200c, decompressed_len: 0x80fefe
block: 0x1809e, length: 0x200d, decompressed_len: 0x80fefe
block: 0x1a0ab, length: 0x200c, decompressed_len: 0x80fefe
block: 0x1c0b7, length: 0x200c, decompressed_len: 0x80fefe
Traceback (most recent call last):
[...]
EOFError: Unexpected end of stream

...interesting... very very interesting

So the first block has a length of 0x200e, presumably because of the afformentioned DEFLATE string deduplication not being applied, but after that the blocks seems to follow a pattern of: 0x200c,0x200c,0x200c,0x200d
...
where does the 0x200d come from? I guess it could just be a quirk of the deflate implementation? Something feels off though

After reading some more I noticed that deflate headers don't have to be byte aligned!

Let's modify the BitInputStream class to add a function for checking our bit position.

diff --git a/deflatedecompress.py b/deflatedecompress.py
index 10fb7e0..c4a1e73 100644
--- a/deflatedecompress.py
+++ b/deflatedecompress.py
@@ -44,6 +44,9 @@ class BitInputStream:
                """Returns the current bit position, which ascends from 0 to 7 as bits are read."""
                assert 0 <= self._num_bits_remaining <= 7, "Unreachable state"
                return -self._num_bits_remaining % 8
+
+       def tell_bits(self):
+               return self._input.tell()*8+self.get_bit_position()


        def read_bit_maybe(self) -> int:

import deflatedecompress
import io

out = io.BytesIO()
f = open("./haystack.part.bin", "rb")

# Skip the gzip header
f.read(10)
inp = deflatedecompress.BitInputStream(f)
dec = deflatedecompress.Decompressor(inp, out)


while True:
	pos_pre = f.tell_bits() # tell -> tell_bits()
	out_pre = out.tell()

	dec.process_block()

	pos_after = f.tell_bits() # tell -> tell_bits()
	out_post = out.tell()

	print(f"block: {hex(pos_pre)}, length: {hex(pos_after-pos_pre)}, decompressed_len: {hex(out_post-out_pre)}")
	output = out.getvalue()
	assert b"\x00"*len(output) == output

$ python solve.py
block: 0x50, length: 0x10071, decompressed_len: 0x80fcfc
block: 0x100c1, length: 0x10062, decompressed_len: 0x80fefe
block: 0x20123, length: 0x10062, decompressed_len: 0x80fefe
block: 0x30185, length: 0x10062, decompressed_len: 0x80fefe
block: 0x401e7, length: 0x10062, decompressed_len: 0x80fefe
block: 0x50249, length: 0x10062, decompressed_len: 0x80fefe
block: 0x602ab, length: 0x10062, decompressed_len: 0x80fefe
block: 0x7030d, length: 0x10062, decompressed_len: 0x80fefe
block: 0x8036f, length: 0x10062, decompressed_len: 0x80fefe
block: 0x903d1, length: 0x10062, decompressed_len: 0x80fefe
block: 0xa0433, length: 0x10062, decompressed_len: 0x80fefe
block: 0xb0495, length: 0x10062, decompressed_len: 0x80fefe
block: 0xc04f7, length: 0x10062, decompressed_len: 0x80fefe
block: 0xd0559, length: 0x10062, decompressed_len: 0x80fefe
block: 0xe05bb, length: 0x10062, decompressed_len: 0x80fefe

now this makes more sense.

Finding the flag

At this point an idea started to emerge. Since all of the non-flag blocks are going to be the same when uncompressed, it would make the sense for them to be the same when compressed, since one compressed block doesn't depend on any other blocks.

A bunch of neatly arranged blocks, one after another

So neatly arranged! That's all cool and good but what happens when the flag shows up? Presumably it can't be compressed as well as the null bytes have been and the block is gonna take up more space

After the block containing the flag appears all the following blocks will be slightly offset, from where we expect them to be.

This means we can do a binary search to find out the position of the flag, by simply checking whether a block is at at the offset we expect it to be.

Solve

Let's write a simple function that tells us what block an address belongs to.

def addr_to_block(bit_addr):
	LENGTH = 0x10062
	# First block of length 0x10062
	first_block=0x100c1

	bit_addr = bit_addr-first_block

	block=bit_addr - (bit_addr%LENGTH)
	return block + first_block

>>> hex(addr_to_block(0x7030d))
'0x7030d'
>>> hex(addr_to_block(0x7030d+5))
'0x7030d'
>>> hex(addr_to_block(0x7090d+5))
'0x7030d'
>>> hex(addr_to_block(0x9090d+5))
'0x903d1'

That checks out!

Now we need some way for the python code to easily access an arbitrary offset into the haystack file.
I tried using httpdirfs, but after that didn't work I wrote my own implementation of binaryIO.

import pickle
from urllib.parse import quote_plus
import requests
from typing import BinaryIO, Optional
class HTTPBinaryIO(BinaryIO):
	url: str
	sess: requests.Session
	pos: int
	_block_size: int
	_cache: dict[int, bytes]

	def __init__(self, url):
		self.url =  url
		self.sess = requests.Session()
		self.pos = 0
		self._block_size = 1024*10
		self._cache = {}
		self._load_cache_from_file()

	def _save_cache_to_file(self):
		with open(f"./_cache.{quote_plus(self.url)}.pick", "wb") as f:
			pickle.dump(self._cache, f)

	def _load_cache_from_file(self):
		try:
			with open(f"./_cache.{quote_plus(self.url)}.pick", "rb") as f:
				self._cache = pickle.load(f)
		except:
			pass

	def _fetch_range(self, start: int, end: int):
		range_header = f"{start}-{end-1}"
		headers={"range": f"bytes={range_header}"}
		res = self.sess.get(self.url, headers=headers)
		assert res.status_code == 206, f"status code is {res.status_code}, not 206"

		return res.content

	def _fetch_block(self, block_start: int):
		if data := self._cache.get(block_start):
			return data

		block_end = block_start+self._block_size

		res = self._fetch_range(block_start,block_end)
		assert len(res) == self._block_size, f"{len(res)} != {self._block_size}"

		self._cache[block_start] = res
		self._save_cache_to_file()
		return res
	
	def _get_range(self, start: int, end: int) -> Optional[bytes]:
		block_start = start - (start % self._block_size)
		block_end = (end - (end % self._block_size)) + self._block_size

		res = b""
		for cur_block_start in range(block_start, block_end+1, self._block_size):
			res += self._fetch_block(cur_block_start)

		return res[start-block_start:end-block_start]

	
	def read(self, size):
		start = self.pos
		end = self.pos + size

		res = self._get_range(start, end)

		self.pos = end

		return res

	def seek(self, pos):
		self.pos = pos

	def tell(self):
		return self.pos

We also need a way to seek into a bit offset. We can once again modify the BitInputStream to include a seek_bits function.

diff --git a/deflatedecompress.py b/deflatedecompress.py
index 09c7f6a..2e6e7c5 100644
--- a/deflatedecompress.py
+++ b/deflatedecompress.py
@@ -47,6 +47,14 @@ class BitInputStream:

        def tell_bits(self):
                return self._input.tell()*8+self.get_bit_position()
+
+       def seek_bits(self, loc):
+               byte = loc >> 3
+               bit = loc & 0b111
+
+               self._input.seek(byte-1)
+               self._current_byte = int.from_bytes(self._input.read(1), byteorder='big')
+               self._num_bits_remaining = 8-bit


        def read_bit_maybe(self) -> int:

Let's try it!

import deflatedecompress
import io
import http_binary_io

def addr_to_block(bit_addr):
	LENGTH = 0x10062
	# First block of length 0x10062
	first_block=0x100c1

	bit_addr = bit_addr-first_block

	block=bit_addr - (bit_addr%LENGTH)
	return block + first_block

out = io.BytesIO()
f = http_binary_io.HTTPBinaryIO("https://futuredisk-web.chal.uiuc.tf/haystack.bin.gz")

# Skip the gzip header
f.seek(10)

inp = deflatedecompress.BitInputStream(f)
dec = deflatedecompress.Decompressor(inp, out)

# Process the first block to fill out the backtrack dictionary
dec.process_block()

inp.seek_bits(addr_to_block(0x133713371337))
dec.process_block()

output = out.getvalue()
assert b"\x00"*len(output) == output

$ python solve.py
[no output]

it works 🎉

Time to put it all together with a binary search

import deflatedecompress
import io
import http_binary_io

def addr_to_block(bit_addr):
	LENGTH = 0x10062
	# First block of length 0x10062
	first_block=0x100c1

	bit_addr = bit_addr-first_block

	block=bit_addr - (bit_addr%LENGTH)
	return block + first_block

def decompress_block_at(pos):
	f = http_binary_io.HTTPBinaryIO("https://futuredisk-web.chal.uiuc.tf/haystack.bin.gz")
	# Skip the gzip header
	f.seek(10)

	inp = deflatedecompress.BitInputStream(f)
	out = io.BytesIO()

	dec = deflatedecompress.Decompressor(inp, out)
	
	# Process the first block to fill out the backtrack dictionary
	dec.process_block()
	
	inp.seek_bits(pos)
	dec.process_block()
	
	output = out.getvalue()
	return output
	
def check_block(pos):
	try:
		output = decompress_block_at(pos)
		assert output == len(output)*b"\x00"
		return True
	except Exception:
		return False

def find_flag_block():
	# Perform a binary search
	low = 0
	high = 9223372036854775807*8
	while True:
		checked_addr = addr_to_block((high - low) // 2 + low)
		if checked_addr == low or checked_addr == high:
			return high
		print(f"Checking: {checked_addr}, range: {low} - {high} ( {high-low} bits )")
		if check_block(checked_addr):
			low = checked_addr
		else:
			high = checked_addr

flag_block = find_flag_block()
print(f"Flag is at {flag_block}")
print(decompress_block_at(flag_block).strip(b"\x00").decode("utf8"))

$ python solve.py
Checking: 18446744073709465031, range: 0 - 36893488147419061235 ( 36893488147419061235 bits )                                                                                                                                                           [1/57]
Checking: 27670116110564263133, range: 18446744073709465031 - 36893488147419061235 ( 18446744073709596204 bits )
Checking: 23058430092136831265, range: 18446744073709465031 - 27670116110564263133 ( 9223372036854798102 bits )
Checking: 25364273101350547199, range: 23058430092136831265 - 27670116110564263133 ( 4611686018427431868 bits )
[snip]
Checking: 23819729546392263779, range: 23819729546392132511 - 23819729546392395047 ( 262536 bits )
Checking: 23819729546392329413, range: 23819729546392263779 - 23819729546392395047 ( 131268 bits )
Flag is at 23819729546392329413
uiuctf{binary search means searching a binary stream, right :D}