Winesap's Blog

TMCTF 2017 Rev400

| Comments

好像開始會一點 Windows 技能了... 學習中

逆向題,x64 PE+、VMProtect 還有一些 anti-debug。每隔幾秒會顯示一個HEX字元,總共41216個,理論上跑得夠快的話丟著跑完就可以了。從頭幾個字元當 header 來看應該會得到一個 7z 檔。

所以先來寫一個用來抓字的工具。它顯示字元的方式是直接把游標移到按鈕上,然後顏色應該是因為 hover 的關係所會變。簡單做法就是底下這樣,可以抓到現在指的字元,再 GetParent()^2 抓個主視窗標題就可以看是第幾個了:

#include <stdafx.h>
#include <Windows.h>
int main() {
    while (1) {
        POINT pt;
        GetCursorPos(&pt);
        HWND hwnd = WindowFromPoint(pt);
        CHAR str[100];
        GetWindowTextA(hwnd, str, 100);
        printf("%s\n", str);
    }
}

接下來把它加速就可以了。由於等待時視窗會卡住 (不是 Timer),CPU 也沒有燒起來 (不是 spin-loop),應該是用到 Sleep 類的 WinAPI。由於某種 anti-debug 導致 break-point 失效的樣子,但看不出它是怎麼做到的,所以要測試的地方直接 patch 加個無窮迴圈讓它停下來好了。我用 IDA remote x64 debugger,attach 上去後跑 python script:

import ida_bytes

def patch(addr, buf):
    ida_bytes.patch_many_bytes(addr, buf.decode('hex'))
    
# ntdll_NtDelayExecution -> movabs rax,0x7ffd4d19e900; jmp rax

patch(0x7FFD4F245A20, '48b800e9194dfd7f0000ffe0')

這裡直接 patch ntdll_NtDelayExecution,跳到另一段也是 patch 上去的 code 跑。內建的 ida_idp.AssembleLine 好像不吃 x64 的樣子,所以後來直接轉 HEX 後貼過來,或許可以試著把 pwntools 裝進去。


DelayExecution 第二個參數 (rdx) 是指向一個 uint64 值,是負值代表要 delay 幾 ns。Patch 好之後 continue,就會發現它跑得飛快了。

但跑一陣子後會跳 "You are too fast",猜測是用了某個時間相關的函式,判斷單位時間內不能跑太多個。試過 GetSystem/LocalTimeWM_TIMER後,發現它是定期用 GetTickCount 來取時間,所以 patch kernel32_GetTickCount 讓回傳值多個 N 倍就行,這樣就可以一路跑到底了。

不過實際測過後發現跑太快了會掉包,應該放慢一點就可以了。但比賽時間不太夠得快點跑完,所以最後 patch 了 user32_SetCursorPos,把座標 (rcx, rdx) 塞進用 kernel32_VirtualAlloc 開出來的 buffer 裡。因為它按鈕的橫向排列順序是依序 rotate 而已,所以只要 group 完 X 座標後就可以直接算出對應該的字元。

7z 解出來後是 stub.exe,這個就是很簡單的 reverse 了。16 位密碼每位是獨立的,直接暴出來就行:

如果早點做有時間的話,放著慢慢跑完比較有趣 XD

Plaid CTF 2015 PlaidDB Writeup

| Comments

My exploit source code: https://gist.github.com/33d6e03d0c9a8c13f3ba

This challange is a ELF64 binary pwnable, which the libc.so library is also provided. The service provides some key-value operations, GET, PUT, DUMP and DEL, by a red-black tree (we spent lots of time on checking the implementation of the tree but nothing there). First we found an input sequence that may cause heap corruption:

PUT
a
0
DEL
a
PUT
NNNNNN
0
GET
AAAAAAAAAAAAAAAAAAAAAAAA

=> *** Error in `./ds': free(): invalid next size (fast): 0x00007ff60a6680d0 ***

The bug is in the function at 0x1040. The length of the buffer for the row key is not enough for the trailing '\0'. The last row key in the input above, which is exactly 24 bytes, causes an off-by-one null byte overwriting into the size entry of the next chunk in heap memory. This makes the next chunk shrink and clears the inuse bit so that the current chunk seems already freed. Since the row key buffer will not be freed when the row is not found in DEL operation, we use DEL to trigger this bug to avoid unwanted memory corruption and also overwrite the prevsize entry of the next chunk. After then, free the next chunk which the prevsize is overwrited and its forward chunk is also unlinked. The result is to put a large chunk which crosses over other chunks into the unsorted bin and we can get it back by malloc a data buffer with matching size later.

Since the data buffer is overlapping with other node structures, we can use GET to leak information on these structures or use PUT to overwrite them. Note that the PUT operation creates a new data buffer. So we have to call PUT twice to swap the current buffer back and fill it up. By overwriting the data pointer in the node struture, we are able to dump memory at arbitrary address by apply GET to the node (see the leak() function in my exploit). First we can leak heap base address by reading the pointer in node strutures.

There is a trick to defeat ASLR for the libc base address. In the libc implementation, malloc() call mmap instead for large size (0x21000 is enough). In general, these pages will be placed at the address just before the .tls section. There are some useful information on .tls, such as the address of main_arena, the canary value of stackguard and a pointer which points to somewhere on stack with fixed offset.

7f55cd026000-7f55cd049000 r-xp 00000000 08:01 6030320       /lib/x86_64-linux-gnu/ld-2.19.so
7f55cd227000-7f55cd22a000 rw-p 00000000 00:00 0    <-- .tls is in here
7f55cd246000-7f55cd248000 rw-p 00000000 00:00 0
7f55cd248000-7f55cd249000 r--p 00022000 08:01 6030320       /lib/x86_64-linux-gnu/ld-2.19.so
7f55cd249000-7f55cd24a000 rw-p 00023000 08:01 6030320       /lib/x86_64-linux-gnu/ld-2.19.so

However, the data buffer pointer can not be used as arbitrary memory writing, since the PUT operation always create a new buffer. Thus we have to make malloc() return the address which we want to overwrite and fill it by node data. There are two approaches: free a fake chunk with the target address or corrupt the fastbin. Since free() check both head and foot of the chunk, corrupt fastbin is easier and the only sanity checking is the chunk size in malloc(). We can free a chunk in fastbin range and then overwrite its fd pointer with the target address. Then after malloc() called twice, the fake chunk at the target address should be return, if the chunk size has been set correctly. (See the fastbin path in libc source for more details, which start from line 3331 to 3358)

We decided to overwrite the stack start from a return address entry with ROP chain. Thus we have to setup a valid chunk size on stack somewhere, which is before but close enough to a return address so that a tiny fastbin chunk can successfully cover it. We found that there is a buffer on stack in the PUT function at 0x1240. The buffer stores the data size and the rest space (total 16 bytes) is large enough for a 8 bytes size value. Also the address (start from [rbp-0x38]) is close to the return address entry. The valid length of ROP chain is only 16 bytes, which is not enough for a system('/bin/sh') call, but its enough for jump to a leave; ret gadget and migrate stack to second ROP chain which placed on heap in advance. Note that the stackgaurd canary should be overwrite with correct value.

Cat the flag by shell: flag{one_null_byte_t0_rul3_them_all_4ecd68f0}

0ctf 2015 Freenote Write Up

| Comments

Double free 漏洞利用,順便來試試之前想到的某個好用的 unlink() 利用法,建議對 heap exploitation 先有一些瞭解。

unlink()

狀況: 存在一個指標 long *p&p 已知且 p 指向 heap 上某處,p[2]p[3] 內容可控制,且能以某種方式觸發 unlink(p, BK, FD)。那麼就能改寫指標指向的位址為 p = &p - 3

通常情況是, p 是某個全域變數,且因為程式本身沒 ASLR,所以 p 存放在哪裡本來就知道,即 &p 已知。而這題的指標是放在 stack 上,位址不固定。但也存在一個 information leakage 漏洞,使得 heap 的基底位址可以被計算出來,所以只要是在 heap 上的指標,位址也都是已知的。

那麼方法很簡單,假設 p 指向某個存放字串的 buffer,所以控制 p[] 的內容沒問題。現在先設好 p[2] = &p - 3; p[3] = &p - 2,接著若因為有 heap overflow 或 double free,可以偽造 chunk 並觸發 unlink(p, BK, FD),此時會發生如下的動作:

  1. FD = p->fd = &p - 3, BK = p->bk = &p - 2
  2. 檢查 FD->bk != p || BK->fd != p,不過因為 *(FD + 3) == p && *(BK + 2) == p 所以順利通過
  3. FD->bk = BK,即 p = &p - 2
  4. BK->fd = FD,即 p = &p - 3

最後 p = &p - 3,該指標被改成指向其存放位址前面一點點。之後若能有對這個指標的寫入,就可以再覆蓋掉指標,再寫一次就是任意位址寫入了。

                                                                          heap
                                                                    |                |
                                                                    +----------------+
                                                          +-----> p |                | <- unlink()
                                                          |         +----------------+    ^
                                                          |         |                |    |
                                    .bss or .data         |         +----------------+    |
                                                          |   p->fd |  p[2] = &p - 3 |    |
  +----------------+               +----------------+     |         +----------------+    |
  |   prev_size    |    FD: &p - 3 |                |     |   p->bk |  p[3] = &p - 2 |    |
  +----------------+               +----------------+     |         +----------------+    |
  |     size       |    BK: &p - 2 |                |     |         |      .         |    |
  +----------------+               +----------------+     |                .              |
  |      fd        |               |                |     |         |      .         |   /
  +----------------+               +----------------+     |         +----------------+ /
  |      bk        |            &p |       p        | ----+         |    prev_size   | <- _int_free
  +----------------+               +----------------+               +----------------+
                                                                    |                |

洩漏 Heap 基底位址

初始化存放 note 的內容的 note[i]->str 時,會 malloc() 一塊大小為 128-byte 對齊的空間。而這個內容字串在輸入時,結尾是不會補 '\0' 的,也沒有多開額外的空間。因此在輸出時可以一併把後面接續的內容印出來。如果巧妙的讓某個非使用中 chunk 的 fd 欄位指向另一個 chunk,並且讓 note->str 剛好接上,就可以把 chunk 的位址洩漏出來。

這裡用 note[8]->fd (offset = (128+16)*8 = 1152),剛好和輸入大小為 1152 (128-byte aligned) 的字串切齊。

任意位址讀寫

接下來要利用上述的 unlink() 修改 note[0]->str。首先建立 2 個 notes X, Y (size=128),刪掉 X, Y 後再建一個 note Z (size>128),Z 的內容可以覆掉 Y 的 chunk prev_size 和 size。因為刪除操作並沒有做任何檢查,可以再刪除一次 Y,造成 double free 並且根據偽造的 Y chunk 內容,觸發 unlink(note[0]->str)

 
----------------------------------------------------------------------------------------------------
                | X                       | Y                      |            fake next_chunk of Y
<< note[0]->str |-----------------------------------------------------------------------------------
                | Z                         prev_size  size                     prev_size  size
----------------------------------------------------------------------------------------------------
      

改寫 note[0]->str 後,對 note 0 做一次編輯,長度不要變。這樣會覆蓋掉 note 總數和 note[0] 的內容,可以把 note[0]->str 指向任意位址。再讀取 note 0 時就可以洩漏任意位址的內容。如果再對 note 0 做一次編輯,就可以修改該位址的內容 (或者第一次編輯 note 0 的長度足夠,就可以寫掉 note[1]->str,之後修改 note 1 就可以重覆使用這個任意讀寫漏洞)。

洩漏 free@got.plt 後,計算 libc.so.6 的基底位址。將 free() 修改為 system() 後,只要建立一個 note 內容為 "sh" 再刪除,就可以執行 system("sh")

Exploit: freenote.py

Codegate CTF Preliminary 2015

| Comments

這次摸到的題目有 systemshock, sokoban, icbm 和 bookstore,都是 pwnable
其實重點是吐槽 sokoban... 他的精美程度讓我想到以前打 ACM 時某賽 (?) 的題目風格
要說的話大概是個 input n = 100000,結果官方解是 O(n!) 之類的吧

systemshock

經典 stack overflow,輾掉 argv 後讓 strlen(argv[0]) = 0,所以原本用來檢查 argv[0] 的迴圈不會作用。
之後可以注入 ;system() 去。

strcat(dest, argv[1]); // overflow & overwrite argv @ [rbp-140h]
for ( i = 0; i < strlen(argv[1]) + 3; ++i )
{
  v3 = (*__ctype_b_loc())[dest[i]];
  if ( !(v3 & 8) && dest[i] != ' ' )
    goto FAIL;
}
system(dest);

sokoban

我覺得這題完全就瞎爆了 = =
雖然前面的漏洞還算藏得不錯

一般的關卡都有設邊界,移動時雖然沒有檢查但不會出界。
但一開始計算關卡總數不正確,會多出 1。
所以在隨機選關時有機率抽到超出範圍的這一關,會是空白盤面。
所以可以走出盤面範圍,然後原本推箱子的功能可以把一個非零 byte 推到一個相鄰的位置上 (+32, -32, +1, -1)

接下來利用的方式就瞎了...
預期的目標是看能不能跳到 0x401A9A 並且先把 eax 設為 1,這樣接下來會執行 sub_401678() 印出 flag。
但只能修改一個 0 byte 這樣的漏洞超級難用... 最後唯一可以利用的是修改 rand() 在 .got.plt 的值:

  1. rand() 的 offset@libc 是 0x3d060,在 ASLR 下有 1/16 的機會隨機到 ?0060,這個 0 byte 位在 0x60C121
  2. 0x3e260@libc 是一個神奇的 gadget: add rsp, 0x28; (pop XX)*6; ret
  3. rand() 在重新隨機盤面時會被呼叫,然後 rsp 偏移 0x58 byte 的結果是... stack 頂端的值剛好是原本要 return 回 0x401A9A 的這個位址
  4. 更瞎的地方是,有個隱藏的功能: 按 v 鍵可以把 0x60C121 處的值加 0x12。看來沒什麼用,不過 e260 和 d060 剛好差 0x12 ... 所以如果補成 1260 的話剛好可以跳到這個 gadget
.got.plt:000000000060C110 off_60C110      dq offset rand

.bss:000000000060C8E0 ; char board[640]
.bss:000000000060C8E0 board           db 280h dup(?)    

    3e260:       48 83 c4 28             add    rsp,0x28
    3e264:       5b                      pop    rbx
    3e265:       5d                      pop    rbp
    3e266:       41 5c                   pop    r12
    3e268:       41 5d                   pop    r13
    3e26a:       41 5e                   pop    r14
    3e26c:       41 5f                   pop    r15
    3e26e:       c3                      ret

另外,在 'press any key' 時送 '\x00' 可以把 eax 設為 1,_wgetch()rand() 前最後一次改到 eax 的地方。總之一切都是完美的巧合。

icbm

前半段是個簡單的 format string + buffer overflow。
後半段是個 crypto,控制 CBC AES 的 IV,如果成功造出 '\xeb\xfe' 就會卡住,用這種方式暴出 IV。

bookstore

洞是皮皮找的,不過基於比較利益原則我寫了 exploit。
大致上是 book[20] (0-based) 和某個暫存 book 重疊了,然後改一改可以 leak 出某個 function pointer 解出 code base address。再改一改可以反過來寫掉 function pointer,總之先寫成 printf(name) 弄成 format string 漏洞就對了。

Test

| Comments

廢文測試...

以後可能寫寫 CTF writeup 之類的,比較長的文章 facebook 不容易讀的也放這邊...
Theme 什麼的心情好再來調