Matt Molinyawe
Security Researcher
HP Security Research – Zero Day Initiative
Many of us here at the ZDI are blessed to look at the world’s best vulnerability research coming from researchers around the world. For those of us who work at the ZDI, it’s literally nothing but zero-day, every day. And we’re not just saying that. It’s documented by the record number of published vulnerabilities attained last year and is the most for a single year in the history of the Zero Day Initiative program.
An interesting case came in through the program in late October from a researcher named n3phos. The report contained vulnerability information affecting the win32k.sys kernel component on Windows 8.1 x64, and examples included in the case were very well-documented and well-written. We recently released an advisory for the case, which is ZDI-15-030 in our system. This is also known as CVE-2015-0058 to MITRE, and was addressed as part of MS15-010 by Microsoft. Here is a write up from the submission which we felt was exceptional and wanted to share with the research community.
Let’s start things off with a demo of the Windows Kernel privilege escalation for Windows 8.1 x64:
Similar to the old phrase “cleanliness is next to godliness”, this privilege escalation cleaned up after itself to prevent crashing the operating system and attained SYSTEM privileges. The privilege escalation came in with source code with bypasses to ASLR, SMEP, and full continuation of execution. I compiled the source to verify this case. As you can see in the video, this was a pretty straightforward case to look at.
Vulnerability Analysis
The report had noted that a crash would occur with the following actions taken:
hCursorA = CreateCursor( NULL, 1, 1, 4, 4, AndMask, XORMask);
hCursorB = CreateCursor( NULL, 1, 1, 4, 4, AndMask, XORMask);
linked = CallService( __NR_NtUserLinkDpiCursor, SYSCALL_ARG(hCursorA), SYSCALL_ARG(hCursorB), SYSCALL_ARG(0x30), );
CallService(__NR_NtUserDestroyCursor, SYSCALL_ARG(hCursorB), SYSCALL_ARG(0x0), );
CallService(__NR_NtUserDestroyCursor, SYSCALL_ARG(hCursorA), SYSCALL_ARG(0x0), );
I compiled an executable for this code and ran it in release mode, and a screen appeared called the “Sad Face of Sorrow” (formerly colloquially known as the “Blue Screen of Death”).
Figure 1: Sad Face of Sorrow
The following crash stack signature appeared in the kernel debugger:
Figure 2: The crash stack signature; click upper image to open in new window
Looking at the access violation, it appeared that the memory was freed and accessed again by the call to DestroyCursor.
Figure 3: Debug of free
The debug session of the crash verified the researcher’s findings in the report, in which n3phos had noted:
There was an attempt made to free a memory location which has already been freed before (double free). This happens during the second call to NtUserDestroyCursor where CursorA gets destroyed and is caused by the reuse of a dangling pointer to the already freed CursorB. By linking CursorA and CursorB together with a call to NtUserLinkDpiCursor, all we have to do in order to hit the double free, is to destroy CursorB before CursorA. And since we have control between the two calls, we can easily replace the freed CursorB.
How the cursors are linked
The report noted the following about cursors inside of NtUserLinkDpiCursor:
Figure 4: A closer look at NtUserLinkDpiCursor (click to open larger image in new tab)
LinkDpiCursor takes three arguments -- two valid cursor handles and one dword as a new dpi value. It first checks if the dpi is a multiple of 0x10 and in the range of 0x10 – 0x40. Then GetCursorForDim looks if CursorA’s current dpi is equal to the newly provided dpi. If it is, the function fails. The default dpi value for a cursor created with CreateCursor is 0x20. By supplying 0x30 as argument, we can pass GetCursorForDim and reach the linking code which, when simplified, looks like this:
CursorB->prevPointer = CursorA
CursorB->nextPointer = CursorA->nextPointer
CursorA->nextPointer = CursorB
Here’s more information regarding the cursor object:
Figure 5: Empty cursor object on the way (click to open larger image in new window)
When calling CreateCursor, a new empty cursor object gets allocated through HMAllocObject, which then calls Win32AllocPool. What’s important to note here is the allocation size of 0x98 bytes and the POOL_TYPE 0x21 enumerable value that stands for “PagedPoolSession.” This information will be useful later on when utilizing this bug.
Figure 6: Inside DestroyCursor (click to open larger image in new tab)
The code checks whether a specific cursor flag is set. If it is not set, the function proceeds to check if the cursor has its nextPointer initialized and if so, takes the branch to the recursive DestroyCursor call.
However, if the cursor flag is set, the code part on the left gets taken and there is some unlinking being performed. In the case where Cursor gets created with CreateCursor, this flag is never set. What happens in the PoC is the following:
- CursorA and CursorB get linked together.
- CursorB gets normally destroyed and freed, no unlinking is performed.
- CursorA gets destroyed, with the branch taken to the recursive DestroyCursor call because its nextPointer points to CursorB.
- Previously destroyed CursorB gets destroyed again.
It is now clear that one can easily take advantage of this bug between step 2 and 3 by replacing the freed cursor object.
EXPLOITATION
n3phos then looked more closely into the DestroyCursor function. During this function there is a call made to CleanupCursorObject:
Figure 7: Calling CleanupCursorObject
If an attacker happens to control the values at offset 0x38 and offset 0x40, he can free an arbitrary object of their choice. This needs some kind of memory leak.
Replacing the cursor with something useful
As mentioned earlier, the cursor object gets allocated on the PagedPoolSession. This means that we have to exclude pretty much all the allocations that are used in the ntoskrnl module as a possible replacement for the cursor since they get allocated on the NonPagedPoolNx (PoolType 0x200). The small allocation size of 0x98 bytes is also a problem because most of the GDI objects are bigger than that.
A possible object that would fit in would be, for example, a solid brush (0x98 bytes in size). But because it gets allocated with Win32AllocateFromPagedLookasideList, the address will never be the same as of the freed cursor. One further restriction is the need of zero reference count.
The researcher decided to use a gesture info structure.
Figure 8: AllocGestureInfo
Like the cursor, this gesture info structure gets allocated by HMAllocObject. What really matters is that we have enough control of its members to trigger the arbitrary free in CleanupCursorObject. ulArguments is @ offset 0x38 in the cursor and needs to be nonzero; arbitraryFree @ offset 0x40 is where the leaked object address gets written. The size of this gesture info object is calculated as follows:
0x30(cbSize) + 0x40(cbExtraArgs) + 0x30 (internally) = 0xa0 bytes. (The cursor is actually 0xa0 bytes big)
Leaking an object
The object used to leak was a Palette object.
This object can be created with the CreatePalette GDI function. It takes one logical palette as an argument:
palNumEntries
The number of entries in the logical palette.
palPalEntry
Specifies an array of PALETTEENTRY structures that define the color and usage of each entry in the logical palette.
A paletteentry is basically a DWORD that defines the RGB values the palette uses and is built like that: 0x00bbggrr. The zero is a flag. If we look at the palette in memory it looks something like this:
Figure 9: The palette object
When the palette gets allocated, its size is calculated like this:
0x98 (which is the basic object size) + 4 * numEntries
One can control the size of the palette to an extent, which will be important later on when we leak it. (Besides that, this object has some very interesting members, so if you ever happen to have a bug in GDI you might want to have one of these.)
For example if you overwrite the numEntries member you can read and write out of bounds (on the PagedPool). By overwriting the palEntries pointer at offset 0x80, we can read and write anywhere. Also, the “this” pointer will be quite useful in the information leak. To read and write we just call the following from Gdi32 in userland:
GetPaletteEntries (reading)
SetPaletteEntries (writing)
xxxBMPtoDIB
To understand how the “information leak” works, we first need to know a bit more about DIBs and the Clipboard.
From the MSDN description:
A DIB (device-independent bitmap) is a format used to define device-independent bitmaps in various color resolutions…
… A DIB is normally transported in metafiles (usually using the StretchDIBits function), BMP files, and the Clipboard (CF_DIB data format)…
…The header actually consists of two adjoining parts: the header proper and the color table. Both are combined in the BITMAPINFO structure, which is what all DIB APIs expect
-------------------
BITMAPINFO structure:
biBitCount
The number of bits-per-pixel. The biBitCount member of the BITMAPINFOHEADER defines the maximum number of colors in the bitmap.
4 The bitmap has a maximum of 16 colors, and the bmiColors member of BITMAPINFO contains up to 16 entries.
8 The bitmap has a maximum of 256 colors, and the bmiColors member of BITMAPINFO contains up to 256 entries.
16 The bitmap has a maximum of 2^16 colors.
bmiColors
An array of RGBQUAD (like palettentry) . The elements of the array that make up the color table.
-------------------
These are the important fields. As it was mentioned in the MSDN description, the BITMAPINFO structure consists of a BITMAPINFOHEADER followed by a color table (bmiColors). The color table is just an array of integers and its maximum size is specified by the biBitCount member. Now if we create (for example) a DIB with a bit count of 4, we would need to allocate 0x68 bytes of memory, because 0x28 bytes are used for the header (biSize) and 0x40 bytes would be used for the color table (maximum number of entries * 4 = 0x10 ( 16 entries ) * 4 = 0x40 bytes)
This is all we need to know about DIBs, so the next thing to look at is the clipboard. The clipboard is used by applications to transfer data between them or when you copy and paste different formats like texts and pictures and so forth. There are so-called standard clipboard formats2 that are defined by the system:
To place something on the clipboard, one has to call OpenClipboard first and then make a call to SetClipboardData. This takes the format (a constant value) as a first argument and a HANDLE to the data in the specified format as a second argument. To get something from the clipboard we call GetClipboardData and pass the format we want.
Another thing we need to know is that the clipboard can convert data between certain clipboard formats. If we request data in a format that is not on the clipboard, the system converts an available format to the requested format. For example if we put normal text on the clipboard and we request data in CF_UNICODETEXT format, the text gets converted to Unicode. Converting a special bitmap to a DIB, however, leads to uninitialized data being leaked. In order to reach the vulnerable function xxxBMPtoDIB in win32k there needs to be a “dummy Dib” on the clipboard. This can be achieved by:
- Opening the clipboard.
- Emptying the clipboard.
- Placing a bitmap handle to the clipboard.
- Closing the clipboard (munging the clipboard data).
We then proceed with these steps to leak uninitialized data:
- Reopen the clipboard.
- Place the special bitmap on the clipboard via SetClipboardData.
- Place some other required formats.
- Request data in the format of CF_DIB via GetClipboardData to convert the bitmap to DIB.
We can repeat this procedure as many times as we wish. This allows us to reach a deterministic state in which the data being leaked is the same over and over again, giving us the certainty that at the leaked address will indeed be a valid object allocated. While this works, the fact that we have to use the clipboard also has some caveats.
Calling CreateBitmap with these arguments is all it needs:
hbm = CreateBitmap( 1, // width 1, // height 1, // planes 5, // bitsPerPel ppvBits );
Each bitmap that gets created has usually a BITMAP structure (userland) and a palette (in the kernel object) associated with it. Not in this case though; this bitmap will not have a palette associated and the fourth parameter, bitsPerPel, gets rounded up to 8 for some reason and will be saved in the BITMAP structure. When converting the bitmap to DIB, this is what happens in xxxBMPtoDIB:
Figure 10: Inside xxxBMPtoDIB (click to open larger image in new window)
This function takes the bitmap we put on the clipboard earlier and uses the bitsPerPel BITMAP structure member from userland to calculate the size of the DIB color table. Remembering that the maximum number of entries of a DIB with biBitCount = 8 is 256, we can calculate the size as follows:
0x100 * 4 (color table) + 0x28 (header size) + 0x4 ( imageSize )= 0x42c bytes
Figure 11: More xxxBMPtoDIB action
Later in xxxBMPtoDIB, the above allocated buffer gets passed to GetDIBitsInternal. GreGetDIBitsInternalWorker would be responsible for initializing the color table @ offset 0x28, but because it never reaches the code (the function fails in bIsCompatible at the beginning because the Bitmap has no palette associated with it), it is possible to leak up to 0x404 bytes of uninitialized memory since the first 0x28 bytes are initialized. This gives us enough power to read the internal object pointers of a palette and predict (or know) where the next palette gets allocated. By allocating palettes with 0xe5 entries and then deleting them again, we can force xxxBMPtoDIB to reuse the freed memory of the palette and leak the “this” pointer @ offset 0x88.
0x98 + 4 * 0xe5 = 0x42c bytes
Once we have leaked the address of the target palette, we can just write it to the arbitraryFree member from the gestureInfo structure and call DestroyCursor to free the palette through CleanupCursorObject.
One problem that all of these objects face is the issue that they do not get immediately freed, but instead get placed on the DeferredFreePool. This problem can be solved by allocating 32 objects of the desired size and then deleting them right after to trigger a call to nt!ExDeferredFreePool, which finally releases the object we want to replace.
Figure 12: Clearing out the DeferredFreePool
Replacing the palette with our fakepalette
Luckily, there is a very convenient way to replace the freed palette: NtUserConvertMemHandle. This function copies the contents of a memory buffer from userland to kernelland on the PagedPool. The only thing we need to take into account is that the kernel buffer is not QWORD aligned, so the structure for the fakepalette has to be adjusted a little.
The shellcode gets stored at the palette entries array @ offset 0x90 and overwrite the function pointer @ offset 0x60 to point to the array. It then executes it through NtGdiGetNearestPaletteIndex, but this doesn’t work because the PagedPool is not executable on Windows 8. This means that we have to disable SMEP first to execute our shellcode in userland.
To achieve this, the report references Sebastian Apelt’s published Pwn2Own afd.sys privilege escalation write up. We have to write the address of the HalDispatchTable in our fakepalette @ offset 0x80, where the palEntries pointer resides. We can then read the function pointer at HalDispatchTable+0x18 (by GetPaletteEntries), namely nt!ArbAddReserved, to calculate the address of nt!KiConfigureDynamicProcessor and use the instructions at the end for our ROP gadget. Finally, we overwrite the QueryIntervalProfile pointer with the gadget (by SetPaletteEntries) and execute the shellcode.
To recap, the provided example performed the following:
- Leak the address of a palette object via Clipboard format conversion.
- Create two Cursors, CursorA and CursorB.
- Call NtUserLinkDpiCursor to link the cursors together.
- Destroy and free CursorB via NtUserDestroyCursor.
- Create a gestureInfo object on the PagedSessionPool of size 0xa0 to replace the freed CursorB.
- Destroy and free CursorA via NtUserDestroyCursor and free the target palette through CleanupCursorObject.
- Call NtUserConvertMemHandle to replace the freed palette of size 0x42c.
- Leak nt!ArbAddReserved from HalDispatchTable to compute the rop gadget address and evade ASLR.
- Perform a write to nt!HalDispatchtable to overwrite the QueryIntervalProfile pointer with the gadget address from nt!KiConfigureDynamicProcessor as ROP entry point.
- Execute Single-Gadget-ROP to disable SMEP.
- Directly return from gadget to userland code and execute the shellcode.
- Shellcode: Replace current process token with token of the SYSTEM process.
'malware ' 카테고리의 다른 글
x86obf code virtualizer released for free (0) | 2015.02.21 |
---|---|
The analysis of SuperFish adware (0) | 2015.02.21 |
KOMODIA/SUPERFISH SSL VALIDATION IS BROKEN (0) | 2015.02.21 |
An Experimental Require Certificate Transparency Directive for HSTS (0) | 2015.02.21 |
TeamCity Account Creation Lockout Bypass (CVE-2015-1313) (0) | 2015.02.21 |