#16
|
|||
|
|||
Found this on my feed and wanted to share
├─Jack-in-the-Cache: A New Code injection Technique through Modifying X86-to-ARM Translation Cache (this one is from the same author I mentioned before in #3). │ it's was presented for BlackHat con also the video this time is in English while the code blue one was in Japanese. │ │─────https://i.blackhat.com/eu-20/Wednesday/eu-20-Nakagawa-Jack-In-The-Cache-A-New-Code-Injection-Technique-Through-Modifying-X86-To-Arm-Translation-Cache.pdf │ └─────ttps://www.youtube.com/watch?v=8wg7X5IaEto Quote:
├─Discovering a new relocation entry of ARM64X in recent Windows 10 on Arm │ │─────https://ffri.github.io/ProjectChameleon/new_reloc_chpev2/ │ └─────https://github.com/FFRI/ProjectChameleon/ Last edited by sh3dow; 02-23-2022 at 09:04. |
#17
|
|||
|
|||
hmpf... sooo I have a nice function hijacking code that from arm64 to arm64 works perfectly.
Now we also know that all processes in arm64 start execution as arm64 (or at least I think that) so at the very start of every program we should enter ntdll's LdrInitializeThunk and for arm64 on arm64 its of cause so. For other architectures on arm64 we should at some point divert into emulated code. When trying to inject a detour in LdrInitializeThunk of a created suspended x64 process on arm64 however that code does not seam to ever be executed. Meaning I can inject garbage and it will still startup just fine. Now my assumption of how x64 on arm64 works is that as soon as execution goes into a system dll i.e. anything compiled as ARM64X we exit emulated x64 mode and execute the native arm code in the dll. So it stands to argue that wen bootstrapping a process, it behaves analogously everything is executed in native arm until it comes the time to call the x64 processes entry point. Well it seams something isn't quite right here, one possibility is that the ARM64X dll's truly have all the code doubled including large portions of the arm code, so when I manipulate the LdrInitializeThunk I get i do it to a copy that will never be used. Now I find that strange I would have assumed that the code wouldn't be doubled that MSFT would have some smart redirection in place allowing the ARM64X dll's to re use most of the arm code for the native and the emulated mode. @RamMerLabs since you apparently have already a lot of experience with the layout of the new PE files, would you may be have a few tips is that really so that the code is fully separated? |
#18
|
|||
|
|||
After some more research I can confirm that this is what happens, when i get the address of #LdrInitializeThunk from the symbol file for ntdll and use these my injected code works.
Sooo... the next question is how to get the "export" addresses without the need of a pdb file. It was already earlier written that this ARM64X files have a 2nd export directory, so I guess parsing that "by hand" would be the strait forward approach. Unless there is a flag that can be passed to LdrLoadDll that would do this for me ? |
#19
|
|||
|
|||
Quote:
As I said, if a particular process needs an x64 image instead of arm64, then after DVRT processing, the image will be perceived by the loader as ARM64EC. Look at WOW-thunks table. It combines data from "Code Ranges to EPs Table" and "Redirection Metadata Table" in HybridPE header Last edited by RamMerLabs; 02-24-2022 at 16:46. |
The Following 5 Users Say Thank You to RamMerLabs For This Useful Post: | ||
Abaddon (02-25-2022), DavidXanatos (02-25-2022), niculaita (02-25-2022), sh3dow (02-26-2022), tonyweb (02-25-2022) |
#20
|
|||
|
|||
Splendid! Where can i find some documentation/reverse engineering notes on the new PE features? As for my use case it's not enough to see them in a tool i need to programmatically get to the relevant addresses for code injection.
I already have found this: https://ffri.github.io/ProjectChameleon/new_reloc_chpev2/ but its from 2019 i imagine by now there should be more refined information available? |
#21
|
|||
|
|||
All you need is WDK starting 21277 build and a little research perseverance
BTW, the features discussed here were first introduced at the very end of 2020, not 2019. And nothing new has been added since then. |
#22
|
|||
|
|||
i was somehow under the impression the article was older, but you seam to be right, it was may be a bot to late yesterday, LOL
downloading the newest WDk right now.... |
#23
|
|||
|
|||
Quote:
Also the author claimed while the title says "Windows 10 on Arm" but the results are also valid in Windows 11 on Arm. probably that an evidence at the slowness of MS side of adding new features. No new public refined information so far and he is the only public researcher known in doing research in this subject. Actually I recommend you to contact him. I'm sure he will be delighted that another reverse engineer is interested in his research and that will spark an interesting chat in this subject, I'm sure you will get a lot of info and questions answered. |
#24
|
|||
|
|||
Indeed I should contact him, good idea
But now lets talk about what we can get from the relevant header files: ...\10.0.22000.0\um\winnt.h and ...\10.0.22000.0\km\ntimage.h With the attached FindDllExport we can extract exports from a loaded image with the below example we run a arm64 process and target a x64 process. For ourselves we get the normal ntdll.dll!LdrLoadDll while for the x64 process we get the ntdll.dll!EXP+#LdrLoadDll Code:
//ntdllBase 0x00007ffa5a7d0000 //0x00007ffa5a811050 {ntdll.dll!LdrLoadDll(void)} //0x00007ffa5a7d1890 {ntdll.dll!EXP+#LdrLoadDll} //0x00007ffa5a969920 {ntdll.dll!#LdrLoadDll} HMODULE hNtdll = GetModuleHandle(L"ntdll.dll"); //DWORD64 LLW1 = GetProcAddress(hNtdll, "LdrLoadDll"); DWORD64 LLW1 = FindDllExport(GetCurrentProcess(), (DWORD64)hNtdll, "LdrLoadDll"); DWORD64 ntdllBase = FindDllBase(hProcess, L"\\system32\\ntdll.dll"); DWORD64 LLW2 = FindDllExport(hProcess, ntdllBase, "LdrLoadDll"); for the x64 process the value at 0x178 is overwritten with 0x308810 what is the alternative export directory. This operation is indicated by PE Anatomist in "Loader Config -> Dyn. Value Relocs", 2nd entry. So if we don't have a loaded process we could extract the value from there and read the second export directory from disk directly. In "Debug->POGO" we see that the export directory starts at 0x308810 and is 0x2b224 in size First the alternative directory 0x308810 to 0x31E1A0 and given the entry size it seams the primary starting at 0x31E1A0 goes to 0x333A34 booth are similar in size, so there does not seam to be a 3rd one for the ntdll.dll!#.... exports I assume that's because there is no typical use case where a process would want those directly, a arm64 will load the primary table a x64 process the alternative table, and 32 bit once have their own ntdll's in the wow folders. So where do we go from here, we notice that "Loader Config -> hybrid PE -> WoW Thunks Metadata" seams to hold all the RVA's we get from the export directory, and the destinations are the addresses of the #... functions. So we can get the !EXP+#... function addresses form the export dir and look up the #... addresses in this RedirectionMetadata table, the FindDllExport now checks if the first char is a '#' and if so triggers the additional lookup. Code:
DWORD64 LLW3 = FindDllExport(hProcess, ntdllBase, "#LdrLoadDll"); One thing I haven't figured yet out is how we get the alternative export directory if we don't have a process with emulation at at our disposal. I'm not sure how PE Anatomist gets the "Dyn. Value Relocs" from, for me in a live process DynamicValueRelocTable is NULL and base + DynamicValueRelocTableOffset or loaderConfig + DynamicValueRelocTableOffset do not seam to result in valid data. While for the use case at hand its not required I would like to know how to get to this list as well, any tips would be greatly appreciated. |
The Following User Says Thank You to DavidXanatos For This Useful Post: | ||
sh3dow (02-26-2022) |
#25
|
|||
|
|||
So a small progress report on my arm64/x64 code injection experiments:
Injecting a arm64ec library into a x64 or arm64ec process on arm64 works just fine, including calling an exported function from the injected shell code. One just need to take care of calling the # function address and not the !EXP+# as given by LdrGetProcedureAddress. Injecting a x64 library into a x64 or arm64ec process on arm64 also works just fine. What does not yet work is calling a exported x64 function. I assume some additional code for arm64 to x64 transition needs to be added to the call. Also the next BIG problem there is no arm32ec tool chain available. I presume arm shell code to just load a x86 dll will work, but if one wants to call some exported function (instead of just relaying on the DllMain entry point) the arm to x86 transition will need to be researched. What's also probably a bit of an issue, PEAnatomist does not show any "WoW Thunks Metadata" for the hybrid 32 bit ntdll in SyChpe32, while a stack trace to a statically loaded dll's DllMain shows the presence of # functions. So there will be some investigation needed if these data can still be obtained form the image. Alternatively parsing the !EXP+# thunk should allow one to find the right # address. |
#26
|
|||
|
|||
I have figured out how to get the Dyn. Relocs Table with which we can get the alternate export directory from an image on disk:
Code:
IMAGE_LOAD_CONFIG_DIRECTORY64 LoadConfig; IMAGE_DATA_DIRECTORY* dir10 = &opt_hdr_64->DataDirectory[IMAGE_DIRECTORY_ENTRY_LOAD_CONFIG]; if (resolve_ec && dir10->VirtualAddress && dir10->Size >= FIELD_OFFSET(IMAGE_LOAD_CONFIG_DIRECTORY64, CHPEMetadataPointer) + sizeof(ULONGLONG)) { status = ReadDll(hProcess, FindImagePosition(dir10->VirtualAddress, nt_hdrs_64, DllBase), &LoadConfig, min(sizeof(LoadConfig), dir10->Size), NULL); } typedef struct _DYN_RELOC_TABLE { ULONG Unknown1; ULONG Unknown2; ULONG Unknown3; ULONG Unknown4; ULONG TableSize; UCHAR Entries[]; } DYN_RELOC_TABLE; DYN_RELOC_TABLE* DynamicValueRelocTable = NULL; if (DllBase == 0 && (resolve_ec || resolve_exp)) { // only for images on disk, on linve images we take the actuallly used export directory PIMAGE_SECTION_HEADER section = IMAGE_FIRST_SECTION(nt_hdrs); nt_hdrs->FileHeader.NumberOfSections; section += (LoadConfig.DynamicValueRelocTableSection - 1); ULONG pos = FindImagePosition(section->VirtualAddress, nt_hdrs_64, DllBase); status = ReadDll(hProcess, pos, Buffer2, min(sizeof(Buffer2), section->Misc.VirtualSize), NULL); DynamicValueRelocTable = (DYN_RELOC_TABLE*)(Buffer2 + LoadConfig.DynamicValueRelocTableOffset); //dir0->VirtualAddress = 0x308810; } for (UCHAR* TablePtr = DynamicValueRelocTable->Entries; TablePtr < DynamicValueRelocTable->Entries + DynamicValueRelocTable->TableSize; ) { struct { ULONG Offset; ULONG Size; } *Section = TablePtr; TablePtr += 8; Section->Size -= 8; for (UCHAR* EntryPtr = TablePtr; TablePtr < EntryPtr + Section->Size; ) { struct { USHORT RVA : 12, Unknown: 1, Size : 3; } *Entry = TablePtr; TablePtr += 2; ULONGLONG Value = 0; memcpy(&Value, TablePtr, Entry->Size); TablePtr += Entry->Size; DbgPrintf("%08x -> %08x\n", Section->Offset + Entry->RVA, (ULONG)Value); } } |
The Following User Says Thank You to DavidXanatos For This Useful Post: | ||
sh3dow (04-10-2022) |
#27
|
|||
|
|||
And here an other useful nugget of information: https://docs.microsoft.com/en-us/windows/uwp/porting/arm64ec-abi
|
The Following User Says Thank You to DavidXanatos For This Useful Post: | ||
niculaita (04-10-2022) |
|
|