Exetools  

Go Back   Exetools > General > General Discussion

Notices

Reply
 
Thread Tools Display Modes
  #16  
Old 02-23-2022, 03:56
sh3dow sh3dow is offline
Family
 
Join Date: Oct 2014
Posts: 158
Rept. Given: 113
Rept. Rcvd 79 Times in 24 Posts
Thanks Given: 458
Thanks Rcvd at 202 Times in 75 Posts
sh3dow Reputation: 79
Found this on my feed and wanted to share

├─Jack-in-the-Cache: A New Code injection Technique through Modifying X86-to-ARM Translation Cache (this one is from the same author I mentioned before in #3).
│ it's was presented for BlackHat con also the video this time is in English while the code blue one was in Japanese.

│─────https://i.blackhat.com/eu-20/Wednesday/eu-20-Nakagawa-Jack-In-The-Cache-A-New-Code-Injection-Technique-Through-Modifying-X86-To-Arm-Translation-Cache.pdf

└─────ttps://www.youtube.com/watch?v=8wg7X5IaEto

Quote:
Recently, the adoption of ARM processors for laptop computers is becoming popular due to its high energy efficiency. Windows 10 on ARM is a new OS for such ARM-based computers. Several laptop computers with this OS have already been shipped; notably, the recent launch of Microsoft Surface Pro X will be a driving force to facilitate the widespread use of Windows 10 on ARM.

You might think that there are new threats to such a new OS. Yes! We found such a threat.

In this talk, we present a new code injection technique to abuse a novel feature of Windows 10 on ARM: X86 emulation.
Remarkably, Windows 10 on ARM can run X86 apps via the X86 emulation feature that translates binary from X86-to-ARM just in time. To reduce the performance overhead of JIT binary translation, the OS has the mechanism to cache already-translated results as X86-to-ARM (XTA) cache files.

Our new code injection technique is performed by modifying this XTA cache file. Since this technique is difficult to detect and trace, appropriate countermeasures are necessary. Moreover, this technique can be used as an API hooking invisible to an X86 process. Therefore, this technique has already been a threat to Windows 10 on ARM.

We believe that future OSs have a JIT translation mechanism at the processor transition. For example, Apple has recently announced Rosetta 2, which is a similar mechanism for introducing their own ARM-based chip. For these OSs, the caching of already-translated results as files is a reasonable way to decrease performance overhead.

Our new code injection technique might also apply to such OSs.This presentation becomes a beneficial advisory for the developers of such future OSs, not limited to Windows 10 on ARM. PoC code of our new code injection technique and analysis results of the X86 emulation will be public on GitHub after this talk.
Excellent blog from the same author as well
├─Discovering a new relocation entry of ARM64X in recent Windows 10 on Arm

│─────https://ffri.github.io/ProjectChameleon/new_reloc_chpev2/

└─────https://github.com/FFRI/ProjectChameleon/

Last edited by sh3dow; 02-23-2022 at 09:04.
Reply With Quote
The Following 2 Users Say Thank You to sh3dow For This Useful Post:
niculaita (02-24-2022), tonyweb (02-25-2022)
  #17  
Old 02-24-2022, 05:04
DavidXanatos DavidXanatos is offline
Family
 
Join Date: Jun 2018
Posts: 179
Rept. Given: 2
Rept. Rcvd 46 Times in 32 Posts
Thanks Given: 58
Thanks Rcvd at 350 Times in 116 Posts
DavidXanatos Reputation: 46
hmpf... sooo I have a nice function hijacking code that from arm64 to arm64 works perfectly.

Now we also know that all processes in arm64 start execution as arm64 (or at least I think that) so at the very start of every program we should enter ntdll's LdrInitializeThunk and for arm64 on arm64 its of cause so.
For other architectures on arm64 we should at some point divert into emulated code.

When trying to inject a detour in LdrInitializeThunk of a created suspended x64 process on arm64 however that code does not seam to ever be executed. Meaning I can inject garbage and it will still startup just fine.

Now my assumption of how x64 on arm64 works is that as soon as execution goes into a system dll i.e. anything compiled as ARM64X we exit emulated x64 mode and execute the native arm code in the dll. So it stands to argue that wen bootstrapping a process, it behaves analogously everything is executed in native arm until it comes the time to call the x64 processes entry point.

Well it seams something isn't quite right here, one possibility is that the ARM64X dll's truly have all the code doubled including large portions of the arm code, so when I manipulate the LdrInitializeThunk I get i do it to a copy that will never be used.

Now I find that strange I would have assumed that the code wouldn't be doubled that MSFT would have some smart redirection in place allowing the ARM64X dll's to re use most of the arm code for the native and the emulated mode.

@RamMerLabs since you apparently have already a lot of experience with the layout of the new PE files, would you may be have a few tips is that really so that the code is fully separated?
Reply With Quote
  #18  
Old 02-24-2022, 16:10
DavidXanatos DavidXanatos is offline
Family
 
Join Date: Jun 2018
Posts: 179
Rept. Given: 2
Rept. Rcvd 46 Times in 32 Posts
Thanks Given: 58
Thanks Rcvd at 350 Times in 116 Posts
DavidXanatos Reputation: 46
After some more research I can confirm that this is what happens, when i get the address of #LdrInitializeThunk from the symbol file for ntdll and use these my injected code works.
Sooo... the next question is how to get the "export" addresses without the need of a pdb file.
It was already earlier written that this ARM64X files have a 2nd export directory, so I guess parsing that "by hand" would be the strait forward approach.
Unless there is a flag that can be passed to LdrLoadDll that would do this for me ?
Reply With Quote
  #19  
Old 02-24-2022, 16:30
RamMerLabs RamMerLabs is offline
Family
 
Join Date: Feb 2020
Posts: 54
Rept. Given: 0
Rept. Rcvd 52 Times in 27 Posts
Thanks Given: 9
Thanks Rcvd at 268 Times in 48 Posts
RamMerLabs Reputation: 52
Quote:
Originally Posted by DavidXanatos View Post
@RamMerLabs since you apparently have already a lot of experience with the layout of the new PE files, would you may be have a few tips is that really so that the code is fully separated?
Briefly about ARM64X: just look at the table LoadConfig->HybridPE->CodeMap and the sizes of code blocks for different architectures. Сontents of CodeMap table does not change either for native ARM64 or for the emulated architecture after the "transformation" of the image.
As I said, if a particular process needs an x64 image instead of arm64, then after DVRT processing, the image will be perceived by the loader as ARM64EC.

Quote:
Originally Posted by DavidXanatos View Post
Sooo... the next question is how to get the "export" addresses without the need of a pdb file.
Look at WOW-thunks table. It combines data from "Code Ranges to EPs Table" and "Redirection Metadata Table" in HybridPE header

Last edited by RamMerLabs; 02-24-2022 at 16:46.
Reply With Quote
The Following 5 Users Say Thank You to RamMerLabs For This Useful Post:
Abaddon (02-25-2022), DavidXanatos (02-25-2022), niculaita (02-25-2022), sh3dow (02-26-2022), tonyweb (02-25-2022)
  #20  
Old 02-25-2022, 05:21
DavidXanatos DavidXanatos is offline
Family
 
Join Date: Jun 2018
Posts: 179
Rept. Given: 2
Rept. Rcvd 46 Times in 32 Posts
Thanks Given: 58
Thanks Rcvd at 350 Times in 116 Posts
DavidXanatos Reputation: 46
Splendid! Where can i find some documentation/reverse engineering notes on the new PE features? As for my use case it's not enough to see them in a tool i need to programmatically get to the relevant addresses for code injection.
I already have found this: https://ffri.github.io/ProjectChameleon/new_reloc_chpev2/ but its from 2019 i imagine by now there should be more refined information available?
Reply With Quote
  #21  
Old 02-25-2022, 17:15
RamMerLabs RamMerLabs is offline
Family
 
Join Date: Feb 2020
Posts: 54
Rept. Given: 0
Rept. Rcvd 52 Times in 27 Posts
Thanks Given: 9
Thanks Rcvd at 268 Times in 48 Posts
RamMerLabs Reputation: 52
All you need is WDK starting 21277 build and a little research perseverance
BTW, the features discussed here were first introduced at the very end of 2020, not 2019. And nothing new has been added since then.
Reply With Quote
  #22  
Old 02-25-2022, 21:27
DavidXanatos DavidXanatos is offline
Family
 
Join Date: Jun 2018
Posts: 179
Rept. Given: 2
Rept. Rcvd 46 Times in 32 Posts
Thanks Given: 58
Thanks Rcvd at 350 Times in 116 Posts
DavidXanatos Reputation: 46
i was somehow under the impression the article was older, but you seam to be right, it was may be a bot to late yesterday, LOL
downloading the newest WDk right now....
Reply With Quote
  #23  
Old 02-26-2022, 06:32
sh3dow sh3dow is offline
Family
 
Join Date: Oct 2014
Posts: 158
Rept. Given: 113
Rept. Rcvd 79 Times in 24 Posts
Thanks Given: 458
Thanks Rcvd at 202 Times in 75 Posts
sh3dow Reputation: 79
Quote:
Originally Posted by DavidXanatos View Post
I already have found this: https://ffri.github.io/ProjectChameleon/new_reloc_chpev2/ but its from 2019 i imagine by now there should be more refined information available?
Actually it's not that old and very recent. it was published 2021/07/13. so I don't think a lot of things has changed if any at all in this short time window knowing Microsoft.

Also the author claimed while the title says "Windows 10 on Arm" but the results are also valid in Windows 11 on Arm. probably that an evidence at the slowness of MS side of adding new features.

No new public refined information so far and he is the only public researcher known in doing research in this subject. Actually I recommend you to contact him. I'm sure he will be delighted that another reverse engineer is interested in his research and that will spark an interesting chat in this subject, I'm sure you will get a lot of info and questions answered.
Reply With Quote
  #24  
Old 02-26-2022, 06:53
DavidXanatos DavidXanatos is offline
Family
 
Join Date: Jun 2018
Posts: 179
Rept. Given: 2
Rept. Rcvd 46 Times in 32 Posts
Thanks Given: 58
Thanks Rcvd at 350 Times in 116 Posts
DavidXanatos Reputation: 46
Indeed I should contact him, good idea

But now lets talk about what we can get from the relevant header files: ...\10.0.22000.0\um\winnt.h and ...\10.0.22000.0\km\ntimage.h

With the attached FindDllExport we can extract exports from a loaded image with the below example we run a arm64 process and target a x64 process. For ourselves we get the normal ntdll.dll!LdrLoadDll while for the x64 process we get the ntdll.dll!EXP+#LdrLoadDll

Code:
	//ntdllBase 0x00007ffa5a7d0000
	//0x00007ffa5a811050 {ntdll.dll!LdrLoadDll(void)}
	//0x00007ffa5a7d1890 {ntdll.dll!EXP+#LdrLoadDll}
	//0x00007ffa5a969920 {ntdll.dll!#LdrLoadDll}

	HMODULE hNtdll = GetModuleHandle(L"ntdll.dll");
	//DWORD64 LLW1 = GetProcAddress(hNtdll, "LdrLoadDll");
	DWORD64 LLW1 = FindDllExport(GetCurrentProcess(), (DWORD64)hNtdll, "LdrLoadDll");

	DWORD64 ntdllBase = FindDllBase(hProcess, L"\\system32\\ntdll.dll");
	DWORD64 LLW2 = FindDllExport(hProcess, ntdllBase, "LdrLoadDll");
The export directory on disk (or in the arm64 case) is at RVA 0x31E1A0
for the x64 process the value at 0x178 is overwritten with 0x308810 what is the alternative export directory.
This operation is indicated by PE Anatomist in "Loader Config -> Dyn. Value Relocs", 2nd entry. So if we don't have a loaded process we could extract the value from there and read the second export directory from disk directly.

In "Debug->POGO" we see that the export directory starts at 0x308810 and is 0x2b224 in size
First the alternative directory 0x308810 to 0x31E1A0
and given the entry size it seams the primary starting at 0x31E1A0 goes to 0x333A34 booth are similar in size, so there does not seam to be a 3rd one for the ntdll.dll!#.... exports

I assume that's because there is no typical use case where a process would want those directly, a arm64 will load the primary table a x64 process the alternative table, and 32 bit once have their own ntdll's in the wow folders.

So where do we go from here, we notice that "Loader Config -> hybrid PE -> WoW Thunks Metadata" seams to hold all the RVA's we get from the export directory, and the destinations are the addresses of the #... functions.

So we can get the !EXP+#... function addresses form the export dir and look up the #... addresses in this RedirectionMetadata table, the FindDllExport now checks if the first char is a '#' and if so triggers the additional lookup.

Code:
	DWORD64 LLW3 = FindDllExport(hProcess, ntdllBase, "#LdrLoadDll");
Gives us now the right #... function address which we can use as target for arm64 code injection as well as for function calls from the injected shell code.


One thing I haven't figured yet out is how we get the alternative export directory if we don't have a process with emulation at at our disposal.
I'm not sure how PE Anatomist gets the "Dyn. Value Relocs" from, for me in a live process DynamicValueRelocTable is NULL and base + DynamicValueRelocTableOffset or loaderConfig + DynamicValueRelocTableOffset do not seam to result in valid data.
While for the use case at hand its not required I would like to know how to get to this list as well, any tips would be greatly appreciated.
Attached Files
File Type: zip dllimport.zip (2.7 KB, 3 views)
Reply With Quote
The Following User Says Thank You to DavidXanatos For This Useful Post:
sh3dow (02-26-2022)
  #25  
Old 02-28-2022, 16:57
DavidXanatos DavidXanatos is offline
Family
 
Join Date: Jun 2018
Posts: 179
Rept. Given: 2
Rept. Rcvd 46 Times in 32 Posts
Thanks Given: 58
Thanks Rcvd at 350 Times in 116 Posts
DavidXanatos Reputation: 46
So a small progress report on my arm64/x64 code injection experiments:
Injecting a arm64ec library into a x64 or arm64ec process on arm64 works just fine, including calling an exported function from the injected shell code. One just need to take care of calling the # function address and not the !EXP+# as given by LdrGetProcedureAddress.

Injecting a x64 library into a x64 or arm64ec process on arm64 also works just fine.
What does not yet work is calling a exported x64 function. I assume some additional code for arm64 to x64 transition needs to be added to the call.

Also the next BIG problem there is no arm32ec tool chain available.
I presume arm shell code to just load a x86 dll will work, but if one wants to call some exported function (instead of just relaying on the DllMain entry point) the arm to x86 transition will need to be researched.

What's also probably a bit of an issue, PEAnatomist does not show any "WoW Thunks Metadata" for the hybrid 32 bit ntdll in SyChpe32, while a stack trace to a statically loaded dll's DllMain shows the presence of # functions.
So there will be some investigation needed if these data can still be obtained form the image. Alternatively parsing the !EXP+# thunk should allow one to find the right # address.
Reply With Quote
  #26  
Old 04-09-2022, 15:16
DavidXanatos DavidXanatos is offline
Family
 
Join Date: Jun 2018
Posts: 179
Rept. Given: 2
Rept. Rcvd 46 Times in 32 Posts
Thanks Given: 58
Thanks Rcvd at 350 Times in 116 Posts
DavidXanatos Reputation: 46
I have figured out how to get the Dyn. Relocs Table with which we can get the alternate export directory from an image on disk:

Code:
			IMAGE_LOAD_CONFIG_DIRECTORY64 LoadConfig;

			IMAGE_DATA_DIRECTORY* dir10 = &opt_hdr_64->DataDirectory[IMAGE_DIRECTORY_ENTRY_LOAD_CONFIG];
			if (resolve_ec && dir10->VirtualAddress && dir10->Size >= FIELD_OFFSET(IMAGE_LOAD_CONFIG_DIRECTORY64, CHPEMetadataPointer) + sizeof(ULONGLONG)) {

				status = ReadDll(hProcess, FindImagePosition(dir10->VirtualAddress, nt_hdrs_64, DllBase), &LoadConfig, min(sizeof(LoadConfig), dir10->Size), NULL);
			}

			typedef struct _DYN_RELOC_TABLE {
				ULONG Unknown1;
				ULONG Unknown2;
				ULONG Unknown3;
				ULONG Unknown4;
				ULONG TableSize;
				UCHAR Entries[];
			} DYN_RELOC_TABLE;
			
			DYN_RELOC_TABLE* DynamicValueRelocTable = NULL;

			if (DllBase == 0 && (resolve_ec || resolve_exp)) { // only for images on disk, on linve images we take the actuallly used export directory

				PIMAGE_SECTION_HEADER section = IMAGE_FIRST_SECTION(nt_hdrs);
				nt_hdrs->FileHeader.NumberOfSections;

				section += (LoadConfig.DynamicValueRelocTableSection - 1);

				ULONG pos = FindImagePosition(section->VirtualAddress, nt_hdrs_64, DllBase);
				status = ReadDll(hProcess, pos, Buffer2, min(sizeof(Buffer2), section->Misc.VirtualSize), NULL);

				DynamicValueRelocTable = (DYN_RELOC_TABLE*)(Buffer2 + LoadConfig.DynamicValueRelocTableOffset);

				//dir0->VirtualAddress = 0x308810;
			}

			for (UCHAR* TablePtr = DynamicValueRelocTable->Entries; TablePtr < DynamicValueRelocTable->Entries + DynamicValueRelocTable->TableSize; ) {

				struct {
					ULONG Offset;
					ULONG Size;
				} *Section = TablePtr;
				TablePtr += 8;
				Section->Size -= 8;

				for (UCHAR* EntryPtr = TablePtr; TablePtr < EntryPtr + Section->Size; ) {
					struct {
						USHORT  
							RVA : 12,
							Unknown: 1,
							Size : 3;
					} *Entry = TablePtr;
					TablePtr += 2;

					ULONGLONG Value = 0;
					memcpy(&Value, TablePtr, Entry->Size);
					TablePtr += Entry->Size;

					DbgPrintf("%08x -> %08x\n", Section->Offset + Entry->RVA, (ULONG)Value);

				}
			}
there are a couple unknown values so if anyone has an idea what they are please share.
Reply With Quote
The Following User Says Thank You to DavidXanatos For This Useful Post:
sh3dow (04-10-2022)
  #27  
Old 04-09-2022, 20:02
DavidXanatos DavidXanatos is offline
Family
 
Join Date: Jun 2018
Posts: 179
Rept. Given: 2
Rept. Rcvd 46 Times in 32 Posts
Thanks Given: 58
Thanks Rcvd at 350 Times in 116 Posts
DavidXanatos Reputation: 46
And here an other useful nugget of information: https://docs.microsoft.com/en-us/windows/uwp/porting/arm64ec-abi
Reply With Quote
The Following User Says Thank You to DavidXanatos For This Useful Post:
niculaita (04-10-2022)
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



All times are GMT +8. The time now is 13:49.


Always Your Best Friend: Aaron, JMI, ahmadmansoor, ZeNiX, chessgod101
( 1998 - 2024 )