Exetools  

Go Back   Exetools > General > Community Tools

Notices

Reply
 
Thread Tools Display Modes
  #1  
Old 11-24-2016, 16:52
jonwil jonwil is offline
VIP
 
Join Date: Feb 2004
Posts: 387
Rept. Given: 2
Rept. Rcvd 21 Times in 9 Posts
Thanks Given: 2
Thanks Rcvd at 65 Times in 34 Posts
jonwil Reputation: 21
unlinker - a program for extracting functions from a PE file for later reuse

I was watching this video from the DefCon conference:
https://www.youtube.com/watch?v=1cgtr7VW7gY
and it got me wondering if it was possible to extract a function from a PE file for re-use in your own code (something that I had a use for as part of reverse engineering work for a game I am reverse engineering). Turns out it is possible and the result is unlinker. As the name implies, unlinker is a tool that can extract functions and data members from a win32 PE file and produce a Visual Studio compatible COFF obj file you can link into your own code.

This represents a lot of work (and a lot of reverse engineering of the compiler and linker and other things to figure out the right things to emit into the obj file for undocumented things like the S_COMPILE3 data, the @feat.00 symbol and the @comp.id symbol) and hopefully it is useful to someone else who has a need to pull a function (or functions) out of a binary to reuse elsewhere.

You can obtain the code from here:
https://github.com/jonwil/unlinker
Its licensed under the GPL version 3 with the PDB/CodeView headers from Microsoft licensed under the MIT license and the udis86 disassembler licensed under the BSD license. If you make any changes or improvements to this code, please do share them with me. If you have any suggestions or requests, feel free to share those as well. Any questions, feel free to ask those as well. Feel free to post in this forum thread or file pull requests or issue reports on Github.

The code has been developed with the latest Visual C++ 2015 compiler (cl.exe reports version 19.00.24215.1) and outputs .obj files confirmed to be compatible with that version. I make no guarantees that it will work with any other version (including the new visual C++ 2017) although if you do happen to get it working, please do let me know (and share your work). The input code can be from any compiler (the stuff I am working with was compiled with Visual C++ 6)

To use unlinker, you just pass it an ini filename on the command line. See the test.ini file in the repo for an example of how this works.

The following keywords are valid in the ini file: (if you dont understand what a given ini keyword is for, just leave it out and it will use a suitable default, in fact I suggest the defaults for most things)
[General] section:
Binary (the filename of the binary to extract from)
Output (the filename of the obj file to create)
Language (the language value put into the output obj file, needs to be one of the CV_CFL_LANG constants in cvconst.h)
Machine (the machine type put into the output obj file, needs to be one of the right CV_CPU_TYPE_e constants in cvconst.h)
FrontEndMajorVersion (the compiler version number put into the output obj file)
FrontEndMinorVersion (the compiler version number put into the output obj file)
FrontEndBuildVersion (the compiler version number put into the output obj file)
FrontEndQFEVersion (the compiler version number put into the output obj file)
BackEndMajorVersion (the compiler version number put into the output obj file)
BackEndMinorVersion (the compiler version number put into the output obj file)
BackEndBuildVersion (the compiler version number put into the output obj file)
BackEndQFEVersion (the compiler version number put into the output obj file)
CompilerString (the compiler string to put into the output obj file)
FeatureEnum (the values to use for the @feat.00 symbol in the output obj file, needs to be a combination of the FeatureEnum constants in unlinker.cpp)
CompileType (the compile enum to use for the @comp.id symbol in the output obj file, needs to be one of the CompileType constants in unlinker.cpp)
CompilerIDVersion (the compiler version to use for the @comp.id symbol in the output obj file)

The actual symbols to extract go in the [CodeSymbols] section (if they are symbols that go in the code segment in the output obj file), [RDataSymbols] section (if they are symbols that go in the read-only data segment in the output obj file), [DataSymbols] section (if they are symbols that go in the read-write data segment in the output obj file) and [BSSSymbols] section (if they are symbols that go in the uninitialized data segment in the output obj file)

The actual symbol entries in the ini file can have the following keywords:
Address (the RVA of the symbol and the one that anything referencing it would be pointing at)
FileAddress (the address within the file of the symbol)
Size (the size in bytes of the symbol)
CodeSize (the size in bytes of the code portion of the symbol, this will matter later when dealing with jump tables and things, for now you set it to the same as Size)
Selection (what selection type to emit for the COMDAT record, needs to be one of the IMAGE_COMDAT_SELECT constants in winnt.h, for most things you want IMAGE_COMDAT_SELECT_NODUPLICATES but if you are emitting a symbol that is likely to be identical to a symbol in another obj file in your project, you want use IMAGE_COMDAT_SELECT_ANY, see __real@00000000 in the example for an instance of this)


If you set FileAddress and Size to 0, the code will treat the symbol as an "external" symbol and emit an external reference to it in the obj file)

Things unlinker cant handle as of yet: (there are probably others but this is what I know for sure)
PE files with relocations that need to be applied
Functions that have jump tables (for switch statements)
Anything that stores data (as opposed to code) in the code segment
Anything to do with exception handlers
Any items in the data segments that contain pointers to other items (e.g. virtual tables for functions)
Any items in the data segments that have global constructors.
Anything involving 16 bit or 64 bit code (its 32 bit only at this point)
Kernel drivers
Any functions that reference functions in another module (i.e. things in the import table of the binary)

If you need global constructors or vtables you can reverse engineer those into definitions in a C++ file and compile that and then link it with the .obj file from unlinker (in the example file _box_normal is an external symbol because it has a global constructor associated with it). If need be, you can even extract the global constructor as a function and then put an actual global constructor in another C++ file (so it gets added to the global constructors list) and call the one in your unlinker obj file.
Reply With Quote
The Following 2 Users Gave Reputation+1 to jonwil For This Useful Post:
mr.exodia (11-24-2016), TechLord (11-26-2016)
The Following 9 Users Say Thank You to jonwil For This Useful Post:
Acronys (02-27-2022), alephz (11-26-2016), dude719 (11-26-2016), Hypnz (11-24-2016), nimaarek (03-26-2021), sh3dow (02-26-2022), tonyweb (02-06-2017), user_hidden (11-25-2016)
  #2  
Old 11-24-2016, 16:55
jonwil jonwil is offline
VIP
 
Join Date: Feb 2004
Posts: 387
Rept. Given: 2
Rept. Rcvd 21 Times in 9 Posts
Thanks Given: 2
Thanks Rcvd at 65 Times in 34 Posts
jonwil Reputation: 21
Oh and feel free to share all this (including the link to the code repo) with anyone you like (if you post it to another forum, I would appreciate a notification so I can keep track of who is playing with it and make sure everyone gets any improvements and bug fixes I make)
Reply With Quote
  #3  
Old 11-24-2016, 18:17
mr.exodia mr.exodia is offline
Retired Moderator
 
Join Date: Nov 2011
Posts: 784
Rept. Given: 492
Rept. Rcvd 1,122 Times in 305 Posts
Thanks Given: 90
Thanks Rcvd at 711 Times in 333 Posts
mr.exodia Reputation: 1100-1299 mr.exodia Reputation: 1100-1299 mr.exodia Reputation: 1100-1299 mr.exodia Reputation: 1100-1299 mr.exodia Reputation: 1100-1299 mr.exodia Reputation: 1100-1299 mr.exodia Reputation: 1100-1299 mr.exodia Reputation: 1100-1299 mr.exodia Reputation: 1100-1299
Here is my Tweet https://twitter.com/mrexodia/status/801730091021729792
Reply With Quote
  #4  
Old 11-24-2016, 18:40
Shub-Nigurrath's Avatar
Shub-Nigurrath Shub-Nigurrath is offline
VIP
 
Join Date: Mar 2004
Location: Obscure Kadath
Posts: 919
Rept. Given: 60
Rept. Rcvd 419 Times in 94 Posts
Thanks Given: 68
Thanks Rcvd at 328 Times in 100 Posts
Shub-Nigurrath Reputation: 400-499 Shub-Nigurrath Reputation: 400-499 Shub-Nigurrath Reputation: 400-499 Shub-Nigurrath Reputation: 400-499 Shub-Nigurrath Reputation: 400-499
very good idea. re-twetted too. https://twitter.com/arteam_rce/status/801736606298284032
__________________
Ŝħůb-Ňìĝùŕřaŧħ ₪)
There are only 10 types of people in the world: Those who understand binary, and those who don't
http://www.accessroot.com
Reply With Quote
  #5  
Old 11-24-2016, 21:43
jonwil jonwil is offline
VIP
 
Join Date: Feb 2004
Posts: 387
Rept. Given: 2
Rept. Rcvd 21 Times in 9 Posts
Thanks Given: 2
Thanks Rcvd at 65 Times in 34 Posts
jonwil Reputation: 21
I updated it so you dont need to set CodeSize= in the ini file if its the same as Size= (less typing and less chance to accidentally screw up)
I also added support for jump tables (where you have a table at the end of the function with a list of addresses pointing to somewhere in the function). It will support the case of functions with multiple jump tables.

To use the new JumpTable stuff, you need to define a
[JumpTableSymbols] section that defines all the jump tables you want to copy.
Each symbol under there needs an Address= line (for the start of the jump table) and a Size= line (for the size in bytes of the jump table)

You also need to make sure the Size= line for the function that contains the jump table points to the end of the jump table and the CodeSize= line for that function points to the end of the code.

You also need a [JumpTargetSymbols] section.
Each symbol under that just gets an address representing a label that is the target of an entry in the jump table.

See test.ini for examples of this.


This doesn't yet support indirect jump tables (where you have both a table of addresses then another table with indexes into the first table)
Reply With Quote
The Following 2 Users Say Thank You to jonwil For This Useful Post:
mr.exodia (11-26-2016), TechLord (11-26-2016)
  #6  
Old 11-25-2016, 08:24
jonwil jonwil is offline
VIP
 
Join Date: Feb 2004
Posts: 387
Rept. Given: 2
Rept. Rcvd 21 Times in 9 Posts
Thanks Given: 2
Thanks Rcvd at 65 Times in 34 Posts
jonwil Reputation: 21
I made a fix to the way the operands for certain instructions are parsed so now it should correctly pick up every case where an instruction contains an absolute address to a data item.
Reply With Quote
The Following 2 Users Say Thank You to jonwil For This Useful Post:
mr.exodia (11-26-2016), TechLord (11-26-2016)
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
unlinker IDA - an IDA plugin for extracting functions from a PE file for later reuse jonwil Community Tools 10 02-26-2022 04:48
Extracting file from MSI package new_profile General Discussion 20 12-09-2013 15:06


All times are GMT +8. The time now is 19:00.


Always Your Best Friend: Aaron, JMI, ahmadmansoor, ZeNiX, chessgod101
( 1998 - 2024 )