View Single Post
  #1  
Old 11-24-2016, 16:52
jonwil jonwil is offline
VIP
 
Join Date: Feb 2004
Posts: 387
Rept. Given: 2
Rept. Rcvd 21 Times in 9 Posts
Thanks Given: 2
Thanks Rcvd at 65 Times in 34 Posts
jonwil Reputation: 21
unlinker - a program for extracting functions from a PE file for later reuse

I was watching this video from the DefCon conference:
https://www.youtube.com/watch?v=1cgtr7VW7gY
and it got me wondering if it was possible to extract a function from a PE file for re-use in your own code (something that I had a use for as part of reverse engineering work for a game I am reverse engineering). Turns out it is possible and the result is unlinker. As the name implies, unlinker is a tool that can extract functions and data members from a win32 PE file and produce a Visual Studio compatible COFF obj file you can link into your own code.

This represents a lot of work (and a lot of reverse engineering of the compiler and linker and other things to figure out the right things to emit into the obj file for undocumented things like the S_COMPILE3 data, the @feat.00 symbol and the @comp.id symbol) and hopefully it is useful to someone else who has a need to pull a function (or functions) out of a binary to reuse elsewhere.

You can obtain the code from here:
https://github.com/jonwil/unlinker
Its licensed under the GPL version 3 with the PDB/CodeView headers from Microsoft licensed under the MIT license and the udis86 disassembler licensed under the BSD license. If you make any changes or improvements to this code, please do share them with me. If you have any suggestions or requests, feel free to share those as well. Any questions, feel free to ask those as well. Feel free to post in this forum thread or file pull requests or issue reports on Github.

The code has been developed with the latest Visual C++ 2015 compiler (cl.exe reports version 19.00.24215.1) and outputs .obj files confirmed to be compatible with that version. I make no guarantees that it will work with any other version (including the new visual C++ 2017) although if you do happen to get it working, please do let me know (and share your work). The input code can be from any compiler (the stuff I am working with was compiled with Visual C++ 6)

To use unlinker, you just pass it an ini filename on the command line. See the test.ini file in the repo for an example of how this works.

The following keywords are valid in the ini file: (if you dont understand what a given ini keyword is for, just leave it out and it will use a suitable default, in fact I suggest the defaults for most things)
[General] section:
Binary (the filename of the binary to extract from)
Output (the filename of the obj file to create)
Language (the language value put into the output obj file, needs to be one of the CV_CFL_LANG constants in cvconst.h)
Machine (the machine type put into the output obj file, needs to be one of the right CV_CPU_TYPE_e constants in cvconst.h)
FrontEndMajorVersion (the compiler version number put into the output obj file)
FrontEndMinorVersion (the compiler version number put into the output obj file)
FrontEndBuildVersion (the compiler version number put into the output obj file)
FrontEndQFEVersion (the compiler version number put into the output obj file)
BackEndMajorVersion (the compiler version number put into the output obj file)
BackEndMinorVersion (the compiler version number put into the output obj file)
BackEndBuildVersion (the compiler version number put into the output obj file)
BackEndQFEVersion (the compiler version number put into the output obj file)
CompilerString (the compiler string to put into the output obj file)
FeatureEnum (the values to use for the @feat.00 symbol in the output obj file, needs to be a combination of the FeatureEnum constants in unlinker.cpp)
CompileType (the compile enum to use for the @comp.id symbol in the output obj file, needs to be one of the CompileType constants in unlinker.cpp)
CompilerIDVersion (the compiler version to use for the @comp.id symbol in the output obj file)

The actual symbols to extract go in the [CodeSymbols] section (if they are symbols that go in the code segment in the output obj file), [RDataSymbols] section (if they are symbols that go in the read-only data segment in the output obj file), [DataSymbols] section (if they are symbols that go in the read-write data segment in the output obj file) and [BSSSymbols] section (if they are symbols that go in the uninitialized data segment in the output obj file)

The actual symbol entries in the ini file can have the following keywords:
Address (the RVA of the symbol and the one that anything referencing it would be pointing at)
FileAddress (the address within the file of the symbol)
Size (the size in bytes of the symbol)
CodeSize (the size in bytes of the code portion of the symbol, this will matter later when dealing with jump tables and things, for now you set it to the same as Size)
Selection (what selection type to emit for the COMDAT record, needs to be one of the IMAGE_COMDAT_SELECT constants in winnt.h, for most things you want IMAGE_COMDAT_SELECT_NODUPLICATES but if you are emitting a symbol that is likely to be identical to a symbol in another obj file in your project, you want use IMAGE_COMDAT_SELECT_ANY, see __real@00000000 in the example for an instance of this)


If you set FileAddress and Size to 0, the code will treat the symbol as an "external" symbol and emit an external reference to it in the obj file)

Things unlinker cant handle as of yet: (there are probably others but this is what I know for sure)
PE files with relocations that need to be applied
Functions that have jump tables (for switch statements)
Anything that stores data (as opposed to code) in the code segment
Anything to do with exception handlers
Any items in the data segments that contain pointers to other items (e.g. virtual tables for functions)
Any items in the data segments that have global constructors.
Anything involving 16 bit or 64 bit code (its 32 bit only at this point)
Kernel drivers
Any functions that reference functions in another module (i.e. things in the import table of the binary)

If you need global constructors or vtables you can reverse engineer those into definitions in a C++ file and compile that and then link it with the .obj file from unlinker (in the example file _box_normal is an external symbol because it has a global constructor associated with it). If need be, you can even extract the global constructor as a function and then put an actual global constructor in another C++ file (so it gets added to the global constructors list) and call the one in your unlinker obj file.
Reply With Quote
The Following 2 Users Gave Reputation+1 to jonwil For This Useful Post:
mr.exodia (11-24-2016), TechLord (11-26-2016)
The Following 9 Users Say Thank You to jonwil For This Useful Post:
Acronys (02-27-2022), alephz (11-26-2016), dude719 (11-26-2016), Hypnz (11-24-2016), nimaarek (03-26-2021), sh3dow (02-26-2022), tonyweb (02-06-2017), user_hidden (11-25-2016)