Go Back   Exetools > General > General Discussion


Thread Tools Display Modes
Old 07-31-2019, 15:13
chants chants is offline
Join Date: Jul 2016
Posts: 635
Rept. Given: 21
Rept. Rcvd 43 Times in 26 Posts
Thanks Given: 589
Thanks Rcvd at 954 Times in 431 Posts
chants Reputation: 43
Advanced IdaPython to change details UI cannot change

Advanced IDAPython coding tutorial

For example, in 16-bit code, there is far-code near-data and the other 3 combinations for farness and nearness. Sometimes IDA gets it wrong with FLIRT and chooses the wrong model messing up stack parameter alignments and so forth. But you cannot change the function type as it already has pointers in it e.g.: void *__cdecl memset(void *, int, size_t);

The Options -> Compilers is the only place where this model can be specified but its a global setting despite that every function prototype has some setting for this.

So you want 2 byte pointers not 4 byte pointers for this particular function since it uses near-data far-code model. But the default for the file is far-data far-code. If the function has an unknown calling convention it will also use this default.

So how can you fix it without ruining the pointer type of the return and first arguments?

ea = idaapi.get_name_ea(idaapi.BADADDR, "_memset")
tinfo = idaapi.tinfo_t()
idaapi.get_tinfo2(ea, tinfo)
fi = idaapi.func_type_data_t()
fi.cc will return a value such as 50 (48+2) and the docs will help on the calling convention masks and values:
const cm_t  CM_N16_F32   = 0x02;  ///< 2: near 2bytes, far 4bytes

const cm_t CM_M_MASK = 0x0C;
const cm_t  CM_M_NN      = 0x00;  ///< small:   code=near, data=near (or unknown if CM_UNKNOWN)
const cm_t  CM_M_FF      = 0x04;  ///< large:   code=far, data=far
const cm_t  CM_M_NF      = 0x08;  ///< compact: code=near, data=far
const cm_t  CM_M_FN      = 0x0C;  ///< medium:  code=far, data=near

const cm_t  CM_CC_UNKNOWN  = 0x10;  ///< unknown calling convention
const cm_t  CM_CC_CDECL    = 0x30;  ///< stack
So it is marked as near 16/far 32 pointers and CDECL both of which are correct.

fi.cc = (fi.cc & ~idaapi.CM_M_MASK) | idaapi.CM_M_FN
ida_typeinf.apply_tinfo(ea, tinfo, idaapi.TINFO_DEFINITE)
This works in IDA (creating a new tinfo would crash it because data would be missing). If you use apply_tinfo without creating a newtinfo it does nothing as the fi needs to be rebound to a tinfo to work. If you do not change fi.cc to far code near data, it also works. So the code is correct, can query the type again and it is changed.

Now unfortunately that did not fix the stack of the function nor the callers.

This gets a bit trickier. Intuitively one would think that just using `ida_funcs.reanalyze_function(idaapi.get_func(ea))` would fix the function up immediately. But it does not work and the answer is deeper in IDAs functioning. The tool to create frames lies in the processor module which is not properly exposed to IDAPython. So some manual C tricks will be needed: first get the DLL name, then get the 'ph' processor_t structure (unfortunately the undocumented ida_idp.ph is not helpful here). Of course inspection is always a best first start such as:
import inspect
Anyway, this done manually requires careful adjustment to pad out the not needed fields before the notify callback. IDA 7.x changes things quite a bit not just with 64 bit calling convention but additional members and the pointer size change. Some commented code is left in for those brave enough to try a solution for Linux where theoretically the variadic argument list is a data structure supporting register arguments and quite a bit more complicated. For whatever strange reason, Windows is not following the recommended ABI with va_list but the old model pre-amd64 model where its a mere pointer to the arguments, conveniently making things easier.

import ctypes
def get_dll():
    # http://www.hexblog.com/?p=695
    idaname = "ida64" if idc.__EA64__ else "ida"
    if sys.platform == "win32":
        return ctypes.windll[idaname + (".dll" if idaapi.IDA_SDK_VERSION >= 700 else ".wll")]
    elif sys.platform == "linux2":
        return ctypes.cdll["lib" + idaname + ".so"]
    elif sys.platform == "darwin":
        return ctypes.cdll["lib" + idaname + ".dylib"]
def get_ph(notif):
#int32 version, id, uint32 flag;
#uint32 flag2; # >= 7.0
#int32 cnbits, dnbits;
#const char *const *psnames, *plnames, const asm_t *const *assemblers; # ptr >= 7.0 is 64-bit
#hook_cb_t *_notify; # >= 7.0 otherwise its just notify(int, ...) instead of (void*, int, va_list/...)
#va_list is just a pointer argument to the variadic list which at invocation time would be difficult to achieve by pushing the next stack address onto the stack as such low level operation probably is not in ctypes
    class processor_t(ctypes.Structure):
        _fields_ = [ ('padding', ctypes.c_int * (8 + 4 if idaapi.IDA_SDK_VERSION >= 700 else 0)),
                     ('notify', notif), ]
    # The exported 'ph' global is the processor_t of the current proc module
    return processor_t.in_dll(get_dll(), 'ph')

def ph_notify(msgid, *args):
  if idaapi.IDA_SDK_VERSION >= 700:
    class VA_LIST(ctypes.Structure):
      _fields_ = [("arg_" + str(i), args[i].__class__) for i in range(len(args))]
    va_list = VA_LIST(*args)
    return get_ph(ctypes.WINFUNCTYPE(ctypes.c_size_t, ctypes.c_void_p, ctypes.c_int, ctypes.c_void_p)).notify(None, msgid, ctypes.pointer(va_list))
    return get_ph(ctypes.WINFUNCTYPE(ctypes.c_size_t, ctypes.c_int, [i.__class__ for i in args])).notify(msgid, *args)

#idaman ssize_t ida_export invoke_callbacks(hook_type_t hook_type, int notification_code, va_list va);
def invoke_callbacks(msgid, *args):
  HT_IDP = 0 #Hook to the processor module.
#AMD64 ABI va_list: unsigned int gp_offset (max 48=6*8), fp_offset (max 304=6*8+16*16), void* overflow_arg_area, reg_save_area
  #class VA_LIST(ctypes.Structure):
  #    _fields_ = [("gp_offset", ctypes.c_uint), ("fp_offset", ctypes.c_uint),
  #      ("overflow_arg_area", ctypes.c_void_p), ("reg_save_area", ctypes.c_void_p)]
  class VA_OF_LIST(ctypes.Structure):
      _fields_ = [("arg_" + str(i), args[i].__class__) for i in range(len(args))]
  va_of_list = VA_OF_LIST(*args)
  #va_list = VA_LIST(0, 6*8, ctypes.cast(ctypes.pointer(va_of_list), ctypes.c_void_p), ctypes.cast(ctypes.pointer(va_of_list), ctypes.c_void_p))
  return get_dll().invoke_callbacks(HT_IDP, msgid, ctypes.pointer(va_of_list))
Finally the reward for that mess is that any processor calls are now possible if done carefully:
#event_t enum
ev_verify_sp = 58
ev_create_func_frame = 60
ev_analyze_prolog = 81
ev_max_ptr_size = 2002
ev_calc_cdecl_purged_bytes = 2008

#ssize_t notify(event_t event_code, ...)
f = idaapi.get_func(ea)
invoke_callbacks(ev_verify_sp, ctypes.c_void_p(int(f.this)))
ph_notify(ev_analyze_prolog, ctypes.c_size_t(ea))
ph_notify(ev_calc_cdecl_purged_bytes, ctypes.c_size_t(ea))
Note that IDA 7.0 has and should use invoke_callbacks as all the wrapper functions do so and not to use ph_notify directly - though there is no reason why it is not possible to do so as it is still exposed especially for processor module development where a custom user defined value is needed. So either one will work.

Finally the reward of all the hard work:
ph_notify(ev_create_func_frame, ctypes.c_void_p(int(f.this)))
The 'int(f.this)' is a tricky point to as it actually yields the true func_t* structure address which is the only one which will work. id(f.this), id(f) are both wrong and yielding the Python pointer value. The string message even misleadingly gives the wrong address. IDA has used the SWIG interface to wrap the objects and the C code here is implemented using the ctypes library. But as an RE the details of the exact mechanisms should be understood or decipherable by compiling small samples and studying disassemblies or even live debugging IDA64.dll.

Anyway, the wrong far data pointer identification was fixed and I have not found any simpler or more straight forward way to do so. Regardless, the techniques and ideas here could be useful beyond such legacy cases. Happy reversing.
Reply With Quote
Old 07-31-2019, 15:46
chants chants is offline
Join Date: Jul 2016
Posts: 635
Rept. Given: 21
Rept. Rcvd 43 Times in 26 Posts
Thanks Given: 589
Thanks Rcvd at 954 Times in 431 Posts
chants Reputation: 43
Now to get this to take place in the callers of the function still the type has stale stack offsets (its separate from the frame):
ida_typeinf.guess_tinfo(tinfo, ea)
fi.cc = (fi.cc & ~idaapi.CM_M_MASK) | idaapi.CM_M_FN
ida_typeinf.apply_tinfo(ea, tinfo, idaapi.TINFO_DEFINITE)
Manually one could enumerate code x-refs to the function and use ida_typeinf.apply_callee_tinfo(callee_ea, tinfo).

Also an alternative to the processor module may be to use ida_typeinf.delete_wrong_frame_info(f, reanalyze_callback) but it requires a callback which takes instructions as an argument. Its not really intended for a complete refresh.
Reply With Quote

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
how can i change my username DMichael General Discussion 7 04-11-2015 18:50
change in VB EXE file. ivanov General Discussion 17 12-28-2004 17:58

All times are GMT +8. The time now is 05:18.

Always Your Best Friend: Aaron, JMI, ahmadmansoor, ZeNiX
( 1998 - 2022 )