View Single Post
  #1  
Old 07-31-2019, 15:13
chants chants is offline
VIP
 
Join Date: Jul 2016
Posts: 725
Rept. Given: 35
Rept. Rcvd 48 Times in 30 Posts
Thanks Given: 666
Thanks Rcvd at 1,053 Times in 478 Posts
chants Reputation: 48
Advanced IdaPython to change details UI cannot change

----------------------------------------
Advanced IDAPython coding tutorial
----------------------------------------

For example, in 16-bit code, there is far-code near-data and the other 3 combinations for farness and nearness. Sometimes IDA gets it wrong with FLIRT and chooses the wrong model messing up stack parameter alignments and so forth. But you cannot change the function type as it already has pointers in it e.g.: void *__cdecl memset(void *, int, size_t);

The Options -> Compilers is the only place where this model can be specified but its a global setting despite that every function prototype has some setting for this.

So you want 2 byte pointers not 4 byte pointers for this particular function since it uses near-data far-code model. But the default for the file is far-data far-code. If the function has an unknown calling convention it will also use this default.

So how can you fix it without ruining the pointer type of the return and first arguments?

Code:
ea = idaapi.get_name_ea(idaapi.BADADDR, "_memset")
tinfo = idaapi.tinfo_t()
idaapi.get_tinfo2(ea, tinfo)
fi = idaapi.func_type_data_t()
tinfo.get_func_details(fi)
fi.cc
fi.cc will return a value such as 50 (48+2) and the docs will help on the calling convention masks and values:
Code:
const cm_t  CM_N16_F32   = 0x02;  ///< 2: near 2bytes, far 4bytes

const cm_t CM_M_MASK = 0x0C;
const cm_t  CM_M_NN      = 0x00;  ///< small:   code=near, data=near (or unknown if CM_UNKNOWN)
const cm_t  CM_M_FF      = 0x04;  ///< large:   code=far, data=far
const cm_t  CM_M_NF      = 0x08;  ///< compact: code=near, data=far
const cm_t  CM_M_FN      = 0x0C;  ///< medium:  code=far, data=near

const cm_t  CM_CC_UNKNOWN  = 0x10;  ///< unknown calling convention
const cm_t  CM_CC_CDECL    = 0x30;  ///< stack
So it is marked as near 16/far 32 pointers and CDECL both of which are correct.

Code:
fi.cc = (fi.cc & ~idaapi.CM_M_MASK) | idaapi.CM_M_FN
tinfo.create_func(fi)
ida_typeinf.apply_tinfo(ea, tinfo, idaapi.TINFO_DEFINITE)
This works in IDA (creating a new tinfo would crash it because data would be missing). If you use apply_tinfo without creating a newtinfo it does nothing as the fi needs to be rebound to a tinfo to work. If you do not change fi.cc to far code near data, it also works. So the code is correct, can query the type again and it is changed.

Now unfortunately that did not fix the stack of the function nor the callers.

This gets a bit trickier. Intuitively one would think that just using `ida_funcs.reanalyze_function(idaapi.get_func(ea))` would fix the function up immediately. But it does not work and the answer is deeper in IDAs functioning. The tool to create frames lies in the processor module which is not properly exposed to IDAPython. So some manual C tricks will be needed: first get the DLL name, then get the 'ph' processor_t structure (unfortunately the undocumented ida_idp.ph is not helpful here). Of course inspection is always a best first start such as:
Code:
dir(ida_idp)
import inspect
inspect.getmembers(ida_idp)
ida_idp.ph
Anyway, this done manually requires careful adjustment to pad out the not needed fields before the notify callback. IDA 7.x changes things quite a bit not just with 64 bit calling convention but additional members and the pointer size change. Some commented code is left in for those brave enough to try a solution for Linux where theoretically the variadic argument list is a data structure supporting register arguments and quite a bit more complicated. For whatever strange reason, Windows is not following the recommended ABI with va_list but the old model pre-amd64 model where its a mere pointer to the arguments, conveniently making things easier.

Code:
import ctypes
def get_dll():
    # http://www.hexblog.com/?p=695
    idaname = "ida64" if idc.__EA64__ else "ida"
    if sys.platform == "win32":
        return ctypes.windll[idaname + (".dll" if idaapi.IDA_SDK_VERSION >= 700 else ".wll")]
    elif sys.platform == "linux2":
        return ctypes.cdll["lib" + idaname + ".so"]
    elif sys.platform == "darwin":
        return ctypes.cdll["lib" + idaname + ".dylib"]
def get_ph(notif):
#int32 version, id, uint32 flag;
#uint32 flag2; # >= 7.0
#int32 cnbits, dnbits;
#const char *const *psnames, *plnames, const asm_t *const *assemblers; # ptr >= 7.0 is 64-bit
#hook_cb_t *_notify; # >= 7.0 otherwise its just notify(int, ...) instead of (void*, int, va_list/...)
#va_list is just a pointer argument to the variadic list which at invocation time would be difficult to achieve by pushing the next stack address onto the stack as such low level operation probably is not in ctypes
    class processor_t(ctypes.Structure):
        _fields_ = [ ('padding', ctypes.c_int * (8 + 4 if idaapi.IDA_SDK_VERSION >= 700 else 0)),
                     ('notify', notif), ]
    # The exported 'ph' global is the processor_t of the current proc module
    return processor_t.in_dll(get_dll(), 'ph')

def ph_notify(msgid, *args):
  if idaapi.IDA_SDK_VERSION >= 700:
    class VA_LIST(ctypes.Structure):
      _fields_ = [("arg_" + str(i), args[i].__class__) for i in range(len(args))]
    va_list = VA_LIST(*args)
    return get_ph(ctypes.WINFUNCTYPE(ctypes.c_size_t, ctypes.c_void_p, ctypes.c_int, ctypes.c_void_p)).notify(None, msgid, ctypes.pointer(va_list))
  else:
    return get_ph(ctypes.WINFUNCTYPE(ctypes.c_size_t, ctypes.c_int, [i.__class__ for i in args])).notify(msgid, *args)

#idaman ssize_t ida_export invoke_callbacks(hook_type_t hook_type, int notification_code, va_list va);
def invoke_callbacks(msgid, *args):
  HT_IDP = 0 #Hook to the processor module.
#AMD64 ABI va_list: unsigned int gp_offset (max 48=6*8), fp_offset (max 304=6*8+16*16), void* overflow_arg_area, reg_save_area
  #class VA_LIST(ctypes.Structure):
  #    _fields_ = [("gp_offset", ctypes.c_uint), ("fp_offset", ctypes.c_uint),
  #      ("overflow_arg_area", ctypes.c_void_p), ("reg_save_area", ctypes.c_void_p)]
  class VA_OF_LIST(ctypes.Structure):
      _fields_ = [("arg_" + str(i), args[i].__class__) for i in range(len(args))]
  va_of_list = VA_OF_LIST(*args)
  #va_list = VA_LIST(0, 6*8, ctypes.cast(ctypes.pointer(va_of_list), ctypes.c_void_p), ctypes.cast(ctypes.pointer(va_of_list), ctypes.c_void_p))
  return get_dll().invoke_callbacks(HT_IDP, msgid, ctypes.pointer(va_of_list))
Finally the reward for that mess is that any processor calls are now possible if done carefully:
Code:
#event_t enum
ev_verify_sp = 58
ev_create_func_frame = 60
ev_analyze_prolog = 81
ev_max_ptr_size = 2002
ev_calc_cdecl_purged_bytes = 2008

#ssize_t notify(event_t event_code, ...)
f = idaapi.get_func(ea)
invoke_callbacks(ev_verify_sp, ctypes.c_void_p(int(f.this)))
ph_notify(ev_analyze_prolog, ctypes.c_size_t(ea))
ph_notify(ev_calc_cdecl_purged_bytes, ctypes.c_size_t(ea))
invoke_callbacks(ev_max_ptr_size)
Note that IDA 7.0 has and should use invoke_callbacks as all the wrapper functions do so and not to use ph_notify directly - though there is no reason why it is not possible to do so as it is still exposed especially for processor module development where a custom user defined value is needed. So either one will work.

Finally the reward of all the hard work:
Code:
ida_frame.del_frame(f)
ph_notify(ev_create_func_frame, ctypes.c_void_p(int(f.this)))
The 'int(f.this)' is a tricky point to as it actually yields the true func_t* structure address which is the only one which will work. id(f.this), id(f) are both wrong and yielding the Python pointer value. The string message even misleadingly gives the wrong address. IDA has used the SWIG interface to wrap the objects and the C code here is implemented using the ctypes library. But as an RE the details of the exact mechanisms should be understood or decipherable by compiling small samples and studying disassemblies or even live debugging IDA64.dll.

Anyway, the wrong far data pointer identification was fixed and I have not found any simpler or more straight forward way to do so. Regardless, the techniques and ideas here could be useful beyond such legacy cases. Happy reversing.
Reply With Quote