Difference between revisions of "User:Nosoop/Guide/Advanced"

From AlliedModders Wiki
Jump to: navigation, search
(The Easy Way: Add note on differing signatures between disassembler scripts)
(SDKCall Order: Put return buffer items into a list; add notes on string return)
Line 143: Line 143:
 
#* The function was declared as static, where there is no <tt>this</tt> to pass in.
 
#* The function was declared as static, where there is no <tt>this</tt> to pass in.
 
#* The function was declared with the <tt>SDKCall_GameRules</tt> or <tt>SDKCall_EntityList</tt> call types; SDKTools itself will provide the appropriate global instance.
 
#* The function was declared with the <tt>SDKCall_GameRules</tt> or <tt>SDKCall_EntityList</tt> call types; SDKTools itself will provide the appropriate global instance.
# The return buffer. If the function returns a <tt>Vector</tt> or <tt>QAngle</tt>, the parameter is a <tt>float[3]</tt>. If the function returns a <tt>char*</tt>, the parameters should be a <tt>char[]</tt> buffer and an <tt>int</tt> specifying the size of the buffer.  If the function returns a primitive type / entity / edict, it will be the return value of the SDKCall, so no such return buffer is necessary.
+
# The return buffer, if applicable.
 +
#* If the function returns a <tt>Vector</tt> or <tt>QAngle</tt>, the parameter is a <tt>float[3]</tt>.
 +
#* If the function returns a <tt>char*</tt>, the parameters should be a <tt>char[]</tt> buffer and an <tt>int</tt> specifying the size of the buffer.  The return value of the SDKCall will be the number of characters written, or -1 if the function returned a null pointer (to differentiate between an empty string).
 +
#* If the function returns a primitive type / entity / edict, it will be the return value of the SDKCall, so no such return buffer is necessary.
 
# Any remaining parameters for the function.
 
# Any remaining parameters for the function.
  

Revision as of 04:39, 16 April 2023

This section is provided for users that want or need to work with game-specific functionality that SourceMod doesn't provide access to out of the box.

It's assumed that you're comfortable with programming and various terms. By the end of this page you'll have some knowledge of calling / hooking arbitrary functions in the game.

Finding Functions

  • TODO refer to public SDK if you don't know what you're looking for
  • TODO explain what to do in a game with symbols
  • TODO suggest opening IDA's options and enabling opcode bytes
  • TODO inlined functions
  • TODO debugging

Finding VTable Offsets

In C++, a virtual method table (shorthand "vtable") is effectively an array of function pointers. It's intended for inheritance — a virtual ::DoThing() method can be different for different classes, and so the code will look up the correct function for a specific instance based on the table for the instance's class. Every class that uses a vtable will hold a reference to it as one of its properties.

The Hard Way

Once you have the virtual call, jump to its reference in .rodata and make a note of that address. Scroll up until you see an offset reference (off_* in IDA, PTR_* in Ghidra); that is likely the first entry in the vtable (index 0). This reference is created by disassemblers as this is the address that is stored in class instances.

Get the difference between your virtual call's address and that of the first entry, then divide by the pointer size (4 on 32-bit platforms, 8 on 64-bit).

For example, given a 32-bit function pointer located at 011AE84Ch and the start at 011AE3C8h, you do (0x011AE84C-0x011AE3C8) / 4, resulting in the index 289.

Alternatively, if you're familiar with the code or have sources to cross-reference against, you can search for the virtual call itself. In a Linux disassembly, it will look something like this:

; get the first vtable by dereferencing the pointer at the start of the class instance
8B 03       mov  eax, [ebx]

; push the class instance as a parameter
89 1C 24    mov  [esp], ebx

; call the fourth entry (at index 3) in the vtable: 0xC / sizeof(void*) = 0x3
FF 50 0C    call dword ptr [eax+0Ch]

The Easy Way

If the game isn't stripped of debugging symbols, use asherkin's VTable Dumper. It provides correct offsets for Linux binaries (as it's what it works with), and estimates usually correct offsets for Windows.

There are instances where the dumper isn't correct, so you may need to be careful in those cases. Known cases include:

Aside - the layout of vtables is not the same across platforms. Notable differences are:

  1. Linux may have multiple virtual destructors; Windows appears to only have up to one.
  2. Linux overloads are in the same order as they are initially defined in the original code. On Windows, this is the same, except that overloaded functions (those with the same name that accept different parameters) are grouped together and emitted in reverse order.

Creating Signatures

The Hard Way

After you've found a function, you need to tell SourceMod the sequence of bytes unique to it. Those bytes make up a signature.

Note:If you're using IDA and only see the mnemonics in the "IDA View" tab, make sure to set the number of opcode bytes in IDA options to a non-zero number. 8 is sufficient in most cases.

You could treat just the sequence bytes as the signature directly, but this would break very easily whenever the game is updated. At the machine-code level, the instructions might be the same for "move X to Y", but the data might change — X and Y might be in a different location in the binary altogether. For an example within a longer signature:

; sets esp to the offset aString
; the bytes 3B B3 25 01 are the absolute offset of aString in this binary in little-endian format (0x0125B33B)
C7 04 24 3B B3 25 01    mov     dword ptr [esp], offset aString

; call function, the four bytes after E8 are the location of the function
E8 78 F0 48 00          call    _Z12UTIL_VarArgsPKcz

; sets eax to arg 0
8B 45 08                mov     eax, [ebp+arg_0]

The naive signature for that would be \xC7\x04\x24\x3B\xB3\x25\x01\xE8\x78\xF0\x48\x00\x8B\x45\x08. However, you can't rely on those bytes mentioned to be constant at all:

  • The offsets of aString and UTIL_VarArgs might be located somewhere else after a game update
  • Relocations may be performed such that the data bytes are different in memory from its on-disk representation

As a solution to this, you use wildcards to mask off the bytes you don't care about. For SourceMod game config files, the sequence \x2A indicates that particular byte shouldn't be checked and to continue to the next one.

Here is what the previous signature looks like with the masked bytes displayed as ??:

C7 04 24 ?? ?? ?? ??    mov     dword ptr [esp], offset aString
E8 ?? ?? ?? ??          call    _Z12UTIL_VarArgsPKcz
8B 45 08                mov     eax, [ebp+arg_0]

A masked signature would then be \xC7\x04\x24\x2A\x2A\x2A\x2A\xE8\x2A\x2A\x2A\x2A\x8B\x45\x08.

Masking is used mainly for offsets, such as for functions and variables. Instructions generally don't change unless the function code itself is modified, at which point you'll want to revisit your binary and update accordingly.

If you're using DHooks with byte signatures (covered later), you may want to also mask out the first six bytes, as a detour will patch in an unconditional JMP at the start to trampoline into a user-defined function, and subsequent scans for the byte signature will fail.

Note:This is no longer the case as of SourceMod 1.11, which stores a copy of the original data for scanning purposes. However, it's noted here for historical / implementation detail reasons.

For an extended lesson, you can look at the following material:

The Easy Way

If you're using IDA (including Free), use the makesig7.idc script. If you're using Ghidra, use makesig.py.

They generally do pretty well at finding and masking byte signatures, but when it fails or you want a more robust signature, you should understand how to create the signatures manually.

Both scripts may produce different byte signatures for the same function due to using different methods to determine if a given byte should be masked.

It's exceedingly rare, but possible that the binary has two copies of the exact same short function (for example, when they are typechecked and statically casted to different subclasses). Both scripts will fail in that case. SourceMod's signature scanner will use the first match it finds, so if any match is acceptable, you can still use an appropriately masked signature.

If two copies of a function seem to exist, be sure to look at the disassembly to make sure that the functions are indeed the same.

Finding Addresses

Sometimes you have a symbol, but you need an address to work with. That is what the "Addresses" section of a game configuration file is used for.

To find an address, you start from a known location reference (signature). You may then have to jump to references (that is, dereference locations), then get an offset from the previous reference.

read keys indicate an offset to load / dereference relative to the previous address, and offset means to shift the previous address without any dereference. These key / value pairs are processed in the order you specify them in the file; offset is only valid as the last "operation".

For a C++-like example:

// start from an address
// "FindLocation" would return the location of either a named symbol reference or the start of a byte signature
uintptr_t addr = FindLocation("some_signature");
addr = *reinterpret_cast<uintptr_t*>(addr + 40); // gameconf: "read" "40"
addr = *reinterpret_cast<uintptr_t*>(addr); // gameconf: "read" "0"
addr += 13; // gameconf: "offset" "13"
Note:This section is a work-in-progress.

Calling Game Functions

Note:This section is a work-in-progress.

SDKCall Order

When performing an SDKCall, the parameters need to be passed in the following order:

  1. The SDKCall handle received from EndPrepSDKCall.
  2. The this instance. this may be omitted in the following cases:
    • The function was declared as static, where there is no this to pass in.
    • The function was declared with the SDKCall_GameRules or SDKCall_EntityList call types; SDKTools itself will provide the appropriate global instance.
  3. The return buffer, if applicable.
    • If the function returns a Vector or QAngle, the parameter is a float[3].
    • If the function returns a char*, the parameters should be a char[] buffer and an int specifying the size of the buffer. The return value of the SDKCall will be the number of characters written, or -1 if the function returned a null pointer (to differentiate between an empty string).
    • If the function returns a primitive type / entity / edict, it will be the return value of the SDKCall, so no such return buffer is necessary.
  4. Any remaining parameters for the function.

Examples:

// Vector CBaseCombatCharacter::Weapon_ShootPosition() -- has 'this' and 'Vector' return
float vecShootPosition[3];
SDKCall(g_hSDKCall, client, vecShootPosition);

// const char *CBaseAnimating::GetSequenceName(int iSequence) -- has 'this', 'char*' return, and parameter
char sequenceName[64];
SDKCall(g_hSDKCall, entity, sequenceName, sizeof(sequenceName), iSequence);

// bool CGlobalEntityList::IsEntityPtr(void* pTest) -- SDKCall_EntityList is used, so no 'this' explicitly needed
// SDKCall passes the return value from the called function as its return value, so use an assignment operator
bool result = SDKCall(g_hSDKCall, pTest);

Hooking Game Functions (with DHooks)

DHooks is an extension bundled with SourceMod that enables plugins to hook functions of their choosing (currently restricted to those accessible via server / engine binaries). You may use its functionality by including <dhooks>.

As with SDKCalls, you must ensure that your hook setup is declared with the same parameter and return types to ensure the server continues to operate as you'd expect.

Note:This section is a work-in-progress.

Virtual Hook or Detour?

A virtual hook is mainly used for hooking virtual methods of a class; a detour is used for hooking any function.

While detours can be used to hook the function a virtual table calls into, virtual hooks still have the merit of hooking specific classes / instances. More specifically:

  • DHooks provides the bookkeeping on which instances are and aren't hooked, so for virtual hooks the callback will only be invoked on those you specifically hook. On detours, you have to filter on instances yourself.
  • On chained inheritance, a virtual hook will only act on the exact class and not any parent nor subclasses, even if they all point to the same virtual function. Detours will, again, be called on any invocation of the function, including calls to it made by its subclass.