Optimizing Plugins (SourceMod Scripting)

From AlliedModders Wiki
Revision as of 11:42, 10 September 2007 by BAILOPAN (talk | contribs) (New page: ==Introduction== SourceMod is pretty fast, but there are some common assumptions about scripts: *You can't possibly make it any faster. *It's pre-compiled, so it's already quite fast. *Det...)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Introduction

SourceMod is pretty fast, but there are some common assumptions about scripts:

  • You can't possibly make it any faster.
  • It's pre-compiled, so it's already quite fast.
  • Details don't matter, as it's only "scripting" anyway.

None of these are true. The compiler, in fact, is very poor at optimizing, and you can greatly increase the speed and efficiency of your plugins by keeping a few rules in mind. Remember - it's more important to minimize instructions than it is to minimize lines of code.

Note that many of these optimizations should be taken in context. An admin command probably doesn't need fine-tuned optimizations. But a timer that executes every 0.1 seconds, or a GameFrame hook, should definitely be as fast as possible.

Always Save Results

Observe the example code snippet below:

if (GetClientTeam(player) == 3)
{
    //...code
} else if (GetClientTeam(player) == 2) {
    //...code
} else if (GetClientTeam(player) == 1) {
    //...code
}

This is a mild example of "cache your results". When the compiler generates assembly for this code, it will (in pseudo code) generate:

  CALL GetClientTeam
  COMPARE+JUMP
  CALL GetClientTeam
  COMPARE+JUMP
  CALL GetClientTeam
  COMPARE+JUMP

Notice the problem? We have called GetClientTeam an extra two times than necessary. The result doesn't change, so we can save it. Observe:

new team = GetClientTeam(player)
if (team == 3)
{
    //...code
} else if (team == 2) {
    //...code
} else if (team == 1) {
    //...code
}

Now, the compiler will only generate this:

  CALL get_user_team
  COMPARE+JUMP
  COMPARE+JUMP
  COMPARE+JUMP

If GetClientTeam were a more expensive operation (it's relatively cheap), we would have recalculated the entire result each branch of the if case, wasting CPU cycles.

Similarly, this type of code is usually not a good idea:

for (new i = 0; i < strlen(string); i++) ....

Better code is:

new len = strlen(string);
for (new i = 0; i < len; i++)

Similarly, you may not want to put GetMaxClients() in a loop about players. While it is a very cheap function call, it incurs the cost of a native call, which might be significant in highly performance-sensitive code.


Switch instead of If

If you can, you should use switch cases instead of if. This is because for an if statement, the compiler must branch to each consecutive if case. Using the example from above, observe the switch version:

new team = GetClientTeam(player)
switch (team)
{
  case 3:
     //code...
  case 2:
     //code...
  case 1:
     //code...
}

This will generate what's called a "case table". Rather than worm through displaced if tests, the compiler generates a table of possible values. The JIT is smart enough to optimize this even further, and the best case is:

  JUMP Table[CALL GetUserTeam]

If your switch cases are listed in perfectly sequential order (that is, skipping no numbers, either ascending or descending), the JIT can make the best optimizations. For example, "7,8,9" and "2,1,0" are examples of perfect switch cases. "1,3,4" is not.


Don't Re-index Arrays

A common practice in Pawn is to "save space" by re-indexing arrays. There are a few myths behind this, such as saving memory, assuming the compiler does it for you, or readability. Fact: none of these are true. Observe the code below.

SomeFunction(const clients[], num_clients)
{
   for (new i = 0; i < num_clients; i++)
   {
      if (!IsClientInGame(clients[i]))
      {
         continue;
      }
      SetSomething(clients[i], GetSomething(clients[i]) + 1);
   }
}

For this, the compiler generates code similar to:

:LOOP_BEGIN
   LOAD i
   LOAD clients
   CALC
   LOAD clients[i]
   CALL IsClientInGame
   LOAD i
   LOAD clients
   CALC
   LOAD clients[i]
   CALL GetSomething
   LOAD i
   LOAD players
   CALC
   LOAD players[i]
   CALL SetSomething

See what happened? The compiler does not cache array indexing. Because we've used clients[i] each time, every instance generates 4-6 (or more) instructions which load i, the address of clients, computes the final location, and then grabs the data out of memory. It is much faster to do:

SomeFunction(const clients[], num_clients)
{
   new client;
   for (new i = 0; i < num_clients; i++)
   {
      client = clients[i];
      if (!IsClientInGame(client))
      {
         continue;
      }
      SetSomething(client, GetSomething(client) + 1);
   }
}

Not only is this more readable, but look at how much cruft we've shaved off the compiler's generated code:

:LOOP_BEGIN
   LOAD i
   LOAD clients
   CALC
   LOAD clients[i]
   STORE client
   CALL IsClientInGame
   LOAD client
   CALL GetSomething
   LOAD client
   CALL SetSomething

In a large loop you can drastically reduce codesize in this manner.

Decl on Local Arrays

There is a small caveat to the 'new' statement in Pawn; it automatically writes a zero for every byte in the data structure. For example:

new String:elephant[512]

If placed inside local scope, all 512 bytes of the variable will be cleared every time the variable is created. In a performance-sensitive callback, this could be disastrous. Even something as innocuous as:

new temp_players[MAX_PLAYERS]

Could be a significant slow-down if called too often. To solve this, SourceMod has a special replacement for new called decl. Unlike new, decl does not bother setting the entire array or structure to zero. Thus, the data will be filled with random garbage.

While this is much faster, you must be careful to initialize the data before you use it. Observe the following code:

decl String:my_string[256];
 
if (Something())
{
   Format(my_string, sizeof(my_string), "clam");
}
 
PrintToChat(client, "%s", my_string);

Notice the bug? If Something() returned false, my_string would still have garbage in it. Thus, you must always take care and make sure that you might not accidentally be reading from uninitialized data. Once that happens, you will get undefined behavior and possibly even crashes.

A common practice is to short-initialize strings. For example:

decl String:my_string[256];
my_string[0] = '\0';

This sets the first byte of the string to zero, which makes sure the string will be read as a valid, but empty, string.

Note that decl will work on any local variable type (array, string, float, integer, et cetera). The one thing it cannot be used for is initialization. This is invalid:

decl var = 5

Since the purpose of decl is to avoid initializaton, such a line would have no meaning, and thus it is invalid syntax.