Optimizing Plugins (SourceMod Scripting)

From AlliedModders Wiki
Revision as of 14:27, 10 September 2007 by BAILOPAN (talk | contribs)
Jump to: navigation, search

Introduction

SourceMod is pretty fast, but there are some common assumptions about scripts:

  • You can't possibly make it any faster.
  • It's pre-compiled, so it's already quite fast.
  • Details don't matter, as it's only "scripting" anyway.

None of these are true. The compiler, in fact, is very poor at optimizing, and the JIT is left to do most of the work. You can greatly increase the speed and efficiency of your plugins by keeping a few rules in mind. Remember - it's more important to minimize instructions than it is to minimize lines of code.

Note that many of these optimizations should be taken in context. An admin command probably doesn't need fine-tuned optimizations. But a timer that executes every 0.1 seconds, or a GameFrame hook, should definitely be as fast as possible.

Always Save Results

Observe the example code snippet below:

if (GetClientTeam(player) == 3)
{
    //...code
}
else if (GetClientTeam(player) == 2)
{
    //...code
}
else if (GetClientTeam(player) == 1)
{
    //...code
}

This is a mild example of "cache your results". When the compiler generates assembly for this code, it will (in pseudo code) generate:

  CALL GetClientTeam
  COMPARE+JUMP
  CALL GetClientTeam
  COMPARE+JUMP
  CALL GetClientTeam
  COMPARE+JUMP

Notice the problem? We have called GetClientTeam an extra two times than necessary. The result doesn't change, so we can save it. Observe:

new team = GetClientTeam(player)
if (team == 3)
{
    //...code
}
else if (team == 2)
{
    //...code
}
else if (team == 1)
{
    //...code
}

Now, the compiler will only generate this:

  CALL get_user_team
  COMPARE+JUMP
  COMPARE+JUMP
  COMPARE+JUMP

If GetClientTeam were a more expensive operation (it's relatively cheap), we would have recalculated the entire result each branch of the if case, wasting CPU cycles.

Similarly, this type of code is usually not a good idea:

for (new i = 0; i < strlen(string); i++) ....

Better code is:

new len = strlen(string);
for (new i = 0; i < len; i++)

Similarly, you may not want to put GetMaxClients() in a loop about players. While it is a very cheap function call, it incurs the cost of a native call, which might be significant in highly performance-sensitive code.


Switch instead of If

If you can, you should use switch cases instead of if. This is because for an if statement, the compiler must branch to each consecutive if case. Using the example from above, observe the switch version:

new team = GetClientTeam(player)
switch (team)
{
  case 3:
     //code...
  case 2:
     //code...
  case 1:
     //code...
}

This will generate what's called a "case table". Rather than worm through displaced if tests, the compiler generates a table of possible values. The JIT is smart enough to optimize this even further, and the best case is:

  JUMP Table[CALL GetUserTeam]

If your switch cases are listed in perfectly sequential order (that is, skipping no numbers, either ascending or descending), the JIT can make the best optimizations. For example, "7,8,9" and "2,1,0" are examples of perfect switch cases. "1,3,4" is not.


Don't Re-index Arrays

A common practice in Pawn is to "save space" by re-indexing arrays. There are a few myths behind this, such as saving memory, assuming the compiler does it for you, or readability. Fact: none of these are true. Observe the code below.

SomeFunction(const clients[], num_clients)
{
   for (new i = 0; i < num_clients; i++)
   {
      if (!IsClientInGame(clients[i]))
      {
         continue;
      }
      SetSomething(clients[i], GetSomething(clients[i]) + 1);
   }
}

For this, the compiler generates code similar to:

:LOOP_BEGIN
   LOAD i
   LOAD clients
   CALC
   LOAD clients[i]
   CALL IsClientInGame
   LOAD i
   LOAD clients
   CALC
   LOAD clients[i]
   CALL GetSomething
   LOAD i
   LOAD players
   CALC
   LOAD players[i]
   CALL SetSomething

See what happened? The compiler does not cache array indexing. Because we've used clients[i] each time, every instance generates 4-6 (or more) instructions which load i, the address of clients, computes the final location, and then grabs the data out of memory. It is much faster to do:

SomeFunction(const clients[], num_clients)
{
   new client;
   for (new i = 0; i < num_clients; i++)
   {
      client = clients[i];
      if (!IsClientInGame(client))
      {
         continue;
      }
      SetSomething(client, GetSomething(client) + 1);
   }
}

Not only is this more readable, but look at how much cruft we've shaved off the compiler's generated code:

:LOOP_BEGIN
   LOAD i
   LOAD clients
   CALC
   LOAD clients[i]
   STORE client
   CALL IsClientInGame
   LOAD client
   CALL GetSomething
   LOAD client
   CALL SetSomething

In a large loop you can drastically reduce codesize in this manner.

Decl on Local Arrays

There is a small caveat to the 'new' statement in Pawn; it automatically writes a zero for every byte in the data structure. For example:

new String:elephant[512]

If placed inside local scope, all 512 bytes of the variable will be cleared every time the variable is created. In a performance-sensitive callback, this could be disastrous. Even something as innocuous as:

new temp_players[MAX_PLAYERS]

Could be a significant slow-down if called too often. To solve this, SourceMod has a special replacement for new called decl. Unlike new, decl does not bother setting the entire array or structure to zero. Thus, the data will be filled with random garbage.

While this is much faster, you must be careful to initialize the data before you use it. Observe the following code:

decl String:my_string[256];
 
if (Something())
{
   Format(my_string, sizeof(my_string), "clam");
}
 
PrintToChat(client, "%s", my_string);

Notice the bug? If Something() returned false, my_string would still have garbage in it. Thus, you must always take care and make sure that you might not accidentally be reading from uninitialized data. Once that happens, you will get undefined behavior and possibly even crashes.

A common practice is to short-initialize strings. For example:

decl String:my_string[256];
my_string[0] = '\0';

This sets the first byte of the string to zero, which makes sure the string will be read as a valid, but empty, string.

Note that decl will work on any local variable type (array, string, float, integer, et cetera). The one thing it cannot be used for is initialization. This is invalid:

decl var = 5

Since the purpose of decl is to avoid initializaton, such a line would have no meaning, and thus it is invalid syntax.


Avoid Large KeyValues

KeyValues is an n-ary structure using linked lists. This type of structure is extremely expensive to allocate and traverse. While it might be suitable for tiny pieces of information (that is, under 10KB of data or so), its complexity growth is very poor.

If you load KeyValues data, you should make an effort to, at the very least, cache its Handle so you don't need to reparse the file every time. Caching its contents on a needed basis would be a bonus as well.

If you're trying to use a KeyValues file with thousands of entries and updating/loading it on events such as player connections or disconnections, you will find that the structure will grow to an unmanageably slow size. If that's the case, you should consider moving to something like SQLite or MySQL.


Use Stock/Public Correctly

Using stock and public correctly won't necessarily optimize your plugin, but they will help your code be more maintainable and will eliminate extra exports/data being written to your plugin file.

Stock

A stock function is compiled, but only written to the plugin's binary if it's used. Generally, you should use the stock keyword if:

  • You may use the function in the future, and don't want the compiler telling you it's never used;
  • You are writing an include file and don't want the function compiled if it's never used.

Public

A public function is exported externally. Every plugin binary has a list of all its public functions, and Core uses this table whenever it needs to find a matching forward.

Generally, there are only two reasons you should ever use public:

  • You are implementing a forward.
  • You are implementing a callback to another function, and it requires you to use public.

It seems common for users to randomly add public to completely private functions -- not only is that unnecessary, but it only adds to the amount of work Core has to do to find functions in your plugin.


Conclusion

Although optimization is important, you should always keep context in mind. Don't replace every single new in your plugin with decl just because it might be faster. Identify the areas of your plugin where optimization is significant, and tweak from there.