Optimizing Plugins (AMX Mod X Scripting)

From AlliedModders Wiki
Revision as of 20:00, 30 January 2006 by BAILOPAN (talk | contribs) (initial import)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Introduction

Admin-Mod and AMX Mod X became very popular because of their easy to use scripting language. However, the words "scripting language" come with a lot of loaded preconceptions. Most people assume that because it's scripted:

  • You can't possible make it any faster
  • It's compiled, so it's already quite fast
  • Details don't matter, as it's only "scripting" anyway

Especially, with Pawn (formerly Small), none of these are true. The compiler, in fact, is very poor at optimizing, and you can greatly increase the speed and efficiency of your plugins by keeping a few rules in mind. Remember - it's more important to minimize instructions than it is to minimize lines of code.

Terms

To read this document, you will need to understand a few concepts beforehand:

  • BRANCHING - When the processor takes a different path of code. For example, to call a function or to use an if statement, the processor will "branch". Modern processors attempt to predict pathways with "branch prediction", but it's best to avoid branching a lot if possible.
  • STACK ALLOCATION - In Pawn, all local data is stored on the stack, a big chunk of continuous memory. Whenever you create a variable on the stack, it is automatically written with zeroes.
  • HEAP ALLOCATION - In Pawn, temporary data that needs to be referenced by a native is stored on the heap, another area of contiguous, but less restrictive memory.
  • DATA SECTION - This is an area of memory built into your .amxx file. In fact, it "becomes" the heap at load time. All your strings and arrays are hardcoded into this area.
  • EXPENSIVENESS - To be "expensive" in computer science means an operation requires a lot of CPU processing. It usually does not refer to memory size, only to processing cycles and time. Addition is inexpensive, floating power operations are expensive. However, both are inexpensive in comparison to writing a file. An inexpensive operation can also be called "cheap".
  • BIG-OH NOTATION - O(*) notation refers to the expensiveness of an algorithm. If something is O(n), it occurs in linear time -- meaning that for N items, it will complete relative to N. O(N^2) means with N items, it will complete relative to N^2. O(1) means "constant time" - no matter what N is, it will run in the same amount of time.


Compiler Optimizations

These optimizations have to do with changing how your code is compiled. While the syntax may remain the same, you are not only increasing compile time, but increasing your plugin's efficiency and speed at run time.

Always Save Results

Observe the example code snippet below:

if (get_user_team(player) == TEAM_T)
{
    //...code
} else if (get_user_team(player) == TEAM_CT) {
    //...code
} else if (get_user_team(player) == TEAM_SPECTATOR) {
    //...code
}

This is a mild example of "cache your results". When the compiler generates assembly for this code, it will (in pseudo code) generate:

  CALL get_user_team
  COMPARE+BRANCH
  CALL get_user_team
  COMPARE+BRANCH
  CALL get_user_team
  COMPARE+BRANCH

Notice the problem? We have called get_user_team an extra two times than necessary. The result doesn't change, so we can save it. Observe:

new team = get_user_team(player)
if (team == TEAM_T)
{
    //...code
} else if (team == TEAM_CT) {
    //...code
} //...etc

Now, the compiler will only generate this:

  CALL get_user_team
  COMPARE+BRANCH
  COMPARE+BRANCH
  COMPARE+BRANCH

If get_user_team were an expensive operation (it's relatively cheap), we would have recalculated the entire result each branch of the if case.


Switch instead of If

If you can, you should use switch cases instead of if. This is because for an if statement, the compiler must branch to each consecutive if case. Using the example from above, observe the switch version:

new team = get_user_team(player)
switch (team)
{
  case TEAM_T:
     //code...
  case TEAM_CT:
     //code...
  case TEAM_SPECTATOR:
     //code...
}

This will generate what's called a "case table". Rather than worm through displaced if tests, the compiler generates a table of possible values, which the virtual machine knows to browse through:

  CALL get_user_team
  COMPARE
  COMPARE
  COMPARE


Don't Re-index Arrays

A common practive in Small is to "save space" by re-indexing arrays. There are a few myths behind this, such as saving memory, assuming the compiler does it for you, or readability. Fact: none of these are true. Observe the code below:

new players[32], num, team
get_players(players, num)
for (new i=0; i<num; i++)
{
   if (!is_user_connected(players[i]))
      continue;
   team = get_user_team(players[i])
   set_user_frags(players[i], 0)
}

For this, the compiler generates code similar to:

:LOOP_BEGIN
   LOAD i
   LOAD players
   CALC
   LOAD players[i]
   CALL is_user_connected
   LOAD i
   LOAD players
   CALC
   LOAD players[i]
   CALL get_user_team
   LOAD i
   LOAD players
   CALC
   LOAD players[i]
   CALL set_user_frags

See what happened? The compiler does not cache array indexing. Because we've used players[i] each time, every instance generates 4-6 (or more) instructions which load i, the address of players, computes the final location, and then grabs the data out of memory. It is much faster to do:

new player
for (new i=0; i<num; i++)
{
   player = players[i]
   if (!is_user_connected(player))
      continue;
   team = get_user_team(player)
}

Not only is this more readable, but look at how much cruft we've shaved off the compiler's generated code:

:LOOP_BEGIN
   LOAD i
   LOAD players
   CALC
   LOAD players[i]
   STORE player
   CALL is_user_connected
   LOAD player
   CALL get_user_team
   LOAD player
   CALL set_user_frags

In a large loop you can drastically reduce codesize in this manner.