Difference between revisions of "Optimizing Plugins (AMX Mod X Scripting)"
(→Compiler Optimizations: added about arrays and crap) |
(→Compiler Optimizations) |
||
Line 166: | Line 166: | ||
}</pawn> | }</pawn> | ||
This has the effect of safely making the string empty beforehand, as well as largely reducing a lot of run-time overhead. | This has the effect of safely making the string empty beforehand, as well as largely reducing a lot of run-time overhead. | ||
+ | |||
+ | |||
+ | ===For Loop Comparisons=== | ||
+ | A common mistake is to write code like this: | ||
+ | <pawn>new string[256] = "something long" | ||
+ | for (new i=0; i<strlen(string); i++) | ||
+ | //...code</pawn> | ||
+ | This plays off a similar principle from before: cache results. The compiler will actually recompute your string length on each iteration of the loop. This will have even worse effects if your string changes mid-loop. A more sensible method is: | ||
+ | <pawn>new string[256] = "something long" | ||
+ | new len = strlen(string) | ||
+ | for (new i=0; i<len; i++) | ||
+ | //...code</pawn> | ||
+ | |||
[[Category:Scripting (AMX Mod X)]] | [[Category:Scripting (AMX Mod X)]] |
Revision as of 19:21, 30 January 2006
Contents
Introduction
Admin-Mod and AMX Mod X became very popular because of their easy to use scripting language. However, the words "scripting language" come with a lot of loaded preconceptions. Most people assume that because it's scripted:
- You can't possible make it any faster
- It's compiled, so it's already quite fast
- Details don't matter, as it's only "scripting" anyway
Especially, with Pawn (formerly Small), none of these are true. The compiler, in fact, is very poor at optimizing, and you can greatly increase the speed and efficiency of your plugins by keeping a few rules in mind. Remember - it's more important to minimize instructions than it is to minimize lines of code.
Terms
To read this document, you will need to understand a few concepts beforehand:
- BRANCHING - When the processor takes a different path of code. For example, to call a function or to use an if statement, the processor will "branch". Modern processors attempt to predict pathways with "branch prediction", but it's best to avoid branching a lot if possible.
- STACK ALLOCATION - In Pawn, all local data is stored on the stack, a big chunk of continuous memory. Whenever you create a variable on the stack, it is automatically written with zeroes.
- HEAP ALLOCATION - In Pawn, temporary data that needs to be referenced by a native is stored on the heap, another area of contiguous, but less restrictive memory.
- DATA SECTION - This is an area of memory built into your .amxx file. In fact, it "becomes" the heap at load time. All your strings and arrays are hardcoded into this area.
- EXPENSIVENESS - To be "expensive" in computer science means an operation requires a lot of CPU processing. It usually does not refer to memory size, only to processing cycles and time. Addition is inexpensive, floating power operations are expensive. However, both are inexpensive in comparison to writing a file. An inexpensive operation can also be called "cheap".
- BIG-OH NOTATION - O(*) notation refers to the expensiveness of an algorithm. If something is O(n), it occurs in linear time -- meaning that for N items, it will complete relative to N. O(N^2) means with N items, it will complete relative to N^2. O(1) means "constant time" - no matter what N is, it will run in the same amount of time.
Compiler Optimizations
These optimizations have to do with changing how your code is compiled. While the syntax may remain the same, you are not only increasing compile time, but increasing your plugin's efficiency and speed at run time.
Always Save Results
Observe the example code snippet below:
if (get_user_team(player) == TEAM_T) { //...code } else if (get_user_team(player) == TEAM_CT) { //...code } else if (get_user_team(player) == TEAM_SPECTATOR) { //...code }
This is a mild example of "cache your results". When the compiler generates assembly for this code, it will (in pseudo code) generate:
CALL get_user_team COMPARE+BRANCH CALL get_user_team COMPARE+BRANCH CALL get_user_team COMPARE+BRANCH
Notice the problem? We have called get_user_team an extra two times than necessary. The result doesn't change, so we can save it. Observe:
new team = get_user_team(player) if (team == TEAM_T) { //...code } else if (team == TEAM_CT) { //...code } //...etc
Now, the compiler will only generate this:
CALL get_user_team COMPARE+BRANCH COMPARE+BRANCH COMPARE+BRANCH
If get_user_team were an expensive operation (it's relatively cheap), we would have recalculated the entire result each branch of the if case.
Switch instead of If
If you can, you should use switch cases instead of if. This is because for an if statement, the compiler must branch to each consecutive if case. Using the example from above, observe the switch version:
new team = get_user_team(player) switch (team) { case TEAM_T: //code... case TEAM_CT: //code... case TEAM_SPECTATOR: //code... }
This will generate what's called a "case table". Rather than worm through displaced if tests, the compiler generates a table of possible values, which the virtual machine knows to browse through:
CALL get_user_team COMPARE COMPARE COMPARE
Don't Re-index Arrays
A common practive in Small is to "save space" by re-indexing arrays. There are a few myths behind this, such as saving memory, assuming the compiler does it for you, or readability. Fact: none of these are true. Observe the code below:
new players[32], num, team get_players(players, num) for (new i=0; i<num; i++) { if (!is_user_connected(players[i])) continue; team = get_user_team(players[i]) set_user_frags(players[i], 0) }
For this, the compiler generates code similar to:
:LOOP_BEGIN LOAD i LOAD players CALC LOAD players[i] CALL is_user_connected LOAD i LOAD players CALC LOAD players[i] CALL get_user_team LOAD i LOAD players CALC LOAD players[i] CALL set_user_frags
See what happened? The compiler does not cache array indexing. Because we've used players[i] each time, every instance generates 4-6 (or more) instructions which load i, the address of players, computes the final location, and then grabs the data out of memory. It is much faster to do:
new player for (new i=0; i<num; i++) { player = players[i] if (!is_user_connected(player)) continue; team = get_user_team(player) }
Not only is this more readable, but look at how much cruft we've shaved off the compiler's generated code:
:LOOP_BEGIN LOAD i LOAD players CALC LOAD players[i] STORE player CALL is_user_connected LOAD player CALL get_user_team LOAD player CALL set_user_frags
In a large loop you can drastically reduce codesize in this manner.
Global vs Local and Variables in Loops
It is important to realize that every variable in Pawn is automatically zeroed. For global variables, they are static and permanent, thus they are zeroed when your plugin is loaded. Variables in functions, however, must be zeroed dynamically. This is a slow and tedious operation, and you should not only avoid relying on it when necessary, but you should keep that fact in mind when using arrays.
Arrays in Pawn are "cells" of data. Each cell is four or eight bytes, depending on whether your processor is 32bit or 64bit. To create an array of 4096 cells, for example:
new array[4096]
The compiler generates code to manually zero every single one of the 16,384 bytes in that location. Normally, this isn't too bad -- but it can be absolutely deadly in a function which is called quite often. For example, server_frame is called on every server tick in AMX Mod X. To declare an array of that size in server_frame is highly unnecessary. Instead, you can take advantage of the fact that not only are globals free of charge, but server_frame does not need to be re-entrant. That is, you will never call server_frame inside of server_frame, so making the variable global will not bring up the problem of one instance of the function overwriting data from another instance of the same function. Observe:
new g_serverframe_array[4096] public server_frame() { //...code that uses g_serverframe_array }
This will execute conseridably faster. You can do this with smaller arrays too.
Likewise, it is equally important to avoid declaring arrays inside of loops. Consider the following block of code:
for (new i=0; i<num; i++) { new message[255], name[32], player player = players[i] get_user_name(player, name, 31) format(message, 254, "Hello, %s", name) }
If there are 32 players, this loop will actually resize and zero out over 1K of memory thirty two times in a row. Not good! However, on the other hand, it's nice that the message is zeroed out for us each time. Tip: you often only need to zero out the first character of a string. This will make the entire string empty. The code below is much more efficient:
new message[255], name[32], player for (new i=0; i<num; i++) { player = players[i] name[0] = '\0' message[0] = '\0' get_user_name(player, name, 31) format(message, 254, "Hello, %s", name) }
This has the effect of safely making the string empty beforehand, as well as largely reducing a lot of run-time overhead.
For Loop Comparisons
A common mistake is to write code like this:
new string[256] = "something long" for (new i=0; i<strlen(string); i++) //...code
This plays off a similar principle from before: cache results. The compiler will actually recompute your string length on each iteration of the loop. This will have even worse effects if your string changes mid-loop. A more sensible method is:
new string[256] = "something long" new len = strlen(string) for (new i=0; i<len; i++) //...code