SourcePawn Transitional Syntax

From AlliedModders Wiki
Revision as of 17:21, 8 November 2014 by BAILOPAN (talk | contribs)
Jump to: navigation, search

We would like to give our users a more modern language. Pawn is showing its age; manual memory management, buffers, tags, and lack of object-oriented API are very frustrating. We can't solve everything all at once, but we can begin to take steps in the right direction.

SourceMod 1.7 introduces a Transitional API. It is built on a new Transitional Syntax in SourcePawn, which is a set of language tools to make Pawn feel more modern. In particular, it allows developers to use older APIs in an object-oriented manner, without breaking compatibility. Someday, if and when SourcePawn can become a full-fledged modern language, the transitional API will making porting efforts very minimal.

The transitional API has the following major features and changes:

  • New Declarators - A cleaner way of declaring variables, similar to Java and C#.
  • Methodmaps - Object-oriented wrappers around older APIs.
  • Real Types - SourcePawn now has "int", "float", "bool", "void", and "char" as real types.
  • null - A new, general keyword to replace INVALID_HANDLE.

New Declarators

Developers familiar with pawn will recognize Pawn's original declaration style:

new Float:x = 5.0;
new y = 7;

In the transitional syntax, this can be reworded as:

float x = 5.0;
int y = 7;

The following builtin tags now have types:

  • Float: is float.
  • bool: is bool.
  • _: (or no tag) is int.
  • String: is char.
  • void can now be used as a return type for functions.

Rationale

In the old style, tagged variables are not real types. Float:x does not indicate a float-typed variable, it indicates a 32-bit "cell" tagged as a float. It is possible to remove the tag or change the tag, which while flexible, is dangerous and confusing. The syntax itself is also problematic. The parser does not have the ability to identify characters in between the tag name and the colon. Internally, the compiler cannot represent values that are not exactly 32-bit.

The takeaway message is: there is no sensible way to represent a concept like "int64" or "X is a type that represents an array of floats". The tagging grammar makes it too awkward, and the compiler itself is incapable of attaching such information to a tag. We can't fix that right away, but we can begin to deprecate the tag system via a more normal declaration syntax.

An easy example to see why this is necessary, is the weirdness in expressing something like this with tags:

native float[3] GetEntOrigin();

Note on the String tag: The introduction of char might initially seem confusing. The reason for this is that in the future, we would like to introduce a true "string" type that acts as an object, like strings in most other languages. The existing behavior in Pawn is an array of characters, which is much lower-level. The renaming clarifies what the type really is, and leaves the door open for better types in the future.

Arrays

The new style of declaration disambiguates between two kinds of arrays. Pawn has indeterminate arrays, where the size is not known, and determinate arrays, where the size is known. We refer to these as "dynamic" and "fixed-length" arrays, respectively.

A fixed-length array is declared by placing brackets after a variable name. For example:

int CachedStuff[1000];
int PlayerData[MAXPLAYERS + 1] = { 0, ... };
int Weapons[] = { WEAPON_AK47, WEAPON_GLOCK, WEAPON_KNIFE };

In these examples, the array size is fixed. The size is known ahead of time and cannot change. When using brackets in this position, the array size must be specified, either via an explicit size or inferred from an initial value.

A dynamic-length array has the brackets before the variable name, that is, after the type. The most common case is when specifying functions that take an array as input:

native void SetPlayerName(int player, const char[] name);

Here, we are specifying that the length of name is not always known - it could be anything.

Dynamic arrays can also be created in local scopes. For example,

void FindPlayers()
{
  int[] players = new int[MaxClients + 1];
}

This allocates a new array of the given size and places a reference in players. The memory is automatically freed when no longer in use.

It is illegal to initialize a fixed-length array with an indeterminate array, and it is illegal to initialize a dynamic array with a fixed-array. It is also illegal to specify a fixed size on a dynamic length array.

For the most part, this does not change existing Pawn semantics. It is simply new syntax intended to clarify the way arrays work.

Rationale

In the original syntax, there was a subtle difference in array declaration: new array1[MAXPLAYERS + 1]; new array2[MaxClents + 1];

Here, array1 and array2 are very different types: the first is an int[65] and the latter is an int[]. However, there is no syntactic difference: the compiler has to deduce whether the size expression is constant to determine the type. The new syntax clearly and explicitly disambiguates these cases, so in the future when we introduce fully dynamic and flexible arrays, we're less likely to break existing code.

This may result in some confusion. Hopefully won't matter long term, as once we have true dynamic arrays, fixed arrays will become much less useful and more obscure.

The rationale for restricting initializers is similar. Once we have true dynamic arrays, the restrictions will be lifted. In the meantime, we need to make sure we're limited to semantics that won't have subtle differences in the future.

Examples

float x = 5.0;    // Replacement for Float
int y = 4;        // Replacement for new
char name[32];    // Replacement for String
 
void DoStuff(float x, int y, char[] name, int length) {
  Format(name, length, "%f %d", x, y);
}

Grammar

The new and old declaration grammar is below.

return-type ::= return-old | return-new
return-new ::= type-expr new-dims?        // Note, dims not yet supported.
return-old ::= old-dims? label?

argdecl ::= arg-old | arg-new
arg-new ::= "const"? type-expr '&'? symbol old-dims? ('=' arg-init)?
arg-old ::= "const"? tags? '&'? symbol old-dims? ('=' arg-init)?

vardecl ::= var-old | var-new
var-new ::= var-new-prefix type-expr symbol old-dims?
var-new-prefix ::= "static" | "const"
var-old ::= var-old-prefix tag? symbol old-dims?
var-old-prefix ::= "new" | "decl" | "static" | "const"

global ::= global-old | global-new
global-new ::= storage-class* type-expr symbol old-dims?
global-old ::= storage-class* tag? symbol old-dims?

storage-class ::= "public" | "static" | "const" | "stock"

type-expr ::= (builtin-type | symbol) new-dims?
builtin-type ::= "void"
               | "int"
               | "float"
               | "char"
               | "bool"

tags ::= tag-vector | tag
tag-vector ::= '{' symbol (',' symbol)* '}' ':'
tag ::= label

new-dims ::= ('[' ']')*
old-dims ::= ('[' expr? ']')+

label ::= symbol ':'
symbol ::= [A-Za-z_]([A-Za-z0-9_]*)

Note that arrays can either be declared next to the type (like "int[]"), or after the variable name (like "int x[]"). The latter style is deeply embedded in the Pawn community, so it will continue to work. However, associating the dimensions with the type is another important step forward. First, it regularizes the positioning when a name isn't needed. Second, another long-term goal is eliminating the need for fixed-length, pre-allocated arrays in the API, and those are one of the only compelling reasons to use the existing syntax.

However, it is illegal to have dimensions in both positions.

Also note, there is no equivalent of decl in the new declarator syntax. decl is considered to be dangerous and unnecessary.

Methodmaps

Introduction

Methodmaps are simple: they attach methods onto a tag. For example, here is our legacy API for Handles:

native CloneHandle(Handle handle);
native CloseHandle(Handle handle);

This is a good example of our legacy API. Using it generally looks something like:

Handle array = CreateAdtArray();
PushArrayCell(array, 4);
CloseHandle(array);

Gross! A Methodmap can clean it up by attach functions to the Handle tag, like this:

methodmap Handle {
    public Clone() = CloneHandle;
    public Close() = CloseHandle;
};

Now, our earlier array code can start to look object-oriented:

Handle array = CreateAdtArray();
PushArrayCell(array, 4);
array.Close();

Inheritance

The Handle system has a "weak" hierarchy. All handles can be passed to CloseHandle, but only AdtArray handles can be passed to functions like PushArrayCell. This hierarchy is not enforced via tags (unfortunately), but instead by run-time checks. Methodmaps allow us to make individual handle types object-oriented, while also moving type-checks into the compiler.

For example, here is a transitional API for arrays:

native AdtArray CreateAdtArray();
 
methodmap AdtArray < Handle {
    public PushCell() = PushArrayCell;
};

Note that CreateAdtArray now returns AdtArray instead of Handle. Normally that would break older code, but since AdtArray inherits from Handle, there is a special rule in the type system that allows coercing an AdtArray to a Handle (but not vice versa).

Now, the API looks much more object-oriented:

AdtArray array = CreateAdtArray();
array.PushCell(4);
array.Close();

Inline Methods

Methodmaps can declare inline methods and accessors. Inline methods can be either natives or Pawn functions. For example:

methodmap AdtArray {
    public native void PushCell(value);
};

This example requires that an "AdtArray.PushCell" native exists somewhere in SourceMod. It has a magic initial parameter called "this", so the signature will look something like:

native void AdtArray.PushCell(AdtArray this, value);

(Of course, this exact signature will not appear in an include file - it's the signature that the C++ implementation should expect, however.)

It's also possible to define new functions without a native:

methodmap AdtArray {
    public native void PushCell(value);
 
    public void PushCells(list[], count) {
        for (int i = 0; i < count; i++) {
            this.PushCell(i);
        }
    }
};

Lastly, we can also define accessors. or example,

methodmap AdtArray {
    property int Size {
        public get() = GetArraySize;
    }
    property bool Empty {
        public get() {
            return this.Size == 0;
        }
    }
    property int Capacity {
        public native get();
    }
};

The first accessor simply assigns an existing function as an accessor for "Size". The second accessor is an inline method with an implicit "this" parameter. The third accessor will bind to a native with the following name and signature:

native int AdtArray.Capacity.get(AdtArray this);

Setters are also supported. For example:

methodmap Player {
    property int Health {
        public native get();
        public native set(int health);
    }
}

Custom Tags

Methodmaps don't have to be used with Handles. It is possible to define custom methodmaps on new or existing tags. For example:

methodmap AdminId {
    public int Rights() {
        return GetAdminFlags(this);
    }
};

Now, for example, it is possible to do:

GetPlayerAdmin(id).Rights()

Constructors and Destructors

Methodmaps can also define constructors and destructors, which is useful if they are intended to behave like actual objects. For example,

methodmap AdtArray {
    public AdtArray(blocksize = 1);
    public ~AdtArray();
    public void PushCell(value);
};

Now AdtArrays can be used in a fully object-oriented style:

AdtArray array = AdtArray();
array.PushCell(10);
array.PushCell(20);
array.PushCell(30);
delete array;

Caveats

There are a few caveats to methodmaps:

  1. CloseHandle() is not yet gone. It is required to call delete on any object that previously would have required CloseHandle().
  2. There can be only one methodmap for a tag.
  3. When using existing natives, the first parameter of the native must coerce to the tag of the methodmap. Tag mismatches of the "this" parameter will result in an error. Not a warning!
  4. Methodmaps can only be defined on tags. Pawn has some ways of creating actual types (like via struct or class). Methodmaps cannot be created on those types.
  5. Methodmaps do not have strong typing. For example, it is still possible to perform "illegal" casts like Float:CreateAdtArray(). This is necessary for backwards compatibility, so methodmap values can flow into natives like PrintToServer or CreateTimer.
  6. It is not possible to inherit from anything other than another previously declared methodmap.
  7. Methodmaps can only be defined over scalars - that is, the "this" parameter can never be an array. This means they cannot be used for enum-structs.
  8. Destructors can only be native. When we are able to achieve garbage collection, destructors will be removed.
  9. The signatures of methodmaps must use the new declaration syntax.
  10. Methodmaps must be declared before they are used.

Grammar

The grammar for methodmaps is:

visibility ::= "public"
method-args ::= arg-new* "..."?

methodmap ::= "methodmap" symbol? { methodmap-item* } term
methodmap-item ::=
           visibility "~"? symbol "(" ")" "=" symbol term
         | visibility "native" type-expr "~"? symbol methodmap-symbol "(" method-args ")" term
         | visibility type-expr symbol "(" method-args ")" func-body term
         | "property" type-expr symbol { property-decl } term
property-func ::= "get" | "set"
property-decl ::= visibility property-impl
property-impl ::=
           "native" property-func "(" ")" term
         | property-func "(" ")" func-body term
         | property-func "(" ")" "=" symbol

Classes

Classes are almost identical to methodmaps, with one very important distinction: they are actual types, rather than an attachment to a tag. In fact, classes form an entirely new type system in Pawn. For example, the following is possible with methodmaps:

native Handle CreateTimer(float interval, Timer func, any data = 0);
 
methodmap AdtArray {
    public AdtArray(blocksize = 1);
    public ~AdtArray();
};
 
public CheckPlayers(Handle timer, any data)
{
    ....
}
 
AdtArray array = AdtArray();
CreateTimer(2.0, CheckPlayers, array);

If AdtArray was created via class instead of methodmap, this code would not compile. The reason is that methodmap creates a tag that can coerce to any. However, class creates a type that coerces only to related types. Those types are AdtArray, and a special builtin type called Object. The type of a class does not coerce via any: or _:.

Explanation

Why does it work this way? The reason is a technical deficiency in the existing Pawn compiler. When you use "any", Pawn does not tell SourceMod what the exact types of its variables are. SourceMod just sees a bunch of bits. It doesn't know whether it's a float, an integer, an array pointer, a handle, or an object. This prevents SourceMod from implementing garbage collection* - SourceMod can't automatically free memory if it can't tell where all its objects are.

Ultimately, we want garbage collection, but it's a very big and complex task. In the meantime, it's important that the transitional API make garbage collection a future possibility. To this end, "class"-based objects do not have observable values. They cannot be passed into variadic functions or functions that would coerce them into an unrelated or non-object type - including the any type. Eventually, we will introduce a new variant of any that is safe to use with objects.

*Well, decent garbage collection. We could implement "conservative" GC, but it's very unappealing.

Typedefs

Function tags and function enums have been deprecated in favor of a more modern syntax. Currently, they can still only create tag names for functions. Future versions will support arbitrary types. The syntax is:

typedef ::= "typedef" symbol "=" full-type-expr term
full-type-expr ::= "(" type-expr ")"
                 | type-expr
type-expr ::= "function" type-name "(" typedef-args? ")"
typedef-args ::= "..."
               | typedef-arg (", " "...")?

Note that typedefs only support new-style types. For example,

functag public Action:SrvCmd(args);

Becomes...

typedef SrvCmd = function Action (int args);