# 1.1 Program structure

A Felix program consists of a nominated root parse unit and the transitive closure of units with respect to inclusion.

The behaviour of this system consists of the action of the initialisation code in each unit, performed in sequence within a given unit, with the order of action between units unspecified.

## 1.1.1 Parse Unit

A parse unit is a file augmented by prefixing specified import files to the front. These consist of a suite of grammar files defining the syntax and other files defining macros.

By convention import files have the extension .flxh.

With this augmentation all parse units in a program are independently parsed to produce an list of statements represented as abstract syntax, denoted an AST (even though it is a list of trees, not a single tree).

## 1.1.2 AST

The program consists of the concatenation of the ASTs of each parse unit, resulting in a single AST, which is then translated to a C++ translation unit by the compiler.

# 1.2 Grammar

The Felix grammar is part of the library. It is notionally prefixed to each file to be processed prior to any import files to specify the syntax with which the file is to be parsed and translated to an AST.

The grammar uses an augmented BNF like syntax with parse actions specified in R5RS Scheme.

The resulting S-expressions are translated to an intermediate form and then into an internal AST structure.

The parser is a scannerless GLR parser with significant extensions.

## 1.2.1 Grammar syntax

Not written yet. Browse the grammar directory for examples.

# 1.3 Modules

Every Felix program is encapsulated in a module with the name being a mangle of the basename of the root unit. The mangling replaces characters in the filename with other characters so that the module name is a valid ISO C identifier.

## 1.3.1 Special procedure flx_main

A program module may contain at most one top level procedure named flx_main. After initialisation code suspends or terminates, this procedure is invoked if it exists. It is the analogue of main in C++ however it is rarely used: side-effects of the root units initialisation code are typically used instead.

## 1.3.2 Libraries

In Felix a library is a root unit together with its transitive closure with respect to inclusion, which does not contain a top level flx_main.

A program unit can be augmented by a set of libraries which are then considered as if included, but without an include directive being present.

# 1.4 Lexicology

All Felix files are considered to be UTF-8 encoded Unicode.

Felix uses a scannerless parser, there are no keywords.

## 1.4.1 Identifiers

A plain identifier starts with a letter or underscore, then consists of a sequence of letters, digits, apostrophy, has no more than one apostrophy or dash in a row, except at the end no dash is allowed, and any number of apostrophies.

  Ab_cd1  a' b-x


Identifies starting with underscore are reserved for the implementation.

A letter may be any Unicode character designated for use in an identifier by the ISO C++ standard. In practice, all high bit set octets are allowed.

A TeX identifier starts with a slosh and is followed by a sequence of letters.

## 1.4.2 Integer Literals

An plain integer literal consists of a sequence of digits, optionally separated by underscores. Each separating underscore must be between digits.

A prefixed integer literal is a plain integer literal or a plain integer literal prefixed by a radix specifier. The radix specifier is a zero followed by one of the letters bBoOdDxX for binary, octal, decimal or hex.

An underscore is permitted after the prefix.

The radix is the one specified by the prefix or decimal by default.

The digits of an integer consist of those permitted by the radix: 01 for binary, 01234567 for octal, 0123456789 for decimal, 0123456789abcdefABCDEF for hex.

Note there are no negative integer literals.

A type suffix may be added to the end of a prefixed integer to designate a literal of a particular integer type, it has the form of an upper or lower case letter or pair of letters usually combined with a prefix or suffix u or U to designate an unsigned variant of the type.

Signed integers are expected to be two's complement with one more negative value that positive value. Bitwise and, or, exclusive or, and complement operations do not apply with signed types.

The effect of overflow on signed types is unspecified.

Unsigned types use the standard representation. Bitwise operations may be applied to unsigned types. Basic arithmetic operations on unsigned types are all well defined as the result of the operation mathematically modulo the maximum value of the type plus one.

The maximum value of an unsigned type is one less than two raised to the power of the number of bits in the type. The number of bits is 8, 16, 32, or 64 or 128 for all unsigned types.

Note that integers starting with 0 are decimal not octal as in C.

A table of suffices and the types they signify follows in lower case.

Suffix  Type      C type              Description
-------------------------------------------------------------------
i8      int8      int8_t              8 bit signed integer
i16     int16     int16_t             16 bit signed integer
i32     int32     int32_t             32 bit signed integer
i64     int64     int64_t             64 bit signed integer

u8      uint8     uint8_t             8 bit unsigned integer
u16     uint16    uint16_t            16 bit unsigned integer
u32     uint32    uint32_t            32 bit unsigned integer
u64     uint64    uint64_t            64 bit unsigned integer

t       tiny      signed char         C++ signed char used an integer
s       short     short               C short
i       int       int                 C int
l       long      long                C long
ll v    vlong     long long           very long: C long long

ut tu   utiny     unsigned char       unsigned tiny: C++ unsigned char used as an integer
us su   ushort    unsigned short      C unsigned short
u       uint      unsigned            C unsigned int
ul lu   ulong     unsigned long       C unsigned long
ull uv  uvlong    unsigned long long  C unsigned longlong

uz zu   size      size_t              array size
j       intmax    intmax_t            largest integer type
uj ju   uintmax   uintmax_t           largest unsigned integer type
p       intptr    intptr_t            pointer considered as an integer
up pu   uintptr   uintptr_t           pointer considered as an unsigned integer
d       ptrdiff   ptrdiff_t           signed distance between pointers
ud      uptrdiff  uptrdiff_t          unsigned distance between pointers


Note that all these types are distinct unlike C and C++. The types designated are not the complete set of available integer like types since not all have literal representations.

Note the suffices do not entirely agree with C.

## 1.4.3 Floating point Literals

Floating point literals follow ISO C89, except that underscores are allowed between digits, and a a digit is required both before and after the decimal point if it is present.

The mantissa may be decimal, or hex, a hex mantissa uses a leading 0x or 0X prefix optionally followed by an underscore.

The exponent may designate a power of 10 using E or e, or a power of 2, using P or p.

A suffix may be F,f,D,d, L or l, designating floating type, double precision floating type, or long double precision floating type.

  123.4
123_456.78
12.6E-5L
0xAf.bE6f
12.7p35


## 1.4.4 String literals

Generaly we follow Python here. Felix allows strings to be delimited by; single quotes ', double quotes ", triped single quotes ''' or tripled double quotes """.

The single quote forms must be on a single line.

The triple quoted forms may span lines, and include embedded newline characters.

These forms all allows embedded escape codes. These are:

\a  -  7 : bell
\b  -  8 : backspace
\t  -  9 : horizontal tab
\n  - 10 : linefeed, newline
\r  - 13 : carriage return
\v  - 11 : vertical tab
\f  - 12 :form feed
\e  - 27 : escape
\\  - \  : slosh
\"  - "  : double quote
\'  - '  : single quote
\   - 32 : space
\o7 \o77 \o777 -- octal character code (stops on count of 3 or non-octal character)
\d9 \d99 \d999 -- decimal character code (stops on count of 3 or non-decimal character)
\uFFFF - utf8 encoding of specified hex value
\UFFFFFFFF - utf8 encoding of specified hex value


### 1.4.4.1 Raw strings

A prefix "r" or "R" on a double quoted string or triple double quoted string suppresses escape processing,

this is called a raw string literal. NOTE: single quoted string cannot be used!

### 1.4.4.2 Null terminated strings

A prefix of "c" or "C" specifies a C NTBS (Nul terminated byte string) be generated instead of a C++ string. Such a string has type +char rather than string.

done


## 1.5.3 Constant folding and conditional compilation

Felix provides two core kinds of constant folding: folding of arithmetic, boolean, and string values, and deletion of code, either statements or expressions, which would become unreachable due to certain value of conditionals.

Basic operations on integer literals, namely addition, subtraction, negation, multiplication, division, and remainder are folded.

Strings are concatenated.

Boolean and, or, exclusive or, and negation, are evaluated.

False branches of if/then/else/endif expression and match expressions are eliminated.

False branches of if/do/elif/else/done are also eliminated.

By this mechanism of constant folding and elimination, Felix provides conditional compilation without the need for special constructions.

# 1.6 General lookup

By default Felix looks up symbols in nested scopes, starting with all symbols in the current scope and proceeding through its containing scope outwards until the outermost scope is reached.

Symbols are visible in the whole of a scope, both before and after their introduction.

A symbol lookup may properly find either a single non-function symbol, which is final, or a set of function symbols.

If the kind of symbol being sought is a function symbol, overload resolution is performed on the set of function signatures found in a scope. If a best match is found, that is final. If no match is found the search continues in the next outermost scope.

All other cases are in error.

# 1.7 Classes

The top level Felix module can contain submodules which are specified by a non-polymorphic class statement:

  class classname { ... }


The effect is to produce a qualified name to be used outside the class:

  class classname { proc f () {} }
classname::f ();


Classes may be nested.

A class may contain private definitions:

  class X {
private var a = 1;
}
// X::a will fail, since a is private to the class X


A private definition is visible within the scope of the class but not outside it.

A class must be specified within a single file.

Classes are not extensible, a definition of a class with the same name in the same scope is not permitted.

The body of a class forms a nested scope. Within a class all symbols defined in the class are visible, along with all those visible in the enclosing context.

The reserved name root may be used as a prefix for the top level module:

  var x = 1;
class A { var x = root::x; }


# 1.8 Lookup control directives

## 1.8.1 Open directive

The simple open directive may be used to make the symbols defined in a class visible in the scope containing the open directive.

  class X { var x = 1; }
open X;
println$x;  Names made visible by an open directive live in a weak scope under the current scope. Names in the weak scope may be hidden by definitions in the current scope without error.  class X { var x = 1; } open X; var x = 2; println$ x; // prints 2


The open directive is not transitive. The names it makes visible are only visible in the scope in which the open directive is written.

## 1.8.2 Inherit directive

The inherit directive allows all of the public symbols of a class to be included in another scope as if they were defined in that scope. This means such names inherited into a class can be accessed by qualification with the inheriting class name, and will be visible if that class is opened.

Inheriting is transtitive.

If a name is inherited it will clash with a local definition.

  class A { var a = 1; }
class B { inherit A; }
println$B::a;  ## 1.8.3 Rename directive This directive is can be used to inherit a single symbol into a scope, possibly with a new name, and also to add an alias for a name in the current scope. When applied to a function name all functions with that name are renamed.  class A { var a = 1; proc f() {} proc f(x:int) {} } class B { rename a = A::a; rename fun f = A::f; }  The new name injected by a rename may be polymorphic:  class A { proc f[T] () {} } class B { rename g[T] = A::f[T]; }  ## 1.8.4 Use directive This is a short form of the rename directive:  class A { var a = 1; } class B { use A::a; use b = A::a; }  It cannot be applied to functions. The first form is equivalent to  use a = A::a;  Unlike the rename directive the new name cannot be polymorphic and is limited to a simple identifier. ## 1.8.5 Export directives The export directives make the exported symbol a root of the symbol graph. The functional export and forces it to be place in the generated code as an extern "C" symbol with the given name:  export fun f of (int) as "myf"; export cfun f of (int) as "myf"; export proc f of (int) as "myf"; export cproc f of (int) as "myf";  Functions are exported by generating a wrapper around the Felix function. If the function is exported as fun or proc the C function generated requires a pointer to the thread frame as the first argument, if the cfun or cproc forms are used, the wrapper will not require the thread frame. In the latter case, the Felix function must not require the thread frame. A type may also be exported:  export type ( mystruct ) as "MyStruct";  This causes a C typedef to be emitted making the name MyStruct an alias to the Felix type. This is useful because Felix types can have unpredictable mangled names. The word export optionally followed by a string may also be used as a prefix for any Felix function, generator, or procedure definition. If the string is omitted is taken as the symbol name. The effect is the same as if an export statement has been written. # 1.9 Variable Definitions A definition is a statement which defines a name, but does no cause any observable behavior, or, a class statement, or, a var or val statement. The latter two exceptions define a name but may also have associated behaviour. ## 1.9.1 The var statement The var statement is used to introduce a variable name and potential executable behaviour. It has one of three basic forms:  var x : int = 1; var y : int; var z = 1;  The first form specifies the type and an initialising expression which must be of the specified type. The second form specifies a variable of the given type without an explicit initialiser, however the variable will be initialised anyhow with the default contructor for the underlying C++ type, although that constructor may be trivial. The third form does not specify the type, it will be deduced from the initialiser. If the initialiser has observable behaviour it will be observed if at all, when control passes through the variable statement. If the variable introduced by the var statement is not used, the variable and its initaliser will be elided and any observable behaviour will be lost. To be used means to have its address taken in a used expression, to occur in a used expression. A used expression is one which initialises a used variable, or, is an argument to function or generator in a used expression, or an argument to a procedure through which control passes. In other words, the variable is used if the behaviour of the program appears to depend on its value or its address. The library procedure C_hack::ignore ensures the compiler believes a variable is used:  var x = expr; C_hack::ignore x;  so that any side effects of expr will be seen. In general the argument to any primitive function, generator or procedure will be considered used if its containing entity is also considered used. In general this means there is a possible execution path from a root procedure of the program. A variable may have its address taken:  var x = 1; var px = &x;  it may be assigned a new value directly or indirectly:  x = 2; px <- 3; *px = 4;  A variable is said to name an object, not a value. This basically means it is associated with the address of a typed storage location. ### 1.9.1.1 Multiple variables Multipls variables can be defined at once:  var m = 1,2; var a,b = 1,2; var c,d = m;  With this syntax, no type annotation may be given. ## 1.9.2 The val statement. A val statement defines a name for an expression.  val x : int = 1; val z = 1;  The value associated with a val symbol may be computed at any time between its definition and its use, and may differ between uses, if the initialising expression depends on variable state, such as a variable or call to a generator. It is not an error to create such a dependence since either the value may, in fact, not change, or the change may not be significant. Nevertheless the user must be warned to take care with the indeterminate evaluation time and use a var when there is any doubt. Since a val simply names an expression, it is associated with a value not an object and cannot be addressed or assigned to. However this does NOT mean its value cannot change:  for var i in 0 upto 9 do val x = i; println$ x;
done


In this example, x isn't mutable but it does take on all the values 0 to 9 in succession. This is just a most obvious case: a less obvious one:

  var i = 0;
val x = i;
println$x; ++i; println$ x;


which is clearly just an expansion of the the first two iteration of the previously given for loop. However in this case there is no assurance x will change after i is incremented because the compiler is free to replace any val definition with a var definition.

### 1.9.2.1 Multiple values

Multipls values can be defined at once:

  val m = 1,2;
val a,b = 1,2;
val c,d = m;


With this syntax, no type annotation may be given.

# 1.10 Functions

## 1.10.1 Functions

A felix function definition takes one of three basic forms:

  fun f (x:int) = { var y = x + x; return y + 1; }
fun g (x:int) => x + x + 1;
fun h : int -> int = | x => x + x + 1;


The first form is the most general, the body of the function contains executable statements and the result is returned by a return statement.

The second form is equivalent to a function in the first form whose body returns the RHS expression.

The third form specifies the function type then the body of a pattern match. It is equivalent to

  fun h (a:int) = { return match a with | x => x + x + 1 endmatch; }


The first two forms also allow the return type to be specified:

  fun f (x:int) : int = { var y = x + x; return y + 1; }
fun g (x:int) :int => x + x + 1;


Functions may not have side effects.

All these function have a type:

  D -> C


where D is the domain and C is the codomain: both would be int in the examples.

A function can be applied by the normal forward notation using juxtaposition or what is whimsically known as operator whitespace, or in reverse notation using operator dot:

  f x
x.f


Such applications are equivalent. Both operators are left associative. Operator dot binds more tightly than whitespace so that

  f x.g    // means
f (g x)


A special notation is used for application to the unit tuple:

  #zero // means
zero ()


The intention is intended to suggest a constant since a pure function with unit argument must always return the same value.

This hash operator binds more tightly than operator dot so

  #a.b // means
(#a).b


## 1.10.2 Pre- and post-conditions

A function using one of the first two forms may have pre-conditions, post-conditions, or both:

  fun f1 (x:int when x > 0) => x + x + 1;
fun f2 (x:int) expect result > 1 => x + x + 1;
fun f3 (x:int when x > 0) expect result > 1 => x + x + 1;
fun f4 (x:int when x > 0) : int expect result > 1 => x + x + 1;


Pre- and pos-conditions are usually treated as boolean assertions which are checked at run time. The compiler may occasionally be able to prove a pre- or post-condition must hold and elide it.

The special identifier result is used to indicate the return value of the function.

## 1.10.3 Higher order functions

A function may be written like

  fun hof (x:int) (y:int) : int = { return x + y; }
fun hof (x:int) (y:int) => x + y;


These are called higher order functions of arity 2. They have the type

  int -> int -> int   // or equivalently
int -> (int -> int) //since -> is right associative.


They are equivalent to

  fun hof (x:int) : int -> int =
{
fun inner (y:int) : int => x + y;
return inner;
}


that is, a function which returns another function.

Such a function can be applied like

  hof 1 2 // or equivalently
(hof 1) 2


since whitespace application is left associative.

## 1.10.4 Procedures

A function which returns control but no value is called a procedure. Procedures may have side effects.

  fun show (x:int) : 0 = { println x; }
proc show (x:int) { println x; }
proc show (x:int) => println x;


The second form is a more convenient notation. The type 0 is also called void and denotes a type with no values.

A procedure may return with a simple return statement:

  proc show (x:int) { println x; return; }


however one is assumed at the end of the procedure body .

Procedures can also have pre- and post-conditions.

A procedure may be called like an application, however it must be a whole statement since expressions of type void may not occur interior to an expression.

  show 1;
1.show;


If a procedure accepts the unit argument, it may be elided:

  proc f () =>  show 1;
f; // equivalent to
f ();


## 1.10.5 Generators

A generator is a special kind of function which is allowed to have side effects. It is defined similarly to a function, but using the binder gen instead of fun:

  var seqno = 1;
gen seq () { var result = seqno; ++seqno; return result; }


When a generator is directly applied in an expression, the application is replaced by a fresh variable and the generator application is lifted out and assigned to the variable. For example:

  fun twice (x:int) => x + x;
println$twice #seq;  will always print an even number because it is equivalent to  var tmp = #seq; println$ twice tmp;


Therefore even if twice is inlined we end up with the argument to println being tmp+tmp and not #seq+#seq which would print an odd number.

### 1.10.5.1 Yielding Generators

A generator may contain a yield statement:

  gen fresh() {
var x = 1;
while x < 10 do
yield x;
++x;
done
return x;
}


In order to use such a yielding generator, a closure of the generator must be stored in a variable. Then the generator may be called repeatedly.

  var g = fresh;
for i in 1 upto 20 do
println$i,#g; done  This will print pairs (1,1), (2,2) up to (10,10) then print (11,10), (12,10) up to (20,10). The yield statement returns a value such that a subsequent call to a closure of the generator will resume execution after the yield statement. Therefore yield is a kind of cross between a return and a subroutine call. If a generator executes a return statement, that is equivalent to yielding a value and setting the resume point back to the return statement, in other words return expr; is equivalent to  while true do yield expr; done  Yielding generators should not be called directly because they will always start at the beginning with a fresh copy of any local variables used to maintain state. Function closures differ from generator closures in that the closures is cloned before every application to ensure that the initial state is fresh. Yielding generators are primarily intended to implement iterators, that is, to provide lazy lists or streams. ## 1.10.6 Constructors Felix provides a special notation which allows an identifier naming a type to return a value of that type:  typedef cart = dcomplex; typedef polar = dcomplex; ctor cart (x:double, y:double) => dcomplex (x,y); ctor polar (r: double, theta: double) => dcomplex (r * sin theta, r * cos theta) ; var z = cart (20.0,15.0) + polar (25.8, 0.7 * pi);  The constructions are equivalent to  fun _ctor_cart (x:double, y:double) : cart => dcomplex (x,y); fun _ctor_polar (r: double, theta: double): polar => dcomplex (r * sin theta, r * cos theta) ;  When a type with a simple name is applied to a value, Felix tries to find a function with that name prefixed by _ctor_ instead. Note that Felix generates a constructor for struct and cstruct types automatically with argument type the product of the types of the structure fields. ## 1.10.7 Special function apply When Felix finds and application  f a  where f is a value of type F which is not a function (or C function) type, Felix looks instead for a function named apply with argument of type:  F * A  where A is the type of a. For example  fun apply (x:string, y:string) => x + y; // concat var x = "hello " "world"; // apply a string to a string  ## 1.10.8 Objects Felix provides an object system with syntax based on Java, and technology based on CLOS. An object is a record of function closures, closed over the local scope of a constructor function that returns the record.  interface person_t { get_name: 1 -> string; set_age: int -> 0; set_job : string -> 0; get_job : 1 -> string; } object person (name:string, var age:int) implements person_t = { var job = "unknown"; method fun get_name () => name; method proc set_age (x:int) { age = x; } method fun get_job () => job; method proc set_job (x:string) { job = x; } } var john = person ("John", 42); println$ #(john.name) + " is " + #(john.age).str;


The entity person is a function which when called with its argument of name and age returns a record of type person_t consisting of closures of the functions and procedures marked as method in its definition.

Since functions hide their local variables the object state is hidden and can only be accessed using the methods.

The implements clause is optional.

Objects provide an excellent way for a dynamically loaded shared library to export a set of functions, only the object function needs to be exported so it has a C name which can be linked to with dlopen.

# 1.11 Type constructors

## 1.11.1 typedef

The typedef statement is used to define an alias for a type. It does not create a new type.

  typedef Int = int;


## 1.11.2 Tuples

Tuple types are well known: a tuple is just a Cartesian Product with components identified by position, starting at 0. The n-ary type combinator is infix * and the n-ary value constructor is infix ,:

  val tup : int * string * double = 1, "Hello", 4.2;


The 0-ary tuple type is denoted 1 or unit with sole value ():

  val u : unit = ();


The 1-ary tuple of type T component value v is identified with the type T and has value v.

The individual components of a tuple may be accessed by a projection function. Felix uses an integer literal to denote this function.

  var x = 1,"Hello";
assert 0 x == 1; assert x.0 == 1;
assert 1 x == "Hello"; assert x.1 == "Hello";


[There should be a way to name this function without application to a tuple!]

A pointer to a tuple is also in itself a tuple, namely the tuple of pointers to the individual components. This means if a tuple is addressable, so are the components.

  var x = 1, "Hello";
val px = &x;
val pi = px.0; pi <-42;
val ps = px.1; ps <-"World";
assert x.0 == 42;
assert x.1 == "World";


In particular note:

  var x = 1, "Hello";
&x.0 <- 42;


because the precedences make the grouping (&x).0.

You cannot take the address of a tuple component because a projection of a value is a value.

Assignment to components of tuples stored in variables is supported but only to one level, for general access you must take a pointer and use the store-at-address operator <-.

The projections of a tuple can also be written in an expanded form so that they may stand alone as functions:

  var first = proj 0 of (int * string);
var a = 1, "Hello";
var one = a . first;
var two = a . proj 1 of (int * string);


## 1.11.3 Records

A record is similar to a tuple except the components are named and considered unordered.

  typedef xy = (x:int, y:int);
var r : xy = (x=1,y=2);
println$r.x, r.y; var px = &r.x; // means (&r).x px <- 42; println$ r.x, r.y; // r.x == 42 now


The field names are projection functions and can be applied to a record value to extract the nominated component, or applied to a pointer to a record to find a pointer to the nominated component.

The notation (may be changed soon)

  var prjx = x of (xy);
var prjpx = x of (&xy);


can be used to refer to a projection in isolation, and a pointer projection in isolation, that is, as unapplied first class functions.

### 1.11.3.1 Interfaces

An interface is a special notation for a record type all of whose fields are functions or procedures.

  interface fred {
f: int -> int;
g: int -> 0; // procedure
}

// equivalent to
typedef fred = (f: int -> int, g: int -> 0);


The primary use is for specifying the type of a Java like object.

## 1.11.4 Structs

A struct is a a nominally typed record, that is, it must be defined, and each definition specifies a distinct type.

  struct S { x:int; y:int; };
var s : S = S (1,2);
println$s.x,s.y;  A struct value can be constructued using the structure name as a function and passing a tuple of values corresponding by position to the fields of the struct. A struct constructor can be used a first class function. The field names are projection functions and can be applied to a struct value to extract the nominated component, or applied to a pointer to a struct to find a pointer to the nominated component. The notation (may be changed soon)  var prjx = x of (S); var prjpx = x of (&S);  can be used to refer to a projection in isolation, and a pointer projection in isolation, that is, as unapplied first class functions. A struct may also contain function and procedure definitions:  struct A { x:int; y:int; fun get2x => 2 * self.x; fun get2y () => 2 * self.y; proc diag (d:int) { self.x <- d; self.y <-d; } };  These functions are precisely equivalent to:  fun get2x (self:A) => 2 * self.x; proc diag (self: &A) (d:int) { self.x <-d; self.y <-d; }  Note that for a function self is a value, for a procedure self is a pointer. Because of these definitions, we can form object closures over a struct:  var a = A(1,2); var g2y = a. get2y; var di = a . diag;  Note we can't form a closure for get2x without an explicit wrapper, i.e. eta-expansion. ## 1.11.5 Sums Sum types are the dual of tuples. They represent a sequence of possible cases, potentially with arguments. Case indicies are 0 origin. Sum variables are decode with a match which may also extract an argument value:  typedef num = int + long + double; var x = (case 1 of num) 53L; println$
match x with
| case 0 (i) => "int " + i.str
| case 1 (l) => "long" + l.str
| case 2 (d) => "double " + d.str
endmatch
;


### 1.11.5.1 Unit sum

There is a family of special sum types equivalent to:

    2 = 1 + 1
3 = 1 + 1 + 1
4 = 1 + 1 + 1


Recall type 1, or unit, is the type of the empty tuple. The type 2 is also known as bool, and represents two cases where false is an alias for case 0 of 2 and true is an alias for case 1 of 2.

The type 0 or void, is the type of no values.

These types are called unit sums because they're a sum of a certain number of units.

Note carefully that:

    x + (y + z), (x + y) + z, x + y + z


are three distinct types because operator + is not associative.

## 1.11.6 union

A union is the dual of a struct. It is a nominally typed version of a sum. Here for example is a list of integers:

  union intlist {
iEmpty ;
iCons of int * intlist;
};


This alternative syntax is more commonly used and comes from ML family:

  union intlist =
| iEmpty
| iCons of int * intlist
;


The fields of a union type are injections or type constructors. In effect they cast their argument to the type of the union, thus unifying heterogenous types into a single type.

Pattern matches are used to decode unions.

  var x = iEmpty;
x = iCons (1, x);
x = iCons (2, x); // list of two integers

fun istr (x:intlist) =>
match x with
| #iEmpty => "end"
| iCons (i, tail) => i.str + "," + istr tail
endmatch
;


The first variant represents an empty list. The second variant says that a pair consisting of an int and a list can be considered as a list by applying the type constructor iCons to it.

Unlike product types, a sum may directly contain itself. This is because sum types are represented by pointers.

### 1.11.6.1 enum

A restricted kind of union, being a nominally typed version of a unit sum.

  enum colour { red, green, blue }; // same as
enum colour = red, green, blue; // same as
union colour = red | green | blue;


The tag value of an enum can be set:

  enum wsize = w8=8, w16=16, w32=32, w64=64;


### 1.11.6.2 caseno operator

The caseno operator can find the tag value of any sum type, the anonymous sum, union, enum or variant as an integer.

  assert caseno w16 == 16;
assert caseno (case 1 of 2) == 1;


## 1.11.7 variant

Wariants the sum type which are the dual of records. They used named injections like unions but are structurally typed.

  typedef vars = union { Int of int ; Float of float; };


## 1.11.8 Array

Felix has various kinds of array. The term is abused and sometimes refers to the abstract concept, and sometimes the statically typed fixed length array described here.

An array is nothing but a tuple all of whose elements have the same type. It is convenient to use an exponential operator with a unit sum index to provide a compact notation:

  int ^ 3 // array of 3 integers equivalent to
int * int * int


Therefore the value:

  var a3 = 1,2,3;


is, in fact, an array. As for tuples an integer literal applied to an array value returns a component, however for arrays, an expression may be used as well:

  var i = 1;
var y = a3 . i;
var z = a3 . proj i of (int^3);


The last form is equivalent to

  var z = a .
match i with
| 0 => proj 0 of (int^3)
| 1 => proj 1 of (int^3)
| 2 => proj 2 of (int^3)
| _ => throw error
endmatch
;


in other words there is a run time array bounds check equivalent to a match failure. Note that of course the actual generated code is optimised!

A run time check can be avoided by using the correct type of index:

  var i = case 1 of 3;
var z = a . i; // no run time check


### 1.11.8.1 Multi-arrays

Whilst we introduced the exponential notation

    B ^ J


as a mere shorthand, where J is a unit sum, in fact Felix allows the index to be any compact linear type.

A compact linear type is any combination of sums, products, and exponentials of unit sums. The type

   3 * 4 * 5


for example is compact linear, and therefore Felix allows the array type

  typedef matrix = double ^ (3 * 4 * 5)


Although this looks like the type

  typedef array3 = double  ^ 3 ^ 4 ^ 5


as suggested by the usual index laws, the latter is an array size 5 of arrays size 4 of arrays size 3 which can be used like:

  var a : array3;
var z = a . case 1 of 5 . case 1 of 4 . case 1 of 3;


where you will note that the projections are applied in the reverse order to the indices. On the other hand to use the first form we have instead:

  var m : matrix;
var z = a . (case 1 of 3, case 1 of 4, case 1 of 5);


The exponent here is a value of a compact linear type. It is a single tuple value! It is called a multi-index when applied to an array.

The advantage of this type is that there is an obvious encoding of the values shown in this psuedo code:

    i * 3 * 4 + j * 3 + k


which is nothing more than a positional number notation where the base varies with position. That encoding clearly associates with the compact linear value as integer in the range 0 to 59, or, alternatively, an value of type 60. In other words the type is linear and compact.

Since clearly, given an integer in range 0 through 59 and this type we can decode the integer into a tuple, being the positional representation of the integer in this weird coding scheme, the type is clearly isomorphic to the subrange of integer.

Therefore Felix allows you to coerce an integer to a compact linear type with a run time check, and convert a compact linear type to an integer or a unitsum:

  var i : int = ((case 1 of 3, case 1 of 4, case 1 of 5) :>> int);
var j : 60 = ((case 1 of 3, case 1 of 4, case 1 of 5) :>> 60);
var clt : 3 * 4 * 5 = (16 :>> 3 * 4 * 5);


Because we can do this we can now write a loop over a matrix with a single iterator:

  for i in 60 do



### 1.15.1.3 conditional chain

A variant on the terminated if/then/elif/else allowing chaining.

  var x =
if c1 then r1
elif c2 then r2 else
if c3 then r3 else
r4
;


## 1.15.2 Alternate conditional chain

  var x = n / d unless d == 0 then 0;


## 1.15.3 Dollar application

A right associative low precedence forward apply operator taken from Haskell.

  var x =  str$rev$ list$1,2,3;  ## 1.15.4 Pipe application A left associative low precedence reverse apply operator taken from C#.  var x = 1,2,3 |> list |> rev |> str;  ## 1.15.5 Tuple cons constructor A right associative cons operator for tuples. Allows concatenating an element to the head/front/left end of a tuple. Can also be used in a pattern match to recursively decode a tuple like a list.  var x = 3,4; var y = 1 ,, 2 ,, x; // 1,2,3,4  ## 1.15.6 N-ary tuple constructor The is a non-associative n-ary tuple constructor consists of a sequence of expressions separated by commas.  var x = 1,2,3,4;  ## 1.15.7 Logical implication An operator for function implies.  var x = false implies true;  ## 1.15.8 Logical disjunction A chaining operator for function lor.  var x = true or false;  ## 1.15.9 Logical conjunction A chaining operator for function land.  var x = true and false;  ## 1.15.10 Logical negation A bool operator for function lnot. ## 1.15.11 Comparisons Non-associative.  var x = a < b or a > b or a <= b or a >= b or a == b or a != b or 1 in list(1,2,3) 1 $$\in$$ list (1,2,3) ;  ## 1.15.12 Name temporary Allows a subexpression to be named as a val by default or a var.  var x = a + (f y as z) + z; // equivalent to val z = f y; var x = a + z + z; var x = a + (f y as var k) + k; // equivalent to var k = f y; var x = a + k + k;  Note that the var for ensures the subexpression is eagerly evaluated, before the containing expression. ## 1.15.13 Schannel pipe operators Used to flow data through schannels from the source on the left to the sink on the right via processing units in between.  spawn_fthread$ source |-> filter |-> enhancer |-> sink;


This variant uses an iterator to stream data out of a data structure:

  spawn_fthread$list (1,2,3) >-> sink;  ## 1.15.14 Right Arrows Right associative arrow operators. List cons operator.  var x = 1 ! 2 ! list (3,4);  Function types (type language only):  D -> C // Felix function D --> C // C function pointer  ## 1.15.15 Case literals The case tag is only used in pattern matches. The sum or union type isn't required because it can be deduced from the match argument.  var a = match a with Some v => v | #None => 0;  The case constructor with integer caseno has two uses. It creates a value of a sum type with no arguments:  var x = case 1 of 2; // aka true  or it is a function for a sum type variant with an argument:  var x = (case 1 of int + double) 4.2;  A case literal with a name instead of an integer constructs a variant instead:  typedef maybe = union { No; Yes of int; }; var x = (case Yes of maybe) 42;  The tuple projection function names a tuple projection:  typedef triple = int * long * string; var snd = proj 1 of triple; var y: int = snd (1, 2L, "3");  ## 1.15.16 Bitwise or Left associative.  var x = q \| b;  ## 1.15.17 Bitwise exclusive or Left associative.  var x = q \^ b;  ## 1.15.18 Bitwise and Left associative.  var x = q \& b;  ## 1.15.19 Bitwise shifts Left associative.  var x = a << b; // left shift var x = a >> b; // right shift  ## 1.15.20 Addition Chain operator. Non-associative for types. Left associative for expressions.  var x = a + b + c;  ## 1.15.21 Subtraction Left associative.  var x = a - b;  ## 1.15.22 Multiplication Chain operator. Non-associative for types. Left associative for expressions.  var x = a * b;  ## 1.15.23 Division operators Left associative. Not carefully: higher precedence that multiplication, unlike C!!  var x = a * b / c * d; // means var x = a * (b / c) * d; var x = a * b % c * d; // means var x = a * (b % c) * d;  ## 1.15.24 Prefix operators  var x = !a; var x = -a; // negation var x = ~a; // bitwise complement  ## 1.15.25 Fortran exponentiation Infix ** is special syntax for function pow. The left operand binds more tightly than ** but the right operand binds as for prefixed operators or more tightly. Observe than that:  var x = -a**-b; // means var x = -(a**(-b));  preserving the usual mathematical syntax. ## 1.15.26 Felix exponentiation Left associative. The right operand binds as deref operator or more tightly. Used for array notation in the type language.  var x = a ^ ix;  ## 1.15.27 Function composition Standard math notation. Left associative. Same precedence as exponentiation. Spelled \circ.  var x = f $$\circ$$ g;  ## 1.15.28 Dereference For function deref.  var x = *p;  For builtin dereference operator:  var x = _deref p;  Note these usually have the same meaning however the function deref can be overloaded. If the overloaded definition is not to be circular it may use _deref when dereferencing pointers. ### 1.15.28.1 Operator new Copies a value onto the heap and returns a pointer.  var px = new 42;  ## 1.15.29 Whitespace application Operator whitespace is used for applications. ### 1.15.29.1 General  var x = sin y;  ### 1.15.29.2 Caseno operator Returns the integer tag value of the value of an anonymous sum, union, or variant type.  var x = caseno true; // 1 var x = caseno (Some 43); // 1  ### 1.15.29.3 Likelyhood Indicaties if a bool valued expression is likely or unlikely to be true. Used to generate the corresponding gcc optimisation hints, if available.  if likely (c) goto restart; if unlikely (d) goto loopend;  ## 1.15.30 Coercion operator left associative. The right operand is a type.  var x = 1L :>> int; // cast  ## 1.15.31 Suffixed name The most general form of a name:  var x = qualified::name of int;  Used to name functions, with the right operand specifying the function argument type. ## 1.15.32 Factors ### 1.15.32.1 Subscript Used for array and string subscripting. Calls function subscript. For strings, returns a character. If the subscript is out of range after adjustment of negative index, returns char 0 and thus cannot fail.  var x = a . [ i ]; // i'th element  ### 1.15.32.2 Subsstring Calls function substring. Negative indices may be used to offset from end, i.e. -1 is the index of the last element. Out of range indices (past the end or before the start, after adjustment of negative indices) are clipped back to the end or start respectively.  var x = a . [ first to past]; // past is one past the last element refered to  ### 1.15.32.3 Copyfrom Calls function copyfrom. Copies from designated index to end. Supports negative indices and range clipping for strings.  var x = a . [to past]; // from the first  ### 1.15.32.4 Copyto Calls function copyto. Supports negative indices and range clipping for strings.  var x = a . [to past]; // from the first  ### 1.15.32.5 Reverse application Left associative.  var x = y .f; // means var x = f y;  ### 1.15.32.6 Reverse application with deref Left associative  var x = p *. k; // means var x = (*p) . k;  ### 1.15.32.7 Reverse application with addressing Left associative  var x = v &. k; // means var x = (&v) . k;  ### 1.15.32.8 Unit application Prefix operator applies argument to empty tuple.  var x = #f; // means var x = f ();  ### 1.15.32.9 Addressing Finds the pointer address of a variable. Means pointer to in type language.  var x : int = 1; var px : &int = &x; // address of x // type pointer to int  ### 1.15.32.10 C pointer Used in type language for pointer to type or NULL.  var px : @char = malloc (42);  Note that this symbol is also used in fdoc as a markup indicator. Please keep out of column 1, do not follow with a left brace. ### 1.15.32.11 Label address Used to find the machine address in the code text of a label. Used with computed goto instruction.  proc f (a: int) { var target: LABEL = if a < 0 then label_address neg elif a > 0 then label_address pos else label_address zer ; goto-indirect target; pos:> println$ "pos"; return;
neg:> println$"neg"; return; zer:> println$ "zer"; return;
}


### 1.15.32.12 Macro freezer

Used to disable macro expansion of a symbol.

  macro val fred = joe;
var x = fred + noexpand fred; // means
var x = joe + fred;


### 1.15.32.13 Pattern variable

Notation v Used in patterns to designate a val variable to be bound in the pattern matching.

  var x =
match y with
| Some v => "Some " + v.str
| #None => "None";
;


### 1.15.32.14 Parser argument

Notation n for some integer n. In user defined syntax designates the n'th term of a syntax production.

## 1.15.33 Qualified name

A name in Felix has the form:

    class1 :: nested1 :: ... :: identifier [ type1, type2, ... ]


where the qualifiers and/or type list may be elided. This is the same as C++ except we use [] instead of <> for template argument types.

# 1.16 Atoms

## 1.16.1 Record expression

  var x = (name="Hello", age=42);


## 1.16.2 Alternate record expression

  var x =
struct {
var name = "Hello";
var age = 42;
}
;


## 1.16.3 Variant type

Denotes a variant type.

  var x :
union {
Cart of double * double;
Polar of double * double;
}
;


## 1.16.4 Wildcard pattern

Used in a pattern match, matches anything.

    var x = match a with _ => "anything";


## 1.16.5 Ellipsis

Used only in C bindings to denote varags.

    fun f: int * ... -> int;


## 1.16.6 Truth constants

    false // alias for case 0 of 2
true  // alias for case 1 of 2


## 1.16.7 callback expression

??

  callback [ x ]


## 1.16.8 Lazy expression

Function of unit.

  var f = { expr };
var x = f ();


## 1.16.9 Sequencing

Function dependent on final expression.

  var x = ( var y = 1; var z = y + y; z + 1 ); // equivalent to
var x = #{ var y = 1; var z = y + y; return z + 1; };


  var p = { println$"Hello"; } // procedure p (); var f = { var y = 1; return y + y; }; // function var x = f ();  ## 1.16.11 Grouping Parentheses are used for grouping.  var x = (1 + 2) * 3;  ## 1.16.12 Object extension  var x = extend a,b with c end;  ## 1.16.13 Conditional expression  var x = if c1 then a elif c2 then b else c endif ;  # 1.17 Executable statements ## 1.17.1 Assignment ## 1.17.2 The goto statement and label prefix Felix statements may be prefixed by a label to which control may be transfered by a goto statement:  alabel:> dosomething; goto alabel;  The label must be visible from the goto statement. There are two kinds of gotos. A local goto is a jump to a label in the same scope as the goto statement. A non-local goto is a jump to any other visible label. Non-local transfers of control may cross procedure boundaries. They may not cross function or generator boundaries. The procedure or function containing the label must be active at the time of the control transfer. A non-local goto may be wrapped in a procedure closure and passed to a procedure from which the goto target is not visible.  proc doit (err: 1 -> 0) { e; } proc outer () { proc handler () { goto error; } doit (handler); return; error:> println$ error;
}


This is a valid way to handle errors. the code is correct because outer is active at the time that handler performs the control transfer.

### 1.17.2.1 halt

Stops the program with a diagnostic.

  halt "Program complete";


### 1.17.2.2 try/catch/entry

The try/catch construction may only be user to wrap calls to C++ primitives, so as to catch exceptions.

  proc mythrow 1 = "throw 0;";
try
mythrow;
catch (x:int) =>
lab2:>
println$"Lab2"; return; }  ### 1.17.2.4 Exchange of control Built on top of label adressing and indirect gotos, the branch-and-link instruction is conceptually the most fundamental control instruction. The library implementation is in  inline proc branch-and-link (target:&LABEL, save:&LABEL) { save <- label_address next; goto-indirect *target; next:> }  A good example is here, which shows an example of coroutines. ## 1.17.3 match/endmatch The form:  match expr with | pattern1 => stmts1 | pattern2 => stmts2 ... endmatch  is an extension of the C switch statement. The patterns are composed of these forms:  (v1, v2, ... ) // tuple match h!t // list match h,,t // tuple cons Ctor // const union or variant match Ctor v // nonconst union or variant match (fld1=f1, fld2=f2, ...) // record match pat as v // assign variable to matched subexpression pat when expr // guarded match pat1 | pat2 // match either pattern 999 // integer match "str" // string match lit1 .. lit2 // range match _ // wildcard match  The guarded match only matches the pattern if the guard expression is true.  match x with | (x,y) when y != 0 => ... endmatch  The tuple as list cons match is a form of row polymorphism where the first element of a tuple and the remaining elements considered as a tuple are matched. A good example of this is found in the library here which allows printing a tuple of arbitrary number of components, indeed, this facility was implemented precisely to allow this definition in the library. Record matches succeed with any record containing a superset of the specified fields. As well as integer and string matches, a literal of any type with an equality and inequality operator can be matched. In addition, if there is a less than or equal operator <= an inclusive range match can be specified. ## 1.17.4 if/goto The conditional goto is an abbreviation for the more verbose conditional:  if c goto lab; // equivalent to if c do goto lab; done  ### 1.17.4.1 if/return The conditional return is an abbreviation for the more verbose conditional:  if c return; // equivalent to if c do return; done  ### 1.17.4.2 if/call The conditional call is an abbreviation for the more verbose conditional:  if c call f x; // equivalent to if c do call f x; done  ## 1.17.5 if/do/elif/else/done The procedural conditional branch is used to select a control path based on a boolean expression. The else and elif clauses are optional.  if c1 do stmt1; stmt2; elif c2 do stmt3; stmt4; else stmt5; stmt6; done  The elif clause saves writing a nested conditional. The above is equivalent to:  if c1 do stmt1; stmt2; else if c2 do stmt3; stmt4; else stmt5; stmt6; done done  One or more statements may be givn in the selected control path. A simple conditional is an abbreviation for a statement match:  if c do stmt1; stmt2; else stmt3; stmt4; done // is equivalent to match c with | true => stmt1; stmt2; | false => stmt3; stmt4; endmatch;  ## 1.17.6 call The call statement is used to invoke a procedure.  proc p(x:int) { println$ x; }
call p 1;


The word call may be elided in a simple call:

  p 1;


If the argument is of unit type; that is, it is the empty tuple, then the tuple may also be elided in a simple call:

  proc f() { println$"Hi"; } call f (); // is equivalent to f(); // is equivalent to f;  ## 1.17.7 procedure return The procedural return is used to return control from a procedure to its caller. A return is not required at the end of a procedure where control would otherwise appear to drop through, a return is assumed:  proc f() { println$ 1; }
// equivalent to
proc f() { println$1; return; }  ### 1.17.7.1 return from The return from statement allows control to be returned from an enclosing procedure, provided that procedure is active.  proc outer () { proc inner () { println$ "Inner";
return from outer;
}
inner;
println$"Never executed"; }  ### 1.17.7.2 jump The procedural jump is an abbreviation for the more verbose sequence:  jump procedure arg; // is equivalent to call procedure arg; return;  ## 1.17.8 function return The functional return statement returns a value from a function.  fun f () : int = { return 1; }  Control may not fall through the end of a function. ### 1.17.8.1 yield The yield statement returns a value from a generator whilst retaining the current location so that execution may be resumed at the point after the yield. For this to work a closure of the generator must be stored in a variable which is subsequently applied.  gen counter () = { var x = 0; next_integer:> yield x; ++x; goto next_integer; } var counter1 = counter; var zero = counter1 (); var one = counter1 (); println$ zero, one;


The spawn_fthread library function invokes the corresponding service call to schedule the initial continuation of a procedure taking a unit argument as an fthread (fibre).

The spawned fthread begins executing immediately. If coutrol returns before yielding by a synchronous channel operation, the action is equivalent to calling the procedure.

Otherwise the spawned fthread is suspended when the first write, or the first unmatched read operation occurs.

## 1.17.11 loops

Felix has some low level and high level loop constructions.

The low level for, while, and repeat loops are equivalent to loops implemented with gotos.

The bodies of do loops do not constitute a scope, therefore any symbol defined in such a body is also visible in the surrounding code.

Low level loops may be labelled with a loop label which is used to allow break, continue, and redo statements to exit from any containing loop.

  outer:for var i in 0 upto 9 do
inner: for var j in 0 upto 9 do
println$i,j; if i == j do break inner; done if i * j > 60 do break outer; done done done  ### 1.17.11.1 redo The redo statement causes control to jump to the start of the specified loop without incrementing the control variable. ### 1.17.11.2 break The break statement causes control to jump past the end of the specified loop, terminating iteration. ### 1.17.11.3 continue The continue statement causes the control variable to be incremented and tests and the next iteration commenced or the loop terminated. ### 1.17.11.4 for/in/upto/downto/do/done A basic loop with an inclusive range.  // up for var ti:int in 0 upto 9 do println$ ti; done
for var i in 0 upto 9 do println$i; done for i in 0 upto 9 do println$ i; done

// down
for var tj:int in 9 downto 0 do println$j; done for var j in 9 downto 0 do println$ j; done


### 1.17.11.6 until loop

The until loop executes the loop body repeatedly until the control condition is false at the start of the loop, it is equivalent o a while loop with a negated condition.

  var i = 0;
assert x == 1;


## 1.17.17 do/done

The do/done statement has no semantics and merely acts as a way to make a sequence of statements appear as a single statement to the parser.

Jumps into do/done groups are therefore allowed, and any labels defined in a do/done group are visible in the enclosing context.

Any variables, functions, or other symbols defined in a do/done group are visible in the enclosing context.

  do something; done


## 1.17.18 begin/end

The begin/end statement creates an anonymous procedure and then calls it. It therefore appears as a single statement to the parser, but it simulates a block as would be used in C. It is exactly equivalent to a brace enclosed procedure called by a terminating semi-colon.

  begin
var x = 1;
end
// equivalent to
{
var x = 1;
};


# 1.18 C bindings

Felix is specifically designed to provide almost seamless integration with C and C++.

In particular, Felix and C++ can share types and functions, typically without executable glue.

However Felix has a stronger and stricter type system than C++ and a much better syntax, so binding specifications which lift C++ entities into Felix typically require some static glue.

## 1.18.1 Type bindings

In general, Felix requires all primitive types to be first class, that is, they must be default initialisable, copy constructible, assignable, and destructible. Assignment to a default initialised variable must have the same semantics as copy construction.

It is recommended C++ objects provide move constructors as Felix generated code uses pass by value extensively.

The Felix type system does not support C++ references in general, you should use pointers instead.

However, there is a special lvalue annotation for C++ functions returning lvalues that allows them to appear on the LHS of an assignment. Only primitives can be marked lvalue.

The Felix type system does not support either const or volatile. This has no impact when passing arguments to C++ functions. However it may be necessary to cast a pointer returned from a primitive function in order for the generated code to type check.

TBD

TBD

TBD

TBD

TBD

TBD