Writing ASL
[Note: This document should be read in conjunction with the document Reading ASL specifications. We expect that the vast majority of ASL users only read ASL and do not need to read this document.]
Architecture Specification Language (ASL) is a specification language that is designed to make it easy to write clear, easily understood ISA specifications. To achieve this goal, ASL is different from many programming languages in the following ways
-
Support for bitvectors of arbitrary size (e.g., 32, 64, 5, 12, etc.).
-
Support for bitslice operations that manipulate part of a bitvector.
-
Support for other common bitvector operations: concatenating bitvectors to form a longer bitvector, replicating a bitvector to form a longer bitvector, zero-extension, sign-extension.
-
A single integer type
integer
that does not overflow or wrap around. Variables of this type are “as big as they need to be” to represent the value that they contain. -
Encouraging a “single assignment” style of programming where most variables have just one assignment to that variable.
-
Not supporting silent default behavior on case statements. Unlike C, it is an error if no branch of a case-statement matches.
-
Static detection of confusing, subtle, dangerous, or incorrect programs including
-
Rejecting expressions where the evaluation order can affect behavior.
-
Requiring that two bitvectors have the same length if they are added, subtracted, ANDed, ORed, etc.
-
Making the silent branches associated with exception propagation explicit by requiring that calls to functions that can throw exceptions look different from calls to functions that cannot throw exceptions.
Marking calls to functions that could throw exceptions is important in code that changes state (e.g., changes a register or pushes a value onto the stack) and also calls a function that could throw an exception because the order in which those happen matters.
The goal of these restrictions is to make the language easier for readers because they will never have to deal with programs where evaluation order matters, they will never have to understand type promotion rules that determine what happens when a ‘char’ is added to a ‘long int’, etc.
This is important because specifications differ from programs in that they have orders of magnitude more readers than writers and the readers tend to be focused on understanding some small corner of the architecture and cannot be expected to understand all of the intricacies of the architecture (such as which functions throw exceptions).
-
These rules and restrictions are intended to make it easier to read the code and understand it correctly. If you want to use the specification for formal verification or you need the additional performance of compiling the ASL specification to C/C++ or Verilog, then there are some additional restrictions because the ASL compiler does not support all the features that are supported by the ASL interpreter. This guide explains both types of restriction and how to work with them.
Type checking
Type conversions
ASL is strongly typed (that is, every expression has a type and mismatched types are not allowed).
Most significantly, ASL does not have anything like the C/C++ rules for type promotion and therefore does not allow things like
- “truthiness”: treating zero values as FALSE and non-zero values as TRUE
- mismatched sizes: adding, comparing, etc. bitvectors of different lengths
- mismatched types: mixing bitvectors and integers
(There are a few exceptions for the common case of expressions
like
RSP + 8
.)
When writing ASL code, it is necessary to explicitly convert values from one type to another. The common conversions are as follows
- integer to bitvector: the bitslice operation written
i[0 +: N]
converts an integeri
tobits(N)
- bitvector to integer (unsigned):
UInt(b)
converts a bitvector to an integer interpreting the bits as unsigned. - bitvector to integer (signed):
SInt(b)
converts a bitvector to an integer interpreting the bits as signed. - from bit
x
to boolean (positive logic):x == '1'
- from bit
x
to boolean (negative logic):x == '0'
- from boolean
b
to bit (positive logic):if x then '1' else '0'
- from boolean
b
to bit (negative logic):if x then '0' else '1'
Sized bitvectors
The bitvector type includes an explicit size. For example bits(8)
is the
type of 8-bit values.
The ASL compiler checks the length of bitvectors at compile time and rejects
code with mismatched lengths.
For example, the following assignment of a 4-bit value to an 8-bit variable
is incorrect: the architect needs to explicitly indicate what should happen
to the other 4 bits.
var x : bits(8); // declaration of a variable
var y : bits(4); // declaration of a different size of variable
...
x = y; // INCORRECT: assigning a 4-bit value to x
Some of the common ways of correcting bitsize mismatches like this are
- zero-extend the value on the right hand side:
x = ZeroExtend(y, 8);
- sign-extend the value on the right hand side:
x = SignExtend(y, 8);
- assign to a part of the left-hand side using a bitslice assignment:
x[3:0] = y;
These three options behave very differently so it is important to be explicit about which one is intended.
Variable sized bitvectors
The size of a bitvector does not need to be a literal constant.
It can be useful for the size to be a variable or expression
containing a variable.
For example, in the following function, the variables src1
, src2
and result
are all of width N
and N
can be any non-negative value.
func ADD(src1 : bits(N), src2 : bits(N)) => bits(N)
begin
let result = src1 + src2;
return result;
end
The ASL compiler reasons about bitwidth expressions to check that bitvectors have matching sizes.
These checks happen at compile time, not when the code is executed and it is possible that the ASL compiler will reject some code that would not have hit a problem at runtime. Any such issues have to be fixed before the compiler will accept the code.
The three main places where you might encounter problems are
-
A function is only called with ‘safe’ values but there would be problems if it was called with other inputs. For example, if we change the function
ADD
above to take inputs of different sizes, then, the ASL compiler will report an error because, even though the programmer knows the sizes will always match, the compiler does not know that.func ADD(src1 : bits(M), src2 : bits(N)) => bits(N) begin let result = src1 + src2; // error: mismatched sizes return result; end
[Actual examples tend to be more subtle than this example. For example, they might depend on an assumption that
M >= N
.] -
An expression is not always safe but it is impossible to reach that expression with unsafe values because it is inside an if-statement or case-statement or it occurs after an assert-statement. For example, the ADD function could try to cope with mismatched sizes as follows
func ADD(src1 : bits(M), src2 : bits(N)) => bits(N) begin if M == N then let result = src1 + src2; // error: mismatched sizes return result; else return Zeros(N); end end
Even though this code would not have mismatched sizes at runtime, the compiler is not able to check that and the code is rejected.
A possible fix would be to apply a size-conversion (zero-extend, sign-extend or bitslice) even though it will have no effect. For example,
let result = ZeroExtend(src1, N) + src2;
Doing this is not ideal because, even though the code means the same, it adds clutter into the specification. Reviewers should try to suggest ways to avoid such artifacts.
-
The expression depends on the relationship between some variables.
For example, in floating point code, we might calculate the size of the exponent (E) and mantissa (M) like this
let E = if N == 32 then 8 else 11; let M = N - (E + 1);
And then, further down the function, we might concatenate the sign bit, the exponent (of type
bits(E)
, and the mantissa (of typebits(M)
) to produce the final floating point result like thislet result : bits(N) = [sign, exponent, mantissa];
To confirm that this is correct, the ASL compiler has to check that
N == 1 + E + M
. To do this, it needs to use the fact thatM == N - (E + 1)
which comes from the second assignment statement above.To help the ASL compiler recognize that it should use this fact, we change the second assignment statement as follows
constant M = N - (E + 1);
(It is very, very rare that this needs to be done. We have only ever seen it in floating point specifications.)
Constraint checks
Integer variables can be “constrained” by specifying a set of legal values
or a range of legal values.
For example, integer {8, 16, 32, 64}
or integer {0..255}
.
(It is also possible to combine the two in a type like integer {0, 3, 5..9}
but we have not found a place where that is useful.)
These constraints are checked by ASLi, they can improve readability of the code, and they can help the ASLi compiler generate better C/C++/Verilog code.
Most constraints are used on function arguments, function results and on mutable variables.
Adding constraints on function results
It is often useful to add constraints on the results of functions that return integers because many of these integers will be used as bitwidths, as array indexes, as bit indexes, etc. and it is useful to the reader and to the compiler to know that it is in range.
For example, suppose there is a function that calculates the size of floating point values, we can add the set of possible results as a constraint on the result type.
func DecodeFPSize(instruction : bits(32)) => integer {32, 64}
Adding constraints like this is almost always a good idea because it provides information that is needed for checking other constraints.
Adding constraints on function arguments
Many constraints are applied to size parameters. For example, we might
want to constrain a floating point addition operation FPAdd
to
require a bitwidth of 32 or 64.
If the bitwidth is an explicit argument of FPAdd, this is done by adding the constraint as port of the function type like this
func FPAdd(x : bits(N), y : bits(N), N : integer {32, 64}) => bits(N)
if, instead, the bitwidth is an implicit parameter of the function (i.e., it is not listed as an argument), then it is necessary to add it as a parameter like this
func FPAdd{N : {32, 64}}(x : bits(N), y : bits(N)) => bits(N)
When first adding constraints to a specification, you might generate a lot of error messages because adding a constraint to a function argument will require that the ASL compiler can show that every call to that function satisfies that constraint. It is common that resolving one constraint error requires the addition of a constraint to some of the callers of the constrained function. This has a reverse cascade effect: you need to keep adding constraints all the way up the call tree until you reach the point where the constrained value originates.
It is often easier to start at the top of the call tree and work your way down the tree to the leaves (where the constraints are needed).
Adding constraints on immutable variables
If you have added constraints on function arguments and parameters, then
the ASL compiler will infer most of the constraints on immutable variables
(i.e., variables declared with the let
keyword) and there is
usually no need to explicitly add a constraint.
For example, given that x : integer {0..15}
and a let-assignment
let y = x * 2 + 3;
, the ASL compiler infers that y : integer {3..33}
.
This inference usually works but will fail if you assign the result of a user-defined function that has an unconstrained result type or if the expression is especially complicated.
Note that if you add an explicit type to a variable declaration, then the variable is unconstrained even if ASL was able to infer a constraint. For example, if you write
let y : integer = x * 2 + 3;
then the addition of “: integer” disables inference of a constraint on “y”.
Adding constraints on mutable variables
It is usually possible to infer the constraint on an immutable integer variable because it has a single assignment. It is usually not possible to infer a useful constraint on a mutable integer variable because that would require examination of all assignments to that variable.
In particular, given this statement
var count = 0;
we would not want ASL to infer that count
has the constraint {0}
since that
would prevent us from setting count to any other value — probably not what is
intended.
Instead, we should write something like this
var count : integer {0..N};
A further complication with using mutable variables is that they are commonly used inside a loop. For example
var count : integer {0..N} = 0;
for i = 0 to N-1 do
if x[i] == '1' then
count = count + 1; // error reported on this line
end
end
In code like this, ASL will report an error on the line that increments count
because it cannot see that count will always be in the range 0..N.
In this case, you need to add a “checked type conversion”.
count = (count + 1) as {0..N};
The notation ... as {0..N}
acts like a “type cast” in C/C++ except that it inserts a runtime
check that checks that the expression satisfies the constraint.
As you have probably realized, adding constraints on mutable variables can take a bit of effort. (This is one reason that we recommend using immutable variables as much as possible.)
Problems generating C code
The ASLi tool includes an interpreter that can execute any correct ASL program. However, most specifications written in ASL are compiled to C/C++ so that they can be linked to other tools and to improve performance. The ASL compiler is not able to compile all ASL code to C/C++ and it is important to understand what can cause these problems and how to fix them.
Unsupported features
The ASL compiler does not support any language feature that would require
dynamic memory allocation.
This affects support for integer
, real
and string
.
-
Although the ASL language says that
integer
values are unbounded (i.e., they can be as large as they need to be), the ASL compiler imposes a maximum size on unconstrained integers. This size can be controlled at compile time. At present, overflowing an integer does not generate an error message.We are in the process of improving this for “constrained integer” types such as
var x : integer {0..2^1000-1};
. Once finished and deployed, a constraint like this will cause the compiler to represent the variablex
using the smallest size that can hold the variable (in this case 1000 bits) even if that size is larger than the default integer size. -
The ASL language has a type
real
that represents unbounded rational numbers. These are not supported because we do not use them in our specifications and they would require dynamic memory allocation. -
The only form of strings supported are string literals like
"Hello, World!"
and the only operation supported on strings is to print them.In particular, string concatenation and formatting numbers as strings is not supported.
Bitwidth polymorphism
We refer to a function like func F(x : bits(8)) => bits(16)
(where all bitvectors
have concrete size) as “monomorphic”
and a function like func G(x : bits(N)) => bits(2*N)
(where some bitvector
has a variable size) as “polymorphic”.
And we refer to a function call like “G(y)” as “monomorphic if ‘y’ has a concrete
size and “polymorphic” if ‘y’ has a variable size.
As the ASL compiler generates C code, it attempts to transform polymorphic functions into monomorphic functions. This process (called “monomorphization”) works by finding monomorphic calls to polymorphic functions and creating “monomorphic instances” of those functions. For example, a polymorphic function like this
func G(x : bits(N)) => bits(2 * N)
begin
return ZeroExtend(x, 2*N);
end
can be transformed into a monomorphic function like this
func G8(x : bits(8)) => bits(16)
begin
return ZeroExtend(x, 16);
end
that can be used in place of any monomorphic call to G with bitwidth 8.
Note that, in function “G”, the call to “ZeroExtend” was polymorphic but, in the function “G8”, the call to “ZeroExtend” is monomorphic. As a result, it would now be possible to create a monomorphic instance of “ZeroExtend”. It is common for monomorphization to have a “cascade effect” where monomorphizing one function “F” can enable the monomorphization of several other functions “G1”, “G2”, “G3” (say) and monomorphization of each “Gi” can enable monomorphization of even more functions “H1”, “H2”, …
Most of the time, the monomorphization process works smoothly and you need not think about it. However, sometimes monomorphization fails and you need to help the ASL compiler.
(Note: similar issues apply when using C++ templates and you may be able to adapt techniques that you would use with C++ code.)
There are two steps to solving monomorphization issues: diagnosis and cure. There are two main techniques to cure monomorphization issues
Diagnosing monomorphization issues
Diagnosis can be challenging because of the cascade effect. For example, if the ASL compiler is unable to monomorphize functions “F”, “G2” and “H3”, the root cause of this problem may be that it problems monomorphizing “F” prevented the usual cascade that would have enabled it to monomorphize “G2” and “H3”. In this example, the wrong thing to do would be to try to fix “H3” or “G2”: we need to identify the topmost function that was not monomorphized.
To aid in diagnosis, the ASL compiler groups monomorphization failures according to the function call tree. You should always focus your attention on functions nearer the top of the call tree since that will almost always resolve problems with functions further down the call tree.
Curing monomorphization: Help the compiler recognize bitwidth parameters.
In a function with a type like “func G(x : bits(M)) => bits(N)”, it is clear to the compiler that both “M” and “N” are bitwidths and that finding concrete values of “M” and “N” such as “8” and “32” would enable “G” to be monomorphized.
But, sometimes a function has an argument that will be used as a bitwidth parameter but that is not used as a bitwidth in the function’s arguments or results. For example, consider a function like this that is monomorphic but contains a polymorphic call to “Write_Memory”.
func Push(address_size : integer, x : bits(32))
begin
let address = RSP[0 +: address_size] - 4;
Write_Memory(address, x);
RSP[0 +: address_size] = address;
end
If the ASL compiler knew that address_size
was a bitwidth
parameter, then a call to Push(16, y)
would be monomorphized
and this would enable a cascade effect that would monomorphize
the call to Write_Memory
.
We can hint that address_size
is a bitwidth parameter by
listing all the bitwidth parameters to the function like this
func Push{address_size}(address_size : integer, x : bits(32))
begin
// remainder of function is unchanged from above
end
Curing monomorphization: Use ‘case splitting’.
Some bitwidth parameters are calculated dynamically. For example, they are determined by the value in a register or the width of a value is determined by a field or prefix on an instruction.
In this case, it is necessary to split the code into several different cases each of which involves constant bitwidths.
For example, some floating point instruction sets use 1 bit in the instruction encoding to choose between single and double precision floating point. The code might look like this (where “D” contains the value of this instruction field).
...
integer N = if D == '1' then 64 else 32;
let src1 = FP[x][0 +: N]; // read first operand
let src2 = FP[y][0 +: N]; // read second operand
let result = FPAdd(src1, src2);
FP[z] = ZeroExtend(result, 64); // write result
To enable this code to be monomorphized, we must duplicate the last 4 lines and move them inside the if-then-else like this.
...
if D == '1' then
let src1 = FP[x][0 +: 64]; // read first operand
let src2 = FP[y][0 +: 64]; // read second operand
let result = FPAdd(src1, src2);
FP[z] = ZeroExtend(result, 64); // write result
else
let src1 = FP[x][0 +: 32]; // read first operand
let src2 = FP[y][0 +: 32]; // read second operand
let result = FPAdd(src1, src2);
FP[z] = ZeroExtend(result, 64); // write result
end
This case splitting is unfortunate because it hides the similarities between the two cases and makes the specification longer. With a little forethought, we can mitigate this by
-
Moving the code that needs to be duplicated into a function so that there is less duplication
-
The primary cause of this problem in most ISAs is instruction decode where we read an instruction from memory and the instruction encoding determines data sizes. Fortunately, we usually generate the instruction decode tree automatically from a list of instruction encodings so it is often possible to modify the decode tree generator to also insert case-splitting code. This avoids the costs of hand-writing the decoder and most people will not be reading machine-generated code like a decode tree since the real specification of the encodings is the list of encodings.
Runtime errors in ASL specifications
In ASL, the following conditions are errors
-
Bitslice operations that go beyond the start or end of a bitslice. For example
x[-1]
orx[64]
ifx
is a 64-bit bitvector. -
Array operations that go beyond the start or end of an array. For example
a[-1]
ora[32]
ifa
is a 32-entry array. -
Assigning a constrained integer a value that is not in the constraint set. For example,
x = 16;
ifx
has typeinteger {32, 64}
. -
Division by zero.
-
Calling
Log2
on a number that is not a power of two. -
Reading a variable or part of a variable that has not been initialized or assigned to.
-
Infinite loops.
-
Case statements where no case-alternative matches the value of the discriminating expression.
-
Failing the condition on an
assert
statement. -
Executing the function
Unreachable
. -
Throwing an ASL exception that is not caught.
If any of these can occur in the specification, it indicates an error in the specification and it is not possible to reason about what the processor will do.
At the moment, all of these (except infinite loops and reading uninitialized variables) are detected by the interpreter. However, the only errors detected by the C/C++ compiler are
- case statements where no case-alternative matches
- failing an assertion
- executing the function
Unreachable
- throwing an ASL exception that is not caught
Until this is fixed, we recommend testing the specification with the interpreter and exercising additional care during code review.
Assertions
Failing an assertion indicates a bug in the specification.
Because failing an assertion indicates a bug in the specification, assertions should only be used to check for things that should not be able to happen even in the presence of buggy or malicious code. Assertions should not be used to check for things that are technically possible but indicate a software issue.
Exception markers
Exception markers are used to indicate whether a function call always throws an exception, can throw an exception or (when there is no marker), cannot throw an exception.
Function definitions must be marked with an accurate marker and function calls must have the same marker as the function definition.
If a function is changed in a way that changes the exception marker, all call sites need to be updated with the new exception marker. This is one of the primary design goals of exception markers: helping ASL authors to find all the parts of the specification where a function now behaves differently. In particular, you should be alert for changes that create the following code patterns.
-
Exception thrown after modifying processor state (or memory)
RAX = RAX + 4; F?(x);
or
RAX = RAX + 4; F!(x);
This pattern is often problematic because it may not be possible to restart the current instruction after dealing with the cause of the exception.
-
Exception generating dead code
if condition then F!(x); RAX = Zeros(64); // this code cannot execute end // this code cannot execute if `condition` is TRUE RBX = Zeros(64); return result;
You might also have naming conventions associated with certain kinds of
exception marker (e.g., a function whose primary job is to check for
a problem and throw an exception if it exists might have a name that starts
with Check
). In this case, you might want to rename a function if the marker changes.
Exception markers and incomplete specifications
When adding a new feature to the specification, it is common to work top down: adding an instruction that calls a number of helper functions; then adding the helper functions; etc. until the feature has been fully specified.
If one of the helper functions is expected to throw exceptions, you should add the expected exception marker to the helper function even before you have defined the helper function. But doing so may result in code that does not compile because the function definition does not match the marker.
There are two alternatives for dealing with this. The first is somewhat ad hoc and can be used if you rarely hit this problem; the second is recommended if you often hit this issue.
-
Add a conditional or unconditional ‘throw’ statement at the start of the function
Add this for functions with exception marker ‘?’
if FALSE then throw ExceptionTaken; end // dummy throw until this function is completed
or add this for functions with exception marker ‘!’
throw ExceptionTaken; // dummy throw until this function is completed
-
Define three helper functions for use when adding new features
Call this function at the start of stub functions that never throw exceptions
func UnimplementedFeature(message : string) begin print("Unimplemented feature : "); print(message); println(); assert FALSE; end
Call this function at the start of stub functions that sometimes throw exceptions
func UnimplementedFeatureWithCheck?(message : string) begin print("Unimplemented feature : "); print(message); println(); if FALSE then throw ExceptionTaken; end assert FALSE; end
Call this function at the start of stub functions that sometimes throw exceptions
func UnimplementedException!(message : string) begin print("Unimplemented feature : "); print(message); println(); assert FALSE; throw ExceptionTaken; end