Top Procedures, Subroutines and Functions Database Files
French French German German ItalianItalian SpanishSpanish PortuguesePortuguese JapaneseJapanese KoreanKorean ChineseChinese

Building Types



Type Declarations

FormulaOne has a facility which allows you to give names to types you define, so that they can be used more readably and clearly. These types must be declared before any predicate declarations which use them. Some simple examples of type declarations:
Posint_t =[1..]
Digit_t = [0..9]
Stringlist_t = list S
Weight_t = I
Notice that the names of the types are capitalized, like the names of predicates. As with predicates, this is because they each refer to a definite, static object, the kind of thing we would give a capitalized name to in English.

In this manual, we use the convention that type names end with the characters _t, to distinguish them from other names; but this is not required.

The first three declarations name the subranges of integers which correspond to the positive integers, the ten digits, and the lists of strings. After these declarations have been made, we can use the new names in place of the old. For example, declaring a variable to be of type Posint_t would have exactly the same effect as declaring it to be of type [1..].

The last declaration may seem a little strange; why should we want to declare the type name Weight_t to be something for which we already have a shorter name, I? We may be doing this because we are not sure what kind of variables we want to be weights (integers, long integers, reals, etc.), and we want to get on with writing the rest of the program without deciding it yet. This way, if we change our mind about what a weight is, we'll only have to change one declaration in the whole program.

Tuples

Consider the following type declaration.
Date_t = (year:I,month:[1..12],day:[1..31])
The expression on the right hand side of the equality formula, looks like a pair term with three parts, but it has things that look like variable declarations in the place of the three terms. This expression is called a tuple, and will act as a template for terms of type Date_t.

If a variable is declared to be of type Date_t, the only terms that can be assigned to it of the form (a, b, c), where a is an integer, b is from 1 to 12, and c is from 1 to 31. The parentheses around the term can be omitted when there is no possibility of confusion.

pred ...  iff
     birthday :> Date_t &
     yr :> I &
     ... &
     yr = 1961 &
     birthday = yr, 7, (41-39) &
     ...
The general form of a named tuple type expression is

(var1:type1, var2:type2,...., varn:typen)

The variable names declared within the parentheses are called Fields, and can be of predefined types themselves:
Date_t = (year:I,month:[1..12],day:[1..31]) 
Person_t = (name:S,age:I,birthdate:Date_t)
Each type expression represents some subset of the universal type, U. The declared type Date_t represents all the terms which have the form of values for variables of type Date_t, just as the type list I represents all the pair terms which are Nil-terminated lists of integers.

However, the representation of tuple type values within the program's memory is a different story. While list terms are represented as separate elements connected by pointers into a linked list, the values representing tuple fields are stored in contiguous memory locations.

So in many respects, such terms are implemented like the records of Pascal and the structures of C. They also have the efficiency that goes along with them.

Field Selection Terms and Deconstructors

The fields of the Date_t type, year, month, and day, can be referred to individually. With the variable declaration in the last section,
birthday:>Date_t
the year field of the birthday variable can be referred to by the term birthday.year, and similarly for birthday.month and birthday.day. Such terms are called field selection terms. We can continue the example of the last section as follows:
... &
birthday2:>Date_t &
birthday2 = (birthday.year - 1, birthday.month, 1) &
... &
Note that an output variable of a tuple type must be initialized all at once, not field-by-field. Before the whole variable is initialized, its individual fields cannot be referred to.

Another way you can refer to fields of a tuple variable is to use a deconstructor term. This is a term of the same "shape" as the tuple variable, but with variables in places corresponding to fields you want to match.

...   &
birthday2 = (yr2, mo2, dy2) &
Print(mo2, dy2) &
...   &
After the equality formula, the variable yr2 will act exactly as the term birthdayl.year, and similarly for mo2 and dy2. This means not only that these variables have the same value, but that they refer to the same location in computer memory. Deconstructors are handled by the FormulaOne compiler in such a way that no extra memory is allocated for them.

Here are two versions of a predicate which succeeds if and only if its first parameter is a date which is before its second parameter. The first uses field selection terms and or ( | ), while the second uses deconstructors and if; you can decide for yourself which you prefer.

pred Before(datel :< Date_t, date2 :< Date_t) iff
    {'datel' is before 'date2'}
    datel.year < date2.year |
    (datel.year = date2.year &
    (datel.month < date2.month |
    (datel.month = date2.month & datel.day < date2.day)))

proc Before1(datel :< Date_t, date2 :< Date_t) iff
    {'datel' is before 'date2'}
    datel = (yl, ml, dl) &
    date2 = (y2, m2, d2) &
    if yl > y2 then
       false
    elsif yl = y2 then
       if ml > m2 then
          false
       elsif ml = m2 then
          dl < d2
       end
    end
The following queries should succeed with either version of this predicate:
all Before((1987, 8, 8), (2001, 1, 1))
Before1((1961, 7, 2), (1961, 10, 29))
Field-selection operators can be chained together to go as deep as you want into a tuple's structure. The following example makes use of the Person_t type, which we used to illustrate nested tuples above.
proc Same_birthday(p1 :< Person_t, p2 :< Person_t) iff
    p1.birthdate.month = p2.birthdate.month &
    p1.birthdate.day = p2.birthdate.day

Enumerated Types

Enumerated types are a way for programmers to introduce their own names for terms 3 a program.
Gender_t = Male | Female
Colour_t = Red | Yellow | Green | Blue
These declarations mean that variables of type Gender_t can have only Male and male as values, and that those of type Colour_t can have only Red, Yellow, Green, and Blue. Note that this is not the same as saying that Gender_t variables can have the strings 'Male' and 'Female' as values. Male and Female are actually declared as new constant terms (with name identifiers because they are static objects), just as Gender_t itself is declared as a new type name.

Enumerated types are used extensively to tailor data types to be more readable. Here's example of a structure for a genealogical database.

Gender_t = Male | Female
P_data_t = (name:S,          {person's name}
            gender:Gender_t,
            b_date:Date_t,   {date of birth}
            d_date:Date_t,   {date of death}
            comment:S)
Enumerated type values are really just new names for the integers starting at 0; for instance, Red is just 0, Yellow is 1, Green is 2, and Blue is 3. This is also how they are represented internally. However, FormulaOne enforces syntactic restrictions on programs which ensure that variables of an enumerated type get assigned only terms of that type. So enumerated types give no extra programming power, just increased readability.

Unions

Enumerated types are actually just a special case of type expressions called unions. Union type expressions are much like C unions, or Pascal case variant types.
Vehicle_t = Tricycle | Car(doors:I, engine_size:I) | Bicycle(gears:I) | Motorcycle
The above declaration defines Tricycle and Motorcycle as new constant terms of type Vehicle_t (represented by 0 and 3), as with enumerated types. But it also defines all expressions of the form Car (a, b) and Bicycle(a), where a and b are integer terms, to be terms of type Vehicle_t.
pred ...  iff
     v1 :> Vehicle & v2 :> Vehicle &
     ... &
     (v1 = Motorcycle | v1 = Car(4, 4000)) &
     v2 = Bicycle(v1.doors) &
     ...
Note the use of the field-selection feature in the last formula above. When we refer to a field of a union type, we do it without referring to the variant name (like Motorcycle or Car). Whenever we select fields of a union variable, FormulaOne inserts an implicit test just before the formula containing the field term, to make sure that the variant name is the one containing the field name. So the formula
v2 = Bicycle(v1.doors)
acts just like the formula
v1 = Car(d,s) & v2 = Bicycle(d)
The field names appearing in different variants of a union type expression must therefore be different, so that FormulaOne can tell which variant a union variable field is in. Since union type definitions can be recursive, we can create important data structures in FormulaOne, such as trees.
Tree_t = Nulltree |
Node(node_val:I, left:Tree_t, right:Tree_t)
Recursively-defined types lend themselves naturally to recursive procedures. Consider this one for inserting a leaf into a structure of type Tree:
proc Insert_tree(x :< I, tree :. Tree_t) iff
    case tree of
        Nulltree       => tree := Node(x, Nulltree, Nulltree);
        Node(nv, l, r) => if x = nv then {x is already there}
                                true
                             elsif x < nv then
                                Insert_tree(x, tree.left)
                             else
                                Insert_tree(x, tree.right)
                             end;
       end
The following queries insert different values into the tree x:
x := Node(5,Node(2,Nulltree,Nulltree),Nulltree) & Insert_tree(7,x)
x := Node(5,Node(2,Nulltree,Nulltree),Nulltree) & Insert_tree(3,x)
x := Node(5,Node(2,Nulltree,Nulltree),Nulltree) & Insert_tree(4,x)
The results in the same order are:
x = Node(5,Node(2,Nulltree,Nulltree), Node(7,Nulltree,Nulltree))
x = Node(5,Node(2,Nulltree,Node(3,Nulltree,Nulltree)), Nulltree)
x = Node(5,Node(2,Nulltree,Node(4,Nulltree,Nulltree)), Nulltree)
Note the use of tree.left and tree.right in the recursive calls to Insert_tree. When a formula like
tree = Node(nv, l, r)
is processed, if the variable on the left is of any mode except input/output, the compiler would consider the parts of the union term on the right to be just synonyms for the fields of the variable. That is, the compiler would have associated the variable nv with the exact storage location assigned to tree.nodeval, I with tree.left, and so on.

But when the variable is of input/output mode, a formula like this causes FormulaOne to make a separate copy of the fields of the tree, and assign the new variables to those copies. So if we had used l and r instead of tree.left and tree.right, we would have been referring to copies of the subtrees of tree, instead of the actual parts that we need to replace in the recursive calls.

In the case of input/output tuple or union types, only field selection terms refer to parts the of the original structures.

Arrays

It's often useful for us to store a number of variables of the same type together in a single variable, and to use another index variable to pick out which one we are referring to. This data structure is called an array, and appears in most programming languages (though not in most Prologs). An array can be seen as a data structure which associates one value of a given element type with each value of an index type. So in FormulaOne, array types are declared using the standard mathematical "mapping" notation:
P_data_at = [0..3] -> P_data_t
This declaration says that a variable of type P_data_at is an array with elements of type P_data_t, indexed by elements of type [0..3]; that is, a set of 4 elements of the P_data_t type. The following type is an array of three arrays of four integers each:
Matrix_t = [0..2] -> [0..3] -> I
The arrows are right-associative, so that the indices of an array of type Matrix_t are of type [0..2] and the elements are of type [0..3] -> I.

The syntax for selecting elements of the array is the one that most programming languages have. The expression v(i), where v is a term of an array type, and i is a term whose value is of the index type of v, stands for the i-th element of the array v. (For nested arrays, such as Matrix_t above, the index expressions can be grouped within the parentheses; for example, m(l,2) means the same thing as m(l)(2).)

proc Youngest(pers_arr :< P_data_at, y_name :> S) iff
    {the name of the youngest person in 'pers_arr' is 'y_name'}
    Youngest1(pers_arr, 1, pers_arr(0), y_name)

proc Youngest1(pers_arr :< P_data_at,curr :< I ,y_sofar :< P_data_t, y_name :> S) iff
    { The youngest in 'pers_arr' before index 'curr' is 'y_sofar' and the name of the 
      youngest overall is 'y_name' }
    if curr >= Len(pers_arr) then
       {we have reached the end of the array}
       y_name = y_sofar.name
    elsif
       Before1(pers_arr(curr).b_date, y_sofar.b_date)
    then
       {current is older than youngest so far}
       Youngest1(pers_arr, curr+1, y_sofar, y_name)
    else
       {current is youngest}
       Youngest1(pers_arr, curr+1, pers_arr(curr), y_name)
    end
See next section for a query using the predicate Youngest1.

The expression pers_arr(curr) above refers to the curr-th element of the pers_arr array, which is a term of type P_data_t. Since it is a term of type P_data_t, fields can be selected from it like any other such term; so pers_arr(curr).name is its name, etc.

The elements of the array are represented side-by-side in computer memory, and there can obviously be only a finite number of them for any one array. So the type expression that appears to the left of the -> in an array type expression cannot be just any type; it normally must be a finite type, except in the cases noted in flexible arrays below. A finite type is defined as being an enumerated type, a subrange of the integers ([m..n]), or any defined type whose definition is a finite type. For now, an array index type which is a subrange must start with 0; but this restriction may be lifted in later versions of FormulaOne.

Array Constants and Initializing

An array variable can be assigned a value, like any other variable. But because of the nature of arrays, we have to use a special syntax to denote a value assigned to an entire array. An array constant is just a list of constants of the element type, separated by commas and enclosed by square brackets.
pred Wackafoo(ca :< I, cb :< Date_t) iff
    parents:>[0..l]->Person &
    matrix:>[0..2]->[0..2]->I &
    ... &
    parents = [('Charles', ca, cb),('Diana', 26, (1961, 7, 1))] &
    matrix = [[8, 1, 6], [3, 5, 7], [4, 9, 2]] &
    ...
Note that these array constants can be nested, as in the matrix example above. We can construct an array of type P_data_at by the following predicate
proc Construct_arr(pers_arr :> P_data_at) iff
    pers_arr = [('Johann',Male,(1800,5,15),(1887,2,10),'friend of Wagner and Liszt'),
                ('Theresa',Female,(1897,2,1),(1979,8,14),'moved to USA in 1920'),
                ('Dagmar',Female,(1834,4,29),(1915,9,23),'had twins twice in a row'),
                ('Thomas',Male,(1889,7,11),(1971,1,28),'was FDRs personal physician')]
Now we can determine the youngest person by running the query
Construct_arr(pers) & Youngest(pers,name)
which should find the solution
name = Theresa
An array output or input/output variable must be initialized on its first use, like any other variable of those modes. This may pose some problems. Some arrays have so many elements, or elements which are so complex, that the array constants we can use as values for them can become excessively large. So FormulaOne has a predefined procedure, Dupl, that can make such initializations easier. It takes three arguments: the number of elements in the array; the value to duplicate; and the variable to receive the value. Dupl is a function, so functional notation can be used with it too.
pred ... iff
parents:.[0..39]->Person &
row:>[0..9]->I & matrix:.[0..9]->[0..9]->I &
... &
Dupl(40, ("Null", 0, (1900, 0, 0)), parents) &
row = Dupl(10, 0) & Dupl(10, row, matrix) &
...
Dupl is a polymorphic procedure because it accepts different types of parameters. For all types T, Dupl acts like a procedure with the following declaration:
proc Dupl(len :< I, val :< T, arr :> [0..]->T) iff
     ...

Flexible Arrays

The arrays we have been considering up to now are fixed arrays: we know how many elements are in them from their declarations. Arrays can also be flexible, however, in that the number of elements in them is set only at the time a query is run.

A flexible array is any array whose index type is [0..]. This is the only exception to the rule that an array index type must be finite. The variable declaration

x :> [0..] -> I
declares an output flexible array with element type I. The length of the array will be known only at the time x is initialized; the initializing formulas
x = [9, 99, 999]
Dupl(10, 0, x)
would initialize x to have 3 elements and 10 elements, respectively. Flexible arrays are most often used as types of formal arguments for predicates which can process actual argument arrays of arbitrary length. The following procedure Asum adds together elements of a flexible array:
Flex = [0..]->I

proc Asum(a :< Flex, sum :> I) iff
    Asum1(a,Len(a)-1,0,sum)

proc Asum1(a :< Flex, index :< I, accum :< I, sum :> I) iff
    if index >= 0 then
        Asum1(a,index-1,accum + a(index),sum)
    else
        sum = accum
    end
Note the use of the length predicate Len to determine the upper bound of the flexible array a, which is then passed to the auxiliary procedure Asum1. Incidentally, the procedure Asum1 is written in the so-called accumulator style where the sum is being accumulated via the input argument accum and is returned through the argument sum at the end of the recursion. This, rather than the conceptually simpler and more straightforward recursion, has the advantage of being tail-recursive. The compiler of FormulaOne compiles the procedure Asuml as a loop. See the section Tail Recursion Elimination for more information.
A typical use of the procedure Asum would be with arrays which have fixed upper bounds:
a:>[0..99]->I & Initialize(a) & sum = Asum(a)
The array a is declared to contain 100 elements, it is assigned a value by the procedure Initialize and it is then passed to the procedure Asum.
There are three basic methods of assigning values to flexible arrays. The first one uses an explicit array constructor:
a :> Flex & a = [2,3,4,5]
The variable a is given a four element array as a value. This method works when the size of the array is small and the length of the array is known during the compile time. Flexible arrays are used most often in dynamic situations where the size of the array is given by a variable rather than by a constant.

The second method of creating flexible arrays uses input/output array variables:

...
a:.Flex & Dupl(n,0,a) & Set(a,0)
...

proc Set(a :. Flex, index :< I) iff
    if index < Len(a) then
        a(index) := index & Set(a,index + 1)
    end
The input/output array a is first allocated to contain n elements (all initialized to zeros). The elements are then set by the procedure Set. This method of incremental assignment of element values works with input/output arrays (where it is possible to change a single element) or with symbolic arrays (where constraints can be set to the arbitrary elements). Flexible output arrays must be given full values at one time. This is possible by the third method of using a cast to convert a list to an array:
a = lst:Flex
Here lst is an already constructed list of integers.

The standard procedure Dupl which allocates flexible arrays is used mostly to initialize input/output arrays. FormulaOne allows the use of the standard predicate Len as a constraint to set the length of a flexible symbolic array:

a::Flex & Len(n,a)
Here the input integer variable n gives the size to the flexible symbolic array a.

The Universal Type

We have indicated before that a universal type U exists, a type of which every member of the FormulaOne universe is a member. Up to now, we have not studied how to use this universal type in programs, because it's implemented as a union type. Now we can study it.

The name of the universal type is U, and it looks to programs as if it has been defined with the following recursive definition:

U = R(r:R) | P(h:U, t:U)

Clearly, this definition is not a valid FormulaOne type definition, since R is used as a variant name as well as a type. But U is treated in a special way, so an exception is made for it. U is special because any object, of no matter what type, is also an object of type U. The structure of U is exactly the structure of the FormulaOne universe: all real numbers, and all pairs that can be built up from them. Thus, if a predicate parameter is of type U, any term can be given as an argument to that parameter in a predicate call.

Consider the following definition, of a procedure which "reverses" any list.

proc Reverse(x :< U, y :> U) iff
    Revl(x, Nil, y)

proc Revl(x :< list U, sofar :< list U, y :> list U) iff
    case x of
        Nil    => y = sofar;
        (h, t) => Revl(t, (h, sofar), y);
    end
We have given the Reverse procedure parameters of type U so that any term can be given as a parameter. However, a procedure call like Reverse(a,b) will fail if a is not of some list type, because when it is passed along as an argument to Revl, it will fail to be converted into a list. So the query
x :> I & x = 10 & Reverse(x, y)
should fail, while the query
x :> list I & y :> list I & x = (3,2,1,Nil) & Reverse(x,y)
should return y = (1, 2, 3, Nil). Note the automatic conversion from the type list I to the universal type U and back inserted by the FormulaOne compiler.

Type Conversions

But what exactly is the mechanism by which terms are converted from one type (say list I) to another (say U)? There are actually two different ways this kind of type conversion is done in FormulaOne programs: type coercion, which happen automatically, and type casts, which are explicitly specified by the programmer. A term of the form

term : type

is a type cast term. Its value, if it exists, is the given term converted to the given type. Logically, the effect of a type cast is to insert a test before whatever formula it appears in, checking that the term is a member of the given type. If the test fails, the formula containing the cast fails.

Computationally, however, the type cast term has the effect of taking the value of the term and deriving a term in the representation of the type; this new term is the term used.
For instance, if v is of the type Vehicle declared earlier, then the formula

Fib5(v:I, y)
will pass a version of v to Fib5, in the representation of integers - if v can be so represented. As it turns out, the only terms of type Vehicle which can be represented as integers are Tricycle and Motorcycle, which have the values 0 and 3. If v has any other value, the type cast will fail and so the call to Fib5 will fail.

Similarly, if v is of type I, then the term v:L will stand for the value of v, represented in the unlimited precision-bit representation of L variables rather than the 32-bit representation of I variables. But in fact it's unnecessary to give a type cast term in this case, because this cast is one of the many that are done automatically by type coercion. Type coercion are done automatically between U and any type, and between L, I, R, and subranges. That is, if a variable of type L is assigned a value of type I, or a parameter of type L gets an argument of type I, no explicit type cast is necessary.

More complex data types can be converted as well. For example, it is possible to convert a list into an array or vice-versa. This can be useful in many situations, as it is generally easier to create lists but accessing array elements is easier and more efficient.

x :> list I & x = 1,2,3,4,5,Nil & y = x:[0..]->I 
x = 1,2,3,4,5,Nil
y = [1,2,3,4,5]
Success  
If the conversion is not possible, the conversion will fail. The following query will fail because the list x contains more elements than the array y:
x :> list I & x = 1,2,3,4,5,Nil & y = x:[0..3]->I 
Failure   
The list elements themselves can be of arbitrary type. For example, it is possible to convert a list of lists into an arrray of arrays or into an array of lists or into a list of arrays.
This example converts a list of lists into an array of arrays:
x:>list(list I) & x = (1,2,3,4,Nil),(5,6,Nil),(7,8,9,Nil),Nil & y = x:[0..]->[0..]->I
The above query will generate the following result:
x = (1,2,3,4,Nil),(5,6,Nil),(7,8,9,Nil),Nil
y = [[1,2,3,4],[5,6],[7,8,9]]
Success  
This example converts a list of lists into an array of of lists:
x:>list(list I) & x = (1,2,3,4,Nil),(5,6,Nil),(7,8,9,Nil),Nil & y = x:[0..]->list I
The above query will generate the following result:
x = (1,2,3,4,Nil),(5,6,Nil),(7,8,9,Nil),Nil
y = [(1,2,3,4,Nil),(5,6,Nil),(7,8,9,Nil)]
Success   
This example converts a list of lists into an list of arrays:
x:>list(list I) & x = (1,2,3,4,5,Nil),(5,6,Nil),(7,8,9,Nil),Nil & y = x:list([0..]->I) 
The above code will generate this result:
x = (1,2,3,4,5,Nil),(5,6,Nil),(7,8,9,Nil),Nil
y = [1,2,3,4,5],[5,6],[7,8,9],Nil
Success   

Strings can be converted into arrays of ASCII characters or lists of ASCII characters. The following code demonstrates how to convert a string into an array of ASCII characters:
x :> S & x = 'Beatles' & y = x:[0..]->I
resulting in:
x = Beatles
y = [66,101,97,116,108,101,115]
Success
Reverse conversion works as well, the following code converts an array of ASCII characters into a string:
y :> [0..]->I & y = [66,101,97,116,108,101,115] & z = y:S
with the expected result:
y = [66,101,97,116,108,101,115]
z = Beatles
Success                       
Most conversions are performed at program run-time (as opposed to compile time), so a certain perfomance penalty may be expected. Incidentaly, allowing conversions between lists and arrays necessitated allowing arrays of less than two elements.

The automatic coercion between U and other types is particularly important, because it allows programmers to write predicates (like Reverse above) which take any term as a parameter. Languages with strict type structures, like Pascal or Prolog, would disallow such predicates, because there is no notion of universal type in those languages.




Top Procedures, Subroutines and Functions Database Files