Chapter 2: Type-Level Computation
The last chapter reviewed some Ur features imported from ML and Haskell. This chapter explores uncharted territory, introducing the features that make Ur unique.
Names and Records
Last chapter, we met Ur's basic record features, including record construction and field projection.
val r = { A = 0, B = 1.2, C = "hi" }
r.B
== 1.2
Our first taste of Ur's novel expressive power is with the following function, which implements record field projection in a completely generic way.
fun project [nm :: Name] [t ::: Type] [ts ::: {Type}] [[nm] ~ ts] (r : $([nm = t] ++ ts)) : t = r.nm
project [#B] r
== 1.2
This function introduces a slew of essential features. First, we see type parameters with explicit kind annotations. Formal parameter syntax like [a :: K] declares an explicit parameter a of kind K. Explicit parameters must be passed explicitly at call sites. In contrast, implicit parameters, declared like [a ::: K], are inferred in the usual way.
Two new kinds appear in this example. We met the basic kind Type in a previous example. Here we meet Name, the kind of record field names; and {Type} the type of finite maps from field names to types, where we'll generally refer to this notion of "finite map" by the name record, as it will be clear from context whether we're discussing type-level or value-level records. That is, in this case, we are referring to names and records at the level of types that exist only at compile time! By the way, the kind {Type} is one example of the general {K} kind form, which refers to records with fields of kind K.
The English description of project is that it projects a field with name nm and type t out of a record r whose other fields are described by type-level record ts. We make all this formal by assigning r a type that first builds the singleton record [nm = t] that maps nm to t, and then concatenates this record with the remaining field information in ts. The $ operator translates a type-level record (of kind {Type}) into a record type (of kind Type).
The type annotation on r uses the record concatenation operator ++. Ur enforces that any concatenation happens between records that share no field names. Otherwise, we'd need to resolve field name ambiguity in some predictable way, which would force us to treat ++ as non-commutative, if we are to maintain the nice modularity properties of polymorphism. However, treating ++ as commutative, and treating records as equal up to field permutation in general, are very convenient for type inference. Thus, we enforce disjointness to keep things simple.
For a polymorphic function like project, the compiler doesn't know which fields a type-level record variable like ts contains. To enable self-contained type-checking, we need to declare some constraints about field disjointness. That's exactly the meaning of syntax like [r1 ~ r2], which asserts disjointness of two type-level records. The disjointness clause for project asserts that the name nm is not used by ts. The syntax [nm] is shorthand for [nm = ()], which defines a singleton record of kind {Unit}, where Unit is the degenerate kind inhabited only by the constructor ().
The last piece of this puzzle is the easiest. In the example call to project, we see that the only parameters passed are the one explicit constructor parameter nm and the value-level parameter r. The rest are inferred, and the disjointness proof obligation is discharged automatically. The syntax #A denotes the constructor standing for first-class field name A, and we pass all constructor parameters to value-level functions within square brackets (which bear no formal relation to the syntax for type-level record literals [A = c, ..., A = c]).
Basic Type-Level Programming
To help us express more interesting operations over records, we will need to do some type-level programming. Ur makes that fairly congenial, since Ur's constructor level includes an embedded copy of the simply-typed lambda calculus. Here are a few examples.
con id = fn t :: Type => t
val x : id int = 0
val x : id float = 1.2
con pair = fn t :: Type => t * t
val x : pair int = (0, 1)
val x : pair float = (1.2, 2.3)
con compose = fn (f :: Type -> Type) (g :: Type -> Type) (t :: Type) => f (g t)
val x : compose pair pair int = ((0, 1), (2, 3))
con fst = fn t :: (Type * Type) => t.1
con snd = fn t :: (Type * Type) => t.2
con p = (int, float)
val x : fst p = 0
val x : snd p = 1.2
con mp = fn (f :: Type -> Type) (t1 :: Type, t2 :: Type) => (f t1, f t2)
val x : fst (mp pair p) = (1, 2)
Actually, Ur's constructor level goes further than merely including a copy of the simply-typed lambda calculus with tuples. We also effectively import classic let-polymorphism, via kind polymorphism, which we can use to make some of the definitions above more generic.
con fst = K1 ==> K2 ==> fn t :: (K1 * K2) => t.1
con snd = K1 ==> K2 ==> fn t :: (K1 * K2) => t.2
con twoFuncs :: ((Type -> Type) * (Type -> Type)) = (id, compose pair pair)
val x : fst twoFuncs int = 0
val x : snd twoFuncs int = ((1, 2), (3, 4))
Type-Level Map
The examples from the last section may seem cute but not especially useful. In this section, we meet map, the real workhorse of Ur's type-level computation. We'll use it to type some useful operations over value-level records. A few more pieces will be necessary before getting there, so we'll start just by showing how interesting type-level operations on records may be built from map.
con r = [A = int, B = float, C = string]
con optionify = map option
val x : $(optionify r) = {A = Some 1, B = None, C = Some "hi"}
con pairify = map pair
val x : $(pairify r) = {A = (1, 2), B = (3.0, 4.0), C = ("5", "6")}
con stringify = map (fn _ => string)
val x : $(stringify r) = {A = "1", B = "2", C = "3"}
We'll also give our first hint at the cleverness within Ur's type inference engine. The following definition type-checks, despite the fact that doing so requires applying several algebraic identities about map and ++. This is the first point where we see a clear advantage of Ur over the type-level computation facilities that have become popular in GHC Haskell.
fun concat [f :: Type -> Type] [r1 :: {Type}] [r2 :: {Type}] [r1 ~ r2]
(r1 : $(map f r1)) (r2 : $(map f r2)) : $(map f (r1 ++ r2)) = r1 ++ r2
First-Class Polymorphism
The idea of first-class polymorphism or impredicative polymorphism has also become popular in GHC Haskell. This feature, which has a long history in type theory, is also central to Ur's metaprogramming facilities. First-class polymorphism goes beyond Hindley-Milner's let-polymorphism to allow arguments to functions to themselves be polymorphic. Among other things, this enables the classic example of Church encodings, as for the natural numbers in this example.
type nat = t :: Type -> t -> (t -> t) -> t
val zero : nat = fn [t :: Type] (z : t) (s : t -> t) => z
fun succ (n : nat) : nat = fn [t :: Type] (z : t) (s : t -> t) => s (n [t] z s)
val one = succ zero
val two = succ one
val three = succ two
three [int] 0 (plus 1)
== 3
three [string] "" (strcat "!")
== "!!!"
Folders
We're almost ready to implement some more polymorphic operations on records. The key missing piece is folders; more specifically, the type family folder that allows iteration over the fields of type-level records. The Ur standard library exposes folder abstractly, along with the following key operation over it. Don't mind the clutter at the end of this definition, where we rebind the function fold from the default-open module Top, as we must include an explicit kind-polymorphic binder to appease the associated let-polymorphism. (A normal program would omit this definition, anyway; we include it here only to show the type of fold.)
val fold : K --> tf :: ({K} -> Type)
-> (nm :: Name -> v :: K -> r :: {K} -> [[nm] ~ r] =>
tf r -> tf ([nm = v] ++ r))
-> tf []
-> r ::: {K} -> folder r -> tf r
= K ==> fold
The type is a bit of a mouthful. We can describe the function arguments in order. First, K is the kind of data associated with fields in the record we will be iterating over. Next, tf describes the type of an accumulator, much as for standard "fold" operations over lists. The difference here is that the accumulator description is not a mere type, but rather a type-level function that returns a type given a properly kinded record. When we begin iterating over a record r, the accumulator has type tf [] (where [] is the empty record), and when we finish iterating, the accumulator has type tf r. As we step through the fields of the record, we add each one to the argument we keep passing to tf.
The next arguments of fold are much like for normal list fold functions: a step function and an initial value. The latter has type tf [], just as we expect from the explanation in the last paragraph. The final arguments are r, the record we fold over; and a folder for it. The function return type follows last paragraph's explanation of accmulator typing.
We've left a big unexplained piece: the type of the step function. In order, its arguments are nm, the current field being processed; v, the data associated with that field; r, the portion of the input record that we had already stepped through before this point; a proof that the name nm didn't already occur in r; and the accumulator, typed to show that the set of fields we've already visited is exactly r. The return type of the step function is another accumulator type, extended to show that now we've visited nm, too.
Here's a simple example usage, where we write a function to count the number of fields in a type-level record of types.
fun countFields [ts :: {Type}] (fl : folder ts) : int =
@fold [fn _ => int] (fn [nm ::_] [v ::_] [r ::_] [[nm] ~ r] n => n + 1) 0 fl
We preface fold with @, to disable inference of folders, since we have one we'd like to pass explicitly. The accumulator type family we use is a simple one that ignores its argument and always returns int; at every stage of our traversal of input record ts, we keep an integer as our sole state, and the type of this state doesn't depend on which record fields we've visited. The step function binds each type parameter with the notation ::_, for an explicit parameter whose kind should be inferred.
The function countFields is a lot easier to use than it is to define! Here's an example invocation, where we see that the appropriate folder is inferred.
countFields [[A = int, B = float, C = string]]
== 3
If folders are generally inferred, why bother requiring that they be passed around? The answer has to do with Ur's rule that type-level records are considered equivalent modulo permutation. As a result, there is no unique traversal order for a record, in general. The programmer has freedom in constructing folders that embody different permutations, using the functions exposed from the module Folder (see the top of lib/ur/top.urs in the Ur/Web distribution). Still, in most cases, the order in which fields are written in the source code provides an unambiguous clue about desired ordering. Thus, by default, folder parameters are implicit, and they are inferred to follow the order of fields in program text.
Let's implement a more ambitious traversal. We will take in a record whose fields all contain option types, and we will determine if every field contains a Some. If so, we return Some of a "de-optioned" version; if not, we return None.
fun join [ts ::: {Type}] (fl : folder ts) (r : $(map option ts)) : option $ts =
@fold [fn ts => $(map option ts) -> option $ts]
(fn [nm ::_] [v ::_] [r ::_] [[nm] ~ r] (f : $(map option r) -> option $r) =>
fn r : $(map option ([nm = v] ++ r)) =>
case r.nm of
None => None
| Some v =>
case f (r -- nm) of
None => None
| Some vs => Some ({nm = v} ++ vs))
(fn _ : $(map option []) => Some {}) fl r
Rather than take in an arbitrary record type and add some sort of constraint requiring that it contain only option types, the Ur way is to construct a record type with computation over some more primitive inputs, such that the process (A) is guaranteed to construct only records satisfying the constraint and (B) is capable, given the proper inputs, of constructing any record satisfying the original constraint.
Our use of folding here involves an accumulator type that is record-dependent. In particular, as we traverse the record, we are building up a "de-optioning" function. To implement the step function, we rely on the record projection form r.nm and the record field removal form r -- nm, both of which work fine with variables standing for unknown field names. To extend the output record with a new mapping for field nm, we use concatenation ++ with a singleton record literal.
Like for the last example, join is much easier to use than to implement! The simple invocations below use Ur's reverse-engineering unification to deduce the value of parameter ts from the type of parameter r. Also, as before, the folder argument is inferred.
join {A = Some 1, B = Some "X"}
== Some({A = 1, B = "X"})
join {A = Some 1, B = None : option string}
== None
The Ur/Web standard library includes many variations on fold that encapsulate common traversal patterns. For instance, foldR captures the idea of folding over a value-level record, and we can use it to simplify the definition of join:
fun join [ts ::: {Type}] (fl : folder ts) (r : $(map option ts)) : option $ts =
@foldR [option] [fn ts => option $ts]
(fn [nm ::_] [v ::_] [r ::_] [[nm] ~ r] v vs =>
case (v, vs) of
(Some v, Some vs) => Some ({nm = v} ++ vs)
| _ => None)
(Some {}) fl r
See lib/ur/top.urs for the types of foldR and some other handy folding functions.
Working with First-Class Disjointness Obligations
The syntax [r1 ~ r2] in a function definition introduces a constraint that the type-level records r1 and r2 share no field names. The same syntax may also be used with anonymous function definitions, which can be useful in certain kinds of record traversals, as we've seen in the types of step functions above. Sometimes we must mark explicitly the places where these disjointness proof obligations must be proved. To pick a simple example, let's pretend that the general value-level ++ operator is missing form the language, so that we must implement it ourselves on top of a version of ++ that can only add one field at a time. The following code demonstrates the use of the syntax ! to discharge a disjointness obligation explicitly. We generally only need to do that when working with the @ version of an identifier, which not only requires that folders be passed explicitly, but also disjointness proofs (which are always written as just !) and type class witnesses.
fun concat [ts1 ::: {Type}] [ts2 ::: {Type}] [ts1 ~ ts2]
(fl : folder ts1) (r1 : $ts1) (r2 : $ts2) : $(ts1 ++ ts2) =
@foldR [id] [fn ts1 => ts2 ::: {Type} -> [ts1 ~ ts2] => $ts2 -> $(ts1 ++ ts2)]
(fn [nm ::_] [v ::_] [r ::_] [[nm] ~ r] v
(acc : ts2 ::: {Type} -> [r ~ ts2] => $ts2 -> $(r ++ ts2))
[ts2] [[nm = v] ++ r ~ ts2] r =>
{nm = v} ++ acc r)
(fn [ts2] [[] ~ ts2] (r : $ts2) => r) fl r1 ! r2
concat {A = 1, B = "x"} {C = 2.3, D = True}
== {A = 1, B = "x", C = 2.3, D = True}
Type-Level Computation Meets Type Classes
Ur's treatment of type classes makes some special allowances for records. In particular, a record of type class witnesses may be inferred automatically. Our next example shows how to put that functionality to good use, in writing a function for pretty-printing records as strings. The type class show is part of the Ur standard library, and its instances are valid arguments to the string-producing function show.
fun showRecord [ts ::: {Type}] (fl : folder ts) (shows : $(map show ts))
(names : $(map (fn _ => string) ts)) (r : $ts) : string =
"{" ^ @foldR3 [fn _ => string] [show] [id] [fn _ => string]
(fn [nm ::_] [t ::_] [r ::_] [[nm] ~ r] name shower value acc =>
name ^ " = " ^ @show shower value ^ ", " ^ acc)
"...}" fl names shows r
showRecord {A = "A", B = "B"} {A = 1, B = 2.3}
== "{A = 1, B = 2.3, ...}"
One natural complaint about this code is that field names are repeated unnecessarily as strings. Following Ur's design rationale, this is a consequence of a "feature," not a "bug," since allowing the syntactic analysis of type-level data would break the celebrated property of parametricity.
Type-Level Computation Meets Modules
To illustrate how the features from this chapter integrate with Ur's module system, let's reimplement the last example as a functor.
functor ShowRecord(M : sig
con ts :: {Type}
val fl : folder ts
val shows : $(map show ts)
val names : $(map (fn _ => string) ts)
end) : sig
val show_ts : show $M.ts
end = struct
open M
val show_ts = mkShow (fn r : $ts =>
"{" ^ @foldR3 [fn _ => string] [show] [id] [fn _ => string]
(fn [nm ::_] [t ::_] [r ::_] [[nm] ~ r] name shower value acc =>
name ^ " = " ^ @show shower value ^ ", " ^ acc)
"...}" fl names shows r)
end
open ShowRecord(struct
con ts = [A = int, B = float, C = bool]
val names = {A = "A", B = "B", C = "C"}
end)
show {A = 1, B = 2.3, C = True}
== "{A = 1, B = 2.3, C = True, ...}"
A few important points show up in this example. First, Ur extends the standard ML module system feature base by allowing functors to be applied to structures with omitted members, when those members can be inferred from the others. Thus, we call ShowRecord omitting the fields fl and shows. Second, merely calling a functor that produces an output with a type class instance can bring that instance into scope, so that it is applied automatically, as in our evaluation example above.
To illustrate the mixing of constraints and functors, we translate another of our earlier examples in a somewhat silly way:
functor Concat(M : sig
con f :: Type -> Type
con r1 :: {Type}
con r2 :: {Type}
constraint r1 ~ r2
end) : sig
val concat : $(map M.f M.r1) -> $(map M.f M.r2) -> $(map M.f (M.r1 ++ M.r2))
end = struct
fun concat r1 r2 = r1 ++ r2
end
structure C = Concat(struct
con f = id
con r1 = [A = int]
con r2 = [B = float, C = bool]
end)
show (C.concat {A = 6} {B = 6.0, C = False})
== "{A = 6, B = 6, C = False, ...}"