| # Type Functions |
| |
| This is a proposal for adding generics to Go, written by Ian Lance |
| Taylor in June, 2010. |
| This proposal will not be adopted. |
| It is being presented as an example for what a complete generics |
| proposal must cover. |
| |
| ## Defining a type function |
| |
| We introduce _type functions_. |
| A type function is a named type with zero or more parameters. |
| The syntax for declaring a type function is: |
| |
| ``` |
| type name(param1, param2, ...) definition |
| ``` |
| |
| Each parameter to a type function is simply an identifier. |
| The _definition_ of the type function is any type, just as with an |
| ordinary type declaration. |
| The definition of a type function may use the type parameters any |
| place a type name may appear. |
| |
| ``` |
| type Vector(t) []t |
| ``` |
| |
| ## Using a type function |
| |
| Any use of a type function must provide a value for each type |
| parameter. |
| The value must itself be a type, though in some cases the exact type |
| need not be known at compile time. |
| |
| We say that a specific use of a type function is concrete if all the |
| values passed to the type function are concrete. |
| All predefined types are concrete and all type literals composed using |
| only concrete types are concrete. |
| A concrete type is known statically at compile time. |
| |
| ``` |
| type VectorInt Vector(int) // Concrete. |
| ``` |
| |
| When a type function is used outside of a func or the definition of a |
| type function, it must be concrete. |
| That is, global variables and constants are required to have concrete |
| types. |
| |
| In this example: |
| |
| ``` |
| var v Vector(int) |
| ``` |
| |
| the name of the type of the variable `v` will be `Vector(int)`, and |
| the value of v will have the representation of `[]int`. |
| If `Vector(t)` has any methods, they will be attached to the type of |
| `v` \(methods of type functions are discussed further below\). |
| |
| ## Generic types |
| |
| A type function need not be concrete when used as the type of a |
| function receiver, parameter, or result, or anywhere within a |
| function. |
| A specific use of a type function that is not concrete is known as |
| generic. |
| A generic type is not known at compile time. |
| |
| A generic type is named by using a type function with one or more |
| unbound type parameters, or by writing a type function with parameters |
| attached to generic types. |
| When writing an unbound type parameter, it can be ambiguous whether |
| the intent is to use a concrete type or whether the intent is to use |
| an unbound parameter. |
| This ambiguity is resolved by using the `type` keyword after each |
| unbound type parameter. |
| |
| \(Another way would be to use a different syntax for type variables. |
| For example, `$t`. This has the benefit of keeping the grammar |
| simpler and not needing to worry about where types are introduced |
| vs. used.\) |
| |
| ``` |
| func Bound(v Vector(int)) |
| func Unbound(v Vector(t type)) |
| ``` |
| |
| The `type` keyword may also be used without invoking a type function: |
| |
| ``` |
| func Generic(v t type) t |
| ``` |
| |
| In this example the return type is the same as the parameter type. |
| Examples below show cases where this can be useful. |
| |
| A value (parameter, variable, etc.) with a generic type is represented |
| at runtime as a generic value. |
| A generic value is a dynamic type plus a value of that type. |
| The representation of a generic value is thus the same as the |
| representation of an empty interface value. |
| However, it is important to realize that the dynamic type of a generic |
| value may be an interface type. |
| This of course can never happen with an ordinary interface value. |
| |
| Note that if `x` is a value of type `Vector(t)`, then although `x` is |
| a generic value, the elements of the slice are not. |
| The elements have type `t`, even though that type is not known. |
| That is, boxing a generic value only occurs at the top level. |
| |
| ## Generic type identity |
| |
| Two generic types are identical if they have the same name and the |
| parameters are identical. |
| This can only be determined statically at compile time if the names |
| are the same. |
| For example, within a function, `Vector(t)` is identical to |
| `Vector(t)` if both `t` identifiers denote the same type. |
| `Vector(int)` is never identical to `Vector(float)`. |
| `Vector(t)` may or may not be identical to `Vector(u)`; |
| identity can only be determined at runtime. |
| |
| Checking type identity at runtime is implemented by walking through |
| the definition of each type and comparing each component. |
| At runtime all type parameters are known, so no ambiguity is possible. |
| If any literal or concrete type is different, the types are different. |
| |
| ## Converting a value of concrete type to a generic type |
| |
| Sometimes it is necessary to convert a value of concrete type to a |
| generic type. |
| This is an operation that may fail at run time. |
| This is written as a type assertion: `v.(t)`. |
| This will verify at runtime that the concrete type of `v` is identical |
| to the generic type `t`. |
| |
| \(Open issue: Should we use a different syntax for this?\) |
| |
| \(Open issue: In some cases we will want the ability to convert an |
| untyped constant to a generic type. |
| This would be a runtime operation that would have to implement the |
| rules for conversions between numeric types. |
| How should this conversion be written? |
| Should we simply use `N`, as in `v / 2`? |
| The problem with that syntax is that the runtime conversion can fail |
| in some cases, at least when `N` is not in the range 0 to 127 inclusive. |
| The same objection applies to T(n). |
| That suggests N.(t), but that looks weird.\) |
| |
| \(Open issue: It is possible that we will want the ability to convert |
| a value from a known concrete type to a generic type. |
| This would also require a runtime conversion. |
| I'm not sure whether this is necessary or not. |
| What would be the syntax for this?\) |
| |
| ## Generic value operations |
| |
| A function that uses generic values is only compiled once. |
| This is different from C++ templates. |
| |
| The only operations permitted on a generic value are those implied by |
| its type function. |
| Some operations will require extra work at runtime. |
| |
| ### Declaring a local variable with generic type. |
| |
| This allocates a new generic value with the appropriate dynamic type. |
| Note that in the general case the dynamic type may need to be |
| constructed at runtime, as it may itself be a type function with |
| generic arguments. |
| |
| ### Assigning a generic value |
| |
| As with any assignment, this is only permitted if the types are |
| identical. |
| The value is copied as appropriate. |
| This is much like assignment of values of empty interface type. |
| |
| ### Using a type assertion with a generic value |
| |
| Programs may use type assertions with generic values just as with |
| interface values. |
| The type assertion succeeds if the target type is identical to the |
| value's dynamic type. |
| In general this will require a runtime check of type identity as |
| described above. |
| |
| \(Open issue: Should it be possible to use a type assertion to convert |
| a generic value to a generic type, or only to a concrete type? |
| Converting to a generic type is a somewhat different operation from |
| a standard type assertion. |
| Should it use a different syntax?\) |
| |
| ### Using a type switch with a generic value |
| |
| Programs may use type switches with generic values just as with |
| interface values. |
| |
| ### Using a conversion with a generic value |
| |
| Generic values may only be converted to types with identical |
| underlying types. |
| This is only permitted when the compiler can verify at compile time |
| that the conversion is valid. |
| That is, a conversion to a generic type T is only permitted if the |
| definition of T is identical to the definition of the generic type of |
| the value. |
| |
| ``` |
| var v Vector(t) |
| a := Vector(t)(v) // OK. |
| type MyVector(t) []t |
| b := MyVector(t)(v) // OK. |
| c := MyVector(u)(v) // OK iff u and t are known identical types. |
| d := []int(v) // Not OK. |
| ``` |
| |
| ### Taking the address of a generic value |
| |
| Programs may always take the address of a generic value. |
| If the generic value has type `T(x)` this produces a generic value of |
| type `*T(x)`. |
| The new type may be constructed at runtime. |
| |
| ### Indirecting through a generic value |
| |
| If a generic value has a generic type that is a pointer `*T`, then a |
| program may indirect through the generic value. |
| This will be similar to a call to `reflect.PtrValue.Elem`. |
| The result will be a new generic value of type `T`. |
| |
| ### Indexing or slicing a generic value |
| |
| If a generic value has a generic type that is a slice or array, then a |
| program may index or slice the generic value. |
| This will be similar to a call to `reflect.ArrayValue.Elem` or |
| `reflect.SliceValue.Elem` or `reflect.SliceValue.Slice` or |
| `reflect.MakeSlice`. |
| In particular, the operation will require a multiplication by the size |
| of the element of the slice, where the size will be fetched from the |
| dynamic type. |
| The result will be a generic value of the element type of the slice or |
| array. |
| |
| ### Range over a generic value |
| |
| If a generic value has a generic type that is a slice or array, then a |
| program may use range to loop through the generic value. |
| |
| ### Maps |
| |
| Similarly, if a generic value has a generic type that is a map, |
| programs may index into the map, check whether an index is present, |
| assign a value to the map, range over a map. |
| |
| ### Channels |
| |
| Similarly, if a generic value has a generic type that is a channel, |
| programs may send and receive values of the appropriate generic type, |
| and range over the channel. |
| |
| ### Functions |
| |
| If a generic value has a generic type that is a function, programs may |
| call the function. |
| Where the function has parameters which are generic types, the |
| arguments must have identical generic type or a type assertion much be |
| used. |
| This is similar to `reflect.Call`. |
| |
| ### Interfaces |
| |
| If a generic value has a generic type that is an interface, programs |
| may call methods on the interface. |
| This is much like calling a function. |
| Note that a type assertion on a generic value may return an interface |
| type, unlike a type assertion on an interface value. |
| This in turn means that a double type assertion is meaningful. |
| |
| ``` |
| a.(InterfaceType).(int) |
| ``` |
| |
| ### Structs |
| |
| If a generic value has a generic type that is a struct, programs may |
| get and set struct fields. |
| In general this requires finding the description of the field in the |
| dynamic type to discover the appropriate concrete type and field |
| offsets. |
| |
| ### That is all |
| |
| Operations that are not explicitly permitted for a generic value are |
| forbidden. |
| |
| ### Scope of generic type parameters |
| |
| When a generic type is used, the names of the type parameters have |
| scope. |
| The generic type normally provides the type of some name; the scope of |
| the unbound type parameters is the same as the scope of that name. |
| In cases where the generic type does not provide the type of some |
| name, then the unbound type parameters have no scope. |
| Within the scope of an unbound type parameter, it may be used as a |
| generic type. |
| |
| ``` |
| func Head(v Vector(t type)) { |
| var first t |
| first = v[0] |
| } |
| ``` |
| |
| ### Scopes of function parameters |
| |
| In order to make this work nicely, we change the scope of function |
| receivers, parameters, and results. |
| We now define their scope to start immediately after they are defined, |
| rather than starting in the body of the function. |
| This means that a function parameter may refer to the unbound type |
| parameters of an earlier function parameter. |
| |
| ``` |
| func SetHead(v Vector(t type), e t) t { |
| v[0] = e |
| return e |
| } |
| ``` |
| |
| The main effect of this change in scope will be to change the |
| behaviour of cases where a function parameter has the same name as a |
| global type, and that global type was used to define a subsequent |
| function parameter. |
| |
| \(The alternate notation approach would instead define that the |
| variables only persist for the top-level statement in which they |
| appear, so that |
| |
| ``` |
| func SetHead(v Vector($t), e $t) $t { ... } |
| ``` |
| |
| doesn't have to worry about which t is the declaration (the ML |
| approach). |
| Another alternative is to do what C++ does and explicitly introduce |
| them. |
| |
| ``` |
| generic(t) func SetHead(v Vector(t), e t) t { ... } ] |
| ``` |
| |
| \) |
| |
| ### Function call argument type checking |
| |
| As can be seen by the previous example, it is possible to use generic |
| types to write functions in which two parameters have related types. |
| These types are checked at the point of the function call. |
| If the types at the call site are concrete, the type checking is |
| always done by the compiler. |
| If the types are generic, then the function call is only permitted if |
| the argument types are identical to the parameter types. |
| The arguments are matched against the required types from left to |
| right, determining bindings for the unbound type parameters. |
| Any failure of binding causes the compiler to reject the call with a |
| type error. |
| Any case where one unbound type parameter is matched to a different |
| unbound type parameter causes the compiler to reject the call with a |
| type error. |
| In those cases, the call site must use an explicit type assertion, |
| checked at run time, so that the call can be type checked at compile |
| time. |
| |
| ``` |
| var vi Vector(int) |
| var i int |
| SetHead(vi, 1) // OK |
| SetHead(vi, i) // OK |
| var vt Vector(t) |
| var i1 t |
| SetHead(vt, 1) // OK? Unclear. See above. |
| SetHead(vt, i) // Not OK; needs syntax |
| SetHead(vt, i1) // OK |
| var i2 q // q is another generic type |
| SetHead(vt, q) // Not OK |
| SetHead(vt, q.(t)) // OK (may fail at run time) |
| ``` |
| |
| ### Function result types |
| |
| The result type of a function can be a generic type. |
| The result will be returned to the caller as a generic value. |
| If the call site uses concrete types, then the result type can often |
| be determined at compile time. |
| The compiler will implicitly insert a type assertion to the expected |
| concrete type. |
| This type assertion can not fail, because the function will have |
| ensured that the result has the matching type. |
| In other cases, the result type may be a generic type, in which case |
| the returned generic value will be handled like any other generic |
| value. |
| |
| ### Making one function parameter the same type as another |
| |
| Sometime we want to say that two function parameters have the same |
| type, or that a result parameter has the same type as a function |
| parameter, without specifying the type of that parameter. |
| This can be done like this: |
| |
| ``` |
| func Choose(which bool, a t type, b t) t |
| ``` |
| |
| \(Or func `Choose(which bool, a $t, b $t) $t`\) |
| |
| The argument `a` is passed as generic value and binds the type |
| parameter `t` to `a`'s type. |
| The caller must ensure that `b` has the same type as `a`. |
| `b` will then also be passed as a generic value. |
| The result will be returned as a generic value; it must again have the |
| same type. |
| |
| Another example: |
| |
| ``` |
| type Slice(t) []t |
| func Repeat(x t type, n int) Slice(t) { |
| a := make(Slice(t), n) |
| for i := range a { |
| a[i] = x |
| } |
| return a |
| } |
| ``` |
| |
| \(Or `func Repeat(x $t, n int) []$t { ... }`\) |
| |
| ### Nested generic types |
| |
| It is of course possible for the argument to a generic type to itself |
| be a generic type. |
| The above rules are intended to permit this case. |
| |
| ``` |
| type Pair(a, b) struct { |
| first a |
| second b |
| } |
| func Sum(a Pair(Vector(t type), Vector(t))) Vector(t) |
| ``` |
| |
| Note that the first occurrence of `t` uses the `type` keyword to |
| declare it as an unbound type parameter. |
| The second and third occurrences do not, which means that they are the |
| type whose name is `t`. |
| The scoping rules mean that that `t` is the same as the one bound by the |
| first use of Vector. |
| When this function is called, the type checking will match `t` to the |
| argument to the first `Vector`, and then require that the same `t` |
| appear as the argument to the second `Vector`. |
| |
| ### Unknown generic types |
| |
| Note that it is possible to have a generic value whose type can not be |
| named. |
| This can happen when a result variable has a generic type. |
| |
| ``` |
| func Unknown() t type |
| ``` |
| |
| Now if one writes |
| |
| ``` |
| x := Unknown() |
| ``` |
| |
| then x is a generic value of unknown and unnamed type. |
| About all you can do with such a value is assign it using `:=` and use |
| it in a type assertion or type switch. |
| The only way that `Unknown` could return a value would be to use some |
| sort of conversion. |
| |
| ### Methods on generic types |
| |
| A generic type may have methods. |
| When a generic type is used as a receiver, the arguments must all be |
| simple unbound names. |
| Any time a value of this generic type is created, whether the value is |
| generic or concrete, it will acquire all the methods defined on the |
| generic type. |
| When calling these methods, the receiver will of course be passed as a |
| generic value. |
| |
| ``` |
| func (v Vector(t type)) At(int i) t { |
| return v[i] |
| } |
| |
| func (v Vector(t type)) Set(i int, x t) { |
| v[i] = x |
| } |
| ``` |
| |
| A longer example: |
| |
| ``` |
| package hashmap |
| |
| type bucket(keytype, valtype) struct { |
| next *bucket(keytype, valtype) |
| key keytype |
| val valtype |
| } |
| |
| type Hashfn(keytype) func(keytype) uint |
| |
| type Eqfn(keytype) func(keytype, keytype) bool |
| |
| type Hashmap(keytype, valtype) struct { |
| hashfn Hashfn(keytype) |
| eqfn Eqtype(keytype) |
| buckets []bucket(keytype, valtype) |
| entries int |
| } |
| |
| func New(hashfn Hashfn(keytype type), eqfn Eqfn(keytype), |
| _ valtype type) *Hashmap(keytype, valtype) { |
| return &Hashmap(k, v){hashfn, eqfn, |
| make([]bucket(keytype, valtype), 16), |
| 0} |
| } |
| |
| // Note that the dummy valtype parameter in the New function |
| // exists only to get valtype into the function signature. |
| // This feels wrong. |
| |
| func (p *Hashmap(keytype type, vvaltype type)) |
| Lookup(key keytype) (found bool, val valtype) { |
| h := p.hashfn(key) % len(p.buckets) |
| for b := buckets[h]; b != nil; b = b.next { |
| if p.eqfn(key, b.key) { |
| return true, b.val |
| } |
| } |
| return |
| } |
| ``` |
| |
| In the alternate syntax: |
| |
| ``` |
| package hash |
| |
| type bucket($key, $val) struct { |
| next *bucket($key, val) |
| key $key |
| val $val |
| } |
| |
| type Map($key, $val) struct { |
| hash func($key) uint |
| eq func($key, $key) bool |
| buckets []bucket($key, $val) |
| entries int |
| } |
| |
| func New(hash func($key) uint, eq func($key, $key) bool, _ $val) |
| *Map($key, $val) { |
| return &Map($key, $val){ |
| hash, |
| eq, |
| make([]bucket($key, $val), 16), |
| 0, |
| } |
| } |
| |
| // Again note dummy $val in the arguments to New. |
| ``` |
| |
| ## Concepts |
| |
| In order to make type functions more precise, we can additionally |
| permit the definition of the type function to specify an interface. |
| This means that whenever the type function is used, the argument is |
| required to satisfy the interface. |
| In homage to the proposed but not accepted C++0x notion, we call this |
| a concept. |
| |
| ``` |
| type PrintableVector(t Stringer) []t |
| ``` |
| |
| Now `PrintableVector` may only be used with a type that implements the |
| interface `Stringer`. |
| This in turn means that given a value whose type is the parameter to |
| `PrintableVector`, a program may call the `String` method on that |
| value. |
| |
| ``` |
| func Concat(p PrintableVector(t type)) string { |
| s := "" |
| for _, v := range p { |
| s += v.String() |
| } |
| return s |
| } |
| ``` |
| |
| Attempting to pass `[]int` to `Concat` will cause the compiler to |
| issue a type checking error. |
| But if `MyInt` has a `String` method, then calling `Concat` with |
| `[]MyInt` will succeed. |
| |
| The interface restriction may also be used with a parameter whose type |
| is a generic type: |
| |
| ``` |
| func Print(a t type Stringer) |
| ``` |
| |
| This example is not useful, as it is pretty much equivalent to passing |
| a value of type Stringer, but there is a useful example below. |
| |
| Concepts specified in type functions are type checked as usual. |
| If the compiler does not know statically that the type implements the |
| interface, then the type check fails. |
| In such cases an explicit type assertion is required. |
| |
| ``` |
| func MyConcat(v Vector(t type)) string { |
| if pv, ok := v.(PrintableVector(t)); ok { |
| return Concat(pv) |
| } |
| return "unprintable" |
| } |
| ``` |
| |
| \(Note that this does a type assertion to a generic type. Should it |
| use a different syntax?\) |
| |
| The concept must be an interface type, but it may of course be a |
| generic interface type. |
| When using a generic interface type as a concept, the generic |
| interface type may itself use as an argument the type parameter which |
| it is restricting. |
| |
| ``` |
| type Lesser(t) interface { |
| Less(t) bool |
| } |
| func Min(a, b t type Lesser(t)) t { |
| if a.Less(b) { |
| return a |
| } |
| return b |
| } |
| ``` |
| |
| \(`type Mintype($t Lesser($t)) $t`\) |
| |
| This is complex but useful. OK, the function `Min` is not all that |
| useful, but this looks better when we write |
| |
| ``` |
| func Sort(v Vector(t type Lesser(t))) |
| ``` |
| |
| which can sort any Vector whose element type implements the Lesser |
| interface. |
| |
| ## A note on operator methods |
| |
| You will have noticed that there is no way to use an operator with a |
| generic value. |
| For example, you can not add two generic values together. |
| If we implement operator methods, then it will be possible to use this |
| in conjunction with the interface restrictions to write simple generic |
| code which uses operators. |
| While operator methods are of course a separate extension, I think |
| it's important to ensure that they can work well with generic values. |
| |
| ``` |
| type Addable(t) interface { |
| Binary+(t) t |
| } |
| type AddableSlice(t Addable(t)) []t |
| func Sum(v AddableSlice) t { |
| var sum t |
| for _, v := range v { |
| sum = sum + v |
| } |
| return sum |
| } |
| ``` |
| |
| ## Some comparisons to C++ templates |
| |
| Obviously the big difference between this proposal and C++ templates |
| is that C++ templates are compiled separately. |
| This has various consequences. |
| Some C++ template features that can not be implemented using type |
| functions: |
| |
| * C++ templates permit data structures to be instantiated differently for different component types. |
| * C++ templates may be instantiated for constants, not just for types. |
| * C++ permits specific instantiations for specific types or constants. |
| |
| The advantages of type functions are: |
| |
| * Faster compile time. |
| * No need for two-phase name lookup. Only the scope of the definition is relevant, not the scope of use. |
| * Clear syntax for separating compile-time errors from run-time errors. Avoids complex compile-time error messages at the cost of only detecting some problems at runtime. |
| * Concepts also permit clear compile time errors. |
| |
| In general, C++ templates have the advantages and disadvantages of |
| preprocessor macros. |
| |
| ## Summary |
| |
| This proposal will not be adopted. |
| It's basically terrible. |
| |
| The syntax is confusing: ```MyVector(t)(v)``` looks like two function |
| calls, but it's actually a type conversion to a type function. |
| |
| The notion of an unbound type parameter is confusing, and the |
| syntax (a trailing `type` keyword) only increases that confusion. |
| |
| Types in Go refer to themselves. |
| The discussion of type identity does not discuss this. |
| It means that comparing type identity at run time, such as in a type |
| assertion, requires avoiding loops. |
| Generic type assertions look like ordinary type assertions, but are |
| not constant time. |
| |
| The need to pass an instance of the value type to `hashmap.New` is a |
| symptom of a deeper problem. |
| This proposal is trying to treat generic types like interface types, |
| but interface types have a simple common representation and generic |
| types do not. |
| Value representations should probably be expressed in the type system, |
| not inferred at run time. |
| |
| The proposal suggests that generic functions can be compiled once. |
| It also claims that generic types can have methods. |
| If I write |
| |
| ``` |
| type Vector(t) []t |
| |
| func (v Vector(t)) Read(b []t) (int, error) { |
| return copy(b, v), nil |
| } |
| ``` |
| |
| then `Vector(byte)` should implement `io.Reader`. |
| But `Vector(t).Read` is going to be implemented using a generic value, |
| while `io.Reader` expects a concrete value. |
| Where is the code that translates from the generic value to the |
| concrete value? |