Using -XGeneralizedNewtypeDeriving
(Section 7.5.5.1, “ Generalising the deriving clause ”), a programmer can take existing
instances of classes and "lift" these into instances of that class for a
newtype. However, this is not always safe. For example, consider the following:
newtype Age = MkAge { unAge :: Int } type family Inspect x type instance Inspect Age = Int type instance Inspect Int = Bool class BadIdea a where bad :: a -> Inspect a instance BadIdea Int where bad = (> 0) deriving instance BadIdea Age -- not allowed!
If the derived instance were allowed, what would the type of its method
bad
be? It would seem to be Age -> Inspect
Age
, which is equivalent to Age -> Int
, according
to the type family Inspect
. Yet, if we simply adapt the
implementation from the instance for Int
, the implementation
for bad
produces a Bool
, and we have trouble.
The way to identify such situations is to have roles assigned to type variables of datatypes, classes, and type synonyms.
Roles as implemented in GHC are a from a simplified version of the work described in Generative type abstraction and type-level computation, published at POPL 2011.
The goal of the roles system is to track when two types have the same
underlying representation. In the example above, Age
and
Int
have the same representation. But, the corresponding
instances of BadIdea
would not have
the same representation, because the types of the implementations of
bad
would be different.
Suppose we have two uses of a type constructor, each applied to the same
parameters except for one difference. (For example, T Age Bool
c
and T Int Bool c
for some type
T
.) The role of a type parameter says what we need to
know about the two differing type arguments in order to know that the two
outer types have the same representation (in the example, what must be true
about Age
and Int
in order to show that
T Age Bool c
has the same representation as
T Int Bool c
).
GHC supports three different roles for type parameters: nominal,
representational, and phantom. If a type parameter has a nominal (N) role,
then the two types that differ must not actually differ at all: they must be
identical (after type family reduction). If a type parameter has a
representational (R) role, then the two types must have the same
representation. (If T
's first parameter's role is R, then
T Age Bool c
and T Int Bool c
would have
the same representation, because Age
and Int
have the same representation.) If a type parameter has a phantom (P) role,
then we need no further information.
Here are some examples:
data Simple a = MkSimple a -- a has role R type family F type instance F Int = Bool type instance F Age = Char data Complex a = MkComplex (F a) -- a has role N data Phant a = MkPhant Bool -- a has role P
The type Simple
has its parameter at role R, which is
generally the most common case. Simple Age
would have the same
representation as Simple Int
. The type Complex
,
on the other hand, has its parameter at role N, because Simple Age
and Simple Int
are not the same. Lastly,
Phant Age
and Phant Bool
have the same
representation, even though Age
and Bool
are unrelated.
What role should a given type parameter should have? GHC performs role
inference to determine the correct role for every parameter. It starts with a
few base facts: (->)
has two R parameters;
(~)
has two N parameters; all type families' parameters are
N; and all GADT-like parameters are N. Then, these facts are propagated to all
places where these types are used. By defaulting parameters to role P, any
parameters unused in the right-hand side (or used only in other types in P
positions) will be P. Whenever a parameter is used in an R position (that is,
used as a type argument to a constructor whose corresponding variable is at
role R), we raise its role from P to R. Similarly, when a parameter is used in
an N position, its role is upgraded to N. We never downgrade a role from N to
P or R, or from R to P. In this way, we infer the most-general role for each
parameter.
There is one particularly tricky case that should be explained:
data Tricky a b = MkTricky (a b)
What should Tricky
's roles be? At first blush, it would
seem that both a
and b
should be at role R,
since both are used in the right-hand side and neither is involved in a type family.
However, this would be wrong, as the following example shows:
data Nom a = MkNom (F a) -- type family F from example above
Is Tricky Nom Age
representationally equal to
Tricky Nom Int
? No! The former stores a Char
and the latter stores a Bool
. The solution to this is
to require all parameters to type variables to have role N. Thus, GHC would
infer role R for a
but role N for b
.
Sometimes the programmer wants to constrain the inference process. For example, the base library contains the following definition:
data Ptr a = Ptr Addr#
The idea is that a
should really be an R parameter, but
role inference assigns it to P. This makes some level of sense: a pointer to
an Int
really is representationally the same as a pointer
to a Bool
. But, that's not at all how we want to use
Ptr
s! So, we want to be able to say
data Ptr a@R = Ptr Addr#
The @R
(enabled with -XRoleAnnotations
) annotation forces the
parameter a to be at role R, not role P. GHC then checks
the user-supplied roles to make sure they don't break any promises. It would
be bad, for example, if the user could make BadIdea
's role be R.
As another example, we can consider a type Set a
that
represents a set of data, ordered according to a
's
Ord
instance. While it would generally be type-safe to
consider a
to be at role R, it is possible that a
newtype
and its base type have
different orderings encoded in their respective
Ord
instances. This would lead to misbehavior at runtime.
So, the author of the Set
datatype would like its parameter
to be at role N. This would be done with a declaration
data Set a@N = ...
The other place where role annotations may be necessary are in
hs-boot
files (Section 4.7.9, “How to compile mutually recursive modules”), where
the right-hand sides of definitions can be omitted. As usual, the
types/classes declared in an hs-boot
file must match up
with the definitions in the hs
file, including down to the
roles. The default role is R in hs-boot
files,
corresponding to the common use case.
Role annotations are allowed on type variables in data, newtype, class,
and type declarations. They are not allowed on type/data family
declarations or in explicit foralls in function type signatures.
The syntax for a role annotation is an @
sign followed
by one of N
, R
, or P
,
directly following a type variable. If the type variable has an explicit
kind annotation, the role annotation goes after the kind annotation, outside
the parentheses. Here are some examples:
data T1 a b@P = MkT1 a -- b is not used; annotation is fine but unnecessary data T2 a b@P = MkT2 b -- ERROR: b is used and cannot be P data T3 a b@N = MkT3 a -- OK: N is higher than necessary, but safe data T4 (a :: * -> *)@N = MkT4 (a Int) -- OK, but N is higher than necessary class C a@R b where ... -- OK type X a@N = ... -- OK type family F a@R -- ERROR: annotations not allowed on family declarations