New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: spec: add sum types / discriminated unions #19412
Comments
This has been discussed several times in the past, starting from before the open source release. The past consensus has been that sum types do not add very much to interface types. Once you sort it all out, what you get in the end is an interface type where the compiler checks that you've filled in all the cases of a type switch. That's a fairly small benefit for a new language change. If you want to push this proposal along further, you will need to write a more complete proposal doc, including: What is the syntax? Precisely how do they work? (You say they are "value types", but interface types are also value types). What are the trade-offs? |
See https://www.reddit.com/r/golang/comments/46bd5h/ama_we_are_the_go_contributors_ask_us_anything/d03t6ji/?st=ixp2gf04&sh=7d6920db for some past discussion to be aware of. |
I think this is too significant a change of the type system for Go1 and there's no pressing need. |
Thanks for creating this proposal. I've been toying with this idea for a year or so now. Sum types in GoA sum type is represented by two or more types combined with the "|"
Values of the resulting type can only hold one of the specified types. The As a special case, "nil" can be used to indicate whether the value can For example:
The method set of the sum type holds the intersection of the method set Like any other interface type, sum type may be the subject of a dynamic The zero value of a sum type is the zero value of the first type in When assigning a value to a sum type, if the value can fit into more For example:
would result in a value with dynamic type int, but
would result in a value with dynamic type float64. ImplementationA naive implementation could implement sum types exactly as interface For example a sum type consisting only of concrete types without pointers For sum-of-struct-types, it might even be possible to use spare padding |
@rogpeppe How would that interact with type assertions and type switches? Presumably it would be a compile-time error to have a |
For type switches, if you have
and you do:
and t contains an interface{} containing an int, does it match the first case? What if the first case is Or can sum types contain only concrete types? What about
what is t's type? Or is that construction forbidden? A similar question arises for |
Yes, I think it would be reasonable to have a compile-time error That means that you can get the compiler to error if you have:
and you change the sum type to add an extra case. |
t can't contain an interface{} containing an int. t is an interface Sum types can match interface types, but they still just get a concrete
According to the proposal above, you get the first item In fact interface{} | nil is technically redundant, because any interface{} For []int | nil, a nil []int is not the same as a nil interface, so the |
The type T []int | nil
var x T = nil would imply that That value would be distinct from the nil var y T = []int(nil) // y != x |
Wouldn't nil always be required even if the sum is all value types? Otherwise what would It seems overly subtle. Exhaustive type switches would be nice. You could always add an empty |
The proposal says "When assigning a value to a sum type, if the value can fit into more So, with:
x would have concrete type []int because nil is assignable to []int and []int is the first element of the type. It would be equal to any other []int (nil) value.
The proposal says "The zero value of a sum type is the zero value of the first type in
No, it would just be the usual interface nil value in that case. That type (interface{} | nil) is redundant. Perhaps it might be a good idea to make it a compiler to specify sum types where one element is a superset of another, as I can't currently see any point in defining such a type. |
That is an interesting suggestion, but since the sum type must record somewhere the type of the value that it currently holds, I believe it means that the zero value of the sum type is not all-bytes-zero, which would make it different from every other type in Go. Or perhaps we could add an exception saying that if the type information is not present, then the value is the zero value of the first type listed, but then I'm not sure how to represent |
So @ianlancetaylor I believe many functional languages implement (closed) sum types essentially like how you would in C
if That would make union's value types instead of special interfaces, which is also interesting. |
Is there a way to make the all zero value work if the field which records the type has a zero value representing the first type? I'm assuming that one possible way for this to be represented would be: type A = B|C
struct A {
choice byte // value 0 or 1
value ?// (thing big enough to store B | C)
} [edit] Sorry @jimmyfrasche beat me to the punch. |
Is there anything added by nil that couldn't be done with
? That seems like it avoids a lot of the confusion (that I have, at least) |
Or better
that way you could type switch on |
@jimmyfrasche |
@bcmills It wasn't my intent to claim otherwise—I meant that it could be used for the same purpose as differentiating a lack of value without overlapping with the meaning of nil in any of the types in the sum. |
@rogpeppe what does this print?
I would assume "Reader" |
@jimmyfrasche I would assume (And I would also expect sums which include only interface types to use no more space than a regular interface, although I suppose that an explicit tag could save a bit of lookup overhead in the type-switch.) |
@bcmills it's the assigment that's interesting, consider: https://play.golang.org/p/PzmWCYex6R |
@ianlancetaylor That's an excellent point to raise, thanks. I don't think it's hard to get around though, although it does imply that my "naive implementation" suggestion is itself too naive. A sum type, although treated as an interface type, does not have to actually contain direct pointer to the type and its method set - instead it could, when appropriate, contain an integer tag that implies the type. That tag could be non-zero even when the type itself is nil. Given:
the runtime value of x need not be all zeros. When switching on the type of x or converting Another possibility would be to allow a nil type only if it's the first element, but
|
Yes.
I don't get this. Why would "this [...] have to be valid for the type switch to print ReadCloser" When there are several interface types in a sum, the runtime representation is just an interface value - it's just that we know that the underlying value must implement one or more of the declared possibilities. That is, when you assign something to a type (I1 | I2) where both I1 and I2 are interface types, it's not possible to tell later whether the value you put into was known to implement I1 or I2 at the time. |
If you have a type that's io.ReadCloser | io.Reader you can't be sure when you type switch or assert on io.Reader that it's not an io.ReadCloser unless assignment to a sum type unboxes and reboxes the interface. |
Going the other way, if you had io.Reader | io.ReadCloser it would either never accept an io.ReadCloser because it goes strictly right-to-left or the implementation would have to search for the "best matching" interface from all interfaces in the sum but that cannot be well defined. |
var x []T
func F(y []T) {
x = append(x, zeroT) // ?
y = append(y, zeroT) // ?
} Again, this has been suggested and rejected a couple of times. Frequently enough that it should be clear that the issues are deeper than we're going to overcome right now.
Exactly. There is a reason this issue has not been resolved yet. |
From reading through the prior discussion, all attempts to address the issue with zero values try to prescribe some sort of zero value to a value of a sum type. There is one discussion from 2019 where the idea of not having a zero value at all comes up, but it was not explored. "Zero values are needed" has been an implicit assumption throughout this issue. How one would adjust the rest of the language around a datatype without a zero-value seems to just be off the table. You're right that the issues are deeper than I originally thought. Slices, map values, builtin functions, return values, etc... all rely on zero values in their semantics. A proposal to create a family of types without a zero value would have to describe how to adjust the semantics of all of these. |
There is perhaps a possibility to change the default constraint for type constructors that are composite of such union types so that unless {sumtype | nil} is used, an uninitialized value is not accepted as argument. In that case, it could eschew most of the annotations. It's just that something such as type V struct{
Num int
S interface{ int | bool | string} // union
} we wouldn't be able to declare Basically variables could be uninitialized, but in most cases, parameters would require initialized arguments. (exception being the case of return values possibly? only when a non-nil error is returned?) The issue is how to make sure that variables always hold initialized values when passed as arguments. Need to think a bit more about it. |
As I understand it, we would not even be able to use |
[edit] I misunderstood. You're absolutely right. Hmmh. Yes. So unless The typestate of the struct fields propagate to the struct value ("uninitialized" would be some sort of predicate attached). It's again somewhat similar to comparability where a struct is comparable iff every field is. |
Is there an active proposal for tagged sums (disjoint union type) as opposed to this untagged sum (union type) proposal? |
There's also a couple posted in this thread but they're all in the hidden 350+ posts in the middle |
@atdiar type V struct{
Num int
S interface{ int | bool | string} // union
} can be replaced with type myUnion interface {
~int | ~bool | ~string
}
type V[T myUnion] struct {
Num int
S T
} |
@bdandy The interesting aspect of unions comes down to the fact¹ that you can declare [1] To be clear, it's not this specifically what makes them interesting, but this possibility demonstrates the difference between "type parameter unions" and actual union types quite well. |
@Merovius while I understand that it might be good idea to have it for some cases, I have questions. For example: type x int | float64 = 1 How much memory should be allocated for Slice of func main() {
var s = []any{V[int]{1, 1}, V[bool]{2, true}}
for _, v := range s {
switch v.(type) {
case V[int]:
fmt.Println("Int")
case V[bool]: ///
}
}
fmt.Println(s)
} For example, also here there is still type switch: // r is an io.Reader interface value holding a type that also implements io.Closer
var v io.ReadCloser | io.Reader = r
switch v.(type) {
case io.ReadCloser: fmt.Println("ReadCloser")
case io.Reader: fmt.Println("Reader")
} so no any difference from // r is an io.Reader interface value holding a type that also implements io.Closer
var v any = r
switch v.(type) {
case io.ReadCloser: fmt.Println("ReadCloser")
case io.Reader: fmt.Println("Reader")
} In general I don't see profits, only complexity. |
@bdandy Yes, the fact that there are still open questions is why this issue isn't closed yet and no version of sum/union types has yet made it into the language. |
The representation of sum types generally includes a discriminator, i.e. an additional hidden field that describes which variant is the dynamic type of the sum value. Exactly what that discriminator looks like is generally an implementation detail. #57644 would most likely have it be a pointer to a type descriptor, as interface values have. #54685 proposes to expose it and allow it to be one of a fixed set of constants. There are other options, like using an index into an array of possible type descriptors. Regardless, this is a well-understood problem.
The benefit is that if |
@bdandy re: "so no any difference from" in your second code example you are doing an open pattern match. If the sum type were expanded your code would look just fine, but in the first case you'd get a compiler warning that you missed a pattern match on one of the possible types. An advantage of sum types is compile time safety. |
Given that Go has nominal type definitions (outside of aliases to anonymous structs), I always felt that sum types would be added to the language by virtue of adding untagged unions. Then, the type name would act as a discriminator. Essentially, extending interfaces from just encapsulating methods to any type name. This would basically add subtyping to the language, since |
This won't work well because each of those types will get its own method set. Ignoring aliases, every type A struct{}
func (A) M() {}
type B A
func main() {
var b B
b.M() // No such method.
} If you forced the creation of a new type just to be able to give it a specific name in a union, it would force the user to recreate all relevant methods on that new type. |
The point is that interfaces would no longer be just about method sets. An interface would be an untagged union.
|
@smasher164 Unless I misunderstand, that is more or less #57644. There is a fair amount of discussion here and in #41716 about the details and downsides of such an approach. The zero value of such sum types is contentious. Either it is all bits zero like every other type, which implies that nil is always a variant, or there is some mechanism to determine a default variant, which requires a nonzero type descriptor for the zero value. There are also some concerns about the algorithmic complexity of checking the subtype relation for such types at compile time if we allow such unions to include interfaces with methods. @Merovius has a proof that it reduces to SAT in the unrestricted case. So it seems we have to keep that restriction, which means you can't form a sum of string and fmt.Stringer that way. |
I agree on compile time safety point and also then we can implement better error type like union of declared error types, which improves documentation to the all possible errors (no unexpected error types, better error handling in-place), which I miss in Go. type myError PackageError | SomeError | io.Error
func test() error {
var err myError = call()
return err
} Will explain all possible errors, which is much clear than generic |
Actually errors are an argument against unions. Right now we have one established pattern to return errors:
With union types I'm sure some people would start writing and arguing for
or
because they have seen this in other languages. And then we would have an unnecessarily inconsistent Go ecosystem. |
Well, if some feature exists in other language - it doesn't mean it's bad practice for Go because we're so unique. We just need to think about the problem it solves and how can it fit to the Go language in Go style. In case of So many proposals are somehow related to better errors Everyones wait for Go v2.0 like magic. |
I don't think the claim is ever that Go is unique. But you also can't ignore the existing corpus of Go code and how a new feature interacts with the existing language. Well, I guess in a way, the claim is that Go is unique, just as every other language is unique. |
In any case that's not related to the unions directly. I've just got a though how it may be to have union error type compatible with type myError PackageError | SomeError | io.Error // our union error type
// old style supported
func test() error
// error interface with concrete union type for compile time safety (myError implements error type)
func testStatic() error[myError]
func main() {
err := testStatic()
// like usual type switch
switch err.(type) {
case PackageError:
} // go vet may say warning here about not all types are checked
// error.Is support
if errors.Is(err, PackageError) {
}
} But maybe this one is for another proposal ticket |
FWIW the main reason today, to return So I disagree that this has anything to do with unions. The crucial missing component is covariance. Unions are helpful to then express the possibility of "one of these kinds of errors could be returned", but I don't see how that is special - it's no different from expressing " |
@Merovius good example, in theory it should be compatible to |
No you're exactly right. What I'm describing is exactly that proposal. I'm coming around to thinking that |
IMO it would be right to demand mandatory case nil:
... in a switch statement. |
This is a proposal for sum types, also known as discriminated unions. Sum types in Go should essentially act like interfaces, except that:
Sum types can be matched with a switch statement. The compiler checks that all variants are matched. Inside the arms of the switch statement, the value can be used as if it is of the variant that was matched.
The text was updated successfully, but these errors were encountered: