Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: spec: generics: type switch on parametric types #45380

Open
rogpeppe opened this issue Apr 4, 2021 · 174 comments
Open

proposal: spec: generics: type switch on parametric types #45380

rogpeppe opened this issue Apr 4, 2021 · 174 comments
Milestone

Comments

@rogpeppe
Copy link
Contributor

rogpeppe commented Apr 4, 2021

In the discussion for #45346, this comment mentioned the possibility of adding a type switch on parametric types. Given that it's threatening to derail the discussion there, I'm creating this proposal to talk about it.

Proposal: type switching on type parameters

I propose that a new statement be added to the language which allows a function with type parameters to further constrain its type parameters:

switch type T {
case A1:
case A2, A3:
   ...
}

The switched type (T above) must be the name of a type parameter. The type in a case arm can be any type, including constraint types.

The first case is chosen that has a constraint that matches the type parameter.

Within a chosen arm of the case statement, the type parameter that's being switched on becomes further restricted by the type selected, just as if it had originally been constrained by that type too. Precisely, given a switched type T constrained by C, and a type in the switch arm A, within the code of the switch arm, T takes on the type:

    interface{
        C
        A
    }

If there are multiple cases mentioned in the arm (A1, A2, ... An), then T is constrained by using a union element:

    interface{
        C
        A1 | A2 | ... | An
    }

Example

type Stringish interface {
	string | fmt.Stringer
}

func Concat[S Stringish](x []S) string {
    switch type S {
    case string:
        // S is constrained by interface {Stringish; string} which is the same as
        // interface{string} which is the same as string, so we can use x
        // as a normal []string slice.
        return strings.Join(x, "")
    case fmt.Stringer:
        // S is constrained by interface {Stringish; Stringer}
        // which is the same as `Stringer`, so we can call 
        // its `String` method but we can't use x directly as
        // []fmt.Stringer because it might have any layout
        // or size.
        var buf strings.Builder
        for _, s := range x {
             buf.WriteString(s.String())
        }
        return buf.String()
    }
 }

The above assumes that the constraint:

interface {
     X | Y | Z
     X
}

would allow all operations allowed on X, but I believe that's true under #45346 proposal.

Note: I don't think there's any need to allow ~ to be used directly in these type switch cases - case interface {~T}: is sufficient if necessary.

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Apr 4, 2021

Replying to @candlerb #45346 (comment)

func Sort[T Lessable[T]](list []T) {
	var less func(a, b T) bool
	switch type T {
	case Ordered:
		less = func(a, b T) bool { return a < b }
	case Lesser:
		less = func(a, b T) bool { return a.Less(b) }
	}
    
	// sort using less
}

I can't see how the two branches of the "switch" can be compiled in the same function instance. For a given T, only one branch or the other is semantically valid.

Isn't this very similar to what happens with a switch x := x.(type) statement currently?

Let's assume for the sake of argument that two versions of Sort with different type parameters can be compiled into the same code (it might do this when both types have the same GC layout).

Then the type switch statement could look at the metadata for the type and "unlock" other operations on the type, just as a type switch does for interface values in current Go - there's no particular reason that Sort would need to be split into two versions just because of the type switch. Splitting is of course still a valid implementation strategy and would result in more efficient code, at the usual cost of potential code bloat.

@Merovius
Copy link
Contributor

Merovius commented Apr 4, 2021

Note: I don't think there's any need to allow ~ to be used directly in these type switch cases - case interface {~T}: is sufficient if necessary.

I assume you would apply the same to union-elements?

FWIW, while it doesn't add expressive power, I'd prefer if we'd allow both approximation- and union-elements directly, if we can make it work without syntactical ambiguities. It's more convenient and IMO still clear. And I would prefer if the cases can more directly reflect what's in the constraint, that is, I think

type Constraint interface {
    ~int | ~int8 | ~string
}

func ThisSyntax[T Constraint]() {
    switch type T {
    case ~int | ~int8:
        // …
    case ~string:
        // …
    }
}

func IsClearerThanThisSyntax[T Constraint]() {
    switch type T {
    case interface{~int | ~int8 }:
        // …
    case interface{ ~string }:
        // …
    }
}

But it's a weakly held opinion.

I think this would be good, but one concern I have is that I think it would have unfortunate overlap with simply allowing approximation-elements in type switches and assertions (let's call it "approximate type switch"). I think the form of type switch you propose here (let's call it "parameter type switch") is clearly better for type parameters. But the approximate type switch is clearly better for "sum-types", as you'd need to unpack the value at some point - so you'd want to type-switch on a value, not a type.

So, IMO

  1. It is slightly unfortunate to have both
  2. If we never add "sum-types", the parameter type switch would be better
  3. If we do add "sum-types" and only want one, the approximate type switch would be better

Of course, we can always eat the cost of "slightly unfortunate" and add both, when the time comes.

@Merovius
Copy link
Contributor

Merovius commented Apr 4, 2021

Isn't this very similar to what happens with a switch x := x.(type) statement currently?

One significant difference is that switch x := x.(type) visibly declares a new variable (shadowing the old one), scoped to the switch case block.

Arguably, the parameter type switch could "declare a new type" (shadowing the old one), scoped to the switch case block. But it's still arguably more implicit and "magic".

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Apr 4, 2021

I think this would be good, but one concern I have is that I think it would have unfortunate overlap with simply allowing approximation-elements in type switches and assertions (let's call it "approximate type switch"). I think the form of type switch you propose here (let's call it "parameter type switch") is clearly better for type parameters. But the approximate type switch is clearly better for "sum-types", as you'd need to unpack the value at some point - so you'd want to type-switch on a value, not atype.

I think that adding approximation-elements to the regular type switch is orthogonal to this proposal. In the case of regular interface values, knowing the type of the value doesn't tell you anything about the type of any other value, but that's not the case here, where once we know about the type, we know the type of all values that use that type.

That's why I chose to restrict this proposal to allow only exactly type parameter names to be specified in the switch, because then there's no ambiguity - we're further constraining the already-known type and thus we know that all arguments, variables and return parameters that use that type are constrained likewise.

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Apr 4, 2021

To try to be clearer, this proposed statement is a "switch on a type" rather than a "switch on the type of a value". That's why it's very different from the current type switch statement, and also why it's specifically useful in the context of parametric types.

@Merovius
Copy link
Contributor

Merovius commented Apr 4, 2021

I think that adding approximation-elements to the regular type switch is orthogonal to this proposal.

I think they are different, but I don't think they are orthogonal/independent. In particular, the "approximate type switch" would in terms of what you can express subsume the "parameter type switch" (or at least most of it). That is, you can write code doing the same, with the same static guarantees, using either - even if less convenient.

In the case of regular interface values, knowing the type of the value doesn't tell you anything about the type of any other value

That is true. But this proposal doesn't talk about regular interface values, it talks about type parameters. So, if you are in a situation where you can use the "parameter type switch", you do know that two values have the same type. That is, you could write

func Max[T constraints.Ordered](a, b T) T {
    switch a := a.(type) {
    case ~float64:
        return math.Max(a, b.(~float64))
    default:
        if a > b {
            return a
        }
        return b
    }
}

and this is statically known to be correct - even if inconvenient, by requiring an extra type-assertion, you know this type-assertion can't fail.

That's why I chose to restrict this proposal to allow only exactly type parameter names to be specified in the switch, because then there's no ambiguity - we're further constraining the already-known type and thus we know that all arguments, variables and return parameters that use that type are constrained likewise.

I don't see how case ~int supposedly adds ambiguity, where interface{ ~int } doesn't. One is simply syntactic sugar for the other. Sorry, I think I misunderstood what you where saying, disregard this.

To try to be clearer, this proposed statement is a "switch on a type" rather than a "switch on the type of a value". That's why it's very different from the current type switch statement, and also why it's specifically useful in the context of parametric types.

Yupp :) That's what I meant when I said this is clearly better for type parameters :)

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Apr 4, 2021

I think that adding approximation-elements to the regular type switch is orthogonal to this proposal.

I think they are different, but I don't think they are orthogonal/independent. In particular, the "approximate type switch" would in terms of what you can express subsume the "parameter type switch" (or at least most of it). That is, you can write code doing the same, with the same static guarantees, using either - even if less convenient.

In the case of regular interface values, knowing the type of the value doesn't tell you anything about the type of any other value

That is true. But this proposal doesn't talk about regular interface values, it talks about type parameters. So, if you are in a situation where you can use the "parameter type switch", you do know that two values have the same type. That is, you could write

func Max[T constraints.Ordered](a, b T) T {
    switch a := a.(type) {
    case ~float64:
        return math.Max(a, b.(~float64))
    default:
        if a > b {
            return a
        }
        return b
    }
}

I don't think that's quite right - ~float64 can't be assigned to float64 without an explicit type conversion, and neither could the float64 returned by math.Max be assigned to the return parameter.

You'd need something like this instead:

func Max[T constraints.Ordered](a, b T) T {
    switch a := a.(type) {
    case ~float64:
        return interface{}(math.Max(float64(a), float64(interface{}(b).(~float64)))).(T)
    default:
        if a > b {
            return a
        }
        return b
    }
}

That would become less verbose if value type assertions could be done directly on non-interface values, but still not great.

and this is statically known to be correct - even if inconvenient, by requiring an extra type-assertion, you know this type-assertion can't fail.

Although as a programmer, you might know that those dynamic type assertions are OK, they seem problematic to me - they're verbose and easy to get wrong (with a nasty panic if you do).

@Merovius
Copy link
Contributor

Merovius commented Apr 4, 2021

I don't think that's quite right - ~float64 can't be assigned to float64 without an explicit type conversion

Who knows, we are talking about a speculative design with no proposal filed :) IMO, x.(~float64) should evaluate to float64 exactly and if ~float64 is used as a case, a should have type float64. But either way, that doesn't matter a lot.

neither could the float64 returned by math.Max be assigned to the return parameter.

Probably true. This is harder to handwave away :)

Although as a programmer, you might know that those dynamic type assertions are OK, they seem problematic to me - they're verbose and easy to get wrong (with a nasty panic if you do).

That's why I said this proposal is better, as long a we're only concerned with type parameters.

@zephyrtronium
Copy link
Contributor

zephyrtronium commented Apr 4, 2021

As I see it, this proposal is to add a way to say, "given a thing that might be one of possibly infinitely many types, add special handling for cases where it is one of a chosen set of types." This is already exactly what the current syntax for type switches does, for where "thing" means "value" rather than "type." Considering how similar they are, why is the proposed syntax so different?

In particular, with the current generics proposal, it would be possible to implement parsing with the only new AST nodes being those needed to describe constraints (type lists for the proposal as accepted, or type set operations for #45346, both only on interface declarations). Depending on how the implementation of type parameters is handled, static analysis of type-checked programs could be done with only the same new node; tools that only analyze function bodies might not need to be updated at all. How is the cost of an entirely new syntactic construct to every Go code parser and to every static analysis tool using them justified?

I would prefer if the spelling of these type switches were also switch T.(type).

@thepudds
Copy link

thepudds commented Apr 4, 2021

I would prefer if the spelling of these type switches were also switch T.(type).

One other option — in an earlier conversation (still in the context of type lists), Ian had floated this syntax:

func F[T constraints.Integer]() { 
    switch T { 
    case int: 
    case int8: 
    } 
} 

The proposed semantics at that time though were not as nice as the semantics proposed here under type sets, I think.

@zephyrtronium
Copy link
Contributor

zephyrtronium commented Apr 4, 2021

I don't think that's quite right - ~float64 can't be assigned to float64 without an explicit type conversion, and neither could the float64 returned by math.Max be assigned to the return parameter.

You'd need something like this instead:

func Max[T constraints.Ordered](a, b T) T {
    switch a := a.(type) {
    case ~float64:
        return interface{}(math.Max(float64(a), float64(interface{}(b).(~float64)))).(T)
    default:
        if a > b {
            return a
        }
        return b
    }
}

That would become less verbose if value type assertions could be done directly on non-interface values, but still not great.

I agree with @Merovius that I would expect case ~float64: to cause T to become float64 within the case, but my reason is specifically that, as proposed in #45346, ~float64 is not a type. There are no values of type ~float64, only of types whose underlying type is float64. Aside from your comment, none of the suggestions in this thread seem to suggest that behavior should change.

(I also note that if the constraint of the function were ~float64, and hence the constraint applied in the type switch case were the same, then T and float64 should be convertible to each other, so it should be fine under even the most conservative type checking of this proposal to return T(math.Max(float64(a), float64(b))).)

@urandom
Copy link

urandom commented Apr 4, 2021

I would think that if a case were to include an approximation type (e.g. ~float64), then the compiler should be able to deduce that within the branch T would be a type whose underlying type is float64, and should thus be able to convert a float64 back to T. If that is really the case, then restricting such a switch to only non-approximation types sounds like an unnecessary restriction.

@Merovius
Copy link
Contributor

Merovius commented Apr 4, 2021

I also note that if the constraint of the function were ~float64, and hence the constraint applied in the type switch case were the same, then T and float64 should be convertible to each other

Yes, but to allow it, the compiler would have to know about it. It is easy to use v.(~float64) as a float64, because neither v itself, nor it's type, actually change - you look at a new variable with a new type. Meanwhile, if you'd want to convert T(v) after that, the compiler would have to take into account that the type-assertion before succeeded. That's not how Go works so far - for example, you can't do

var r io.Reader = new(bytes.Buffer)
_ = r.(*bytes.Buffer) // succeeds
var br *bytes.Buffer = (*bytes.Buffer)(r) // type-assertion succeded, so we know that we can convert r to *bytes.Buffer

It's a little different, because we are talking type parameters, but it still destroys the elegant simplicity.

So, I do agree with @rogpeppe that type switching on the type parameter is useful and enables you to do new things, because there is inherent logic in saying "inside the switch case, T is interpreted as being constrained to the type set in the case".

I still don't think they are orthogonal though. They still both have significant overlap.

@Merovius
Copy link
Contributor

Merovius commented Apr 4, 2021

@zephyrtronium

This is already exactly what the current syntax for type switches does, for where "thing" means "value" rather than "type." Considering how similar they are, why is the proposed syntax so different?

I believe it is because of what you mention - we aren't switching on a value, but on a type. But FTR, I don't think it really matters for the substance of the proposal whether it's switch T, switch type T or switch T.(type) - so maybe we shouldn't worry about the color of our bikeshed just yet :)

@Merovius
Copy link
Contributor

Merovius commented Apr 4, 2021

FWIW, after the discussion so far, I've come around to this. I think it's a good addition :)

I still would like to allow union- and approximation elements directly (without wrapping them in interface) though and I also think that this shouldn't prevent us from adding type switches/assertions using approximation elements either :)

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Apr 4, 2021

I don't think that's quite right - ~float64 can't be assigned to float64 without an explicit type conversion

Who knows, we are talking about a speculative design with no proposal filed :) IMO, x.(~float64) should evaluate to float64 exactly and if ~float64 is used as a case, a should have type float64.

FWIW the semantics I'd expect from an approximate type switch on a value would be that the value inside the case would allow just the same operations as if the type had been constrained by the switched type in a generic function parameter.

That is, in this example, case ~float64 would allow only operations allowed by every type with underlying type float64, and since you can't assign a value with a defined float64 type to a float64 without an explicit type conversion you wouldn't be able to call Max, hence the need for an explicit type conversion in my code snippet.

Note that being able to assert on ~float64 doesn't help when you're wanting to operate on a slice or other higher level type though, because the constraint syntax doesn't allow interface{[]~float64} AFAICS.

However, switching on the type parameter itself does potentially help that case, because then the slice can be used as an argument to any function with suitable constraints.

For example:

func MaxVec[F comparable](xs []F) F {
    switch F {
    case interface{~float64 | ~float32}:
        // This is OK because F is here constrained by interface{comparable; ~float64 | ~float32}.
        return MaxFloats(xs)
    default:
        ...
    }
}

func MaxFloatVec[F interface {~float64 | ~float32}](xs []F} F {
    ...
}

This makes me realise an interesting point here: even if one has an approximate type switch, unless you significantly add to the power of the approximation and union element features (a big ask, given their current elegance), you will still not be able to do something like assert on the underlying type of a type component such a slice element.

That is, something like this would probably not be allowed, which arguably makes approximate type switches on values
considerably less interesting as a candidate feature for the language.

func X(x interface{}) {
    switch x := x.(type) {
    case []~float64:
        ...
    }
}

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Apr 4, 2021

@zephyrtronium The concrete syntax I've proposed here is only one of many possible. A syntax like switch T { would work just as well; in fact I think I might prefer that, although it arguably gives less clues to the reader of the code about what's going on.

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Apr 4, 2021

I still would like to allow union- and approximation elements directly

Note that you could add union elements for completeness, but they're technically redundant with respect to this proposal because a case with multiple comma-separated elements is the same as a union element.

Come to think of that: you can use multiple elements in a type switch case currently, but the result is the original interface type. That behaviour probably can't change for backward compatibility reasons, but if one allowed a union element directly, one could constrain the available operations to the intersection of operations available on all the selected types, just as with generics.

That doesn't affect this proposal though.

My main concern about this proposal is that it might make the compiler's job considerably harder. I'm trying to think through the potential implications there.

@Merovius
Copy link
Contributor

Merovius commented Apr 4, 2021

FWIW the semantics I'd expect from an approximate type switch on a value would be that the value inside the case would allow just the same operations as if the type had been constrained by the switched type in a generic function parameter.

That would require ~float64 to be a type though. That is, with v := v.(~float64), v needs to have some type. IMO float64 is the most natural and most useful type here. Why would I not want it to be a float64? Except avoiding having to learn "v.(~T) evaluates to a T"?

Note that this is completely different when we are talking about this proposal. When you change the constraints put on T inside the case block, you can obviously do more things with it, because you can have many values of that type and be sure they are the same type.

Agreed, to the rest of the comment.

That is, something like this would probably not be allowed, which arguably makes approximate type switches on values
considerably less interesting as a candidate feature for the language.

FWIW, I think approximate type switches will be a must if we ever allow to use all interfaces as types. Up until then, they are a nice-to-have, at best (if something like this proposal gets accepted - I do strongly feel that we need some way to specialize on the type argument eventually).

Note that you could add union elements for completeness, but they're technically redundant with respect to this proposal because a case with multiple comma-separated elements is the same as a union element.

This is a drawback to me. Personally, I think I would consider to disallow multiple cases in this type switch construct. It seems we need to choose whether we'd rather be consistent with other switch constructs or be consistent with the constraint syntax. Unfortunate.

My main concern about this proposal is that it might make the compiler's job considerably harder. I'm trying to think through the potential implications there.

Can you give an example? I'm not sure what would make it harder. Conceptually, each case block is just an anonymous generic function with a more constrained type parameter. ISTM that if we can type-check a generic function, we should already be able to type-check this construct as well?

@zephyrtronium
Copy link
Contributor

zephyrtronium commented Apr 4, 2021

@Merovius

I also note that if the constraint of the function were ~float64, and hence the constraint applied in the type switch case were the same, then T and float64 should be convertible to each other

Yes, but to allow it, the compiler would have to know about it. It is easy to use v.(~float64) as a float64, because neither v itself, nor it's type, actually change - you look at a new variable with a new type. Meanwhile, if you'd want to convert T(v) after that, the compiler would have to take into account that the type-assertion before succeeded. That's not how Go works so far - for example, you can't do

var r io.Reader = new(bytes.Buffer)
_ = r.(*bytes.Buffer) // succeeds
var br *bytes.Buffer = (*bytes.Buffer)(r) // type-assertion succeded, so we know that we can convert r to *bytes.Buffer

Getting a bit off-topic over an example, but I don't follow your argument here. You can do this:

var r io.Reader = new(bytes.Buffer)
switch r := r.(type) {
case *bytes.Buffer:
	var br *bytes.Buffer = r
}

which much closer resembles the example under discussion. The question is the operations (specifically conversions) legal on a parameterized type within a case of a type switch that effectively redefines the type parameter's constraint.

Now, there is a bit of a difference here in that I use r := r.(type) whereas the switch on a parameterized type does not. You can't assign var br *bytes.Buffer = r without shadowing, of course, because the r remains type io.Reader. However, the difference is that r is a value with an interface type, whereas T in the original example is a constraint – not even a type. Per the proposal, within the switch case, the latter is defined to operate as if the function's constraint type set were originally the intersection of the actual constraint and the constraint used in the case. The only types in that intersection are convertible to float64 and float64 is convertible to any type in it, so by the behavior specified in the accepted proposal, conversions between those types are allowed.

Perhaps this does relate to @rogpeppe's comment while I was typing this:

My main concern about this proposal is that it might make the compiler's job considerably harder. I'm trying to think through the potential implications there.

And, perhaps this line leads me to conclude that switch T.(type) is not quite the correct syntax, either, because this sort of type constraint switch effectively includes a redeclaration, which arguably should not be implicit.

@zephyrtronium
Copy link
Contributor

@rogpeppe

@zephyrtronium The concrete syntax I've proposed here is only one of many possible. A syntax like switch T { would work just as well; in fact I think I might prefer that, although it arguably gives less clues to the reader of the code about what's going on.

Noted. To me, there is a major difference between proposing an extension of semantics for existing syntax to analogous behavior, and proposing new syntax when an analogous one already exists. But perhaps I am a bit too early to this complaint.

@randall77
Copy link
Contributor

I can definitely see some trickiness with this idea.

func f[T any](a T) {
    switch T {
        case float64:
            var x T // x is a float64
            x = a // both are type "T", why can't I assign them?
    }
}

I think it is cleaner if you introduce a new type-parameter-like-identifier, like:

func f[T any](a T) {
    switch U specializing T { // or some other syntax
        case float64:
            var x U // x is a float64
            x = a // now it is clear this is not allowed, as it is assigning a T to a U.
    }
}

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Apr 4, 2021

That would require ~float64 to be a type though. That is, with v := v.(~float64), v needs to have some type. IMO float64 is the most natural and most useful type here. Why would I not want it to be a float64?

Because the original type still carries semantic weight and could be useful. A type switch tells you what the dynamic type of the value is; it doesn't convert it to some other type.

For example, I'd expect the following code to print "MyFloat", not "float64":

type MyFloat float64

func P[F any](f F) {
    switch type F {
    case ~float64:
        fmt.Printf("%T\n", f)
    default:
        fmt.Printf("%T\n", f)
    }
}

func main() {
    P(MyFloat(64))
}

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Apr 4, 2021

I can definitely see some trickiness with this idea.

func f[T any](a T) {
    switch T {
        case float64:
            var x T // x is a float64
            x = a // both are type "T", why can't I assign them?

In this proposal, you can. Within that case, T is known to be exactly float64 as if with a type alias.
So both x and a are of both type float64 and T.

The important point is that the specialisation affects all variables in scope that use type T.
It's OK to do that because, unlike regular interface values, there's no special GC shape associated with T, so we can specialise without needing to create a new value.

That is, we're not creating a new T scoped to the case - we are genuinely specialising the original T for the extent of that case.

@randall77
Copy link
Contributor

Ah, ok, so we could think of these switches as happening at the top level. We put them syntactically lower down in the AST just to share all the code outside the switch.

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Apr 4, 2021

Ah, ok, so we could think of these switches as happening at the top level. We put them syntactically lower down in the AST just to share all the code outside the switch.

Yes, you could look at it that way, although I'm not sure how helpful that is.

The way I look at it is that within the case statement, you get a more precise view of what T happens to be, so you can do more specific operations on values defined in terms of T. If the generated code isn't fully expanded out for every possible type, the switch operation may well involve some runtime cost (not at the top level) to look up the relevant dictionary of available operations, much as the usual type switch does with the method dictionary.

@Merovius
Copy link
Contributor

Merovius commented Apr 4, 2021

@zephyrtronium

ISTM that the confusion is that you are talking about this proposal (reasonably so) while I was discussing a different idea - namely allowing type switches/assertions on values to assert on the underlying type. And I was discussing that idea to compare its expressive power to the one proposed by @rogpeppe. Everything you say is true, under this proposal.

@atdiar
Copy link

atdiar commented Jan 4, 2023

There isn't/shouldn't.
However, the type parameter switch will treat the type parameter differently depending on whether int was satisfied first or `~int was.

That's more obvious an issue with interface{~string | fmt. Stringer} perhaps.

Edit: Oh wait... Yes there is a restriction on basic interfaces in unions... At least for now.
I guess another future-proofing concern. (exhaustive pattern matching and decomposition)

@Merovius
Copy link
Contributor

Merovius commented Jan 4, 2023

However, the type parameter switch will treat the type parameter differently depending on whether int was satisfied first or ~int was.

I don't understand what you mean. This seems simply untrue. Can you provide an example of code where they would behave differently under this proposal?

Edit: Oh wait... Yes there is a restriction on basic interfaces in unions... At least for now.
I guess another future-proofing concern. (exhaustive pattern matching and decomposition)

Note that the proposal as written deals with this perfectly fine. It just makes exhaustiveness checks impossible and implies that it's an even worse idea not to match cases in source order, as there are more overlaps.

@Merovius
Copy link
Contributor

Merovius commented Jan 4, 2023

To be clear: This proposal it means that

switch T {
case int: // …
case ~int: // …
}
// and
switch T {
case ~int: // …
case int: // …
}

Behave differently. But they behave the same regardless of whether T is constrained on int|~int or on ~int|int. The two interfaces are identical, as they should be.

@atdiar
Copy link

atdiar commented Jan 5, 2023

Well that was the point (admittedly not well expressed). Cases are likely to be written in source order of the terms of the union, so somehow the order in the switch is quite likely to match that.

Done naively, overlaps would remain unhandled a bit too easily.

If decomposition was enforced, there would be much less chance for a switch (or constraint) modification to change the whole program.

To be clear, all branches will still be visited in source code order but semantics won't rely on the source code order if cases are dealt with exhaustively.

Much safer then to add a union term to a constraint for instance.

@Merovius
Copy link
Contributor

Merovius commented Jan 5, 2023

It is frustrating, after three rounds of back-and-forth, to find out that it was all a dud, because you did not actually mean that the two interfaces are treated differently (which would be confusing and bad) you meant that two switch statements are treated differently (which is expected and good).

In any case: IMO the principle of least surprise is to behave as other switch statements already do. I disagree that there is any reasonable chance of confusion here. People do not get confused that any of these differ either:

switch x.(type) {
case io.Reader:
case *bufio.Reader:
}
// vs.
switch x.(type) {
case *bufio.Reader:
case io.Reader:
}
// or
switch {
case x < 10:
case x < 5:
}
// vs.
switch {
case x < 5:
case x < 10:
}

And so on. The "the first matched case is executed" semantic is easy to explain, understand and intuit. No matter in what order the original interface mentioned the cases.

You'd be treating a non-existing problem with a caustic solution, by forcing people to write unreadable code:

switch T {
case fmt.Stringer:
    return v.String()
case ~string:
    return string(v)
default:
    return fmt.Sprint(v)
}
// vs.
switch T {
case ~string:
    switch T {
    case fmt.Stringer:
        return v.String()
    default:
        return string(v)
    }
default:
    switch T {
    case fmt.Stringer:
        return v.String()
    default:
        return fmt.Sprint(v)
    }
}

So, no. I don't believe that's a good change to the proposal.

@atdiar
Copy link

atdiar commented Jan 5, 2023

Well, sorry to hear that you're frustrated. The proposal does mention that adding a case is equivalent to constraining the type parameter. So case order seems to be reflected at the constraint level, at least for the writer of such code.

My point is that the order in which to write the cases is not necessarily obvious when cases cannot be ordered in terms of decreasing specificity.

The problem is not exactly with source order execution.
I think that's fine.

My issue is case satisfaction ambiguity.
For instance, all your examples depend on the coder writing the more specific case first.
This is more an issue if someone wants to add a case to the switch. Now in which position within the switch is it the least ambiguous?

We could eschew the problem with enforced term decomposition.
It has some merits, one being that the order in which the cases are written wouldn't matter . (no forward compatibility issue, could even allow program transformations to increase the performance of these type switches etc).

Much less error prone.

Wrt readability, that's a syntactic issue most likely. Presumably, something akin to

switch i:=v.(type) {
case ~string:
    case fmt.Stringer:
        return i.String() // we could also decide to do something else here
    default:
        return string(i)
case fmt.Stringer:
    // bonus: we know that it is not a ~string in that branch
     return i.String()
}

// vs what you propose which does not handle every case
switch T {
case fmt.Stringer:
    return v.String()
case ~string:
    return string(v)
default:
    return fmt.Sprint(v)
}

Admittedly, that's quite more of a concern if we ever want to lift the restriction on basic interfaces as union terms.

That's just a question that I think could be considered though.
Your input is acknowledged, however. :)

@jimbobmcgee
Copy link

jimbobmcgee commented Jan 5, 2023

@atdiar It is not at all clear what problem you are seeing. Can you express it as a short code example, with your expectation vs. reality? [EDIT: above post now has code example]

do we want compiler-checked exhaustiveness for unions? (possibly with a way to opt-out?)

Would that not lend itself to backward compatibility problems, i.e. adding another "supported" type to an existing union would immediately break existing code. That might block useful future stdlib changes.

A vet warning might be nice, though, but I fear it might not be trivial to implement. The trouble is, if it is too noisy, everyone is going to end up covering their switch type with a default just to make it go away, and then you haven't really achieved anything.

Of course, the fact that you could cover with a default may mean that compile-time failure is worth having -- at least there would be a way to "opt-out".

My vote (for what it is worth) is no, purely because of the irritation of having to write out every possible case vs. the benefit of doing so. For me:

type T { int | string }

switch T {
case int:
   DoSomething()
}

...should not have to become...

switch T {
case int:
   DoSomething()
default:
}

The problem is not exactly with source order execution.

This surely always has to eventually boil down to a if T is some-type then ... else if T is other-type ... else ... decision, and something has to be checked first. Evaluating case in lexical order makes logical sense, and is in keeping with existing switch logic.

(I suppose select is a similar structure which, by rights could appear to evaluate case out-of-order, but we are not discussing select T...)

We could eschew the problem with enforced term decomposition.

You are almost certainly going to have to clarify "term decomposition", because it reads like you want switch to support...

switch T.(doWhatIMeantWhenISaidIt) {
case ~string:
case io.Stringer:
}

...and, if you mean that, then I'd like to propose we broaden this proposal to replacing func main() with func doWhatIMeantWhenISaidIt(), and never writing anything else ever again. :-)

@jimbobmcgee
Copy link

jimbobmcgee commented Jan 5, 2023

@atdiar Regarding...

switch i:=v.(type) {
case ~string:
    case fmt.Stringer:
        return i.String() // we could also decide to do something else here
    default:
        return string(i)
case fmt.Stringer:
    // bonus: we know that it is not a ~string in that branch
     return i.String()
}

...I would be very concerned if indentation was a deciding factor in the nesting of switch-case.

If you want to be able to nest decisions on types, you would surely have to do:

switch i:=v.(type) {
case ~string:
    switch i := i.(type) {
    case fmt.Stringer:
        return i.String() // we could also decide to do something else here
    default:
        return string(i)
    }
case fmt.Stringer:
    // bonus: we know that it is not a ~string in that branch
     return i.String()
}

Of course, it may then hold that switch i := i.(type) should be allowed to be evaluated for any type, not just for interface{} and that the result of case ~string be of an anonymous type that also implements all of the not-eliminated types.

Instead, I think the straightforward way to get this over the line would be for...

switch i:=v.(type) {
case ~string:
    // i is a string
case fmt.Stringer:
    // i is is a fmt.Stringer
}

...and that any extra interfaces are sadly lost (to i), once we reach the case that applies.

You could (should) still be able to do...

switch a := v.(type) {
case ~string:
    // a is a string
    switch b := v.(type) {
    case fmt.Stringer:
        // b is a fmt.Stringer
    }
case fmt.Stringer:
    // a is is a fmt.Stringer
}

...if you had to -- you would just have to juggle your use of a and b as necessary.

A terser syntax, such as case ~string && !fmt.Stringer: might have benefits, although I'm not sure what the re-typed variable becomes in all cases (possibly string for this case; but what for just case !fmt.Stringer:?). Almost certainly for another proposal, though...

@atdiar
Copy link

atdiar commented Jan 5, 2023

@jimbobmcgee
If you add a term to a union, it broadens the set of acceptable type arguments so I don't think it changes backward-compatibility?

switch T {
case fmt.Stringer:
    return v.String()
case ~string:
    return string(v)
}

The case of the type parameter satisfying both fmt.Stringer and ~string is underspecified.
Ideally, there should be a subcase for fmt.Stringer (that can be a default/catchall subcase for that branch).

On the other hand, I'm now a little more circonspect about adding new cases to a switch statement appart from in last position. (backward compatibility issues)

Ack about indentation. This is a mistake on my part as I was merely thinking about readability improvements.
Instead of case, one could use subcase as a keyword perhaps if ultimately required.

I'm less convinced that it is necessary now. Unless we want to make reeaally safe the addition of new cases to a type parameter switch, regardless of the case position and/or force the coder to deal with all cases of overlapping type sets.

I'm also thinking about this issue with union/sum types in mind where I don't think we can have a default clause. I need to think a bit more about it.

@Merovius
Copy link
Contributor

Merovius commented Jan 5, 2023

@atdiar

Wrt readability, that's a syntactic issue most likely. Presumably, something akin to

That code is incorrect, under your proposal. ~string and fmt.Stringer are not disjoint cases - there are types implementing fmt.Stringer, which have underlying type string. Furthermore, you do not cover the case of a type being neither (which is necessary, as ~string|fmt.Stringer is not a valid constraint). In particular:

// vs what you propose which does not handle every case

It does. That's exactly the point. It handles every case, exactly as is the natural way to do and as it is intended - it always calls the String method if it exists, it uses a simple conversion if possible and it falls back to a reflect based alternative if not. That was the point of the example - to show that the natural code is easy to write and read and understand.

So, you are re-interpreting the requirements, you then write code which side-steps the problem with your suggestion my example was pointing out (by being incorrect under your proposed rules) and the result still uses a nested switch and is thus significantly harder to read and write.

So, no, I remain unconvinced.

@atdiar
Copy link

atdiar commented Jan 5, 2023

@Merovius when I said, it didn't handle all cases, I meant that default was the escape hatch that allows for such code to be valid without all the combinations being explicitly considered

My point was that we would have exhaustiveness checking so there wouldn't be a need for the default clause. That depends on the constraint of the type parameter.

Anyway, something that needs to be ponder on.

@Merovius
Copy link
Contributor

Merovius commented Jan 5, 2023

@atdiar I understand what you meant, but what you meant is wrong.

@atdiar
Copy link

atdiar commented Jan 5, 2023

@atdiar I understand what you meant, but what you meant is wrong.

I'm not sure you do.
A type may satisfy fmt.Stringer... But what if it also satisfies ~string? Is this subcase handled in your code or do we just stop at fmt.Stringer?

Point is that if we want to be exhaustive, embedding switches are needed.

@Merovius
Copy link
Contributor

Merovius commented Jan 5, 2023

That is not the code I did want to write. I agree that if you want to do a different thing, you need to write different code, perhaps using more nesting.

But the point of the example was that under your proposed rules, you have to write the nested switches even if you want to do the simple thing. Because you always have to explicitly list (and nest) all possible combinations of overlap, which quickly leads into combinatoric explosions.

To illustrate, here is how listing all combinations looks for your proposal:

switch T { // side-note: the suggested syntax is to switch on the type parameter
case ~string:
    switch T {
    case fmt.Stringer:
        // both fmt.Stringer and ~string
    default:
        // ~string, but not fmt.Stringer
    }
default:
    switch T {
    case fmt.Stringer:
        // fmt.Stringer, but not ~string
    default:
        // neither fmt.Stringer, nor ~string
}

Here's how it looks under the original proposal:

switch T {
case fmt.Stringer:
    switch T {
    case ~string:
        // both fmt.Stringer and ~string
    default:
        // fmt.Stringer, but not ~string
    }
case ~string:
    // ~string, but not fmt.Stringer
default:
    // neither fmt.Stringer, nor ~string
}

(edit: had this first, then edited to use interface{ ~string; fmt.Stringer }, before realizing that's disallowed and changing it back)

Not nice, but strictly nicer than your proposal. That's not a coincidence. It's a consequence of a problem with your proposal. Feel free to try and come up with an example where your proposal requires less nesting, but I believe you'll have trouble.

Again, I chose the examples deliberately, to demonstrate a point. Please try to understand and address that point.

@jimbobmcgee
Copy link

jimbobmcgee commented Jan 5, 2023

@atdiar ...

If you add a term to a union, it broadens the set of acceptable type arguments so I don't think it changes backward-compatibility?

If the switch is required to be compile-time exhaustive (i.e. by which I take to mean "must minimally have a case that is true for every member of the union"), then adding a new type to the union requires adding an extra case to all switches that reference it.

If you don't own that union (i.e. it is in another package or in the stdlib, such as constraints.Ordered), then someone else making that change in that package creates the requirement for you to update your switch (unless you peg/vendor your dependencies).

A default clause may allow you to react to uncovered cases (and therefore opts out of the compile-time check) but at best, you're going to log, panic or return error; at worst you're going to silently fail.

If you take "exhaustive" as "any possible combination", does that not become prohibitively difficult to compute? It feels like one of those P=NP problems, but I'm not a CS guy, so maybe I'm worrying about nothing.

Instead of case, one could use subcase as a keyword perhaps if ultimately required.

Still not sold on needing a nested subcase, over merely nesting a second switch. What if you needed a third level? Would you subsubcase?

That said, on the subject of nesting, I do see the potential need to open up type-switching to more than interface{} types. If each branch of the switch eliminates the unmatched types, writing a nested switch becomes more difficult to do (or at least a little more smelly) if you have to keep casting back to interface{} at each level.

I'm also thinking about this issue with union/sum types in mind where I don't think we can have a default clause. I need to think a bit more about it.

I think you can always have the default clause, I just don't think you can necessarily type-eliminate in its code path. In the same way you can always...

var x interface{}
switch x := x.(type) {
case int:
   // shadowed x is int
default:
   // shadowed x is interface{}
}

...you'd just end up with the un-eliminated type T...

switch T {
case int:
    var x T // is now an int
default:
    var x T // is still T
}

I'd need to see an example of a union/sum type that could not default before I understood your concern. I'll leave you to consider.

Some of this may depend on whether we are talking about switch T or switch aValueOfT.(type). I know I have a preference for the latter, so my views might be coloured by that. Luckily for everyone else, I'm just a voice on the internet!!

@Merovius
Copy link
Contributor

Merovius commented Jan 5, 2023

FTR, I don't think requiring exhaustiveness checks works anyways (edit: just saw that @jimbobmcgee mentioned that possibility, and yes, indeed). Checking that a type parameter switch covers all cases is a similar problem to calculating whether a type set is empty. So at the very least, we'd have to introduce restrictions to make this problem tractable, e.g. disallowing using comparable or interfaces with methods as switch cases. Which is a pretty huge restriction.

It just so happens that I also think it would be a bad idea, even if it was computationally feasible.

@atdiar
Copy link

atdiar commented Jan 5, 2023

@Merovius
Well you do embed, simply not the type switch cases but by defining a qconstraint that use embedding.
You are still depending on being able to find the right ordering of case which is obvious here but unlikely to be obvious in general. You would have to define many more new constraint and think about ordering if that is even possible.

Note
I know it would be tedious to write all cases hence, why I mentioned that it could deter writing unnecessarily complex interfaces, which could actually be a benefit. That's just an idea

I am pondering on this exhaustiveness issue, only in the case of type parameters constrained by a union (possible behavior extended to sum/union types). I do tend to think that it might be too problematic wrt backward compatibility.

The one issue I see however is backward-compatibility, would someone attempt to add a term to a union that is being type switched over somewhere arbitrary.

So I think that exhaustiveness is not necessarily a good idea to be enforced at the compiler level after all.

@atdiar
Copy link

atdiar commented Jan 5, 2023

@jimbobmcgee
You're right. I was thinking about that too. As long as there is a type parameter switch somewhere, it would freeze the definition of the constraint.
Even if we could add a new case to every type switch, that would still not be backward-compatible.

@Merovius
Copy link
Contributor

Merovius commented Jan 5, 2023

@atdiar I had to revert an edit of that example back, because I forgot that ~string|fmt.Stringer is disallowed, so the first paragraph is no longer accurate. On the other hand, the argument remains the same and effective in the original form.

And I want to re-invite you to come up with an example where your proposed restriction (where it feasible) would require less nesting than the proposal as written. Don't just make statements like "You would have to define many more new constraint and think about ordering if that is even possible", come up with an actual example.

@atdiar
Copy link

atdiar commented Jan 5, 2023

@Merovius why would require less nesting...? That's not really what I claimed. I said that nesting would be required.
Your example is nice. But with more constraints that are overlapping, you would have more constraint intersections to define to be handled in the type switch and I don't know how I would order them. Not sure it is tractable without nesting. (just consider adding a ReaderStringer interface as a case of the type parameter switch for example)

But regardless, I don't think that exhaustive checking mixes too well with backward-compatibility. Unless a constraint cannot be changed once defined.

@jimbobmcgee well yes, you can default, and that's the escape hatch that I was hinting at, out of exhaustiveness.
But all things considered, I'm not sure exhaustiveness is that flexible anyway.

@Merovius
Copy link
Contributor

Merovius commented Jan 6, 2023

@atdiar To be clear: My original point was, that your proposed rule would make code harder to read and write, because it requires to split up cases into more and more deeply nested switch cases. Do you agree? Because while you say you agree that nesting is needed, you make it sound like "the original proposal also requires nesting for complex cases" makes up for that deficiency, which I find deeply disagreeable.

FWIW the backwards compatibility problem does not exist, I believe. If the constraint is not exported, all code switching over a type parameter constrained on it will be in the same package, so can be changed in the same step. If, OTOH, it is exported, adding a new case to it is already a backwards incompatible change, similar to removing a method from a basic interface:

package a

type C interface { ~int }

// --

package b

func F[T ~int]() {}

func G[T a.C]() {
    F[T]()
}

Adding a new case to a.C would break b.G. So, the backwards compatibility concern is a red herring.

@atdiar
Copy link

atdiar commented Jan 6, 2023

I'm not interested in bickering back and forth at this point, sorry.
The point is that exhaustive checking is not very useful here.
And that although adding a term to a union interface is not backward-incompatible by itself (which I said before), everyone else would have understood that it mixes badly with backward-compatibility.
You just showed one other example of such a case which is great.
What else do you still want to argue about? It's pointless. :)

[edit] And to be fair, if I'm not mistaken, I think you had a good hunch but unfortunately didn't develop it
Seems to me that it is possible to be exhaustive without subcases for a union.
It simply requires to build the intersections between terms of the union in decreasing specificity order:

for a | b | c
we would have:

case  a x b x c 
case a x b
case a x c
case b x c
case a
case b
case c

where a,b,c are constraints. (case with the same number of embeddings, symbolized here by x, should be cases with interchangeable positions if I'm not wrong.

No need for a default case here. In a real union, a lot of cases would even disappear.
But it needs explaining since it is not necessarily obvious to everyone. It should influence how people write their switch.

Otherwise, I am confused when you claim that exhaustiveness is similar a problem to checking that type sets are empty which is solving for SAT I believe (unless I am misinterpreting,)
This seems to me that the difficulty of SAT Solving is about Exponential time complexity but here the main issue is Exponential Space complexity which is much more mitigable (unless someone could and would create a union with thousands of basic interface types, but good luck to that person ;)

@rogpeppe
Copy link
Contributor Author

rogpeppe commented Jan 6, 2023

And that although adding a term to a union interface is not backward-incompatible by itself (which I said before),

In passing, for the record: adding a term to a union interface can be backward-incompatible by itself. Consider, for example, adding an additional term (say int32) to I1 in the code below, which will cause the code to fail to compile:

type I1 interface {
	int8 | int16
}

type I2 interface {
	int8 | int16
}

func f[T I1]() {
	g[T]()
}

func g[T I2]() {
}

@Merovius
Copy link
Contributor

Merovius commented Jan 6, 2023

@Merovius
Copy link
Contributor

Merovius commented Jan 6, 2023

@atdiar

It simply requires to build the intersections between terms of the union in decreasing specificity order:

My understanding of our suggestion was that you specifically wanted to forbid having cases with non-empty intersection. I thought that was the whole point of your suggestion. Either way, you are not refuting what I said, because that's exactly the "combinatorial explosion" I was talking about. Note how that requires an exponential number of switch cases in the number of union-terms.

This seems to me that the difficulty of SAT Solving is about Exponential time complexity but here the main issue is Exponential Space complexity which is much more mitigable

Exponential space is strictly worse than exponential time. Because the compiler has to parse the source code, which takes at least linear time in its length. So compiling a program that takes exponential space, necessarily takes exponential time as well (but also, well, exponential space).

@rsc
Copy link
Contributor

rsc commented Jan 18, 2023

It sounds like maybe we want this, but we don't have the bandwidth to work out the right answer right now. Putting on hold until we have a clearer answer and/or more time.

@rsc
Copy link
Contributor

rsc commented Jan 18, 2023

Placed on hold.
— rsc for the proposal review group

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Hold
Development

No branches or pull requests