Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upbpo-42128: Structural Pattern Matching (PEP 634) #22917
Conversation
These are like keywords but they only work in context; they are not reserved except when there is an exact match. This would enable things like match statements.
This currently segfaults but I still like it
static int compiler_enter_scope(struct compiler *, identifier, int, void *, int); | ||
static void compiler_free(struct compiler *); | ||
static basicblock *compiler_new_block(struct compiler *); | ||
static int compiler_next_instr(basicblock *); | ||
static int compiler_addop(struct compiler *, int); | ||
static int compiler_addop_i(struct compiler *, int, Py_ssize_t); | ||
static int compiler_addop_j(struct compiler *, int, basicblock *); | ||
static int compiler_error(struct compiler *, const char *); | ||
static int compiler_error(struct compiler *, const char *, ...); |
brandtbucher
Oct 23, 2020
Author
Member
One other change I made here: compiler_error
takes varargs now, like compiler_warn
. I got tired of building strings for helpful error messages.
#define CHECK(X) \ | ||
if (!(X)) { \ | ||
return 0; \ | ||
} |
brandtbucher
Oct 23, 2020
Author
Member
This comes in handy to keep visual clutter down in the nastier compound patterns.
// To keep things simple, all compiler_pattern_* routines follow the convention | ||
// of preserving TOS (the subject for the given pattern) and pushing either True | ||
// (match) or False (no match) on top of it. We do this even for irrefutable | ||
// patterns; the idea is that it's much easier to smooth out any redundant | ||
// pushing, popping, and jumping in the peephole optimizer than to detect or | ||
// predict it here. |
brandtbucher
Oct 23, 2020
Author
Member
NB!
Also, note that the current pattern compiler is designed to be as simple and easy-to-review as possible. The PEP gives us broad freedom to do trickier stuff, but I'm not going to begin work on a rewrite until we have a couple of releases using this simpler approach.
If you want to schedule another build, you need to add the " |
So I see three categories of buildbot failures:
I’ve verified that all of these same failures occur on the 3.x buildbots, so I think they can be safely ignored here. |
Thanks @brandtbucher! I am going to try to fix the segfaults first, as we have a release soon of 3.10 alpha 2 and I don't want those segfaults to be there. |
{ | ||
// Don't blindly optimize the pattern as an expr; it plays by its own rules! | ||
// Currently, this is only used to form complex/negative numeric constants. | ||
switch (node_->kind) { |
ncoghlan
Oct 29, 2020
Contributor
Something I noticed while working on PEP 642's reference implementation is that this ast_opt code is currently relying on the parser to keep the operators that it doesn't handle away from it, so it's easy to segfault by feeding it your own AST tree, without invoking the parser.
While filling in validate_pattern
in ast.c will be the main answer to that, I think there are some genuinely missing cases here as well (e.g. +1
will fail the UnaryOp check).
isidentical
Oct 29, 2020
Member
While filling in validate_pattern in ast.c will be the main answer to that, I think there are some genuinely missing cases here as well (e.g. +1 will fail the UnaryOp check).
The drafted AST validator should prevent +1
, but I can't say for sure is whether a full coverage exists (I tried to implement it by looking at the grammar but it would be nice if @brandtbucher can do a second pass on it):
>>> t = ast.parse("match test:\n\tcase -1: pass")
>>> t.body[0].cases[0].pattern.op = ast.UAdd()
>>> print(ast.unparse(t))
match test:
case +1:
pass
>>> compile(t, "<test>", "exec")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: only USub operator is allowed for UnaryOp nodes inside of a pattern
brandtbucher
Oct 29, 2020
•
Author
Member
Something I noticed while working on PEP 642's reference implementation is that this ast_opt code is currently relying on the parser to keep the operators that it doesn't handle away from it, so it's easy to segfault by feeding it your own AST tree, without invoking the parser.
I'm not sure it's easy to actually segfault the compiler with malformed patterns (at least --with-pydebug
); most of the assumptions it makes here and in compile.c
about what the parser can emit are guarded either by assert(...)
or Py_UNREACHABLE()
calls. (I'm probably a heavier user of assert
than the historical authors of these files). As I see it, the goal of PyAST_Validate
is to make sure these checks cannot be hit by people playing with ast
and compile
in userland.
So checking those same assumptions in PyAST_Validate
should be sufficient (and more should be added if missing).
While filling in
validate_pattern
in ast.c will be the main answer to that, I think there are some genuinely missing cases here as well (e.g.+1
will fail the UnaryOp check).
For what it's worth, +1
isn't valid. Full grammar here.
I can't say for sure is whether a full coverage exists (I tried to implement it by looking at the grammar but it would be nice if @brandtbucher can do a second pass on it)
Ha! I knew I was forgetting something. I promised you a review, then promptly got distracted with ceval.c
and forgot. Thanks for reminding me.
brandtbucher
Oct 29, 2020
Author
Member
@isidentical can you open up a PR in your own repo (merging isidentical/cpython:patma-ast-validators
into isidentical/cpython:master
)? That way I can leave review comments.
Also, if any core devs are eavesdropping and haven't voted for Batuhan's promotion yet, please do so! He's absolutely earned it. |
def f(t):
a, b, p, q, x, y = [None]*6
match t:
case []:
pass
case [x]:
y = 2
case [a, b] if a == b:
pass
case [p, q]:
pass
print(a,b,p,q,x,y)
f((1,2)) This prints |
Hi Mark. PEP 634 says (in regard to name bindings in patterns):
In general, there are two points to keep in mind for cases like yours:
|
So the bug is in the PEP, not the implementation? |
It's not a bug, it's part of the spec. We even mention it twice:
|
Why inflict that on users? And please don't use "optimization" as an excuse, that's just lazy. |
This isn't the place to discuss PEP 634. Perhaps take it up in gvanrossum/patma (relevant discussion in gvanrossum/patma#110). It's been pretty quiet lately since the spec has matured. |
PEP 634 has not yet been accepted, but we'd like to hit the ground running and get this into alphas as soon as it (hopefully) is.
Several people have volunteered to review the implementation, since it's so huge. Other reviews are very welcome, if anybody has a bit of time to pitch in. This touches tons of stuff: the parser, the compiler, the VM, the builtins, the stdlib, the tests... I'd like as many eyeballs as possible!
https://bugs.python.org/issue42128