Split compiler into code-gen, optimizer and assembler. #87092

markshannon · 2021-01-13T15:03:24Z

BPO	42926
Nosy	@gvanrossum, @markshannon, @corona10, @brandtbucher, @iritkatriel

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2021-01-13.15:03:23.987>
labels = []
title = 'Split compiler into code-gen, optimizer and assembler.'
updated_at = <Date 2022-02-03.22:11:26.525>
user = 'https://github.com/markshannon'

bugs.python.org fields:

activity = <Date 2022-02-03.22:11:26.525>
actor = 'iritkatriel'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = []
creation = <Date 2021-01-13.15:03:23.987>
creator = 'Mark.Shannon'
dependencies = []
files = []
hgrepos = []
issue_num = 42926
keywords = ['patch']
message_count = 2.0
messages = ['385033', '385043']
nosy_count = 5.0
nosy_names = ['gvanrossum', 'Mark.Shannon', 'corona10', 'brandtbucher', 'iritkatriel']
pr_nums = ['31116']
priority = 'normal'
resolution = None
stage = 'patch review'
status = 'open'
superseder = None
type = None
url = 'https://bugs.python.org/issue42926'
versions = []

PR: gh-87092: do not allocate PyFutureFeatures dynamically #98913

PR: gh-87092: remove unused SET_LOC/UNSET_LOC macros #98914

PR: gh-87092: expose the compiler's codegen to python for unit tests #99111

PR: gh-87092: move all localsplus preparation into separate function called from assembler stage #99869

markshannon · 2021-01-13T15:03:24Z

Currently the compiler operates in three main passes:

Code-gen
Optimize
Assemble

The problem is that these passes use the same basic-block based CFG, leading to unnecessary coupling and inefficiencies.
A basic block CFG is awkward and error-prone for the code-gen, but not very efficient for the optimizer and assembler.

A better design would be for the code-gen to create a single linear sequence of instructions. The optimizer would take this and produce a list of extended-blocks for the assembler to consume.

code-gen -> (list of instructions) -> optimizer
optimizer -> (list of extended blocks) -> assembler

(Extended blocks have a single entry and multiple exits, unlike basic blocks which have a single entry and single exit)

This would:

Reduce memory use considerably (the size of instruction and block data structures would be about 60% of current)
Be faster (Less CFG management).
Produce better code (extended blocks are a better unit for optimization that basic blocks).
Be easier to maintain:
a) Code-gen wouldn't have to worry about creating a correct CFG.
b) The optimizer wouldn't need to handle empty blocks and track which basic blocks form an extended block.

Apart from the changes to the compiler, it would help if we made all branch instructions absolute (or have a backward dual) to accommodate free reordering of blocks in the optimizer.

gvanrossum · 2021-01-13T15:52:32Z

SGTM. But I’m not the one who has to work with it.

…compiler's codegen stage does not work directly with basic blocks

…er's codegen stage does not work directly with basic blocks (GH-95398)

…he target pointer is only calculated just before optimization stage (GH-95655)

…96243)

pythonGH-96243)

iritkatriel · 2022-09-04T22:24:58Z

I've been struggling to determine where to draw the line between the optimization stage and the assembly stage.

I think the solution is to add a fourth stage, between optimization and assembly, which prepares the CFG for assembly. It does all the normalisation of pseudo-stuff (instructions, targets, etc) into actual stuff (real opcodes, offsets, etc).

I don't know if there is a name for this stage in compilers parlance. We could call it "resolve instructions" or something like that.

It will include everything to do with:

(1) calculating stackdepth and except targets, replacing exception related opcodes by NOP
(2) moving cold blocks to end of block-list
(3) everything related to making sure line numbers are correct
(4) replacing pseudo-op jumps by actual jumps, including figuring out the direction and translating conditional backwards jumps into real jumps
(5) Finally, calculating the jump offsets

gvanrossum · 2022-09-05T01:20:10Z

I don't know what that should be called either, but after this is done, are EXTENDED_ARG prefixes for jumps all set? I'm guessing, maybe make a similar list of responsibilities for the assembler? Because I'm not sure what those are.

iritkatriel · 2022-09-05T08:59:03Z

The responsibility of the assembler is to turn a list of instructions to a code object:

(1) write the bytecode for the instructions
(2) create the lineno table
(3) create the exception table
(4) create a code object from those + metadata about the compilation unit.

Bytecode generation in stage (1) adds EXTENDED_ARG bytecodes when an instruction has a large oparg that requires it. The part that calculates jump offsets ((5) in the "resolve" list) takes into account the EXTENDED_ARGs when calculating block sizes.

The idea is that all the complex calculations would be in the new stage, and we can write tests for them. Then the assembler's job is to transform the instructions representation to what we need in the code object (but all the exception/jump targets, lineno, block order etc is already resolved before). Then we can write tests just for this translation stage.

gvanrossum · 2022-09-05T14:31:31Z

SG. May the name can be inspired by “settle” or “tidy” or “clean”? Or maybe “resolve” is fine.

markshannon · 2022-09-05T15:18:24Z

From the list above:
(1) calculating stackdepth and except targets, replacing exception related opcodes by NOP
(2) moving cold blocks to end of block-list
(3) everything related to making sure line numbers are correct
(4) replacing pseudo-op jumps by actual jumps, including figuring out the direction and translating conditional backwards jumps into real jumps
(5) Finally, calculating the jump offsets

And splitting (1) into 1x (exception targets) and 1s (stack depth calculation) and adding O (other cfg optimizations)

In a normal compiler, there are de-sugaring and semantic analysis passes after parsing, but before optimization.
(3) and (1x) belongs in there, as does computing except targets.

(2) belongs in the optimizer

(1s), (4) and (5) belong in the assembler.

Is this a good order of passes? 3, 1x, 2, O, 4, 5, 1s

iritkatriel · 2022-09-05T15:50:05Z

I like the idea or moving some of the resolution to before optimisations. I think the order you suggest is fine, by and large. I’ll need to check whether all the line number business is safe to move up front. Some of it involves duplicating blocks,etc.

iritkatriel · 2022-09-05T20:02:56Z

The lineno calculation needs to duplicate blocks that have no lineno but more than one predecessors. So it needs to come after mark_reachable, which is currently somewhere in the middle of the optimization stage. So that part of Mark's suggested reordering might be a problem. At least it needs to be done carefully.

… stage (GH-96713)

…e the CFG optimization stage (GH-96935)

… organisation (GH-97644)

* main: (66 commits) pythongh-65961: Raise `DeprecationWarning` when `__package__` differs from `__spec__.parent` (python#97879) docs(typing): add "see PEP 675" to LiteralString (python#97926) pythongh-97850: Remove all known instances of module_repr() (python#97876) I changed my surname early this year (python#96671) pythongh-93738: Documentation C syntax (:c:type:<C type> -> :c:expr:<C type>) (python#97768) pythongh-91539: improve performance of get_proxies_environment (python#91566) build(deps): bump actions/stale from 5 to 6 (python#97701) pythonGH-95172 Make the same version `versionadded` oneline (python#95172) pythongh-88050: Fix asyncio subprocess to kill process cleanly when process is blocked (python#32073) pythongh-93738: Documentation C syntax (Function glob patterns -> literal markup) (python#97774) pythongh-93357: Port test cases to IsolatedAsyncioTestCase, part 2 (python#97896) pythongh-95196: Disable incorrect pickling of the C implemented classmethod descriptors (pythonGH-96383) pythongh-97758: Fix a crash in getpath_joinpath() called without arguments (pythonGH-97759) pythongh-74696: Pass root_dir to custom archivers which support it (pythonGH-94251) pythongh-97661: Improve accuracy of sqlite3.Cursor.fetchone docs (python#97662) pythongh-87092: bring compiler code closer to a preprocessing-opt-assembler organisation (pythonGH-97644) pythonGH-96704: Add {Task,Handle}.get_context(), use it in call_exception_handler() (python#96756) pythongh-93738: Documentation C syntax (:c:type:`PyTypeObject*` -> :c:expr:`PyTypeObject*`) (python#97778) pythongh-97825: fix AttributeError when calling subprocess.check_output(input=None) with encoding or errors args (python#97826) Add re.VERBOSE flag documentation example (python#97678) ...

…embler organisation (pythonGH-97644)

…99111)

…ed from assembler stage (GH-99869)

* main: (112 commits) pythongh-99894: Ensure the local names don't collide with the test file in traceback suggestion error checking (python#99895) pythongh-99612: Fix PyUnicode_DecodeUTF8Stateful() for ASCII-only data (pythonGH-99613) Doc: Add summary line to isolation_level & autocommit sqlite3.connect params (python#99917) pythonGH-98906 ```re``` module: ```search() vs. match()``` section should mention ```fullmatch()``` (pythonGH-98916) pythongh-89189: More compact range iterator (pythonGH-27986) bpo-47220: Document the optional callback parameter of weakref.WeakMethod (pythonGH-25491) pythonGH-99905: Fix output of misses in summarize_stats.py execution counts (pythonGH-99906) pythongh-99845: PEP 670: Convert PyObject macros to functions (python#99850) pythongh-99845: Use size_t type in __sizeof__() methods (python#99846) pythonGH-99877) Fix typo in exception message in `multiprocessing.pool` (python#99900) pythongh-87092: move all localsplus preparation into separate function called from assembler stage (pythonGH-99869) pythongh-99891: Fix infinite recursion in the tokenizer when showing warnings (pythonGH-99893) pythongh-99824: Document that sqlite3.connect implicitly open a transaction if autocommit=False (python#99825) pythonGH-81057: remove static state from suggestions.c (python#99411) Improve zip64 limit error message (python#95892) pythongh-98253: Break potential reference cycles in external code worsened by typing.py lru_cache (python#98591) pythongh-99127: Allow some features of syslog to the main interpreter only (pythongh-99128) pythongh-82836: fix private network check (python#97733) Docs: improve accuracy of socketserver reference (python#24767) ...

ezio-melotti transferred this issue from another repository Apr 10, 2022

iritkatriel added a commit to iritkatriel/cpython that referenced this issue Jul 28, 2022

pythongh-87092: create a 'jump target label' abstraction so that the …

717f1b4

…compiler's codegen stage does not work directly with basic blocks

bedevere-bot mentioned this issue Jul 28, 2022

gh-87092: create a 'jump target label' abstraction so that the compiler's codegen stage does not work directly with basic blocks #95398

Merged

iritkatriel self-assigned this Jul 28, 2022

markshannon pushed a commit that referenced this issue Aug 4, 2022

gh-87092: create a 'jump target label' abstraction so that the compil…

000c387

…er's codegen stage does not work directly with basic blocks (GH-95398)

bedevere-bot mentioned this issue Aug 4, 2022

gh-87092: compiler's codegen stage uses int jump target labels, and the target pointer is only calculated just before optimization stage #95655

Merged

iritkatriel added a commit that referenced this issue Aug 11, 2022

gh-87092: compiler's codegen stage uses int jump target labels, and t…

9533b40

…he target pointer is only calculated just before optimization stage (GH-95655)

iritkatriel added a commit to iritkatriel/cpython that referenced this issue Aug 24, 2022

pythongh-87092: use basicblock_last_instr consistently

5d4b484

bedevere-bot mentioned this issue Aug 24, 2022

gh-87092: use basicblock_last_instr consistently #96243

Merged

iritkatriel added a commit that referenced this issue Aug 24, 2022

gh-87092: use basicblock_last_instr consistently in the compiler (GH-…

fba3b67

…96243)

mdboom pushed a commit to mdboom/cpython that referenced this issue Aug 24, 2022

pythongh-87092: use basicblock_last_instr consistently in the compiler (

33c27a2

pythonGH-96243)

bedevere-bot mentioned this issue Sep 9, 2022

gh-87092: reduce redundancy and repetition in compiler's optimization stage #96713

Merged

iritkatriel added a commit that referenced this issue Sep 13, 2022

gh-87092: reduce redundancy and repetition in compiler's optimization…

6d7a0e0

… stage (GH-96713)

bedevere-bot mentioned this issue Sep 19, 2022

gh-87092: in compiler, move the detection of exception handlers before the CFG optimization stage #96935

Merged

iritkatriel added a commit that referenced this issue Sep 20, 2022

gh-87092: in compiler, move the detection of exception handlers befor…

98e785d

…e the CFG optimization stage (GH-96935)

iritkatriel mentioned this issue Sep 21, 2022

Fix line numbers generated by the compiler faster-cpython/ideas#469

Closed

bedevere-bot mentioned this issue Sep 29, 2022

gh-87092: bring compiler code closer to a preprocessing-opt-assembler organisation #97644

Merged

iritkatriel added a commit that referenced this issue Oct 5, 2022

gh-87092: bring compiler code closer to a preprocessing-opt-assembler…

c529b45

… organisation (GH-97644)

mpage pushed a commit to mpage/cpython that referenced this issue Oct 11, 2022

pythongh-87092: bring compiler code closer to a preprocessing-opt-ass…

6d4d702

…embler organisation (pythonGH-97644)

iritkatriel added a commit to iritkatriel/cpython that referenced this issue Oct 31, 2022

pythongh-87092: PyFutureFeatures no longer allocated dynamically

4717323

bedevere-bot mentioned this issue Oct 31, 2022

gh-87092: do not allocate PyFutureFeatures dynamically #98913

Merged

iritkatriel added a commit to iritkatriel/cpython that referenced this issue Oct 31, 2022

pythongh-87092: remove unused SET_LOC/UNSET_LOC macros

cfe9385

bedevere-bot mentioned this issue Oct 31, 2022

gh-87092: remove unused SET_LOC/UNSET_LOC macros #98914

Merged

iritkatriel added a commit that referenced this issue Nov 2, 2022

gh-87092: do not allocate PyFutureFeatures dynamically (GH-98913)

6d683d8

iritkatriel added a commit that referenced this issue Nov 2, 2022

gh-87092: remove unused SET_LOC/UNSET_LOC macros (GH-98914)

df84b7b

bedevere-bot mentioned this issue Nov 4, 2022

gh-87092: expose the compiler's codegen to python for unit tests #99111

Merged

iritkatriel added a commit that referenced this issue Nov 14, 2022

gh-87092: expose the compiler's codegen to python for unit tests (GH-…

a3ac923

…99111)

bedevere-bot mentioned this issue Nov 29, 2022

gh-87092: move all localsplus preparation into separate function called from assembler stage #99869

Merged

iritkatriel added a commit that referenced this issue Nov 30, 2022

gh-87092: move all localsplus preparation into separate function call…

ac12e39

…ed from assembler stage (GH-99869)

Split compiler into code-gen, optimizer and assembler. #87092

Split compiler into code-gen, optimizer and assembler. #87092

markshannon commented Jan 13, 2021 •

edited by bedevere-bot

markshannon commented Jan 13, 2021

gvanrossum commented Jan 13, 2021

iritkatriel commented Sep 4, 2022

gvanrossum commented Sep 5, 2022

iritkatriel commented Sep 5, 2022

gvanrossum commented Sep 5, 2022

markshannon commented Sep 5, 2022

iritkatriel commented Sep 5, 2022

iritkatriel commented Sep 5, 2022

Split compiler into code-gen, optimizer and assembler. #87092

Split compiler into code-gen, optimizer and assembler. #87092

Comments

markshannon commented Jan 13, 2021 • edited by bedevere-bot

markshannon commented Jan 13, 2021

gvanrossum commented Jan 13, 2021

iritkatriel commented Sep 4, 2022

gvanrossum commented Sep 5, 2022

iritkatriel commented Sep 5, 2022

gvanrossum commented Sep 5, 2022

markshannon commented Sep 5, 2022

iritkatriel commented Sep 5, 2022

iritkatriel commented Sep 5, 2022

markshannon commented Jan 13, 2021 •

edited by bedevere-bot