Skip to content

WIP: redesign interp #2655

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 35 commits into
base: dev
Choose a base branch
from
Draft

WIP: redesign interp #2655

wants to merge 35 commits into from

Conversation

niaow
Copy link
Member

@niaow niaow commented Feb 27, 2022

This is an attempt to redesign interp. It is still relatively early on, but the main goals are to track side-effects more accurately and keep time complexity down.

@niaow
Copy link
Member Author

niaow commented Feb 27, 2022

This is still a bit more complex than I want (especially with errRuntime vs reverts)

@niaow
Copy link
Member Author

niaow commented Mar 1, 2022

There is one known problem right now: the memory management assumes that a store of a non-byte-multiple integer (mainly i1) is zero extended when storing and truncated when loading. This is not really correct, but should not actually be a problem. I don't really have a better idea for this.

@aykevl
Copy link
Member

aykevl commented Mar 1, 2022

There is one known problem right now: the memory management assumes that a store of a non-byte-multiple integer (mainly i1) is zero extended when storing and truncated when loading. This is not really correct, but should not actually be a problem. I don't really have a better idea for this.

Why would that not be correct? All modern general-purpose ISAs are byte addressable, not bit addressable. LLVM backends lower i1 to i8. I think this assumption is correct.

@niaow
Copy link
Member Author

niaow commented Mar 1, 2022

From the LLVM docs for store:

When writing a value of a type like i20 with a size that is not an integral number of bytes, it is unspecified what happens to the extra bits that do not belong to the type, but they will typically be overwritten.

If I understand what this is saying, this is unspecified behavior. However, yes I expect that this is at least generally correct.

@niaow
Copy link
Member Author

niaow commented Mar 1, 2022

Now most of the base tests pass. The main exceptions are:

  • The reflect.go and json.go tests fail (this failure can be replicated by completely turning interp off, so it does not seem to be a bug in interp)
  • The WASM target doesn't really work right now because of bugs in ExternalInt64AsPtr (wasi is fine)
  • env.go, stdlib.go, and testing.go fail on emulated Cortex-M3 and RISC-V

@niaow
Copy link
Member Author

niaow commented Mar 3, 2022

The env.go issue is caused by internal/testlog.logger is getting corrupted. So far it is difficult to track down why.

@niaow
Copy link
Member Author

niaow commented Mar 3, 2022

Well I found the problem: it was a stack overflow in unicode.init, since it was not interpreted due to the allocas.

@aykevl
Copy link
Member

aykevl commented Mar 13, 2022

The reflect.go and json.go tests fail (this failure can be replicated by completely turning interp off, so it does not seem to be a bug in interp)

This is probably because those rely on OptimizeReflectImplements, which in turn relies on interp.
...yes, that's a bit fragile.

@niaow
Copy link
Member Author

niaow commented Mar 13, 2022

The remaining issue with env.go now is that maps and other special data structures do not have attached type information (which causes the allocation to be left on the heap). I think that we do need to pass type information to these through the constructors, which implies getting the frontend to emit the compound type info.

@niaow niaow force-pushed the newer-interp branch 2 times, most recently from a0df4b9 to ac5f18f Compare March 18, 2022 21:22
@niaow
Copy link
Member Author

niaow commented Mar 19, 2022

With the latest commit, the reflect and json tests pass.

Use the existing compilerContext.createObjectLayout on the channel's
underlying element to create a layout value that can be passed to the
channel buffer's allocation.
Split compilerContext.createObjectLayout into two parts in preparation
for calling the second part by the map creation layout function.
A Bits type for manipulating a size and a bitmap as one. Probably
overkill for the needs coming up of building the layout for a hashmap
but it can be shrunk down or replaced. For now, it makes the building of
a bucket layout straightfoward in the next commit.

Also the unit tests are a bit much given the limited use of the type.
Modify build.createMapUpdate to take the element type.

Define the compilerContext.createMapLayout for creating a hashmap bucket
layout.

Pass the hashmap layout to the hashmapMake and hashmapSet functions.

Define a hashmapBucketLayout function that takes sizes and bitmaps for
the key and element types of a map and returns the size and bitmap for
the bucket type. This is the function that needs to be kept in sync with
the bucket layout implemented by src/runtime/hashmap.go.

In src/runtime/hashmap.go, modifies arguments to hashmapMake and the
hashmapSet family of functions to take the layout parameter in order to
get the layout value passed to the alloc call.
Copy link

@FrankReh FrankReh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given my confusion regarding pointerAlign vs pointerSize, the first parameter name to createObjectLayout1 is then a misnomer. Maybe both should be passed to createObjectLayout1 or maybe createObjectLayout1 should should compute those two values again on its own.

compiler/llvm.go Outdated
@@ -102,6 +101,70 @@ func (c *compilerContext) createObjectLayout(t llvm.Type, pos token.Pos) llvm.Va
layout := (uint64(1) << 1) | 1
return llvm.ConstIntToPtr(llvm.ConstInt(c.uintptrType, layout, false), c.i8ptrType)
}
return c.createObjectLayout1(pointerSize, objectSizeBytes, bitmap, pos)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My version passed a pointerSize. But with the change to line 163, this should also be the alignment. Maybe better to pass both or pass neither and just recompute them in createObjectLayout1.

compiler/llvm.go Outdated
// We know there are pointers, we added one ourself for the bucket overhead
// so no check for that.

return c.createObjectLayout1(pointerAlign, mSizeBytes, mbitmap, pos)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't pass the alignment unless the earlier call does too.

compiler/llvm.go Outdated
}

func (c *compilerContext) createObjectLayout1(pointerSize, objectSizeBytes uint64, bitmap *big.Int, pos token.Pos) llvm.Value {
pointerAlignment := c.targetData.PrefTypeAlignment(c.i8ptrType)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of the causes of confusion. I got the alignment here, without passing it from the caller.

@niaow
Copy link
Member Author

niaow commented Mar 25, 2022

I am not really sure how I feel about this. I have gotten a lot working so far, but this is just way too big and complicated right now.

@niaow
Copy link
Member Author

niaow commented Jun 11, 2022

Some things that went right with this attempt:

  1. The fixed-shape tree used for memory works quite well and has predictable overhead
  2. The bit cat and slice mechanisms simplify memory handling a lot
  3. Side-effect calculation is fairly accurate
  4. The structure of each instruction using offsets into a list to get inputs and then appending its outputs to that list works well
  5. The structure of the constructed representation works well, and the depth-first traversal of the dominator tree simplifies the construction of the internal interp representation
  6. It is relatively easy to debug behavior
  7. The "resolve" process on arguments is effective and flexible

Some things that went wrong with this attempt:

  1. The whole distinction between errRuntime and errRevert is very confusing
  2. There are still a few cases where repeated reverts can result in O(bad) overhead with repeated restoration of memory state
  3. Memory lifetimes get stretched when they are used at runtime
  4. Structs in interfaces are quite heavy, so functions use more memory than they really need to
  5. The lack of access to the bodies of functions from imported packages (e.g. errors.New) forces extra work to be done after linking
  6. The process of implementing an instruction is excessively complicated
  7. The bit-concatenation type is not properly comparable
  8. Its just too big, although this might be necessary
  9. The "resolve" process on arguments is unnecessarily slow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants