Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prototype v2 for Module Splitting #4111

Draft
wants to merge 16 commits into
base: master
from
Draft

Prototype v2 for Module Splitting #4111

wants to merge 16 commits into from

Conversation

@gzm0
Copy link
Contributor

gzm0 commented Jul 1, 2020

Since the previous PR (#4010), the following has improved:

  • The overall architecture is ready for review.
  • The test suite passes in CommonJS.
@gzm0 gzm0 force-pushed the gzm0:split-module branch from b38c63f to c5ac3ee Jul 1, 2020
@gzm0 gzm0 requested a review from sjrd Jul 1, 2020
@gzm0 gzm0 marked this pull request as draft Jul 1, 2020
Copy link
Member

sjrd left a comment

Some comments on the first commit. I don't have more time to continue this now, unfortunately, and I'll be on the road for the next 4 days. I'll come back to it next week.

@@ -41,8 +42,6 @@ object Infos {
val methods: List[MethodInfo],
val jsNativeMembers: List[MethodName],
val exportedMembers: List[ReachabilityInfo],

This comment has been minimized.

@sjrd

sjrd Jul 1, 2020

Member
Suggested change
val exportedMembers: List[ReachabilityInfo],
val exportedMembers: List[ReachabilityInfo]
}

builder.result()
(classDef, classInfo, topLevelExportInfos)

This comment has been minimized.

@sjrd

sjrd Jul 1, 2020

Member

What's the point of returning classDef? Shouldn't this function only return (classInfo, topLevelExportInfos)?

This comment has been minimized.

@sjrd

sjrd Jul 1, 2020

Member

Going further, I think generateClassInfo should stick with returning classInfo only, and generateTopLevelExportInfos should take a ClassDef (and actually be intended to be called publicly from elsewhere).


/* Module initializers, which by spec run at the end. */
unit.moduleInitializers.iterator.map(classEmitter.genModuleInitializer(_))
)(Position.NoPosition)

val trackedGlobalRefs = classIter
.map(_.trackedGlobalRefs)
.foldLeft(coreJSLibTrackedGlobalRefs)(unionPreserveEmpty(_, _))
.foldLeft(
coreJSLibTrackedGlobalRefs ++ topLevelExportsTrackedGlobalRefs)(

This comment has been minimized.

@sjrd

sjrd Jul 1, 2020

Member
Suggested change
coreJSLibTrackedGlobalRefs ++ topLevelExportsTrackedGlobalRefs)(
unionPreserveEmpty(coreJSLibTrackedGlobalRefs, topLevelExportsTrackedGlobalRefs))(

Unless we're using GCC, there is a high chance that both of these sets are empty.

for (topLevelExport <- unit.topLevelExports) {
implicit val ctx = ErrorContext(topLevelExport.tree)

val clazz = lookupClass(topLevelExport.owningClass)

This comment has been minimized.

@sjrd

sjrd Jul 1, 2020

Member

I don't think we can do that. If a class has only TopLevelMethodExportDefs, and is not otherwise needed, it will not appear at all in the unit.classDefs, and therefore lookupClass will not find it.

import org.scalajs.linker.interface.unstable.{OutputFileImpl, OutputDirectoryImpl}

private[linker] object LinkerCompat {
def linkerOutputToOutputSpec(

This comment has been minimized.

@gzm0

gzm0 Jul 3, 2020

Author Contributor

Is this adaptation sufficiently backwards compatible? (it is not 100%).

import org.scalajs.linker.interface.unstable.ModuleSpecImpl.SelectorImpl

/** Module output specification. */
final class ModuleSpec (

This comment has been minimized.

@gzm0

gzm0 Jul 3, 2020

Author Contributor

I have also considered to make a ModuleSpec a list inherently (like in an earlier proposal). However, it seems that with the fully automatic module tracking, the input ModuleSpec is unlikely to turn into something more automatic.

final class OutputSpec private (
val directory: OutputDirectory,
val jsName: ModuleSpec.ModuleID => String,
val sourceMapName: ModuleSpec.ModuleID => String,

This comment has been minimized.

@gzm0

gzm0 Jul 3, 2020

Author Contributor

I'm unsure where to put jsName / sourceMapName. If I put it here, it is nice for the sbt plugin, since it keeps full control. OTOH, we need more machinery in the sbt plugin to support the "mjs" suffix for node.

Maybe putting these into the config (and then maybe just make them suffix strings due to serializability) and instead return a LinkingReport containing the files that were written might be a better idea.

This comment has been minimized.

@sjrd

sjrd Jul 14, 2020

Member

AFAICT, it makes more sense for those things to be here. At least that way StandardConfig doesn't have to know about files.

This comment has been minimized.

@gzm0

gzm0 Jul 17, 2020

Author Contributor

I'm less and less convinced this is the right spot TBH. I think we should have patterns for the following:

  • JS file name
  • Source map name
  • Source map URI
  • JS file URI
  • module name (for include)

At that point, it really doesn't feel like this belongs to OutputSpec, but rather in a ModulePatterns option class in StandardConfig. WDYT?

This comment has been minimized.

@gzm0

gzm0 Jul 20, 2020

Author Contributor

Hmmm... Turns out like this we cannot easily provide a backwards compatible method in Linker (because config cannot be change at that point).

I'm unsure how to proceed. IMHO as soon as things modify the contents of the files, it definitely should go into config :-/

This comment has been minimized.

@sjrd

sjrd Jul 20, 2020

Member

Perhaps one way to answer those questions: StandardConfig is just one way of configuring a linker, i.e., a standard linker. Does it make sense for another linker to be configured in a different way for file names? Would the sbt plugin be able to survive a different scalaJSLinker implementation that used a completely different scheme for file name configuration?

Today, the sbt plugin considers scalaJSLinkerConfig almost entirely as a blackbox, only used to create the default scalaJSLinkerConfig. The "almost" is because we actually dig into its moduleKind field to build jsEnv (and having some weird warning that if moduleKind differs in fast/fullOptJS than just in the configuration, things will break). To me this suggests that moduleKind already has a very specific status, and maybe shouldn't have been part of the scalaJSLinkerConfig to begin with. If these module patterns would equally affect the rest of the sbt plugin, perhaps they should stay out of StandardConfig?

This comment has been minimized.

@gzm0

gzm0 Jul 21, 2020

Author Contributor

Hah, I think this is an excellent way of looking at it.

So what we observe is that the sbt plugin is peeking at the scalaJSLinkerConfig to find out which moduleKind the output is. It does this without taking the scalaJSStage into account, requiring the weird warning.

IMHO a better way to deal with this is a LinkingReport providing this information. At this point, the sbt plugin does not need to know/care anymore why a certain module type was returned (we could imagine a custom linker that only supports a specific type). It simply knows which module type was returned.

A similar strategy can then be applied to output files: The LinkingReport will simply list all the file names (and source maps) that have been produced.

This partially also addresses the backwards compatibility issue: We need not care about the config for file names but can

  • write to a in memory directory
  • retrieve the single output from the LinkingReport and write to the actual output

It does not address how to deal with sourceMapURI / jsFileURI.

This comment has been minimized.

@gzm0

gzm0 Jul 27, 2020

Author Contributor

How about something like this?

final class OutputPatterns private (
    val jsFile: String,
    val sourceMapFile: String,
    val moduleName: String,
    val jsFileURI: String,
    val sourceMapURI: String
) {
  def this() = {
    this(
        jsFile = "[module-id].js",
        smFile = "[js-file].map",
        moduleName = "./[js-file]",
        jsFileURI = "[js-file]",
        smFileURI = "[sm-file]"
    )
  }

  // snip
}

This is inspired by webpack's placeholders. We could make this fully backwards compatible except:

  • There would always be a link between source-map / js file.
  • jsFileURI and sourceMapURI of a given LinkerOutput are set to the same (non-empty) value.

We'd have to combine this with a LinkingReport such that the sbt-plugin can recover root files and types.

@sjrd WDYT?


//TODO: This dependency chain fails with this:
// java.io.Flushable: java.lang.Object: org.scalajs.linker.runtime.UndefinedBehaviorError: java.lang.Throwable: java.lang.StackTraceElement: key not found: ClassName<java.lang.StackTraceElement>
//deps ++= classInfos.referencedFieldClasses

This comment has been minimized.

@gzm0

gzm0 Jul 3, 2020

Author Contributor

Any idea what is happening here?

val externalDeps =
topLevelExports.flatMap(externalDependencies(_)).distinct.toList

// TODO: Filter empty modules.

This comment has been minimized.

@gzm0

gzm0 Jul 3, 2020

Author Contributor

Unsure what to do here. Should an empty module cause a failure? Should it cause an empty file? Should it cause no file?

This comment has been minimized.

@sjrd

sjrd Jul 15, 2020

Member

It seems to me that the most reasonable outcome would be an empty file. If a module has been requested by the user, and it ends up being empty, why should we get rid of it or consider that an error?

This comment has been minimized.

@gzm0

gzm0 Jul 15, 2020

Author Contributor

why should we get rid of it or consider that an error?

It might hint at a configuration error. But yes, probably an empty module is OK.

import org.scalajs.linker.interface.ModuleInitializer
import org.scalajs.linker.interface.ModuleSpec.ModuleID

final class ModuleSet private[linker] (

This comment has been minimized.

@gzm0

gzm0 Jul 3, 2020

Author Contributor

An alternative is to evolve the LinkingUnit into this class instead. What is unclear is how the Refiner would deal with a non-single module LinkingUnit.

This comment has been minimized.

@sjrd

sjrd Jul 15, 2020

Member

No, I think what you have now is better. It's better if front-end phases (the optimizer, but also future custom front-end phases) don't have to worry about modules.

ModuleSpec(modName)
.withInitializers(moduleInitializers)
.withSelector(ModuleSpec.Selector.default)
)

This comment has been minimized.

@gzm0

gzm0 Jul 3, 2020

Author Contributor

sbt plugin needs a lot more work. It's not 100% clear to me ATM how we can reasonably support existing builds that assume single-file outputs without introducing a lot of double logic.

@gzm0 gzm0 force-pushed the gzm0:split-module branch 2 times, most recently from 02417f4 to 606dd62 Jul 9, 2020
@gzm0
Copy link
Contributor Author

gzm0 commented Jul 11, 2020

Rebased on top of #4120 and addressed the comments in the first commit (@sjrd WDYT about merging that one on its own?).

@gzm0 gzm0 force-pushed the gzm0:split-module branch 8 times, most recently from 26170eb to fc662ab Jul 11, 2020
Copy link
Member

sjrd left a comment

Some comments as I finally get to review this, piece by piece.

Some things I've seen so far and that I like:

  • The ModuleSpec concept.
  • ModuleSplitStyle.
  • The ModuleSet and ModuleSplitter concepts.

There's still a lot that I need to go through, so the above list is not yet exhaustive ;)

@@ -118,6 +118,9 @@ private final class Analyzer(config: CommonPhaseConfig,
// Entry points (top-level exports and static initializers)

This comment has been minimized.

@sjrd

sjrd Jul 14, 2020

Member
Suggested change
// Entry points (top-level exports and static initializers)
// Static initializers
val JSMethodDef(flags, pName, params, body) = topLevelMethodExportDef.methodDef
implicit val ctx = ErrorContext(topLevelMethodExportDef.methodDef)

val static = flags.namespace.isStatic

This comment has been minimized.

@sjrd

sjrd Jul 14, 2020

Member

This val is unused.

cacheUpdate = irFile.tree.map(tree =>
(tree, Infos.generateClassInfo(tree), Infos.generateTopLevelExportInfos(tree)))
Comment on lines 136 to 137

This comment has been minimized.

@sjrd

sjrd Jul 14, 2020

Member

Coding style nit:

Suggested change
cacheUpdate = irFile.tree.map(tree =>
(tree, Infos.generateClassInfo(tree), Infos.generateTopLevelExportInfos(tree)))
cacheUpdate = irFile.tree.map { tree =>
(tree, Infos.generateClassInfo(tree), Infos.generateTopLevelExportInfos(tree))
}

final class LinkedTopLevelExport(
val owningClass: ClassName,
val tree: TopLevelExportDef,

This comment has been minimized.

@sjrd

sjrd Jul 14, 2020

Member
Suggested change
val tree: TopLevelExportDef,
val tree: TopLevelExportDef
final class OutputSpec private (
val directory: OutputDirectory,
val jsName: ModuleSpec.ModuleID => String,
val sourceMapName: ModuleSpec.ModuleID => String,

This comment has been minimized.

@sjrd

sjrd Jul 14, 2020

Member

AFAICT, it makes more sense for those things to be here. At least that way StandardConfig doesn't have to know about files.

final class OutputSpec private (
val directory: OutputDirectory,
val jsName: ModuleSpec.ModuleID => String,
val sourceMapName: ModuleSpec.ModuleID => String,

This comment has been minimized.

@sjrd

sjrd Jul 14, 2020

Member

Have you considered merging both of these in a single function?

val jsNameWithSourceMap: ModuleSpec.ModuleID => (String, String)

The name is uglier, but those things should always stick together, shouldn't they?

def withDirectory(directory: OutputDirectory): OutputSpec =
copy(directory = directory)
Comment on lines 23 to 24

This comment has been minimized.

@sjrd

sjrd Jul 14, 2020

Member

Since directory is a mandatory item provided in apply, it probably does not make sense to add a withDirectory. I think so far we've not provided withX methods for x'es that are mandatory arguments of the apply method of config classes.


finallyWith(writeLoop(), chan.close())
}
/** Convencience method to open a new channel this file's output directory. */

This comment has been minimized.

@sjrd

sjrd Jul 14, 2020

Member
Suggested change
/** Convencience method to open a new channel this file's output directory. */
/** Convenience method to open a new channel in this file's output directory. */

?

}

private def finallyWith(v: Future[Unit], f: => Future[Unit])(
/** Convencience method to write this file in its output directory. */

This comment has been minimized.

@sjrd

sjrd Jul 14, 2020

Member
Suggested change
/** Convencience method to write this file in its output directory. */
/** Convenience method to write this file in its output directory. */
@@ -47,6 +49,7 @@ object Analysis {
*/
trait ClassInfo {
def className: ClassName
def data: Infos.ClassInfo

This comment has been minimized.

@sjrd

sjrd Jul 14, 2020

Member

The raw info shouldn't be exposed in the Analysis. One use case of an Analysis is to build a separate tool that would render graphically/interactively the reachability graph of a Scala.js codebase, so it is a bit more public in spirit than Infos, which is really internal to the Analyzer.

This comment has been minimized.

@gzm0

gzm0 Jul 15, 2020

Author Contributor

Hmmm... But what is the alternative? We need an info-like thing to determine module dependencies. Are you suggesting we re-calculate this if necessary with another tree traverse?

This comment has been minimized.

@sjrd

sjrd Jul 15, 2020

Member

We can potentially extend the Analysis, but now I see that there is a bigger problem: LinkedClass, which is in standard, contains references to Infos as well. That goes beyond intention, and directly into stabler-references-less-stable territory, which is not good at all.

From what I see, those Infos are needed in ModuleAnalyzer.staticDependencies. IIUC, this computes a dependencies: Set[ClassName]. Could we compute these dependencies eagerly in the Infos/Analyzer? Then we could expose that dependencies in Analysis (where it would make sense) and further in LinkedClass (which would therefore not depend on a type less stable than itself).

This comment has been minimized.

@gzm0

gzm0 Jul 16, 2020

Author Contributor

Could we compute these dependencies eagerly in the Infos/Analyzer?

Yes. The first prototype did exactly that. Downsides:

  • It makes the Analyzer even more complicated.
  • The Analyzer now contains specifics to modules.
  • We calculate it twice but only need it once (if we do not add yet another flag to the Analyzer).

I feel like the dependencies we calculate in ModuleAnalyzer are much more private than the Infos. Notably, they are not independent of other classes and depend on symbolRequirements. The Infos on the other hand are a mere summary of the trees of a class.

An alternative to what we have right now, might be to make Info caching more central. As such, both the Refiner and the ModuleAnalyzer could access Infos (without exposing them in the LinkedClass). It is unclear how to properly identify the LinkedClass in this case though.

This comment has been minimized.

@sjrd

sjrd Jul 17, 2020

Member

If we break down the things that staticDependencies computes, there are:

  • Things read from ReachabilityInfo, among which
    • instantiatedClasses and accessedModules, which depend on isNonNative. Why do we need to know whether it is non-native? Could we get away without this knowledge? If not, we could store them in a separate set that the ModuleAnalyzer will filter for non-native classes.
    • Everything else, which do not depend on other classes.
  • Things read directly from the classInfos, which can also be read from existing fields of LinkedClass, AFAICT
  • Things depending on the symbolRequirements, which are only added to j.l.Object, and that the ModuleInitializer could patch up when reading the dependencies of j.l.Object (after all, that is clearly a private implementation detail of the ModuleInitializer).
  • The artificial dependency from everything on j.l.Object, which the ModuleInitializer can also patch up when reading the dependencies from the LinkedClass.

That would remove all the dependencies on other classes and on symbolRequirements.

The Infos might be less private in terms of exposed information, but as an API they are very unstable, and clearly an implementation detail of the Analyzer.

This comment has been minimized.

@gzm0

gzm0 Jul 20, 2020

Author Contributor

Why do we need to know whether it is non-native?

Because instantiating a native class does not actually need the SJS module that class is in. But the native module that class is in.

This is why we need the distinction between, for example, instantiatedClasses and methodsCalledStatically.keys. A static method of a native class is still an SJS thing. However, a new instance is not.

Could we get away without this knowledge?

No.

So in terms of ReachabilityInfo this establishes that we need finer granularity than simply a Set[ClassName]. The alternative I see here is that we make a difference between a static and an external dependency in the Analyzer. This is not really a breach of abstraction, since it follows from the IR spec (the distinction, what exactly constitutes a static dependency is, AFAICT an implementation detail of the Emitter, so we probably have a breach of abstraction either way).

The upside of this would be:

  • No exposed infos.
  • No exposed isParentDataAccessed

The downside of this would be:

  • More complexity in the Analyzer.
  • static and external dependencies are calculated pre optimizer for no reason (if it is enabled).
  • static and external dependencies are calculated even without module splitting.

This comment has been minimized.

@gzm0

gzm0 Jul 22, 2020

Author Contributor

@sjrd would you like me to try this? How I expect this to look is that LinkedClass would get a staticDependencies: Set[ClassName] and a externalDependencies: Set[String] field. (but we should probably introduce a more specific type for externalDependencies).

This comment has been minimized.

@sjrd

sjrd Jul 22, 2020

Member

Yes, I believe that plan sounds good :)

This comment has been minimized.

@gzm0

gzm0 Jul 23, 2020

Author Contributor

I have added a new commit with this. I think it is better.

The only thing I do not like is how we handle module initializers. Maybe we should introduce a LinkedModuleInitializer? It is still not super nice, that the Refiner is re-classifying entry points into modules.

@gzm0 gzm0 force-pushed the gzm0:split-module branch 2 times, most recently from f1b4cfa to c0d54ce Jul 15, 2020
require(moduleSet.modules.size == 1,
"Cannot user multiple modules with the Closure Compiler")
Comment on lines 80 to 81

This comment has been minimized.

@sjrd

sjrd Jul 15, 2020

Member

Couldn't we send each module to Closure in isolation?

This comment has been minimized.

@gzm0

gzm0 Jul 15, 2020

Author Contributor

No. The renaming across the modules needs to be consistent. So it might be possible to make this work, but it seems the main use case we want to address here is to feed these things downstream to webpack or other. So IMO we should not try to support this in a first iteration (probably provide better error messages though :)).

@@ -47,6 +49,7 @@ object Analysis {
*/
trait ClassInfo {
def className: ClassName
def data: Infos.ClassInfo

This comment has been minimized.

@sjrd

sjrd Jul 15, 2020

Member

We can potentially extend the Analysis, but now I see that there is a bigger problem: LinkedClass, which is in standard, contains references to Infos as well. That goes beyond intention, and directly into stabler-references-less-stable territory, which is not good at all.

From what I see, those Infos are needed in ModuleAnalyzer.staticDependencies. IIUC, this computes a dependencies: Set[ClassName]. Could we compute these dependencies eagerly in the Infos/Analyzer? Then we could expose that dependencies in Analysis (where it would make sense) and further in LinkedClass (which would therefore not depend on a type less stable than itself).

Copy link
Member

sjrd left a comment

Some more comments.

Nil
)

Block(tree, Apply(defProp, List(exportsVarRef, name, descriptor)))

This comment has been minimized.

@sjrd

sjrd Jul 15, 2020

Member

Consider refactoring as:

Suggested change
Block(tree, Apply(defProp, List(exportsVarRef, name, descriptor)))
val export = Apply(defProp, List(exportsVarRef, name, descriptor))
Block(tree, export)

and then you can factor out val export = and Block(tree, export) outside of the if (mutable).

val externalDeps =
topLevelExports.flatMap(externalDependencies(_)).distinct.toList

// TODO: Filter empty modules.

This comment has been minimized.

@sjrd

sjrd Jul 15, 2020

Member

It seems to me that the most reasonable outcome would be an empty file. If a module has been requested by the user, and it ends up being empty, why should we get rid of it or consider that an error?

import org.scalajs.linker.interface.ModuleInitializer
import org.scalajs.linker.interface.ModuleSpec.ModuleID

final class ModuleSet private[linker] (

This comment has been minimized.

@sjrd

sjrd Jul 15, 2020

Member

No, I think what you have now is better. It's better if front-end phases (the optimizer, but also future custom front-end phases) don't have to worry about modules.

jsEnv in Test := {
val out = (artifactPath in Test in fastOptJS).value
val config = NodeJSEnv.Config()
.withEnv(Map("NODE_PATH" -> out.toString))

This comment has been minimized.

@sjrd

sjrd Jul 15, 2020

Member

That seems fishy. Should we emit references to other modules as ./foo.js instead of foo.js to avoid the need for NODE_PATH?

This comment has been minimized.

@gzm0

gzm0 Jul 15, 2020

Author Contributor

Maybe. I didn't understand that standard well enough. Is that required to work in every engine? (if yes, we should probably indeed do that).

This comment has been minimized.

@sjrd

sjrd Jul 15, 2020

Member

Nothing is required to work in every engine, if you only read the standard. But if an engine doesn't support explicit relative paths like that, it's unlikely that it supports any kind of reasonable module resolution anyway.

This comment has been minimized.

@gzm0

gzm0 Jul 16, 2020

Author Contributor

Hmmm... But then haven't we taken the stance that we do not want to rely on specifics of engines? Maybe we need to have an additional moduleName function (just like jsNameWithSourceMap, or have that one return all of the things we need).

This comment has been minimized.

@gzm0

gzm0 Jul 16, 2020

Author Contributor

I think I like this idea much more. We should maybe even do this for the references between source maps and JS file. That way:

  • We can provide full backwards compatibility.
  • We have a single, simple mechanism to deal with all the oddities of how things reference each other in a given environment.
@gzm0 gzm0 force-pushed the gzm0:split-module branch 4 times, most recently from 773d498 to 0e730c8 Jul 16, 2020
@gzm0 gzm0 force-pushed the gzm0:split-module branch from 2280620 to 4e1ccd4 Jul 23, 2020
case ModuleKind.ESModule =>
// TODO: Mutable export!
val export = Export((ident -> ExportName(ident.name)) :: Nil)
Block(tree, export)

This comment has been minimized.

@gzm0

gzm0 Jul 23, 2020

Author Contributor

@sjrd this is another thing we should discuss: Currently, we theoretically allow assigning to any static field from anywhere. Nothing I know of uses this. It is more difficult to do across ES6 modules because we need a dedicated setter function (and hence the trees look entirely different).

We can of course build it, but maybe only allowing to write to a static field from the context of the class it is defined in would be an acceptable squint-with-the-eyes IR change?

Otherwise, how do we test this :)

This comment has been minimized.

@sjrd

sjrd Jul 23, 2020

Member

Under optimizations, this is very difficult to ensure. You could have a method that modifies the static field in its class, but if that method is inlined in another class, the assignment will be moved to a different class, and there's not much you can do about it.

Otherwise, how do we test this :)

With an @inline def :p

This comment has been minimized.

@gzm0

gzm0 Jul 23, 2020

Author Contributor

Ah, right. Fair enough :)

@gzm0
Copy link
Contributor Author

gzm0 commented Jul 23, 2020

Open questions at this point:

  • Stick with dependency analysis in Analyzer? (I vote yes)
  • Introduce LinkedModuleInitializer?
  • What to do with filename patterns / OutputSpec?
  • What to do with writing of static fields? (Answer: Allow, under inlining it's necessary).
@sjrd
Copy link
Member

sjrd commented Aug 25, 2020

No, it's OK. Feel free to squash them.

@sjrd
Copy link
Member

sjrd commented Aug 25, 2020

For the default module, with your latest comments, I guess main is good. I think it would be fine to have it as constant String in the ir project. It might have to be wrapped in ModuleID type in the linker if that simplifies things.

I don't see what is wrong with simply comparing the file we're about to write. The reason this is currently in the file abstraction itself is because that one also takes care of atomic moves. That is not set in stone (but it would increase the API surface and potentially make the JS implementation trickier).

Thinking more about this, I believe you're right. It's fine to have this in the file abstraction.

Good point. I didn't think about the scenario of where the linker is restarted entirely, but nothing has changed.

The major challenge here IMO is that we need to serialize the report. This is not easy once we consider compatibility between different linker versions.

The issue between different linker versions can be solved easily, I think: serialize the version number first, and when we deserialize, if the version number is not exactly the same as the current version, return None, saying that the report cannot be deserialized. When we get None, we force relinking. This makes sense because, if I change the version of the linker, I can end up with a different .js file anyway (more optimizations, different encoding, etc.).

We need the report file anyways. Otherwise the key's result type will be Optional[Linker.Report] which is neither nice, nor feasible for downstream stuff (e.g. jsEnvInput, that needs to know the module).

Indeed.

@gzm0
Copy link
Contributor Author

gzm0 commented Aug 26, 2020

Status:

Requesting input from @sjrd:

  • OutputPatterns: Should we make accessing the fields part of the unstable API (or even hide them)? This would allow for more complex implementations later if necessary.

TODO @gzm0:

  • Make ModuleID AnyVal, internal to linker, use strings elsewhere.
  • Incremental writing: Bring back FileFunction, serialize report.
  • Implement writing of static fields across modules.
  • Allow TopLevelExports with same names in different modules.
  • Make Jenkins pass (including new tests).
  • Ensure stability/uniqueness of module IDs (in MinModuleAnalyzer).
  • Complete and cleanup module analyzers.
  • Complete and cleanup compiler (TopLevel export machinery).
  • Complete and test sbt integration.
  • Full-self review.
@gzm0 gzm0 force-pushed the gzm0:split-module branch 2 times, most recently from db53227 to a9b9977 Aug 26, 2020
@sjrd
Copy link
Member

sjrd commented Aug 26, 2020

OutputPatterns: Should we make accessing the fields part of the unstable API (or even hide them)? This would allow for more complex implementations later if necessary.

If we can, I guess the more hidden, the better. It's easy to expose later if we need it. It's harder to hide/remove stuff that we exposed previously.

@gzm0
Copy link
Contributor Author

gzm0 commented Aug 26, 2020

TODO @gzm0:

  • OutputPatterns: Make accessing the fields part of the unstable API (or even hide them)
  • Restrict what a valid module ID is (" " probably shouldn't be one)
  • Incremental writing: Bring back FileFunction, serialize report.
  • Implement writing of static fields across modules.
  • Allow TopLevelExports with same names in different modules
  • Ensure stability/uniqueness of module IDs (in MinModuleAnalyzer).
  • Complete and cleanup module analyzers.
  • Complete and cleanup compiler.
  • Complete sbt integration.
  • Build legacy linker bridge
  • Make Jenkins pass (including new tests).
  • Full-self review.

Stuff to Test:

  • TopLevelExports in multiple modules (with shared code)
  • Everything (that is allowed) in both splitting styles
  • Unchanged files are not re-written
  • sbt legacy integration
@gzm0 gzm0 force-pushed the gzm0:split-module branch 3 times, most recently from d7a1ade to f17b6d6 Aug 27, 2020
@gzm0 gzm0 force-pushed the gzm0:split-module branch from e46ffb9 to bc38095 Sep 6, 2020
@gzm0 gzm0 force-pushed the gzm0:split-module branch from bc38095 to dcbe76e Sep 7, 2020
@gzm0
Copy link
Contributor Author

gzm0 commented Sep 7, 2020

@sjrd I'm unsure about the names for the new sbt tasks. Ideally we don't change the stage names (just to minimize changes). But that makes it hard to find new names :-/ Ideas? (see last commit of this PR for a proposal I don't really like).

@sjrd
Copy link
Member

sjrd commented Sep 7, 2020

Perhaps fastLinkJS and fullLinkJS? I'll keep thinking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants
You can’t perform that action at this time.