User:Roan Kattouw (WMF)/ResourceLoader submodules
This is an idea for how we could support submodules for large modules that expose many small things with dependencies between them. This is commonly the case for libraries (like OOUI). Right now, our only option is to subdivide these modules into smaller modules, but that also increases the size of the startup module by increasing the number of modules. We'd like to be able to do fine-grained tree-shaking of large libraries, but in the current system that would require creating too many modules.
The basic idea of this proposal is as follows:
- Let modules depend on individual files from another module (if the depended-on module is a package module)
- Allow files in package modules to express dependencies on each other, and on other modules
- Simplify/consolidate this information in the manifest, so that only module-level information is exposed to the client
- The client doesn't know exactly what parts of modules it's asking for, but it passes enough information so that the server can figure it out
Per-file dependencies for package modules
editLet files in package modules define dependencies for each file. These dependencies could be other files in the same module (internal), or other modules (external). This information would not be exposed directly in the module manifest in the startup module: internal dependencies are omitted completely, and external dependencies are consolidated at the module level.
{
"foo": {
"packageFiles": [
{
"file": "one.js",
"dependencies": [
"foo/two.js",
"bar"
]
},
{
"file": "two.js",
"dependencies": [
"bar"
]
},
{
"file": "three.js",
"dependencies": [
"baz"
]
}
]
}
}
The module definition above expresses an internal dependency (one.js
depends on two.js
) and several external dependencies. In the startup manifest, this will be simplified to say that foo
depends on bar
and baz
.
Allow modules to depend on files from other modules
edit
Using the moduleName/filename.js
syntax, also used above for internal dependencies, modules can depend on files from other modules, and then load these using require( 'moduleName/filename.js' )
{
"quux": {
"dependencies": [
"foo/one.js",
"blah"
]
}
}
In the startup manifest, this is simplified to say that quux
depends on foo
and blah
. It will also say that the dependency on blah
is a full dependency and the dependency on foo
is a partial dependency, without saying exactly which file(s) it depends on[1].
How the client deals with partial dependencies
editWhen the client is asked to load quux
, it sees that it has a full dependency on blah
and a partial dependency on foo
, and that foo
in turn depends on bar
and baz
. Assuming none of these modules have been loaded yet, the client sends a request to the server indicating it wants all of quux
, blah
, bar
and baz
, and part of foo
. It doesn't know what part it needs, but it indicates that foo
needs to be loaded partially[2]. The server can figure out which parts are needed by looking at which files within foo
are depended on by the modules that are being requested.
Note that there is an inefficiency here: baz
is requested even though we won't need it, because we aren't going to load the part of foo
that it depends on, but the client doesn't know that.
How the server resolves partial dependencies
editThe server gets a request asking for all of quux
, blah
, baz
and quux
, and part of foo
. The server determines which parts of foo
are needed by looking at which files in foo
the fully-loaded modules depend on, then resolving internal dependencies. It finds that quux
depends on foo/one.js
, which in turn depends on foo/two.js
. It responds with the full contents of the fully-requested modules, and the partial contents of foo
(only one.js
and two.js
, but not three.js
).
How the client manages state for partially-loaded modules
editThe client receives a partial response for foo
, which is flagged as such[3]. It makes these files available for loading with require()
, but it doesn't mark the module as fully loaded. If, later, a module is loaded that also has a partial dependency on foo
, the client will follow the same protocol and let the server figure out which files to send, which might duplicate files it already has. If this happens, the client will simply ignore the files in the response that it has already loaded. If, later, the full foo
module is asked to be loaded in its entirety (or a module is loaded that has a full dependency on foo
) , the client will ask the server for the entire module, and again ignore the files it already has.
Inefficiencies
editThis proposal has two main inefficiencies. First, and more important, when a module is loaded partially, all of its external dependencies are loaded too, even the ones that aren't needed for the files that are being loaded. This is difficult to avoid with this architecture, and it might be an issue if there are many unnecessary dependencies that are loaded this way or if the unnecessary dependencies are large. I don't think there's a good way of dealing with this other than splitting the module.
Secondly, if the same module is partially loaded twice, to satsify different dependencies in different requests, some of its files could be downloaded twice. I don't think this will be much of an issue, because this is likely to be infrequent (the same module being partially loaded twice, on separate occasions, on the same page won't happen often) and the impact is likely to be low (few files double-loaded each time). If there is a large "core" part of the module that almost all files depend on, breaking that out into a separate module could address that.
Examples for where this could be used
edit- OOUI icon packs: these consist of fully independent parts (individual icons) with no internal or external dependencies. All OOUI icons could be put in one big module, with each module using them specifying which exact icons it needs
- OOUI itself: each widget could, in principle, be exposed separately
- mediawiki.widgets.*: There is a
mediawiki.widgets
module with relatively unrelated widgets, and there are 16 moremediawiki.widgets.SomethingWidget
modules (and 5mediawiki.widgets.SomethingWidget.styles
modues) that contain individual widgets. These could potentially be consolidated into one omnibusmediawiki.widgets
module.
Open questions
edit- Should modules have to opt into letting other modules depend on their files, or should it be allowed for all package modules? If a module allows other modules to load its files, should all its files be exposed, or only a limited list of files that it specifies?
- How do we support CSS files? We'd need this for OOUI (widgets come with styles) and for icons (which are only CSS). In the code for
.vue
file support, we do have an internal content typescript+style
that allows CSS to be bundled with a (JS) package file, but we don't allow this to be used in the module definition (yet). This may be as simple as allowing.css
(and.less
) as a package file extension that maps to ascript+style
with an empty script part, then having JS files express internal dependencies on the CSS files they need. - Should we / Would we need to allow direct loading of individual files? Icon packs are often loaded directly through
addModuleStyles()
, themediawiki.widgets.*.styles
modules are too, and in some cases we may want to consolidate many modules into a single module but still be able to load parts of it directly throughaddModules()
,
Footnotes
edit- ↑ This means having to send another boolean flag along with every dependency. Note that, in the startup manifest, a module's dependencies are expressed as an array of numbers, where each number is an index into the modules array. One hacky way of encoding this boolean for each dependency could be to use the sign of this number: encode full dependencies as positive numbers (as they currently are) and partial dependencies as negative numbers. (This may require using 1-indexing, because -0 is difficult to work with in JS.)
- ↑ This could be done, for example, by listing the partially-requested modules in a separate parameter:
&modules=quux|blah|bar|baz&partialModules=foo
- ↑ We'd need some way in the
mw.implement()
call to distinguish. Our current format for full package modules ismw.implement('foo', {main: "mainFile.js", "files": {"one.js":function...}})
, so perhaps the format for a partial response could drop the"main"
field, or replace it withpartial: true
.