Topic on Extension talk:Pickle

Pickle vs Module:UnitTests

10
Summary by Jeblad

A discussion about pros and cons with the different frameworks.

197.218.82.210 (talkcontribs)

Nice work!

This looks pretty interesting, if this were deployed in WMF wikis it would be very likely to be wrapped around some other module to make it easier for casual editors to check their code.

Looking at https://en.wikipedia.org/wiki/Module:Duration/testcases(Module:UnitTests) vs Help:Pickle/Quick tour it generally seems that the former is more intuitive, and could even be written in such a way that a non-developer could add new test cases easily.

One other pain point about lua -based templates, and likely out of scope for this particular extension (but still a bit related) is documentation. Currently users need to write documentation separately from the lua module, and this becomes a big pain to maintain due to the proper of display of related modules.

Anyway, there is a risk of turning Mediawiki into a software development platform, so perhaps one shouldn't make these things overly complicated. Although lack of automatic documentation for plain templates is also a problem (e.g. for detecting template parameters).

Jeblad (talkcontribs)

It is slightly complicated to translate the current test frameworks on enwiki into spec-style testing, so I believe this must be done manually, but it is possible to write methods that allow w:en:Module:Duration and w:en:Module:UnitTests to interface with the rest of the extension. It is more about this extension on Help:Pickle.

197.218.83.76 (talkcontribs)

Yes, I looked into the tests. They seem to account for most things. For the tracking categories it might be useful to have a special page to report the module status, e.g. "Special:PickleReport". Initially it might simply mention the tracking categories and their counts just like Special:LintErrors, later on it could just report on the status of each module with appropriate filters.

Jeblad (talkcontribs)

The initial version of this test lib will halt on errors, and there will not be a meaningful number of failures for a specific pickle page. Ie., it runs up to the first error and then returns. After an error the environment would have to be recreated and the valuation for the failing frame (aka the Example) and all previous failing frames must be rejected. That imply to save and load state for the lib. It is possible, at least during debugging in the console, but not for now.

Statistics on types of errors will neither give any meaningful result as that would imply fixed categories of errors. The failing lib will although be put in categories according to states configured in the setup. That is, the categories will hold the number of failing libs.

From Special:TrackingCategories on my current running Vagrant instance

Category title Message key Description
Modules with good tests pickle-tracking-category-good The result indicates a "good" state after evaluating the tests.
Modules with pending tests pickle-tracking-category-pending The result indicates a "pending" state after evaluating the tests.
Modules with incomplete tests pickle-tracking-category-todo The result indicates that one or more "todo" marker(s) is found after evaluating the tests.
Modules with skipped tests pickle-tracking-category-skip The result indicates that one or more "skip" marker(s) is found after evaluating the tests.
Modules with failing tests pickle-tracking-category-fail The result indicates a "failing" state after evaluating the tests.
Modules without test page* pickle-tracking-category-missing The result indicates a "missing" test page after evaluating the invocation.
Modules with unknown tests pickle-tracking-category-unknown The result indicates a "unknown" state after evaluating the tests.

*"Modules without test page" should probably be "Modules without known tests"

The previous categories will be populated and thus describe the current state of the libs.

It is possible to say something about the importance of the various libs, by counting the articles that use them, thereby saying something about the impact from a failing lib. That is although outside the scope of this project, at least for the moment.

Jeblad (talkcontribs)

Note that if step-style testing is implemented, then skipping of examples must be implemented.

197.218.88.122 (talkcontribs)

Having it on Special:Trackingcategories is a good thing. But that prevents you from adding meaningful data about the errors, for example, a link to a page offering more explanations for these categories of failures, or to help pages about writing good complete tests, or some possible reasons for the "unknown" state, it also makes it possible to separate severe errors from random warnings. Currently, unlike wikitext-based templates, lua modules are all or nothing. If there is a severe fatal error (or syntax error) in some of the code nothing will render, while templates will just ignore that section and render the rest.

So I still think that having a Special:PickleReport, or Special:PickleErrors, is a good idea. Presumably even if it is not added to this extension, it will be added to some random page by users anyway using parser functions to count the errors in the category.

You could get away with adding some functionality for marking critical modules that deserve immediate attention. Users have already created bots and manually protected lots of these, so they can easily identify them anyway, fixing a critical module is more important than fixing some module outputting errors only in the user namespace.

Jeblad (talkcontribs)

The tracking categories will be created as normal categories, the "tracking" part is like the mechanism for assigning the categories. A page like "Category:Modules with good tests" will thus be created and have libs as members.

I have a sort of plan to add the member count for the categories to the Special:Statistics, but nothing will stop users from creating a project page with {{PAGESINCATEGORY:''categoryname''pages}} so I'm not quite sure if it is necessary to do that. It is more important to notify the devs of the lib that something is breaking.

There are separate categories for pages with syntax errors, but I guess they are less known. That could be a problem. Perhaps modules with severe errors should be marked separately. That is although a problem for Scribunto itself, as the module will not return a valid lib. It should be added to Category:Scribunto modules with errors. The test module could still detect this, perhaps… (I think this will generate an error and show up as an unknown test)

Note that this extension is for detecting logical errors, not severe errors that blocks Scribunto from returning a functional lib.

Impact analysis is interesting, but outside the scope of the project. It is also somewhat difficult as using a module does not imply using the failing functionality. I've been toying with an idea whereby inspectors are injected on the tested lib, and during testing it is possible to profile the calls. That could be used to figure out which functions will break, but then a template might not use a profiled interface. That imply creating a call graph.

197.218.88.122 (talkcontribs)

Oh yes, I didn't know about that module, I think it is actually never populated nowadays because modules with syntax errors cannot really be saved anymore. That was probably a tracking category (https://phabricator.wikimedia.org/T41605) to cleanup the old ones before that change was made. It should probably be removed from scribunto now that it is pointless.

The one I knew about was Category:Pages_with_script_errors. There is an interesting bug / feature with it that makes stuff show up in that category when someone uses a hacky frame:preprocess to invoke a module incorrectly. Useful for testing if a module will really run correctly, although in many cases not really necessary if you use frame:newchild .

Note that this extension is for detecting logical errors, not severe errors that blocks Scribunto from returning a functional lib.

Some logical errors eventually become syntax errors, e.g. if code outputs data from a lua table that assumes February always has 28 days, once in a blue moon it will fail. Depending on the code the error may even disappear with a refresh of the page, only to reappear sometime later. The unfortunate part of that is that scribunto developers deliberately deactivated the debugging functions of lua that can actually output variables in the stack trace due to security concerns.

The only other feature I could suggest for this extension (although it might not be feasible) is to show all module dependencies related to the failure of the tests. Currently it is quite cumbersome to go through all the module dependencies one by one to track that the offending code. I once tried to port Module:documentation to a non-Wikimedia wiki, and it was hell to figure out all errors.

Jeblad (talkcontribs)

Time and date are wacky. I hate bugs that come and goes.

I would really like full debugging, but that is difficult as it can create security holes.

I'm playing with some ideas for debugging, they too might create security holes.

Jeblad (talkcontribs)

Thank you for good questions! =)