Extension talk:Scribunto/Lua reference manual
capital in Then
editRe [1]: "If named arguments to #invoke are specified, for example {{#invoke: test | func | foo = bar }}" is not a full sentence, so "Then" cannot be with a capital.--Patrick1 (talk) 11:44, 30 August 2012 (UTC)
pattern () in which function ?
edit"the empty capture () captures the current string position (a number)."
- string.find (s, pattern [, init [, plain]])
- string.gmatch (s, pattern)
- string.gsub (s, pattern, repl [, n])
- string.match (s, pattern [, init])
Perhaps a note about "pattern ()" could be usefull in these functions ? --Rical (talk) 11:51, 23 December 2012 (UTC)
What's available on WMF projects?
editIt seems like a lot of the libraries mentioned in this manual are not available on test2.WP, which I assume means that they won't be available on the WMF projects that Scribunto is about to be deployed on? (For example, mw.language
, mw.site
, mw.uri
, and mw.ustring
all appear to be missing.) Could this manual be edited to make clear which modules are available in what Scribunto versions? —RuakhTALK 05:02, 18 February 2013 (UTC)
- mw.ustring is available now. Here's a little test: User:Amire80/Scribunto. --Amir E. Aharoni (talk) 04:20, 19 February 2013 (UTC)
- The hard part about that is idenfitying the versions. There isn't a 1:1 relationship between the version of MediaWiki and the version of Scribunto; for example, until the deploy 9 hours ago wmf9 had a relatively old version, which was upgraded to the newest. This could potentially be done again if warranted, rather than waiting for wmf11 to get the latest updates. Anomie (talk) 13:21, 19 February 2013 (UTC)
- This is a point of confusion, though. As of right now there is no way to know for sure which functions are available on any given wiki, except by writing test code to see if it works or triggers a script error. At the very least, the documentation should specify the first Scribunto version for which each function became/will become available. CodeCat (talk) 22:19, 5 March 2013 (UTC)
- Define "Scribunto version"; saying bit32 is first available in 5e548e769a464e3223cd52ffa0f819f6bf1c9924 doesn't help a whole lot. Anomie (talk) 02:34, 7 March 2013 (UTC)
- It is possible to test existence of stuff by performing ~=nil tests. It is not very nice but it works (I use it in my test module to check functions that appears). Maybe we could write a "What's available module" that shows what is present or not? Hexasoft (talk) 08:59, 7 March 2013 (UTC)
- Define "Scribunto version"; saying bit32 is first available in 5e548e769a464e3223cd52ffa0f819f6bf1c9924 doesn't help a whole lot. Anomie (talk) 02:34, 7 March 2013 (UTC)
- This is a point of confusion, though. As of right now there is no way to know for sure which functions are available on any given wiki, except by writing test code to see if it works or triggers a script error. At the very least, the documentation should specify the first Scribunto version for which each function became/will become available. CodeCat (talk) 22:19, 5 March 2013 (UTC)
(u)string patterns and PCRE
editI believe, for people who know PCRE a special explanation of differences should be written, except that one must use %
instead of \
. I was confused not to find (abc|def)
construction in Lua. Ignatus (talk) 15:16, 1 March 2013 (UTC)
- Done. Anomie (talk) 17:10, 1 March 2013 (UTC)
- Great! Ignatus (talk) 18:26, 2 March 2013 (UTC)
Split the page?
editThis page is very long right now. Should it be split into several subpages? I think splitting the general Lua documentation (which mostly just reiterates the official documentation anyway) from the Scribunto-specific stuff would be a good idea. So it would become Extension:Scribunto/Lua reference manual and Extension:Scribunto/Scribunto libraries or something similar. Actually... why is a separate documentation for general Lua even needed on this page if it's already available elsewhere? It might be more useful to note only the points where Scribunto differs. CodeCat (talk) 22:17, 5 March 2013 (UTC)
- Documentation for general Lua is useful here to avoid people getting lost in BNF, to provide wikilinks to relevant articles, and to avoid people getting lost in the documentation for things that aren't available in Scribunto. Anomie (talk) 02:35, 7 March 2013 (UTC)
- I agree with this point of view: easier for wikilinks, all is here, what's not in scribunto is not here. In addition some things are pure "Lua" and others pure "Scribunto", but some are Lua-but-changed-in-Scribunto. Hexasoft (talk) 08:48, 7 March 2013 (UTC)
Documentation question
editHello,
Anomie: in the change you made today you added mw.language:formatDuration() and lang:getDurationIntervals() in two different places, but they both have the same parameters and the same description. Is it a copy/paste error or really two functions performing the same action? (in which case you may collapse the two sections in my opinion).
Regards, Hexasoft (talk) 08:48, 7 March 2013 (UTC)
- Eek. Don't bother. It's too early the morning, I don't read it correctly. Hexasoft (talk) 08:52, 7 March 2013 (UTC)
request for libraryUtils: new()
editi would like to see in libraryUtil a little syntactic sugar called "new", emulating a constructor, to help make lua a tiny bit more OO like.
new = function(t) return setmetatable({}, {__index = t}) end
this will allow me, e.g., to do things like
x = libraryUtil.new(table)
x:insert(21)
or to create a table with some members, some of which may be functions, and construct new "instances" of this table by calling "new" and the table:
myClass = {
member1 = "here i was",
member2 = "there i go",
a = function(t, x) doThis(t.member1, x) end
b = function(t, x) doThat(t.member2, x) end
}
-- .....
instance = libraryUtil.new(myClass)
-- ...
instance:a(11)
instance.member1 = "hoola boola"
instance:b("whaddayasay")
peace - קיפודנחש (talk) 18:07, 11 March 2013 (UTC)
- The usual idiom seems to be to use
myClass.new()
rather thannew(myClass)
. Also, due to Scribunto's sandboxing, it is not possible for arequire
d library such as libraryUtils to add global functions as proposed here; it would wind up aslibraryUtils.new
, at which point you may as well use the normal idiom. BJorsch (WMF) (talk) 13:19, 12 March 2013 (UTC)- regarding your 2nd comment: sure, i meant
libraryUtils.new()
(also changed the snippets above to reflect this more correct usage). not a problem. - however, my Lua-fu is not that strong, and i'm not sure i understand your "standard idiom" comment. calling
x = table.new()
result in an error, as well aslocal x = {a=1,b=2,c=3}; local y = x.new()
. if you mean i should write a new "new" function in any table i want to construct, then yes, i understand this is possible, but i think what i ask is both more elegant and simpler, while suplying the reuired functionality. of course, it won't stop anyone from creating tailored constructors, and calling them "new" or "construct" or duplicating the object name, or any other name they choose. peace - קיפודנחש (talk) 17:55, 12 March 2013 (UTC)
- regarding your 2nd comment: sure, i meant
mw.text.unstrip()
editHello,
as far as I understand the doc, the mw.text.unstrip()
function will allow to "unstrip" incoming strings that include tags such as nowiki.
It is fine (I filled the bugreport) because it will allow to read the full content of any kind of parameters. My question: does using frame:preprocess()
on the unstrip string will produce the same original string? Also, to be sure: unstrip will returns a copy, not modify the original string?
Regards, Hexasoft (talk) 19:31, 11 March 2013 (UTC)
- No,
frame:preprocess()
on the unstripped string will expand any wikitext in that string. Even if you were to add back the correct extension tag (e.g. wrap the string in<nowiki>...</nowiki>
, or<ref>...</ref>
, or whatever), the result still might not be the same. For example,<ref>A book!</ref>
might return␡UNIQ7664d5acee6378e2-ref-00000002-QINU␡
, but then unstripping might result in<sup id="cite_ref-1" class="reference"><a href="#cite_note-1"><span>[</span>1<span>]</span></a></sup>
: the HTML for the superscripted footnote, which isn't even valid for return from the Lua module since<a>
is not allowed in wikitext. It's even possible that the text hidden behind␡UNIQ7664d5acee6378e2-ref-00000002-QINU␡
is something completely unexpected such as serialized PHP data, as it may be that the extension is intending to do postprocessing in a ParserAfterParse hook or the like. - Yes,
unstrip
does not modify the original string. Strings in Lua are immutable and primitive values are passed by value, so it is not possible to modify an input string in a normal function call (you can modify the keys/values in an input table, of course). BJorsch (WMF) (talk) 13:38, 12 March 2013 (UTC)- Thanks for the clarification. I will play with it when available. Regards, Hexasoft (talk) 17:58, 12 March 2013 (UTC)
In section "mw.language:formatDate"
editHello,
in section Extension:Scribunto/Lua reference manual#mw.language:formatDate it is said:
« Formats a date according to the given format string. If timestamp
is omitted, the default is the current time. The value for local
must be a boolean or nil; if true, the time is formatted in the server's local time rather than in UTC. »
Rather than « server's local time » shouldn't it be « wiki local time »?
Regards, Hexasoft (talk) 16:39, 17 March 2013 (UTC)
- Yes, it should be. Thanks for pointing out the error. BJorsch (WMF) (talk) 23:10, 17 March 2013 (UTC)
- I just noticed some similar language in the os.date section. I assume this should also be wiki time, rather than server time? — Mr. Stradivarius ♪ talk ♪ 10:52, 3 February 2014 (UTC)
- os.date is a Lua builtin and really does use the server's local time. You can test this easily enough by going to dewiki or another wiki with a non-UTC local time set, edit any module page to get to the debug console, and compare the output of
mw.language.getContentLanguage():formatDate('O',nil,true)
versusos.date('%z')
. Anomie (talk) 14:12, 3 February 2014 (UTC)
- os.date is a Lua builtin and really does use the server's local time. You can test this easily enough by going to dewiki or another wiki with a non-UTC local time set, edit any module page to get to the debug console, and compare the output of
- I just noticed some similar language in the os.date section. I assume this should also be wiki time, rather than server time? — Mr. Stradivarius ♪ talk ♪ 10:52, 3 February 2014 (UTC)
Suggestion for mw.text: word distance
editHello,
I had the need of a distance function beetween words, and coded en:Levenshtein distance in my module.
Don't know if it should be useful (I see few cases where that feature can be useful) but if you think it is it could have its place in mw.text module. Moreover I think it is the kind of algorithm that can be far more efficient in something else than Lua (I guess that a double-looping on characters from two strings is − for example − far more efficient in C or any language that gives direct accès to string elements).
Regards, Hexasoft (talk) 17:21, 21 March 2013 (UTC)
- There are a lot of things that would be easier done in PHP, but on the other hand we don't want to bloat the Scribunto libraries with too many features that won't be of general use. On this one, I'd personally lean towards "no". But if someone else wants to code it up and put a patch in Gerrit, they can and we'll see what anyone else things. BJorsch (WMF) (talk) 13:28, 22 March 2013 (UTC)
environment for getfenv(function)
edit- We can read : "Passing a function returns the environment that will be used when that function is called."
- Can we say : "Passing a function returns the environment where this calling function is, if any, else nil. In case of recursive function, returns the nearest call of this function." ? --Rical (talk) 18:40, 21 March 2013 (UTC)
- That would not be accurate. All functions have an environment, it's just that Scribunto may restrict access to some of them. And the environment returned depends on the function, not on the call stack; the "environment" returned is usually the same table that function sees as
_G
. BJorsch (WMF) (talk) 13:38, 22 March 2013 (UTC)
- That would not be accurate. All functions have an environment, it's just that Scribunto may restrict access to some of them. And the environment returned depends on the function, not on the call stack; the "environment" returned is usually the same table that function sees as
Conditionals
editThe reference doesn’t formally cover if-then-else conditionals. —Michael Z. 2013-03-21 21:55 z
- Extension:Scribunto/Lua reference manual#if seems to cover it. BJorsch (WMF) (talk) 13:39, 22 March 2013 (UTC)
mw.text.tag() request
editSo there is one special property for tags, also known as "style". jQuery acknowledge this, so setting style elements is not done through calling elem.attr('style', something) (although this will also work), but rather, they provide a special api called elem.css().
so here is the request: to provide some mw.text.XXX() support for style. this can be done either by augmenting the ma.text.tag() somehow, or by creating a whole new mw.text.style(...). if the latter approach is taken, i'd like to make a suggestion: please allow for multiple parameters, and concatenate them using semicolon. so the call should look something like so:
function mw.text.style(...)
local style = {}
for _, v in pairs( { ... } ) do
if type( v ) == "string" then
table.insert( style, v )
elseif type( v ) == "table" then
for sk, sv in pairs( v ) do table.insert( style, sk .. ':' .. sv ) end
else
error( "Parameters to mw.text.style should be strings and tables only" )
end
end
return table.concat( style, ';' )
end
this is not meant to be the code itself, necessarily - clearly i did not think of all the possible implications. it's more of a sektch to illustrate the required functionality of helping the lua programmers composing "style" attribute from different bits and pieces of flotsam. either way, some functionality that would best mimic jQuery's ".css()" would be greatly appreciated. peace - קיפודנחש (talk) 15:40, 30 March 2013 (UTC)
Fragment
editThe arguments for mw.title.makeTitle
are ( namespace, title, fragment, interwiki )
. Namespace, title, and interwiki are familiar enough to me, but what does fragment mean? Does this refer to the project link? I assume, for example, that the "fragment" in [[:wikt:es:Foo]]
would be "wikt". Is this correct? — Mr. Stradivarius ♪ talk ♪ 11:19, 3 April 2013 (UTC)
- Ah, I see I was wrong after playing around with this for a bit. It seems the fragment is the part that appears after "#" in the URL, and that the "interwiki" part includes both the project name and the language name. If no-one has any objections, I think I'll update the manual to say this. — Mr. Stradivarius ♪ talk ♪ 12:32, 3 April 2013 (UTC)
- Feel free. Note that "fragment" (or sometimes "fragment identifier") is the actual name for that part of a URL, see RFC 3986 § 3.5. Anomie (talk) 13:37, 3 April 2013 (UTC)
Access to MediaWiki API; access to page text
editHi. I have two somewhat related questions.
- Is there any ability currently to access the MediaWiki API from Scribunto? For example, I'd like to get a list of all the subpages of m:Global message delivery/Targets. This is available in the MediaWiki API, but I'm not sure if Scribunto can retrieve this information right now. It'd be super-helpful to have a Special:PrefixIndex equivalent available (or more generally, access to the MediaWiki API).
- Is there any ability currently to access page text? For example, I want to count instances of a string within the wikitext of m:Global message delivery/Targets/Wikidata.
Thanks in advance for any help or pointers here. --MZMcBride (talk) 18:03, 4 April 2013 (UTC)
- Hello,
- I can answer to the #2 question: title objects has a getContent() method, allowing to get the raw content of the page corresponding the the title object. See Extension:Scribunto/Lua_reference_manual#Title_objects (the last entry). It think it is what you are looking for. Regards, Hexasoft (talk) 19:18, 4 April 2013 (UTC)
- PS: note that I read somewhere that a Scribunto library exists (or is planed) to access wikidata stuff. Not sure about how advanced it is.
- note that title object in general, and getContent() in particular, do not work for titles on another wiki/site, so if the requirement is, for instance, to get the content of m:Global message delivery/Targets/Wikidata from any other wiki than meta, it won't help you, and i do not think this is possible. peace - קיפודנחש (talk) 22:37, 4 April 2013 (UTC)
- The Wikibase Lua API will be deployed as the same time as the #property parser function. So, next Monday for en Wikipedia and, if all is good, Wednesday for the others. Here is the doc. I've already written a module to test it. Tpt (talk) 20:05, 5 April 2013 (UTC)
- note that title object in general, and getContent() in particular, do not work for titles on another wiki/site, so if the requirement is, for instance, to get the content of m:Global message delivery/Targets/Wikidata from any other wiki than meta, it won't help you, and i do not think this is possible. peace - קיפודנחש (talk) 22:37, 4 April 2013 (UTC)
- Making arbitrary API queries from Scribunto is probably not going to happen; they would be slow and prone to issues with proper data sanitization and accounting of the CPU time. This was discussed in this wikitech-l thread. BJorsch (WMF) (talk) 13:30, 5 April 2013 (UTC)
- All right. So for the Special:PrefixIndex example mentioned above, how would I achieve this functionality? File a bug in Bugzilla about adding this functionality to Lua/Scribunto? Will I need to do this for each MediaWiki API feature I want ported over to Scribunto/Lua? (Generating input lists seems like it'll be a pretty commonly needed functionality inside Scribunto modules, to avoid duplicating/hardcoding lists of pages, templates, transclusions, categories, images, external links, etc.) --MZMcBride (talk) 18:33, 7 April 2013 (UTC)
Links related to this discussion:
Perhaps these will be helpful to someone. --MZMcBride (talk) 20:48, 11 April 2013 (UTC)
- Actually, you can transclude and unstrip Special:PrefixIndex - see w:Module:Module overview for an example. A problem though is that I found out this causes the cache to be disabled, so the page is regenerated on every view. On the other hand, you can use the mw.message library to link to the such an uncacheable page, and then the page will never be updated except when it is edited. Alas, I still know of no way to say refresh the cache once a day or once a week. (Speaking of which I never did remember to update Module Overview to avoid reloads, hmmm... Wnt (talk) 05:54, 24 January 2014 (UTC)
- And only limited number of subpages is processed. See also category handling by Lua: ru:module:category. Ignatus (talk) 16:00, 24 January 2014 (UTC)
System messages with "page" or "subpage" ?
editHi.
Do you know what's the diffrence between messages system like Mediawiki:Scribunto-doc-subpage-xxx and Mediawiki:Scribunto-doc-page-xxx ? It seems that the twice exist.
Thanks by advance for your answer, Automatik (talk) 14:53, 5 April 2013 (UTC)
- "Mediawiki:Scribunto-doc-subpage-xxx" was renamed to "Mediawiki:Scribunto-doc-page-xxx". The former is no longer used. BJorsch (WMF) (talk) 21:16, 6 April 2013 (UTC)
- Thanks! Automatik (talk) 00:55, 7 April 2013 (UTC)
Extension for mw.text.listToText
editHello,
it may be usefull to add the possibility to add something before/after items given to listToText
. I.e. something like <span class="nowrap">...</span> or why not [[...]] so that it would not need to preprocess incoming parameters just to add constant strings before/after them before passing them to listToText
.
Maybe something like an optional format
parameter (i.e. "[[%s]]", or optional pre and post parameters).
Regards, Hexasoft (talk) 21:50, 6 April 2013 (UTC)
- Hmmm… Well, in fact it can be simulated by:
pre .. mw.text.listToText( list_of_elements, post .. '; ' .. pre, post .. ' or ' .. pre ) .. post
. The only problem is that it prevent using the default separators. Regards, Hexasoft (talk) 08:55, 9 April 2013 (UTC)- Yes, I also would like to use %1..9 in the string formats, example:
mw.text.listToText( { John, Mickael, Robert }, '<br/>Dr. %1', <br/>and Dr. %1\n' )
like inmw.string.gsub
, with a last string format for the last element, and another string format for all other before elements. --Rical (talk) 11:09, 9 April 2013 (UTC)
- Yes, I also would like to use %1..9 in the string formats, example:
mw.language:parseFormattedNumber
editDocumentation says:
lang:parseFormattedNumber( s )
This takes a number as formatted by lang:formatNum() and returns the actual number. In other words, this is basically a language-aware version of tonumber()
.
However, if you call lang:parseFormattedNumber( 'bla bla bla' )
, you will not get nil as expected (after all, this is what tonumber() returns), but rather 'bla bla bla'. i'd rather you fix the code to do what the documentation implies rather than fix the documentation to say what the code currently does, so i did not touch the documentation. however, i recommend that until this is fixed, we should use tonumber( lang:parseFormattedNumber( s ) )
if we really wand "tonumber" behavior. peace - קיפודנחש (talk) 17:25, 12 April 2013 (UTC)
File information
editIs it possible to add file dimensions to the title objects? Is this even the right place to ask? — 69.162.8.80 17:47, 12 April 2013 (UTC)
- Better would be bugzilla. Anomie (talk) 00:43, 16 April 2013 (UTC)
Debug console
editHi,
When we edit a module, we can see below the interface edition the debug console where it's written:
«* Precede a line with "=" to evaluate it as an expression, or use print().»
But the function print() is not available. How to change this text (in all languages)?
Thanks by advance, Automatik (talk) 21:01, 14 April 2013 (UTC)
- It works fine for me. Something like
print(p.functionname())
prints whatever thefunctionname
function returns, same as=p.functionname()
does.print()
doesn't do anything in the module itself, but you can use it in the debug console—at least on en.wikipedia. — 69.162.8.80 20:32, 15 April 2013 (UTC)- Yes,
print()
is available as an alias formw.log
in the console only. See Extension:Scribunto/Lua reference manual#print. Anomie (talk) 00:41, 16 April 2013 (UTC)- it would be nice if we had the debug console in Special:TemplateSandbox, to get the output of all the modules. is it possible to expand the synergy between Extension:Scribunto and Extension:TemplateSandbox as to include scribunto debug console in the sandbox? it would be nice, of course, to do it in the most generic way possible, so if sandbox will learn to do its thing with other extensions, they will be able to hook *their* debug console to the sandbox too. peace - קיפודנחש (talk) 22:02, 19 April 2013 (UTC)
- I've been working on including the mw.log output in all page previews. But it's currently hung up on a tangential UI change that I haven't had time to get to. BJorsch (WMF) (talk) 12:58, 22 April 2013 (UTC)
- Thanks. i think it will be very useful. peace - קיפודנחש (talk) 21:52, 22 April 2013 (UTC)
- I've been working on including the mw.log output in all page previews. But it's currently hung up on a tangential UI change that I haven't had time to get to. BJorsch (WMF) (talk) 12:58, 22 April 2013 (UTC)
- it would be nice if we had the debug console in Special:TemplateSandbox, to get the output of all the modules. is it possible to expand the synergy between Extension:Scribunto and Extension:TemplateSandbox as to include scribunto debug console in the sandbox? it would be nice, of course, to do it in the most generic way possible, so if sandbox will learn to do its thing with other extensions, they will be able to hook *their* debug console to the sandbox too. peace - קיפודנחש (talk) 22:02, 19 April 2013 (UTC)
- Yes,
- I dont understand how to use the debug console. There's only input field and one button "clear". How to know value of variables/functions from text of script? If I type "=variable" in the field - nothing changes. If I insert mw.log () or mw.log (variable) in script - also nothing. --Vladis13 (talk) 23:44, 18 November 2015 (UTC)
title
argument to frame:newChild()
edit
What does it do? Seems like nothing, in console
mw.getCurrentFrame():newChild{title="user:Ignatus"}:preprocess("{{PAGENAME}}")
gives the opened module name, not "Ignatus". Well, it would be great if someday we could expand templates like on a page with specific title. Ignatus (talk) 08:17, 16 April 2013 (UTC)
- It sets the title of the new frame. But at a glance, I don't see anything that actually uses the frame's title (
{{PAGENAME}}
gets the title of the page being parsed from the parser). Anomie (talk) 12:48, 16 April 2013 (UTC)
mw.text.unstrip: be able to detect wiki tag?
editHello,
the mw.text.unstrip()
will be useful for some cases where we get parameters in nowiki tag is some cases.
But is there a way to know that a string is "tagged" and/or to know with which tag(s)? Of course comparing str with unstrip(str) can do it for the first point but it is not very nice, and it don't allow to know which kind of data is inside (I mean, nowiki or pre should be fine 'cause data inside is « real », ref is probably not fine − <a href="#_note-toto-1">[1]</a> is not treatable).
Regards, Hexasoft (talk) 13:34, 30 April 2013 (UTC)
- You can extract the tags that
mw.text.unstrip()
replaces with a pattern something like(\127UNIQ[^\127]+QINU\127)
. Often a pattern like(\127UNIQ%x+-([^-\127]+)-[^\127]+QINU\127)
should give you the name of the tag too, but that's not guaranteed to work. Anomie (talk) 13:00, 1 May 2013 (UTC)- Yes. When I played with stripped strings I found this point. As internal strip structure is internal maybe a mw.text.isStripped(text) that whould return nil or a string with the tag found? (or to make unstrip() returning a second argument with the same convention?)
- It is clearly not difficult to code this by myself. It would be useful if the internal structure may change in the future. If not it is not an important point.
- Thanks, Hexasoft (talk) 08:33, 22 May 2013 (UTC)
mw.ustring.isutf8
editHi,
Is it normal that I get true
when I call a function who returns mw.ustring.isutf8 ("00000")
? How to understand what a utf8 character is?
Thanks by advance, Automatik (talk) 01:56, 2 May 2013 (UTC)
- Yes, because that string is valid UTF-8. You would get false for something like
mw.ustring.isutf8 ("00\128000")
because a byte 128 may not follow a byte 48 in a UTF-8-encoded string. See w:UTF-8 (or, as I see you edit most on French-langauge projects, w:fr:UTF-8) for details on UTF-8 encoding. Anomie (talk) 13:15, 2 May 2013 (UTC)
__len
metamethod
edit
Is it planned to use it for overriding # operator? I see it is highlighted in edit window like others but doesn't yet work. Ignatus (talk) 11:41, 16 May 2013 (UTC)
- The __len metamethod does not apply to tables in Lua 5.1, which Scribunto is based on. If there is a way to simulate it in pure Lua 5.1 (i.e. without recompiling or loading C libraries), we'd be very interested. Anomie (talk) 13:27, 16 May 2013 (UTC)
- It is possible to get that behaviour using proxies. It is possible to create a light userdata with
newproxy()
, whose__len
metamethod is called when the#
operator is used. Example:
- It is possible to get that behaviour using proxies. It is possible to create a light userdata with
local mt, cache = {}, setmetatable({}, {__mode = "k"})
function newobject()
local pr = newproxy()
debug.setmetatable(pr, mt)
cache[pr] = {}
return pr
end
mt.__newindex = function(self, k, v) cache[self][k] = v end
mt.__index = function(self,k) return cache[self][k] end
mt.__len = function() return 5 end
foo = newobject()
foo.a = 2
print (foo.a, #foo) --> 2, 5
- A safe version of
debug.setmetatable()
could be provided: check that you're not setting the metatable of a core object, like strings, numbers, booleans or nil.
- A safe version of
- ... but Scribunto backports __pairs and __ipairs from Lua 5.2.. Why isn't __len for tables supported as well? I don't think it it would break any existing code, since
#table
is currently useless if you don't want the default behavior. -- Pygy 81.243.36.191 11:55, 3 June 2013 (UTC)- Hmm, undocumented
newproxy()
, which seems to have been removed entirely in 5.2. I'll have to have a look at that at some point, to see if it seems close enough to a real table to be usable internally (e.g. formw.loadData
).type( t ) ~= 'table'
may turn out to be troublesome. But at a glance I doubt we'll want to exposenewproxy
and even a wrappeddebug.setmetatable
to module code. - We can backport __pairs and __ipairs because doing in in pure Lua just requires redefining pairs and ipairs using documented and supported methods; see Gerrit change 40998. Anomie (talk) 14:38, 3 June 2013 (UTC)
newproxy()
was experimental and it was removed because its primary use case (the custom__len
metamethod) was no longer needed in 5.2.type(newproxy())
returns"userdata"
. The patch that backports len is rather short (https://github.com/dubiousjim/luafiveq/blob/master/patches/table-len.patch)... Why are you reluctant to use a patched version? BTW, if you are using LuaJIT, enabling__len
is one compile-time switch away. Edit: my bad regarding LuaJIT: the flag I mentioned also enable a slew of other Lua 5.2 features, and removes two methods from the standard lib... That being said, the functionality is already there, it must just be enabled. -- Pygy 81.243.36.191 22:46, 3 June 2013 (UTC)- Because we want other wikis to be able to use Scribunto without installing custom binaries, which their hosting provider may not even allow. At one time even requiring the stock Lua interpreter was worrisome. Anomie (talk) 13:51, 5 June 2013 (UTC)
- That makes sense, I'm not familiar with the distribution process of MediaWiki. If I understand you properly, providing patched Lua source and binaries along with MediaWiki is not an option, then... -- Pygy 89.90.144.90 08:50, 10 June 2013 (UTC)
- It turns out that debug.setmetatable is not needed:
- Because we want other wikis to be able to use Scribunto without installing custom binaries, which their hosting provider may not even allow. At one time even requiring the stock Lua interpreter was worrisome. Anomie (talk) 13:51, 5 June 2013 (UTC)
- Hmm, undocumented
- ... but Scribunto backports __pairs and __ipairs from Lua 5.2.. Why isn't __len for tables supported as well? I don't think it it would break any existing code, since
baseproxy = newproxy(true)
getmetatable(baseproxy).__len = function() return 5 end
proxy2 = newproxy(baseproxy)
print(#proxy2) --> 5
- -- Pygy ~~
Please tell something more about frame:getParent
editframe:getParent() Called on the frame created by {{#invoke:}}, returns the frame for the page that called {{#invoke:}}. Called on that frame, returns nil.
Very exoteric doc.... :-D I couldn't understand what it means; but, while browsing some script into fr.source, I found (if I'm not wrong) that this is the key to pass arguments from a template to a Lua script, and this is mostly important (see s:fr:Module:Table. Can someone expand doc as it deserves? Thanks! --Alex brollo (talk) 07:59, 23 May 2013 (UTC)
- Hello,
- the frame is the "environment" of the caller. This environment (mainly) includes the agurments given to the call (the #invoke). As modules are often called from a template, which is used in articles, the arguments given to the template are not in the frame but in the getParent() of the frame. Let give an example:
- In article foo we call
{{mytemplate|templatearg1|templatearg2}}
. This template do:{{#invoke:mymodule|myfunction|modulearg1|modulearg2}}
. - In the function myfunction from module Module:Mymodule if you read the arguments in
frame.args
you will find modulearg1 and modulearg2 as unamed arguments. If you access the args table of the parent frame (usingframe:getParent().args
) you will find templatearg1 and templatearg2 as unamed arguments. - Of course if you directly use a module with #invoke rather than with an intermediate template only the main frame exists.
- Hope it will help you.
- Regards, Hexasoft (talk) 08:20, 23 May 2013 (UTC)
- Thanks! I'll test into my sandbox. I guess, from fr.source script, that I can retrieve templatearg1 and templatearg2 by
frame:getParent().args
even if template doesn't pass parameters explicitely when calling the module (a statement{{#invoke:mymodule|myfunction}}
is sufficient). Well, I've what I need to learn by "try and learn". --Alex brollo (talk) 10:03, 23 May 2013 (UTC)- It runs :-) --Alex brollo (talk) 13:18, 23 May 2013 (UTC)
- Thanks! I'll test into my sandbox. I guess, from fr.source script, that I can retrieve templatearg1 and templatearg2 by
Also beware that neither frame:getParent() nor frame.args ever return the intermediate args.
- In article foo we call
{{mytemplate|templatearg1|templatearg2}}
. - The template mytemplate has the code
{{myothertemplate|templatearg3|templatearg4}}
- The template myothertemplate has the code
{{#invoke:mymodule|myfunction|modulearg1|modulearg2}}
Module:mymodule will never get templatearg3, templatearg4 unless you specifically pass them from the last (myothertemplate) to the module. You should use something like {#invoke:mymodule|myfunction|modulearg1|modulearg2|{{{templatearg3}}}|{{{templatearg4}}}}}
(or some other code that checks the existence of these parameters before passing them to the module, not very practical though)--Xoristzatziki (talk) 10:29, 9 August 2018 (UTC)
- You seem to have that backwards.
frame:getParent().args
is specifically intended to return your templatearg3 and templatearg4. Anomie (talk) 12:08, 9 August 2018 (UTC)
- You seem to have that backwards.
mw.language:formatNum in the Lithuanian
editIn the Lithuanian number 123456.78 must be given as "123 456,78". but result is "123&#nbsp;456,78". --Vpovilaitis (talk) 05:09, 24 May 2013 (UTC)
- It works fine for me:
mw.language.new('lt'):formatNum(123456.78)
returns "123 456,78", with a raw non-breaking space character. Whatever you're using to encode html entities appears to be mis-encoding the non-breaking space as "&#nbsp;" rather than " ". Anomie (talk) 07:17, 24 May 2013 (UTC)- Thanks. I'm experiment with en:Module:Chart. But this text was in tooltip (title tag). --Vpovilaitis (talk) 08:03, 24 May 2013 (UTC)
I'm lazy, so I'm going to suggest...
editCould there be another set of classes added:
%m
: represents all magic characters^$()%.[]*+-?
.%M
: All characters not in%m
.
Just an idea to enable my laziness... :p Technical 13 (talk) 17:08, 7 June 2013 (UTC)
- Not likely. We don't want to mess around with extending the standard Lua methods like
string.gsub
, and we don't want to take themw.ustring
methods too far from the correspondingstring
methods. - Also, what would be the point of this? The only thing I can see that it might be useful for is if you were trying to escape a user-supplied value-that's-not-supposed-to-be-a-pattern, but for that you can just use
%p
because all magic characters are in %p and anything in %p that isn't magic will still work properly when escaped. Anomie (talk) 15:08, 8 June 2013 (UTC)
String methods
edit'foobar':match'foo'
The manual says that when we call a method on a string, we are using the string library. So in the above code, we would be calling the function string.match('foobar', 'foo')
. However, it also says that using the string library cannot operate on unicode characters, and that we should use the mw.ustring library instead. For this reason, is using string methods a bad idea? I have noticed a few places that they are used in code already, and I am curious. — Mr. Stradivarius ♪ talk ♪ 08:51, 12 June 2013 (UTC)
- Personally, I try to avoid it in Scribunto code for that reason. Although if you are doing bytestring manipulation, the ability to chain calls is useful. But IMO that's a matter for a style guide rather than the reference manual. Anomie (talk) 12:59, 12 June 2013 (UTC)
- That's what I thought - thanks for the clarification. I guess I'm new to issues of coding style - such things didn't really exist (?) with template code. — Mr. Stradivarius ♪ talk ♪ 12:18, 20 June 2013 (UTC)
mw.ustring library missing reverse() method
edit- The following discussion is closed. Please do not modify it. No further edits should be made to this discussion.
this may seem as a useless method, but it's not: specifically, Extension:EasyTimeline, which is installed on wikimedia wikis, treat all strings as if they were written from left to right. this means that for RTL strings, we need to reverse them manually, which was a major point of frustration for Timeline users on RTL wikis. i can't think of a good reason why mw.ustring should omit to provide the standard string function reverse()
. peace - קיפודנחש (talk) 13:57, 15 June 2013 (UTC)
- Because correctly reversing a Unicode string is non-trivial. You can't just reverse the codepoints, you have to divide the string into "abstract characters" (base characters plus any combining characters) or "grapheme clusters" and reverse those. And then you probably have to handle ties, bidi characters, and other such specially. Rather than trying to hack around it with Lua, it would probably be better to fix RTL support in Extension:EasyTimeline. Anomie (talk) 13:13, 17 June 2013 (UTC)
- in principle i think you are correct, but in reality, it would be useful to have a reverse() function with a footnote, explaining that this function actually reverses codepoints and not "abstract characters". true, there would be cases where this limitation would make reverse() useless, or at least "less useful", but i can't see how the current state of affairs, with no "reverse" at all, is any better. for hebrew, at least (and i believe the situation is very similar for arabic), these "abstract characters" appear when the string uses something callsed Niqqud. this is optional, and telling the users "do not use niqqud in such and such situation" is far superior to telling them "if you want to use timeline, you have to supply the strings backwards", which is what happens now (and, btw, i do not think niqqud works with timeline anyway).
- as to the "Rather than trying to hack around it with Lua, it would probably be better to fix RTL support in Extension:EasyTimeline": this problem plagued the use of "timeline" in RTL wikis for years and years (ever since timeline was introduced - i think around 2005 or maybe even earlier). it was not fixed in the years since then, and is not likely to be fixed ever.
- Lua gave us a nice opportunity to work around the issue (i actually already implemented string reversing on hewiki, by utilizing mw.text.split(), manually reversing the table, and then use table.concat(). i just thought that since lua supports string.reverse(), we should also have mw.ustring.reverse() ).
- peace - קיפודנחש (talk) 16:00, 17 June 2013 (UTC)
- Both of you are wrong. Unicode "code points" and "abstract characters" are exactly the same (except for special code points not assigned to abstract characters: the 4096 surrogates and the ~50 non-characters like U+FFFF).
- The correct term is "combining sequences" (that are easy to split, and locale-independant) or the longer "grapheme clusters" (Unicode provide some data for them, but they are locale-dependant).
- "Reversing a string" anayway is not an operation that should depend on locales as it will always create something that has no meaning in all locales (so "grapheme clusters" are not relevant at all).
- What this means is that the reverse() method can safely be implemented by splitting on boundaries of combining sequences (Bidi properties also do not matter at all!). It also does not matter if that string is not displaying the same clusters (when rendering this text that has no meaning).
- All that is needed is to parse code points (their limits are extremely easy to detect, and the Ustring module already does that) and know if a codepoint is combining or not (i.e. its combining class is non-zero) to detect the boundaries of combining sequences (and to ensure that the result will respect canonical equivalences). You actually don't need a large map of exact combining classes but only if it is zero or not and the UCD provides a convenient "derived" data file for that (which is used by NFC/NFD normalizers).
- In summary, the "Ustring" library should be able to return not just one code point, but should be able to easily parse and return combining sequences (even, if it does not support for now their normalization to NFC or NFD or NFKC or NFKD, something that requires more data: normalization to NFC would however be very useful).
- I could create a demonstration of that, in pure Lua, and it will be fast, but this reverse() function will be a "conforming Unicode process" and it will also be autoreversible (if it does not normalize its input or output, something that is not necessary).
- It is even possible to implement it without using the "Ustring" code point parser, by working directly at the byte level with Lua "string". For example it is possible to create a pattern for gsub() that will match a valid UTF-8 encoded combining character, and a second pattern that will match any other valid code point which is not a combining character or any other byte that cannot be part of a valid UTF-8 encoded character. Then use these two patterns (generated automatically from the small list containing all ranges of codepoints that are assigned to characters with a non-zero combining class) in a simple loop, to split an input string into a sequential table, that will then be reversed in situ, and then concatenated.
- If you are still not convinced that the list of code point ranges is small, look at "http://unicode.org/Public/UNIDATA/extracted/DerivedCombiningClass.txt" (drop the part related to characters with combining class zero, merge the other lists per combining class into a single one) or at "http://unicode.org/Public/UNIDATA/DerivedNormalizationProps.txt" (look for NFC_QC or NFC_QD properties): in fact all we need is even smaller that these lists of ranges! Verdy p (talk) 20:59, 23 April 2015 (UTC)
- You are wrong, an abstract character may be represented by multiple codepoints and thus they are in no way the same. And the reversing is more complicated than just looking for combining characters, see [2] for the actual specification. It's possible to do, but it would take a large additional data table (from [3]). Anomie (talk) 22:20, 23 April 2015 (UTC)
- No you are wrong. Per Unicode definition, an "abstract character" is definitely NOT a "grapheme cluster". Reread the standard itself (notably the "Unicode character model", published many years ago).
- Code points are being assigned ONLY to abstract characters, or to non-characters (including surrogates), and an abstract character can be given ONE and ONLY ONE code point.
- You are confused by the fact that "character sequences" may be standardized with a standard name by listing the code points to which it is mapped. But character sequences are also not necessarily combining sequences (it may contain several combining sequences) and they are also not necessarily grapheme clusters (because they are extensible by appending more combining characters and/or "grapheme extenders"). Unicode in fact does NOT standardize grapheme clusters (because they are locale-dependant), but ONLY "default grapheme clusters" (which are locale neutral but have little use, except in neutral locales such as the CLDR root).
- In other words, you have absolutely no knowledge of the Unicode terminology (as a member and participant of Unicode since many years, I know that you're wrong, and that you have never read correctly this essential part of the standard). Verdy p (talk) 23:36, 26 April 2015 (UTC)
- Quoting [4] §3.4 D7, "Abstract characters not directly encoded by the Unicode Standard can often be represented by the use of combining character sequences." I can't see any way to claim that abstract characters and code points are the same thing, as you did above. I never said that an abstract character is the same thing as a grapheme cluster, but for properly reversing a string you'd likely need to consider grapheme clusters rather than abstract characters. Anomie (talk) 13:41, 27 April 2015 (UTC)
- You are wrong, an abstract character may be represented by multiple codepoints and thus they are in no way the same. And the reversing is more complicated than just looking for combining characters, see [2] for the actual specification. It's possible to do, but it would take a large additional data table (from [3]). Anomie (talk) 22:20, 23 April 2015 (UTC)
- By definition, in Unicode itself, ALL abstract characters encoded in Unicode have a single (and warrantied) assigned code point; code points may also be assigned to "non-characters" (e.g. surrogates that don't even have a scalar value, or U+FFFE and U+FFFF which ARE valid code coints with a valid scalar value, but are not characters at all).
- Once again reread the Unicode standard. The "combining character sequences" are still not defined by Unicode as "abstract characters" (the standard says that this interpretation **may** be done, but this is not what Unicode defines: the "combining character sequences" are NOT abstract characters encoded by Unicode. And they are also not code points, have no scalar values themselves. They are also NOT "grapheme clusters". The set of "grapheme clusters" (not defined by Unicode) includes the set of "combining character sequences" (defined by Unicode) which themselves includes the set of "abstract characters" (encoded by Unicode).
- Other non-Unicode standards (which are also NOT standards supported by ISO 10646 in its version since 2003) are used to create more "abstract characters" (e.g. the Apple logo in MacRoman, or characters assigned in planes higher than plane 16, defined on the old UCS-4 encoding of the former ISO 10646:2000 standard), but there's no way to encode them uniquely with any standard Unicode encoding form (because Unicode does not assign tham any scalar value and does not allow to encode them either with the "combining sequences" defined by Unicode).
- Unicode expressively says that ANY sequence of abstract characters encoded by Unicode is a VALID Unicode text (even if they don't make sense with some "grapheme clusters" that Uniucode does not define as they make sense only on specific locales and Unicode does not standardize languages, or orthographic conventions; note that Unicode defines only a subset known as "default grapheme clusters", but still they are not associated to any specific language or orthography, and in fact these "default grapheme clusters" are now mostly abandonned and were not made any entry in the standard or in standard annexes, but only in some informative technical annexes, and based on informative, non-normative character properties or other mutable rules of these technical annexes that are just documenting some of the known best practices). So there's NO "grapheme cluster" defined in the standard.
- So I absolutely don't see why "ustring.reverse()" cannot do what it is intended for: reverse the order of characters. It does not matter at all if this breaks some "grapheme clusters" defined for some languages, because the result will still be perfectly VALID Unicode text ! Of course you won't see the usual creation of clusters made in fonts (e.g. the joining of Arabic characters will have curious artefacts, or mutliline text will have the lines reversed), but still it renderers will still be able to render the generated pseudo-text correctly (and all these effects also exist with the "string.reverse" function (e.g. it will also reverse the order of lines, or will reverse CR+LF sequences to LF+CR).
- So actually "string.reverse" is not made to create "meaningful" text, and the same can be said about "ustring.reverse" for exactly the same use.
- There's no point in those two functions to speak about "grapheme clusters" when "string.reverse" already splits and reverses the CRL+LF "grapheme cluster" and reverse('text') returns 'txet' which has no meaning (in English), or reverse('chat') returns 'tahc' which also breaks the English cluster 'ch'.
- Lua, or MediaWiki, do not even know what encoding or orthography or language is used in text, so they cannot infer any "meaning" of their sequences as their locale-sensitive "grapheme clusters", because their meaning or usage is completely opaque. As well, string.reverse('sample\000') returns '\000elpmas' which would break an interpretation as a null-terminated string like in C (it would be interpreted as an empty string), but this does nit matter: we are not concerned by string usages or interpretations in specific locales or environment that have their own requirement about their supported "grapheme clusters" (the valid Lua '\000' string is not a grapheme cluster in C, as a grapheme cluster can never be empty; but it is still a valid single "character" in C).
- Lua or MediaWiki does not need then to know the "meaning" as "grapheme clusters" in "mw.ustring" or if it will render as expected.
- But the generated strings won't create any conformance bug, they are valid, encodable, and fully supported as well by HTML if the original string is supported (original strings are supported only if they use a known subset of valid characters assigned to a wellknown set of code points, which includes MOST codes points that were either assigned to characters by Unicode, excluding MOST C0 and C1 controls, but also include other valid codepoints still not assigned, codepoints assigned to private use characters in the 3 PUA blocks, but does not include any code point assigned to "non-characters" like surrogates and U+FFFF; this set of valid codepoints is known, static, will never change, and any text using these codepoints is VALID in Unicode, and as well in HTML).
- As well the combining sequences may be "altered" by normalization, but: normalization1(text)==normalization1(reverse(normalization3(reverse(normalization2(text)))) is still true even if the three normalizations here are different and order/combine/uncombines characters differently or one of them does nothing (provided that each normalization used is a "conforming process", which is the case for the four standard normalizations NFC, NFD, NFKC, NFKD, but which is not if it transcodes via another encoding, including with ISO 2022 variants, GBK, HKCS, SJIS, and even GB 18030) !
- So reversing a text and reversing it again will preserve the canonical equivalence, and the "reverse" operation is then a "conforming" process according to the Unicode standard. Verdy p (talk) 19:30, 25 October 2018 (UTC)
- The truth is that we need a mw.ustring.reverse function NOT to create meaningfulful text, but precisely to correctly and easily parse some Unicode texts (given the various limitations of Lua patterns, notably for unsupported alternation with '|' or unsupported bounded repetitions). Reversing UTF-8 strings has exactly the same uses as with ASCII-only strings. And it is always reversible again to recreate meaningful text.
- But using "string.reverse" on UTF-8 text is NOT conforming as it breaks in the middle of encoded characters.
- Permitting "string.reverse" in Scribunto for MediaWiki makes no sense at all, it should not even be tolerated and it should return an error instead, as it creates text that cannot be parsed as valid HTML !). There's no such risk with "mw.ustring.reverse" (which is extremely simple to implement reliably and efficiently, without needing to use costly splits/joins via large arrays, and without excessive use of the memory allocator and garbage collector if this is done only in pure Lua, where strings are immutable, but it is possible in pure PHP within Scribunto itself, which can also use the exposed PHP API of Mediawiki or one of its extensions, where some functions are implemented in native C libraries interfaced with PHP, including the Scribunto extension for Mediawiki (written in PHP but using a native C library to run Lua: PHP makes the link between native C, the Mediawiki API in PHP, and Lua where Scribunto exposes some "mw" packages).
- Note: "mw.ustring.reverse" can be implemented efficiently (in pure Lua running within Scribunto) by:
- applying "string.reverse" to the whole input string;
- applying "string.gsub" with a pattern matching '[\128-\191]+[\194-\244]' (which detects ALL valid original UTF-8 sequences which were reversed by the 1st operation and became invalid) and substituting each match individual occurence with the "string.reverse" function. (This pattern matches a bit more than valid UTF-8 sequences only, and matches some invalid UTF-8 sequences, but only those found in input texts that were already not valid UTF-8)
- These two successive operations may be made in any order (this does not change the result).
- For example, with invalid UTF-8 input text:
- mw.ustring.reverse('\129\128') will first reverse it completely to '\128\129' in the first operation, and the second operation will do nothing else, and you get '\128\129' which is also invalid text (reapplying mw.ustring.reverse will restore the original)
- mw.ustring.reverse('\128\129\194') will first reverse it completely to '\194\129\128' in the first operation, and the second operation will match nothing, and you get '\194\129\128' (reapplying mw.ustring.reverse will restore the original, as in the following example)
- mw.ustring.reverse('\194\129\128') will first reverse it completely to '\128\129\194' in the first operation, and the second operation will match everything to reverse it again, and you get '\194\129\128' (like the original)
- mw.ustring.reverse('\194\128\194\129') (valid input) will first reverse it in the first operation completely to the (now invalid!) '\129\194\128\194', and the second operation will match '\129\194' and '\128\194', will reverse them separately, and you get valid output '\194\129\194\128' (reapplying mw.ustring.reverse will restore the original as in the following example)
- mw.ustring.reverse('\129\194\128\194') (invalid input) will first reverse it completely to '\194\128\194\129' (now valid!) in the first operation, and the second operation will match nothing to reverse, and you get the valid ouput '\194\128\194\129' (reapplying mw.ustring.reverse will restore the invalid original)
- In summary:
- with any valid UTF-8 input, the result is also valid UTF-8, and reversible again by the same function.
- with any invalid UTF-8 input, the result is also invalid UTF-8, and reversible again by the same function.
- Applying once again the same two operations (also in any order) is then warrantied to return the initial input text, so "mw.ustring.reverse" is self-reversible.
- with any valid Unicode text (which is valid UTF-8 and does not contain any non-character), the result is also valid Unicode text (and valid UTF-8), and reversible again by the same function.
- with any invalid Unicode text (which is valid UTF-8 but contains one or more non-characters), the result is also is invalid Unicode text (and valid UTF-8), and reversible again by the same function.
- So "mw.ustring.reverse" is also a "Unicode-conforming process", which does NOT require combining sequences to be complete after a leading base character, and does NOT require any normalization order, and does NOT require that grapheme clusters (valid only for specific locales) to be left untouched (combining sequences boundaries, or grapheme cluster boundaries for specific locales NEVER matter at all for strict Unicode process conformance, and not even for strict HTML conformance, or strict XML conformance).
- Strict conformance of "mw.ustring.reverse" for its use with identifiers (or other technical syntaxes or linguistic orthographies) is not warantied, but this is true as well if you use "string.reverse", for exactly the same reasons.
- "mw.ustring.reverse" will not preserve the grapheme locale-specific cluster boundaries, needed for correct collation or sorting, but this is true as well if if you use "string.reverse", for exactly the same reasons (linguistic and their orthographic considerations do not matter here).
- Alternatively in the 2nd operation, you may prefer using the pattern matching:
- '[\194-\244][\128-\191]+' (which detects only invalid original UTF-8 sequences which were reversed by the 1st operation and became possibly VALID) to restore their initial order. Here also the two successive operations may be made in any order (this does not change the result). This alternate variant for implementing "mw.ustring.reverse" in fact produces exactly the same result as the first variant.
- you may want to replace the subpattern '[\128-\191]+' (in either of the two previous patterns, where it is used just beside the subpattern matching a valid leading byte), which greedily matches an unlimited number of continuation bytes (beside the leading byte), by '[\128-\191][\128-\191]?[\128-\191]?', which greedily matches at most 3 continuation bytes (beside the leading byte), because valid UTF-8 sequences cannot be longer than 4 bytes, in order to get some minor speedup (visible only with arbitrarily long invalid UTF-8 input that contains some very large chunks of invalid UTF-8 sequences made of very long sequences of continuation bytes, causing the "gsub" operation to find longer matches and to allocate longer substrings to be replaced by new reversed strings also very long). This changes the result of "mw.ustring.reverse" ONLY in invalid text, but the result is identical for valid UTF-8 input text, and the "mw.ustring.reverse" function still remains self-reversible. As well this variant will also remain a "Unicode-conforming process".
- So I see absolutely no justified reason to forbid "mw.ustring.reverse" in Scribunto for MediaWiki, or Lua in general (outside MediaWiki), where it helps solving more complex text-handling problems, just like what "string.reverse" does for legacy texts encoded with 7-bit or 8-bit charsets (most of them technical, or restricted to basic English, and encoded with pure ASCII or ISO 8859 or similar very limited repertoires).
- Asian users may want a function similar to "mw.ustring.reverse", but working this time on their legacy multibyte charsets (SJIS, GBK, GB18030...) to preserve the boundaries of valid multibyte sequences, like those used by UTF-8 (this is probably not needed for most installations of Mediawiki that just need UTF-8).
- You may want to define a similar function working now with UTF-7, or BOCU-8 (most probably not needed in Mediawiki to render HTML pages on the web).
- This won't work however with multibyte charsets whose encoding depends on the effective encoding state produced when encoding previous characters, notably ISO 2022 or encoding for old terminal or printer protocols like VT100/ANSI/etc. (with escape sequences to switch to another encoding, or with shift-ins/shift-outs or data link escapes to switch some parts of their encoding space across several codepages) or encodings using some stateful compression schemes: these charsets are not safely reversible (meaning that you must preserve texts at least from their begining, and cannot extract subtexts safely at most other start positions, and that if you reverse them, you need to preserve them at least from the end). These legacy charsets are now obsoleting rapidly; even compression is much better performed, in a simpler way without having to handle them in Mediawiki, by in the HTTP(S) transport layer (implemented by the webserver) and MediaWiki just needs to generate plain UTF-8 text.
- In fact the whole standard "string" package should be disabled completely in Scribunto (including basic functions, notably length, substrings, and all search/match/substitution functions), and replaced by "mw.ustring" for everything: we must ensure that valid UTF-8 text will NEVER be broken and transformed into INVALID text, generating invalid HTML, or invalid XML, or inavlid JSON, and so on. Verdy p (talk) 19:53, 25 October 2018 (UTC)
- Note: "mw.ustring.reverse" can be implemented efficiently (in pure Lua running within Scribunto) by:
This discussion was answered five years ago, and again three years ago. Continuing to write multi-kilobyte error-ridden screeds years after the fact is just disruptive. Anomie (talk) 13:28, 26 October 2018 (UTC)
- The discussion above is closed. Please do not modify it. No further edits should be made to this discussion.
Example for string.format?
editI read the description for string.format() about 10 times, but I still don't really understand what it does. Would some kind person be willing to point me to an example or two? I have a feeling that it would be very useful in my coding, but I can't really grasp what it's used for at the moment. — Mr. Stradivarius ♪ talk ♪ 12:05, 20 June 2013 (UTC)
- Here is an extract of use the format function in my module to display coordinates and convert the angle value to text representation with specific accuracy, where
markers
is a table with °, ', and " characters:
assert(value >= 0, "Unexpected negative value")
local angle = math.floor(value)
local minutes = math.floor((value - angle) * 60)
local seconds = (value - angle) * 3600 - minutes * 60
if precision == "st" then
return string.format("%s%0.0f%s%s", prefix, angle, markers[1], suffix)
elseif precision == "min" then
return string.format("%s%0.0f%s%02.0f%s%s", prefix, angle, markers[1], minutes, markers[2], suffix)
elseif precision == "sek" then
return string.format("%s%0.0f%s%02.0f%s%02.0f%s%s", prefix, angle, markers[1], minutes, markers[2], seconds, markers[3], suffix)
else -- "sek+"
return string.format("%s%0.0f%s%02.0f%s%04.1f%s%s", prefix, angle, markers[1], minutes, markers[2], seconds, markers[3], suffix)
end
- The roots of that interface come from the old printf from C language. In the example i.e. "%04.1f" prints float value passed with one digit after decimal point using 4 characters and padding with leading 0 if necessary. Paweł Ziemian (talk) 13:41, 20 June 2013 (UTC)
- (edit conflict) All this is heavily based on the C function printf. It's basically a way to concatenate strings and variables, applying formatting to the variables' values, in a way that can be more convenient than using the concatenation operator and individual functions to do the formatting operations.
- The basic idea is that in a call like
string.format( "foo %s bar", var )
, the "%s" will be replaced by the value of the string variable 'var'. "%d" (or "%i") works the same for integers. "%o" will format an integer in octal and "%x" in hexadecimal, e.g.string.format( "%o %x", 42, 42 )
results in "52 2a". "%e" will output a number in E notation, "%f" will format a number as a decimal with a fixed number of decimal places, and "%g" tries to be smart about choosing E notation versus decimal notation and trims trailing zeros in the decimal representation. For example,string.format( "%e %f %g", 42.5, 42.5, 42.5 )
results in "4.250000e+01 42.500000 42.5". "%c" takes an integer and outputs the corresponding character (as withstring.char()
). And "%%" is the escape in case you need a literal "%" character in your output. - Then there are the flags, width, and precision specifiers, which go between the "%" and the conversion specifier character. For example, using the width specifier as in "%10s" is like "%s", but it will pad with spaces on the left if the string is less than 10 bytes. If you add the '-' flag, as in "%-10s", it will pad on the right instead. "%10d" and "%-10d" work similarly; for the numeric conversions, you can also use the 0 flag (e.g. "%010d") to pad with zeros on the left instead of spaces. Precision truncates strings and specifies the number of digits after the decimal for "%f" and the like: "%.10s" will cut off the input string at 10 bytes, and "%.3f" will round to thousandths.
- Lua leaves out a lot of the more complicated features in C's printf, though; the statements in the Scribunto Reference manual about things unsupported are for the benefit of people familiar with that so they can know what isn't available. And yes, someday I should probably make mw.ustring.format() handle %s and %c using Unicode characters rather than bytes. Hope this helps. Anomie (talk) 13:59, 20 June 2013 (UTC)
- Thank you Anomie and Paweł! Those explanations are both really helpful. As I suspected, this function looks really useful, and I will try it out right now. :) — Mr. Stradivarius ♪ talk ♪ 01:44, 22 June 2013 (UTC)
Getting an Error with Italics
editI am running a freebsd box and installed the latest lua port. I have been trying to get some of the wiki templates to work (specifically the SCOTUS) I keep getting "Script Error" all over the page. when I click on the link I get the following:
Lua error in Module:Citation/CS1 at line 186: attempt to index field 'text' (a nil value).
Backtrace:
(tail call): ? Module:Citation/CS1:186: in function "internallinkid" Module:Citation/CS1:591: in function "buildidlist" Module:Citation/CS1:1318: in function "buildidlist" (tail call): ? mw.lua:463: ? (tail call): ? [C]: in function "xpcall" MWServer.lua:73: in function "handleCall" MWServer.lua:266: in function "dispatch" MWServer.lua:33: in function "execute" mw_main.lua:7: in main chunk [C]: ?
If anyone can give me a clue as to what the problem might be. I tried removing and intalling the latest Parser and that didn't work either. It seems like it keeps throwing NIL or null values and I'm not sure why? Help!!!!
- Hi there. It sounds like your version of Scribunto doesn't have the
mw.text
library enabled. Try downloading the latest development version rather than the latest stable version, and see if that fixes your problem. Best — Mr. Stradivarius ♪ talk ♪ 21:51, 2 July 2013 (UTC)
- ~~Thanks, I'll do that this afternoon when I get a chance. I'm currently running version 1.21 but I noticed that when I was looking at the snapshots there is a master development version as well....I'll use that one and see what happens.
- ~~Ok that seems to have solved the script error issue, but now the formatting is all messed up, do I need to download a new css file for this? [UPDATE] I deleted the CSS file and reloaded the copy from mediawiki and that made no difference. The formatting is all messed up now.
- When you say that the formatting is all messed up, what do you mean? You'll need to give a bit more detailed description than that. Do you have an error message or a screenshot that you can give us? — Mr. Stradivarius ♪ talk ♪ 09:04, 4 July 2013 (UTC)
- I'm having a problem pasting a screen shot, here is a link to the page:
- When you say that the formatting is all messed up, what do you mean? You'll need to give a bit more detailed description than that. Do you have an error message or a screenshot that you can give us? — Mr. Stradivarius ♪ talk ♪ 09:04, 4 July 2013 (UTC)
- ~~Ok that seems to have solved the script error issue, but now the formatting is all messed up, do I need to download a new css file for this? [UPDATE] I deleted the CSS file and reloaded the copy from mediawiki and that made no difference. The formatting is all messed up now.
- ~~Thanks, I'll do that this afternoon when I get a chance. I'm currently running version 1.21 but I noticed that when I was looking at the snapshots there is a master development version as well....I'll use that one and see what happens.
http://w[removeME]ww.i[RemoveMe]cce-t.n[RemoveMe]et/index.php/Roe_v._Wade I tried reloading the CSS file and that didn't make any difference, I'm just not sure what is screwed up now.
- I guess nobody knows...
- Still looking for some help with this one.
- This doesn't look like a Scribunto problem. Instead, it is most likely that you are missing some necessary templates and/or modules. I think I managed to track the problem down to your wiki not having Template:Color - try uploading it and see if that makes things better. At any rate, things look much saner if you remove the "SCOTUS" parameter from the infobox in the page you linked. These kinds of problems are bound to crop up if you are relying on Wikipedia's templates, as they have been developed incrementally over 12 years and are, frankly, a big ol' mess. It's very easy to mess up a page because of missing dependencies, but then if you try and use all of Wikipedia's templates then things might start getting pretty slow. You'll have to experiment to find the best balance for you. — Mr. Stradivarius ♪ talk ♪ 06:51, 15 July 2013 (UTC)
- Still looking for some help with this one.
- I guess nobody knows...
mw.title.compare
editI have a quick question about mw.title.compare
- what does it mean for a title to be less than, equal to, or greater than another title? Is it purely to do with the title length, or does it query the title text in some way? — Mr. Stradivarius ♪ talk ♪ 09:07, 4 July 2013 (UTC)
- When I invoke the lua code
mw.title.compare( 'Automatik', 'Botomatika' )
, it returns 0. It also returns 0 when I invokemw.title.compare( 'Automatik', 'Botomatik' )
, so I don't understand me either. Automatik (talk) 10:39, 4 July 2013 (UTC)mw.title.compare
compares two title objects, passing strings isn't going to work. It just compares the titles by.interwiki
,.namespace
, and.text
. You can see the Lua code behind it at [5]. Anomie (talk) 13:00, 5 July 2013 (UTC)- Thanks you so much, I understand now. Automatik (talk) 01:19, 8 July 2013 (UTC)
- The "mw.title" libary has another bug: it is not idempotent when it returns page instances that are exactly the same page.
- For example when starting from a valid title object, reading the talkpage property, and from it getting the subjectpage property, we get a new title object that is NOT the same object, even if it compares equal.
- This has a very bad effect: this generate infinite loops when traversing the content of objects because we cannot know that we have already visited an object (it is not possible to check that by comparing them with equality because not all objects are comparable this way, notably when traversing polymorphic data structures).
- Clearly, "mw.title" should maintain a cache of title instances that have already been returned instead of creating new ones: "mw.title.new()" needs to implement a true caching factory instead of always creating a new distinct array! The index of this cache is the full title string (including interwikis, namespace, base page, subpages, query parameters, and fragment). The cache will be used then to store other expensive properties: page existence, ID, content length, file content properties or metadata... Non-costly properties (such as URL parsing) don't absolutely need to be cached in title instances, when the "mw.url" module can process full title strings.
- Note that a new array for the same title is returned EVEN if its (expensive) ID property was already known, such as the current page ID:
var seen = {} ; var t = mw.getCurrentPageTitle() ; seen[t] = true ; mw.log(tostring(t)) ; u = t.getTalkPageTitle().getSubjectPageTitle(); mw.log(tostring(u), seen[u]);
unexpectedly displays "false" for the same title (even if it says that "t==u" is true and if t.ID and u.ID are also equal). Verdy p (talk) 20:15, 23 April 2015 (UTC)- This rant is entirely unrelated to the existing discussion in this section. Bug reports belong in Phabricator, but if you do please include a reason why anyone would want to be recursively traversing title objects in this way. Also note that a cache isn't quite so simple when it needs to avoid T67258. And it might well be considered a bug that two calls to mw.title.new return the same object rather than two objects referring to the same title, as much as you're considering it a bug that they don't. Anomie (talk) 22:30, 23 April 2015 (UTC)
- Recursively traversing objects is being performed by various modules, some of them are for debugging purpose only (dumping the content of a variable), others are used in data transforms to generate structured data (e.g. in JSON or XML or HTML tables). So yes the non-idempotence is a problem (a bug in mw.title module IMHO, and this is not "rambling" but something that cause these modules to enter in infinite recursion loops, using lots of memory and finally failing without any useful output except a server-side error). Verdy p (talk) 23:40, 26 April 2015 (UTC)
- Thanks you so much, I understand now. Automatik (talk) 01:19, 8 July 2013 (UTC)
Text library
editmw.text library is unicode safe ? it say that add functions missing from mw.string that it's not unicode safe.--Moroboshi (talk) 09:10, 6 July 2013 (UTC)
- Fixed. Anomie (talk) 13:45, 8 July 2013 (UTC)
- Thanks! --Moroboshi (talk) 06:00, 9 July 2013 (UTC)
mw.site.namespaces
editHi,
Does anyone know what's the defaultContentModel which is in the mw.site.namespaces table? Thanks by advance, Automatik (talk) 01:36, 8 July 2013 (UTC)
- With ContentHandler, each namespace has a default content model; often this is "wikitext", but this can be changed. For example, on wikidata.org the default content model for the main namespace is "wikibase-item". Anomie (talk) 13:47, 8 July 2013 (UTC)
- Thank you. Automatik (talk) 18:13, 8 July 2013 (UTC)
check if a page exist on another project
editI'm trying to use mw.title.new
to check if a page on project different from the project where i run the lua code exist. But i get a id=0
on every page I try even if the page exist. Is possible to check the existence of a page from a project to another ?--Moroboshi (talk) 21:13, 18 July 2013 (UTC)
- No, this is not possible. Anomie (talk) 13:23, 19 July 2013 (UTC)
Format text/plain is not supported
editHi,
I've try to import a XML file from wikipedia.org. At the end of the import, I obtain this error relevant to Scribunto :
- Échec de l'importation : Format text/plain is not supported for content model Scribunto
Do you have any idea of its origin ? --Fractaliste (talk) 13:21, 23 August 2013 (UTC)
- Chances are you are using an old version of Scribunto on your local wiki, see bug 51504 for discussion of a similar issue, and bug 45750 for the more general issue. Anomie (talk) 13:16, 26 August 2013 (UTC)
- I use the last version of Scribunto. But I make it work with manualy changing text/plain with "CONTENT_FORMAT_TEXT"
--Fractaliste (talk) 12:24, 2 September 2013 (UTC)
Could you explain to me how you manually edited text/plain?
frame:expandTemplate
editframe:expandTemplate
seems not to work properly. While frame:expandTemplate {title = someTemplate, args = {someTable [1], someTable [2]} }
works as expected, frame:expandTemplate {title = someTemplate, args = someTable }
doesn't, calling someTemplate without passing it any parametres. —The preceding unsigned comment was added by 178.72.109.160 (talk • contribs) 00:50, 22 September 2013 (UTC)
- I just tried it at the English Wikipedia and it worked fine for me. Can you give us any details about your specific situation? — Mr. Stradivarius ♪ talk ♪ 11:05, 23 September 2013 (UTC)
- Note that if
someTable
was loaded bymw.loadData
, this will be fixed by gerrit:84985. Anomie (talk) 13:31, 23 September 2013 (UTC)
- Note that if
Title object question
editIf we get a title object using local title = mw.title.getCurrentTitle
, and then get the talk page title object using local talkTitle = title:talkPageTitle()
, is the expensive function count incremented? It seems like it would be, but this is not obvious from the manual. And is this also true for basePageTitle, rootPageTitle, etc.? It would be good to know this to know which methods I should be calling with pcall. Thanks. :) — Mr. Stradivarius ♪ talk ♪ 10:59, 23 September 2013 (UTC)
- All of these create a new mw.title object, so yes. As noted in the manual, these are equivalent to calling
mw.title.makeTitle
with the appropriate parameters, and makeTitle increments the expensive function count. Anomie (talk) 13:34, 23 September 2013 (UTC)- It's a bit late, but I've added "this is expensive" to the relevant title methods and properties. I know this is redundant to the notice in the makeTitle documentation, but I thought that it was probably better to be clear than to avoid redundancy. — Mr. Stradivarius ♪ talk ♪ 04:31, 22 October 2013 (UTC)
mw.text.trim error
editHello. I'm trying to install some modules from WP on my personal wiki using scribunto extension. Most of them fail mainly because mw.text.trim returns an error "attempt to index field 'text' (a nil value).". I get this for example in the debug console if I type =mw.text.trim(' some string ') Is her any explanation why it fails ? Phcalle (talk) 12:15, 27 September 2013 (UTC)
- You are using too old of a version of the Scribunto extension. In particular, the version marked as being for MediaWiki 1.21 is too old. See Extension talk:Scribunto for a history of other people discussing this same question. Anomie (talk) 13:56, 27 September 2013 (UTC)
Control structures "for"
editstep
have a default value then the line
if not ( var and limit and step ) then error() end
probably should become:
step = step or 1
if not ( var and limit ) then error() end
Rical (talk) 15:40, 16 October 2013 (UTC)
- The problem is that
for i = 1, 5, nil do ... end
andfor i = 1, 5, false do ... end
both raise an error. The step value must be omitted entirely or given a numeric value, so it's not a real "default" value. Anomie (talk) 13:14, 17 October 2013 (UTC)- Enforce the value 1 in these cases could be misunderstood and disturbing for users. Keep the errors seems a good way to help the user to write a better code. Then we could write "A step value false or nil raise an error, else the default value is 1.". --Rical (talk) 18:05, 17 October 2013 (UTC)
«But if exp3 is nil, the for doesn't work.» : In this case, can we "break" the for and continue after end to solve this strange case without error ? "A nil value do nothing" seems normal. --Rical (talk) 02:05, 26 January 2014 (UTC)
- We are not going to rewrite Lua's handling of for loops, if that's what you're asking. Anomie (talk) 15:36, 27 January 2014 (UTC)
Scope of variable example
editThe code example for "Function declarations" is not clear. Somehow it leaves out simple function naming, and introduces mw.log out of the blue. If it wants to illustrate a scope, there only should be some nesting right? I know about scope, but this example does not explain to me what it is in Lua. -DePiep (talk) 10:23, 18 November 2013 (UTC)
- Do you have a suggestion on what to replace it with? Anomie (talk) 14:18, 18 November 2013 (UTC)
Insertion of excessive detail in Language module documentation
edit@Verdy p: Seriously, I doubt most Scribunto users care at all about the minutia of IETF language tags, BCP47, and how MediaWiki's language codes aren't exactly the same thing. And we certainly don't need big warnings all over the place to "warn" people that they aren't the same thing. Please don't revert again. If you have suggestions for actual improvement to the existing documentation beyond what I already incorporated from your original edit, please propose them here so we can discuss it. Thanks. Anomie (talk) 14:14, 18 November 2013 (UTC)
- This is a reference. One would assume that it contains information that all users care about.
- It certainly should be mentioned that “language code” here is not what one would assume. A standard language code is usable in HTML
lang
attributes, while the non-standard strings returned by the Language Library can invalidate HTML.[6] - A reasonable user who knows what “language code” means in the rest of the world would not bother following a link to “Language codes are described at Language code.” There should definitely be a warning evident in this reference that reasonable assumptions are incorrect. —Michael Z. 2013-11-18 16:35 z
- I don't understand the revert (my edits add been accepted previously by others than you, Anomie), these were absolutely not complains or excessive details of what these functions do or why these codes are not valid BCP47 codes which are required by the HTML standard. We still need a way to provide correct BCP47 codes even if Mediawiki (in fact Wikimedia sites) uses specific codes (in fact they are subdomain names for projects, not really language codes, but these codes have slipped into ther projects that use them now incorrectly; for example you can see "simple" being used in OpenStreetmap tags such as "name:simple=*" only because they have been borrowed from Wikipedia or now Wikidata using these bogous codes which are strictly internal to Wikimedia projects).
- These codes should have never been in MediaWiki itself, and all should be done to deprecate them fast (this should also concern the translations of Mediawiki on translate.net). This legacy inheritence from Wikimedia projects includes also correctly explaining why the Language library does not validate these codes and why there are 4 functions but still none of them are able to support standard language codes; these functions also have differences that are NOT explained in the doc: my addition added these for reference. We still have a missing function for validating BCP47 codes (but only in PHP, not in the Scribunto Lua module).
- These are not details. Conformance to standards is a legitimate goal that users assume, even in Wikimedia sites, but more importantly when other sites will use MediaWiki.
- May be you don't care about this, Anomie, but many users, even the beginners, need to understand what these functions do, and also more importantly what they don't and what are their differences.
- Verdy p (talk) 23:30, 18 November 2013 (UTC)
- You appear to still be under the misapprehension that these functions are supposed to be dealing with IETF language codes. They're not. This is made clear in the documentation. There is no need to further belabor the point, just like we don't need big warnings on every screwdriver to say "This is not a hammer!". Anomie (talk) 14:07, 19 November 2013 (UTC)
- Except that in this case this is neither a screwdriver or a hammer. This is something else (undocumented).
- Lack of documentation is a problem. But ONLY you consider this is details (other people were grantful and thanked me for exhibiting these differences). You continue to use and propagate a bad culture of unexplained assumptions, of keeping things secret or undocumented, or of assuming that people have the same interests as you. In fact most of your work in this documentation is just crap, and adding or perpetuating confusion. All is done here to ensure that you'll play the role of a God for this module, when Wikimedia wants more collaboration (every is replaceable, including you).
- Consider getting yourself outside this documentation that you don't want to maintain (or are too lazy to update) when there are other people that want to develop it in a better way. You've made many people loose lot of time because of your bad work or blind reverts by rejecting all updates to this documentation, even if they were justified.
- If you don't retire here, Wikimedians will create another better documentation elsewhere (in Meta, or Commons, or English Wikipedia, or Wikibooks) and will stop referencing this crap page that you have severely impacted with lies and bas assumptions everywhere, and that you refuse to correct and for which you don't want anyone else to contribute. Verdy p (talk) 19:44, 23 April 2015 (UTC)
- What, specifically, do you think is actually missing from the documentation? Everything here that's actually relevant rather than a personal attack is already in the documentation, just not in the extremely verbose style you personally think is needed.
- If you continue the personal attacks and other trolling, you may be blocked from editing this wiki. Anomie (talk) 22:34, 23 April 2015 (UTC)
- You are still refusing to recognize bugs where they are, and missing documentation when it is important, and by ignoring the loss of time by module developers that experiment unexpected errors. This is not a personal attack but a general remark about this page that is not intended to cover only your own view but intended to be used by all Wikimedians.
- All current limitations and bugs have to be listed, even if they may be solved later, and there should be a way to know if these will be solved or left as is (in which case we'll use something else because we know that there will be no maintenance). These considerations are for general audience, but you only see your own immediate interest. Verdy p (talk) 23:45, 26 April 2015 (UTC)
- You appear to still be under the misapprehension that these functions are supposed to be dealing with IETF language codes. They're not. This is made clear in the documentation. There is no need to further belabor the point, just like we don't need big warnings on every screwdriver to say "This is not a hammer!". Anomie (talk) 14:07, 19 November 2013 (UTC)
frame.args
editHow can I check if frame.args is empty, since next (frame.args) is always nil? —The preceding unsigned comment was added by Alex Mashin (talk • contribs) 15:59, 2 January 2014 (UTC)
- Chances are you don't actually need to do this, just check for the args you care about. But if you really do need it, try something like this:
local nextfunc, static, cur = pairs( frame.args )
if nextfunc( static, cur ) == nil then
-- frame.args is empty
end
- Doing this will cause all the args (if any were passed) to be parsed, even if you're not going to use them otherwise. Anomie (talk) 14:37, 3 January 2014 (UTC)
- Yes, I need to do this: I create Lua functions with dynamic (i.e. unpredictable) parametres often; and in this particular case, I wanted my Lua function to take all parametres from encapsulating template call when the function is invoked without arguments.
Thank you, your code worked.
Alex Mashin (talk) 14:48, 2 February 2014 (UTC)
- Yes, I need to do this: I create Lua functions with dynamic (i.e. unpredictable) parametres often; and in this particular case, I wanted my Lua function to take all parametres from encapsulating template call when the function is invoked without arguments.
getContent returns the unparsed content, but how to return the interpreted one?
editI've just wrote b:fr:Module:Version imprimable to create the printable versions of all Wikibooks by a single line. However the books pages are full of templates and they aren't displayed, eg: b:fr:Programmation_XML/Version_imprimable.
b:pt:Módulo:Book seems to do it but I don't understand the trick for that please? JackPotte (talk) 12:19, 3 February 2014 (UTC)
- Hi JackPotte! On b:pt:Módulo:Book the trick is in the line
frame:expandTemplate{ title = ':' .. chapter }
. I'm treating the page as if it was a template, and "transcluding" it in the printable version. Helder 12:31, 3 February 2014 (UTC)- Thank you, I'm going to try to call the pages as templates (even if they aren't in their namespace). JackPotte (talk) 12:37, 3 February 2014 (UTC)
frame:preprocess
editOn it.wiki there is a lua function that used mw.message:text method, now I switched to use frame:preprocess. I would like to known frame:preprocess has the same potential problem of mw.message:text and could be removed in the future.--Moroboshi (talk) 11:39, 19 February 2014 (UTC)
- frame:preprocess is good. The problem with the mw.message methods was that MediaWiki's MessageCache class makes it's own instance of the parser and processes 'text', 'parse', and so on using that separate instance. So any categories, links, and such coming from the message were recorded on that separate parser instance and not on the parser instance that is being used to parse the actual page. frame:preprocess, on the other hand, uses the same parser instance that is being used to parse the page and therefore all the categories, links, and so on get recorded in the right place. Anomie (talk) 13:55, 19 February 2014 (UTC)
- thanks for the explanation.--Moroboshi (talk) 12:34, 20 February 2014 (UTC)
getContentLanguage()
editOn Wikimedia Commons the whole interface translates, if the user sets a language code in his/her preferences. However wgPageContentLanguage will always be 'en', so within the scope of commons it would be very useful to determine the language a user has set in prefs, to translate lua module output accordingly. getContentLanguage does not seem useful to do this as it just returns the default content language of the wiki, not the users. How do we read the language actually used by the user? This would be the one to care about if a Module should translate its messages, or am I mistaken something fundamentally here? mw.message seems to only fit system messages and cannot advance to include module owned translation tables as I get it. Thanks for comments, recommendations. --Cmuelle8 (talk) 03:47, 10 March 2014 (UTC)
- I've found some code in Module:Languages on commons, but I doubt this should do in the long run, should it? I have not tried it, but it looks ugly, since a normal template is preprocessed - just to get the users language..
local userlang = frame:preprocess( '{{int:lang}}' )
local fallback, contentlang = mw.text.split( userlang, '-', true )[1], mw.language.getContentLanguage():getCode()
- The problem here is that Lua == content language, because it is server side and generates content. The trick above also really shouldn't be used. Content gets cached, and the above trick would mean a german user could get chinese content. For now, commons will have to remain using Javascript. TheDJ (talk) 11:37, 10 March 2014 (UTC)
- Yes, the "int:lang" thing Commons uses all over the place is a bit of a hack, and it's not going to be supported in Scribunto other than by using the hack as above. I haven't actually tested it, but using int: via frame:preprocess shouldn't be any worse than using it in wikitext (i.e. it should fragment the cache based on user language, but shouldn't cause issues like bug 14404). Anomie (talk) 14:31, 10 March 2014 (UTC)
- The problem here is that Lua == content language, because it is server side and generates content. The trick above also really shouldn't be used. Content gets cached, and the above trick would mean a german user could get chinese content. For now, commons will have to remain using Javascript. TheDJ (talk) 11:37, 10 March 2014 (UTC)
Tabel description
editAbout the table definition (mw:Extension:Scribunto/Lua_reference_manual#table). The example code now has
-- Create table
t = {}
t["foo"] = "foo"
t.bar = "bar"
t[12] = "the number twelve"
t["12"] = "the string twelve"
I understand that it is not essential that the key and value are alike. Also accepted are:
t["foo"] = "bar1"
t.bar = "bar2"
t[12] = "the number twelve"
t["12"] = 12
This is to improve (my understanding of) the documentation. -DePiep (talk) 18:54, 6 April 2014 (UTC)
- Just now I had to try 2n tests to relearn how numbers can and cannot be defined in a table. Great. -DePiep (talk) 20:26, 12 June 2014 (UTC)
Examples
editSigh. I count 239 functions and not a single example. What am I supposed to know beforehand? Is there a secret class to follow? -DePiep (talk) 10:01, 7 April 2014 (UTC)
- This is a reference manual, not a textbook. While a textbook would be an excellent idea, this shouldn't be it. Anomie (talk) 13:26, 7 April 2014 (UTC)
How to use "next" ?
editIt is not so easy to understand how to use next
. I used it in a local table with keys [1] and ["1"]. Could you add the right use : "for k, v in next, table" ? --Rical (talk) 09:10, 14 April 2014 (UTC)
- In a for loop, you'd probably want to use "for k, v in pairs( table )" instead of trying to use 'next' directly.
- To use next, you call it the first time as "k = next( table, nil )" (or just "k = next( table )") and then subsequent times as "k = next( table, k )". When the returned key is nil, you've reached the end of the table. Anomie (talk) 13:26, 14 April 2014 (UTC)
- If we use an
ipair
loop, then apair
loop, we must duplicate all the process inside with only a very small difference difficult to locate and to maintain. If we usenext
we write only once the process and the small difference become explicit and easy to maintain. - The fonction
next
exists and is usefull in some cases, anywhere, not necessary for read arguments. Like all others, it deserves a full description. --Rical (talk) 00:13, 15 April 2014 (UTC)- Functionality-wise and absent a __pairs metamethod, there is no difference between
for k, v in pairs( table )
andfor k, v in next, table
. So I'm not following what you're trying to get at here. Anomie (talk) 13:22, 15 April 2014 (UTC)
- Functionality-wise and absent a __pairs metamethod, there is no difference between
- If we use an
Using a pattern capture
editSee Extension:Scribunto/Lua_reference_manual#Captures. Captures does not describe how to reuse a numbered capture (try $1
or %1
?). -DePiep (talk) 17:01, 13 April 2014 (UTC)
- What, like
"(['\"]).-%1"
? This is described slightly earlier in the document under "Pattern items": "%n
, for n between 1 and 9; such item matches a substring equal to the n-th captured string (see below)". Anomie (talk) 13:23, 14 April 2014 (UTC)- So my statement is confirmed, thank you. Now how can we improve the documentation? -DePiep (talk) 19:15, 17 April 2014 (UTC)
- Great. So you say: to understand the documentation, read its talkpage. That is where the links are! -DePiep (talk) 20:35, 12 June 2014 (UTC)
- So my statement is confirmed, thank you. Now how can we improve the documentation? -DePiep (talk) 19:15, 17 April 2014 (UTC)
Lua error: too many language codes requested.
editAccording to the documentation Lua calls are preferable to callParserFunction calls. So I was trying to replace
mw.getCurrentFrame():callParserFunction( "#time", { dFormat, timeStamp, lang } )
with
mw.language.new(lang):formatDate( dFormat, timeStamp)
Both work just fine for individual examples, but I run into "Lua error: too many language codes requested." on my c:Module talk:Date/sandbox/testcases test page. Any way to avoid that error, other than not using formatDate? --Jarekt (talk) 15:53, 16 June 2014 (UTC)
- Split the page up so it doesn't try to use more than 20 language codes per page. Anomie (talk) 13:04, 17 June 2014 (UTC)
work with media files
editI am wondering if it is possible to work with media files using Lua.
My situation:
I have the URL to a video, that I can get to look nice using wikitext via Lua:
[[File:myUrl|thumb]]
But I do not want to have an embedded player, I want to show the respective thumbnail and define my own link target. Unfortunately I am not aware of how I can access this thumbnail.
For a quick-n-dirty solution I tried to use frame:preprocess
in order to extract the image path with string magic, but preprocess does not process such a file expression.
How can I get the path of the thumbail of a video? Is there a Lua library I am not aware of that handles media files? Any other ideas how to do that?
Greetings --Sebschlicht (talk) 16:53, 16 July 2014 (UTC)
- It's currently not possible to do that. Jackmcbarn (talk) 03:22, 17 July 2014 (UTC)
Frame object in called modules
editI am trying to make a function in Wikidata that would retrieve the source of a claim and show it in the references. But the function is not called directly from the frame, and the caller function does not send the frame as an argument either. A I right, that there is no way I can use "ref" tags then ? --Zolo (talk) 12:33, 28 August 2014 (UTC)
- If you don't want to pass the frame through as an argument to the function, you can just get the current frame by using mw.getCurrentFrame. Mr. Stradivarius on tour (talk) 14:18, 28 August 2014 (UTC)
- Thanks ! I had somehow missed that :)/ -Zolo (talk) 20:51, 28 August 2014 (UTC)
Fullwidth hex digits: defined by Unicode
editAbout Ustring_patterns. For the set %x
it now says: %x: adds fullwidth character versions of the hex digits.
This is correct, but it can be described more normative and Unicode-based (as the other bullets in that section are). See en:Unicode_character_property#Hexadecimal_digits. Clearly, there are two Character Properties to be used:
ASCII_Hex_Digit=Yes
: all ASCII hex digits (A-F, a-f, 0-9; 22 total)Hex_Digit=Yes
: all ASCII hex digits plus all fullwidth hex digits (22 + 22)
Describing the %x
set using these Properties is basing it on a sound Unicode definition. Outside of this documentation change, there are no material effects as far as I can see. -DePiep (talk) 19:17, 17 September 2014 (UTC)
Something is broken in mw.string module
editThere is some problem with Lua pattern matching in pl wiki:
- "{{#invoke:string|match|432-440|[^%s%d%-–]}}" → "String Module Error: Match not found"
- "{{#invoke:string|match|432-440|[^%s0-9%-–]}}" → "String Module Error: Match not found"
There is expected that both calls returns error with match not found, which is true here, but fails in pl wiki (LuaSandbox 2.0-7; Lua 5.1.5). The problem can be observed here. Unfortunately the first match returns incorrect value "4". Paweł Ziemian (talk) 17:40, 9 October 2014 (UTC)
- I checked other few wikis (en, de, ru, test, test2) and there is older version of Lua (LuaSandbox 1.9-1; Lua 5.1.4). Is this version upgrade in pl wiki intentional? Paweł Ziemian (talk) 18:02, 9 October 2014 (UTC)
- It appears that there's a bug in HHVM and PCRE; see some discussion at around 01:30 in this IRC log. I don't know whether the bug has been filed anywhere yet.
- As for the differing versions of LuaSandbox, apparently the 2.0 version was only compiled and installed for HHVM and not Zend for some reason. If you have the HHVM beta feature enabled on a wiki you'll see 2.0, otherwise 1.9. Anomie (talk) 13:15, 10 October 2014 (UTC)
- Thanks for explanation. Today I see both results correct. Paweł Ziemian (talk) 17:57, 10 October 2014 (UTC)
mw.text.split
editIt would be nice if this returned also the count of splits eg.
- value, count = mw.text.split(str, " ")
Otherwise a second line is needed to get count eg.
- novalue, count = mw.ustring.gsub(str, "%S+", "")
Typically functions return the number of times they performed an action. -- 71.114.106.88 23:57, 12 October 2014 (UTC)
- Did you try get number of elements in the returned table? I think there is relation between them, that is (number of splits) + 1 = (number or items). Paweł Ziemian (talk) 15:08, 13 October 2014 (UTC)
Additional libraries
editHow can I get (install) additional libraries in Scribunto, like this one: isbn? Jaider msg 22:37, 1 December 2014 (UTC)
- @Jaideraf: If they are pure Lua libraries, then you may be able to add them as modules in your wiki's module namespace, with some caveats: they must work under Lua 5.1, and if they use any of the standard Lua functions that have been changed or removed in Scribunto, they may need to be altered. Some pure-Lua libraries may not be able to work at all due to the differences from standard Lua. If that doesn't work, then you need to add them to Scribunto as Scribunto libraries. This isn't intended as a process for end users to follow - you would need to do the coding yourself, and either submit a patch here or maintain a private Scribunto repository. — Mr. Stradivarius ♪ talk ♪ 07:16, 12 December 2014 (UTC)
- Thank you @Mr. Stradivarius. I was able to add the ISBN library by putting the lua files as modules in the wiki's module namespace. It works great! Again, Thank you. Jaider msg 21:41, 17 December 2014 (UTC)
Expensive properties and methods in title objects
edit@Jackmcbarn: At the moment the properties/methods in #Title objects that return a new title object are marked as "this is expensive". This will need to be changed now that gerrit:178698 was merged, but I'm not sure what exactly it needs to be replaced with. I can see that the properties "id", "exists", "isRedirect" and "contentModel" need to be marked as expensive, but how about the methods, e.g. "getContent"? Do any of them increment the expensive function count as well? — Mr. Stradivarius ♪ talk ♪ 06:51, 12 December 2014 (UTC)
- @Mr. Stradivarius: The methods that return a new title are fixed now. I mentioned those 4 properties in the prose, since they work a little differently than the other expensive properties. No additional methods count as expensive. Jackmcbarn (talk) 16:18, 12 December 2014 (UTC)
- @Jackmcbarn: Thanks for that. About the other expensive properties - do they depend on the expensive data being fetched, or are they separate? In other words, say someone writes the code and the "Foo" page hasn't been loaded previously. When protectionLevels is accessed, is the expensive function count incremented by one, for just the protectionLevels access, or is it incremented by two, for the normal expensive data and for the protectionLevels access? — Mr. Stradivarius ♪ talk ♪ 08:57, 13 December 2014 (UTC)
mw.title.new(' Foo ').protectionLevels
- Actually, ignore that - after I typed out that question I realised that I could test it myself. I see that for my example the expensive function count is only incremented by one, which is rather nice. I think we need a clearer way of indicating to people that exists etc. is expensive while still maintaining the distinction with protectionLevels.
I'll have a go at doing that now.— Mr. Stradivarius ♪ talk ♪ 09:33, 13 December 2014 (UTC)- Ok, docs are now updated. — Mr. Stradivarius ♪ talk ♪ 15:35, 14 December 2014 (UTC)
- Actually, ignore that - after I typed out that question I realised that I could test it myself. I see that for my example the expensive function count is only incremented by one, which is rather nice. I think we need a clearer way of indicating to people that exists etc. is expensive while still maintaining the distinction with protectionLevels.
- @Jackmcbarn: Thanks for that. About the other expensive properties - do they depend on the expensive data being fetched, or are they separate? In other words, say someone writes the code
Splitting into subpages
editThe manual is now 172k in size, and it's slowly but surely getting bigger as more features are added to Scribunto. There may also be a couple of large jumps in size at some point relatively soon, as there are two new libraries in the works: getArgs and mw.math. I think it would make sense to split the manual up into subpages before it gets any bigger. For one thing, this would make the page a lot easier to read on mobile - splitting the page up would mean we could use more level two headings on the subpages, and level two headings get collapsed on mobile. At the moment, mobile users have to do a lot of scrolling to find the library that they want. I'm open to suggestions as to how to do this, but I'm thinking we should at least have a subpage for the Lua language, one for standard libraries, and one for mw libraries. — Mr. Stradivarius ♪ talk ♪ 14:12, 14 December 2014 (UTC)
- @Mr. Stradivarius: Good idea. I'd like to see all the MediaWiki-specific stuff kept here, and all the stuff copied from the official Lua reference manual moved to a subpage. Jackmcbarn (talk) 02:48, 16 December 2014 (UTC)
- I like how it is currently organized. Often use Ctrl+F to find something I am sure Lua or one of its MediaWiki extensions has. -- Rillke (talk) 13:41, 16 December 2014 (UTC)
- @Rillke: How about including a summary of what properties/methods are available on the main page, and having the main documentation on a subpage? — Mr. Stradivarius ♪ talk ♪ 06:11, 18 December 2014 (UTC)
- @Rillke: Or you might appreciate the recent German solution and its page source? w:de:Hilfe:Lua/* Developed the other way around. – Greetings --PerfektesChaos (talk) 22:56, 18 December 2014 (UTC)
mw.language.fetchLanguageName()
editI recently did an experiment with mw.language.fetchLanguageName()
where I gave it the code 'ara'. If one is to believe the documentation, the returned string should have been 'Arabic'. Instead, the returned string was 'ara'. When I gave mw.language.fetchLanguageName()
the code 'lad', the returned string was 'Ladino' which is the correct language name for that ISO639-3 code.
This behavior is inconsistent and contrary to the documentation. Similar results are obtained with {{#language:<code>|en}}
, which see.
—Trappist the monk (talk) 14:54, 14 January 2015 (UTC)
- When I try
mw.language.fetchLanguageName('ara')
, I get the empty string rather than "ara". This is not contrary to the documentation: Language codes are described at Language code. Many of MediaWiki's language codes are similar to IETF language tags, but not all MediaWiki language codes are valid IETF tags or vice versa.
- IETF language tags do not include all ISO 639-3 codes (it uses ISO 639-1 codes where one exists, as 'ar' does for Arabic), and MediaWiki language codes do not include all IETF language tags. Anomie (talk) 14:42, 15 January 2015 (UTC)
- Right, I misspoke; I did the test in en:Module:Citation/CS1/sandbox which returns the code if
mw.language.fetchLanguageName()
doesn't return a language name.
- Right, I misspoke; I did the test in en:Module:Citation/CS1/sandbox which returns the code if
- Is there documentation someplace where the 'similarity' to IETF language tags is defined (and also dissimilarity)? If there is a standard (IETF), why not adopt the standard? If there are reasons that the standard has not / will not / should not be adopted, shouldn't those reasons be clearly and unambiguously documented?
- —Trappist the monk (talk) 17:02, 15 January 2015 (UTC)
- Both of you are wrong. "IETF tags" are in fact BCP47 tags (those described in a list of RFC's and maintained in the IANA database). BCP47 are the actual standard used in W3C standards (including HTML, XML, SVG), also used in other languages (including SGML), and international databases (such as CLDR). BCP47 tags are not just language codes, they are locale codes. But not all language codes are valid locale codes in BCP47. For example ISO 639 contains many language codes that are NOT valid locale codes, and also forgets many codes that have been dropped even if they are still valid and used in BCP47.
- In summary, don't look at ISO 639, it's not relevant at all (even ISO 639-1 only is irrelevant, it also contains codes that are invalid locale codes!). ISO 639 codes are NOT stable across time, they have lot of unsolved compatibility problems and have in fact never been developed to be used as locale codes: they are just codes developed for bibliographic purposes. Their "technical" use (introduced in ISO 639-2) has failed completely. The world uses BCP47 instead for localisation (and in fact many librarians, and even linguists, now use BCP47 locale tags instead of the old ISO 639, which is a dying standard with too many problems).
- But also note that Wikimedia defines additional locale codes that are not standard in BCP47 or conflicting with it (e.g. "simple" or "als" or "nrm", or "map-bms", or "fiu-vro", or "sr-ec", or "sr-el") and that we should deprecate and even drop later, we don't need them (even if we keep their existing local mapping to wiki database names, and their mapping to public domain names for interwiki prefixes, because this is a completely different thing! Interwiki prefixes are NOT locale codes, and NOT language codes).
- On the opposite, locale codes supported by MediaWiki are abusively named "languages" in
mw.language
. This library is just full of bugs and in fact a real piece of untested junk that confuses everything (and that even does not work the way it is currently documented and has severe limitations!). In Lua, stay away from the "mw.language" library, it has to be completely rewritten (it is even less functional than what it isued to be in the past, because of added limitations and bugs)! But in fact all these bugs and limitations are coming from the equivalent module in PHP (that has exactly the same bugs and that should be rewritten as well!) Verdy p (talk) 19:09, 23 April 2015 (UTC)
- —Trappist the monk (talk) 17:02, 15 January 2015 (UTC)
This documentation page is crap
editWhy is there a lua reference in there, like Lua's tokens and stuff? I want to know what's that frame variable do? How do I pass arguments (other than just what function to call) - it seems to almost all be Lua's standard library! — Preceding unsigned comment added by 137.205.238.124 (talk) 16:18, 12 March 2015 (UTC)
- Because this is a reference to Lua as it exists in Scribunto, not just to Scribunto's extensions to Lua. The frame object is documented at Extension:Scribunto/Lua reference manual#Frame object. Accessing arguments passed to the #invoke parser function is described at Extension:Scribunto/Lua reference manual#frame.args. Anomie (talk) 14:07, 12 March 2015 (UTC)
- Thanks for the response (not sure how to reply to) hope you get this - this certainly needs to be made more clear, and is far more important than "math.abs is included!" I think these ought to be two pages. One about Scribunto and wiki and another about the Scribunto environment. — Preceding unsigned comment added by 137.205.238.147 (talk) 20:14, 12 March 2015 (UTC)
- You replied correctly, although do remember to end your post with "~~~~" to display a signature. While this page is large, I'm not sure what the best division might be, particularly with the number of cross-references between what would be different pages. Anomie (talk) 13:14, 13 March 2015 (UTC)
- Thanks for the response (not sure how to reply to) hope you get this - this certainly needs to be made more clear, and is far more important than "math.abs is included!" I think these ought to be two pages. One about Scribunto and wiki and another about the Scribunto environment. — Preceding unsigned comment added by 137.205.238.147 (talk) 20:14, 12 March 2015 (UTC)
- I'm not alone to think that this Scributo doc page is crap. Notonly it mixex everything, it does not clearly differentiate what is standard Lua and what is not, and it does not even correctly document what is really working or not.
- Clearly the MediaWiki-specific libraries should be in separate pages or subpages (and it needs lots of updates base most of the functions are simply incorrectly documented, and you Anomie, do not even want to have this documented).
- In additions, some MediaWiki libraries have also specific behaviors in some wikis, and some wikis alos have their own Scribunto libaries not listed here !
- In summary, this page should just link to the releval Lua.org documentation (we don't need to include it, but need to list the Lua versions we support in Scribunto). Then it should concentrate on describing only the "frame" objects (which are the only thing we need to interact with MediaWiki in Scribunto). All other libraries should be in separate pages (including "Ustring", "mw.language", "wm.title", libraries for wikibase clients, libraries to parse file contents, libraries to interact with user preferences, or user notifications, or liquidthreads, or for interacting with specific special pages and other mediawiki extensions, or just generic libraries in pure Lua...)
- Verdy p (talk) 19:31, 23 April 2015 (UTC)
Rendering a gallery
editI cannot make Scribunto to render a gallery from ca:Module:FotoNumero. I get "<gallery>...</gallery>" in the wiki page instead. See ca:User:QuimGil/proves#Foto_del_dia. I have tried mw.text.decode, mw.text.encode and mw.text.tag, but I still would get the tags instead of the gallery. Help, please.--QuimGil (talk) 13:27, 15 March 2015 (UTC)
- See Extension:Scribunto/Lua_reference_manual#frame:extensionTag. --Vriullop (talk) 15:41, 15 March 2015 (UTC)
- Thank you! I had tried that as well, but a) didn't understand the documentation, and b) clearly my attempts were wrong because I kept getting error messages.--QuimGil (talk) 16:37, 15 March 2015 (UTC)
mw.ext.?
editWhere I can find some more information (documentation) about extensions available in mw.ext
table? There is a list
["ext"] = { ["TitleBlacklist"] = { ["test"] = function } ["FlaggedRevs"] = { ["getStabilitySettings"] = function } ["ParserFunctions"] = { ["expr"] = function } }
I am interested about the ParserFunctions
, which is used in en:Module:Location map. Paweł Ziemian (talk) 20:50, 6 April 2015 (UTC)
- The developers of those extensions are supposed to have added links to their documentation from Extension:Scribunto/Lua reference manual#Extension libraries (mw.ext), but it doesn't seem that any of those have. Anomie (talk) 13:18, 7 April 2015 (UTC)
- I tried to find roots of the code and found that the only documentation of those are commit messages, when the modules were introduced: ParserFunctions, FlaggedRevs, TitleBlacklist. All of them are made by Jackmcbarn. It would be glad to know if i.e. ParserFunctions extension is stable and is ready to use for everyone, or it is still in experimental state and using it is not recommended yet. Paweł Ziemian (talk) 20:37, 8 April 2015 (UTC)
- @Paweł Ziemian: They're all stable and ready to use for everyone. Jackmcbarn (talk) 16:02, 12 April 2015 (UTC)
- But they are not all deployed on the same wikis. For example on Commons we get
- I tried to find roots of the code and found that the only documentation of those are commit messages, when the modules were introduced: ParserFunctions, FlaggedRevs, TitleBlacklist. All of them are made by Jackmcbarn. It would be glad to know if i.e. ParserFunctions extension is stable and is ready to use for everyone, or it is still in experimental state and using it is not recommended yet. Paweł Ziemian (talk) 20:37, 8 April 2015 (UTC)
=mw.dumpObject(mw.ext) { ["ParserFunctions"] = { ["expr"] = function#1, }, ["TitleBlacklist"] = { ["test"] = function#2, }, }
- In some editions of Wikinews and Wiktionnary, there are a few other libraries. My opinion is that the Special:Version page should list these libraries and their description, in the section related to Scribunto (where Scribunto is installed of course).
- Some extensions are also not preloaded in "mw.ext" but may be loaded on demand. We lack a way to see the list all external libraries (loaded or not), and where they are supposed to be bound to within Lua globals:
- For example in Commons, "_G.package.loaded" currently has these keys:
- '_G', 'debug', 'libraryUtil', 'math', 'os', 'package', 'string', 'table' — a subset of Lua 5.1 core libraries (some of them modified for Scribunto)
- 'mw', 'mw.html', 'mw.language', 'mw.message', 'mw.site', 'mw.text', 'mw.title', 'mw.uri', 'mw.ustring' — MediaWiki core libraries (more restricted than its PHP library)
- 'mw.ext.ParserFunctions', 'mw.ext.TitleBlacklist' — Mediawiki extension libraries (not documented on Scribunto)
- 'mw.wikibase', 'mw.wikibase.entity' — additional Mediawiki extension libraries implementing the Wikibase client (for querying Wikidata).
- The Scribunto doc page becomes indigest and should now be splitted with separate pages for each library.
- One of these libraries has a special interest: 'mw.ext.ParserFunctions', it can be used because mw:frame() methods do not permit expanding parserfunctions, they are only converted to stripped markers, and their evaluation is delayed).
- But only one parser function is enabled, and it is the "#expr:" parser function, this allows computing values passed in parameters to Lua or computing expressions built dynamically (because Lua's standard method "load(string)" is not supported, using the syntax supported by #expr, without having to parse them in Lua (the existing implementation in PHP is faster, even if it has the same quirks): pass it a string, it returns a string like in MediaWiki
=mw.ext.ParserFunctions.expr('sin pi')
returns the string'1.2246467991474E-16'
. - I'd like to have a fews existing other MediaWiki parser functions or extension tags mapped as well, and notably the "Special:Prefixindex" parser function (to avoid having to generate many "#ifexist:" from a list of language codes and exploding the limits, this would generate a list of subpages for a given base page) and the extension tag "#tag:categorytree" to get in a single query a list of 200 member pages by default in a category (or up to 500 like when viewing a catagory, with the adhoc option of this extension tag); that categorytree should also have an option to return the list in a simpler format, just page names separated by newlines, or returned as an array of page names, possibly also with the common prefix hidden with the option "hiddenprefix=1", and hidden redirects with the option "hideredirects=1", exactly like when viewing the special page that should support also the query option "format=text" to avoid generating HTML-formatted lists of anchor elements or wiki-formatted wikilists of wikilinks, or "format=json" to get this list just in the standard JSON format of the API). Instead of having a cost of 1 per #ifexist:, we would have a cost of 1 for the whole list of existing translations, generated in a single query to the database and we would be able to avoid testing the 400+ languages supported (and exploding the limits of costly individual "#ifexist:" parser functions or costly individual "mw.title()" function calls, and with much better performance; an alternative would be to add "(mw.title()):subpages()" to get this list of subpages, or
mw.prefixindex('Template:Basename/', { hideredirect=1 })
with the same options as in Special:Prefixindex, ormw.categorylist('CategoryName', { pagefrom='A', subcatfrom='A', hideredirect=1, hidecat=1, hidefiles=1... })
also with the same options as when viewing category pages... Verdy p (talk) 04:22, 20 April 2015 (UTC)- That's extremely long. Your understanding of strip markers is incorrect, they are not "delayed parsing" but rather hide already-generated raw HTML from further parsing (although post-parsing hooks could be used to postprocess). Feature requests belong in Phabricator. Anomie (talk) 13:49, 20 April 2015 (UTC)
- You're completely wrong. These stripped markers are definitely NOT preparsed by extensions, and I can prove it, because I tested it (not like you): this is the essential thing that changed in last december and that has caused many templates or pages stopping to work.
- I can affirm that these stripped templates are not preparsed, but only storing the name of the extension to use and the value of its parameters, and that everything is delayed; these extension will be called later, (and at least always after the execution of Lua). Visibly you have not tested these or do not know that this is something that changed in last December. Stripped markers (as they are seen in Lua) are just containing the parsed text, not its expansion (and not just because this expansion may contain pure HTML incompatible with the wiki syntax such as javascript or anchor elements with arbitrary URLs, but also because this execution of extension hooks takes time on the server even when such execution is not needed, because the Lua code will choose to not return them even if they were present in Lua call parameters).
- As stripped markers are unique identifiers for each instance, their result can also be cached so that the same instance will be executed only once, even if the result is duplicated on the page.
- Anyway I don't like your way of making blind reverses of everything in the documentation, because you (only you!) just want to keep things undocumented, even when they have changed or are severely limited, or when functions may return errors in certain conditions (just like those that I listed for the "mw:language" module, which were accurate).
- May be you could think about better wordings for English (it's not my native language, but I make efforts, not like you...), but dropping everything is clearly unacceptable and a very selfish behavior. You have abused your privileged position when also adding personal threats in your reverts. Your staff membership does not authorize you to block someone that is honestly contributing, when you, instead, are just abusively deleting useful content (for clearly unjustified reasons). The acceptable behavior would have been to correct things or enhance them, instead of using your personal judgement about what you think is "wrong" but was in fact an effective reality.
- This page is not supposed to document how Scribunto should idseally work, but how it currently works. Its limitations (or even bugs) have to be documented somewhere, because all Wikipedians want to understand why their code is no longer working or has never worked: they spend hours looking for unexplained problems, and you don't give any consideration to their problems and the time they invest in looking for solutions.
- So replying "TLDR" is definitely not a good reply (it's just lazy from you), when such details are in fact essential. Verdy p (talk) 18:45, 23 April 2015 (UTC)
- When I look at the actual source code, I see that it's obviously storing replacement text rather than "the name of the extension to use and the value of its parameters" as you claim. Anomie (talk) 22:43, 23 April 2015 (UTC)
- If this was the case, that source code would call for the expansion of its parameters before doing the stripping. This is not the case, all parameters in the specified text are left "as is". They will be processed only when they will be unstripped (by the hook mechanism present in this source). So now extensions are just stripping parameters and delay their effective expansion (and notably those that generate plain HTML, including plain HTML anchors, or javascript, or some unsupported HTML elements such as "video", or unsupported HTML attributes such as "onload=", or that generate embedded CSS stylesheets in a HTML "style" element, or that would generate "frame", "iframe", "xml", "object", "input", "textarea", and similar elements forbidden in Wikitext... or even just "thead", "tbody", "tfoot", "colgroup", "col", "caption", or other semantic document structure elements of HTML5, even if they have no good reason for being restricted in Wikitext).
- But thanks for the link I see now that there's a "unstripGeneral()" function (not documented, again!), but I don't see it mapped in frame objects, it remains only in the internal PHP code and not accessible to Lua (or I have still not found how to access it).
- Look for "extensionSubstitution()" method used in the MediaWiki parser, that now supports so called "half-parsed" text, which is stripped for later expansion, by the hooks registered by extensions (not all extensions are still using this delayed expansion mechanism, but their number is increasing; for now only basic parser functions such as "lc:" and standard templates are being expanded immediately and don't need stripping, but other extensions — such as dated magic keywords, or version magic keywords, or magic keywords counting pages in categories, or enumerating their protection level, or even the standard wikitext generating wikilinks — are not expanded immediately, they are in "half-parsed" state for later processing, and they use their own caches if needed ; there's some complex "ghost" processing in the parser to allow pages to be processed faster, when their expensive immediate expansion is strictly not needed and may be avoided by conditional parser functions such as "#if:" or "#switch:"). The main advantage of this delayed expansion is that it allows a much faster construction of the DOM tree, with a smaller resource footprint in the server.
- Now what I'm looking for is: how to use "unstripGeneral()" in Lua, we only have "unstripNowiki()", and "unstrip()" that kill all other stripping markers with the "general" type (which is in fact all other extension tags not bound to a simple core parser function). Verdy p (talk) 23:53, 26 April 2015 (UTC)
- You seem to be confusing strip markers with the optimization of not processing an argument to a template or parser function (including Scribunto's #invoke) until it's actually used. If something like Gerrit change 181046 were to be merged, then you'd be potentially correct. Or, as I said, something could be using various parser hooks to try to post-process things at some level, but I'm not aware of anything that actually does this in the manner you describe.
- There is intentionally no way to access
unstripGeneral()
from Scribunto, it was removed in Gerrit change 171290 due to T63268 and T73167. Anomie (talk) 13:55, 27 April 2015 (UTC)
- When I look at the actual source code, I see that it's obviously storing replacement text rather than "the name of the extension to use and the value of its parameters" as you claim. Anomie (talk) 22:43, 23 April 2015 (UTC)
- The only good reason given is in the second issue T73167 which says: "To prevent access to tokens, do not allow Scribunto modules to unstrip special page HTML." But the solution taken is extreme: only the tokens need to be protected. I don't see why all "general" stripped tokens have to be cleared. If there are security tokens, they should not be using te "general" type for their stripped text, but a "secure" type. For the first issue T63268, I have absolutely not seen why there was an issue (most probably the issue was just the second one creating the first one indirectly).
- In fact I am convinced that most extension tags should not be processed immediately but should just be half-processed by just creating stripped markers only containing the extension tag and its parameters, to be processed later. It would make the parser much more efficient. But now as they will be stripped in a "general" stripped tag for most of them (except "secure" ones), using unstrip() would not kill them and would allow their expansion.
- If there are things hidden in the unstripped text that should not be revealed, they will be exposed later in the final web page, so all issues will exist as well by using a second scrpting server querying the same page: the issue remains usable from an external client (hiding tags only locally to Lua will not secure them). Verdy p (talk) 20:42, 27 April 2015 (UTC)
get template parameter's by order
editi'm trying to extract from the template the names of the parameter by read them from the "templateData" using "mw.text.jsonDecode( s )". i am able to do it but when i'm using for...pairs, the parameter's names returns in diffrent order from "templateData". i tried to use the flag "mw.text.jsonDecode( s, mw.text.JSON_PRESERVE_KEYS )" but it still not help. what is worng?! Badidipedia (talk) 13:10, 17 May 2015 (UTC)
- If the incoming JSON data is encoded as a JSON object, note that JSON objects are unordered so there isn't technically an order to be extracted in the first place. If the incoming JSON data is an array, use
ipairs
instead ofpairs
. Anomie (talk) 13:53, 18 May 2015 (UTC)- Thank you, Anomie. as i saying, i was took the data from "Template:" pages and the text passed to the function
mw.text.jsonDecode( s, mw.text.JSON_PRESERVE_KEYS )
as string so i belive it's consider as serialzed and therefore - ordered.
- Thank you, Anomie. as i saying, i was took the data from "Template:" pages and the text passed to the function
- I'm adding the code here. maby it will help to understand the problem:
local function getParametersPage()
templateTitle = mw.title.getCurrentTitle()
templateText = templateTitle:getContent()
templateDataStart = mw.ustring.find( templateText, "<templatedata>") + mw.ustring.len( "<templatedata>" )
templateDataEnd = mw.ustring.find( templateText, "</templatedata>", templateDataStart) -1
templateDataText = mw.ustring.sub( templateText, templateDataStart, templateDataEnd)
templateDataTable = mw.text.jsonDecode( templateDataText )
answer = " {{" .. templateTitle.text
params = templateDataTable.params
for k, v in pairs( params ) do
answer = answer .. "<br />|" .. k .. "="
end
answer = answer .. "}}"
return answer
end
- Badidipedia (talk) 18:21, 18 May 2015 (UTC)
- As I said, there you're accessing a JSON object which by definition has no order. Anomie (talk) 13:20, 19 May 2015 (UTC)
- Thanks again Anomie. if i'm understand you correctly, by saying "JSON object", you are meaning to the function's return value (Table i guess, and this is explain whay it's unordered). It's mean that if i want to do it, i must learn JSON from start to top and write a lot of code. Wish me luck... (Sory abaut my terible english..) Badidipedia (talk) 18:13, 19 May 2015 (UTC)
- As I said, there you're accessing a JSON object which by definition has no order. Anomie (talk) 13:20, 19 May 2015 (UTC)
- Badidipedia (talk) 18:21, 18 May 2015 (UTC)
- See ru:Модуль:TemplateDataDoc. Jack who built the house (talk) 20:34, 24 June 2018 (UTC)
Function declaration and invocation parameters
editThanks for writing this useful manual; I stayed a Lua/Scribunto newbie for a while so now I can offer myself as guinea pig user of the documentation.
I was utterly confused by the lack of information on how to create a simple module which takes some (named) parameters and outputs a string: this seems to be a very common use case, yet it's not covered.
In detail:
- there is one such minimal example at Lua/Tutorial#Accessing template parameters, which I initially skipped assuming the reference manual would be more complete;
- then most information is hidden in the "Frame object" section, which I'm never going to check unless I already know what it says (i.e. that I must use it):
as a result, information in the sections about function invocation and declaration (which I did check, as they sounded like what I needed) was misleading, especially given the lack of a realistic example module. --Nemo 09:34, 11 July 2015 (UTC)
- I'm not very fond of sticking that unexplained Scribunto-specific see-also in the middle of the documentation on how to declare a Lua function. But you're right in that the whole "frame object" deal isn't explained too well, so I added a section titled "Accessing parameters from wikitext" right at the top of the document. Feel free to improve it if necessary, although we should avoid turning it into a tutorial rather than a reference manual (and note that #frame.args already includes most of what Lua/Tutorial#Accessing template parameters does). Anomie (talk) 13:24, 13 July 2015 (UTC)
Accessing protection levels
editI'm using DPL to generate a list of pages I need to protect and came across something a little odd. If I use mw.title.new(pagename)
to generate a title object, I add to the expensive function limit (perfectly understandable). However, I then need to check the protection level of said page, which is another expensive function. Are the protection levels not loaded as part of the initial Title
instance? mdowdell (talk) 19:47, 28 July 2015 (UTC)
- No, because they take an extra database query to load. Anomie (talk) 13:09, 29 July 2015 (UTC)
Documentation for newline in long brackets
editExtension:Scribunto/Lua reference manual#string says:
- if an opening long bracket is immediately followed by a newline then the newline is not included in the string, but a newline just before the closing long bracket is kept.
and then gives as an example:
-- This long string
foo = [[
bar\tbaz
]]
-- is equivalent to this quote-delimited string
bar = 'bar\\tbaz'
Why is the newline in foo
between "baz" and "]]" omitted? DMacks (talk) 05:37, 9 October 2015 (UTC)
- Because it was incorrect. Fixed. Anomie (talk) 13:14, 9 October 2015 (UTC)
- Ah, that does explain the situation. Thanks:) DMacks (talk) 10:30, 12 October 2015 (UTC)
Find all references
editRecently I faced the problem of finding all places where specific function from module is called via #invoke
from template. That was not trivial. The only thing I was able to receive is a list WhatLinksHere to the module, where the function is defined. But the list has many items that do not call the interested function instead any other available in the module. To resolve it I had to create separate module (BTW named Module:Name/deprecated) and copy all its code there. Then in the main module created new implementation, which redirects the call to separate module:
myFunction = function(frame)
mw.log(frame:getParent():getTitle(), "parent:title")
return require("Module:Name/deprecated").myFunction(frame)
end
The new implementation added log entry with parent frame title that help me getting template name that calls the function, when looking into the log generated in the article preview. When the template with reference was found, I was able to remove it or change to the new one. This removed relevant part of articles from the WhatLinksHere list. Finally I was able to fix all references. However, the whole process is not trivial, and very time consuming. Could it be simplified somehow? It could be useful if #invoke
could create "shadow" categories, which are emitted after the whole page is created, and are declared via some specific mw.???
api function. Generating categories independently from the returned value can be very powerful method for various diagnostic purposes. Paweł Ziemian (talk) 19:01, 22 October 2015 (UTC)
- You could have skipped the "copy all its code there" bit and just added a
require("Module:Name/deprecated")
(or better,mw.title.new("Module:Name/deprecated").id
since that doesn't even require that "Module:Name/deprecated" exist) to the original function. To request new features for Scribunto, such as aframe:addCategory()
function, please file a bug in the "MediaWiki-extensions-Scribunto" project requesting it. Anomie (talk) 14:15, 23 October 2015 (UTC)- Thanks. Paweł Ziemian (talk) 15:33, 23 October 2015 (UTC)
ustring pattern dont recognize char
editUstring pattern dont recognize old-slavic char 'І'. Visualy it like latin 'I' but has other code. All other simbols works fine.
local str='1. Іqwerty'
local _, _, b = mw.ustring.find(str, '([а-яёѣѵіѳА-ЯЁѢѴІѲa-zA-Z])')
return b
This returns 'q' but must 'І'. --Vladis13 (talk) 03:20, 10 November 2015 (UTC)
- This sounds a lot like phab:T73922, so I added details distilled from this case there. Anomie (talk) 14:53, 10 November 2015 (UTC)
- Unfortunately, issue there year now and no solution in sight. Is way to bypass the bug? --Vladis13 (talk) 00:45, 11 November 2015 (UTC)
frame object in debug console?
editIs there a way to create a frame object in the debug console? Or at least something that works like one.
Suppose I want to test a function that uses frame:preprocess(), even mw.getCurrentFrame() return nil, so I get an error.
Currently, I'm using something like this:
local frame=mw.getCurrentFrame()
if(nil==frame) then
frame={['preprocess']=function(self,text) return text..'' end}
end
mw.getCurrentFrame()
does not return nil in the debug console. Anomie (talk) 14:35, 13 November 2015 (UTC)
Stats of users groups
editIn site.stats and usersInGroup I found by some tests:
- administrators = usersInGroup( "sysop" )
- bots = usersInGroup( "bot" )
- patrollers = usersInGroup( "patroller" )
- bureaucrats = usersInGroup( "autoconfirmed" )
- accountcreator = usersInGroup( "accountcreator" )
- could you better complete all use cases?--Rical (talk) 21:03, 14 November 2015 (UTC)
- Bureaucrats should be
usersInGroup( "bureaucrat" )
.usersInGroup( "autoconfirmed" )
should give 0 since users aren't directly added to that group, it's dynamically added by MediaWiki when conditions are met. - As for finding group names, use API action=query&meta=siteinfo&siprop=usergroups (e.g. [7]) to get the group names, then see the messages "MediaWiki:group-{name}" (e.g. MediaWiki:group-sysop) to find the human-readable names. Or go to Special:ListUsers and use the "Group" dropdown, then look in the URL for the value of the "group=" parameter. Or go to Special:ListGroupRights and look at the value of the "group=" parameter in the "(list of members)" links. Anomie (talk) 14:15, 16 November 2015 (UTC)
- Bureaucrats should be
Serial comma needed in listToText()
editen:Serial comma is proper basically everywhere.
en:Template:Further, en:Template:Further2, and en:Template:See also use mw.text.listToText()
(via en:Module:Further and en:Module:See also).
They produce results like:
Further information: a
Further information: a and b
Further information: a, b and c
See also: a
See also: a and b
See also: a, b and c
The templates produce incorrect results because listToText() produces incorrect results.
The results should be:
Further information: a
Further information: a and b
Further information: a, b, and c
See also: a
See also: a and b
See also: a, b, and c
The additional comma applies only for 3 or more items. The results for lists of 2 items must NOT be:
Further information: a, and b
See also: a, and b
-A876 (talk) 08:56, 15 November 2015 (UTC)
- Bug reports and feature requests should be filed in Phabricator. Anomie (talk) 14:19, 16 November 2015 (UTC)
Interpreter error signal 11
editI tried to follow the example of the Bananas module on the project page, but I when try to create the Module:Bananas I get the message Script error: Lua error: Internal error: The interpreter has terminated with signal "11". But nowhere can I find an explanation of what signal 11 means. Why is the interpreter being invoked anyway, when all I'm trying to do is to create a new page? Eric Corbett (talk) 14:01, 17 November 2015 (UTC)
Is the problem simply that the binaries distributed with the 1.25 version of Scribunto don't work? In particular, I'm using the 64-bit Linux binary. Eric Corbett (talk) 19:07, 17 November 2015 (UTC)
- This can be. The binaries are of course system dependent and a signal 11 (SEGFAULT) usually means execution is failing hard. They 'should' work in most cases, but it's no guarantee. There is some more info on binaries here. You can easily test if they work by using them directly for a hello world. And pages are validated before they are saved, so that is why you see this during page creation. —TheDJ (Not WMF) (talk • contribs) 22:34, 17 November 2015 (UTC)
Analog of {{#switch:{{{1}}}|foo1|foo2=bar}}
editHow to make analog of {{#switch:{{{1}}}|foo1|foo2|foo3|foo4|foo5|foo6=bar}}
, to assign "bar" only once for all identifiers?
Without creating dublicate array-field for each parameter, like:
local = p { foo1 = bar,
foo2 = bar,
foo3 = bar,
foo4 = bar,
foo5 = bar,
foo6 = bar,}
mw.message library
editI find the documentation for mw.message library confusing. Is the API just an interface to manage messages stored in the mediawiki namespace with fallbacks and extra stuff or can it be used to provide general localization for a LUA module?
If it is neither, then I would suggest adding an example to its use, even in a subpage if the reference manual is not the appropriate place for it.
Speaking of which, is there any API that facilitates localization of lua modules such as this: https://github.com/kikito/i18n.lua? If not, then that's my suggestion. — Preceding unsigned comment added by 197.218.81.244 (talk • contribs) 19:56, 13 December 2015 (UTC)
- It is, as is stated in the documentation, an interface to MediaWiki's i18n system that involves the MediaWiki namespace. Somewhat more specifically, it wraps the PHP Message class. It could be used for general localization for a module, if the module is to be localized by creating MediaWiki-namespace messages. Anomie (talk) 14:34, 14 December 2015 (UTC)
UNIQ ... QINU Strip Marker
editI'm trying to write a module that uses a DynamicPageList to select a random article from the list. I've found many different ways to call DynamicPageList, but no matter what I do, the module only sees a UNIQ ... QINU strip marker. If I unstrip it, I get nothing. What's the secret to reading strip content inside a Lua module? -- Dave Braunschweig (talk) 15:01, 29 December 2015 (UTC)
- using frame:preprocess() on the direct results worked for me (though I see one can't apparently do further processing on it without getting the marker again--I read that a parser function is supposed to avoid this problem that tags have, so maybe could rewrite the DynamicPageList function as one, but I got an error trying to change to setFunctionHook). Brettz9 (talk) 04:30, 12 March 2016 (UTC)
ustring.match problem
editThe function ustring.match (unlike mw.ustring.match) sometimes fails:
ustring.lua:728: attempt to concatenate field '?' (a nil value)
This occurs, for example, such code:
ustring.match( 'ст.38В с учета', '^ст%.(%d%d[АБВГД]) с учета$' )
This error has also on standalone Lua (without Mediawiki). I found a way how to get around, just to add a second capture:
ustring.match( 'ст.38В с учета', '^ст%.(%d%d[АБВГД]) (с) учета$' )
But it would be desirable that the error has been corrected.
Full example: Module:Ustring_test --StasR (talk) 16:20, 1 January 2016 (UTC)
- Should be fixed once gerrit:261947 is merged and deployed. Anomie (talk) 15:50, 2 January 2016 (UTC)
- Thanks! --StasR (talk) 21:05, 2 January 2016 (UTC)
Is result of frame:expandTemplate() cached?
editI call twice 'frame:expandTemplate()' with the same parameters. Apparently, the template is executed only once. --StasR (talk) 17:09, 13 January 2016 (UTC)
- Yes, calls to
frame:expandTemplate()
for the same template with the same parameters are cached, unless the expansion sets the "isVolatile" flag. Anomie (talk) 15:28, 14 January 2016 (UTC)- Developer of the extension says he doesn't know how to do it. :-( --StasR (talk) 17:21, 14 January 2016 (UTC)
- He might look at how the Cite extension does it, to learn by example. He could also ask for assistance, I'd recommend wikitech-l as a good place to start. Anomie (talk) 15:11, 15 January 2016 (UTC)
- Developer of the extension says he doesn't know how to do it. :-( --StasR (talk) 17:21, 14 January 2016 (UTC)
Is there a way to keep the original parameter order?
editWhen iterating over frame.args
with pairs( frame.args )
, the original order of parameters is lost: if I use {{#invoke:Module|main|named_arg=value|unnamed_arg}}
, the parameters will swap positions (unnamed_arg will be the first, named_arg the second, as is always the case with unnamed & named parameters). But it is critical for me to keep the original order.
Documentation says: "For performance reasons, frame.args uses a metatable, rather than directly containing the arguments. Argument values are requested from MediaWiki on demand. This means that most other table methods will not work correctly, including #frame.args
, next( frame.args )
, and the functions in the Table library".
Maybe I should overwrite frame.__pairs
somehow, so that the original order will be kept? But I don't even know how to get the contents of that function (tostring(getmetatable(frame.args).__pairs)
gives "function"), let alone how it "requests argument values from MediaWiki on demand".
Thanks in advance. Jack who built the house (talk) 18:42, 13 February 2016 (UTC)
- Jack who built the house, have you found a solution for this problem? Reptilien.19831209BE1 (talk) 12:53, 30 October 2017 (UTC)
- @Reptilien.19831209BE1: Hm, for some reason the ping didn't work. Nope, I haven't. Probably a Phabricator task has to be created. Jack who built the house (talk) 04:33, 9 November 2017 (UTC)
mw.ustring.match and ordering of combinations in character class
editWhen studying Lua modules and regexes at the same time, I found this:
* string.match('314 159 265', '^[%s]*[%-]?[%s]*[%d]+[%d%s]*$') == '314 159 265' * string.match('314 159 265', '^[%s]*[%-]?[%s]*[%d]+[%s%d]*$') == '314 159 265' * mw.ustring.match('314 159 265', '^[%s]*[%-]?[%s]*[%d]+[%d%s]*$') == '314 159 265' * mw.ustring.match('314 159 265', '^[%s]*[%-]?[%s]*[%d]+[%s%d]*$') == nil
In words: Two patterns differ in part [%d%s]
versus [%s%d]
, each of two is tested with two functions (string.match and mw.ustring.match). Then mw.ustring.match
+ [%s%d]
does not detect occurrences which are detected by other three combinations and which, as far as I currently understand it, must be detected. Is it a problem with my understanding or with current implementation of mw.ustring.match
? Stannic (talk) 15:55, 16 March 2016 (UTC)
- This sounds like phab:T73922. Anomie (talk) 13:28, 17 March 2016 (UTC)
fetchLanguageNames
editHi,
How should we call fetchLanguageNames to have all languages (include='all') with their languageName in their language (inLanguage ~ autonym) (719 languages).
Sadly fetchLanguageNames(Nil,'all') returns the same thing as fetchLanguageNames() (417 languages in fact), when fetchLanguageNames('fr','all') returns 719 languages
Perhaps it is a bug or perhaps there is no way to get the autonym of the additional languages ?
Cheers Liné1 (talk) 13:43, 13 May 2016 (UTC)
- This is due to MediaWiki's underlying Language class. It doesn't call the 'LanguageGetTranslatedLanguageNames' hook when fetching autonyms, so it doesn't get the additional languages from Extension:cldr in that case. A comment in the code says "TODO: also include when $inLanguage is null, when this code is more efficient". Anomie (talk) 16:03, 16 May 2016 (UTC)
- Thanks for the answer. Best regards Liné1 (talk) 17:35, 16 May 2016 (UTC)
mw.loadData() and next()
editas far as i know, the standard way to test if a lua table is empty is to test for next(t) == nil
. i will be happy to learn a better or safer way to test, but this is what many sources seem to indicate.
i have a module, let's call it "data module", that looks something like so:
local t = {
['a'] = 'something',
['b'] = 'something else',
-- et., etc. etc.
}
return {
[0] = t,
[733] = t,
}
i then use it from another module like so:
local t = mw.loadData('Module:data module')
local inner = t[0]
-- at this point, inner is indeed { ['a'] = 'something', ['b'] = 'something else' }
-- however, when i test for next(inner), i get nil!
i guess it has something to do with the sanitation imposed by loadData(). is this a bug? feature? intentional? unintentional?
presuming this is known and intentional, i think that at the least, the documentation should warn about it, and maybe provide some sanctioned alternative to next(t) == nil
test.
peace - קיפודנחש (talk) 23:51, 13 June 2016 (UTC)
- Extension:Scribunto/Lua_reference manual#mw.loadData says: "The table actually returned by
mw.loadData()
has metamethods that provide read-only access to the table returned by the module. Since it does not contain the data directly,pairs()
andipairs()
will work but other methods, including#value
,next()
, and the functions in the Table library, will not work correctly." Matěj Suchánek (talk) 06:02, 14 June 2016 (UTC)- thanks. i guess i missed this piece in the documentation. it would be good, i think, if the metatable will add a method to test for emptiness, i think (t:empty() returning bool or somesuch). also, maybe it would be better to throw exceptions when calling a "broken" method, rather than returning incorrect result. peace - קיפודנחש (talk) 15:45, 14 June 2016 (UTC)
- The problem is that Lua 5.1 just doesn't have the ability to do what's being asked here. Even 5.3 doesn't seem to have a way to override
next
, although you might be able to cheat on it by usingpairs
to get a next-like function. There's nothing we can do for the length operator, that would need 5.2. Anomie (talk) 13:14, 15 June 2016 (UTC)
- The problem is that Lua 5.1 just doesn't have the ability to do what's being asked here. Even 5.3 doesn't seem to have a way to override
- thanks. i guess i missed this piece in the documentation. it would be good, i think, if the metatable will add a method to test for emptiness, i think (t:empty() returning bool or somesuch). also, maybe it would be better to throw exceptions when calling a "broken" method, rather than returning incorrect result. peace - קיפודנחש (talk) 15:45, 14 June 2016 (UTC)
Title objects: getContent() versus exists
editThe page states that exists
is expensive and that getContent()
is not. Yet, aren't .exists
and :getContent() ~= nil
equivalent, such that the latter is an inexpensive workaround? Are there situations where exists
would be required in its current form instead of the inexpensive workaround? Njardarlogar (talk) 08:56, 25 July 2016 (UTC)
- "getContent()" is expensive too. Taylor 49 (talk) 19:40, 23 March 2021 (UTC)
MediaWiki and looking into the entire module, Chapter edit using a browser while editing in the variable 'text' , not only to this chapter
editWe have a code in Lua:
local myTitle = mw.title;
local page = myTitle.makeTitle ("", args [1]);
local text = page: getContent ()
Peering to the entire module, which can not watch via a browser is very easy, but when this module we watch and show his chapter on the browser during editing, it analyze his chapter, instead of the whole module in the variable 'text' , and I want to just analyze the entire module, not the section that edit. How to fix it.Persino (talk) 15:00, 18 August 2016 (UTC)
How to download precisely defined section of the module using the equipment in Lua on mediawiki
editHow do I bring a strictly defined section of the module using the equipment in Lua for mediawiki? Persino (talk) 16:01, 24 August 2016 (UTC)
Call mw.loadData() with parameter
editAbout mw.loadData() writen that the data module is like require(), can have functions. But how call it with a parameter? E.g.:
local t = mw.loadData('Module:data module')
local result = t.func(parameter)
Would be desirable that the module return different data depending on the parameter. (Alternative to make separate modules or to expand the table does not fit.) --Vladis13 (talk) 20:46, 16 October 2016 (UTC)
- The returned table (and all subtables) may contain only booleans, numbers, strings, and other tables. Other data types, particularly functions, are not allowed. --StasR (talk) 07:26, 17 October 2016 (UTC)
- The returned table - yes, but itself data module can have functions. E.g. work the following 'Module:data module':I want to it can process not only CurrentTitle, but any title that given via parameter, like in a usual module. --Vladis13 (talk) 17:35, 17 October 2016 (UTC)
local currtitle = mw.title.getCurrentTitle() local currname = currtitle["text"] function p(str) return str .. '/postfix' end t['pagename'] = p(currname) return t
- This means that the module will return a function. --StasR (talk) 08:49, 18 October 2016 (UTC)
- It sounds like you want to have a normal module, loaded with
require
, rather than a data module. Anomie (talk) 13:21, 18 October 2016 (UTC)
- The returned table - yes, but itself data module can have functions. E.g. work the following 'Module:data module':
To see the link in < code > tag
edit- In the text of this help, when a link to another paragraph is inside a <code> tag, we do not see the link.
- We can show it using rather
pattern
thanpattern
--Rical (talk) 06:50, 24 November 2016 (UTC)
Translation of the reference manual
editHello,
I would like to help with a translation of the manual to Esperanto. But I see no "translate this page" link, as stated in Project:Language policy. Does the page already use the translate extension? If not, could an admin make the conversion? --Psychoslave (talk) 09:30, 25 November 2016 (UTC)
- It was attempted in June by User:Amire80, but reverted a few days later by User:Shirayuki. Apparently it's harder than just letting an extension throw in 16K of weird comments. Anomie (talk) 14:18, 28 November 2016 (UTC)
- It is very hard to prepare a huge page for translation. --Shirayuki (talk) 15:09, 28 November 2016 (UTC)
- The page should be split into multiple pages at first. --Shirayuki (talk) 22:04, 29 November 2016 (UTC)
- Well I can see the link, so I guess that someone made something but I just can see that in the history. So thank to whoever made something. :) --Psychoslave (talk) 07:06, 30 November 2016 (UTC)
- Hmm, I fact it seems that only a very little part was actually translatable with the current translation markup… --Psychoslave (talk) 08:09, 30 November 2016 (UTC)
- So having read some doc, I think the problem is that most of the text is not within a
<translate>
anchor. I can add them, then some translate admin might validate the changes. I'll wrap the code too, as I see no reason to not translate it. That is the part that can be translated at least. The documentation should be adapted to what might concern the lectorate: having"你好,世界!"
instead of"Hello, world!"
is not a problem. However identifiers for example currently can't use unicode. So translating the source code only where possible enable to emphasize what is not translatable. Feedback is welcome on this point of view. --Psychoslave (talk) 10:47, 30 November 2016 (UTC)
- So having read some doc, I think the problem is that most of the text is not within a
- Hmm, I fact it seems that only a very little part was actually translatable with the current translation markup… --Psychoslave (talk) 08:09, 30 November 2016 (UTC)
- Well I can see the link, so I guess that someone made something but I just can see that in the history. So thank to whoever made something. :) --Psychoslave (talk) 07:06, 30 November 2016 (UTC)
- The page should be split into multiple pages at first. --Shirayuki (talk) 22:04, 29 November 2016 (UTC)
- Some administrator was too quick to validate the changes when most of the markup is still absent. As a result, most of already translated manually Russian version disappeared. Do you know some bot to put the markup automatically and move information from prrviois versions of the page to Translate chunks? Ignatus (talk) 11:13, 30 November 2016 (UTC)
- After a reading of all the documentation I found on the matter, I would say that "no" for both your demands: there is no automatic way to have sliced
<translate>
chunks. One might wrap the whole page in a single anchor, but this would probably provide awful chunks to translators. Regarding existing translation, they have to be manually migrated into the new system, see the migration process documentation for more information about that. But as far as I understand, to re-import the whole translation, all the original source need to be marked up and validated by a translator admin. Regarding the tag part, this is a work in progress. --Psychoslave (talk) 13:15, 30 November 2016 (UTC)- Done Maybe Shirayuki might have a look and mark the document for translation if everything seems fine. After that, it should be possible to migrate previous translations and make new ones with the help of the Translation extension. --Psychoslave (talk) 13:59, 30 November 2016 (UTC)
- @Psychoslave: Why does something like
math.atan2( y, x )
need to be wrapped in translate tags? The "math" is a Lua table name that's not going to work if you turn it into some other language, and the "atan2" is a function name that's similarly not going to work if altered. Do the "y" and "x" need to be translated? The wrapping ofmw.allToString( ... )
in translate tags is even worse, are you going to somehow translate the ellipsis? Anomie (talk) 14:01, 30 November 2016 (UTC)- Whether the x and y should be translated is up to translators. There are other code sample where letting possibility to translate might seems more interesting like
next( table, key )
: table and key should be translatable in my humble opinion (but translators should be aware that identifiers should stay pure ASCII). So this is more consistent to make it translatable everywhere, the final decision being let to translators. From such a perspective, it's fine to remove from the translation list code chunks that only contains reserved keywords, like the example you give with ellipses. Most of the translate tags around code tags where added with a regexp, so this kind of case wasn't catch and removed in the process. --Psychoslave (talk) 14:45, 30 November 2016 (UTC)
- Whether the x and y should be translated is up to translators. There are other code sample where letting possibility to translate might seems more interesting like
- The page should be split into multiple pages before translation. I taught by example, but you cannot prepare for translation. --Shirayuki (talk) 14:28, 30 November 2016 (UTC)
- Why would you like to see the page split? Please provide more explicit explanations on expected benefits. There was already several translations based on the single page realized without the Translate extension, and now there is a tagged version for it. If there are specific problems with the proposed version, please detail what those problem are so it can be fixed, or fix them directly. --Psychoslave (talk) 15:00, 30 November 2016 (UTC)
- The page is too huge to be prepared for translation at once. Untranslatable parts should be excluded finely for translators and translation admins. --Shirayuki (talk) 22:26, 30 November 2016 (UTC)
- Ok, well, I didn't made any split so far, but I did progress on the adding tvar, and some other tricks. The current state of my progress is here, so my changes don't conflict with the main page. Please let me know if you can grab more problems to resolve on this version, I already know that I yet have to treat some links to tvar, but I'm not sure which one should be tvar-ed as "url" and which one should give opportunity to link to some translated pages. Namely this links are :
- subst
- MediaWiki-namespace messages
- comments
- UTF-8
- double-precision floating-point value
- E notation
- NaN
- closures
- order of operations
- syntactic sugar
- lexical closures
- syntactic sugar
- tail calls
- pseudocode
- E notation
- pseudocode
- seed
- regular expressions
- PCRE
- palindrome
- $wgExpensiveParserFunctionLimit
- substed
- strip marker
- Language code
- IETF language tags
- Extension:CLDR
- wiki's local time
- Extension:ParserFunctions
- Help:Magic words#Localization
- translatewiki:FAQ#PLURAL
- Help:Magic words#Localisation
- Help:Magic words#Language-dependent word conversions
- translatewiki:Grammar
- $wgScriptPath
- $wgServer
- $wgSitename
- $wgStylePath
- $wgDisableCounters
- interwiki
- protocol-relative
- transcludable
- scary transclusion
- $wgExtraInterlanguageLinkPrefixes
- HTML entities
- strip marker
- MediaWiki:comma-separator
- MediaWiki:and
- MediaWiki:word-separator
- HTML entities
- MediaWiki:ellipsis
- strip marker
- MIME type
- expensive function count
- Percent-encodes
- Percent-decodes
- Normalization Form C
- Normalization Form D
- Unicode character properties
- bitwise operations
- bitwise AND
- bitwise complement
- bitwise OR
- bitwise XOR
- shifted
- logical shift
- shifted
- logical shift
- arithmetic shift
- rotated
- Wikibase Client
- Wikidata
- Extension:Wikibase Client/Lua
- ScribuntoExternalLibraries
- ScribuntoExternalLibraryPaths
UnitTestsList
hook- UnitTestsList hook
- Jenkins
- w:Lua (programming language)
- I don't need guidance about each, but what should I do for each class of link would help me. --Psychoslave (talk) 00:28, 3 December 2016 (UTC)
- Hi Shirayuki, would you please take a look at this stub, and tell me if apart from links of form [[:en:…]] and [[w:…]] , links seems fine for you. Also, how should this two kinds of links be prepared for translation? I think it would be better if the link would lead to the localized wikipedia article if it does exist, but I don't know how to do that. --Psychoslave (talk) 11:06, 6 December 2016 (UTC)
- Ok, well, I didn't made any split so far, but I did progress on the adding tvar, and some other tricks. The current state of my progress is here, so my changes don't conflict with the main page. Please let me know if you can grab more problems to resolve on this version, I already know that I yet have to treat some links to tvar, but I'm not sure which one should be tvar-ed as "url" and which one should give opportunity to link to some translated pages. Namely this links are :
- The page is too huge to be prepared for translation at once. Untranslatable parts should be excluded finely for translators and translation admins. --Shirayuki (talk) 22:26, 30 November 2016 (UTC)
- Why would you like to see the page split? Please provide more explicit explanations on expected benefits. There was already several translations based on the single page realized without the Translate extension, and now there is a tagged version for it. If there are specific problems with the proposed version, please detail what those problem are so it can be fixed, or fix them directly. --Psychoslave (talk) 15:00, 30 November 2016 (UTC)
- @Psychoslave: Why does something like
- Done Maybe Shirayuki might have a look and mark the document for translation if everything seems fine. After that, it should be possible to migrate previous translations and make new ones with the help of the Translation extension. --Psychoslave (talk) 13:59, 30 November 2016 (UTC)
- After a reading of all the documentation I found on the matter, I would say that "no" for both your demands: there is no automatic way to have sliced
- About
[[#getmetatable|getmetatable()]]
, why did you make getmetatable() translatable?--Shirayuki (talk) 11:55, 6 December 2016 (UTC)- Most tvar where added with a regexp, I didn't specifically wanted getmetatable to be translatable, it just that in the general case the label of the link should be translated. --Psychoslave (talk) 15:33, 6 December 2016 (UTC)
- Shirayuki, I replace that and other similar cases with a tvar embedding the whole link. I also did some clean up for other cases where to my mind there is really nothing translatable. Please let me know if there are further improvement I could make on this regard. Also could you suggest me something for links toward the English Wikipedia like [[:en:Arithmetic shift|arithmetic shift]] and [[w:Lua (programming language)]]. --Psychoslave (talk) 08:36, 7 December 2016 (UTC)
- devunt, -revi, 80686, Addshore, Amire80, Az1568, BPositive, Base, Bawolff, Bene*, Brackenheim, BrionVIBBER, Chrumps, DannyB., Dereckson, DiegoGrez-Cañete, ElandZhou, Elitre, Eloquence, FabriceFlorin, Fabsouza1, FeelUs, Florianschmidtwelzow, GeorgeBarnick, Glaisher, Grind24, Guillaume, He7d3r, Hooman, IAlex, IoannisKydonis, JackPhoenix, Jackmcbarn, Jdforrester, Jean-Frédéric, Jianhui67, Johan, JohnBroughton, JohnVandenberg, Jsoby, Kaganer, Keegan, Krinkle, LeaLacroix, Legoktm, M4tx, MZMcBride, Macofe, MaxSem, Mdennis, MegaAlex, Melamrawy, Mglaser, Nemobis, Nikerabbit, Peachey88, Petrb, PiRSquared17, Pyfisch, Quiddity, Quiddity, Reedy, RunabWMF, Rundl132, SPage, SVG, Seb35, Shirayuki, Siebrand, Skizzerz, Steinsplitter, Stemoc, Tbayer, Tgr, Thehelpfulone, Trizek, Tropicalkitty, Valhallasw, Varnent, Vogone, Wargo, Whatamidoing, ^demon, Ата is welcome to make a review and feedback of this preparation for translation, and would it be fine as is, move its content in the current help article and mark it for translation. Also if there is an easier way to contact members of a group in the wiki, please let me know. :) --Psychoslave (talk) 14:32, 8 December 2016 (UTC)
- If someone insists on splitting this page, please transclude the subpages back into this main page to preserve the ability to load one page and search it with Ctrl+F. Anomie (talk) 14:39, 1 December 2016 (UTC)
- This looks like a good compromise. I was about to propose that. --Ciencia Al Poder (talk) 10:25, 5 January 2017 (UTC)
- If someone insists on splitting this page, please transclude the subpages back into this main page to preserve the ability to load one page and search it with Ctrl+F. Anomie (talk) 14:39, 1 December 2016 (UTC)