Open main menu
This page is a translated version of the page Extension:Scribunto/Lua reference manual and the translation is 35% complete.

Other languages:
Bahasa Indonesia • ‎Deutsch • ‎English • ‎Esperanto • ‎Tiếng Việt • ‎asturianu • ‎español • ‎français • ‎italiano • ‎magyar • ‎polski • ‎português do Brasil • ‎русский • ‎українська • ‎עברית • ‎ဖၠုံလိက် • ‎မြန်မာဘာသာ • ‎中文 • ‎日本語 • ‎한국어
快捷方式:
Lua manual
LUAREF

此手册记载Lua 的文档,它用于MediaWiki的Scribunto 扩展。 部分内容取自Lua 5.1参考手册,其许可协议为MIT许可证

Contents

介绍

入门

在已启用Scribunto的MediaWiki wiki上,以“Module:”前缀为标题创建一个页面,例如“Module:Bananas”。 进入该新页面,复制下列文本:

local p = {} -- p代表一个包(package)

function p.hello( frame )
    return "你好,世界!"
end

return p

保存之后,另一个非模块的页面将会写入:

{{#invoke:Bananas|hello}}

除了你应该将“Bananas”替换成你需要调用的模块。这将会使“hello”函数被模块调用。{{#invoke:Bananas|hello}}将会被函数输出的结果替换,那么,“你好,世界!”

从模板的内容调用Lua代码确实是一个好办法。这意味着从调用页面的角度来说,语法不依赖与模板逻辑是否被Lua或者维基文本执行。这也避免了把复杂语法加到wiki内容页面中。

模块结构

模块本身必须是返回一个包含能被{{#invoke:}}调用的函数的表(table)。一般来说,如上所示,先声明一个局部变量表,函数被写在这个表中,而且最终的模块代码中,这个表会被返回。

若是没有加到这个表的函数,无论是局部(变量)还是全局(变量),都不能被{{#invoke:}}关联,但是局部变量可能会被其他的用require()加载的模块关联。对于模块来说,最好将所有的函数和变量声明为局部变量。

从维基文本关联参数

{{#invoke:}}调用函数时,会将框架项目(frame object)作为调用函数时的参数(参数只有一个)。要关联被{{#invoke:}}通过的函数,代码通常会使用框架项目的args。也可以关联被包含{{#invoke:}}的模板参数,方法是使用frame:getParent() 然后关联到框架的args

框架项目也用来关联特定情况下的维基文本解析器,例如调用解析器函数扩展模板以及展开任意的(含模板或解析器函数的)维基文本字符串

返回文本

模块函数通常返回单个的字符串;无论返回什么值都会通过tostring()转换,然后连接在一起。这个字符串就是转化成维基文本代码{{#invoke:}}的结果。

在解析页面的这一点上,模板已经被展开,解析器函数和扩展标签都已经处理,而且预存的转换(例如签名的扩展以及pipe trick)都已经完成。因此,模块不能在输出文本中使用这些特性。例如,如果一个模块返回"你好,[[世界]]!{{欢迎}}",页面就会是“你好,世界!{{欢迎}}”。

另一方面,替换引用会在处理过程的早期就被处理,所以带有{{subst:#invoke:}}会尝试替换后的处理内容。自从失败的替换引用会保持为维基文本,它们都会被下一次编辑处理。这得要避免。

模块文档

Scruibunto允许模块可以被自动关联的模块维基文本页面关联文档;默认情况下,模块的“/doc”子页面是用来作为文档,并且会在模块页面的代码顶部显示其内容。例如,“模块:Bananas”的文档页面就是“模块:Bananas/doc”。

这也可以使用以下命名空间消息配置:

  • scribunto-doc-page-name(文档页面名称):设置用来作文档页面的名称。模块(除了模块:prefix)的名称会通过$1。如果在模块命名空间,这里的页面通常会视为维基文本而不是Lua代码,也不会被{{#invoke:}}使用。模块的页面通常是“Module:$1/doc”这样的模块/doc子页面。注意解析器函数等其他可扩展代码在这个消息中不被使用。
  • scribunto-doc-page-does-not-exist(文档页面不存在):消息会在文档页面不存在时显示。页面名称通常会通过$1。默认是空的。
  • scribunto-doc-page-show(文档页面显示):当文档页面存在时,消息就会显示。页面的名称通常会通过$1。默认是引用文档页面。
  • scribunto-doc-page-header(文档页面开头):当显示文档页面本身时,会显示开头。模块(包括模块:prefix)的开头会通过$1而作为文档。默认情况下会简单显示为斜体的表达句。

注意模块不可以被直接分类,也不可以直接添加跨维基链接。这些应该放在文档页面里面的<includeonly>...</includeonly>标签中,当文档页面引用到模块页面时,它们就会应用于模块中。

Lua语言

变量名称

Lua中的名称(Names)(也叫标识符(identifiers))通常可以是任何字母、数字和下划线的字符串,但不可以以数字开头。名称区分大小写;“foo”、“Foo”和“FOO”都是不同的名称。

下列关键字是保留的,不能用作名称:

  • and
  • break
  • do
  • else
  • elseif
  • end
  • false
  • for
  • function
  • if
  • in
  • local
  • nil
  • not
  • or
  • repeat
  • return
  • then
  • true
  • until
  • while

名字的开头一个下划线大写字母是留给内部Lua全局变量。

下面这些文本也不能作为变量的名称。

  • #
  • %
  • (
  • )
  • *
  • +
  • ,
  • -
  • .
  • ..
  • ...
  • /
  • :
  • ;
  • <
  • <=
  • =
  • ==
  • >
  • >=
  • [
  • ]
  • ^
  • {
  • }
  • ~=

注释

注释在任何地方(除了字符串内)由--开始。如果--紧接着开放的中括号,注释就会一直延伸到下一个中括号;否则注释会延伸到这一行的结尾。

-- Lua的注释是由两个连字符(<code>-</code>)开始,运行到一行末尾。
--[[ 多行的字符串和注释
     可以被两层中括号括起来。]]
--[=[ 像这样的注释可以紧随其他--[[注释]]。]=]
--[==[ 像这样的注释可以有其他的
      --[===[ 修长的 --[=[注释]=] --跟随着
        ]===] 多次,即使它们所有
      --[[ 都没有用匹配的长括号分隔! ]===]
  ]==]

数据类型

Lua是动态类型语言,意味着变量和函数值都不会有种类,只有值会有。所有的值都有类型。

Lua有8个基本类型,然而只有6个是与Scribunto扩展相关。type()函数会返回值的类型。

tostring()能够将值转化为字符串(string)。tonumber()能够将值转化为数字(number)(如果可能的话),而其他情况则会返回空值(nil)。没有明确的函数能够将一个值转化为其他的数据类型。

凡是将与字符串(string)连接在一起的地方,数字(number)会自动转化为字符串。使用计算符号时,字符串会由tonumber()自动转化为数字辨识。当运算中需要将一个值作为布尔值(boolean)时,除了nil和false之外,所有的值都会视为true。

空值(nil)

“空值”是nil的数据类型,用来表示这个值不存在。

空值不一定作为一个表(table)中的域名(key),而且表中未指定的域名其实就是nil值的域名。

当空值转化为字符串时,其结果会是“nil”;转化为布尔值(boolean)时,空值会被视为false。

布尔值(boolean)

布尔值就是true(真)和false(假)。

当布尔值转化为字符串时,结果就是“true”或者“false”。

不像很多其他语言,布尔值不会直接转化为数字。而且只有false和nil作为布尔值时也会视为false;数字0和空字符串都是视为true。

字符串(string)

Lua字符串会视为一系列的8比特字节;这取决于应用程序以哪种特定的编码来解析。

字符串会记录在任何一组单双引号('")中;就像JavaScript而不像PHP,这两者(指的单引号和双引号)无区别。以下转义序列被辨识:

  • \a(响铃,字节7)
  • \b(退格,字节8)
  • \t(水平制表符,字节9)
  • \n(换行,字节10)
  • \v(纵向制表符,字节11)
  • \f(换页,字节12)
  • \r(回车,字节13)
  • \"(双引号,字节34)
  • \'(单引号,字节39)
  • \\(反斜线,字节92)

在字符串代码中直接换行,应该在前面加一个反斜线(\)。字节也可以通过转义序列'\ddd',这样ddd是0~255中的字节值。使用转义序列来代替Unicode字符,则为UTF-8的单个编码字节必须要指定;总的来说,直接输入Unicode字符会更加简单。

字符串也可以用长括号定义。长括号之间可以夹杂着(0个或更多个)等号(两边的等号要等量),例如[[[=[或者[======[。开放的长括号必须被相应的闭合的长括号(或者说是结束标记)链接,例如]]]=]或者]======]。特殊情况下,开放的长括号紧跟着连续换行但未被包括的字符串,新一行只会持续到闭合长括号前。由长括号定义的字符串不会处理转义序列。

-- 长字符串
foo = [[
bar\tbaz
]]

-- 等效于此引号分隔的字符串
foo = 'bar\\tbaz\n'

注意,在转化为布尔值时,所有的字符串都会视为true(真)。这不像其他的语言将空字符串视为false(假)。

数字(number)

Lua只有一种数字类型,就是典型的双精度浮点数。在这种格式,-9007199254740992到9007199254740992之间的数都会准确表达,更大的小数将会受到舍入的误差。

数字可以用点(.)来表示小数,例如123456.78。数字也可以用不带空格的科学计数法,例如1.23e-10123.45e20或者1.23E5。也可以用16进制表示整数,方法就是以0x开头,例如0x3A

虽然NaN和正负无穷大都可以正确地储存、处理,但是Lua不提供相应的直接文字表示方法。math.huge是正无穷大,相当于1/0,此外,像0/0这样的除法就可以生成NaN。

重申一遍,在转化为布尔值时,任何数字都会被视为true。这不像其他语言,数字0通常视为false。数字在转化为字符串时,数字都会被视为有限小数和科学计数;NaN是"nan"或者"-nan";无穷大就是"inf""-inf"

表(table)

Lua的表是关联数组,很像PHP的数组和JavaScript的object。

表要用一对花括号创建。空表是{}。一个表有多个字段时,逗号和分号可以分隔表中的各个域。表的内容可以用以下形式表示:

  • [表达式1] = 表达式2,意思是名称(关键字;key)为表达式1,它与值表达式2匹配。即表达式1的值是表达式2
  • 名称 = 表达式,它等价于["名称"] = 表达式
  • 表达式大致相当于[i] = 表达式,在这里i是在这个表中从1开始排序的正整数。如果最后一个字段和表达式有多个值,所有的值都会使用;否则只有第一个会保留。

表中的字段用中括号关联,例如table[key]。字符串的域名(key)同样也可以用作点来表示的名字,例如table.key就与table['key']是一个意思。这样的用点表示域的方法也可以用来调用函数;例如table:func( ... ),相当于table['func']( table, ... )或者table.func( table, ... )

序列从1到N的所有的正整数域(非空)的表,这些正整数中没有值比N大。很多Lua函数仅按它的序列操作,忽略所有的非整数或小数的域名。

不像很多其他的语言(例如PHP或JavaScript),任何值(除了nil和NaN)都可以作为域名,而且没有执行类型转换。参看下面的示例:

-- 创建表
t = {}
t["你好"] = "你好"
t.bar = "你不好"
t[1] = "一"
t[2] = "二"
t[3] = "三"
t[12] = "十二"
t["12"] = "字符串十二"
t[true] = "真的"
t[tonumber] = "是的,即使是函数也可以作为域名"
t[t] = "表也可以作为域名,即使是它自己。"

-- 这将创建一个大致相当于上面的表
t2 = {
    foo = "你好",
    bar = "你不好",
    "一",
    "二",
    [12] = "十二",
    ["12"] = "字符串十二",
    "三",
    [true] = "真的",
    [tonumber] = "是的,即使是函数也可以作为域名",
}
t2[t2] = "是的,表也可以作为域名,即使是它自己。"

类似地,任何非空的值都可以在表中储存。将表中的某个域值修改为空值(nil)相当于删除表中的对应的域,并且调用表中不存在的任何一个域都会返回一个nil值。

注意,在Lua中,表从来不会复制出一个独立的新表;如果表通过一个参数到函数,那么函数能修改表中的域名或域值,这些变化在调用器中都会可见。(不太清楚这句话什么意思。)

当转化为字符串时,结果通常是“table”(在Lua 5.3.1中),但使用__tostring元方法时,转化为字符串的值就会重写。作为布尔值时,即使是空的表也会视为真(true)。

函数(function)

Lua中的函数(又叫功能)是最好的(first-class)的值:它们可以匿名创建,作为参数传递,给变量赋值,等等。

函数通过function(“函数”的英文)关键字,并使用圆括号调用。有一些语法糖可以用来命名函数,局部函数,并且可以作为表中的一个域值。参看函数声明函数调用

Lua函数是闭包,这意味着它们维护对它们声明的范围的引用,并可以访问和操作该范围中的变量。

类似于表,如果函数一个函数被分配给另一个变量,或者作为参数传递给另一个函数,它仍然是相同的被调用的底层“函数对象”。

函数转化为字符串时,结果是“function”(在Lua 5.3.1中是如此)。

不支持的种类

自定义类型用来储存其他语言中的值;例如,一个自定义类型可以用来储存C的指针或结构。使用Scribunto运行环境不允许使用自定义类型。

线程数据类型代表协同处理,在Scribunto的沙盒中不可用。

元表

每个表可能都关联一个被称为元表的表。元表的每个元素都被一些操作符或方法用于调用,以使表发生不同的操作行为或回落到默认行为。元表通过getmetatable()方法获得,通过setmetatable()方法设定。

如果需要关联至元方法,需要在元表里面调用rawget()方法。

元表可以配置以下项目元素:

__index
使用这个时,t[域名]会返回nil。 如果值是表的域,输出就会在表中重复,例如__index[域名](会调用表的元表的__index)。 如果这个域的值是函数,函数就会像__index( t, 域名 )这样调用。 rawget()函数会绕过这个元方法。
__newindex
这个元方法用于将一个域名关联值一个表中,即t[域名]=,但是rawget( t, 域名 )会返回空值(nil)。 如果这个域的值是表,那么这个表就会重复关联,例如__newindex[域名] = (会调用这个表的元表的__newindex)。 如果这个域的值是函数,那么函数就会像这样调用:__newindex( t, 域名, )rawset()函数会绕过这个元方法。
__call
将这个表作为函数调用时,就会使用这种元表,t( ··· )。 这个值必须是函数,像__call( t, ··· )这样调用。
__mode
这用于使表保持弱引用。 这个值一定是一个字符串。 默认情况下,任何一个值被作为一个表的一个域名或域值时是不会被垃圾回收的。 但如果元表域包含字母k,且没有弱引用,域名会作为垃圾收集。而且如果包含v则值有可能也会作为垃圾收集;其他的情况,域名和域值都会从表中移除。 注意,如果在表作为元表之后改变了域,这个域的行为就会未定义。

其他元表项目包括:

对于二元运算符,Lua首先检查左边的变量的元表(若有),然后发现可使用的元表时就使用右边的。
对于关系运算符,只有在同样的函数在两个变量的元表定义时,元表才会被使用。不同的匿名函数,即使具有相同的主体和闭包,也不可能被认为是相同的。
* __metatable会同时影响getmetatable()setmetatable()

注意:在Lua中,所有的字符串都会共用一个单一的元表,__index会转到string表。在Scribunto中,这个元表不允许被关联,也不能被string表引用;对模块可用的string表是一个副本。

变量

变量是储存值的地方。Lua有三种变量:全局(global)变量、局部(local)变量和表(table)中的域(field)。

名称分为全局和局部变量(或者是函数变量,是局部变量的一种)。定义一个局部变量,可以使用关键词local。任何没有赋值的变量都会视为有nil值。

全局变量储存在叫做环境的Lua表中;这个表通常是作为全局变量_G的值。这个全局变量表也可以设置元表;__index和__newindex元方法都可以用于全局变量表,就像其他的表一样。

函数的环境可以使用getfenv()函数关联,使用setfenv()函数修改;在Scribunto中,这些函数如果全部可用,就会受到严重限制。

局部变量是词法性质的;参见局部变量定义了解详细信息。

表达式

表达式是有值的:常量(数字、字符串、布尔值(真假)、空)、匿名函数的声明、表构造函数、变量引用、函数调用、变量参数表达式、用括号括起来的表达式、一进制和二进制运算的表达式。

大多数表达式都有一个值;函数调用和变量参数表达式可以有任何数量个值。注意用括号括一个函数调用或变量参数表达式只能保留第一个值,其他的值会失去。

表达式列表是逗号分隔的表达式列表。除了最后一个表达式以外,所有的值都只能是一个值(如果表达式没有值,则丢弃附加值,或者使用nil);最后一个表达式的所有值都包含在表达式列表的值中。

算数运算符

Lua支持以下常见的算数运算符:加减乘除、模运算、幂和相反数。

当所有操作值为数字或字符串时,即使用tonumber()返回非nil时,这些操作符有他们通常的意义。

如果一个操作数是一个有合适的元方法的表,元方法就会被调用。

操作 函数(功能) 示例 元方法 注释
+ 加法 a + b __add
- 减法 a - b __sub
* 乘法 a * b __mul
/ 除法 a / b __div 除以零是错误的,会输出NaN或者无穷大
% 模运算 a % b __mod 定义为a % b == a - math.floor( a / b ) * b
^ 幂运算 a ^ b __pow 允许非整数指数
- 取相反数 -a __unm

关系运算符

Lua的关系运算符是==~=<><=>=。关系运算的结果一定是布尔值(boolean)。

等于号(==)首先比较两个值的种类;如果两个值是不同的种类,结果为假(false)。然后比较值:空值、布尔值、数字和字符串照常比较。对于函数,则是看两个函数是否引用同一个准确的函数对象;像这样检测两个不同的(但作用相同的)匿名函数function() end == function() end一定会返回假(false)。表也会像函数这样比较,但是会由于使用__eq元方法而改变结果。

不等号(~==)与等于号作用相反。

对于排序运算符,如果两者都是数字,或者两者都是字符串,则直接进行比较。其次检查元方法:

  • a < b使用__lt
  • a <= b使用__le如果可用,或者__lt可用,那么它等价于not ( b < a )
  • a > b等价于b < a
  • a >= b等价于b <= a

如果必需的元方法不可用,会产生错误。

逻辑运算符

逻辑运算是and(与)、or(或)和not(非)。在逻辑运算中,只有空值(nil)和falce被视为false,其他的都被视为true。

对于and,如果左边的操作数被视为假,那么它就会被返回,将右边的操作数忽略;否则右边的操作数会被返回。

对于or,如果左边的操作数视为真,那么它就会被返回,忽略右边的操作数;否则右边的操作数会被返回。

对于not,其结果一定是布尔值(true或false)。

注意andorshort circuit。例如,foo() or bar()只会调用bar(),除非foo()不是在它的返回false或nil。

连接运算符

连接运算符就是两个点(dot),比如a .. b。如果两个操作数都是数字或字符串,它们会被转化为字符串然后返回。但是如果__concat元方法可用,就会使用这样的元方法。如果它存在但无效,则会产生错误。

注意Lua的字符串不可以改变,而且Lua不提供任何的“字符串产生器”的排序,所以反复进行a = a .. b会必须为每次迭代创建一个新字符串,并最终收集旧字符串。如果许多字符串都需要连接,则使用string.format()或将所有的字符串添加到一个序列然后最后使用table.concat()

长度运算符

长度运算符是#,像#a这样使用。如果a是字符串,会返回字符串的比特数。如果a序列表,会返回序列的长度。

如果a是一个不是序列的表,#a会返回能够使“a[n]不是nil而a[N+1]是nil”成立的值N,即使更高的索引不是nil值(在Lua 5.1如此)。例如,

-- 这不是序列,因为a[3]是nil而a[4]不是
a = { 1, 2, nil, 4 }

-- 会输出2或4。
-- 即使这个表没有被修改,这个值也有可能改变。
mw.log( #a )

运算优先级

Lua的操作符优先级,从高到低为:

  1. ^
  2. not # - (负号)
  3. * / %
  4. + - (减号)
  5. ..
  6. < > <= >= ~= ==
  7. and
  8. or

在同一级中,二元运算符会从左到右运算,例如a / b / c相当于(a / b) / c。幂运算和级联会从右往左,例如a ^ b ^ c相当于a ^ (b ^ c)>

函数调用

Lua的函数调用域其他的语言很类似:函数名称后面跟着被括号括起来的参数列表。

func( 表达式列表 )

和往常一样在Lua表达的列表,列表中的最后一个表达式可以提供多个参数值。

如果向函数传递的参数个数少于函数定义中的形参个数,则额外的形参将被赋值nil。如果函数表达式含有比形参更多的参数,则多余的参数会被丢弃。函数也可以接受可变个数的参数(也就是说,有时可以接受任意数量个参数),参见函数声明的细节。

Lua允许直接调用由函数返回的值,例如func()()。如果需要比变量访问更复杂的表达式来确定要调用的函数,则可以使用括号表达式来代替变量访问。

Lua的语法有两个常见的语法糖例子。第一个是当一个表被当做对象使用,并且这个对象有一个函数被调用,那么语法

table:name( 表达式 )

完全等同于

table.name( 表, 表达式 )

第二种常见的情况就是Lua通过含有名称至值的映射的表作为函数唯一参数实施已命名的参数的工具(直截了当说,就是将一个由花括号分割的表作为唯一参数)。这种情况下,包围着参数的括号可以省去。如果将一个(由引号或中括号分割的)字符串作为调用函数时的唯一参数,那么包围着这个参数的括号也可以省去。比如调用

func{ arg1 = 表达式, arg2 = exp }
func"表达式"

等同于

func( { arg1 = 表达式, arg2 = 表达式 } )
func( "字符串" )

它们也可以被组合使用。以下使用是等价的:

table:name{ arg1 = 表达式, arg2 = 表达式 }
table.name( table, { arg1 = 表达式, arg2 = 表达式 } )

函数声明

方法的定义语法如下:

function ( 变量列表 )
    代码段
end

对于这个函数,变量列表中的所有变量都是局部的,这些变量带有着函数调用的表达式列表。在这个代码段中,多余的局部变量会被忽略。

当函数调用时,代码段中的表述会在由参数列表创建局部变量和指定值之后执行。如果执行到返回声明,那么返回的值则会在表达式中的调用函数时返回。如果执行到代码段末尾,还没有返回声明,则函数调用返回的结果是0个值。

Lua函数是词法闭包。一个常见的习惯用法是在函数声明的范围内声明“私有静态”变量作为局部变量。比如,

-- 这会返回将它的参数增加一个数字的函数
function makeAdder( n )
    return function( x )
        -- 这个变量从外一层来的变量在这里可以给x做加法
        return x + n
    end
end

local add5 = makeAdder( 5 )
mw.log( add5( 6 ) )
-- 输出11

函数可以声明接受可变数量的(即任何数量个)参数,通过将...作为变量列表的最后一项:

function ( 变量列表, ... )
    代码段
end

在这个块内,变量表达式...就可使用,结果会是函数调用中的所有额外值。比如,

local join = function ( separator, ... )
    -- 将额外参数作为表
    local args = { ... }
    -- 获取额外参数的数量
    local n = select( '#', ... )
    return table.concat( args, separator, 1, n )
end

join( ', ', '芥', '菜', '薹' )
-- 返回字符串"芥, 菜, 薹"

select()函数被声明作用于这些参数表达式;特别地,是使用select( '#', ... )而非#{ ... }来数参数表达式有多少个,因为{ ... }不是序列

Lua为将函数声明和赋值提供到变量中提供语法糖;参见函数声明语句的细节。

注意这个不会起效:

local factorial = function ( n )
    if n <= 2 then
        return n
    else
        return n * factorial( n - 1 )
    end
end

因为这个函数声明是在局部变量分配声明完成之前就处理好的,函数主体中的“factorial”会指向函数外(通常是未定义的)变量。这个问题可以通过先声明一个局部变量然后再将它分配到后来的声明中,或者使用函数声明语句语法。

声明

声明是执行的基本单元:一个声明、控制结构、函数调用、变量声明,等等。

区块是声明语句的序列,可以由分号分开。一个区块基本上被考虑为匿名函数的主体,所以它可以声明局部变量,接受参数,并返回值。

代码段也是声明语句中的序列,就像一个区块。块可以被分隔以创建单个声明:do 代码段 end。这些通常用来限制局部变量的作用范围,或者在另一个块的中间加入returnbreak

分配

变量列表 = 表达式列表

变量列表是由逗号分隔的变量;表达式列表是由逗号分隔的一组一个或多个表达式。所有的表达式都会在分配值之前被求值,所以a, b = b, a会交换ab的值。

局部变量定义

local 变量列表

local 变量列表 = 表达式列表

局部变量可以在一个代码段区块内任意地方声明。第一种形式,不需要表达式列表,声明变量但不赋值,所以这些变量都会将nil作为值二中形式,为局部变量分配值,参见上面的分配

注意:局部变量显式声明于local关键字后面,形式位local 变量名。比如语句 local x = x 声明一个名为“x”的局部变量,并将全局变量“x”的值赋予这个变量。局部变量只在其声明的语句块结尾前的地方生效。

控制结构

while 表达式 do 代码段 end

表达式(视为)是true时,反复执行代码段

repeat 代码段 until 表达式

反复执行代码段,直到表达式(视为)是false位置。在代码段内声明局部变量可以用于表达式中。

for 名称 = 表达式1, 表达式2, 表达式3 do 代码段 end
for 名称 = 表达式1, 表达式2 do 代码段 end

This first form of the for loop will declare a local variable, and repeat the block for values from exp1 to exp2 adding exp3 on each iteration. Note that exp3 may be omitted entirely, in which case 1 is used, but non-numeric values such as nil and false are an error. All expressions are evaluated once before the loop is started.

这种形式的for循环大致相当于

 do
     local var, limit, step = tonumber( exp1 ), tonumber( exp2 ), tonumber( exp3 )
     if not ( var and limit and step ) then
         error()
     end
     while ( step > 0 and var <= limit ) or ( step <= 0 and var >= limit ) do
         local name = var
         block
         var = var + step
     end
 end

不过变量var、limit和step在其他地方都不可访问。注意变量“name”是块的局部变量;如果要在循环以外用它的值,它必须被复制到一个循环外面定义的变量。

for 变量列表 in 表达式列表 do 代码段 end

第二种形式的for循环会与迭代函数一起作用。就像在第一种形式里,表达式列表会在开始循环之前就赋值了。

这种形式的for循环大致相当于

 do
     local func, static, var = expression-list
     while true do
         local var-list = func( static, var )
         var = var1  -- ''var1''是''var-list''的第一个变量
         if var == nil then
             break
         end
         block
     end
 end

except that again the variables func, static, and var are not accessible anywhere else. Note that the variables in var-list are local to the block; to use them after the loop, they must be copied to variables declared outside the loop.

Often the expression-list is a single function call that returns the three values. If the iterator function can be written so it only depends on the parameters passed into it, that would be the most efficient. If not, Programming in Lua suggests that a closure be preferred to returning a table as the static variable and updating its members on each iteration.

if 表达式1 then 代码段1 elseif 表达式2 then 代码段2 else 代码段3 end

表达式1返回true时执行代码段1,否则,当表达式2返回true时执行代码段2,否则执行代码段3else 代码段3部分可以省去,elseif 表达式2 then 代码段2部分可以重复,也可以省去。

return 表达式列表

The return statement is used to return values from a function or a chunk (which is just a function). The expression-list is a comma-separated list of zero or more expressions.

Lua implements tail calls: if expression-list consists of exactly one expression which is a function call, the current stack frame will be reused for the call to that function. This has implication for functions that deal with the call stack, such as getfenv() and debug.traceback().

The return statement must be the last statement in its block. If for some reason a return is needed in the middle of a block, an explicit block do return end may be used.

break

The break statement is used to terminate the execution of a while, repeat, or for loop, skipping to the next statement after the loop.

The break statement must be the last statement in its block. If for some reason a break is needed in the middle of a block, an explicit block do break end may be used.

函数调用作为语句

A function call may be used as a statement; in this case, the function is being called only for any side effects it may have (e.g. mw.log() logs values) and any return values are discarded.

函数声明语句

Lua provides syntactic sugar to make declaring a function and assigning it to a variable more natural. The following pairs of declarations are equivalent

-- 基本声明
function func( 变量列表 ) 代码段 end
func = function ( 变量列表 ) 代码段 end
-- 局部声明
local function func( 变量列表 ) 代码段 end
local func; func = function ( 变量列表 ) 代码段 end
-- 将函数作为表的一个域
function table.func( 变量列表 ) 代码段 end
table.func = function ( 变量列表 ) 代码段 end
-- 函数作为表的工具
function table:func( 变量列表 ) 代码段 end
table.func = function ( self, 变量列表 ) 代码段 end

注意,这里的冒号符号与函数调用的冒号符号相类似,在参数列表的开头添加了一个隐式参数,名为“self”。

错误处理

错误可以通过error()assert()“抛出”,使用pcall() 或者xpcall()可以“捕获”错误。注意,某些Scribunto的内部错误是不能被Lua层面的代码捕获处理。

垃圾回收

Lua能自动管理内存。这意味着你不需要关心新建对象时内存空间的申请,或者对象不需使用时内存空间的释放。Lua的内存管理会自动执行“垃圾回收”,将不会再被访问的死对象或者被弱引用保持的对象的内存空间收回。几乎所有Lua类型元素都能被回收,包括表、方法、字符串等。

垃圾回收是自动运行的,不能被Scribunto配置。

标准库

Lua标准库为Lua提供基本的性能和关键功能。只有Lua标准库再Scribunto启用的一部分内容会在此说明。

基本函数

_G

这个变量持有对当前全局变量表的引用;全局变量foo也会被关联到_G.foo。注意,然而,对_G本身并没有什么特别的;它可能被以同样的方式作为任何其他变量:

foo = 1
mw.log( foo ) -- logs "1"
_G.foo = 2
mw.log( foo ) -- logs "2"
_G = {}       -- _G no longer points to the global variable table
_G.foo = 3
mw.log( foo ) -- still logs "2"

全局变量表会用作另一个表。例如,

-- Call a function whose name is stored in a variable
_G[var]()

-- Log the names and stringified values of all global variables
for k, v in pairs( _G ) do
   mw.log( k, v )
end

-- Log the creation of new global variables
setmetatable( _G, {
    __newindex = function ( t, k, v )
         mw.log( "Creation of new global variable '" .. k .. "'" )
         rawset( t, k, v )
    end
} )

_VERSION

包含Lua运行版本的字符串,例如“Lua 5.1”。

assert

assert( 值, 消息, ... )

如果是空值(nil)或是假(布尔值false),就会产生错误,那么消息就会用作错误的文本:如果nil(或者未指定),文字是“assertion failes!(声明失败!)”;如果是字符串或者数字,文字就是那个值;否则声明(assert)本身就会引发错误。

如果是任何其他的值,声明(assert)就会返回包括消息在内的全部变量。

在Lua的一个比较常见的惯用用法是一个函数返回正常操作时的true值,并在失败时返回0或false作为为第一个值,错误信息作为第二值。容易的错误检查可以通过调用函数assert来实现:

-- This doesn't check for errors
local result1, result2, etc = func( ... )

-- This works the same, but does check for errors
local result1, result2, etc = assert( func( ... ) )

error

error(消息, 级别)

消息来发出一个错误。

error一般会提供出错的地址的信息。如果level是1或者省略,纳闷信息就是调用error本身的位置;2 uses the location of the call of the function that called error; and so on. Passing 0 omits inclusion of the location information.

getfenv

getfenv( f )

注意,这个函数可能会无效,取决于配置中的allowEnvFuncs

返回运行环境(全局变量),由f指定:

  • 如果1、nil或者未指定,返回调用getfenv函数的环境。通常这与_G相同。
  • 整数2~10返回更高一级的环境。例如,2返回被这一个函数调用的函数的环境,3返回调用这一个函数的函数的环境,以此类推。如果这个堆栈值越高,错误也会随之上升,否则这个堆栈级别会返回tail call。
  • 传递函数返回在调用该函数时将使用的环境。

通过所有的基本函数和Scribunto基本库函数使用的环境都被保护。尝试访问这些环境中使用getfenv将返回nil。

getmetatable

getmetatable(表)

返回这个元表。其他的类型都会返回nil。

如果元表拥有__metatable域,这个值就会直接返回,而不是返回对应的元表。

ipairs

ipairs( t )

返回3个值:迭代函数,表t和0。它是供for的迭代形式

for i, v in ipairs( t ) do
    block
end

它会迭代数对(1, t[1])、(2,t[2])等,直到t[i]是nil时。

The standard behavior may be overridden by providing an __ipairs metamethod. If that metamethod exists, the call to ipairs will return the three values returned by __ipairs( t ) instead.

next

next( 表, 键值 )

This allows for iterating over the keys in a table. If key is nil or unspecified, returns the "first" key in the table and its value; otherwise, it returns the "next" key and its value. When no more keys are available, returns nil. It is possible to check whether a table is empty using the expression next( t ) == nil.

Note that the order in which the keys are returned is not specified, even for tables with numeric indexes. To traverse a table in numerical order, use a numerical for or ipairs.

Behavior is undefined if, when using next for traversal, any non-existing key is assigned a value. Assigning a new value (including nil) to an existing field is allowed.

pairs

pairs( t )

Returns three values: an iterator function (next or a work-alike), the table t, and nil. This is intended for use in the iterator form of for:

for k, v in pairs( t ) do
    -- process each key-value pair
end

This will iterate over the key-value pairs in t just as next would; see the documentation for next for restrictions on modifying the table during traversal.

The standard behavior may be overridden by providing a __pairs metamethod. If that metamethod exists, the call to pairs will return the three values returned by __pairs( t ) instead.

pcall

pcall( f, ... )

用指定的参数在“保护模式”下调用函数f。这意味着如果在调用f时一个错误发生,pcall会返回false与错误消息。如果没有错误发生,pcall会返回true与调用返回的所有值。

伪代码中,pcall的定义类似如下:

function pcall( f, ... )
    try
        return true, f( ... )
    catch ( 消息 )
        return false, 消息
    end
end

rawequal

rawequal( a, b )

a==b相同,只是忽略了所有__eq 元方法

rawget

rawget( 表, k )

This is equivalent to table[k] except that it ignores any __index metamethod.

rawset

rawset( table, k, v )

This is equivalent to table[k] = v except that it ignores any __newindex metamethod.

select

select( index, ... )

If index is a number, returns all arguments in ... after that index. If index is the string '#', returns the count of arguments in ....

In other words, select is something roughly like the following except that it will work correctly even when ... contains nil values (see documentation for # and unpack for the problem with nils).

function select( index, ... )
    local t = { ... }
    if index == '#' then
        return #t
    else
        return unpack( t, index )
    end
end

setmetatable

setmetatable( table, metatable )

Sets the metatable of a table. metatable may be nil, but must be explicitly provided.

If the current metatable has a __metatable field, setmetatable will throw an error.

tonumber

tonumber( 值, 基数 )

Tries to convert value to a number. If it is already a number or a string convertible to a number, then tonumber returns this number; otherwise, it returns nil.

The optional base (default 10) specifies the base to interpret the numeral. The base may be any integer between 2 and 36, inclusive. In bases above 10, the letter 'A' (in either upper or lower case) represents 10, 'B' represents 11, and so forth, with 'Z' representing 35.

In base 10, the value may have a decimal part, be expressed in E notation, and may have a leading "0x" to indicate base 16. In other bases, only unsigned integers are accepted.

tostring

tostring( 值 )

Converts value to a string. See Data types above for details on how each type is converted.

The standard behavior for tables may be overridden by providing a __tostring metamethod. If that metamethod exists, the call to tostring will return the single value returned by __tostring( value ) instead.

type

type( value )

Returns the type of value as a string: "nil", "number", "string", "boolean", "table", or "function".

unpack

unpack( table, i, j )

Returns values from the given table, something like table[i], table[i+1], ···, table[j] would do if written out manually. If nil or not given, i defaults to 1 and j defaults to #table.

Note that results are not deterministic if table is not a sequence and j is nil or unspecified; see Length operator for details.

xpcall

xpcall( f, errhandler )

This is much like pcall, except that the error message is passed to the function errhandler before being returned.

伪代码中,xpcall的定义类似如下:

function xpcall( f, errhandler )
    try
        return true, f()
    catch ( message )
        message = errhandler( message )
        return false, message
    end
end

Debug库

debug.traceback

debug.traceback( message, level )

Returns a string with a traceback of the call stack. An optional message string is appended at the beginning of the traceback. An optional level number tells at which stack level to start the traceback.

Math库

math.abs

math.abs( x )

返回x的绝对值。

math.acos

math.acos( x )

返回x的反余弦值(以弧度表示)。

math.asin

math.asin( x )

返回x的反正弦值(以弧度表示)。

math.atan

math.atan( x )

返回x的反正切值(以弧度表示)。

math.atan2

math.atan2( y, x )

Returns the arc tangent of y/x (given in radians), using the signs of both parameters to find the quadrant of the result.

math.ceil

math.ceil( x )

Returns the smallest integer larger than or equal to x.

math.cos

math.cos( x )

返回x(以弧度表示)的余弦值。

math.cosh

math.cosh( x )

Returns the hyperbolic cosine of x.

math.deg

math.deg( x )

Returns the angle x (given in radians) in degrees.

math.exp

math.exp( x )

返回 

math.floor

math.floor( x )

返回小于或等于x的最大整数。

math.fmod

math.fmod( x, y )

Returns the remainder of the division of x by y that rounds the quotient towards zero. For example, math.fmod( 10, 3 ) yields 1.

math.frexp

math.frexp( x )

Returns two values m and e such that:

  • If x is finite and non-zero:  , e is an integer, and the absolute value of m is in the range  
  • If x is zero: m and e are 0
  • If x is NaN or infinite: m is x and e is not specified

math.huge

The value representing positive infinity; larger than or equal to any other numerical value.

math.ldexp

math.ldexp( m, e )

返回 e是整数)。

math.log

math.log( x )

返回x的自然对数。

math.log10

math.log10( x )

返回x的以10为底的对数。

math.max

math.max( x, ... )

返回其参数的最大值。

Behavior with NaNs is not specified. With the current implementation, NaN will be returned if x is NaN, but any other NaNs will be ignored.

math.min

math.min( x, ... )

返回其参数的最小值。

Behavior with NaNs is not specified. With the current implementation, NaN will be returned if x is NaN, but any other NaNs will be ignored.

math.modf

math.modf( x )

Returns two numbers, the integral part of x and the fractional part of x. For example, math.modf( 1.25 ) yields 1, 0.25.

math.pi

 的值。

math.pow

math.pow( x, y )

x^y相同。

math.rad

math.rad( x )

Returns the angle x (given in degrees) in radians.

math.random

math.random( m, n )

返回伪随机数。

The arguments m and n may be omitted, but if specified must be convertible to integers.

  • With no arguments, returns a real number in the range  
  • With one argument, returns an integer in the range  
  • With two arguments, returns an integer in the range  

math.randomseed

math.randomseed( x )

Sets x as the seed for the pseudo-random generator.

Note that using the same seed will cause math.random to output the same sequence of numbers.

math.sin

math.sin( x )

返回x(以弧度表示)的正弦值。

math.sinh

math.sinh( x )

Returns the hyperbolic sine of x.

math.sqrt

math.sqrt( x )

返回x的平方根。与x^0.5相同。

math.tan

math.tan( x )

返回x(以弧度表示)的正切值。

math.tanh

math.tanh( x )

Returns the hyperbolic tangent of x.

操作系统库

os.clock

os.clock()

返回程序大约使用的CPU时间,以秒为单位。

os.date

os.date( 格式, 时间 )

Language library's formatDate may be used for more comprehensive date formatting

Returns a string or a table containing date and time, formatted according to format. If the format is omitted or nil, "%c" is used.

If time is given, it is the time to be formatted (see os.time()). Otherwise the current time is used.

If format starts with '!', then the date is formatted in UTC rather than the server's local time. After this optional character, if format is the string "*t", then date returns a table with the following fields:

  • year (full)
  • month (1–12)
  • day (1–31)
  • hour (0–23)
  • min (0–59)
  • sec (0–60)
  • wday (weekday, Sunday is 1)
  • yday (day of the year)
  • isdst (daylight saving flag, a boolean; may be absent if the information is not available)

If format is not "*t", then date returns the date as a string, formatted according to the same rules as the C function strftime.

os.difftime

os.difftime( t2, t1 )

Returns the number of seconds from t1 to t2.

os.time

os.time( 表 )

Returns a number representing the current time.

When called without arguments, returns the current time. If passed a table, the time encoded in the table will be parsed. The table must have the fields "year", "month", and "day", and may also include "hour" (default 12), "min" (default 0), "sec" (default 0), and "isdst".

包库

require

require( 模块名 )

加载目标模块。

First, it looks in package.loaded[modulename] to see if the module is already loaded. If so, returns package.loaded[modulename].

Otherwise, it calls each loader in the package.loaders sequence to attempt to find a loader for the module. If a loader is found, that loader is called. The value returned by the loader is stored into package.loaded[modulename] and is returned.

See the documentation for package.loaders for information on the loaders available.

例如,如果你有一个模块“模块:Giving”其中有如下代码:

local p = {}

p.someDataValue = '您好!'

return p

你可以用以下代码在其它模块载入此模块:

local giving = require( "Module:Giving" )

local value = giving.someDataValue -- 值现在等于 '您好!'

package.loaded

This table holds the loaded modules. The keys are the module names, and the values are the values returned when the module was loaded.

package.loaders

This table holds the sequence of searcher functions to use when loading modules. Each searcher function is called with a single argument, the name of the module to load. If the module is found, the searcher must return a function that will actually load the module and return the value to be returned by require. Otherwise, it must return nil.

Scribunto提供了两种搜索方式:

  1. Look in package.preload[modulename] for the loader function
  2. Look in the modules provided with Scribunto for the module name, and if that fails look in the Module: namespace. The "Module:" prefix must be provided.

Note that the standard Lua loaders are not included.

package.preload

This table holds loader functions, used by the first searcher Scribunto includes in package.loaders.

package.seeall

package.seeall( table )

Sets the __index metamethod for table to _G.

字符串库

In all string functions, the first character is at position 1, not position 0 as in C, PHP, and JavaScript. Indexes may be negative, in which case they count from the end of the string: position -1 is the last character in the string, -2 is the second-last, and so on.

Warning: The string library assumes one-byte character encodings. It cannot handle Unicode characters. To operate on Unicode strings, use the corresponding methods in the Scribunto Ustring library.

string.byte

string.byte( s, i, j )

If the string is considered as an array of bytes, returns the byte values for s[i], s[i+1], ···, s[j]. The default value for i is 1; the default value for j is i. Identical to mw.ustring.byte().

string.char

string.char( ... )

Receives zero or more integers. Returns a string with length equal to the number of arguments, in which each character has the byte value equal to its corresponding argument.

local value = string.char( 0x48, 0x65, 0x6c, 0x6c, 0x6f, 0x21 ) -- value is now 'Hello!'

See mw.ustring.char() for a similar function that uses Unicode codepoints rather than byte values.

string.find

string.find( s, pattern, init, plain )

Looks for the first match of pattern in the string s. If it finds a match, then find returns the offsets in s where this occurrence starts and ends; otherwise, it returns nil. If the pattern has captures, then in a successful match the captured values are also returned after the two indices.

A third, optional numerical argument init specifies where to start the search; its default value is 1 and can be negative. A value of true as a fourth, optional argument plain turns off the pattern matching facilities, so the function does a plain "find substring" operation, with no characters in pattern being considered "magic".

Note that if plain is given, then init must be given as well.

See mw.ustring.find() for a similar function extended as described in Ustring patterns and where the init offset is in characters rather than bytes.

string.format

string.format( formatstring, ... )

Returns a formatted version of its variable number of arguments following the description given in its first argument (which must be a string).

The format string uses a limited subset of the printf format specifiers:

  • Recognized flags are '-', '+', ' ', '#', and '0'.
  • Integer field widths up to 99 are supported. '*' is not supported.
  • Integer precisions up to 99 are supported. '*' is not supported.
  • Length modifiers are not supported.
  • Recognized conversion specifiers are 'c', 'd', 'i', 'o', 'u', 'x', 'X', 'e', 'E', 'f', 'g', 'G', 's', '%', and the non-standard 'q'.
  • Positional specifiers (e.g. "%2$s") are not supported.

The conversion specifier 'q' is like 's', but formats the string in a form suitable to be safely read back by the Lua interpreter: the string is written between double quotes, and all double quotes, newlines, embedded zeros, and backslashes in the string are correctly escaped when written.

Conversion between strings and numbers is performed as specified in Data types; other types are not automatically converted to strings. Strings containing NUL characters (byte value 0) are not properly handled.

Identical to mw.ustring.format().

string.gmatch

string.gmatch( s, pattern )

Returns an iterator function that, each time it is called, returns the next captures from pattern over string s. If pattern specifies no captures, then the whole match is produced in each call.

For this function, a '^' at the start of a pattern is not magic, as this would prevent the iteration. It is treated as a literal character.

See mw.ustring.gmatch() for a similar function for which the pattern is extended as described in Ustring patterns.

string.gsub

string.gsub( s, pattern, repl, n )

Returns a copy of s in which all (or the first n, if given) occurrences of the pattern have been replaced by a replacement string specified by repl, which can be a string, a table, or a function. gsub also returns, as its second value, the total number of matches that occurred.

If repl is a string, then its value is used for replacement. The character % works as an escape character: any sequence in repl of the form %n, with n between 1 and 9, stands for the value of the n-th captured substring. The sequence %0 stands for the whole match, and the sequence %% stands for a single %.

If repl is a table, then the table is queried for every match, using the first capture as the key; if the pattern specifies no captures, then the whole match is used as the key.

If repl is a function, then this function is called every time a match occurs, with all captured substrings passed as arguments, in order; if the pattern specifies no captures, then the whole match is passed as a sole argument.

If the value returned by the table query or by the function call is a string or a number, then it is used as the replacement string; otherwise, if it is false or nil, then there is no replacement (that is, the original match is kept in the string).

See mw.ustring.gsub() for a similar function in which the pattern is extended as described in Ustring patterns.

string.len

string.len( s )

Returns the length of the string, in bytes. Is not confused by ASCII NUL characters. Equivalent to #s.

See mw.ustring.len() for a similar function using Unicode codepoints rather than bytes.

string.lower

string.lower( s )

Returns a copy of this string with all ASCII uppercase letters changed to lowercase. All other characters are left unchanged.

See mw.ustring.lower() for a similar function in which all characters with uppercase to lowercase definitions in Unicode are converted.

string.match

string.match( s, pattern, init )

Looks for the first match of pattern in the string. If it finds one, then match returns the captures from the pattern; otherwise it returns nil. If pattern specifies no captures, then the whole match is returned.

A third, optional numerical argument init specifies where to start the search; its default value is 1 and can be negative.

See mw.ustring.match() for a similar function in which the pattern is extended as described in Ustring patterns and the init offset is in characters rather than bytes.

string.rep

string.rep( s, n )

Returns a string that is the concatenation of n copies of the string s. Identical to mw.ustring.rep().

string.reverse

string.reverse( s )

Returns a string that is the string s reversed (bytewise).

string.sub

string.sub( s, i, j )

Returns the substring of s that starts at i and continues until j; i and j can be negative. If j is nil or omitted, -1 is used.

In particular, the call string.sub(s,1,j) returns a prefix of s with length j, and string.sub(s, -i) returns a suffix of s with length i.

See mw.ustring.sub() for a similar function in which the offsets are characters rather than bytes.

string.upper

string.upper( s )

Returns a copy of this string with all ASCII lowercase letters changed to uppercase. All other characters are left unchanged.

See mw.ustring.upper() for a similar function in which all characters with lowercase to uppercase definitions in Unicode are converted.

Patterns

Note that Lua's patterns are similar to regular expressions, but are not identical. In particular, note the following differences from regular expressions and PCRE:

  • The quoting character is percent (%), not backslash (\).
  • Dot (.) always matches all characters, including newlines.
  • No case-insensitive mode.
  • No alternation (the | operator).
  • Quantifiers (*, +, ?, and -) may only be applied to individual characters or character classes, not to capture groups.
  • The only non-greedy quantifier is -, which is equivalent to PCRE's *? quantifier.
  • No generalized finite quantifier (e.g. the {n,m} quantifier in PCRE).
  • The only zero-width assertions are ^, $, and the %f[set] "frontier" pattern; assertions such as PCRE's \b or (?=···) are not present.
  • Patterns themselves do not recognize character escapes such as '\ddd'. However, since patterns are strings these sort of escapes may be used in the string literals used to create the pattern-string.

Also note that a pattern cannot contain embedded zero bytes (ASCII NUL, "\0"). Use %z instead.

Also see Ustring patterns for a similar pattern-matching scheme using Unicode characters.

字符类

A character class is used to represent a set of characters. The following combinations are allowed in describing a character class:

  • x: (where x is not one of the magic characters ^$()%.[]*+-?) represents the character x itself.
  • .: (a dot) represents all characters.
  • %a: represents all ASCII letters.
  • %c: represents all ASCII control characters.
  • %d: represents all digits.
  • %l: represents all ASCII lowercase letters.
  • %p: represents all punctuation characters.
  • %s: represents all ASCII space characters.
  • %u: represents all ASCII uppercase letters.
  • %w: represents all ASCII alphanumeric characters.
  • %x: represents all hexadecimal digits.
  • %z: represents ASCII NUL, the zero byte.
  • %A: All characters not in %a.
  • %C: All characters not in %c.
  • %D: All characters not in %d.
  • %L: All characters not in %l.
  • %P: All characters not in %p.
  • %S: All characters not in %s.
  • %U: All characters not in %u.
  • %W: All characters not in %w.
  • %X: All characters not in %x.
  • %Z: All characters not in %z.
  • %x: (where x is any non-alphanumeric character) represents the character x. This is the standard way to escape the magic characters. Any punctuation character (even the non magic) can be preceded by a '%' when used to represent itself in a pattern.
  • [set]: represents the class which is the union of all characters in set. A range of characters can be specified by separating the end characters of the range with a '-'. All classes %x described above can also be used as components in set. All other characters in set represent themselves. For example, [%w_] (or [_%w]) represents all alphanumeric characters plus the underscore, [0-7] represents the octal digits, and [0-7%l%-] represents the octal digits plus the lowercase letters plus the '-' character.

    The interaction between ranges and classes is not defined. Therefore, patterns like [%a-z] or [a-%%] have no meaning.

  • [^set]: represents the complement of set, where set is interpreted as above.
Pattern items

A pattern item can be

  • a single character class, which matches any single character in the class;
  • a single character class followed by '*', which matches 0 or more repetitions of characters in the class. These repetition items will always match the longest possible sequence;
  • a single character class followed by '+', which matches 1 or more repetitions of characters in the class. These repetition items will always match the longest possible sequence;
  • a single character class followed by '-', which also matches 0 or more repetitions of characters in the class. Unlike '*', these repetition items will always match the shortest possible sequence;
  • a single character class followed by '?', which matches 0 or 1 occurrence of a character in the class;
  • %n, for n between 1 and 9; such item matches a substring equal to the n-th captured string (see below);
  • %bxy, where x and y are two distinct characters; such item matches strings that start with x, end with y, and where the x and y are balanced. This means that, if one reads the string from left to right, counting +1 for an x and -1 for a y, the ending y is the first y where the count reaches 0. For instance, the item %b() matches expressions with balanced parentheses.
  • %f[set], a frontier pattern; such item matches an empty string at any position such that the next character belongs to set and the previous character does not belong to set. The set set is interpreted as previously described. The beginning and the end of the subject are handled as if they were the character '\0'.
    Note that frontier patterns were present but undocumented in Lua 5.1, and officially added to Lua in 5.2. The implementation in Lua 5.2.1 is unchanged from that in 5.1.0.
Pattern

A pattern is a sequence of pattern items.

A '^' at the beginning of a pattern anchors the match at the beginning of the subject string. A '$' at the end of a pattern anchors the match at the end of the subject string. At other positions, '^' and '$' have no special meaning and represent themselves.

Captures

A pattern can contain sub-patterns enclosed in parentheses; they describe captures. When a match succeeds, the substrings of the subject string that match captures are stored ("captured") for future use. Captures are numbered according to their left parentheses. For instance, in the pattern (a*(.)%w(%s*)), the part of the string matching a*(.)%w(%s*) is stored as the first capture (and therefore has number 1); the character matching . is captured with number 2, and the part matching %s* has number 3.

Capture references can appear in the pattern string itself, and refer back to text that was captured earlier in the match. For example, ([a-z])%1 will match any pair of identical lowercase letters, while ([a-z])([a-z])([a-z])[a-z]%3%2%1 will match any 7-letter palindrome.

As a special case, the empty capture () captures the current string position (a number). For instance, if we apply the pattern "()aa()" on the string "flaaap", there will be two captures: 3 and 5.

表库

Most functions in the table library assume that the table represents a sequence.

The functions table.foreach(), table.foreachi(), and table.getn() may be available but are deprecated. Use a for loop with pairs(), a for loop with ipairs(), and the length operator instead.

table.concat

table.concat( table, sep, i, j )

Given an array where all elements are strings or numbers, returns table[i] .. sep .. table[i+1] ··· sep .. table[j].

The default value for sep is the empty string, the default for i is 1, and the default for j is the length of the table. If i is greater than j, returns the empty string.

table.insert

table.insert( 表, 值 )
table.insert( 表, 位置, 值 )

Inserts element value at position pos in table, shifting up other elements to open space, if necessary. The default value for pos is the length of the table plus 1, so that a call table.insert(t, x) inserts x at the end of table t.

Elements up to #table are shifted; see Length operator for caveats if the table is not a sequence.

table.maxn

table.maxn( 表 )

Returns the largest positive numerical index of the given table, or zero if the table has no positive numerical indices.

To do this, it iterates over the whole table. This is roughly equivalent to

function table.maxn( table )
    local maxn, k = 0, nil
    repeat
        k = next( table, k )
        if type( k ) == 'number' and k > maxn then
            maxn = k
        end
    until not k
    return maxn
end

table.remove

table.remove( table, pos )

Removes from table the element at position pos, shifting down other elements to close the space, if necessary. Returns the value of the removed element. The default value for pos is the length of the table, so that a call table.remove( t ) removes the last element of table t.

Elements up to #table are shifted; see Length operator for caveats if the table is not a sequence.

table.sort

table.sort( table, comp )

Sorts table elements in a given order, in-place, from table[1] to table[#table]. If comp is given, then it must be a function that receives two table elements, and returns true when the first is less than the second (so that not comp(a[i+1],a[i]) will be true after the sort). If comp is not given, then the standard Lua operator < is used instead.

The sort algorithm is not stable; that is, elements considered equal by the given order may have their relative positions changed by the sort.

Scribunto libraries

所有Scribunto库位于表mw中。

基本函数

mw.addWarning

mw.addWarning( 文本 )

Adds a warning which is displayed above the preview when previewing an edit. text is parsed as wikitext.

mw.allToString

mw.allToString( ... )

Calls tostring() on all arguments, then concatenates them with tabs as separators.

mw.clone

mw.clone( 值 )

Creates a deep copy of a value. All tables (and their metatables) are reconstructed from scratch. Functions are still shared, however.

mw.getCurrentFrame

mw.getCurrentFrame()

返回目前的框架项目,通常是从最近的#invoke中的框架函数。

mw.incrementExpensiveFunctionCount

mw.incrementExpensiveFunctionCount()

Adds one to the "expensive parser function" count, and throws an exception if it exceeds the limit (see $wgExpensiveParserFunctionLimit).

mw.isSubsting

mw.isSubsting()

Returns true if the current #invoke is being substed, false otherwise. See Returning text above for discussion on differences when substing versus not substing.

mw.loadData

mw.loadData( module )

Sometimes a module needs large tables of data; for example, a general-purpose module to convert units of measure might need a large table of recognized units and their conversion factors. And sometimes these modules will be used many times in one page. Parsing the large data table for every {{#invoke:}} can use a significant amount of time. To avoid this issue, mw.loadData() is provided.

mw.loadData works like require(), with the following differences:

  • The loaded module is evaluated only once per page, rather than once per {{#invoke:}} call.
  • The loaded module is not recorded in package.loaded.
  • The value returned from the loaded module must be a table. Other data types are not supported.
  • The returned table (and all subtables) may contain only booleans, numbers, strings, and other tables. Other data types, particularly functions, are not allowed.
  • The returned table (and all subtables) may not have a metatable.
  • All table keys must be booleans, numbers, or strings.
  • The table actually returned by mw.loadData() has metamethods that provide read-only access to the table returned by the module. Since it does not contain the data directly, pairs() and ipairs() will work but other methods, including #value, next(), and the functions in the Table library, will not work correctly.

The hypothetical unit-conversion module mentioned above might store its code in "Module:Convert" and its data in "Module:Convert/data", and "Module:Convert" would use local data = mw.loadData( 'Module:Convert/data' ) to efficiently load the data.

mw.dumpObject

mw.dumpObject( object )

Serializes object to a human-readable representation, then returns the resulting string.

mw.log

mw.log( ... )

Passes the arguments to mw.allToString(), then appends the resulting string to the log buffer.

In the debug console, the function print() is an alias for this function.

mw.logObject

mw.logObject( object )
mw.logObject( object, prefix )

Calls mw.dumpObject() and appends the resulting string to the log buffer. If prefix is given, it will be added to the log buffer followed by an equals sign before the serialized string is appended (i.e. the logged text will be "prefix = object-string").

Frame对象

The frame object is the interface to the parameters passed to {{#invoke:}}, and to the parser.

frame.args

用来访问传递到frame的参数的表。例如,一个模块从如下的维基文本调用

{{#invoke:module|function|arg1|arg2|name=arg3}}

then frame.args[1] will return "arg1", frame.args[2] will return "arg2", and frame.args['name'] (or frame.args.name) will return "arg3". It is also possible to iterate over arguments using pairs( frame.args ) or ipairs( frame.args ). However, due to how Lua implements table iterators, iterating over arguments will return them in an unspecified order, and there's no way to know the original order as they appear in wikitext.

Note that values in this table are always strings; tonumber() may be used to convert them to numbers, if necessary. Keys, however, are numbers even if explicitly supplied in the invocation: {{#invoke:module|function|1|2=2}} gives string values "1" and "2" indexed by numeric keys 1 and 2.

As in MediaWiki template invocations, named arguments will have leading and trailing whitespace removed from both the name and the value before they are passed to Lua, whereas unnamed arguments will not have whitespace stripped.

For performance reasons, frame.args uses a metatable, rather than directly containing the arguments. Argument values are requested from MediaWiki on demand. This means that most other table methods will not work correctly, including #frame.args, next( frame.args ), and the functions in the Table library.

If preprocessor syntax such as template invocations and triple-brace arguments are included within an argument to #invoke, they will not be expanded, after being passed to Lua, until their values are being requested in Lua. If certain special tags written in XML notation, such as <pre>, <nowiki>, <gallery> and <ref>, are included as arguments to #invoke, then these tags will be converted to "strip markers" — special strings which begin with a delete character (ASCII 127), to be replaced with HTML after they are returned from #invoke.

frame:callParserFunction

frame:callParserFunction( name, args )
frame:callParserFunction( name, ... )
frame:callParserFunction{ name = string, args = table }

Note the use of named arguments.

调用解析器函数,返回一个适当的字符串。当可能时,最好优先使用原生的Lua函数或Scribunto库函数。

以下调用与以下维基文本基本相同:

-- {{ns:0}}
frame:callParserFunction{ name = 'ns', args = 0 }

-- {{#tag:nowiki|some text}}
frame:callParserFunction{ name = '#tag', args = { 'nowiki', 'some text' } }
frame:callParserFunction( '#tag', { 'nowiki', 'some text' } )
frame:callParserFunction( '#tag', 'nowiki', 'some text' )
frame:callParserFunction( '#tag:nowiki', 'some text' )

-- {{#tag:ref|some text|name=foo|group=bar}}
frame:callParserFunction{ name = '#tag:ref', args = {
    'some text', name = 'foo', group = 'bar'
} }

Note that, as with frame:expandTemplate(), the function name and arguments are not preprocessed before being passed to the parser function.

frame:expandTemplate

frame:expandTemplate{ title = title, args = table }

Note the use of named arguments.

This is transclusion. The call

frame:expandTemplate{ title = 'template', args = { 'arg1', 'arg2', name = 'arg3' } }

does roughly the same thing from Lua that {{template|arg1|arg2|name=arg3}} does in wikitext. As in transclusion, if the passed title does not contain a namespace prefix it will be assumed to be in the Template: namespace.

注意标题与参数在传递到模版之前并未预处理:

-- 这与维基文本{{template|{{!}}}}基本相同。
frame:expandTemplate{ title = 'template', args = { '|' } }

-- 这与维基文本{{template|{{((}}!{{))}}}}基本相同。
frame:expandTemplate{ title = 'template', args = { '{{!}}' } }

frame:extensionTag

frame:extensionTag( name, content, args )
frame:extensionTag{ name = string, content = string, args = table_or_string }

This is equivalent to a call to frame:callParserFunction() with function name '#tag:' .. name and with content prepended to args.

-- 这些都相同
frame:extensionTag{ name = 'ref', content = 'some text', args = { name = 'foo', group = 'bar' } }
frame:extensionTag( 'ref', 'some text', { name = 'foo', group = 'bar' } )

frame:callParserFunction{ name = '#tag:ref', args = {
    'some text', name = 'foo', group = 'bar'
} }

-- 这些都相同
frame:extensionTag{ name = 'ref', content = 'some text', args = 'some other text' }
frame:callParserFunction{ name = '#tag:ref', args = {
    'some text', 'some other text'
} }

frame:getParent

frame:getParent()

Called on the frame created by {{#invoke:}}, returns the frame for the page that called {{#invoke:}}. Called on that frame, returns nil.

For instance, if the template {{Example}} contains the code {{#invoke:ModuleName}}, and a page transcludes that template and supplies arguments to it ({{Example|arg1|arg2}}), calling mw.getCurrentFrame():getParent().args[1], mw.getCurrentFrame():getParent().args[2] in Module:ModuleName will return "arg1", "arg2".

frame:getTitle

frame:getTitle()

Returns the title associated with the frame as a string. For the frame created by {{#invoke:}}, this is the title of the module invoked.

frame:newChild

frame:newChild{ title = title, args = table }

Note the use of named arguments.

Create a new Frame object that is a child of the current frame, with optional arguments and title.

This is mainly intended for use in the debug console for testing functions that would normally be called by {{#invoke:}}. The number of frames that may be created at any one time is limited.

frame:preprocess

frame:preprocess( 字符串 )
frame:preprocess{ text = 字符串 }

This expands wikitext in the context of the frame, i.e. templates, parser functions, and parameters such as {{{1}}} are expanded. Certain special tags written in XML-style notation, such as <pre>, <nowiki>, <gallery> and <ref>, will be replaced with "strip markers" — special strings which begin with a delete character (ASCII 127), to be replaced with HTML after they are returned from #invoke.

If you are expanding a single template, use frame:expandTemplate instead of trying to construct a wikitext string to pass to this method. It's faster and less prone to error if the arguments contain pipe characters or other wikimarkup.

If you are expanding a single parser function, use frame:callParserFunction for the same reasons.

frame:getArgument

frame:getArgument( arg )
frame:getArgument{ name = arg }

Gets an object for the specified argument, or nil if the argument is not provided.

The returned object has one method, object:expand(), that returns the expanded wikitext for the argument.

frame:newParserValue

frame:newParserValue( text )
frame:newParserValue{ text = text }

Returns an object with one method, object:expand(), that returns the result of frame:preprocess( text ).

frame:newTemplateParserValue

frame:newTemplateParserValue{ title = title, args = table }

Note the use of named arguments.

Returns an object with one method, object:expand(), that returns the result of frame:expandTemplate called with the given arguments.

frame:argumentPairs

frame:argumentPairs()

Same as pairs( frame.args ). Included for backwards compatibility.

Hash library

mw.hash.hashValue

mw.hash.hashValue( algo, value )

Hashes a string value with the specified algorithm. Valid algorithms may be fetched using mw.hash.listAlgorithms().

mw.hash.listAlgorithms

mw.hash.listAlgorithms()

Returns a list of supported hashing algorithms, for use in mw.hash.hashValue().

HTML library

mw.html is a fluent interface for building complex HTML from Lua. A mw.html object can be created using mw.html.create.

Functions documented as mw.html.name are available on the global mw.html table; functions documented as mw.html:name and html:name are methods of an mw.html object (see mw.html.create).

A basic example could look like this:

local div = mw.html.create( 'div' )
div
     :attr( 'id', 'testdiv' )
     :css( 'width', '100%' )
     :wikitext( 'Some text' )
     :tag( 'hr' )
return tostring( div )
-- Output: <div id="testdiv" style="width:100%;">Some text<hr /></div>

mw.html.create

mw.html.create( tagName, args )

Creates a new mw.html object containing a tagName html element. You can also pass an empty string or nil as tagName in order to create an empty mw.html object.

args can be a table with the following keys:

  • args.selfClosing: Force the current tag to be self-closing, even if mw.html doesn't recognize it as self-closing
  • args.parent: Parent of the current mw.html instance (intended for internal usage)

mw.html:node

html:node( builder )

Appends a child mw.html (builder) node to the current mw.html instance. If a nil parameter is passed, this is a no-op. A (builder) node is a string representation of an html element.

mw.html:wikitext

html:wikitext( ... )

Appends an undetermined number of wikitext strings to the mw.html object.

Note that this stops at the first nil item.

mw.html:newline

html:newline()

Appends a newline to the mw.html object.

mw.html:tag

html:tag( tagName, args )

Appends a new child node with the given tagName to the builder, and returns a mw.html instance representing that new node. The args parameter is identical to that of mw.html.create

mw.html:attr

html:attr( name, value )
html:attr( table )

Set an HTML attribute with the given name and value on the node. Alternatively a table holding name->value pairs of attributes to set can be passed. In the first form, a value of nil causes any attribute with the given name to be unset if it was previously set.

mw.html:getAttr

html:getAttr( name )

Get the value of a html attribute previously set using html:attr() with the given name.

mw.html:addClass

html:addClass( class )

Adds a class name to the node's class attribute. If a nil parameter is passed, this is a no-op.

mw.html:css

html:css( name, value )
html:css( table )

Set a CSS property with the given name and value on the node. Alternatively a table holding name->value pairs of properties to set can be passed. In the first form, a value of nil causes any property with the given name to be unset if it was previously set.

mw.html:cssText

html:cssText( css )

Add some raw css to the node's style attribute. If a nil parameter is passed, this is a no-op.

mw.html:done

html:done()

Returns the parent node under which the current node was created. Like jQuery.end, this is a convenience function to allow the construction of several child nodes to be chained together into a single statement.

mw.html:allDone

html:allDone()

Like html:done(), but traverses all the way to the root node of the tree and returns it.

Language library

Language codes are described at language code. Many of MediaWiki's language codes are similar to IETF language tags, but not all MediaWiki language codes are valid IETF tags or vice versa.

Functions documented as mw.language.name are available on the global mw.language table; functions documented as mw.language:name and lang:name are methods of a language object (see mw.language.new or mw.language.getContentLanguage).

mw.language.fetchLanguageName

mw.language.fetchLanguageName( code, inLanguage )

The full name of the language for the given language code: native name (language autonym) by default, name translated in target language if a value is given for inLanguage.

mw.language.fetchLanguageNames

mw.language.fetchLanguageNames()
mw.language.fetchLanguageNames( inLanguage )
mw.language.fetchLanguageNames( inLanguage, include )

Fetch the list of languages known to MediaWiki, returning a table mapping language code to language name.

By default the name returned is the language autonym; passing a language code for inLanguage returns all names in that language.

By default, only language names known to MediaWiki are returned; passing 'all' for include will return all available languages (e.g. from Extension:CLDR), while passing 'mwfile' will include only languages having customized messages included with MediaWiki core or enabled extensions. To explicitly select the default, 'mw' may be passed.

mw.language.getContentLanguage

mw.language.getContentLanguage()
mw.getContentLanguage()

Returns a new language object for the wiki's default content language.

mw.language.getFallbacksFor

mw.language.getFallbacksFor( code )

Returns a list of MediaWiki's fallback language codes for the specified code.

mw.language.isKnownLanguageTag

mw.language.isKnownLanguageTag( code )

Returns true if a language code is known to MediaWiki.

A language code is "known" if it is a "valid built-in code" (i.e. it returns true for mw.language.isValidBuiltInCode) and returns a non-empty string for mw.language.fetchLanguageName.

mw.language.isSupportedLanguage

mw.language.isSupportedLanguage( code )

Checks whether any localisation is available for that language code in MediaWiki.

A language code is "supported" if it is a "valid" code (returns true for mw.language.isValidCode), contains no uppercase letters, and has a message file in the currently-running version of MediaWiki.

It is possible for a language code to be "supported" but not "known" (i.e. returning true for mw.language.isKnownLanguageTag). Also note that certain codes are "supported" despite mw.language.isValidBuiltInCode returning false.

mw.language.isValidBuiltInCode

mw.language.isValidBuiltInCode( code )

Returns true if a language code is of a valid form for the purposes of internal customisation of MediaWiki.

The code may not actually correspond to any known language.

A language code is a "valid built-in code" if it is a "valid" code (i.e. it returns true for mw.language.isValidCode); consists of only ASCII letters, numbers, and hyphens; and is at least two characters long.

Note that some codes are "supported" (i.e. returning true from mw.language.isSupportedLanguage) even though this function returns false.

mw.language.isValidCode

mw.language.isValidCode( code )

Returns true if a language code string is of a valid form, whether or not it exists. This includes codes which are used solely for customisation via the MediaWiki namespace.

The code may not actually correspond to any known language.

A language code is valid if it does not contain certain unsafe characters (colons, single- or double-quotes, slashs, backslashs, angle brackets, ampersands, or ASCII NULs) and is otherwise allowed in a page title.

mw.language.new

mw.language.new( code )
mw.getLanguage( code )

Creates a new language object. Language objects do not have any publicly accessible properties, but they do have several methods, which are documented below.

There is a limit on the number of distinct language codes that may be used on a page. Exceeding this limit will result in errors.

mw.language:getCode

lang:getCode()

Returns the language code for this language object.

mw.language:getFallbackLanguages

lang:getFallbackLanguages()

Returns a list of MediaWiki's fallback language codes for this language object. Equivalent to mw.language.getFallbacksFor( lang:getCode() ).

mw.language:isRTL

lang:isRTL()

Returns true if the language is written right-to-left, false if it is written left-to-right.

mw.language:lc

lang:lc( s )

Converts the string to lowercase, honoring any special rules for the given language.

When the Ustring library is loaded, the mw.ustring.lower() function is implemented as a call to mw.language.getContentLanguage():lc( s ).

mw.language:lcfirst

lang:lcfirst( s )

Converts the first character of the string to lowercase, as with lang:lc().

mw.language:uc

lang:uc( s )

Converts the string to uppercase, honoring any special rules for the given language.

When the Ustring library is loaded, the mw.ustring.upper() function is implemented as a call to mw.language.getContentLanguage():uc( s ).

mw.language:ucfirst

lang:ucfirst( s )

Converts the first character of the string to uppercase, as with lang:uc().

mw.language:caseFold

lang:caseFold( s )

Converts the string to a representation appropriate for case-insensitive comparison. Note that the result may not make any sense when displayed.

mw.language:formatNum

lang:formatNum( n )
lang:formatNum( n, options )

Formats a number with grouping and decimal separators appropriate for the given language. Given 123456.78, this may produce "123,456.78", "123.456,78", or even something like "١٢٣٬٤٥٦٫٧٨" depending on the language and wiki configuration.

The options is a table of options, which can be:

  • noCommafy: Set true to omit grouping separators and use a dot (.) as the decimal separator. Digit transformation may still occur, which may include transforming the decimal separator.

mw.language:formatDate

lang:formatDate( format, timestamp, local )

Formats a date according to the given format string. If timestamp is omitted, the default is the current time. The value for local must be a boolean or nil; if true, the time is formatted in the wiki's local time rather than in UTC.

The format string and supported values for timestamp are identical to those for the #time parser function from Extension:ParserFunctions. Note however that backslashes may need to be doubled in a Lua string literal, since Lua also uses backslash as an escape character while wikitext does not:

-- This string literal contains a newline, not the two characters "\n", so it is not equivalent to {{#time:\n}}.
lang:formatDate( '\n' )

-- This is equivalent to {{#time:\n}}, not {{#time:\\n}}.
lang:formatDate( '\\n' )

-- This is equivalent to {{#time:\\n}}, not {{#time:\\\\n}}.
lang:formatDate( '\\\\n' )

mw.language:formatDuration

lang:formatDuration( seconds )
lang:formatDuration( seconds, allowedIntervals )

Breaks a duration in seconds into more human-readable units, e.g. 12345 to 3 hours, 25 minutes and 45 seconds, returning the result as a string.

allowedIntervals, if given, is a table with values naming the interval units to use in the response. These include 'millennia', 'centuries', 'decades', 'years', 'weeks', 'days', 'hours', 'minutes', and 'seconds'.

mw.language:parseFormattedNumber

lang:parseFormattedNumber( s )

This takes a number as formatted by lang:formatNum() and returns the actual number. In other words, this is basically a language-aware version of tonumber().

mw.language:convertPlural

lang:convertPlural( n, ... )
lang:convertPlural( n, forms )
lang:plural( n, ... )
lang:plural( n, forms )

This chooses the appropriate grammatical form from forms (which must be a sequence table) or ... based on the number n. For example, in English you might use n .. ' ' .. lang:plural( n, 'sock', 'socks' ) or n .. ' ' .. lang:plural( n, { 'sock', 'socks' } ) to generate grammatically-correct text whether there is only 1 sock or 200 socks.

The necessary values for the sequence are language-dependent, see localization of magic words and translatewiki's FAQ on PLURAL for some details.

mw.language:convertGrammar

lang:convertGrammar( word, case )
lang:grammar( case, word )

Note the different parameter order between the two aliases. convertGrammar matches the order of the method of the same name on MediaWiki's Language object, while grammar matches the order of the parser function of the same name, documented at Special:MyLanguage/Help:Magic words#Localisation.

This chooses the appropriate inflected form of word for the given inflection code case.

The possible values for word and case are language-dependent, see Special:MyLanguage/Help:Magic words#Localisation and translatewiki:Grammar for some details.

mw.language:gender

lang:gender( what, masculine, feminine, neutral )
lang:gender( what, { masculine, feminine, neutral } )

Chooses the string corresponding to the gender of what, which may be "male", "female", or a registered user name.

mw.language:getArrow

lang:getArrow( direction )

Returns a Unicode arrow character corresponding to direction:

  • forwards: Either "→" or "←" depending on the directionality of the language.
  • backwards: Either "←" or "→" depending on the directionality of the language.
  • left: "←"
  • right: "→"
  • up: "↑"
  • down: "↓"

mw.language:getDir

lang:getDir()

Returns "ltr" or "rtl", depending on the directionality of the language.

mw.language:getDirMark

lang:getDirMark( opposite )

Returns a string containing either U+200E (the left-to-right mark) or U+200F (the right-to-left mark), depending on the directionality of the language and whether opposite is a true or false value.

mw.language:getDirMarkEntity

lang:getDirMarkEntity( opposite )

Returns "&lrm;" or "&rlm;", depending on the directionality of the language and whether opposite is a true or false value.

mw.language:getDurationIntervals

lang:getDurationIntervals( seconds )
lang:getDurationIntervals( seconds, allowedIntervals )

Breaks a duration in seconds into more human-readable units, e.g. 12345 to 3 hours, 25 minutes and 45 seconds, returning the result as a table mapping unit names to numbers.

allowedIntervals, if given, is a table with values naming the interval units to use in the response. These include 'millennia', 'centuries', 'decades', 'years', 'days', 'hours', 'minutes', and 'seconds'.

Message library

This library is an interface to the localisation messages and the MediaWiki: namespace.

Functions documented as mw.message.name are available on the global mw.message table; functions documented as mw.message:name and msg:name are methods of a message object (see mw.message.new).

mw.message.new

mw.message.new( key, ... )

Creates a new message object for the given message key.

The message object has no properties, but has several methods documented below.

mw.message.newFallbackSequence

mw.message.newFallbackSequence( ... )

Creates a new message object for the given messages (the first one that exists will be used).

The message object has no properties, but has several methods documented below.

mw.message.newRawMessage

mw.message.newRawMessage( msg, ... )

Creates a new message object, using the given text directly rather than looking up an internationalized message. The remaining parameters are passed to the new object's params() method.

The message object has no properties, but has several methods documented below.

mw.message.rawParam

mw.message.rawParam( value )

Wraps the value so that it will not be parsed as wikitext by msg:parse().

mw.message.numParam

mw.message.numParam( value )

Wraps the value so that it will automatically be formatted as by lang:formatNum(). Note this does not depend on the Language library actually being available.

mw.message.getDefaultLanguage

mw.message.getDefaultLanguage()

Returns a Language object for the default language.

mw.message:params

msg:params( ... )
msg:params( params )

Add parameters to the message, which may be passed as individual arguments or as a sequence table. Parameters must be numbers, strings, or the special values returned by mw.message.numParam() or mw.message.rawParam(). If a sequence table is used, parameters must be directly present in the table; references using the __index metamethod will not work.

Returns the msg object, to allow for call chaining.

mw.message:rawParams

msg:rawParams( ... )
msg:rawParams( params )

Like :params(), but has the effect of passing all the parameters through mw.message.rawParam() first.

Returns the msg object, to allow for call chaining.

mw.message:numParams

msg:numParams( ... )
msg:numParams( params )

Like :params(), but has the effect of passing all the parameters through mw.message.numParam() first.

Returns the msg object, to allow for call chaining.

mw.message:inLanguage

msg:inLanguage( lang )

Specifies the language to use when processing the message. lang may be a string or a table with a getCode() method (i.e. a Language object).

The default language is the one returned by mw.message.getDefaultLanguage().

Returns the msg object, to allow for call chaining.

mw.message:useDatabase

msg:useDatabase( bool )

Specifies whether to look up messages in the MediaWiki: namespace (i.e. look in the database), or just use the default messages distributed with MediaWiki.

The default is true.

Returns the msg object, to allow for call chaining.

mw.message:plain

msg:plain()

Substitutes the parameters and returns the message wikitext as-is. Template calls and parser functions are intact.

mw.message:exists

msg:exists()

Returns a boolean indicating whether the message key exists.

mw.message:isBlank

msg:isBlank()

Returns a boolean indicating whether the message key has content. Returns true if the message key does not exist or the message is the empty string.

mw.message:isDisabled

msg:isDisabled()

Returns a boolean indicating whether the message key is disabled. Returns true if the message key does not exist or if the message is the empty string or the string "-".

Site library

mw.site.currentVersion

A string holding the current version of MediaWiki.

mw.site.scriptPath

The value of $wgScriptPath.

mw.site.server

The value of $wgServer.

mw.site.siteName

The value of $wgSitename.

mw.site.stylePath

The value of $wgStylePath.

mw.site.namespaces

Table holding data for all namespaces, indexed by number.

The data available is:

  • id: Namespace number.
  • name: Local namespace name.
  • canonicalName: Canonical namespace name.
  • displayName: Set on namespace 0, the name to be used for display (since the name is often the empty string).
  • hasSubpages: Whether subpages are enabled for the namespace.
  • hasGenderDistinction: Whether the namespace has different aliases for different genders.
  • isCapitalized: Whether the first letter of pages in the namespace is capitalized.
  • isContent: Whether this is a content namespace.
  • isIncludable: Whether pages in the namespace can be transcluded.
  • isMovable: Whether pages in the namespace can be moved.
  • isSubject: Whether this is a subject namespace.
  • isTalk: Whether this is a talk namespace.
  • defaultContentModel: The default content model for the namespace, as a string.
  • aliases: List of aliases for the namespace.
  • subject: Reference to the corresponding subject namespace's data.
  • talk: Reference to the corresponding talk namespace's data.
  • associated: Reference to the associated namespace's data.

A metatable is also set that allows for looking up namespaces by name (localized or canonical). For example, both mw.site.namespaces[4] and mw.site.namespaces.Project will return information about the Project namespace.

mw.site.contentNamespaces

Table holding just the content namespaces, indexed by number. See mw.site.namespaces for details.

mw.site.subjectNamespaces

Table holding just the subject namespaces, indexed by number. See mw.site.namespaces for details.

mw.site.talkNamespaces

Table holding just the talk namespaces, indexed by number. See mw.site.namespaces for details.

mw.site.stats

Table holding site statistics. Available statistics are:

  • pages: Number of pages in the wiki.
  • articles: Number of articles in the wiki.
  • files: Number of files in the wiki.
  • edits: Number of edits in the wiki.
  • users: Number of users in the wiki.
  • activeUsers: Number of active users in the wiki.
  • admins: Number of users in group 'sysop' in the wiki.

mw.site.stats.pagesInCategory

mw.site.stats.pagesInCategory( category, which )

This function is expensive

Gets statistics about the category. If which is unspecified, nil, or "*", returns a table with the following properties:

  • all: Total pages, files, and subcategories.
  • subcats: Number of subcategories.
  • files: Number of files.
  • pages: Number of pages.

If which is one of the above keys, just the corresponding value is returned instead.

Each new category queried will increment the expensive function count.

mw.site.stats.pagesInNamespace

mw.site.stats.pagesInNamespace( ns )

Returns the number of pages in the given namespace (specify by number).

mw.site.stats.usersInGroup

mw.site.stats.usersInGroup( group )

Returns the number of users in the given group.

mw.site.interwikiMap

mw.site.interwikiMap( filter )

Returns a table holding data about available interwiki prefixes. If filter is the string "local", then only data for local interwiki prefixes is returned. If filter is the string "!local", then only data for non-local prefixes is returned. If no filter is specified, data for all prefixes is returned. A "local" prefix in this context is one that is for the same project. For example, on the English Wikipedia, other-language Wikipedias are considered local, while Wiktionary and such are not.

Keys in the table returned by this function are interwiki prefixes, and the values are subtables with the following properties:

  • prefix - the interwiki prefix.
  • url - the URL that the interwiki points to. The page name is represented by the parameter $1.
  • isProtocolRelative - a boolean showing whether the URL is protocol-relative.
  • isLocal - whether the URL is for a site in the current project.
  • isCurrentWiki - whether the URL is for the current wiki.
  • isTranscludable - whether pages using this interwiki prefix are transcludable. This requires scary transclusion, which is disabled on Wikimedia wikis.
  • isExtraLanguageLink - whether the interwiki is listed in $wgExtraInterlanguageLinkPrefixes.
  • displayText - for links listed in $wgExtraInterlanguageLinkPrefixes, this is the display text shown for the interlanguage link. Nil if not specified.
  • tooltip - for links listed in $wgExtraInterlanguageLinkPrefixes, this is the tooltip text shown when users hover over the interlanguage link. Nil if not specified.

Text library

The text library provides some common text processing functions missing from the String library and the Ustring library. These functions are safe for use with UTF-8 strings.

mw.text.decode

mw.text.decode( s )
mw.text.decode( s, decodeNamedEntities )

Replaces HTML entities in the string with the corresponding characters.

If boolean decodeNamedEntities is omitted or false, the only named entities recognized are '&lt;', '&gt;', '&amp;', '&quot;', and '&nbsp;'. Otherwise, the list of HTML5 named entities to recognize is loaded from PHP's get_html_translation_table function.

mw.text.encode

mw.text.encode( s )
mw.text.encode( s, charset )

Replaces characters in a string with HTML entities. Characters '<', '>', '&', '"', and the non-breaking space are replaced with the appropriate named entities; all others are replaced with numeric entities.

If charset is supplied, it should be a string as appropriate to go inside brackets in a Ustring pattern, i.e. the "set" in [set]. The default charset is '<>&"\' ' (the space at the end is the non-breaking space, U+00A0).

mw.text.jsonDecode

mw.text.jsonDecode( s )
mw.text.jsonDecode( s, flags )

Decodes a JSON string. flags is 0 or a combination (use +) of the flags mw.text.JSON_PRESERVE_KEYS and mw.text.JSON_TRY_FIXING.

Normally JSON's zero-based arrays are renumbered to Lua one-based sequence tables; to prevent this, pass mw.text.JSON_PRESERVE_KEYS.

To relax certain requirements in JSON, such as no terminal comma in arrays or objects, pass mw.text.JSON_TRY_FIXING. This is not recommended.

Limitations:

  • Decoded JSON arrays may not be Lua sequences if the array contains null values.
  • JSON objects will drop keys having null values.
  • It is not possible to directly tell whether the input was a JSON array or a JSON object with sequential integer keys.
  • A JSON object having sequential integer keys beginning with 1 will decode to the same table structure as a JSON array with the same values, despite these not being at all equivalent, unless mw.text.JSON_PRESERVE_KEYS is used.

mw.text.jsonEncode

mw.text.jsonEncode( value )
mw.text.jsonEncode( value, flags )

Encode a JSON string. Errors are raised if the passed value cannot be encoded in JSON. flags is 0 or a combination (use +) of the flags mw.text.JSON_PRESERVE_KEYS and mw.text.JSON_PRETTY.

Normally Lua one-based sequence tables are encoded as JSON zero-based arrays; when mw.text.JSON_PRESERVE_KEYS is set in flags, zero-based sequence tables are encoded as JSON arrays.

Limitations:

  • Empty tables are always encoded as empty arrays ([]), not empty objects ({}).
  • Sequence tables cannot be encoded as JSON objects without adding a "dummy" element.
  • To produce objects or arrays with nil values, a tricky implementation of the __pairs metamethod is required.
  • A Lua table having sequential integer keys beginning with 0 will encode as a JSON array, the same as a Lua table having integer keys beginning with 1, unless mw.text.JSON_PRESERVE_KEYS is used.
  • When both a number and the string representation of that number are used as keys in the same table, behavior is unspecified.

mw.text.killMarkers

mw.text.killMarkers( s )

Removes all MediaWiki strip markers from a string.

mw.text.listToText

mw.text.listToText( list )
mw.text.listToText( list, separator, conjunction )

Joins a list, prose-style. In other words, it's like table.concat() but with a different separator before the final item.

The default separator is taken from MediaWiki:comma-separator in the wiki's content language, and the default conjuction is MediaWiki:and concatenated with MediaWiki:word-separator.

Examples, using the default values for the messages:

 -- Returns the empty string
 mw.text.listToText( {} )
 
 -- Returns "1"
 mw.text.listToText( { 1 } )
 
 -- Returns "1 and 2"
 mw.text.listToText( { 1, 2 } )
 
 -- Returns "1, 2, 3, 4 and 5"
 mw.text.listToText( { 1, 2, 3, 4, 5 } )
 
 -- Returns "1; 2; 3; 4 or 5"
 mw.text.listToText( { 1, 2, 3, 4, 5 }, '; ', ' or ' )

mw.text.nowiki

mw.text.nowiki( s )

Replaces various characters in the string with HTML entities to prevent their interpretation as wikitext. This includes:

  • The following characters: '"', '&', "'", '<', '=', '>', '[', ']', '{', '|', '}'
  • The following characters at the start of the string or immediately after a newline: '#', '*', ':', ';', space, tab ('\t')
  • Blank lines will have one of the associated newline or carriage return characters escaped
  • "----" at the start of the string or immediately after a newline will have the first '-' escaped
  • "__" will have one underscore escaped
  • "://" will have the colon escaped
  • A whitespace character following "ISBN", "RFC", or "PMID" will be escaped

mw.text.split

mw.text.split( s, pattern, plain )

Splits the string into substrings at boundaries matching the Ustring pattern pattern. If plain is specified and true, pattern will be interpreted as a literal string rather than as a Lua pattern (just as with the parameter of the same name for mw.ustring.find()). Returns a table containing the substrings.

For example, mw.text.split( 'a b\tc\nd', '%s' ) would return a table { 'a', 'b', 'c', 'd' }.

If pattern matches the empty string, s will be split into individual characters.

mw.text.gsplit

mw.text.gsplit( s, pattern, plain )

Returns an iterator function that will iterate over the substrings that would be returned by the equivalent call to mw.text.split().

mw.text.tag

mw.text.tag( name, attrs, content )
mw.text.tag{ name = string, attrs = table, content = string|false }

Note the use of named arguments.

Generates an HTML-style tag for name.

If attrs is given, it must be a table with string keys. String and number values are used as the value of the attribute; boolean true results in the key being output as an HTML5 valueless parameter; boolean false skips the key entirely; and anything else is an error.

If content is not given (or is nil), only the opening tag is returned. If content is boolean false, a self-closed tag is returned. Otherwise it must be a string or number, in which case that content is enclosed in the constructed opening and closing tag. Note the content is not automatically HTML-encoded; use mw.text.encode() if needed.

For properly returning extension tags such as <ref>, use frame:extensionTag() instead.

mw.text.trim

mw.text.trim( s )
mw.text.trim( s, charset )

Remove whitespace or other characters from the beginning and end of a string.

If charset is supplied, it should be a string as appropriate to go inside brackets in a Ustring pattern, i.e. the "set" in [set]. The default charset is ASCII whitespace, "\t\r\n\f ".

mw.text.truncate

mw.text.truncate( text, length )
mw.text.truncate( text, length, ellipsis )
mw.text.truncate( text, length, ellipsis, adjustLength )

Truncates text to the specified length, adding ellipsis if truncation was performed. If length is positive, the end of the string will be truncated; if negative, the beginning will be removed. If adjustLength is given and true, the resulting string including ellipsis will not be longer than the specified length.

The default value for ellipsis is taken from MediaWiki:ellipsis in the wiki's content language.

Examples, using the default "..." ellipsis:

 -- Returns "foobarbaz"
 mw.text.truncate( "foobarbaz", 9 )
 
 -- Returns "fooba..."
 mw.text.truncate( "foobarbaz", 5 )
 
 -- Returns "...arbaz"
 mw.text.truncate( "foobarbaz", -5 )
 
 -- Returns "foo..."
 mw.text.truncate( "foobarbaz", 6, nil, true )
 
 -- Returns "foobarbaz", because that's shorter than "foobarba..."
 mw.text.truncate( "foobarbaz", 8 )

mw.text.unstripNoWiki

mw.text.unstripNoWiki( s )

Replaces MediaWiki <nowiki> strip markers with the corresponding text. Other types of strip markers are not changed.

mw.text.unstrip

mw.text.unstrip( s )

Equivalent to mw.text.killMarkers( mw.text.unstripNoWiki( s ) ).

This no longer reveals the HTML behind special page transclusion, <ref> tags, and so on as it did in earlier versions of Scribunto.

Title library

mw.title.equals

mw.title.equals( a, b )

Test for whether two titles are equal. Note that fragments are ignored in the comparison.

mw.title.compare

mw.title.compare( a, b )

Returns -1, 0, or 1 to indicate whether the title a is less than, equal to, or greater than title b.

This compares titles by interwiki prefix (if any) as strings, then by namespace number, then by the unprefixed title text as a string. These string comparisons use Lua's standard < operator.

mw.title.getCurrentTitle

mw.title.getCurrentTitle()

Returns the title object for the current page.

mw.title.new

mw.title.new( text, namespace )
mw.title.new( id )

This function is expensive when called with an ID

Creates a new title object.

If a number id is given, an object is created for the title with that page_id. The title referenced will be counted as linked from the current page. If the page_id does not exist, returns nil. The expensive function count will be incremented if the title object created is not for a title that has already been loaded.

If a string text is given instead, an object is created for that title (even if the page does not exist). If the text string does not specify a namespace, namespace (which may be any key found in mw.site.namespaces) will be used. If the text is not a valid title, nil is returned.

mw.title.makeTitle

mw.title.makeTitle( namespace, title, fragment, interwiki )

Creates a title object with title title in namespace namespace, optionally with the specified fragment and interwiki prefix. namespace may be any key found in mw.site.namespaces. If the resulting title is not valid, returns nil.

Note that, unlike mw.title.new(), this method will always apply the specified namespace. For example, mw.title.makeTitle( 'Template', 'Module:Foo' ) will create an object for the page Template:Module:Foo, while mw.title.new( 'Module:Foo', 'Template' ) will create an object for the page Module:Foo.

Title objects

A title object has a number of properties and methods. Most of the properties are read-only.

Note that fields ending with text return titles as string values whereas the fields ending with title return title objects.

  • id: The page_id. 0 if the page does not exist. This may be expensive.
  • interwiki: The interwiki prefix, or the empty string if none.
  • namespace: The namespace number.
  • fragment: The fragment, or the empty string. May be assigned.
  • nsText: The text of the namespace for the page.
  • subjectNsText: The text of the subject namespace for the page.
  • text: The title of the page, without the namespace or interwiki prefixes.
  • prefixedText: The title of the page, with the namespace and interwiki prefixes.
  • fullText: The title of the page, with the namespace and interwiki prefixes and the fragment. Interwiki is not returned if equal to the current.
  • rootText: If this is a subpage, the title of the root page without prefixes. Otherwise, the same as title.text.
  • baseText: If this is a subpage, the title of the page it is a subpage of without prefixes. Otherwise, the same as title.text.
  • subpageText: If this is a subpage, just the subpage name. Otherwise, the same as title.text.
  • canTalk: Whether the page for this title could have a talk page.
  • exists: Whether the page exists. Alias for file.exists for Media-namespace titles. For File-namespace titles this checks the existence of the file description page, not the file itself. This may be expensive.
  • file, fileExists: See #File metadata below.
  • isContentPage: Whether this title is in a content namespace.
  • isExternal: Whether this title has an interwiki prefix.
  • isLocal: Whether this title is in this project. For example, on the English Wikipedia, any other Wikipedia is considered "local" while Wiktionary and such are not.
  • isRedirect: Whether this is the title for a page that is a redirect. This may be expensive.
  • isSpecialPage: Whether this is the title for a possible special page (i.e. a page in the Special: namespace).
  • isSubpage: Whether this title is a subpage of some other title.
  • isTalkPage: Whether this is a title for a talk page.
  • isSubpageOf( title2 ): Whether this title is a subpage of the given title.
  • inNamespace( ns ): Whether this title is in the given namespace. Namespaces may be specified by anything that is a key found in mw.site.namespaces.
  • inNamespaces( ... ): Whether this title is in any of the given namespaces. Namespaces may be specified by anything that is a key found in mw.site.namespaces.
  • hasSubjectNamespace( ns ): Whether this title's subject namespace is in the given namespace. Namespaces may be specified by anything that is a key found in mw.site.namespaces.
  • contentModel: The content model for this title, as a string. This may be expensive.
  • basePageTitle: The same as mw.title.makeTitle( title.namespace, title.baseText ).
  • rootPageTitle: The same as mw.title.makeTitle( title.namespace, title.rootText ).
  • talkPageTitle: The same as mw.title.makeTitle( mw.site.namespaces[title.namespace].talk.id, title.text ), or nil if this title cannot have a talk page.
  • subjectPageTitle: The same as mw.title.makeTitle( mw.site.namespaces[title.namespace].subject.id, title.text ).
  • redirectTarget: Returns a title object of the target of the redirect page if the page is a redirect and the page exists, returns false otherwise.
  • protectionLevels: The page's protection levels. This is a table with keys corresponding to each action (e.g., "edit" and "move"). The table values are arrays, the first item of which is a string containing the protection level. If the page is unprotected, either the table values or the array items will be nil. This is expensive.
  • subPageTitle( text ): The same as mw.title.makeTitle( title.namespace, title.text .. '/' .. text ).
  • partialUrl(): Returns title.text encoded as it would be in a URL.
  • fullUrl( query, proto ): Returns the full URL (with optional query table/string) for this title. proto may be specified to control the scheme of the resulting url: "http", "https", "relative" (the default), or "canonical".
  • localUrl( query ): Returns the local URL (with optional query table/string) for this title.
  • canonicalUrl( query ): Returns the canonical URL (with optional query table/string) for this title.
  • getContent(): Returns the (unparsed) content of the page, or nil if there is no page. The page will be recorded as a transclusion.

Title objects may be compared using relational operators. tostring( title ) will return title.prefixedText.

Since people find the fact surprising, note that accessing any expensive field on a title object records a "link" to the page (as shown on Special:WhatLinksHere, for example). Using the title object's getContent() method or accessing the redirectTarget field records it as a "嵌入", and accessing the title object's file or fileExists fields records it as a "文件链接".

File metadata

Title objects representing a page in the File or Media namespace will have a property called file. This is expensive. This is a table, structured as follows:

  • exists: Whether the file exists. It will be recorded as an image usage. The fileExists property on a Title object exists for backwards compatibility reasons and is an alias for this property. If this is false, all other file properties will be nil.
  • width: The width of the file. If the file contains multiple pages, this is the width of the first page.
  • height: The height of the file. If the file contains multiple pages, this is the height of the first page.
  • pages: If the file format supports multiple pages, this is a table containing tables for each page of the file; otherwise, it is nil. The # operator can be used to get the number of pages in the file. Each individual page table contains a width and height property.
  • size: The size of the file in bytes.
  • mimeType: The MIME type of the file.
Expensive properties

The properties id, isRedirect, exists, and contentModel require fetching data about the title from the database. For this reason, the expensive function count is incremented the first time one of them is accessed for a page other than the current page. Subsequent accesses of any of these properties for that page will not increment the expensive function count again.

Other properties marked as expensive will always increment the expensive function count the first time they are accessed for a page other than the current page.

URI library

mw.uri.encode

mw.uri.encode( s, enctype )

Percent-encodes the string. The default type, "QUERY", encodes spaces using '+' for use in query strings; "PATH" encodes spaces as %20; and "WIKI" encodes spaces as '_'.

Note that the "WIKI" format is not entirely reversible, as both spaces and underscores are encoded as '_'.

mw.uri.decode

mw.uri.decode( s, enctype )

Percent-decodes the string. The default type, "QUERY", decodes '+' to space; "PATH" does not perform any extra decoding; and "WIKI" decodes '_' to space.

mw.uri.anchorEncode

mw.uri.anchorEncode( s )

Encodes a string for use in a MediaWiki URI fragment.

mw.uri.buildQueryString

mw.uri.buildQueryString( table )

Encodes a table as a URI query string. Keys should be strings; values may be strings or numbers, sequence tables, or boolean false.

mw.uri.parseQueryString

mw.uri.parseQueryString( s, i, j )

Decodes the query string s to a table. Keys in the string without values will have a value of false; keys repeated multiple times will have sequence tables as values; and others will have strings as values.

The optional numerical arguments i and j can be used to specify a substring of s to be parsed, rather than the entire string. i is the position of the first character of the substring, and defaults to 1. j is the position of the last character of the substring, and defaults to the length of the string. Both i and j can be negative, as in string.sub.

mw.uri.canonicalUrl

mw.uri.canonicalUrl( page, query )

Returns a URI object for the canonical URL for a page, with optional query string/table.

mw.uri.fullUrl

mw.uri.fullUrl( page, query )

Returns a URI object for the full URL for a page, with optional query string/table.

mw.uri.localUrl

mw.uri.localUrl( page, query )

Returns a URI object for the local URL for a page, with optional query string/table.

mw.uri.new

mw.uri.new( s )

Constructs a new URI object for the passed string or table. See the description of URI objects for the possible fields for the table.

mw.uri.validate

mw.uri.validate( table )

Validates the passed table (or URI object). Returns a boolean indicating whether the table was valid, and on failure a string explaining what problems were found.

URI object

The URI object has the following fields, some or all of which may be nil:

  • protocol: String protocol/scheme
  • user: String user
  • password: String password
  • host: String host name
  • port: Integer port
  • path: String path
  • query: A table, as from mw.uri.parseQueryString
  • fragment: String fragment.

The following properties are also available:

  • userInfo: String user and password
  • hostPort: String host and port
  • authority: String user, password, host, and port
  • queryString: String version of the query table
  • relativePath: String path, query string, and fragment

tostring() will give the URI string.

Methods of the URI object are:

mw.uri:parse

uri:parse( s )

Parses a string into the current URI object. Any fields specified in the string will be replaced in the current object; fields not specified will keep their old values.

mw.uri:clone

uri:clone()

Makes a copy of the URI object.

mw.uri:extend

uri:extend( parameters )

Merges the parameters table into the object's query table.

Ustring library

The ustring library is intended to be a direct reimplementation of the standard String library, except that the methods operate on characters in UTF-8 encoded strings rather than bytes.

Most functions will raise an error if the string is not valid UTF-8; exceptions are noted.

mw.ustring.maxPatternLength

The maximum allowed length of a pattern, in bytes.

mw.ustring.maxStringLength

The maximum allowed length of a string, in bytes.

mw.ustring.byte

mw.ustring.byte( s, i, j )

Returns individual bytes; identical to string.byte().

mw.ustring.byteoffset

mw.ustring.byteoffset( s, l, i )

Returns the byte offset of a character in the string. The default for both l and i is 1. i may be negative, in which case it counts from the end of the string.

The character at l == 1 is the first character starting at or after byte i; the character at l == 0 is the first character starting at or before byte i. Note this may be the same character. Greater or lesser values of l are calculated relative to these.

mw.ustring.char

mw.ustring.char( ... )

Much like string.char(), except that the integers are Unicode codepoints rather than byte values.

local value = mw.ustring.char( 0x41f, 0x440, 0x438, 0x432, 0x435, 0x442, 0x21 ) -- value is now 'Привет!'

mw.ustring.codepoint

mw.ustring.codepoint( s, i, j )

Much like string.byte(), except that the return values are codepoints and the offsets are characters rather than bytes.

mw.ustring.find

mw.ustring.find( s, pattern, init, plain )

Much like string.find(), except that the pattern is extended as described in Ustring patterns and the init offset is in characters rather than bytes.

mw.ustring.format

mw.ustring.format( format, ... )

Identical to string.format(). Widths and precisions for strings are expressed in bytes, not codepoints.

mw.ustring.gcodepoint

mw.ustring.gcodepoint( s, i, j )

Returns three values for iterating over the codepoints in the string. i defaults to 1, and j to -1. This is intended for use in the iterator form of for:

for codepoint in mw.ustring.gcodepoint( s ) do
     -- block
end

mw.ustring.gmatch

mw.ustring.gmatch( s, pattern )

Much like string.gmatch(), except that the pattern is extended as described in Ustring patterns.

mw.ustring.gsub

mw.ustring.gsub( s, pattern, repl, n )

Much like string.gsub(), except that the pattern is extended as described in Ustring patterns.

mw.ustring.isutf8

mw.ustring.isutf8( s )

Returns true if the string is valid UTF-8, false if not.

mw.ustring.len

mw.ustring.len( s )

Returns the length of the string in codepoints, or nil if the string is not valid UTF-8.

See string.len() for a similar function that uses byte length rather than codepoints.

mw.ustring.lower

mw.ustring.lower( s )

Much like string.lower(), except that all characters with lowercase to uppercase definitions in Unicode are converted.

If the Language library is also loaded, this will instead call lc() on the default language object.

mw.ustring.match

mw.ustring.match( s, pattern, init )

Much like string.match(), except that the pattern is extended as described in Ustring patterns and the init offset is in characters rather than bytes.

mw.ustring.rep

mw.ustring.rep( s, n )

Identical to string.rep().

mw.ustring.sub

mw.ustring.sub( s, i, j )

Much like string.sub(), except that the offsets are characters rather than bytes.

mw.ustring.toNFC

mw.ustring.toNFC( s )

Converts the string to Normalization Form C. Returns nil if the string is not valid UTF-8.

mw.ustring.toNFD

mw.ustring.toNFD( s )

Converts the string to Normalization Form D. Returns nil if the string is not valid UTF-8.

mw.ustring.upper

mw.ustring.upper( s )

Much like string.upper(), except that all characters with uppercase to lowercase definitions in Unicode are converted.

If the Language library is also loaded, this will instead call uc() on the default language object.

Ustring patterns

Patterns in the ustring functions use the same syntax as the String library patterns. The major difference is that the character classes are redefined in terms of Unicode character properties:

  • %a: represents all characters with General Category "Letter".
  • %c: represents all characters with General Category "Control".
  • %d: represents all characters with General Category "Number, decimal digit".
  • %l: represents all characters with General Category "Lowercase Letter".
  • %p: represents all characters with General Category "Punctuation".
  • %s: represents all characters with General Category "Separator", plus tab, linefeed, carriage return, vertical tab, and form feed.
  • %u: represents all characters with General Category "Uppercase Letter".
  • %w: represents all characters with General Category "Letter" or "Decimal Number".
  • %x: adds fullwidth character versions of the hex digits.

Like in String library patterns, %A, %C, %D, %L, %P, %S, %U和⧼word-separator⧽%W here represent the complementary set ("all characters without given General Category").

In all cases, characters are interpreted as Unicode characters instead of bytes, so ranges such as [0-9], patterns such as %b«», and quantifiers applied to multibyte characters will work correctly. Empty captures will capture the position in code points rather than bytes.

Loadable libraries

These libraries are not included by default, but if needed may be loaded using require().

bit32

This emulation of the Lua 5.2 bit32 library may be loaded using

bit32 = require( 'bit32' )

The bit32 library provides bitwise operations on unsigned 32-bit integers. Input numbers are truncated to integers (in an unspecified manner) and reduced modulo 232 so the value is in the range 0 to 232−1; return values are also in this range.

When bits are numbered (as in bit32.extract()), 0 is the least-significant bit (the one with value 20) and 31 is the most-significant (the one with value 231).

bit32.band

bit32.band( ... )

Returns the bitwise AND of its arguments: the result has a bit set only if that bit is set in all of the arguments.

If given zero arguments, the result has all bits set.

bit32.bnot

bit32.bnot( x )

Returns the bitwise complement of x.

bit32.bor

bit32.bor( ... )

Returns the bitwise OR of its arguments: the result has a bit set if that bit is set in any of the arguments.

If given zero arguments, the result has all bits clear.

bit32.btest

bit32.btest( ... )

Equivalent to bit32.band( ... ) ~= 0

bit32.bxor

bit32.bxor( ... )

Returns the bitwise XOR of its arguments: the result has a bit set if that bit is set in an odd number of the arguments.

If given zero arguments, the result has all bits clear.

bit32.extract

bit32.extract( n, field, width )

Extracts width bits from n, starting with bit field. Accessing bits outside of the range 0 to 31 is an error.

If not specified, the default for width is 1.

bit32.replace

bit32.replace( n, v, field, width )

Replaces width bits in n, starting with bit field, with the low width bits from v. Accessing bits outside of the range 0 to 31 is an error.

If not specified, the default for width is 1.

bit32.lshift

bit32.lshift( n, disp )

Returns the number n shifted disp bits to the left. This is a logical shift: inserted bits are 0. This is generally equivalent to multiplying by 2disp.

Note that a displacement over 31 will result in 0.

bit32.rshift

bit32.rshift( n, disp )

Returns the number n shifted disp bits to the right. This is a logical shift: inserted bits are 0. This is generally equivalent to dividing by 2disp.

Note that a displacement over 31 will result in 0.

bit32.arshift

bit32.arshift( n, disp )

Returns the number n shifted disp bits to the right. This is an arithmetic shift: if disp is positive, the inserted bits will be the same as bit 31 in the original number.

Note that a displacement over 31 will result in 0 or 4294967295.

bit32.lrotate

bit32.lrotate( n, disp )

Returns the number n rotated disp bits to the left.

Note that rotations are equivalent modulo 32: a rotation of 32 is the same as a rotation of 0, 33 is the same as 1, and so on.

bit32.rrotate

bit32.rrotate( n, disp )

Returns the number n rotated disp bits to the right.

Note that rotations are equivalent modulo 32: a rotation of 32 is the same as a rotation of 0, 33 is the same as 1, and so on.

libraryUtil

This library contains methods useful when implementing Scribunto libraries. It may be loaded using

libraryUtil = require( 'libraryUtil' )

libraryUtil.checkType

libraryUtil.checkType( name, argIdx, arg, expectType, nilOk )

Raises an error if type( arg ) does not match expectType. In addition, no error will be raised if arg is nil and nilOk is true.

name is the name of the calling function, and argIdx is the position of the argument in the argument list. These are used in formatting the error message.

libraryUtil.checkTypeMulti

libraryUtil.checkTypeMulti( name, argIdx, arg, expectTypes )

Raises an error if type( arg ) does not match any of the strings in the array expectTypes.

This is for arguments that have more than one valid type.

libraryUtil.checkTypeForIndex

libraryUtil.checkTypeForIndex( index, value, expectType )

Raises an error if type( value ) does not match expectType.

This is intended for use in implementing a __newindex metamethod.

libraryUtil.checkTypeForNamedArg

libraryUtil.checkTypeForNamedArg( name, argName, arg, expectType, nilOk )

Raises an error if type( arg ) does not match expectType. In addition, no error will be raised if arg is nil and nilOk is true.

This is intended to be used as an equivalent to libraryUtil.checkType() in methods called using Lua's "named argument" syntax, func{ name = value }.

libraryUtil.makeCheckSelfFunction

libraryUtil.makeCheckSelfFunction( libraryName, varName, selfObj, selfObjDesc )

This is intended for use in implementing "methods" on object tables that are intended to be called with the obj:method() syntax. It returns a function that should be called at the top of these methods with the self argument and the method name, which will raise an error if that self object is not selfObj.

This function will generally be used in a library's constructor function, something like this:

 function myLibrary.new()
     local obj = {}
     local checkSelf = libraryUtil.makeCheckSelfFunction( 'myLibrary', 'obj', obj, 'myLibrary object' )
 
     function obj:method()
         checkSelf( self, 'method' )
     end
 
     function obj:method2()
         checkSelf( self, 'method2' )
     end
 
     return obj
 end

luabit

The luabit library modules "bit" and "hex" may be loaded using

bit = require( 'luabit.bit' )
hex = require( 'luabit.hex' )

Note that the bit32 library contains the same operations as "luabit.bit", and the operations in "luabit.hex" may be performed using string.format() and tonumber().

The luabit module "noki" is not available, as it is entirely useless in Scribunto. The luabit module "utf8" is also not available, as it was considered redundant to the Ustring library.

ustring

The pure-Lua backend to the Ustring library may be loaded using

ustring = require( 'ustring' )

In all cases the Ustring library (mw.ustring) should be used instead, as that replaces many of the slower and more memory-intensive operations with callbacks into PHP code.

Extension libraries (mw.ext)

The following MediaWiki extensions provide additional Scribunto libraries:

See also the lists of extensions using the ScribuntoExternalLibraries and ScribuntoExternalLibraryPaths hooks.

Planned Scribunto libraries

These libraries are planned, or are in Gerrit pending review.

(none at this time)

Differences from standard Lua

Changed functions

The following functions have been modified:

setfenv()
getfenv()
May not be available, depending on the configuration. If available, attempts to access parent environments will fail.
getmetatable()
Works on tables only to prevent unauthorized access to parent environments.
tostring()
Pointer addresses of tables and functions are not provided. This is to make memory corruption vulnerabilities more difficult to exploit.
pairs()
ipairs()
Support for the __pairs and __ipairs metamethods (added in Lua 5.2) has been added.
pcall()
xpcall()
Certain internal errors cannot be intercepted.
require()
Can fetch certain built-in modules distributed with Scribunto, as well as modules present in the Module namespace of the wiki. To fetch wiki modules, use the full page name including the namespace. Cannot otherwise access the local filesystem.

Removed functions and packages

The following packages are mostly removed. Only those functions listed are available:

package.*
Filesystem and C library access has been removed. Available functions and tables are:
package.loaded
package.preload
package.loaders
Loaders which access the local filesystem or load C libraries are not present. A loader for Module-namespace pages is added.
package.seeall()
os.*
There are some insecure functions in here, such as os.execute(), which can't be allowed. Available functions are:
os.clock()
os.date()
os.difftime()
os.time()
debug.*
Most of the functions are insecure. Available functions are:
debug.traceback()

The following functions and packages are not available:

collectgarbage()
module()
coroutine.*
No application is known for us, so it has not been reviewed for security.
dofile()
loadfile()
io.*, file.*
Allows local filesystem access, which is insecure.
load()
loadstring()
These were omitted to allow for static analysis of the Lua source code. Also, allowing these would allow Lua code to be added directly to article and template pages, which was not desired for usability reasons.
print()
This was discussed on wikitech-l and it was decided that it should be omitted in favour of return values, to improve code quality. If necessary, mw.log() may be used to output information to the debug console.
string.dump()
May expose private data from parent environments.

Additional caveats

Referential data structures
Circular data structures and data structures where the same node may be reached by more than one path cannot be correctly sent to PHP. Attempting to do so will cause undefined behavior. This includes (but is not limited to) returning such data structures from the module called by {{#invoke:}} and passing such data structures as parameters to Scribunto library functions that are implemented as callbacks into PHP.

Such data structures may be used freely within Lua, including as the return values of modules loaded with mw.loadData().

Writing Scribunto libraries

This information is useful to developers writing additional Scribunto libraries, whether for inclusion in Scribunto itself or for providing an interface for their own extensions.

A Scribunto library will generally consist of five parts:

  • The PHP portion of the library.
  • The Lua portion of the library.
  • The PHP portion of the test cases.
  • The Lua portion of the test cases.
  • The documentation.

Existing libraries serve as a good example.

Library

The PHP portion of the library is a class that must extend Scribunto_LuaLibraryBase. See the documentation for that class for implementation details. In the Scribunto extension, this file should be placed in engines/LuaCommon/NameLibrary.php, and a mapping added to Scribunto_LuaEngine::$libraryClasses. Other extensions should use the ScribuntoExternalLibraries hook. In either case, the key should match the Lua module name ("mw.name" for libraries in Scribunto, or "mw.ext.name" for extension libraries).

The Lua portion of the library sets up the table containing the functions that can be called from Lua modules. In the Scribunto extension, the file should be placed in engines/LuaCommon/lualib/mw.name.lua. This file should generally include boilerplate something like this:

local object = {}
local php

function object.setupInterface( options )
    -- Remove setup function
    object.setupInterface = nil

    -- Copy the PHP callbacks to a local variable, and remove the global
    php = mw_interface
    mw_interface = nil

    -- Do any other setup here

    -- Install into the mw global
    mw = mw or {}
    mw.ext = mw.ext or {}
    mw.ext.NAME = object

    -- Indicate that we're loaded
    package.loaded['mw.ext.NAME'] = object
end

return object

The module in engines/LuaCommon/lualib/libraryUtil.lua (load this with local util = require 'libraryUtil') contains some functions that may be helpful.

Be sure to run the Scribunto test cases with your library loaded, even if your library doesn't itself provide any test cases. The standard test cases include tests for things like libraries adding unexpected global variables. Also, if the library is loaded with PHP, any upvalues that its Lua functions have will not be reset between #invoke's. Care must be taken to ensure that modules can't abuse this to transfer information between #invoke's.

Test cases

The Scribunto extension includes a base class for test cases, Scribunto_LuaEngineTestBase, which will run the tests against both the LuaSandbox and LuaStandalone engines. The library's test case should extend this class, and should not override static function suite(). In the Scribunto extension, the test case should be in tests/engines/LuaCommon/NameLibraryTest.php and added to the array in ScribuntoHooks::unitTestsList() (in common/Hooks.php); extensions should add the test case in their own UnitTestsList hook function, probably conditional on whether $wgAutoloadClasses['Scribunto_LuaEngineTestBase'] is set.

Most of the time, all that is needed to make the test case is this:

class ClassNameTest extends Scribunto_LuaEngineTestBase {
    protected static $moduleName = 'ClassNameTest';

    function getTestModules() {
         return parent::getTestModules() + array(
             'ClassNameTest' => __DIR__ . '/ClassNameTests.lua';
         );
    }
}

This will load the file ClassNameTests.lua as if it were the page "Module:ClassNameTests", expecting it to return an object with the following properties:

  • count: Integer, number of tests
  • provide( n ): Function that returns three values: n, the name of test n, and a string that is the expected output for test n.
  • run( n ): Function that runs test n and returns one string.

If getTestModules() is declared as shown, "Module:TestFramework" is available which provides many useful helper methods. If this is used, ClassNameTests.lua would look something like this:

local testframework = require 'Module:TestFramework'

return testframework.getTestProvider( {
    -- Tests go here
} )

Each test is itself a table, with the following properties:

  • name: The name of the test.
  • func: The function to execute.
  • args: Optional table of arguments to pass to the function.
  • expect: Results to expect.
  • type: Optional "type" of the test, default is "Normal".

The type controls the format of expect and how func is called. Included types are:

  • Normal: expect is a table of return values, or a string if the test should raise an error. func is simply called.
  • Iterator: expect is a table of tables of return values. func is called as with an iterated for loop, and each iteration's return values are accumulated.
  • ToString: Like "Normal", except each return value is passed through tostring().

Test cases in another extension

There are (at least) two ways to run PHPUnit tests:

  1. Run phpunit against core, allowing the tests/phpunit/suites/ExtensionsTestSuite.php to find the extension's tests using the UnitTestsList hook. If your extension's test class names all contain a unique component (e.g. the extension's name), the --filter option may be used to run only your extension's tests.
  2. Run phpunit against the extension directory, where it will pick up any file ending in "Test.php".

Either of these will work fine if Scribunto is loaded in LocalSettings.php. And it is easy for method #1 to work if Scribunto is not loaded, as the UnitTestsList hook can easily be written to avoid returning the Scribunto test when $wgAutoloadClasses['Scribunto_LuaEngineTestBase'] is not set.

But Jenkins uses method #2. For Jenkins to properly run the tests, you will need to add Scribunto as a dependency for your extension. See Gerrit change 56570 for an example of how this is done.

If for some reason you need the tests to be able to run using method #2 without Scribunto loaded, one workaround is to add this check to the top of your unit test file:

 if ( !isset( $GLOBALS['wgAutoloadClasses']['Scribunto_LuaEngineTestBase'] ) ) {
     return;
 }

Documentation

Modules included in Scribunto should include documentation in the Scribunto libraries section above. Extension libraries should include documentation in a subpage of their own Extension page, and link to that documentation from #Extension libraries (mw.ext).

See also

License

This manual is derived from the Lua 5.1 reference manual, which is available under the MIT license.

This derivative manual may also be copied under the terms of the same license.