Extension:Regex Fun
A request to archive this extension has been made on Phabricator. アーカイブ化の申請とその理由は タスク T374076 を参照し、申請についてコメントを残してください。 |
現在、この拡張機能は積極的な保守が行われていません! それでも機能する可能性はありますが、バグ報告や機能の要望は無視される可能性が高くなります。 この拡張機能の開発や保守の作業を引き受けることに興味がある場合は、リポジトリの所有権を申請できます。 礼儀として、作者に問い合わせることをお勧めします。 保守を引き継いだ場合、このテンプレートは除去すべきです。また、拡張機能ページの {{Extension }} 基礎情報ボックス内のリストにあなたの名前を保守担当者として追加してください。 |
Regex Fun リリースの状態: 保守されていない |
|
---|---|
実装 | パーサー関数 |
説明 | Adds parser functions allowing the use of regular expressions within wiki pages |
作者 | Daniel Werner (Danweトーク) |
最新バージョン | 1.3.0 (2017-07-27) |
MediaWiki | 1.23+ |
PHP | 5.3+ |
データベースの変更 | いいえ |
ライセンス | ISC ライセンス |
ダウンロード | README CHANGELOG |
translatewiki.net で翻訳を利用できる場合は、Regex Fun 拡張機能の翻訳にご協力ください | |
The Regex Fun extension provides four new parser functions for performing regular expression handling within wiki pages. The main difference to other regex extensions such as RegexFunctions, besides richer functionality, is that this extension provides a function #regexquote
for encoding user-provided strings for the use within a regular expression, a function #regex_var
to access subexpression results, and it introduces MediaWiki-related regular expression modifiers r
and changes the meaning of the e
modifier in a meaningful and secure way.
使用法
This section will introduce each of the four regular expression related functions which come along with this extension.
#regex
search mode
Syntax: {{#regex: text| pattern }}
When the third argument is omitted, this function allows simple search via regular expression. It will return the first match within a string. For example:
{{#regex: foo 10$, Baa 21$, baa 3$ | /baa\s+\d+\s*\$/i }} |
should return : Baa 21$
|
replace mode
Syntax: {{#regex: text | pattern | replacement | limit }}
Through a third argument (replacement), the #regex
function allows each match of the expression to be replaced. For example:
{{#regex: Foo 10, Baa 21 | /(.+?)\s+(\d+)\s*/ | $1 $2\$ }} |
should return : Foo 10$, Baa 21$
|
The replacement parameter accepts back-references in the format $n
, where n is a number. The format \n
is also possible, but not recommended.
- limit
An optional fourth argument can be used to limit how many replacements should be made. If there are further matches, these will be shown without replacement. The default limit is -1
, which means that each match will be replaced.
- modifiers (flags)
In replace mode, the regex pattern allows the use of php pcre regex modifiers as flags. #regex:
can use most of the pcre regex modifiers with their exact meaning. The exception is the e
modifier, which would be a security risk and is therefore treated differently.
e
modifier
Syntax : e.g. /(.+?)\s+(\d+)\s*/e
Before the replacement of matching strings is done, references in the replacement string (such as $0
or \1
) will be replaced by their matches. In case e
is set, after this replacement, the replacement string will first be parsed before being inserted in the original string. In addition, the whole wikitext within the replacement parameter will not be parsed before the actual regex. This allows to use parser functions or templates within the replacement string which will run over the references first.
{{#regex: Foo 10, Baa 21 | /(.+?)\s+(\d+)\s*/e | {{uc:$1}} $2\$ }} |
Provided that Template:(( and Template:)) exist within the wiki, this should return:FOO 10$, BAA 21$
|
Without e flag :
{{#regex: cat, dog, bear |/\w+/ | {{uc:a $0}} }} |
should return A cat, A dog, A bear because {{uc:a $0}} is parsed first, which results just into "A $0 ", even before the "#regex " is being parsed.
|
With e flag :
|
should return A CAT, A DOG, A BEAR because the e-flag delays the parsing of the third parameter to the time it is being used as replacement |
r
modifier
If set, the function will return an empty string if no replacement could be done (if the expression didn't match anything). Without the flag the function would simply return the input string.
#regexall
Syntax: {{#regexall: text | pattern | separator | offset | length }}
Searches the whole string for as many matches as possible and returns them separated by a separator. Also gives control from which match to start and where to end. This function can be particularly useful together with extensions Arrays and HashTables.
- Optional parameters
- separator (default "
,
") - offset (default
0
) If non-negative, first item will come from that offset. If negative, the first item comes that far from the end of all items. - length (default "
") If set and non-negative, the result will contain that many items. If negative, the last item comes that far from the end of all items.
- Example
{{#regexall: 0+11+2+33 | /\d+/ | , | 1 | 2 }}
|
should return 11,2
|
#regex_var
Syntax: {{#regex_var: index, or reference string with index(es) written as $n | default }}
This function allows to access subexpression references of the last used #regex
function. The matches available to #regex_var
depend on whether #regex
used the search or replace mode :
- If it used the search mode, only one match would be available.
- If it used the replace mode, it should make all matches available for use in
#regex_var
.
- parameter 1
Subexpressions are the parts within parentheses "()
" which can be referenced to as $n
(where n is a number) within #regex
in replacement mode. You can
- set a certain index to the first parameter 1
- use a whole string, containing references, following the rules of the
#regex
replacement string.
Accessing named subexpressions (like (?P<name>expr)
) has not been implemented.
- parameter 2
The second parameter, default, is optional and can be used to provide a default string in case the given index doesn't exist, the last use of #regex
failed or #regex
hasn't been called yet.
- Examples
Using specific index: | {{#regex_var: 0 | nothing }}
|
Using a reference string with $n: | {{#regex_var: $3, $1 and $2 }}
|
- Note on scope
It may happen that the last use of #regex
on the page is followed by a call to a template and that a call is made to another #regex
from within that template. If this is followed by a #regex_var
, then #regex_var
will take the function call from within the template as its reference.
This may lead to confusing outputs if you are not aware of what the template contains. This might be fixed in a later version.
#regexquote
Syntax: {{#regexquote: text | delimiter }}
#regexquote
is a function used to escape user provided data that are used as part of a regular expression. User provided input, for example template parameter provided data, should always run through this function first to make sure that special characters like ".
" or "\
" in the user input won't irritate the regular expression. Technically, this function will run the php function preg_quote over the string and in case the first character has a special meaning in MW, it will be replaced it with its hexadecimal notation e.g. "\x23
" instead of "#
" (to prevent the line from becoming a MW list).
- Parameter 2 (delimiter)
The delimiter parameter is optional. It should be the character used as delimiter within the regular expression where the text should be used. By default, the delimiter is set to "/
" since it is the most common delimiter in most examples.
- example
{{#regex: {{{Items|}}} | %(?:^{{!}}(?<=,))(\s*{{#regexquote: {{{Favorite}}} | % }}\s*)(?:${{!}}(?=\,))% | '''$''' }}
This will highlight some item provided by template parameter Favorite within a list of items, separated by comma, provided by parameter Items. If #regexquote
wasn't used here and Favorite would contain some special character, this would break the whole expression and return an error message!
invalid regular expression handling
Instead of outputting a php notice in the event of an invalid regular expression, this will output an inline wiki error message which can be caught by extension ParserFunctions error catching #iferror function.
インストール
- ダウンロードして、ファイルを
extensions/
フォルダー内のRegexFun
という名前のディレクトリ内に配置します。
開発者とコード寄稿者は、上記の代わりに以下を使用してGitからインストールします:cd extensions/
git clone https://gerrit.wikimedia.org/r/mediawiki/extensions/RegexFun - 以下のコードを LocalSettings.php ファイルの末尾に追加します:
require_once "$IP/extensions/RegexFun/RegexFun.php";
- 必要に応じて設定します。
- 完了 – ウィキの「Special:Version」に移動して、拡張機能が正しくインストールされたことを確認します。
設定
Regex Fun comes with two global customization variables. Their default values can be changed by including them into localsettings.php after the inclusion of RegexFun.php.
$egRegexFunDisabledFunctions
- Array which allows to define functions which should not be available for use within the wiki. For example if you want to prevent users from using
#regex_var
and#regexall
, simply set this to:$egRegexFunDisabledFunctions = array( 'regexall', 'regex_var' );
$egRegexFunMaxRegexPerParse
- Defines the maximum regular expression executions per ongoing parser process. This counts all major executed regular expression usages triggered by this extension. The counter will be increased by '#regex', '#regexall' and by '#regex_var' in case a reference string is given but not if only an index is requested. '#regexquote' is not affected. Before the limit is exceeded, a
#iferror
catchable error message will be put out instead of the result of the called function. By default the limit is set to-1
which disables any limitation. - Note: Instead of using a limit per page, this limit is per
Parser::parse()
process bound to each Parser object. This makes sense to avoid complications on page import or when the job queue is updating pages because a single increasing global counter would not really be per page but rather per session then.
関連項目
- Extension:RegexFunctions - just another regular expression extension with less functionality but with more global customization variables for further limitations.
- Extension:ReplaceSet - Allows several replacement strings within one function call.