Extension talk:SyntaxHighlight

About this board

Previous discussion was archived at Extension talk:SyntaxHighlight/Archive 2017 on 2017-03-29.

Is it always installed now?

2
Amire80 (talkcontribs)

The page says: "This extension comes with MediaWiki 1.21 and above".

Does this mean that I can use the <syntaxhighlight> tag in something that another extension outputs and be sure that it will work?

BDavis (WMF) (talkcontribs)
Reply to "Is it always installed now?"

</syntaxhighlight> within <syntaxhighlight>

3
One cookie (talkcontribs)

Re. the edit of 16 January 2024:

Hiya @Jdforrester (WMF): - in the edit you made, the unescaped <syntaxhighlight> tags were making the examples of the code render as the examples - I can't see a way to retain syntax highlighting while having the actual tags appear within the block of highlit code when created with normal tags (why would anyone ever need to escape characters between tags which are used to escape whole blocks of code?!) - the {{#tag:}} magic word works instead of the {{SyntaxHighlight source}} template, it looks a bit cleaner and it's how the page's other examples have done it, and like the template it will let its contents linewrap, which the raw tags don't - but I guess it looks similarly confusing in the source. Any suggestions for a better solution?

Jdforrester (WMF) (talkcontribs)
One cookie (talkcontribs)

They only got one - there was a second starting at line 267

Some lexers put pages using <syntaxhighlight> in the tracking category

5
Summary by Snowyamur9889

Forgot to set this as resolved from 8 months ago. Solution was to upgrade Pygments extension from 2.11.2 to 2.15.0.

Snowyamur9889 (talkcontribs)

Pygments lexers like "lua", "html", and "css" put pages using <syntaxhighlight> into "Category:Pages with syntax highlighting errors", despite these lexers being correct. Maybe there are other lexers causing this problem - I don't know. But the three I mentioned above are the ones I commonly use for template/module documentation.


I don't know why this happens. The wiki I contribute to runs on MediaWiki 1.39 with Extension:SyntaxHighlight version 2.0. The Pygments documentation states that these Short names: for the lexers I mentioned above are correct, so why are the pages that use them with <syntaxhighlight> placed in the tracking category?


Does anyone know?

Tacsipacsi (talkcontribs)

Does the code get highlighted? (It’s always good to share a link so that I can answer such simple questions on my own.) The error category gets added whenever the highlighting is unsuccessful – this includes not only incorrect lexer names, but also when the code is too big or calling Pygments fails for any reason.

Snowyamur9889 (talkcontribs)

The code gets highlighted, so I don't know why pages using <syntaxhighlight> end up in the tracking category. There are pages in the tracking category for this wiki that use syntax highlighting correctly but still end up categorized.

Tacsipacsi (talkcontribs)

Could you please provide a link? It’s very hard to debug without being able to see what exactly is happening.

Snowyamur9889 (talkcontribs)

I apologize for the 2-month late response User:Tacsipacsi. I tried replying sooner, but an error with Extension:StructuredDiscussions on MediaWiki was preventing me from entering a response about a month ago.


I just wanted to let you know that the lexers I defined 2 months ago as not working do work again. The pages on the wiki I contribute to no longer end up being incorrectly categorized under this extension's tracking category when using <syntaxhighlight> correctly.


This issue I had may have been fixed with a major version update to Pygments. During the time I had the categorization issue, this extension was using Pygments 2.11.2. It now uses Pygments 2.15.0 as of April 10 of this year (commit 9f9c13b).

Syntax highlight on Mediawiki without Python

2
Aoineko (talkcontribs)

Hello,

I can't install Python on my server (only running PHP/mySQL); is there a syntax highlighting solution that doesn't require Python?

I'm using Mediawiki 1.30.2.

Cavila (talkcontribs)

I don't know about MediaWiki support, but there's a couple of extensions listed at the bottom of the page that you could try out.

Reply to "Syntax highlight on Mediawiki without Python"

Any idea how to add a copy-to-clipboard button ?

2
217.133.38.27 (talkcontribs)

?

Ernstkm (talkcontribs)

Extension:PreToClip appears to do that. Adding a widget would be another possibility, if you're not happy with how the PreToClip extension does it. The latter option would require getting your hands dirty, though. --Ernstkm (talk) 11:51, 25 October 2023 (UTC)

Reply to "Any idea how to add a copy-to-clipboard button ?"

Syntax-highlight text from a template?

6
Jordan Brown (talkcontribs)

I want to have a template that contains precisely the source file to be displayed, and to drop it into another page syntax-highlighted.

Conceptually, what I want is something like

<syntaxhighlight> {{msgnw:MySourceFile}} </syntaxhighlight>

but of course that doesn't expand the template.

{{Pre|{{msgnw:MySourceFile}}}} sort of works, but it's not quite the same formatting as <syntaxhighlight>.

Is there a way?

Dinoguy1000 (talkcontribs)

If you haven't already, try {{ #tag: syntaxhighlight | {{MySourceFile}} |lang="lang" }}.

Jordan Brown (talkcontribs)

Thanks but, alas, turns a semicolon into &#59;.

Precisely what I tried - itself inside a template - is

{{ #tag: syntaxhighlight | {{msgnw:User:Jordan Brown/sandbox/{{{name}}}}}|lang="C" }}

where the page pointed at contains cube(10);.

But I didn't know anything about #tag, so I'll look into it further.

Dinoguy1000 (talkcontribs)

You might have luck without the msgnw (I haven't tested this myself):

{{ #tag: syntaxhighlight | {{:User:Jordan Brown/sandbox/{{{name}}}}}|lang="C" }}
Jordan Brown (talkcontribs)

No, didn't work. With

{{ #tag: syntaxhighlight | {{{{BOOKNAME}}/examples/{{{name}}}}}|lang="C"|line="line" }}

and the input being (a deliberate torture test)

cube(10);|| foo && bar {{baz}} <pre></pre> &lt;
x < 5 && y > 6

the result was

cube(10);|| foo && bar [[:Template:Baz]] '"`UNIQ--pre-00000000-QINU`"' &lt;
x < 5 && y > 6

Thanks for your help. Other editors on the book have looked at it and decided that my idea of keeping the examples in separate pages would be too awkward, so I'm abandoning the effort for now.

Verdy p (talkcontribs)

If you see `UNIQ--pre-*-QINU` stripped tags, this is because the transcluded content was parsed by MediaWiki. The only safe way to transclude a content not written in the Mediawiki syntax and that must not be parsed by it, is to use the "msgnw:" prefix before the transcluded full pagename, then you can put it inside a "#tag:" which allows the raw transclusion (via "msgnw:") or the parsed transclusion (without "msgnw:") to work.

However, it should be possible to use a simpler approach with <syntaxhighlight page="fullpagename" />, without even having to set its lang="*" attribute, because every page has metadata specifying its content-type (aka "page content model"): the syntaxhighlight extension should know every supported "content model" of MediaWiki, and notably MediaWiki, JSON, CSS, Javascript, XML, Lua, config/INI files, CSV files (the variant defined by the most common RFC using ASCII commas as field delimiters, ASCII double-quotes for escapes, and numbers in English format using dots and not commas, or otherwise escaping numbers within quotes, and non-mandatory quotation marks surrounding fields, and whitespaces between fields not significant, but where newlines may be present within fields escaped by quotation marks within which whitespaces are significant), and plain text, that should all be mappable to a language code/name supported by a Pygment parser.

Reply to "Syntax-highlight text from a template?"
Novem Linguae (talkcontribs)

Is lang="regex" supported? Is there an alias for it that I'm missing? If not supported, what's the proper repo to request its addition?

Dinoguy1000 (talkcontribs)

Regex might be supported as part of Perl (that's the first thing I'd try after lang="regex", at least). If it's not supported at all (or you can't find any other language that supports it), the place to request support would be the Pygments project, after which you'd need to file a task on Wikimedia Phabricator to have the Pygments version used in this extension updated.

Stang (talkcontribs)
Dinoguy1000 (talkcontribs)

Yes, that's what I said.

PerfektesChaos (talkcontribs)

Well, the problem is that there is no unique RegExp syntax but a large variety of dialects. They all need different lexers since some elements are permitted in the one language but unknown and an error within the other context.

Verdy p (talkcontribs)

Since the transition to pygments, there's no longer any support for any regular expression syntax, notably lang="pcre" (which was commonly used in various talk pages, notably on Wikidata which uses PCRE extensively), or the legacy lang="regexp". All these talk pages are now listed in a tracking category for all kinds of errors produced by "syntaxhighlight" (without distinguishing any tracking category for languages indicated in the "lang" attribute" but that are not (or no longer) recognized. PCRE regular expressions are not complex to parse, but this should recognize all features of PCRE 5+, including multiline expression with whitespace compression, comments (important in talk pages), and distinctive coloring for inners of character classes, open/close brackets, operators, escapes, defined label names, and a few keywords in special constructs. The "regexp" language is just a subset (without multiline/comments and a much more limited syntax). As well we should have a recognizer for Lua patterns (lang="luapattern"?) which are much more restricted (notably no support for repetitions in a limited range, or for '|' alternates, or defined subpatterns, or extended character classes based on POSIX or Unicode character properties).

Verdy p (talkcontribs)

As well, before we could use lang="HTML5", which was working for colorizing most the Mediawiki wiki syntax (but not lang="html"). Now we have to use lang="html" (but not lang="html5", which is not recognized). There's an evident need to create some common aliases.

Reply to "RegEx"
Olarp (talkcontribs)

Running MediaWiki 1.38.4 on

  • Windows Server 2019
  • IIS 10
  • PHP 8.1.13
  • Python 3.11
  • Pygmentize 2.13.0

For unknown reasons some lexers work while other don't and instead generate the "Pages with syntax highlighting errors"-category. The page below, as an example, generates the powershell and bash code correctly, but not the yaml.

If I swap the yaml for bash without changing the block of code below, it does not generate an "Pages with syntax highlighting" category of it, and of course it doesn't highlight it correctly either as it is trying to bash highlight a yaml snippet.

<syntaxhighlight lang="powershell">
# This is a PowerShell script
Write-Host "Hello world"
</syntaxhighlight>

<syntaxhighlight lang="bash">
# This is a shell script in bash
ls -l /var/lib
</syntaxhighlight>

<syntaxhighlight lang="yaml">
# This is a yaml block
- martin:
    job: Developer
</syntaxhighlight>

I did not try all lexers supported by pygmentize. But it seems some work, some not. powershell and bash works every time, yaml and text fails everytime. It doesn't matter what order the blocks are on the page, they can be alone and it still fails. I checked the file SyntaxHighlight.lexers.php and all the lexers that pygmentize support are there, marked as true.

This is what debug output looks for the above page, relevant section. Note it only runs pygmentize four times. First time with version check, second with lexers check, third and fourth is the syntax highlight blocks that work.

[DBQuery] MediaWiki\Storage\SqlBlobStore::fetchBlobs [0s] 127.0.0.1: SELECT old_id,old_text,old_flags FROM `text` WHERE old_id = 1822
[objectcache] fetchOrRegenerate(global:SqlBlobStore-blob:hkr_mediawiki:tt%3A1822): miss, new value computed
[exec] Creating base path C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-0b8cf480f6bf3322
[exec] Executing: cmd /s /c ""c:\Progra~1\Python311\Scripts\pygmentize.exe" "-V"" 2>&1
[exec] Removed file "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-0b8cf480f6bf3322/sb-stderr"
[exec] Removed file "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-0b8cf480f6bf3322/sb-stdout"
[exec] Removed directory "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-0b8cf480f6bf3322"
[exec] Creating base path C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-86fd2e5bba940d09
[exec] Executing: cmd /s /c ""c:\Progra~1\Python311\Scripts\pygmentize.exe" "-L" "lexer"" 2>&1
[exec] Removed file "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-86fd2e5bba940d09/sb-stderr"
[exec] Removed file "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-86fd2e5bba940d09/sb-stdout"
[exec] Removed directory "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-86fd2e5bba940d09"
[exec] Creating base path C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-237f871e31353545
[exec] Executing: cmd /s /c ""c:\Progra~1\Python311\Scripts\pygmentize.exe" "-l" "powershell" "-f" "html" "-O" "cssclass=mw-highlight,encoding=utf-8" "file""
[exec] Removed file "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-237f871e31353545/file"
[exec] Removed file "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-237f871e31353545/sb-stderr"
[exec] Removed file "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-237f871e31353545/sb-stdout"
[exec] Removed directory "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-237f871e31353545"
[objectcache] fetchOrRegenerate(global:highlight:1e04b7e8e2e671872099e468f594fa58): miss, new value computed
[exec] Creating base path C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-5a0641ea93a9910c
[exec] Executing: cmd /s /c ""c:\Progra~1\Python311\Scripts\pygmentize.exe" "-l" "bash" "-f" "html" "-O" "cssclass=mw-highlight,encoding=utf-8" "file""
[exec] Removed file "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-5a0641ea93a9910c/file"
[exec] Removed file "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-5a0641ea93a9910c/sb-stderr"
[exec] Removed file "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-5a0641ea93a9910c/sb-stdout"
[exec] Removed directory "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-5a0641ea93a9910c"
[objectcache] fetchOrRegenerate(global:highlight:79a0bec1f4c29b0a4e1f837a4b70f4a9): miss, new value computed
[DBConnection] Wikimedia\Rdbms\LoadBalancer::getLocalConnection: reused a connection for local/0
[DBQuery] LCStoreDB::get [0s] 127.0.0.1: SELECT lc_value FROM `l10n_cache` WHERE lc_lang = 'sv' AND lc_key = 'messages:syntaxhighlight-error-category' LIMIT 1																									
[DBConnection] Wikimedia\Rdbms\LoadBalancer::getLocalConnection: reused a connection for local/0
[DBQuery] LCStoreDB::get [0s] 127.0.0.1: SELECT lc_value FROM `l10n_cache` WHERE lc_lang = 'sv' AND lc_key = 'linkPrefixExtension' LIMIT 1
[DBConnection] Wikimedia\Rdbms\LoadBalancer::getLocalConnection: reused a connection for localAutoCommit/0

When swapping the yaml for bash it processes correctly and doesn't throw errors and doesn't get the "Pages with syntax highlighting errors". It now runs pygmentize five times, with the last working bash-block.

[DBQuery] MediaWiki\Storage\SqlBlobStore::fetchBlobs [0s] 127.0.0.1: SELECT old_id,old_text,old_flags FROM `text` WHERE old_id = 1823
[objectcache] fetchOrRegenerate(global:SqlBlobStore-blob:hkr_mediawiki:tt%3A1823): miss, new value computed
[exec] Creating base path C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-c3bb7d17062047d8
[exec] Executing: cmd /s /c ""c:\Progra~1\Python311\Scripts\pygmentize.exe" "-V"" 2>&1
[exec] Removed file "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-c3bb7d17062047d8/sb-stderr"
[exec] Removed file "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-c3bb7d17062047d8/sb-stdout"
[exec] Removed directory "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-c3bb7d17062047d8"
[exec] Creating base path C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-0484336c3f645690
[exec] Executing: cmd /s /c ""c:\Progra~1\Python311\Scripts\pygmentize.exe" "-L" "lexer"" 2>&1
[exec] Removed file "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-0484336c3f645690/sb-stderr"
[exec] Removed file "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-0484336c3f645690/sb-stdout"
[exec] Removed directory "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-0484336c3f645690"
[exec] Creating base path C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-0b05e1edd661e368
[exec] Executing: cmd /s /c ""c:\Progra~1\Python311\Scripts\pygmentize.exe" "-l" "powershell" "-f" "html" "-O" "cssclass=mw-highlight,encoding=utf-8" "file""
[exec] Removed file "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-0b05e1edd661e368/file"
[exec] Removed file "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-0b05e1edd661e368/sb-stderr"
[exec] Removed file "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-0b05e1edd661e368/sb-stdout"
[exec] Removed directory "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-0b05e1edd661e368"
[objectcache] fetchOrRegenerate(global:highlight:1e04b7e8e2e671872099e468f594fa58): miss, new value computed
[exec] Creating base path C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-d08c021bfe1b0311
[exec] Executing: cmd /s /c ""c:\Progra~1\Python311\Scripts\pygmentize.exe" "-l" "bash" "-f" "html" "-O" "cssclass=mw-highlight,encoding=utf-8" "file""
[exec] Removed file "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-d08c021bfe1b0311/file"
[exec] Removed file "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-d08c021bfe1b0311/sb-stderr"
[exec] Removed file "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-d08c021bfe1b0311/sb-stdout"
[exec] Removed directory "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-d08c021bfe1b0311"
[objectcache] fetchOrRegenerate(global:highlight:79a0bec1f4c29b0a4e1f837a4b70f4a9): miss, new value computed
[exec] Creating base path C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-90a3aca3d2df8f1f
[exec] Executing: cmd /s /c ""c:\Progra~1\Python311\Scripts\pygmentize.exe" "-l" "bash" "-f" "html" "-O" "cssclass=mw-highlight,encoding=utf-8,linenos=inline,hl_lines=4" "file""
[exec] Removed file "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-90a3aca3d2df8f1f/file"
[exec] Removed file "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-90a3aca3d2df8f1f/sb-stderr"
[exec] Removed file "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-90a3aca3d2df8f1f/sb-stdout"
[exec] Removed directory "C:\WINDOWS\TEMP\mwtmp-IUSR/shellbox-90a3aca3d2df8f1f"
[objectcache] fetchOrRegenerate(global:highlight:3ad02a4ec124ea3009c02acf078da02e): miss, new value computed
[DBConnection] Wikimedia\Rdbms\LoadBalancer::getLocalConnection: reused a connection for local/0
[DBQuery] LCStoreDB::get [0s] 127.0.0.1: SELECT lc_value FROM `l10n_cache` WHERE lc_lang = 'sv' AND lc_key = 'linkPrefixExtension' LIMIT 1
[DBConnection] Wikimedia\Rdbms\LoadBalancer::getLocalConnection: reused a connection for localAutoCommit/0

I'm so out of ideas.

What do you think?

Olarp (talkcontribs)

Upgrading to MediaWiki 1.39.4 appears to have fixed this issue. All lexers work in this version.

The problem still persisted in MediaWiki 1.39.2 but an upgrade to 1.39.4 fixed it.

Issue in parsing -L lexer output on Windows

2
Check to the King (talkcontribs)

I was experiencing an issue on Windows where the lexer list was being incorrectly parsed, identifying language names with a colon at the end, i.e. as 'sql:', 'c++:', etc. My setup was as follows:

Product Version
MediaWiki 1.39.3
PHP 8.0.7 (apache2handler)
MySQL 8.0.25
ICU 68.2
Pygments 2.15.1

Using the latest Python 3.11 for Windows.

I managed to fix this on my machine by making one modification to includes\Pygmentize.php (add '\r' to trim character list for fetchLexers function, on line 214):

Can we please incorporate this into the production code?

Keyacom (talkcontribs)

There is no need because this has been implemented for SyntaxHighlight for a future MediaWiki version.

Optionally, one can use the python -m pygments -L lexer --json command, which outputs the lexer list formatted as JSON. This feature has been added in Pygments 2.11.0.

Reply to "Issue in parsing -L lexer output on Windows"

code tag (or bold tag?) adds a whitespace character

3
Summary by BDavis (WMF)

The <code> tag is an HTML5 construct and not related to the SyntaxHighlight extension.

47.189.155.100 (talkcontribs)

For example:

h t t p s : / / www.youtube.com/watch?v=pRXoS_P0lk

This mixes the code tag with the bold - note that there is a whitespace added after the = character. Why? And how can I fix it?

the extra spaces in the URL are to get around the linkspam filter

47.189.155.100 (talkcontribs)

Dangit. It didn't do it here. I guess I need to fix something on my won wiki. NVM

BDavis (WMF) (talkcontribs)

The <code> tag is an HTML5 construct and not related to the SyntaxHighlight extension.

Return to "SyntaxHighlight" page.