User:Danwe/regex

This page contains some of my regular expressions. I just put them here to find them whenever the need arises.

JavaScript edit

Matches everything until some word (in this case ABC) and everything behind that word and stores those parts as backrefs. Nothing fancy, I just keep forgetting the syntax of look around assertions and this is probably the one most frequently used by me.

'this is some ABC simple test'.match( /((?!ABC).*?)ABC(.*)/ )

So here are the different look around asertions anyhow (source, javascriptkit.com):

lookaheads:

  • (?=pattern) matches only if there is a following pattern in input.
  • (?!pattern) matches only if there is not a following pattern in input.

lookbehinds - not supported in JS, but the web might help with that


This one is supposed to replace all attributes of all DOM nodes:

var crazyHtml = '<div sdfdf="<div>fsdf</div>sdf" sdfsdf="sdgsgf">baa <-- foo!</div>';
    // replace all attributes of all DOM nodes
    regex = /(<\S+)(?:[^<>"']+(?:(["'])[^\2]*\2)?)*?(\/?>)/g,

console.log( crazyHtml.replace( regex, '$1$3' ) );


PHP edit

From Semantic Expressiveness extension. Matches a certain DOM structure, using recursive expressions. Used in ExpressiveStringPieceSQResult class.

// [1] match <span> with 'shortQuery' class if we can make sure...
// [2] ... no further <span>-pairs inside...
//     OR
// [3] ... DOM inside only contains opening+closing <span>-pairs (ensured by recursive regex)
$regex = '/
	(?# COMMENT #1 )
	<span \s+(?:[^>]*\s+|) class\s*=\s*(?P<q>[\'"])(?:[^>\k<q>]*\s+|\s*) shortQuery (?:[^>\k<q>]*\s+|\s*)\k<q>[^>]*>

	(?# COMMENT #2 )
	( (?>(?!<span(?:\s+[^>]*|)>|<\/span>).)*

	(?# COMMENT #3 )
	| (?P<innerDOM> <span(?:\s+[^>]*|)>(?: (?>(?!<span(?:\s+[^>]*|)>|<\/span>).)* | (?&innerDOM) )*?<\/span> )*
	)* <\/span>'/sx;