User:Slaporte/Article quality visualization
What we Have
edit- Reference Count
- Intro paragraph
- Paragraph count
- Image count
- Category count
- reference section count
- external link section count
- external link count
- article assessment
- google web search results
- google news search results
- page visits per day
- likelihood of vandalism (from Wikitrust)
- incoming
- outgoing
- number of editors
- recency from last edit
Areas
edit- structure
- trustworthy
- complete
- objective
Formula
edit- reference count / paragraph count ENOUGH REFERENCES
- paragraph count / google news search result SIGNIFICANCE
- paragraph count / google web search result SIGNIFIANCE
- image count / paragraph count ENOUGH IMAGES
- category count ENOUGH CATEGORIES
- unique editor count EDIT HISTORY
- time since last revision EDIT HISTORY
- assessment ASSESSMENT
- feedback FEEDBACK
- incoming links INTERCONNECTION
- outgoing links INTERCONNECTION
- incoming links / outgoing links INTERCONNECTION
to do
edit- quality algorithm
- visualization on page
- batch processing, page history
API dependency?
brainstorming quality metrics
editis the article at least a couple of paragraphs? is the article as long as it is important? (e.g. is it proportional to the number of results on google for the article's subject?) does the article have at least 1 picture for every n paragraphs? is the article in a category? does the article have at least 1 source for every n sentences? are there a large number of unique editors? are a good proportion of the editors users with long histories of editing articles? has the article been featured? are any of the paragraphs or sentences too long? are there any grammar or spelling errors? has the article been edited recently? how many flags does the article have? (e.g. neutrality, citation needed, weasel words, etc.) what are the user-created page ratings of the article?
Minimum requirements -- Y/N
//would also be good to highlight which calls to action you want to encourage
/* would apply to all non-stub article pages? */
One infobox
$('.infobox').length
One intro paragraph
$('.mw-content-ltr p').length
Three incoming links
API: http://en.wikipedia.org/w/api.php?action=query&format=json&list=backlinks&bltitle=Charizard&bllimit=100&blnamespace=0
n images
$('img').length
n categories
$('a[href*="/wiki/Category:"]').length
More than one editor API: http://en.wikipedia.org/w/api.php?action=query&format=json&prop=revisions&titles=Charizard&rvprop=user&rvlimit=500
not stub: if( !$('#siteSub').length ) return;
Content
- by density (blah per n paragraphs/words etc)
References $('.reference').length
- by diversity (does it cover all the bases)
- proper structure, e.g. does it follow http://en.wikipedia.org/wiki/Wikipedia:Style
Does it have references and external links section $('#References').length $('#External_links').length
Edits
- by frequency/rate of edits (# edits/day, days since last edit)
- by "demographics" of editors (total number of editors; percentage of editors that are registered; uniqueness; editor's; quality of editors)
Significance/External
- sources/links out
- comparison to google search (position in results, number of results:length of article)
- Google News API (no key required): http://ajax.googleapis.com/ajax/services/search/news?v=1.0&q=SOPA
- number of instances of the wiki page in other languages
- is http://en.wikichecker.com/ useful? too slow?
- http://en.wikipedia.org/w/api.php?action=query&list=articlefeedback&afpageid=9228&afuserrating=1 < -- feedback
- how many hits
http://stats.grok.se/json/en/200804/Main_page
Quality Assessment - Sometimes available on the article's talk page /* * @author Outriggr - created the script and used to maintain it
* @author Pyrospirit - currently maintains and updates the script */ getRating: function getRating (text) { this.callHooks('getRating_before'); var rating = 'none'; if (text.match(/\|\s*(class|currentstatus)\s*=\s*fa\b/i)) rating = 'fa'; else if (text.match(/\|\s*(class|currentstatus)\s*=\s*fl\b/i)) rating = 'fl'; else if (text.match(/\|\s*class\s*=\s*a\b/i)) { if (text.match(/\|\s*class\s*=\s*ga\b|\|\s*currentstatus\s*=\s*(ffa\/)?ga\b/i)) rating = 'a/ga'; // A-class articles that are also GA's else rating = 'a'; } else if (text.match(/\|\s*class\s*=\s*ga\b|\|\s*currentstatus\s*=\s*(ffa\/)?ga\b|\{\{\s*ga\s*\|/i) && !text.match(/\|\s*currentstatus\s*=\s*dga\b/i)) rating = 'ga'; else if (text.match(/\|\s*class\s*=\s*b\b/i)) rating = 'b'; else if (text.match(/\|\s*class\s*=\s*bplus\b/i)) rating = 'bplus'; // used by WP Math else if (text.match(/\|\s*class\s*=\s*c\b/i)) rating = 'c'; else if (text.match(/\|\s*class\s*=\s*start/i)) rating = 'start'; else if (text.match(/\|\s*class\s*=\s*stub/i)) rating = 'stub'; else if (text.match(/\|\s*class\s*=\s*list/i)) rating = 'list'; else if (text.match(/\|\s*class\s*=\s*sl/i)) rating = 'sl'; // used by WP Plants else if (text.match(/\|\s*class\s*=\s*(dab|disambig)/i)) rating = 'dab'; else if (text.match(/\|\s*class\s*=\s*cur(rent)?/i)) rating = 'cur'; else if (text.match(/\|\s*class\s*=\s*future/i)) rating = 'future'; this.callHooks('getRating_after'); return rating; }
Where the code goes
editThe eventual goal is to create a mediawiki 'gadget' which users can enable at https://www.mediawiki.org/wiki/Special:Preferences#mw-prefsection-gadgets
$('.reference').length
JS Fiddle: http://jsfiddle.net/MqfAZ/
JS Fiddle for UI fiddlin': http://jsfiddle.net/eSEFq/
Template for citation:
$(".ambox-Refimprove:contains('citation')").length $('.ambox-Notability').length $('.ambox:contains("importance")').length $('.ambox:contains("advertisement")').length $('.ambox:contains("cleanup")').length $('.ambox:contains("confusing")').length $('.ombox:contains("deletion")').length $('.ambox:contains("quality standards")').length
$('.haudio').length