User:Catrope/Extension review/ArticleFeedbackv5/archive

Base revision: r105164

ArticleFeedbackv5.php edit

  • Line 82, 103: please be aware that tracking bucketing generates a lot of noise in the tracking data, to the point where it brought the site down once. This can only be enabled on the cluster if we use UDP logging (which we want to do anyway, I just need to get around to enabling it), and I will probably want to have it turned off initially when deploying. Fixed in r105309
    • I couldn't tell from the wording whether this meant the tracking was dangerous (which seemed more likely), or the bucketing was dangerous, so I averted both. The bucket config is now set to ignore = 100 by default, and ext.articleFeedbackv5.js skips bucketing if either track or ignore is set to 100. -- rsterbin
      • I meant the tracking, i.e. generating an event for 'I just put user X in bucket Y' and logging all of those events. I'm sorry for not being clearer; all I asked for was to remove 'tracked' => true, throughout (which has not happened, so my concerns are not addressed with r105309)
  •   Done Line 119-129: the keys in the bucket configuration (-, A, B, C) don't match the keys in the documentation comment (0, 1, 2, 3) Fixed in r105262 (rsterbin)
  • Line 161: are you guys even using the survey stuff anymore? If not, it should be stripped out of v5

ArticleFeedbackv5.hooks.php edit

  •   Done Line 293-297: debugging code, please comment out or remove. This seems to be unfinished edit tagging functionality? Fixed in r105262 (rsterbin)

ArticleFeedbackv5.i18n.php edit

  •   Done Line 35, 50: inconsistency between $1 / 5 (with spaces) and $1/5 (without spaces) Fixed in r105260 (rsterbin)
  •   Done Line 39: message key is misspelled (invalud). The correct spelling is used in api/ApiViewFeedbackArticleFeedbackv5.php , which means that code will hit a missing message Fixed in r105260 (rsterbin)
  •   Done Line 42: don't capitalize 'Feedback'. Our i18n people don't like Title Case, and 'feedback' isn't capitalized in any of the other messages Fixed in r105260 (rsterbin)
  •   Done Our i18n people also prefer 'page' over 'article' but I think Fabrice raised a similar point already Fixed in r105260 (rsterbin) -- Special page only; form fixes will go in as approved
  •   Done Line 85: Don't Use Title Case In 'Very Good' Fixed in r105260 (rsterbin)

SpecialArticleFeedback5.php edit

Overall it looks like this special page is very much unfinished, and it is undeployable in its current state. This either needs serious work before deployment, or needs to be disabled by commenting it out from $wgSpecialPages.

  •   Done Line 17, 48-50, 53, 55, 56, 61: hardcoded English textFixed in r105627 (gregchiasson)
  •   Done Line 22: The Wikipedia:$title thing will not work on Wikipedias, because on those wikis "Wikipedia:" is a namespace prefix rather than an interwiki prefix. There are also general issues with building wikitext this way. You should build links the proper way:
    $linkHTML = Linker::link( $titleObj, wfMessage( 'articlefeedbackv5-foobar' )->escaped() )
    Here $titleObj is a Title object (see also my Title::newFromText() example below) and the second parameter is HTML (which is why you have to escape the i18n message) Fixed in r105617 (gregchiasson)
  • Line 43-44: " This is a terrible, terrible hack. I'm taking it out as soon as I stop being an idiot and sort this ResourceLoader thing out". Would this week be a good time to stop being an idiot? ;) If you have questions regarding ResourceLoader, I can answer them, I wrote half of it. I'll be on vacation Tue-Fri but should be reasonably responsive to e-mail.
  •   Done Line 75: MediaWiki has built-in functionality for obtaining a page ID from a title, and unlike your pageIdFromTitle() function it also works for namespaced titles like this page :)
    $title = Title::newFromText( 'Foobar' );
    if ( !$title ) { /* newFromText() returns null for invalid input. Handle error here */ }
    $id = $title->getArticleID(); Fixed in r105617 (gregchiasson)

sql/ArticleFeedbackv5.sql edit

It saddens me to see that my feedback on the schema was largely ignored.

  •   Done Line 22: I've told you before that you should really have a user_ip field (IP address or null) field here rather than a user_text field (IP address or user name). The latter is the old style used in most of MW core, the former is the new style used in newer extensions. If you insist on having a user_text field, at least make sure the data type is exactly the same as that of user.user_name (varchar(255) binary)
  •   Done Line 34, 35: I've told you before that you should not use MySQL's TIMESTAMP type. You also acknowledged that the modified field wasn't needed because the rows in this table aren't changed, but it's still there
  •   Done With the exception of the first table, comments are few and far between. Please add a comment above every table to explain what it contains (this is missing even for the first table -- from the field names I can deduce that it contains individual ratings) and above every field name where the contents of the field aren't immediately obvious from their name (things like auto increment IDs don't need comments, but things like foreign keys, enums and most text fields do) Fixed in r105448 (gregchiasson)

api/ApiArticleFeedbackv5Utils.php edit

  •   Done Line 34, 36, 38: please resolve this TODO Fixed in r105449 (gregchiasson)
  •   Done Line 54: $wgArticleFeedbackNamespaces is global'ed but $wgArticleFeedbackv5Namespaces is used in the code (and defined in the setup file). You should use error_level(E_ALL) when debugging, that should have caused an E_NOTICE about an undefined variable to appear Fixed in r105252 (rsterbin)
  • Line 98-111: you don't need to write your own getRevisionId() function, there is a built-in function for obtaining the latest revision ID of a page: obtain a Title object and call $title->getLatestRevID() Fixed in r105252 (rsterbin)
    • Still flawed, see CR comment on r105252. Code as written will cause a null pointer dereference when a nonexistent page ID is passed in
  •   Done Line 116, 138: per the TODO comments, use memcached here Fixed in r105726 (gregchiasson)
  •   Done Line 122-125, 144-147: always pass the __METHOD__ parameter to select() Fixed in r105449 (gregchiasson)
  •   Done Line 118: this function passes the return value of select() through unmodified, which means it does not return an array, but a ResultWrapper. These objects are iterable, so in practice you don't notice the difference (you can use foreach on them and such) until you try passing them into array_*() functions and everything explodes. If you document your return values correctly, this is less likely to happen Fixed in r105252 (rsterbin)
  •   Done Line 150-152: creating an array before you append to it is something that you need to do in JavaScript (which is annoying as hell), but it's not needed in PHP. In fact, you can even do stuff like $variableThatDidntExistBefore['foo']['bar'][2]['baz'] = 'quux'; and that'll just work. Fixed in r105252 (rsterbin)

api/ApiArticleFeedbackv5.php edit

  •   Done Line 38-40: why is this check commented out? Fixed in r105449 (gregchiasson)
  •   Done Line 76: TODO ERROR Fixed in r105449 (gregchiasson) -- Not fixed in r105449, but must have been fixed in some later rev cause it's fixed in trunk
  •   Done Line 90-104: I'm impressed that you guys put in the Squid purge thing. This is probably the only extension that does this, and it's not something I thought would have been easy to figure out on your own. However, you need to get these things exactly right, if there's the slightest discrepancy it won't work at all:
    • Line 94: this should be list=articlefeedbackv5-view-ratings to match the JS file
    • Line 96: the empty string here betrays that the anontoken parameter is unused. It should be removed here, in the JS file, and in the API module
    • Line 97: afuserrating is even more unused (not defined in the API module, not passed by JS) and should also be removedFixed in r105460 (gregchiasson)
  •   Done Line 142: you should never do escaping in an input validation function. Escaping should happen as close to the output as possible. I know this isn't done right now but I recommend removing the comment suggesting this lest anyone read it and implement it Removed in r105246 (rsterbin)
  •   Done Line 201: the containing function is called three times, so getRevisionLimit() is also called three times. You should cache its return value somewhere, either in getRevisionLimit() itself or somewhere in this class Fixed in r105460 (gregchiasson)
  •   Done Line 211-240: these two queries are poorly indexed. You will need the following:
    • an index on (af_page_id, af_revision_id). There is currently an index on af_revision_id alone, but that is insufficient for this query
    • an index on (afi_data_type)
    • an index on (aa_feedback_id, aa_field_id, aa_response_option_id) Fixed in r105460 (gregchiasson)
    • with all those, the query should be fairly reasonable (although I'll believe that when I EXPLAIN it), but I'd really prefer updating the rollup tables with increments/decrements (UPDATE foo SET bar=bar+1 WHERE baz) over recomputing the values every time. The GROUP BY query is a bit scary, and will probably be slow for heavily rated pages In r105601 (gregchiasson) I changed up the logic to avoid expensive GROUP BY type queries where possible, doing a larger number of much simpler queries instead, and gone back to the old system of incrementing counts, along with adding a number of comments about why I think this makes sense. Page rollups still us the group by, but they only pull from the revision rollups, which might be good enough, I'm not sure. Feel free to re-test, I'm open to suggestion if you think this won't work (I can just insert a billion random rows into our dev DB, and test against that, if I get time).
      • That's much better, thanks. I've left some comments on the revision in code review.
  •   Done Line 242: I have no idea what that comment is supposed to tell me Fixed in r105460 (gregchiasson)
  •   Done Line 275: am I misinterpreting the meaning of the per-revision rollup tables here? It seems these tables track the numbers for that revision and the 29 revs before it combined, rather than just tracking the numbers for that revision. I can't tell which interpretation is correct because there is no documentation in the code and no documentation in the SQL file either Fixed in r105460 (gregchiasson)
  •   Done Line 290-300: instead of that DELETE-INSERT-DELETE-INSERT sequence, you may want to use two REPLACE queries. See DatabaseBase::replace() in includes/db/Database.php As of r105601, there's only one delete/insert block. The replace function looks like it might be worth examing, still. (gregchiasson)
    • That looks better, thanks. I'll mark this as done; using REPLACE is a nice-to-have here and I'm not even sure it'll be much faster
  •   Done Line 310: this function is named and documented as a getter, but in reality it inserts a row. Please name and document the function accordingly Fixed in r105246 (rsterbin)
  •   Done Line 389, 394, 400, 406, 411, 412, 420, 421: you don't need to specify ISMULTI => false or REQUIRED => false, these are false by default Fixed in r105246 (rsterbin)

api/ApiFlagFeedbackArticleFeedbackv5.php edit

  •   Done Line 18: the 'af' prefix conflicts with the v4 modules and with the other two modules. Every query module should have a unique parameter prefix. Also, action= modules conventionally do not have a parameter prefix Removed the prefix in r105487 (gregchiasson)
  •   Done Line 31, 43, 49: ideally you would return an error code or message key from the API and have the JS client do the i18n work. This prevents annoying bugs when ?uselang= is used Fixed in r105487 (gregchiasson)

api/ApiViewFeedbackArticleFeedbackv5.php edit

  • A submodule of action=query has to be able to coexist peacefully with its fellow submodules, because any number of submodules can be invoked in the same request. This means you have to
    • choose a unique parameter prefix (line 22, see also comment on ApiFlagFeedback)
    •   Done namespace your ApiResult additions by using $this->getModuleName() as the key rather than 'data' (line 49-51). This is done correctly in the ViewRatings module, on line 36 fixed in r105632 (gregchiasson)
  • Line 89: a COUNT(*) query on an unbounded number of rows is unacceptable in production. Doesn't the rollup table have this information? Sort of, but that would break the counts when a filter is active - hidden, visible, etc.
  •   Done Line 103-110: we have a choice between DESC sorting and DESC sorting? That looks broken Fixed in r105243 (rsterbin) -- not broken, just repetitive
  •   Done Line 119-126: this query has WHERE af_page_id=N ORDER BY af_id, so there should be an index of (af_page_id, af_id) Fixed in r105632 (gregchiasson)
  • Line 123: don't use OFFSET-based paging. Instead, page with WHERE af_page_id >= $params['continue'] and set a query-continue for the next page ID. Grep the core code for query-continue for examples, or ask me
  • Line 188, 200, 203, 206 : as it is, these messages are being treated as raw HTML. This is occasionally OK, but please avoid it wherever possible. The output of these wfMsg() calls should be escaped Fixed in r105717 (gregchiasson)
  • Line 201, 204: use the 'pipe-separator' What does this mean? (gregchiasson)
  • Line 225: $found is the result of a wfMsg() call, should also be escaped Fixed in r105717 (gregchiasson)
  • Line 226: THIS IS A GLARINGLY OBVIOUS STORED XSS VULNERABILITY. You have to escape aa_response_text before putting it in your HTML output Fixed in r105717 (gregchiasson)
  • Line 234, 242, 256: More instances of unescaped database values in the HTML output Fixed in r105717 (gregchiasson)
  • Line 233, 241, 247, 252, 270, 274: More instances of unescaped messages Fixed in r105717 (gregchiasson)

api/ApiViewRatingsArticleFeedbackv5.php edit

  •   Done Line 55: you don't need the setIndexedTagname_internal() call here because there are no numerically indexed arrays around Fixed in r105242 (rsterbin)
  •   Done Line 75-77: the fetchRevisionRollup() function is unused Fixed in r105242 (rsterbin)
  •   Done Line 113: why are you grouping by afi_name? It's unnecessary because you're already grouping by rating_id (and afi_name is derived from that). Also, are afi_id and aa_rating_id joinable against each other? If so, shouldn't the latter be called aa_field_id? I wouldn't have to ask if there were any documentation whatsoever in the SQL file Force of habit from postgres, I guess, where you have to group by every column you're selecting, and I need both ID and name. At any rate, obviated by the below.
    •   Done Also, because the revisions branch is dead/unreachable, you can remove the SUM() and GROUP BY parts, right? Correct, and fixed in r105464 (gregchiasson). I'm not entirely sure what I was getting at there, but that query was definitely more complicated than it had to be.
  •   Done Line 127, 130: the userrating and subaction parameters are undocumented and unused Fixed in r105242 (rsterbin)
  •   Done Line 128, 135-139: the anontoken and revid parameters are unused Fixed in r105242 (rsterbin)

modules/ext.articleFeedbackv5/ext.articleFeedbackv5.startup.js edit

  • Line 23: this is checking the v4 preference name (articlefeedback-disable), should check the v5 preference name (articlefeedbackv5-disable) instead The spec says that v5 should respect users' disable setting for v4 -- rsterbin
  •   Done Line 30-32: why does the bucketing happen here rather than in ext.articleFeedbackv5.js ? There's another bucket() call there, so removing the bucket() call here wouldn't break anything, it would just change when the bucketing occurs (and it would bucket way way fewer people in the beginning, that's important) Fixed in r105307 (rsterbin)

modules/jquery.articleFeedbackv5/jquery.articleFeedbackv5.special.js edit

  •   Done Line 13-14: you can use mw.util.wikiScript( 'api' ) to get the URL for api.php Fixed in r105433 (yonishostak)
  •   Done Line 54: "TODO user ID?" yeah strangely the user ID isn't available in JS. But you shouldn't need to pass it to the API module anyway Fixed in r105433 (yonishostak)
  •   Done Line 95-97: can't you use $foo.append( bar ) rather than $foo.html( $foo.html() + bar ) ? Fixed in r105433 (yonishostak)
  •   Done Line 100, 101: please use .text() rather than .html() for text, it escapes HTML. Using HTML functions for things that aren't really HTML scares me Fixed in r105433 (yonishostak) and r105463 (gregchiasson)

modules/ext.articleFeedbackv5/ext.articleFeedbackv5.js edit

  •   Done Line 78: you can't use a self-closing span tag here. IE will choke if you use self-closing span tags in a jQuery constructor Fixed in r105194 (rsterbin)
  •   Done Line 93: ditto, closing a tag Fixed in r105194 (rsterbin)

modules/jquery.articleFeedbackv5/jquery.articleFeedbackv5.js edit

  •   Done Line 112: you can use mw.util.wikiScript( 'api' ) to get the URL for api.php Fixed in r105193 (rsterbin)
  •   Done Line 243: all your HTML classes and IDs are nicely namespaced, but find-feedback isn't Fixed in r105193 (rsterbin)
  •   Done Line 297: this URL is passed in in wgArticleFeedbackv5TermsPage. Bucket 2 does grab this variable Fixed in r105193 (rsterbin)
  •   Done Line 363-364, 621-623, 656, 941, 942: why is this commented out? Fixed in r105193 (rsterbin) -- half-implemented idea of disabling the form when the comment was empty; removed for now
  •   Done Line 577, 847, 1312: this grabs the proper variable for the terms page URL, but feeding it through wikiGetLink() is wrong, because it's already an absolute URL, not a page name Fixed in r105193 (rsterbin)
  •   Done Line 682, 712: t leaks to the global scope Fixed in r105193 (rsterbin)
  •   Done Line 776: why is there hardcoded English here? ("Remove this rating") Fixed in r105193 (rsterbin) -- no idea, but it's definitely not needed and gone now
  •   Done Line 828: so what is this for? Fixed in r105193 (rsterbin) -- ditto'
  •   Done Line 1055: you cannot have an unclosed html:msg tag here (self-closing is OK, that's what's used elsewhere) because it's invalid HTML and IE will barf Fixed in r105193 (rsterbin)
  •   Done Line 1156: as written this will bucket all users in a with-expertise and a without-expertise group, even if they're never shown form number 5. It's probably better to move this in a bit deeper so the expertise vs. no expertise bucketing only happens if needed, and we end up bucketing fewer people (=fewer cookies sent around, fewer bucketing events if bucket tracking is enabled) Fixed in r105307 (rsterbin)
  •   Done Line 1276-1277: is something missing here? Fixed in r105193 (rsterbin) -- stray outline of old template definition; removed
  •   Done Line 1616: the anontoken parameter is not used by this API module and should be removed Fixed in r105193 (rsterbin)
  •   Done Line 1618: mw.config.get( 'wgArticleFeedbackSMaxage' ) is called but the setting is called wgArticleFeedbackv5SMaxage Fixed in r105188 (rsterbin)
  •   Done Line 1632, 2309: you cannot call console.log() if you're not sure you're in debug mode, it causes a JS error if there is no active debugger Fixed in r105193 (rsterbin) -- I just used a defined check on console; if there's something built in I should be using instead, let me know.
  •   Done Line 1782-1796: why is <html:msg> not used here? Fixed in r105193 (rsterbin)
  •   Done Line 2028, 2029, 2314: use .text() for message contents, not .html() Fixed in r105193 (rsterbin)