ORES

This page is a translated version of the page ORES and the translation is 35% complete.

Внимание:

The ORES infrastructure is being deprecated by the Machine Learning team, please check wikitech:ORES for more info.

ORES (/ɔɹz/)^[1] – сервис, предназначенный для оценки правок. Содержит в себе веб-сервис и API, которые предоставляют машинное обучение как сервис для Wikimedia проектов, управляется Scoring Platform team. Система разработана для того, чтобы автоматизировать критическую вики-работу – например, обнаружение вандализма и его удаление. На данный момент имеется два типа оценки, предоставляемых ORES: "качество правки" и "качество статьи".

Сам по себе ORES является лишь бэкэнд-сервисом и не предоставляет прямого способа использовать такие оценки качества. Если вы хотите воспользоваться оценками ORES, просмотрите список инструментов, которые используют их. Если ORES ещё не поддерживает вашу вики, см. инструкции для запроса поддержки.

Ищите ответы на свои вопросы по ORES? Обратитесь к ЧЗВ по ORES.

Качество правки

Поток правок Диаграмма показывает правки неизвестного качества до введения ORES, "хорошие", "требующие проверки" и "вредные правки" после введения ORES.

Одной из самых серьезных проблем, связанных с открытыми проектами Викимедиа, является просмотр потенциально разрушительных правок. Помимо этого, необходимо определить добросовестных участников (которые могут непреднамеренно внести проблемную правку) и предложить им поддержку. Эти модели предназначены для упрощения фильтров канала Служебная:СвежиеПравки. Мы предлагаем два уровня поддержки моделей для предсказания качества редактирования: базовую и расширенную.

Базовая поддержка

Предполагая, что наиболее разрушительные правки будут откачены, а правки, которые не повреждают, не будут откачены, модель строится на использовании истории правок (в том числе отмененных) из вики. Эта модель проста в настройке, но имеет одну проблему: многие правки отменяются по причинам, отличным от ущерба и вандализма. Для ее решения имеется модель, основанная на нецензурных выражениях.

откачены – предсказывает, будет ли правка мгновенно отменена

Расширенная поддержка

Rather than assuming, we can ask editors to train ORES which edits are in-fact damaging and which edits look like they were saved in goodfaith. This requires additional work on the part of volunteers in the community, but it affords a more accurate and nuanced prediction with regards to the quality of an edit. Многие инструменты ORES будут функционировать, только когда расширенная поддержка доступна для целевой вики.

damaging – оценивает на наличие вредной правки к статье
goodfaith – оценивает на наличие добросовестной правки к статье

Качество статей

Таблица качества статей в английской Википедии. Скриншот таблицы по состоянию на июнь 2024

Качество статей Википедии важно для её редакторов. Новые страницы должны быть проверены и подготовлены, чтобы убедиться, что в вики не осталось спама, вандализма и статей-нападок. For articles that survive the initial curation, some of the Wikipedians periodically evaluate the quality of articles, but this is highly labor intensive and the assessments are often out of date.

New article evaluation

Чем быстрее эти черновики статей, вызывающие серьёзные проблемы, будут удалены, тем лучше. Проверка вновь созданных статей может потребовать большого количества работы. Like the problem of counter-vandalism in edits, machine predictions can help curators focus on the most problematic new pages first. Based on comments left by admins when they delete pages (see the logging table), we can train a model to predict which pages will need quick deletion. See en:WP:CSD for a list of quick deletion reasons for English Wikipedia. For the English model, we used G3 "vandalism", G10 "attack", and G11 "spam".

draftquality – predicts if the article will need to be speedy deleted (spam, vandalism, attack, or OK)

Existing article assessment

For articles that survive the initial curation, some of the large Wikipedias periodically evaluate the quality of articles using a scale that roughly corresponds to the English Wikipedia 1.0 assessment rating scale (articlequality). Having these assessments is very useful because it helps us gauge our progress and identify missed opportunities (e.g., popular articles that are low quality). However, keeping these assessments up to date is challenging, so coverage is inconsistent. This is where the articlequality machine learning model comes in handy. By training a model to replicate the article quality assessments that humans perform, we can automatically assess every article and every revision with a computer. This model has been used to help WikiProjects triage re-assessment work and to explore the editing dynamics that lead to article quality improvements.

The articlequality model bases its predictions on structural characteristics of the article. E.g. How many sections are there? Is there an infobox? How many references? And do the references use a w:Template:cite xxx template? The articlequality model doesn't evaluate the quality of the writing or if there's a tone problem (e.g. a point of view being pushed). However, many of the structural characteristics of articles seem to correlate strongly with good writing and tone, so the models work very well in practice.

articlequality – predicts the (Wikipedia 1.0-like) assessment class of an article or draft

Topic routing

Topic Cross-walk. A visualization of the cross-wiki labeling process is presented. English Wikipedia's WikiProjects tag articles by topical interest. WikiProjects are organized into a taxonomy of topic labels. The topic labels are applied to articles on other wikis via Wikidata sitelinks.

ORES' article topic model applies an intuitive top-down taxonomy to any article in Wikipedia -- even new article drafts. This topic routing is useful for curating new articles, building work lists, forming new WikiProjects, and analyzing coverage gaps.

ORES topic models are trained using word embeddings of the actual content. For each language, a language-specific embedding is learned and applied natively. Since this modeling strategy depends on the topic of the article, topic predictions may differ between languages depending on the topics present in the text of the article.

New article evaluation

New article routing. A diagram maps the flow of new articles in Wikipedia with the 'draftquality' and 'articletopic' ORES models used for routing.

The biggest difficulty with reviewing new articles is finding someone familiar with the subject matter to judge notability, relevance, and accuracy. Our drafttopic model is designed to route newly created articles based on their apparent topical nature to interested reviewers. The model is trained and tested against the first revision of articles and is thus suitable to use on new article drafts.

drafttopic – predicts the topic of an a new article draft

Topic interest mapping

Article tagging example (Ann Bishop). Ann Bishop is tagged by WikiProjects East Anglia, Women scientists, Women's history, and Biography. The topic taxonomy translation and predictions are presented. Note that the predictions include more relevant topic information than the taxonomy links.

The topical relatedness of articles is an important concept for the organization of work in Wikipedia. Topical working groups have become a common strategy for managing content production and patrolling in Wikipedia. Yet a high-level hierarchy is not available or query-able for many reasons. The result is that anyone looking to organize around a topic or make a work-list has to do substantial manual work to identify the relevant articles. With our articletopic model, these queries can be done automatically.

articletopic – предсказывает тему статьи (more details )

Support table

The ORES support table reports the status of ORES support by wiki and model available. If you don't see your wiki listed, or support for the model you'd like to use, you can request support.

Использование API

ORES offers a Restful API service for dynamically retrieving scoring information about revisions. See https://ores.wikimedia.org for more information on how to use the API.

If you're querying the service about a large number of revisions, it's recommended to batch no more than 50 revisions within a given request as described below. It's acceptable to use up to 4 parallel requests. Please do not exceed these limits or ORES can become unstable. For even larger number of queries, you can run ORES locally

Примерный вопрос: http://ores.wikimedia.org/v3/scores/enwiki/?models=draftquality|wp10&revids=34854345|485104318

{
  "enwiki": {
    "models": {
      "draftquality": {
        "version": "0.0.1"
      },
      "wp10": {
        "version": "0.5.0"
      }
    },
    "scores": {
      "34854345": {
        "draftquality": {
          "score": {
            "prediction": "OK",
            "probability": {
              "OK": 0.7013632376824356,
              "attack": 0.0033607229172158775,
              "spam": 0.2176404529599271,
              "vandalism": 0.07763558644042126
            }
          }
        },
        "wp10": {
          "score": {
            "prediction": "FA",
            "probability": {
              "B": 0.22222314275400137,
              "C": 0.028102719464462304,
              "FA": 0.7214649122864883,
              "GA": 0.008833476344463836,
              "Start": 0.017699431000825352,
              "Stub": 0.0016763181497590444
            }
          }
        }
      },
      "485104318": {
        "draftquality": {
          "score": {
            "prediction": "OK",
            "probability": {
              "OK": 0.9870402772858909,
              "attack": 0.0006854267347843173,
              "spam": 0.010405615745053554,
              "vandalism": 0.0018686802342713132
            }
          }
        },
        "wp10": {
          "score": {
            "prediction": "Stub",
            "probability": {
              "B": 0.02035853144725939,
              "C": 0.021257471714087376,
              "FA": 0.0018133076388221472,
              "GA": 0.003447287158958823,
              "Start": 0.1470443252839051,
              "Stub": 0.8060790767569672
            }
          }
        }
      }
    }
  }
}

Результат

Example query: https://ores.wikimedia.org/v3/scores/wikidatawiki/421063984/damaging

{
  "wikidatawiki": {
    "models": {
      "damaging": {
        "version": "0.3.0"
      }
    },
    "scores": {
      "421063984": {
        "damaging": {
          "score": {
            "prediction": false,
            "probability": {
              "false": 0.9947809563336424,
              "true": 0.005219043666357669
            }
          }
        }
      }
    }
  }
}

Результат

EventStream usage

The ORES scores are also provided as an EventStream at https://stream.wikimedia.org/v2/stream/revision-score

Локальное использование

To run ORES locally you can install the ORES Python package by:

pip install ores # needs to be python3, incompatible with python2

Then you should be able to run it through:

echo -e '{"rev_id": 456789}\n{"rev_id": 3242342}' | ores score_revisions https://ores.wikimedia.org (your user-agent string goes here) enwiki damaging

You should see output of

017-11-22 16:23:53,000 INFO:ores.utilities.score_revisions -- Reading input from <stdin>
2017-11-22 16:23:53,000 INFO:ores.utilities.score_revisions -- Writing output to from <stdout>
{"score": {"damaging": {"score": {"prediction": false, "probability": {"false": 0.9889349126544834, "true": 0.011065087345516589}}}}, "rev_id": 456789}
{"score": {"damaging": {"score": {"prediction": false, "probability": {"false": 0.9830812038318183, "true": 0.016918796168181708}}}}, "rev_id": 3242342}

Результат

Сноски

↑ Изначально назывался "Objective Revision Evaluation Service", но полное название на данный момент устарело

[1] Изначально назывался "Objective Revision Evaluation Service", но полное название на данный момент устарело

[1]