User:OrenBochman
|
| |||||||||||||
These Are A Few Of My Favourite ThingseditQuick IRC Channels Linksedit
|
||||||||||||||
|
||||||||||||||
TODO
editSetup SCP support - https://wikitech.wikimedia.org/wiki/User:Wikinaut/Help:Access_to_instances_with_PuTTY_and_WinSCP Setup port forwarding with moodle 2.5 instance setup tmux
Translate Wiki
edit- Solar on labs
- Salarium integration
Berlin Hackaton
editSearch:
OAI Extension
editThis extention needs to updated to work a little differently. It needs to provide the content info
- Trigger updates on link change:
- Output To be modified to the following format
Output Schema |
---|
Content Type |
DataDump (HTML for pages/JSON for wikidata / URI for files) |
Metadata in JSON - pages listed below |
Page Meta Data |
---|
Page Id |
RevId |
Title |
Internal link List |
Exernal Links List |
InterWiki Links List |
Catagory List Visible |
Catagory List Hidden |
InterWiki List |
GeoData List |
Edit Count |
Editor Count |
Cache Hits Weekly |
Cache Hits Weekly Normalaised |
Afd Nomination |
Project List |
WikiData
editSchedule video confrence with BugMiester
WikiData MetaData |
---|
Title |
RevId |
Internal Links |
Exernal Links |
InterWiki |
Catagories |
InterWiki |
RevId |
WikiData Page Data |
---|
Title |
RevId |
Internal Links |
Exernal Links |
InterWiki |
Catagories |
InterWiki |
RevId |
Hackathon 2013
edit- Develop
- Tron bot - Quality analytics + advice for new articles
- Orwell01 - Sandbox edit test
- Needs a ~/.description
- Orwell02 - Group edit export to Gephi
- Orwell03 - plsi + grammar check
- Orwell04 - Configurable checks
- SVG Comics gadget to display animated SVG comics based on Arun Ganesh D3 enchanced Map gadget.
- Bootsrap skin for moodle based on ....
- Contacts:
- User:Ocaasi (room mate) USA who does many projects including the wikipedia library and the wikipedia adventure.
- Martijn Hoekstra - Helps with AfC Stats.
- User:Felipe Schenone (roomate) - Helped with Widget design (Argentina)
- Evan Rosen (room mate) wiki metrics developer from the analytics team.
- Chris Steipp Senior Security Engineer working on OpenId & OAUTH development.
- User:Yug fr Wiktionary wiki data migration & maps design. Recommends http://commons.wikimedia.org/wiki/File:Israel_location_map.svg as basis svg maps for Israel.
- wikt:fr:User:Darkdadaah anothe fr wikitionary
- User:Kolossos Czech dev in the Czech Wikipedia - we met in Berlin and Amsterdam (Like puzzles)
- Peter Bena Czech project lead of huggle the (Who is a volunteer labs ops who knows the tool migration stuff.
- User:Yurik who works at on wikipedia zero.
- Magioladitis lead developer of auto wiki browser
- User:Erik Zachte of the analytics team who makes monthly aggregates of Wikipedia dumps.
- User:Planemad Arun Ganesh - map developer
- Susanna anas phd interested in maps and memorabilia ...
- User:Kelson open zim and Kiwix !!
- Antoine Musso - Jenkins and search
- User:MarkAHershberger Old timer like me.
- User:Henna - Report issues with Vargrent (64 bit python)
- User:TMg
- Merlijn van Deen - Pywikipedia bot assitence
- Maarten Dammers - WLM solr connection.
- Sebastiaan - intersted in video teaching scripts
- The wiki loves art guy
- kimmo.virtanen@gmail.com Kimmo (room mate) from Finland
- mike rubio mikerubio@gmail.com (room mate) from the Philippines.
- user:lyhana8 french developer of Wiktionary project migration to wikidata
Extension Ideas
edit- Latex Diagram Builder (Latex to SVG script)
- take latext diagram in a <latexD><\latexD>
- Outputs an SVG of the diagram.
- Easy to do since latex can work a command line application.
- Gambit extension
- Take an extensive form game
- Generate diagram
- Generate solutions
- Easy to do since gambit works as a command line application.
- cannot make ess reports
Confrences
editSOLR
editsecurity: [1]
Stuff
edit- Cooperate with
- Google on NLP
- Academia
- Apertium
- HFST
Summer Of Code
editLucene Lemma Analyzers based on Morphology Extraction from Wikipedia Text
edit- Part 1: use & expand induction software to process exiting languages.
- Lemmas to word sense:
- exsiting works
- semantic frames - verb "think" (about) takes a noun complement XXX. In hungarian this is more explicit. Can be powerfull format for representing knowldge in sentences. Could be used to convert text to relation. (go, go to XXX,go from XXX to YYY) not many relations are needed. Verbs of motions, events,
- logic frames - map simple senteces to a prologu like logic structure
- Part 2 extract semantic frames from (part of speech tagged) corpus.
- deliverables:
- semantic networks used in wikipedia
- search and retrieve sample sentences for semantic frame patterns
Lucene - Automatic Query Expansion System
edituse SVD or other methods to make a cross language word nets
User Fingerprinting
edit- anonymous fingerprinting for:
- free unregisterd editor contribution.
- sock pupet detection
- probably not a good GSOC concept
Lucene - NG Wiki Parser Filter
editIntegrate the cutting edge parser as a lucne filter to allow offline indexing of wiki source. Deliverable: up to date wikipedia parser. Problems - no specs Problem - templates THis will probably be one of my own projects if I get to work full time
UIMA Content Extraction From Talk Pages
editUse UIMA to automate content extraction talk and user Talk Pages. This is to facilitate tracking of action on various policies. Product a Q&A system.
This is on the frnge of contetnt analytics.
Corpus Stuff
edit- http://nlp.cs.nyu.edu/wikipedia-data/ A Wikipedia-based Corpus Reference Tool
- http://nlp.cs.nyu.edu/wikipedia-data/
Foot notes
edit- ↑ Ant
- ↑ Grammars
- ↑ Benchmark
- ↑ Machine Translation
- ↑ QA
- ↑ Media Wiki's
- ↑ clustering
- ↑ data structure
- ↑ real time collaboration
- ↑ CI
- ↑ search lib
- ↑ language detection
- ↑ checking external links
- ↑ testing search
- ↑ Statistics & data mining
- ↑ source control
- ↑ search engine
- ↑ language detection
- ↑ translation memory
- ↑ tutorials
- ↑ testing
- ↑ content analytics frame work
- ↑ SOLR PHP integration
Subpages
edit- OrenBochman//Search/Resources
- OrenBochman//Search/Test Plan
- OrenBochman/Bugs
- OrenBochman/Contacts
- OrenBochman/Dev Contacts
- OrenBochman/Features
- OrenBochman/Header
- OrenBochman/HunSig
- OrenBochman/HunSig/Development
- OrenBochman/HunSig/Research
- OrenBochman/Ideas
- OrenBochman/Installation
- OrenBochman/Introduction
- OrenBochman/Lucene
- OrenBochman/Main
- OrenBochman/ParserNG
- OrenBochman/ParserNG/Preprocessor
- OrenBochman/ParserNG/Preprocessor Antlr
- OrenBochman/ParserNG/Sanitizer Antlr
- OrenBochman/ParserNG/Tests
- OrenBochman/ParserNG/Tests/Test1
- OrenBochman/ParserNG/Tests/Test2
- OrenBochman/ParserNG/Tests/Test3
- OrenBochman/ParserNG/Tests/Test4
- OrenBochman/ParserNG/Transliterator Antlr
- OrenBochman/ParserNG/WikiTable
- OrenBochman/ParserNG/antlr
- OrenBochman/Scratch
- OrenBochman/Search
- OrenBochman/Search/Analytics
- OrenBochman/Search/BrainStorm
- OrenBochman/Search/Conf
- OrenBochman/Search/Features
- OrenBochman/Search/Labs
- OrenBochman/Search/NGSpec
- OrenBochman/Search/NLP Tools
- OrenBochman/Search/NLP Tools/Morphology
- OrenBochman/Search/Plan
- OrenBochman/Search/Porting
- OrenBochman/Search/Risk Assesssment
- OrenBochman/Search/Spec
- OrenBochman/Search/Tab
- OrenBochman/Search/Test Plan
- OrenBochman/Search/Todo
- OrenBochman/Search/Tools
- OrenBochman/Search/Tools/
- OrenBochman/SearchTools/Awk Antlr
- OrenBochman/Social Wiki
- OrenBochman/Sul
- OrenBochman/WikiJournal
- OrenBochman/bots
- OrenBochman/common.css
- OrenBochman/common.js
- OrenBochman/new ssh key
- OrenBochman/skin