OCG/Open Tasks
< OCG
A dump of some open tasks for the OCG service:
- Finish ZIM writer! Then use this for ePub writer.
- Tweak the text writer to match Tilman's preferences better:
- meta:User:Tbayer_(WMF)/Converting_wiki_pages_to_plaintext_emails
- "have an output where headlines are more clearly marked (e.g. retaining the "== ...==" wikitext for <h2>s, which is what I'm doing currently)."
- Integrate loadtest regression testing better into deploy process
- Automatically deploy to beta repository on merge, like parsoid does (see https://gerrit.wikimedia.org/r/170130 )
- Fix crashers: https://logstash.wikimedia.org/#/dashboard/elasticsearch/ocg-crashers
- Keep an eye on https://logstash.wikimedia.org/#/dashboard/elasticsearch/ocg-cpu-leak
- Fix pdfsplit/djvu split pipeline, so that we only split the pages we actually need (see https://gerrit.wikimedia.org/r/165632 )
- Don't default to RTL based only on language ( https://bugzilla.wikimedia.org/show_bug.cgi?id=71869 )
- Refactor mw-ocg-service to use npm modules for mw-ocg-bundler, etc, instead of git submodules.
- Allows easier install process
- Allows us to use the same jenkins jobs and deploy process as for parsoid (and hopefully other services)
- Factor out a common mw-ocg-common package with lib/p.js, lib/db.js, lib/domutil.js, lib/status.js, cli stuff, the fork helper from mw-ocg-service, etc.
- Table improvements to mw-ocg-latexer (https://gerrit.wikimedia.org/r/107587)
- But turn this off for most tables, and then whitelist specific cases where this doesn't break the page
- Use some package to provide robust multipage tables.
- Use CSS to identify 'large' tables.
- Parse some CSS in domino so we can get better size information in mw-ocg-latexer. (Images, tables, etc.) ( https://bugzilla.wikimedia.org/show_bug.cgi?id=71339 )
- Use v2 Parsoid API ( https://bugzilla.wikimedia.org/show_bug.cgi?id=71186 )
- Refactor wiki-specific hacks into separate class. Make visitor pluggable. (mw-ocg-latexer)
- Handle font-switching better. Automatically compute font coverage? ( https://bugzilla.wikimedia.org/show_bug.cgi?id=68922 )
- Better i18n of status messages. Use gettext format messages, and add a language specifier to the status check API? Or else pass unformatted messages back to Collection, and let the Collection extension localize them and then format them.
- Refactor widget library of Collection extension to use OOui. Use a proper progress bar.