User:Kipcool/importing

From OmegaWiki

Jump to: navigation, search

Contents

[edit] How does it work

The idea is to use TBX which is an XML standard for sharing terminological data. TBX and OmegaWiki are different but close. If we want to share the OmegaWiki with some people, we'll need OW <-> TBX conversion, so to import Wiktionaries, we should first:

  1. create a TBX -> OmegaWiki import script.
  2. create a Wikts -> TBX conversion script. That's what does the script below.

[edit] List

[edit] Importing fr:

  • done:
    • definitions = definedMeaning
    • language
    • title of the article
    • part of speech: nouns, verb, adjective, ...
    • translations
      • matched with the corresponding meaning when in the form [[word]] (2)
      • '''the definition''' followed by a list of translations is still to be done...
  • todo:
    • partofspeech: phrase (noun phrase, verb phrase,...)
    • synonyms
    • antonyms
    • examples
    • permalink
    • pronunciation
    • etymology
    • field/domain: {term|...}
    • images
    • see also (wikipedia...)

[edit] result

  • use: "wiktfr2TBX.pl frtest.xml > TBXtest.xml"
  • Entry: first 5000 lines of the French Wiktionary dump frtest.xml
  • processing: a Perl script wiktfr2TBX.pl
  • Output: TBX file
Personal tools
Toolbox