As an anonymous user, you can only add new data. If you would like to also modify existing data, please create an account and indicate your languages on your user page.

International Beer Parlour/Archive20110831

From OmegaWiki
Jump to: navigation, search

New Link Attribute[edit]

Ortografix created a new link attribute which you can see in action here: DefinedMeaning:an_axe_to_grind_(689829) and DefinedMeaning:chicken_(5778). I don't know if the error message has anything to do with it. Do others see it as well? And what do you think about this new attribute? --Tosca 21:05, 6 November 2010 (UTC)

I see the error messages as well. They appeared when I tried to add a DM attribute on Syntrans level; that annotation has not been displayed and I cannot delete it. After I learned that it does not work I removed that class attribute from lexical item. The link attribute did not cause the error. --Ortografix 11:53, 7 November 2010 (UTC)
The questions is, how do we get rid of them? --Tosca 17:22, 10 November 2010 (UTC)
The other question is if we get working DM attributes on Syntrans level some time. --Ortografix 15:20, 14 November 2010 (UTC)

I think the above mentioned attribute ("idiom") is necessary, because if someone hears or reads something like "Ich habe ein Hühnchen mit dir zu rupfen", he/she would not find that phrase but would rather lookup Huhn, of which "Hühnchen" is a wordform and find the related idiom. --Ortografix 15:20, 14 November 2010 (UTC)

"Usage" fields[edit]

Unfortunately, there are two fields called the same: "usage", one in "Plain texts", the other one in "Option values". The latter is a rolling list, which is a good idea, except that I would add the options "formal" and "respectful". The plain text "usage" field would be very useful… should it be filled… I already proposed the "expression" page, like a one-language dictionary, shows the annotations rather that hidding them (In the "edit" mode, on have to click twice to access these fields) and would hide the translations, while the defineMeaning page would show the translations and hide the annotations, as it does. This is enhancement proposal 22328. The plain text "usage" field can be used for a wide range of purpose, such as saying that "chocolatine" is a word used in Southern France and not in the North, or explaining the grammatical use of an expression. I would split it into several fields: "geographical usage", "historical usage", "grammatical usage" etc.. But things being what they are, I suggest a wider and rigourous use of that field, with the hope it will be usable by an opensource automatic translator one day. However, it would also help human translators. My proposal would be that such a field begin by "History:", "Geography:", "Syntax:" etc., or their equivalent in the concerned language. As for syntax, variables would be written inside square brackets, numbered if necessary. Moreover, optional parts would be written inside parentheses, and if a part, standing alone, is usually inflexionable but, in that particular expression, is not, it will appear within braces. Examples, for the word "behove":

  • Syntax: It behoves [someone] to [do something].

"do something" being a pro-verb representing any infinitive clause. For the causative meaning of "make":

  • [Some being] makes [some being 2] [do something].

For the word "suckle":

  • Syntax: [Some non human adult female mammal] suckles ([some baby mammal]).

(I remember Romulus and Remus, so I don't write "non human" for the baby.). Contrast this with breast-feed:

  • Syntax: [Some woman] breast-feeds ([some baby]).

For the expression "rain cats and dogs":

  • Syntax: "{It} rains {cats and dogs}."

to indicate it cannot rain a cat and a dog, nor can "they" rain cats and dogs, for examples, but "rain" can be inflected. For the French "aller de l'avant":

  • Syntaxe: "[Quelqu'animal ou véhicule] va {de l'avant}.

"sth" being used, in most dictionaries, in the case it can be anything, in the case it can be a person, an animal or a thing, and in the case it has to be a thing, I suggest to distinguish between "some thing" (written separately) for things, "some being" for beings, "anything" if even facts or other notions are included. For instance, for "speak":

  • Syntax: [Someone 1] speaks (with [someone 2]) (about [anything]).
  • Syntax: [Someone 1] speaks (with [someone 2]) (of [anything]).

But for the literal meaning of "break":

  • Syntax: [Some being] breaks [some thing].

But I notice Cambridge dictionary uses "sb/sth" to make a distinction with "sth". For instance, they write "turn (sb) against sb/sth". In my proposal, expression parts and variables would be inflected as necessary:

  • Syntax: [Someone] regrets [doing something].

Of course, such syntaxic patterns aim at being translated:

  • Syntax: [Some animated being] likes [anything].
  • Sintaxis: A [cualquier ser animado] [le] gusta [algo].

This leads to another problem. In Spanish the subject is usually before its verb, so in this last example, it's not obvious for a non-Spanish speaker that the subject of "gusta" is "algo", and that "gustar" should be inflected according to that word . It's not obvious either than "le", with no equivalent in the English expression, refers to "cualquier ser animado" and should be inflected accordingly. So we also need a standard way to note the functions of the variables related to an expression. Let's keep thinking.

I hope we will be able one day to use the wiki conventions (''italic'' for italic, etc.) in the usage field, so that we could replace the square brackets above by italic, like other dictionaries do and which is lighter than brackets. The brace could also be replaces by, say, bold. What do you thing of such rules? --Fiable.biz 16:12, 9 November 2010 (UTC)

Using the usage field to explain how the word is used: Great. But those grammar rules are way too complicated and I don't think that the average editor is interested in that. An automatic translation software that uses OmegaWiki is still too far away for this to be useful. --Tosca 17:26, 10 November 2010 (UTC)
In Mongolia, all junior high school pupils learn a slightly more complicated system where subject, object, adverbials etc. are underlined (if short) by different standardised kinds of lines (simple, double, curly, with crosses…) or, if longer, put inside corresponding types of square brackets (simple, double, with curly vertical bars, with a cross…) to indicate their grammatical function. The main problem is that most of these symbols are not in Unicode. But we could choose a few brackets among the quite many which already exist in Unicode to do the same thing. I don't see why what is learnt by all Mongolian children would be "way too complicated" to most Omegawiki editors. For instance, wiki markup language and Wikipedia's conventions are far more complicated than what I propose, yet widely used. Moreover, editors not interested in grammar are free to let this to others, as they already do. Apart from automatic translation, this is very useful when the grammatical construction differs from language to language, specially for verbs with multiple complements. And your comment in the section below just proves that it is nearly impossible to define some words without their syntax. The fact is that Omegawiki provides with much less grammatical information than many dictionaries, even paper ones. The tendency is to give more and more. For instance the paper "Clé" French dictionary for foreigners indicates for each adjective if it is placed before or after the noun. Most dictionaries provide for inflections of irregular words. If we don't improve the grammar aspect, of course Omegawiki will never be usable for automatic translation, not even for human one. Personally, I seldom use it when I translate. I'm open to other propositions. --Fiable.biz 09:19, 11 November 2010 (UTC)
I didn't say that Omegawiki-Editors are stupid. ;-) Of course they could learn the system - I just don't see why a significant number of editors would do it. You're right that grammatical explanations are needed. But why not just use the "Usage" annotation to explain them? And the example sentences shows the structure as well. --Tosca 15:44, 11 November 2010 (UTC)
For the 2 above mentioned reasons: 1) I do think once such rules set, it would be simpler for a non-fluent Spanish speaker to understand "A [cualquier ser animado] [le] gusta [algo].", specially if compared with the syntax of his own language (Let's say: "[Any animated being] likes [anything]."), than any explanation in plain Spanish. But you can try here to explain this syntax in plain Spanish words in order to prove I'm wrong. 2) Clearly a translation machine will not understand plain Spanish except if... it is already so good that it doesn't need the concerned piece of information.
By the way, there is another question: is it better to explain the usage in the concerned language, or in English? --Fiable.biz 14:04, 1 December 2010 (UTC)

Compound expressions[edit]

For the dictionary entry itself,

  • if the expression is non separable and the main form of the main word is its simple form, there is no problem. For example: "take off" (for planes).
  • If the expression is non separable but the main word is not its simple form, what to to? Should we record "It rains cats and dogs." as such, or as "rain cats and dogs", or as "rain"?
  • If the expression is separated but the main element cannot be used alone, we should probably better follow the tradition to record "behove" rather than "it behoves… to", which is the real lexical item (expression memorised by speakers), but more difficult to find, specially until Omegawiki have an advanced search system, accepting regular expressions (i.e. with wild characters *, ? etc.) for instance.
  • In the case the expression is separated and the main element alone has another definedMeaning, I'm hesitating. Should we record "turn (sb) against sb/sth" as "turn… against", as "turn against" or as "turn"? When the expression has more than 2 elements, the notion of "separable"/"non separable" is not enough: the reader needs to know what can be separated from what, whether it can be separated or has to be separated, and what should be in between. I've just entered "charge à… de" ("charge à [agent] de [faire quelque chose]." meaning "being incombent on [some agent] to [do something]"). But maybe should I have entered it as "charge" or as "charge à", or as "charge à de"?

Hopefully we will one day be able to transform the different elements of a compound expression into hyperlinks to the corresponding definedMeanings they are derived from, so that clicking on the "rains" of "It rains cats and dogs" will lead to the verb "rain", and clicking on "cats" will lead to the noun "cat". More useful will be the fact that in the French "aller de l'avant", "aller" and "avant" will lead to the right meanings among the quite many meanings of these French words. This would also lead to the inflexion group of the given element once this feature will have been implemented.

Note that the online Cambridge Advanced Learner's Dictionary follows the tradition to record "behove" rather than "It behoves… to", but they listed "turn (sb) against sb/sth" as such, as a separate entry, the variables in it being italicised. In other words, the authors cleverly put in the entry itself the grammatical information that, in order no to perturb you too much, I'm just proposing to enter in the "usage" field.  ;-) But since they have a search engine, this entry is found when one searches "turn". However, we need rules and, in order to have rules applied, we'll need hyperlinks to them from the "edit" page. --Fiable.biz 16:12, 9 November 2010 (UTC)

About DefinedMeaning:charge_(1268323): I agree that we should enter the expression like this, with three little dots. What about the spaces though? xxx...xxx or xxx... xxx ? I think we should leave out the spaces, but that's a minor point. However, the definition: I'm a good French speaker but I didn't know this expression and I have no idea what your definition is supposed to mean. Explaining the grammar in the usage field is a good idea. The explanation should be easily understandable for all readers though, not just linguists. --Tosca 17:19, 10 November 2010 (UTC)
For the ellipsis, I just followed the usual typographic rules: it's followed by a space. For the definition you mention, I'm not a linguist either. None of the expressions I used in the "usage" field are technical, except maybe "Proposition principale", which can be translated into English word by word: "main clause", and is a grammatical expression taught in all French junior high schools. The equivalent is also taught in Mongolian junior high schools. Admittedly, the acception of agent (in English "agent") is also grammatical, but it's also taught in all junior high schools and very understandable from its etymology: "ago, agere, egi, actum" in Latin → "agir" in French and Portuguese, "agire" in Italian, "to act" in English, "actuar" in Spanish, since English and Spanish peoples are not cultivated enough to use properly the Latin past participle and use it as an indicative.  ;-) . I think we can accept in Omegawiki things taught to all high schools pupils. As for the definition itself, "having, as for me/you/him/us…, the task of" is an attempt to say that the meaning is "having, as for me, the task of", OR "having, as for you, the task of" OR "having, as for him, the task of" OR any such meaning with any other personal pronoun in place of "me", "you", "him". Again, if we had rules to express this kind of things unequivocally, it would be better. For instance: "having, as for [some agent], the task of". --Fiable.biz 14:26, 1 December 2010 (UTC)

Special:Random is slow[edit]

It takes a couple of minutes for the Special:Random redirect to display a page. Could there a missing index on some database table? Can someone profile this? --Pmj 03:20, 22 November 2010 (UTC)

It is known that the Special Random page is not working great. That is why I removed it from the left panel... I am not sure why it is so slow though. My idea was to reimplement a new random function using the random_page_hook_something with a home made sql query, but this is not on top of my priority list (also, the hook is not yet available in OW because we use an older version of MediaWiki). --Kip 15:33, 28 November 2010 (UTC)

Software change: searching for languages[edit]

Today I changed the combobox that searches for language. When you type something, it does not look just for language names starting with this string, but for any language containing this string. This allows for example:

  • when you type "French", it also finds "Middle French" and "Old French"
  • therefore instead of having language names like "Mandarin (simplified)", we can simply call them "simplified Mandarin". --Kip 15:37, 28 November 2010 (UTC)
No, in the syntrans tables "Mandarin (simplified)" should be found close to "Mandarin (traditional)", independently from the user language. --Ortografix 15:53, 28 November 2010 (UTC)

Other changes[edit]

  • It is not possible anymore to delete an option that is still in use (for example to remove accidentally the part of speech "noun" in DefinedMeaning:lexical_item_(402295), etc.). In such case, a small message appears on top "Option XXX cannot be removed because it is still in use". If the option is anyway to be removed, a developer can do it with a sql query, or just identify the DMs using the option.
  • I removed .../util/stats.php because we have had a special page that is better for quite some time now.
  • The icon Delete.png is back. --Kip 16:36, 28 November 2010 (UTC)

Parts of speech[edit]

Please, see my comment at DefinedMeaning_talk:single_(371826). How are parts of speech hanled on OW? ~~helix84 12:42, 29 November 2010 (UTC)

Hi, I answered. Each part of speech deserves its own definition, and the part of speech can be indicated in the annotation field. --Kip 21:57, 2 December 2010 (UTC)

What about a Special:Grammar page?[edit]

Hi, the more I think about it, the less I like it that we have to modify DefinedMeaning:lexical_item_(402295) to change basic attributes that affect all expressions, and that anyone can edit it, and that it is so ugly to edit. I am thinking in particular about the part of speech, genre and all these language dependent options that are there.

So, I was thinking of creating a special page that would be specially designed to add grammatical functionalities (that are language dependent). I am considering a system where you first select a language, and then you can add grammatical attributes to that language, and easily create a hierarchy of attributes for example with drag and drop (if I am successful with Javascript). Inflections, that we still don't have... will be defined with the same or a similar page.

But, access to this special page would be limited to a trusted group of people. (this is debattable independently of the above).

The next step is that I create a prototype on my local installation, and submit it here for further comments. What do you think? --Kip 18:55, 16 December 2010 (UTC)

I'm not sure at all I understand correctly the idea. Possibly you didn't provide us with the right hyperlink. It's very important that the grammatical terms are treated like the other terms, I mean that they have a definedMeaning, translations, annotations etc.. For instance "dual" is a grammatical notion applying to classical ancient Greek, but the word exists in English, in French ("duel") etc.. There is no difference between the Spanish grammatical notion of "plural", the French one, and the English one so no point in defining it many times. But there is a difference between the classical Greek grammatical notion of "plural" and the English French Spanish one because, in classical Greek, "plural" means, in many circumstances, "at least 3". The declension problem is more tricky, because the case usage often differs between 2 different languages. For example, even if it's classical to use the same word "dative" for Latin, Greek, Mongolian and other languages, in Greek and in Mongolian, this case is also used as a locative (to indicate a place), but not in Latin. In Greek and Latin, it's used with some prepositions, but in Mongolian there are no preposition and dative in never used with a postposition (playing the same role as Latin and Greek prepositions) since there is a special "postposition" case. Therefore we might need 3 different definedMeanings: "Latin dative", "Greek dative", "Mongolian dative". I think that, for declensions as for conjugations, we need to distinguish between the definedMeaning (DM) of the form and the one of the meaning. For instance, we would have the DMs "dative meaning", "locative meaning" etc., and DMs "dative form", locative form" etc.. If needed, the information that the locative meaning is expressed in Greek thanks to the dative form or that, in French, the irreal condition meaning is expressed thanks to the imperfect TENSE form ("Si j'étais Dieu...": "If I were God..."), and not by a mood, could be provided somewhere. This doesn't only matters because of inflexions, but also because of deponent words. For instance "can" doesn't even have an infinitive.
The difference between grammatical expressions and non-grammatical expressions is that it is recorded somewhere (I don't know where.) what notions apply to what languages. Instead of resticting the write access to that information, I would rather widen it. For instance I would like to write the ancient Greek genders and numbers but I don't know how to do that except in... asking Kipcool. But since I'd like you to implement the inflections groups records as a priority, I have restrained myself up to now. The risk with special pages is that it's yet something else to learn to contribute to OmegaWiki. So I think a special page could be a good idea to attribute some attributes to languages, probably not to define them.
A related proposal I have already made is to provide for an easier access to each expression's annotation. --Fiable.biz 04:58, 26 December 2010 (UTC)
Hmm it seems I was completely misunderstood ;-). So, forget it for the moment until I come up with something concrete to show. --Kip 12:15, 27 December 2010 (UTC)

Readings[edit]

Regarding "readings", I don't understand your problem, because we already have an IPA field, and the International phonetic alphabet has the huge advantage of being a world-famous language-independent standard, including tones. Many people can read it to a certain extend, though writing it exactly is a rarer skill. To my mind, we should add a list of IPA characters needed according to the language (If we put all together, people will get confused). --Fiable.biz 04:08, 14 February 2011 (UTC)

Readings in Japanese are not just IPA, it is something specific to Japanese ( http://en.wikipedia.org/wiki/Kanji#Readings ). However, I think that we can add them easily with the current system of annotations (though I would need instructions from someone who knows Japanese). Jim: can you point me to an online dictionary that shows readings the way you'd like to have them, or explain me what it should look like? Thanks. --Kip 09:32, 14 February 2011 (UTC)
There could be a small conversation tool that translates from Kanji, Zhuyin, and Pinyin syllables to IPA if entered in the IPA-field. An IPA-to-Speech feature might also be nice.
See also: Chinese Syllables
MovGP0 20:32, 15 February 2011 (UTC)
It seems reasonable that there might be an unambiguous mapping between Kana <-> IPA, or Pinyin <-> IPA. However, it also seems reasonable that there might not be. Kana readings for Japanese, and Pinyin for Mandarin, and IPA, all have different purposes and different symbol sets. There's no guarantee that mappings between them are unambiguous. I do expect that linguists have looked at this question before, so probably the literature contains an answer. It would be useful for OmegaWiki to search out such mappings and publish them here. If they exist, they would be quite useful. — JimDeLaHunt 21:30, 17 February 2011 (UTC)
Readings are a well-established convention for Japanese and Japanese-foreign language dictionaries. The Wikipedia article Kanji#Readings above explains this. If OmegaWiki wants to be taken seriously as a dictionary which include Japanese, it must provide them and accept them, in the form which the Japanese user community expects. Similarly for Pinyin and other readings. If a mapping exists between such readings and IPA (and that's a big if), perhaps OmegaWiki could present Japanese or Mandarin readings in the customary form, and map to IPA for storage in the database. — JimDeLaHunt 21:30, 17 February 2011 (UTC)
Kip points out a January 2010 blog post "Language specific annotations". I had not seen that when it came out. I tried entering readings for Japanese Expression:日. I think what we have is a good start, but not ready for Japanese-language usage yet. My observations below. — JimDeLaHunt 21:30, 17 February 2011 (UTC)
  • Annotations not visible in Expression:日, where I would expect a Japanese reader to look first. I had to go to DefinedMeaning:day_(5533) and add the annotation under the Japanese translation.
  • I don't understand the data model: what do these annotations modify, how do they relate to Expressions and DefinedMeanings? Where is this documented?
  • The annotations should be named "On-Reading" and "Kun-Reading" rather than "Katakana" and "Hiragana".
  • Readings sometimes indicate a suffix, such as -ka . OmegaWiki should define a convention for that.
  • Readings fields should only allow a restricted character set, suitable for the intended purpose.
  • A feature like Readings will require more design than this Beer Parlour conversation to become expressive, powerful, and easy to use.
Annotations not visible in Expression:日 <= I will do something about this.
I don't understand the data model <= These annotations apply on a Syntrans, i.e. a couple (Expression,Definedmeaning). Which means that you can assign different annotations to different meanings for the same expression. We have Help:Annotation. If you don't find what you are looking for on the help page, I can complete it (if I know what is missing).
I gave some feedback on Help:Annotation in Help talk:Annotation. Also, isn't it interesting that you used the term "Syntrans" to explain Annotations, but there are no articles Syntrans or Help:Syntrans to go with Help:DefinedMeaning and Help:Expression. — JimDeLaHunt 06:24, 21 February 2011 (UTC)
The annotations should be named "On-Reading" and "Kun-Reading" <= if they are really the same, yes (will do some research to check)
suffix convention <= I cannot help here, I need to learn more Japanese first... --Kip 22:50, 17 February 2011 (UTC)

Anyone can edit[edit]

Hi! As an experiment, I have changed the edit rules so that also anonym users can edit OmegaWiki. Since we do not have roll back functionalities, anonym users can add new data, but not modify existing data. For that, they still need to be "approved" by a bureaucrat.

In case of problem or bug (I had to changed several things in the software), please notify me, so that I can correct it or revert to the previous state of no anonym users allowed. --Kip 20:39, 15 February 2011 (UTC)

Images to illustrate the definitions[edit]

It is now possible to display images from Commons along with the definitions. Check out Expression:mouse. More on that later, I am tired now. --Kip 20:11, 21 February 2011 (UTC)

I have written an explanation page at Help:Image from Commons. Corrections or comments are welcome.
I have also written Help:Commons category --Kip 19:08, 22 February 2011 (UTC)
The images are not always displayed, e. g. on the page Expression:suit no image is diplayed, but on Expression:enseigne and other translations it is. If I click on "purge" the images are also gone.
It would be nice if we had to enter only the filename, for example "2-buttons_mouse.jpg" instead of the complete path.
--Ortografix 17:31, 23 February 2011 (UTC)
The problem with "purge" is solved. The other problem... still needs some work. --Kip 18:32, 23 February 2011 (UTC)
I considered also entering only the filename, but the field says "URL", so it would be a bit misleading. --Kip 18:34, 23 February 2011 (UTC)
First bug solved as well. Danke, dass du so viele Bug findest ;-) --Kip 20:34, 23 February 2011 (UTC)
Thank you very much for this very important improvement to OmegaWiki. --Fiable.biz 07:50, 13 March 2011 (UTC)

Extension[edit]

I've just installed http://www.mediawiki.org/wiki/Extension:LanguageSelector , that automatically selects an appropriate language for the interface of anonymous users, based on their IP. If you notice strange/unwanted behaviour, please notify me.

There is also the possibility to add a drop-down box to allow for the selection of the interface language, but I don't know if this is an interesting feature (since we can set the language in the preferences already) and I don't know where to put it so that it is not ugly (I tried in the toolbox, but it does not work). --Kip 18:12, 23 February 2011 (UTC)

I also installed http://www.mediawiki.org/wiki/Extension:Polyglot to replace what I had done to automatically view a page in your language if available (for example the main page). It works better in that when you are redirected, it writes a "redirect from" which allows you to see the original English version. Also, if you visit e.g. "Meta:Main Page/" , with a "/" at the end, you'll be taken to the English version instead of the translated version in your language. --Kip 18:51, 23 February 2011 (UTC)

In fact, Polyglot also adds the recognized translations on the left, like interwiki links. So, we can eventually get rid of the ugly list of translated pages at the top of each page (template {Translation} or something). --Kip 21:12, 28 February 2011 (UTC)
Not working for me: I see the pages in English with my portuguese IP address. Malafaya 11:38, 15 March 2011 (UTC)
Could you check that your browser is not setting its own language preference? If using Firefox, it's in Tools=>Options=>Set your preferred language. It seems that the first language of that list overrides the determination of language according to IP address. --Kip 12:39, 15 March 2011 (UTC)
Indeed it does. Not a very useful extension (based on IP) in that case, as it's normal for the browser to have something there. Malafaya 17:42, 16 March 2011 (UTC)

Inflexions, a question[edit]

I think I have found (eventually) a SQL implementation of the inflexions that should work. (cf. User:Kipcool/Inflexions#SQL_implementation, but these are personal notes, I don't know if it is easy to understand as it is).

Basically, the idea is based on the assumption that the set of inflexions for a given noun/verb/adjective/... is always either:

  1. one value (French noun => 1 plural)
  2. a vector (e.g. English adjectives => comparatives + superlatives)
  3. a two-dimensional table (e.g. French adjectives, German nouns)
  4. a set of two-dimensional tables (e.g. French conjugations)

And since "a set of 2D tables" contains all other possibilities, it is sufficient to use that in the sql database.

As far as I know, it should work for English, French, German, Chinese+Japanese (no inflexions :) ), Spanish and Latin. The question is: do you know any language that does not fall within these four possibilities, or will that do?

Thanks. --Kip 19:52, 27 February 2011 (UTC)

Unfortunately I don't know German, but such an approach would lead to an unmanageable huge number of tables nobody will ever fill by hand in Mongolian and, generally speaking, agglutinative languages. See International_Linguists_Beer_Parlour/Inflexions#Too_many_inflectional_forms. Even in French, there is a problem: French verbs have a conjugation, but also number and gender forms for past participles. French or Spanish verbs inflect according to 8 critera, not just 2 or 3: voice, mood, aspect (simple, continuous, accomplished, such as past simple, past perfect and imperfect), "form" (just for the 2 conditional past tenses, of identical meaning), tense, number, person and, for past participle, gender. Even if this can be represented as 2D tables, logically, in the database, the multidimensional aspect should be implemented, for future use. We should notably think of an inter-language correspondance between inflectional forms, even if we don't implement it in a near future. --Fiable.biz 07:59, 13 March 2011 (UTC)
For French, it works to record a table like this. I am not sure if we need to record more forms.
For Mongolian, maybe we need another system. How are the inflexions represented in Grammar books in Mongolian? Are there conjugation tables like in French? --Kip 11:30, 13 March 2011 (UTC)
I can also think of a system to encode the 8 or so dimensions of inflexions, but there is then the issue of how to display them on the screen: the software needs to know how to represent it on a 2D display in a way that looks nice to the user, i.e. what is a column and what is a row. Even though this has nothing to do with the inflexion form itself, it needs to be stored somewhere in the database as well... need to think harder ;-) --Kip 12:24, 13 March 2011 (UTC)
The link you provide is OK, but there are neither the passive voice nor the inflexions of the past participle. Since the French passive form depends on the gender, the verbal inflexions are more than 3 times more than shown. In Mongolian grammar books, there are no tables of combinations of suffixes. Each kind of suffixes is explained: voice, aspect, finite mood + tense and non-finite mood + tense, case etc., the list of possible suffixes in the category is given (the longest list is the one of mood-tenses) and it's also explained how to combine the suffixes. This short presentation says there are more than 200 inflectional suffixes… combining together. Now "хамтралжуулагдсанаараа" has 5 inflectional suffixes (2 voices, a mood-tense, a case, the reflexive). "явуулагдсанаас" has 4 (2 voices, a mood-tense, a case). In the Mongolian interface of this very page of OmegaWiki, I see "хадгалагдаагүй" with 3 suffixes (1 voice, 1 mood-tense, the negation), meaning "has not been saved". Note that the French "avaient été mangées" ("had been eaten", in the feminine) also has several grammatical elements, not all independent from each other and which can be grouped into 5 independent grammatical facts : "av-" + "-é" of "été" (accomplished, therefore past), "-ai-" (indicative past, thus past of the past), "-ent" + "-s" (plural 3rd person), "ét-"+ "-é" of "mangé" (passive), "-e" (feminine), but it happens than the number of possible combinations is much larger in Mongolian. You might also be interested by this presentation of an automatic analyser. There are many rules in Mongolian, but many less exceptions than in French. In order to inflect a noun, the program has to know the radical (i.e. the singular nominative), whether or not there is a so-called "secret n", whether or not the is a so-called "short genitive", whether or not there is a collective genitive, if the noun means "the inhabitant of" any place, or stands for a person part of an organised corporation, stands for any other person or for a thing, and whether or not the noun is part of the small class of nouns with a plural in "-s". For a verb: the radical (which cannot always be deduced from the infinitive), whether or not it's part of a small class of verbs whose causative voice has a long vowel and whether or not the causative form is used with a passive meaning (in which case the passive form in unused). A noun inflections can be produced from 5 forms (radical, genitive singular, collective genitive singular, short genitive, nominative plural), and a verb ones from 3 (radical, causative, passive), if I don't forget anything. I think we can record the 1st and maybe 2nd level of derivations, and produce the other ones on the fly, on demand.
Please note also that, to describe properly the language, it should be good to write somewhere that in "avaient été mangées", the negation must be placed around the first element: "n'avaient pas été mangées" and not "n'avaient été pas mangées" or "n'avaient été mangées pas", that adverbs can be placed after any element: "avaient vite été mangées", "avaient été vite mangées", "avaient été mangées vite" are all acceptable, though long adverbs and adverbials tend to be placed after the 2nd or 3rd elements ("avaient été mangées en toute illégalité"), while short ones tend to be placed after the 1st or the second element. But you may argue that this is grammar, not lexicography. I just mention it because of the too simple "separable/unseparable" present field. In fact, inflexion production comes within the province of grammar, but inflexion groups membership comes within the province of lexicography. --Fiable.biz 16:16, 16 March 2011 (UTC)

Changes of today[edit]

Today I changed:

  • that if you type http://omegawiki.org/ or http://whatever.omegawiki.org/ , you will be redirected to http://www.omegawiki.org/ . I did that because I noticed that Google considered omegawiki.org and www.omegawiki.org as two different sites and was indexing both separately
  • the generation of the meta tag description for each Expression page. It used to be a generic "Omegawiki, the free, multilingual dictionary, ....", and in most cases, this is what is shown by Google in the results (see e.g. this Google query . I changed it to the first definition of each page, based on what the dictionary http://dictionary.reference.com/ is doing.
  • I removed the display of the "dataset panel" on the right. I think that nobody uses it, and it is only confusing for new users. Furthermore, it does not look good with images. I plan to reintroduce later on as a small not-annoying tab. --Kip 21:19, 28 February 2011 (UTC)
Good changes. I think displaying the definition in the search results is very useful because it gives more of an incentive to click the link. I never liked the Dataset panel anyway and as a matter of fact I think that we should get rid of UMLS and SwissProt. Nobody ever edits them. --Tosca 16:04, 1 March 2011 (UTC)
Now it looks like this in Google (antisemitizam is currently the word that receives the most visits). --Kip 09:53, 9 March 2011 (UTC)
Looks good! Even better than the search result for Wiktionary right above which only has a standard text. ;-) --Tosca 10:14, 14 March 2011 (UTC)
For the first point, we can tell Google to consider www.omegawiki.org for indexing via their Webmaster Tools. Help. —Veeven 14:02, 14 March 2011 (UTC)
I've just done that, thanks. --Kip 08:16, 15 March 2011 (UTC)

uselang[edit]

When you want to share a link to OmegaWiki to someone else (in my case, to French friends), and you want them to see OmegaWiki in a given language (not English), you can add the parameter uselang in the URL.

It works like this: http://www.omegawiki.org/Expression:apple?uselang=fr

It used to work only partially, translating the main interface, but not the language names, etc. Now it shows you the same thing as someone who set his preference language to French. Note: the uselang parameter is not sticky; if you navigate to another page, it'll be lost. --Kip 22:57, 3 March 2011 (UTC)

Requested pages[edit]

This is one of the few, perhaps the only wiki where I have administrator rights. Unfortunately I'm here rarely. Now by chance to change my e-mail address and writing this :-). The problem with being here not often enough is that I don't remember how to handle changed interface(s). Developers work hard to make things easier, but sometimes it's better to go back to the basics.

How to make a new page for example when you see an entry does not exist, I have honestly no idea. Here my request for the entry heckle.

Cheers,
Patio4it 10:09, 19 March 2011 (UTC)

Type "heckle" into the search box and click the "Go" button (the left button; the name is language dependent) and then click on Expression:heckle and enter the definition(s).
Cheers, --Ortografix 10:50, 19 March 2011 (UTC)

New languages for edition[edit]

I added the following languages to the list of editable languages:

--Kip 19:05, 1 April 2011 (UTC)

Example sentences no longer limited to 256 characters[edit]

The title says it all. Now, example sentences will not be cut and can be as long as definitions (some 1 million characters I think). --Kip 20:32, 9 May 2011 (UTC)

382 images and MediaWiki upgrade[edit]

Since I was wondering how many images we had already, I have created quickly a page that shows the number of link annotations:

http://www.omegawiki.org/Special:Ow_statistics?showstat=annot

I will blog about it when we have 1000 images. I was also astonished to see that we already have 13971 links to Wikipedia articles (counting all languages).

On an other matter, I'll be upgrading MediaWiki in the coming days (weeks). At the moment, we cannot upgrade any extension (including the WikiData extension) because of that. I'll warn in advance when the upgrade takes place since I'll probably have to lock the website for a couple of hours. But first, I have to get it working on my personal computer. --Kip 07:18, 17 May 2011 (UTC)

OmegaWiki and its relation to other Wiktionaries[edit]

Will OmegaWiki mirror all the (usable) contents from all other wiktionaries? What if I edit an entry in OmegaWiki -- would that be reflected in the original wiktionary? (E.g. say I correct the gender information for a Swedish word.) Or should I instead edit it in the English wiktionary, which would then (after an update) be reflected in OmegaWiki?

That is, is OmegaWiki a view of data extracted (parsed) from the various wiktionaries? If not, how will it be kept synchronized and up-to-date with the wiktionaries? (Sounds futile...)

Personally I think that the most feasible approach would be for each wiktionary to decide on its own standards, and work on following them strictly. Then, this would be parsed and made available as far as possible. A language with a tidy well-organized (e.g. templatized) wiki would simply contribute more data than others. Thus change could be gradual and evolutionary. --194.237.142.6 06:55, 19 May 2011 (UTC)

  • The two projects are separate and have their own standards and their own data.
  • The fact that OmegaWiki is organized around concepts (each concept has definitions and translations in several languages) makes it impossible to parse the definitions from the various wiktionaries (because we would need to match the definitions from the English wikt, the French wikt, etc. and it is not possible to automatically judge that two definitions in different languages are the same concept). However, we do use the data from the various Wiktionaries, for example when adding translations.
  • The other way around (i.e. the Wiktionaries would be a view of the data from OmegaWiki for a selected language) would be possible, but at the moment, the biggest Wiktionaries have more data, and the various communities do not want to switch to a centralized dictionary, for various reasons (for example, the French Wikt community wants to be able to communicate in French, whereas it is clear that in OW English will prevail).
  • Now, regarding genders, this can be imported in either way. I am sure that at some point we will automatically import the gender data from the Wiktionaries, as well as other annotations (IPA, inflexions, etc.). I am not aware of projects to import data from OW to the Wiktionaries. Maybe this will come when OW grows bigger.
  • But in any case, when importing data, usually the rule of thumb is to trust the data that is already there, and import only data that is missing. So, if an information is incorrect in a Wiktionary, you'll need to correct it there. If it is incorrect in both Wiktionary and OmegaWiki, it'll need to be corrected twice.
I hope I was clear... it is not very easy to explain. --Kip 08:47, 19 May 2011 (UTC)

Min Nan[edit]

Can I request that Min Nan (nan) be added as an editable language. thanks.--Hiong3.eng5 04:58, 23 May 2011 (UTC)

Done! --Kip 09:20, 23 May 2011 (UTC)
In view of the three scripts used in Min Nan, I have created three "artificial" languages:

Min Nan Variants[edit]

Using # 42 of the Sino-Tibetan Swadesh List as an example, we have 3 ways of saying 阿母. a-bió (Quanzhou), a-bú(Xiamen, Taipei) and a-bó (Zhangzhou, Tainan). One for 媽媽 ma-ma.

I have two questions at the moment.

  • How can show that 阿母, a-bió, a-bú and a-bó are the same words only, variants of one word, while 媽媽 and ma-ma are the same dm but of a different group?
  • Do we have to create languages to differentiate the variant? Min Nan (POJ), Min Nan (POJ-Xiamen), Min Nan (POJ-Taipei) etc?
  • We don't have a solution for 1 yet.
  • 2: yes, this is how we did with English (UK, US), Portuguese (Portugal or Brazil), etc. How many are there?
Another possibility (e.g. if there are too many regions) would be to use the annotations, with a list of the different regions, similarly to the current part of speeches, and where you can assign several regions to one word. The problem then being that the information will be hidden in the annotation. --Kip 09:01, 14 June 2011 (UTC)

Quanzhou, Zhangzhou and Xiamen (Mainland China); Taipei and Tainan (Taiwan); and Hokkien (Southeast Asian Chinese where 'Penang Hokkien' of Malaysia(?) as one of the most outspoken in the Internet) are the most documented, these six dialects, I guess. My family came from Jinjiang (晋江) and is somewhat of a mix of Xiamen and on of the first two dialects. Xiamen seems to be the most flexible because of its location (I think it is in the middle of the other provinces). Some say that all these dialects are a mix of Quanzhou and Zhangzhou.

If I have a say, I think using annotations would not be bad at all.

Ok, I will try to create a prototype with annotations and see how it goes (if it works, if you like it, etc.). --Kip 09:09, 15 June 2011 (UTC)
Hmm, I gave it a try, but it does not work at the moment, I'll have a look this week-end at the software and make it work.
In the meantime, I have an additional question for the annotation: I see two possibilities:
1. An annotation named "Region" and the list of options will be "Quanzhou", "Zhangzhou", etc.
2. An annotation named "Dialect" and the list of options will be "Quanzhou dialect", "Zhangzhou dialect", etc.
Do you have an opinion of which is best? I am hesitating because, as far as I understood, using dialects names such as "Quanzhou dialect" would be more correct since it is spoken in the city of Quanzhou, but also in other places. On the other hand, indicating regions instead of dialects gives more possibility, for example the possibility to indicate that a word is used in a specific city / country, even if it is not identified as a particular dialect (I don't know for Min nan, but for example in French, there are word used mostly in some regions, but there are not dialects, only additional vocabulary). --Kip 20:12, 16 June 2011 (UTC)
There are words I use here in the Philippines that I do not hear in Taiwan. Like in the English word and, I use kiao (I am not sure of the diacritic here) and kap interchangeably. but kiao is not in the Taiwanese Dictionary for that particular definition. Whereas a local Chinese(Mandarin/Hokkien)-English-Tagalog handbook uses kap for while kiao for .
I do not know what would be best. I will ask this question at Min Nan Wikipedia's Under the Tree.
It works now, I created an annotation "region" with the options "Quanzhou" and "Zhangzhou". More can be added when we decide that we keep this implementation. --Kip 17:36, 19 June 2011 (UTC)
Kip, how can I use the region on annotation. I was not able to access it. --Hiong3.eng5 09:57, 20 June 2011 (UTC)
When you edit a word (e.g. DefinedMeaning:Min_Nan_(POJ)_(1299485) ) you need to look in the table containing all the synonyms and translations, then on the right column, click on "annotation >>" next to the word you want to annotate , then under the last one "option values", select "area" (or region, not sure how it is shown in English).
I created a screen shot File:Syntrans-option-values.JPG
To add further regions, you need to edit DefinedMeaning:Min_Nan_(POJ)_(1299485), and under the section "Class attributes", click "Options >>" and then add the region to the list. Each region needs to be defined as an Expression before. --Kip 14:46, 20 June 2011 (UTC)

To do list[edit]

To help the new contributors to find where to start, I have created Help:To do list for contributers. If you have any ideas of things to add, or word lists that I have missed, please mention them. --Kip 11:44, 9 June 2011 (UTC)

Defined Meaning Text in the SQL Dump[edit]

I would just like to ask where the defined meaning texts are in the latest omegawiki SQL Dump.

I downloaded a copy and used Open Office to browse the database, but can not find it. I found the defined meaning ids but not the text. Am I missing something? --Hiong3.eng5 05:40, 10 June 2011 (UTC)

Do you mean the definitions ?
I take the example of DefinedMeaning:dictionary_(915)
  • select * from uw_defined_meaning where defined_meaning_id = 915 ;
returns one entry with meaning_text_tcid = 150085 ;
  • select * from uw_translated_content where translated_content_id = 150085 and remove_transaction_id is NULL;
This returns a list of several entries with language_id and text_id.
Then if you want for example the definition in English (language_id = 85, text_id = 278237),
  • select text_text from uw_text where text_id = 278237 ;
Does this answer your question? --Kip 08:17, 10 June 2011 (UTC)

Yes. Thanks! --Hiong3.eng5 11:38, 11 June 2011 (UTC)

other[edit]

I would just like to ask if this is an error. I edited defined meaning other, and looked at the recent changes and found out that the url is http://www.omegawiki.org/DefinedMeaning:few_(5501) --Hiong3.eng5 08:21, 13 June 2011 (UTC)

It is normal, but something that should be fixed.
When we create a DM, the name of the DM is the first expression created. It can happen that the DM has then completely been changed to mean something else, and the list of translations changed as well, but the DM name stays the same.
In the future, it would be better to just use the DM number [DefinedMeaning:5501] to get rid of that problem. --Kip 17:36, 19 June 2011 (UTC)

Selective Language output for omegawiki users[edit]

Is there a way to limit the language output of what I see and edit as an omegawiki user?

Let's say I am interested only in English, Mandarin, Min Nan and Tagalog. Can omegawiki display expressions, definitions, translations etc only in these Languages? Maybe through my preferences.

I think if this is possible, the page would be lighter, and faster. and adding entries for a limited list of language would be faster. Just a thought. --Hiong3.eng5 04:01, 14 June 2011 (UTC)

It is not yet possible, but many users have suggested it and I agree that it is a good idea. --Kip 08:43, 14 June 2011 (UTC)
A thing that would make pages load faster is the retrieval of translations in bulk or simultaneous instead of sequential one by one. I believe that currently there is an HTTP request for each translation. Malafaya 13:01, 14 June 2011 (UTC)

Regional words[edit]

At the moment, we have some way to indicate that a word is used only in a region, but we've done it by defining artificial languages. We have therefore British English, American English, Canadian French, Swiss French, Belgian French, Portuguese (Brazil / Portugal), German (Germany, Swiss, Austria). Nothing yet for Spanish, but it would be needed as well.

Instead of artificial language names, we can now have a region annotation. The advantages I see to it are:

  • if a word is used for example in Belgian French and Swiss French but not in France (e.g. septante), we can enter it once, and give 2 region annotations : "Switzerland" and "Belgium"
  • while it is not realistic to have too many artificial languages (one for each region having a particular word), a region annotation can be anything we want, such as "North of France", "Wales", "North Midland", etc. (see http://en.wikipedia.org/wiki/Regional_vocabularies_of_American_English )

Maybe one drawback is that annotations are not as visible as language names at the moment, but that will change in the future I promise...

I have created one example with two regions for English, if you want to give it a try. Of course, if we decide to go for annotations, the change will be done by a bot.

What do you think? --Kip 17:48, 19 June 2011 (UTC)

Search suggestions[edit]

I have discovered that I could enable a new feature: search suggestions. It is the same as in Wikipedia: when you type a text in the search box, it suggests existing words.

At the moment I know it looks (a bit) ugly, because the "Expression:" cannot be easily removed. Also, all the (useless) expression from UMLS and Swissprot are included. This means that I need to rewrite the bit of AJAX that does this so that it looks better for our needs, but this takes some time... However, I think it is also already useful as it is now.

Furthermore, if you don't like, you can disable it in your user preferences under "Search Options" => "Disable AJAX suggestions".

Of course, opinions are welcome. --Kip 18:29, 20 June 2011 (UTC)

Upgrade to MediaWiki 1.17[edit]

Today (or yesterday?) was released the first official stable version of MediaWiki 1.17. This is a good timing since I am almost ready to perform the upgrade (only esthetic bugs still need to be fixed). I'll let you know when I am ready, since I then need to lock the website for some minutes to some hours (not sure how long it will last). --Kip 18:48, 22 June 2011 (UTC)

I am ready, but don't know when I have time to do the upgrade (depends on the weather tomorrow ;-) ). --Kip 20:05, 25 June 2011 (UTC)

The changes will be:

  • the interface Vector will be default (see the current Wikipedias). Monobook will still be available in the user preferences. (I had to change a bit the code to make it work with Vector)
  • I have rewritten the function that searches for a word (because I had to). It should be a tiny bit faster (though maybe not noticeable)
  • it will be possible to upgrade some extensions which are now 1 or 2 years old. In particular, the language translations for the interface will be udpated.
  • the database has been changed compared to MediaWiki 1.16. I don't know the details, but I believe it has been optimized (?).
  • ...? --Kip 19:14, 26 June 2011 (UTC)
Ich bin gespannt! --Tosca 09:11, 29 June 2011 (UTC)
It is done. --Kip 19:55, 30 June 2011 (CEST)
Using an RTL language gives me Fatal error: Call to a member function addHTML() on a non-object in /var/www/ow/extensions/Wikidata/Wikidata.hooks.php on line 14 because $wgOut is not loaded as a global. I suggest to remove loading the RTL gadget and set $wgBetterDirectionality = true instead :) SPQRobin 23:07, 30 June 2011 (CEST)
Done --Kip 23:23, 30 June 2011 (CEST)
Just a note: that RTL gadget was a hack I made in the old times, when there was no RTL support in the MediaWiki version installed here at OW. I suspect that may be deprecated as of now. Malafaya 09:58, 17 August 2011 (UTC)
Keep up the good work, Kip! We are so proud of having you among us. Patio4it 11:46, 2 July 2011 (CEST)
Great update! Thanks for your effort!
I've got a Class 'SiteConfiguration' not found error on pages classes and collections. --Ascánder 12:37, 3 July 2011 (CEST)
I've also got a Call to a member function getNamespace() on a non-object on history pages of defined meanings (for instance, here). --Ascánder 12:47, 3 July 2011 (CEST)
Thanks for reporting, I'll investigate. --Kip 10:46, 4 July 2011 (CEST)
History pages are now working again. --Kip 19:05, 4 July 2011 (CEST)
Classes an collections viewer also repaired (though it needs some better interface...) --Kip 19:26, 4 July 2011 (CEST)
Thanks again. --Ascánder 15:16, 5 July 2011 (CEST)

Btw, FYI, you say you upgraded to 1.17, but Special:Version says 1.19 (which it actually is). SPQRobin 20:47, 5 July 2011 (CEST)

Ah... ok thanks for the info :-) I thought "svn update" would restrict itself to the stable version... --Kip 22:48, 5 July 2011 (CEST)

my error[edit]

I put hyphenation of Ab·ys·sin·i·a to Castilian Abisinia instead of English Abyssinia. Kindly delete the hyphenation on Abisinia. (I do not know how) thanks --Hiong3.eng5 05:32, 30 June 2011 (UTC)

Instead of deleting, I put the correct hyphenation for Spanish. To delete, you have to click on the checkbox of the left column in the annotation panel. --Kip 08:37, 30 June 2011 (UTC)

Babelbox[edit]

The link on the bottom of the box is red. Is it a bug or on purpose? Patio4it 11:52, 2 July 2011 (CEST)

Hmm I don't know what was there before... There has been many changes in the Babel extension. I have changed the settings so that each language is a link to the portals. The categories are supposed to be created automatically. --Kip 15:39, 2 July 2011 (CEST)

On Dutch wikipedia we see the following category appear after clicking the bottom part of the babelbox. Don't know how they made it, probably with a bot. Patio4it 17:25, 13 July 2011 (CEST)

Random page[edit]

It works now! So I put the link back on the left panel.

Because of caching, the link might not appear. In this case, you need to purge the page (...?action=purge). --Kip 14:15, 28 July 2011 (CEST)

Spam fighting...[edit]

Due to the many recent spams from many different IPs, I have installed a new extension to prevent spam, namely http://www.mediawiki.org/wiki/Extension:Check_Spambots .

This extension is nice since it checks any IP against a list of known spamming IPs, in my case from stopforumspam.com . This list is updated daily with new known spammers.

I have changed a bit the extension so that it does not affect logged in users (because the time to look up stopforumspam.com would else have to be added to the time to save a page...) but only IPs.

It is still to see if it works. (Editing with a non-spam IP still works...)

PS: I have also deleted the huge spamming list of today-yesterday from the recent changes list. --Kip 14:02, 1 August 2011 (UTC)

Hmm does not work... --Kip 15:11, 1 August 2011 (UTC)
Maybe now? --Kip 17:11, 1 August 2011 (UTC)