As an anonymous user, you can only add new data. If you would like to also modify existing data, please create an account and indicate your languages on your user page.

International Beer Parlour/Archive20130728

From OmegaWiki
Jump to: navigation, search

Syntrans relations dm restricted[edit]

Nice, I have not checked, but can you limit the syntrans-syntrans relation seach to the defined_meaning_id, to trim the list? -- Hiong3-eng5 (talk) 04:21, 3 March 2013 (CET)
I don't know what you mean. --Kip (talk) 10:37, 3 March 2013 (CET)
I just understood what you mean... (took me 5 days ;-) ). We cannot restrict it to the present DM, because not all syntrans-syntrans relations stay in the same DM. For example "etymon" can link to a syntrans with another meaning. --Kip (talk) 10:38, 8 March 2013 (CET)
Can we not move etymon to a different attribute_type say STDM? We still lump etymon with the others as syntrans-syntrans relations, while making other syntrans-syntrans class have a DM restricted search item. What do you think? --Hiong3-eng5 (talk) 06:29, 14 March 2013 (CET)
I was thinking about something similar, where by default, a syntrans-syntrans relation would allow to link to any syntrans (as it is now), but where we would add a checkbox, or some other boolean, that you set to 1 to tell that a specific syntrans-syntrans relation is limited to syntrans-es of the same DM.
That way, it would be easy to check or uncheck if we later consider that a relation is not just limited to a given dm (see discussion below about abbreviations).
Another thing would be to limit the relation to a specific language. "abbreviated as" apparently always link to a same language. Some "Min Nan" relations are only for "traditional" -> "POJ".
However, my solution would have to wait that I rewrite the annotation system (where there would be a special page "Special:AnnotationManager" that allows more flexibility in defining annotation, including the possibility to define that relation "hypernym" is the inverse of "hyponym").
On the contrary, your solution would be faster to implement, and the specificity of the relation could always be changed by me in the database.
What do others think? (I hope the conversation can be understood) --Kip (talk) 09:56, 14 March 2013 (CET)
I think it will be much user friendly if we just manipulate the class attribute type. If limited to a certain language as in the case of abbreviations, we could use STL Syntrans-Language specific. I however do not have an idea what to do with Min Nan -> traditional. except if there's a new column (say ... attribute_type_addtnl) where we can specify that an attribute type will search for only a specified language id and give an attribute type STSL (Syntrans To Specific Language). --Hiong3-eng5 (talk) 12:22, 14 March 2013 (CET)

Is an abbreviation a synonym?[edit]

With the new syntrans-to-syntrans linkage capabilities, it is now possible to define an expression as an abbreviation for another expression (e.g., "U.S.A." is annnotated to be an abbreviation for "United States of America" as well as being listed as a synonym). When doing this, I am not clear whether the abbreviation should be considered a synonym or a separate DM (i.e., create a new entry for "U.S.A.", defined as "an abbreviation for the United States of America"?). Perhaps for the USA, it is indeed both a synonym and an abbreviation. But for some scientific symbols and measurement abbreviations, I am less sure. For example F, stands for fluorine in chemistry, farad in physics, phenylalanine in biochemistry, folio in book binding, 15 in hexadecimal notation (computing). What do people think? --InfoCan (talk) 17:31, 13 March 2013 (CET)

You know my opinion already (that they should be considered as synonyms), but I want to know the opinion of others as well. --Kip (talk) 19:50, 13 March 2013 (CET)
I think it is a synonym. When searching for USA you are looking for a definition and "acronym for United States of America" isn't really a definition. It's an etymology. --Tosca (talk) 09:29, 14 March 2013 (CET)
I also think it is a synonym. Besides, if we give it it's own DM, it would add more bytes to the database. I like it that the database is lean, without any unnecessary bloat. Just my opinion. --Hiong3-eng5 (talk) 12:28, 14 March 2013 (CET)

Consider this situation. "H" is the international chemical symbol for hydrogen. If we are to use the field called "is an abbreviation for" for the expression "H", it would have to have a syntrans-to-syntrans link to some 400 languages, potentially. This seems impractical to enter, and even if it could be done automatically, an inefficient use of of database storage. As an alternative, consider having two DMs: 1) "chemical element with atomic number 1", which would have syntranses of "hydrogen", "hidrógeno", "hydrogène", etc., for different languages, and 2) "symbol for the chemical element hydrogen", which would have one syntrans, "H", in the language we call "International". Wouldn't this be a better solution? --InfoCan (talk) 15:28, 14 March 2013 (CET)

I'm not keen about definitions beginning "symbol for..." or "word meaning..." – H doesn't mean the symbol for hydrogen; it is the symbol for hydrogen and means just plain hydrogen. – Ypnypn (talk) 02:30, 17 March 2013 (CET)
OK, agreed. It would still be good to find a solution to the problem I described above. --InfoCan (talk) 16:55, 18 March 2013 (CET)
We could consider that "H" is the abbreviation only for the English "hydrogen"...?
It is simpler if we consider "N" for Nitrogen, because in French it is called "azote", so "N" is not an abbreviation for "azote" :-) [note: same problem for "g" gramm, and "L" liter] --Kip (talk) 16:58, 18 March 2013 (CET)
But sometimes abbreviation are used in other languages. For example, in English we still use lb. for pounds, even though the abbreviation only makes sense in Spanish (and a few other languages). W stands for Tungsten (Wolfram in German).
So I agree that it should be considered a synonym in language "International", and indicate its etymology through an annotation. -- Ypnypn (talk) 17:31, 18 March 2013 (CET)
There is no such language called "International" for various reasons.
(0) There is no ISO 369 code for "International".
(1) Several languages are already international, such as English, Arabic, Spanish, Portuguese, Hindi/Urdu, Kurdish, French, etc.
(2) Noone names notational systems, such as decimal numbers of the Latin alphabet, scientific symbols of chemical elements, mathematical symbols, ISO-639 language codes, to name only a few, a language. (They may be seen as such under some aspects of language definition, agreed)
I think, we should not start confusing people, and stick to the technical term notational systems as opposed to a single abbreviation, or a full (natural) language. Any such notational system leans itself pretty easily to making it a Collection in Omegawiki. So, we can have a collection "chemical elements" and on the DM level assign the respective symbols as elements indices in the collection easily. For states and territories, we have at least three collections: (1) ISO 3166 and its various subdivisions, (2) the abreviations used for automobiles as per a series of international treaties, and (3) the ones used by sports arganizations. Since "USA" is already in two of those, is there a need, or even a justification, for an additional abbreviation entry? --Purodha Blissenbach (talk) 15:37, 1 May 2013 (CEST)

definition for expressions with variable parts[edit]

There are some multi-word phrases whose middle part is variable. One example in English that I just entered is "tell a person his fortune". There might have been a better way to write it, like "tell somebody their fortune" or something else. I am sure other languages have similar expressions. Any thoughts on a policy? --InfoCan (talk) 15:39, 20 March 2013 (CET)

I also think a policy would be nice.
One argument would be to have all possible combinations, because this could be how people search for that expression. But this could make a long list. (I am not saying I am in favor of this, it is just an idea) --Kip (talk) 14:16, 22 March 2013 (CET)

Agree. To be political correct is a good thing, wasting bandwidth not.  Klaas|Z4␟V:  00:57, 15 April 2013 (CEST)

Another approach was to see the variable part(s) as part of grammatical information to be kept in annotations. One could just leave all the variable pieces out and enter:
tell fortune
Nevertheless, if the variable part(s) needs to be denoted in the expression, could it be done using ellipses?
tell … fortune
In either case, (an) annotation(s) then is/are used to further clarify what "…" cannot be, or needs to be, such as "someone ones".
Similarly, and possibly more obvously, you have to distiguish anyways between
(1) "to tell the truth / lies / a story / etc."
and
(2) "to tell someone someting", such as in: "He told my father the way."
by annotating that (1) has one object, and that (2) has two objects, and what kind or class of words or expressions or phrases these can be. I know, we are not at a point where we can comfortably deal with grammatical annotations, but that should not keep us from preparing for it. --Purodha Blissenbach (talk) 14:49, 1 May 2013 (CEST)

It seems like there is a need for a "Redirect" function in Omegawiki. If the way the expression should be defined is not obvious to an inexperienced user, there should be some kind of "forwarding" of various reasonable entries to the correct one. For the above example, let's say we decided that "to tell someone their fortune" should be the main entry, then there should be a redirect from "to tell someone his fortune" and from "tell ... fortune" to this main entry. This could be done like it is done in Wikipedia using the code #REDIRECT to tell someone his fortune but this redirection would be stored in the MediaWiki database, not the Omegawiki one. Kip, is there a better solution that you can think of? --InfoCan (talk) 19:07, 21 May 2013 (CEST)

A redirect is an idea, though I am not sure how to best implement it with the "relational database" format of OmegaWiki. Currently a "#redirect" would not work, because in the Expression: namespace, the MediaWiki page is not read at all (it is only checked if the title exist).
Another idea - that I got from reading your idea - would be an advanced search by similarity, such as what Google does with "Did you mean: ..." : if a word does not exist in the database, it would automatically suggest other words that are lexically close to it. --Kip (talk) 19:36, 4 June 2013 (CEST)

Good news on adoption by WMF[edit]

Here are 61 supporters, 10 people who voted 'oppose', 3 undecided. It looks we have a chance to join our big sibling :-)  Klaas|Z4␟V:  17:43, 18 April 2013 (CEST)

I am only disappointed that most supporters didn't create an account to test/contribute,
Actually, both ‎Znilos and User:‎Yugioh joined in the past few hours... - ‎Ypnypn (talk) 01:41, 19 April 2013 (CEST)
Both are spammers, not supporters --Kip (talk) 10:21, 19 April 2013 (CEST)
and that most opposers didn't actually read the proposition (particularly that we do not mean to kill the Wiktionaries, and that OW would mostly benefit the small Wiktionaries, i.e. NOT the French and English Wiktionary). --Kip (talk) 18:06, 18 April 2013 (CEST)
So what's next? - Ypnypn (talk) 01:29, 19 April 2013 (CEST)
We can't do much except wait for a decision from a WMF guy. --Kip (talk) 10:19, 19 April 2013 (CEST)
Someone with influence. I noticed too that wiktionaries see us a threat. I feel it more as an opportunity to cooperate much easier. We add tremendous value to all of them ad vice versa  Klaas|Z4␟V:  17:18, 21 April 2013 (CEST)
This also sounds like good news: [1]. Not sure what would be to improve in the proposal. Anyone has an idea? --Kip (talk) 19:12, 22 April 2013 (CEST)
It's noticed by too few people. Perhaps a lobby will help? Gerard, anyone else who knows people in the board of the Wikimedia Foundation?  Klaas|Z4␟V:  12:31, 28 April 2013 (CEST)

WikiLang[edit]

On Meta-Wiki, there's proposal for a project called metawikipedia:WikiLang, whose goals would be "documenting, recording, sharing and teaching all languages with a strong sub-project on languages revitalization including living dictionaries, sprachausbau and decipherment of dead languages and scripts as options." (They just set up a sort of demo on Meta, just to demonstrate what it would be like.) I left a note there about them considering working together with OmegaWiki, taking advantage of its unique software and existing site. So what do you think? -- Ypnypn (talk) 04:36, 2 May 2013 (CEST)

In my opinion, OmegaWiki can be more than a Dictionary. Though at its current state limited to that. I think that WikiLang's goal can be achieved here, but if we do not get enough editors and developers, not much can be achieved. --Hiong3-eng5 (talk) 06:53, 2 May 2013 (CEST)
Not sure yet... I think it would be better to have it as a separate wiki, but if the proposal is not successful, we could consider hosting it on OmegaWiki. Having it on OmegaWiki would make OW less "dry" (not just a database of translations, but also a few texts about languages, grammar, etc. could be nice). I made this ugly page: Help:Language_resources, a replacement by WikiLang would be much better. I am just a little concerned about sprachausbau, since neologisms are not normally accepted on OmegaWiki but we could probably find a solution to that. --Kip (talk) 10:40, 2 May 2013 (CEST)

Vocabulary Trainer[edit]

What was the Vocabulary Trainer? What was the purpose of this functionality? Why has is it stopped at this time? Is it a little similar to what WikiLang's goal? --Hiong3-eng5 (talk) 12:55, 2 May 2013 (CEST)

I don't think it has ever really worked. The purpose was to learn new vocabulary, like a flash-card. It gives you a word in a language, and you have to give the correct translation. Instead of our own vocabulary trainer, I think the way forward is to export our data in a format supported by other vocabulary trainer, or flash-card softwares, like kvoctrain for KDE (linux). --Kip (talk) 13:11, 2 May 2013 (CEST)
i would be interested in helping reactivating of a online version. see Talk:Functionality --MartinMai (talk) 23:40, 11 August 2013 (CEST)
Meanwhile, you can do something like this: you write yourself the words you want to learn. But it is quite fastidious. --Fiable.biz (talk) 02:26, 12 August 2013 (CEST)
Like it seems, some of us are somewhat back to the roots? Hmmm ... funny coincidence. Let's see - yes, a vocabulary trainer would be good and I could imagine have it working with Lessons which could be created like collections. My only doubt is that we could then have too many collections ... hmmmm ... maybe one should be able to create these lessons from within the vocabulary trainer. --Sabine (talk) 18:14, 19 September 2013 (CEST)

Update 04-05-2013[edit]

Still cleaning the code, I have changed several things with CSS, so if you have a problem, first try to refresh your cache (Ctrl F5 or Shift F5), and then report it.

Among other things, I have shortened the html class names, so that the pages are a bit lighter, though I am not sure if the improvement in speed will be visible. --Kip (talk) 14:10, 4 May 2013 (CEST)

Grammatical translations on OmegaWiki?[edit]

I had an idea about a new functionality for OmegaWiki, I wanted to get some feedback from the community.

In a manner similar to how, currently, a meaning is associated with words that correspond to this meaning in different languages, couldn't we associate a grammatical function with different ways it can be expressed in different languages?

Consider the following example:

ENG: present tense
1) Used to refers to an action or event that takes place habitually.

example: "I always take a shower"
  • FRE: Present tense (présent de l'indicatif). "Je prends toujours une @#!*% ."
  • TUR: Boundless tense (Geniş zaman). "Ben hep duş alırım."

2) Used when quoting someone or something.

example: "Mary says she's ready"
  • FRE: Present tense (présent de l'indicatif) "Marie dit qu'elle est prête."
  • TUR: Present continuous tense (şimdiki zaman). "Meryem hazırım diyor."

3) Used to refer to an arranged future event, usually with a reference to time.

example: "We leave tomorrow at 1 pm."
  • FRE: Present tense (présent de l'indicatif). "Nous partons demain à 13 heures."
  • TUR: Present continuous tense (şimdiki zaman). "Yarın 13'te gidiyoruz."

4) Used in providing a commentary on events as they occur.

example: "He kicks the ball and it's a goal!"
  • FRE: Present tense (présent de l'indicatif). "Il frappe la balle et marque le but!"
  • TUR: Present continuous tense (şimdiki zaman). "Topa vuruyor ve gol!"

So, the present tense of English can mean several things, and each of these are expressed with different grammatical constructs in French and Turkish.

And another example, this time using interlinear glosses:

ENG: present continuous tense (present progressive tense)
Used to denote an action, conceived of as having limited duration, taking place at the present time.

example: "I am eating" (I be.PRS eat.PTCP)
  • FRE: [être en train de].PRS verb.INF. "je suis en train de manger"
  • TUR: Present continuous tense (şimdiki zaman). "Ben yiyorum." (I eat-PRS-1PS)

Should we implement something like this in OmegaWiki? If yes, I propose to organize grammatical structures similarly to lexical expressions. A grammatical function (analogous to a lexical definition) will have zero, one or more "structures" per language (analogous to SynTranses) associated with it.

I am not quite sure of the best way to present "structures". One way is as I have done above, but there may be better ways that are used by linguists. And a better terminology, perhaps...

--InfoCan (talk) 19:58, 12 May 2013 (CEST)

Nice idea, but for most languages this is more a task for the Wiktionaries of WikiMedia Foundation. They see us as a threat yet and following your suggestion their 'fear' comes true so only languages not having a wiktionary it might be a great idea. To have more chance to join WMF we should be more cooperative and compatible rather than competitive.  Klaas|Z4␟V:  00:50, 16 May 2013 (CEST)
My thought was that this is no different from our current data scheme. In Omegawiki we are building a dictionary organized around definitions, and each definition is associated with SynTranses from many languages. The advantage of this organization is that when you have N languages, you don't need to create N*(N-1) bilingual dictionaries, you just create one dictionary organized around meanings. You can go from a Finnish word to a Korean word without having a Finnish-Korean dictionary. My proposal applies the same principle. You don't need to explain the grammar rules of N languages in N-1 other languages. All you need is to have centralized explanation of the grammar rules of all languages in a few languages (English, Russian, whatever), and you can figure out, for example, how to translate into Finnish the Korean equivalent of "I always take a shower" (which describes a habitual activity and is expressed differently in different languages). So, there is no competition with the Wiktionaries in this proposal. Wiktionaries are always from one language to another. Omegawiki is from from meanings to words in all languages. Just like Omegawiki will help Wiktionaries by enabling the creation of bilingual dictionaries for any pair of languages, it could also help Wiktionaries by enabling the creation of grammar books about any language written in any other language. --InfoCan (talk) 17:05, 16 May 2013 (CEST)
While I am not 100% sure what you have in mind exactly, I think it is an interesting proposal :) [or at least what I understand from it is interesting ;) ]
If we do this, we would probably need a new namespace, since the needed functionalities might be a bit different from the Expression: namespace. But I still need to think more about this. --Kip (talk) 23:26, 22 May 2013 (CEST)
Please notice the consequence of the will to become infeudated to the Wikimedia foundation: you already want to brake Omegawiki's development. --Fiable.biz (talk) 11:25, 19 May 2013 (CEST)
See the discussion we already started about Needed notions. I think the priority of OmegaWiki should be to deal with inflections. --Fiable.biz (talk) 11:25, 19 May 2013 (CEST)
I tend to agree with you, inflections should be our first priority. Ascánder (talk) 03:16, 21 May 2013 (CEST)
The best way to implement it that I imagined so far is this: User:Kipcool/Inflexions. If we don't find a better structure, I could give it a try, considering that, since everything is stored nicely in a database, it is always possible to switch to another structure later on. --Kip (talk) 23:26, 22 May 2013 (CEST)
One issue I see there is that some languages decline in three or more dimensions. For example, Hebrew verbs decline based on binyan, tense, number, person, and gender. So a row/column methods may not work. -- Ypnypn (talk) 21:22, 23 May 2013 (CEST)
The idea is that in the end, the information has to be printed on a screen, and the screen is 2D.
But in principle yes, you are pointing out exactly the issue that is still causing me trouble, and which is why I am not 100% happy with my solution (but couldn't find a better one). --Kip (talk) 22:51, 23 May 2013 (CEST)
Your solution 3 seems to me the good one.--Fiable.biz (talk) 03:35, 12 August 2013 (CEST)

Those inflexions are causing me some questions as well. For example, in Mi'kmaq language, there isn't a word for "hand", but multiple words for "my hand", "his hand", etc. How OmegaWiki "software" is handling that? Thanks, Amqui (talk) 19:03, 27 May 2013 (CEST)

Don't they have a word for if they find a hand somewhere - or see a painting of a hand - and don't know whom it belongs to?
I see two ways of handling it, but this has never been discussed, since we mostly have had editors of Eurasian languages so far.
  • use as a translation what would be typically used in a Mi'kmaq dictionary. Maybe then add a text in "usage note" indicating that it actually means "my hand" and not just "hand".
  • add a few translation under "hand", such as the words for "my hand", "his hand", etc. and indicate them as inexact. At the same time, define these words as a separate expression, to give their exact definition (English: "my hand", French: "ma main"), they would not necessarily need to have English translation, since the definition is already a short translation.
I don't know which one makes more sense, since I am not familiar with this language. Also, there might be other possibilities which I have not considered (anyone??). --Kip (talk) 19:31, 27 May 2013 (CEST)
This was only an example for simple nouns, but if we really want to develop complete dictionaries in those languages as we do with English or French we will need a solution. The "problem" arises from the fact that the "concept" of words is different in Algic languages (and other morphological languages), a complete sentence in English can be expressed in a single word sometimes. Like in French we would add the prefix "re-" to change a verb to say "do it again", words in Algic languages are built by the addition of prefixes and suffixes. In most cases, it is possible to add a "root" for the translation of a word, but this "root" doesn't have any actual meaning in the concerned language. Amqui (talk) 19:49, 27 May 2013 (CEST)
You should answer the proposed Policy to create separate definedMeanings or not, and add there your case, as "different word to express possession", etc., with the rule you propose. --Fiable.biz (talk) 03:35, 12 August 2013 (CEST)

No page Tibetan?[edit]

I notice there is no page "Tibetan" or "Expression:Tibetan" ~ and when I tried to create the latter I was unable to. Can someone add these.

Also why is there no available language "Tibetan" listed in when we try to add translations etc.? Instead we have "Central Tibetan", "Amdo Tibetan", "Khams Tibetan", etc. While these are seperate spoken languages - the written language is the same. This is a bit like having "Cornish English", "Yorkshire English", "Scouse", "Irish English", and so on - but no "English". Can someone please fix this. Nobody is going to bother filling out seperate translations for "Central Tibetan", "Amdo Tibetan", "Khams Tibetan", and so on - when the written translations are identical for each.

There is of course an ISO 639-1 code for (plain) Tibetan: bo (2 letter code), bod (3 letter code).

Chris Fynn (talk) 17:31, 21 May 2013 (CEST)

If the creation of the page Expression:Tibetan does not work, it is usually because one of the three fields is missing
  • on top, combobox with red background, put the language of the word "Tibetan" (= English)
  • second combobox, left column: the language of the definition (usually English as well)
  • text area, right column: write a definition here
Great, I've managed to create the page Expression:Tibetan now. Hope the definitions I added are OK for a start.
"Central Tibetan", "Amdo Tibetan", "Khams Tibetan", etc. are only expressions, but not editable languages (for adding translations, etc.).
As an editable language we have only Expression:Standard Tibetan ( http://en.wikipedia.org/wiki/Standard_Tibetan ).
Hmm, this page has "standard Tibetan" and "central Tibetan" in the same DM, but according to what you are saying, they should be separated?
And maybe "Tibetan" should be added as a synonym of "Standard Tibetan"?
--Kip (talk) 17:47, 21 May 2013 (CEST)
"Literary Tibetan" might be better as "Standard Tibetan" is used by some as a synonym for "Lhasa Tibetan" - or the Tibetan spoken in Lhasa. There are actually a number of varieties of Tibetan spoken in Central Tibet - some of them mutually incomprehensible. Likewise there are several different sorts of "Amdo Tibetan" and "Khams Tibetan". With English there are numerous varieties of spoken English, but the vast majority of written English is essentially comprehensible to all readers no matter which variety or dialect of English they happen to speak. Similarly with Tibetan there are many kinds of spoken Tibetan but the written language is essentially unitary.
Doesn't a dictionary with written definitions deal primarily with written lannguages? Unfortunately ISO 639-3 and SIL Ethnolouge make no distinction between written and spoken languages, and their classification seems to be based on spoken languages - not neccessarily the best clasification system for applications primarily dealing with text. If there is a division to be made in a dictionary of written terms, it would make most sense to divide the written language into "Tibetan", "Classical Tibetan" and "Old Tibetan" - in the same way as we have "English", "Middle English", and "Old English".
Chris Fynn (talk) 18:51, 21 May 2013 (CEST)
To keep things simple, for editable languages I think plain "Tibetan" would be best. There are many "Tibetan", "English-Tibetan" and "Tibetan-English" dictionaries. I don't see too many dictionaries with "Central Tibetan" or even "Standard Tibetan" in their title. As you suggest, you could make "Tibetan" as a synonym for "Standard Tibetan" - but it would be best if it showed as plain "Tibetan" when adding entries. "Central Tibetan", "Amdo Tibetan" "Khams Tibetan" etc could be used in the few instances where there are differences. Similar to "English", "English (United Kingdom)" and "English (United States)"
-- Chris Fynn (talk) 17:15, 22 May 2013 (CEST)
I have separated Standard Tibetan and Central Tibetan into two separate concepts (probably the definitions can be improved), and used the one you created Tibetan as the name for the language.
In OW, we do have definitions for both written varieties, and spoken varieties. This is because we imported some ISO639-6 language data, which makes this distinction. This does not mean that we will enable each spoken variety as an editable language, since as you say it does not make much sense for a written dictionary.
In my opinion, "English (United Kingdom)" and "English (United States)" does not make much sense. It is better to use "English", and add a "region: UK" or "region: USA" annotation to the word. English is actually one language, and the distinction between UK and USA English is not so clear as the language evolves. Furthermore, there are also words specific to e.g. Irland, New Zeland, Australia, South Africa, ... and it does not make sense to have as many varieties of "English (...)" languages. The region annotations make such distinctions more easily (one word can have several regions). Same goes for Spanish, which is spoken in many countries, each having some specific words. --Kip (talk) 18:56, 22 May 2013 (CEST)
Great. Nicolas Tournadre estimates that there are 220 'Tibetan dialects' derived from Old Tibetan and nowadays spread across 5 countries. Eventually SIL will probably define an ISO 639 language code for most of these. Fortunately, as far as a dictionary dealing with written words is concerned, the majority of these speakers (or at least the literate amongst them) use the same written language: Tibetan, which also serves a purpose similar to that which Latin once did amongst speakers of diverse Romance languages. Dzongkha is fairly unique in now having a seperate written form - but that is very recent. Chris Fynn (talk) 19:46, 26 May 2013 (CEST)

How to make a collection?[edit]

I have created the Expression:Neuer Kölnischer Sprachschatz and the DefinedMeaning:Neuer Kölnischer Sprachschatz (1474560) with it so as to have a collection of the same name collecting words, respective spellings, listed in said book. According to Help:Collection, I tried to add expressions to the collection by entering the collection name "Neuer Kölnischer Sprachschatz" in the collection name field under the expression, but when I do that, it disappears and the expression is not added to the collection upon saving the expression page. What am I doing wrong?

I have also looked in detail through the definitions of the expressions of other collections, and found nothing that would distinguish them from other expressions. I am wondering. --Purodha Blissenbach (talk) 00:33, 23 May 2013 (CEST)

Having again looked into the issue, I find that we need a way to add spellings to this collection, nothing else. We don't have a way to add spellings to collections yet, do we? --Purodha Blissenbach (talk) 02:52, 23 May 2013 (CEST)

  • For adding a collection, there is the special page Special:AddCollection which, I think, is only available to bureaucrats.
  • And, indeed, collections are only for DefinedMeanings. Applying the same system for spellings would not be straightforward - and we would also need to find a way to integrate it nicely in the interface. What I did until now is to simply create a standard wiki page with the list of these spellings. For example this: Portal:jpn/jlpt1 and more at Category:Wordlists_by_language. --Kip (talk) 09:20, 23 May 2013 (CEST)

Table sorting: faster?[edit]

I have just changed the table sorting - for every table that has a language column.

Instead of a client-side sort (done with Javascript, which would get my CPU to 100% for a few seconds after loading a page), I am now experiencing a server-side sort (done with php).

The server-side sort gives more work to the server, which is why I first wanted to avoid that, but the client sort - using the jquery tablesorter - sometimes freezes the browser...

Also, since I was playing with sort, I made the sorting of translation tables a bit better: it will sort first on languages, then (for a given lang) put the exact translations on top, and then sort the translations alphabetically.

So the question: do you notice a difference in speed? --Kip (talk) 20:17, 23 May 2013 (CEST)

Add an editable language[edit]

Good day, please add Mi'kmaq (ISO code mic) as an editable language on OmegaWiki. Thanks, Amqui (talk) 22:12, 24 May 2013 (CEST)

Done! (maybe it will appear under the name "Micmac", we can't control which synonym is chosen at the moment) --Kip (talk) 13:00, 25 May 2013 (CEST)
Thanks. Amqui (talk) 05:30, 27 May 2013 (CEST)

Translations list[edit]

Ok noob question here: How do I get the translations list? For example, if I go to Expression:dog, I can see the various definitions of dog, I can also see what "dog" means in other languages, but I can't see what is a dog in the other languages. Amqui (talk) 19:41, 27 May 2013 (CEST)

Ok, sorry, very noob question... just need to click on the square to "unfold" all the tables... Amqui (talk) 20:03, 27 May 2013 (CEST)
The current practice is: click on the triangle or definition to fold/unfold ; click on the "headword" to access the DefinedMeaning page. The "unfold" is clicked automatically by java for pages that have only one definition.
I think you are not the first to have this problem. Maybe we should add a fold/unfold button next to the small [edit] button on the right?
"square": you see a square at the beginning of the line? (should be a triangle but it is a UTF-8 character, and I don't know if it is displayed correctly everywhere), or you mean the "light green square background"? --Kip (talk) 20:46, 27 May 2013 (CEST)
I don't see the triangle when it is folded, I only see a square like for characters of a police I don't have where the triangle is suppose to be (but I'm currently using an outdated system though). However, I see a triangle pointing down though when it is unfolded. If I would have seen the triangle, maybe it would have been more intuitive than the square... A fold/unfold button may be a good idea. Personally, I would replace the "edit" button by an "unfold" button, and have the "edit" button only shown under when the "section" is unfolded. Amqui (talk) 20:57, 27 May 2013 (CEST)

Noob questions[edit]

Other noob quetions:

  1. Why can we only add one definition when we create a new expression? I think it would be practical to have the "+" sign to add another row of definition to add a definition in another language right away. This "+" is there when I edit the page afterwards, but not at the creation of a new expression.
  2. Are different orthographies for the same expression are considered "synonyms" on Omegawiki?
  3. I haven't find any section on conjugations yet, is there any? I don't think we create "expression" for words that are a conjugated form of a verb (like Wiktionaries do), like "trouvait" for the third person singular simple past of "trouver" in French, is that right? However, conjugation tables would be a good idea.

Thanks, Amqui (talk) 20:57, 27 May 2013 (CEST)

  1. No "+" when creating a new expression: this is because when creating a new expression, the software creates internally the corresponding defined_meaning_id (DM_ID). You need this ID to attach new synonyms to the same DM. But this Id is created and known only when you first create an expression and click save. So, in theory it is possible, but given the way OW is implemented, it is not as easy as it seem to have it as you say ;-) It might be possible however in the future when we do more ajax (like Wikidata).
  2. yes, different orthographies are considered synonyms. We can then link them together with the relation "alternative form".
  3. We don't have conjugations but we do want them. This is part of the "inflections should be our first priority" in the discussion above. Inflections include conjugations (verbs), gender-plural (French adjectives), declensions (German nouns), etc. There has been some discussions on the subject already... International Linguists Beer Parlour/Inflexions :-) --Kip (talk) 21:23, 27 May 2013 (CEST)
Thanks, it all make sense, and add animacy to your plural-gender-animacy part of it. Animacy is used in Algic languages like gender is used in French, but instead of male-female it is animate-inanimate (they also use a "dual form" besides singular and plural). Another question for now: I haven't started contributing to any of those yet, but for languages that use more than one script, do OmegaWiki separate them into "two language names", or do they just add words in different scripts under the same language name? Thanks, Amqui (talk) 21:41, 27 May 2013 (CEST)
We separate into two language names, with custom-made codes. See Help:Language#Languages_with_several_scripts. --Kip (talk) 17:02, 28 May 2013 (CEST)

Also on the subject of "inflexions", we would need basic guidelines as to where a word is not an "inflexion" anymore, but is a new "word". For example in Mi'kmaq "iku" is "animal", "kisiku" is "old man", "kisikui'skw" is "old women", "kisiku'sm" is "old animal", and "kisiku'skw" is "old female animal", are those inflexions of the same word or different words (considering that Mi'kmaq language doesn't use male-female gender like French does)? Amqui (talk) 21:56, 27 May 2013 (CEST)

I would say this is rather dependent on the language. If a person familiar with the language is able to easily understand how the word is constructed, then we don't really need the word. For me, "kisiku" is not really different than "old man", only that in one case there is a " ". However, we would add "kis-" as a prefix, synonym of the adjective "old". --Kip (talk) 08:47, 28 May 2013 (CEST)

Noob question take three: Where is Spanish? I can't find it when I click in the language box to add a definition, not under "Spanish" not under "español"... Amqui (talk) 22:19, 27 May 2013 (CEST)

Oh I guess it is under Castilian, a bit weird... Amqui (talk) 22:23, 27 May 2013 (CEST)
Yes, as I said previously, it selects one of the synonyms a bit randomly. If you set your interface in French, you'll have it under "espagnol". --Kip (talk) 08:47, 28 May 2013 (CEST)

Languages with more than one script[edit]

To come back to my earlier question about languages with more than one script, do you think it would be possible to include an "auto-conversion" tool between the scripts on OmegaWiki? For example, the Inuktitut Wikipedia ([2]) is using one. I think it would be counter-productive to have to add each word twice in each script when it is a direct conversion for all words of a particular language. Amqui (talk) 19:27, 28 May 2013 (CEST)

It is tempting to use such tool, however if we use auto-conversion like Wikipedia, we won't have the converted words in the database, and thus couldn't search for it. Because we are multilingual, we cannot really make a wild guess and try to convert a word that we are searching for... What we can do is run a conversion tool on the database from time to time to create those words that are missing. Or there might be other solutions. Since many languages are affected, we definitely need something in that direction. --Kip (talk) 19:43, 28 May 2013 (CEST)
I didn't mean to implement a conversion tool to convert the text on the wiki page like on Wikipedia, but a tool that automatically create the same word in the other script when a given word is created in a given script. For example, if I add the Inuktitut translation nanuq to the polar bear DM of OmegaWiki, there are no reasons why the equivalent in syllabics ᓇᓄᖅ could not be added automatically in the database as well. Maybe a bot could do it after from time to time like you said, but there could be an extension or something that do it automatically at the creation of the word for specific languages that the conversion between scripts is straightforward. Amqui (talk) 22:30, 28 May 2013 (CEST)
Ok, then the idea is good :) --Kip (talk) 10:14, 29 May 2013 (CEST)

Deleting entries[edit]

Does it require a special right to be able to delete entries like wrong translation of a word? If so, where do we make requests? If not, how do I do it? Thanks, Amqui (talk) 22:30, 27 May 2013 (CEST)

Welcome to OmegaWiki Amqui! I think anyone who has an account is able to delete entries. In the edit. Let us say that I have entered a wrong expression for dog, let's say I put hî to Bân-lâm-gú (POJ) (hî is actually fish). So I need to delete it. This is what I will do. I will select the box on the left side of syntrans like below ...
☑ Bân-lâm-gú (POJ) hî khui ▿
and when I save it, the word is deleted. The left side box is either for deleting or adding a new entry. Hope this helps. -- Hiong3-eng5 (talk) 07:00, 28 May 2013 (CEST)
Yes, special right is required to delete wrong translations, but you have these rights already. The rest is as Hiong3.eng5 says.
Anyone having spare time is welcome to help and make some screenshots to complete the help pages :). --Kip (talk) 08:47, 28 May 2013 (CEST)
Thanks to both, I guessed the check box needed to be checked to delete something, but it wasn't intuitive that you only need to check it and click Save since nothing above the check boxes column is saying "Delete" or anything like that. Thanks, Amqui (talk) 16:23, 28 May 2013 (CEST)
There is an icon of a trash can on top of the column, and when you put your cursor on it says something like "click to remove the entry" (though the checkbox should be modified to say the same).
By the way, I had created this page some time ago Help:Correcting a mispelling. --Kip (talk) 16:59, 28 May 2013 (CEST)
I don't see the icon of the trashcan from here, only an empty square beside the other titles. When I move my cursor over it, it indeed says "Mark the row to remove", but it is very unintuitive to move your cursor over an empty square. Amqui (talk) 18:43, 28 May 2013 (CEST)

Browser issues IE7[edit]

While we are talking about the "interface", there is something very buggy. On most pages there is a very large image of the Wikipedia logo (sometimes the Wikidata logo) as you can see on this printscreen: [3]. Amqui (talk) 18:50, 28 May 2013 (CEST)

ie6countdown?
I am not sure what to do. It is probably that some css or js rules are not supported by your browser, but I don't have Internet Explorer here to test :-( --Kip (talk) 19:17, 28 May 2013 (CEST)
I'm using Internet Explorer 7, and I can't use another browser since I'm accessing Internet from a public place, and I'm sure others do. It is most likely js that causes the issues here. Amqui (talk) 19:23, 28 May 2013 (CEST)
The trash can is pure css, I am using the "@embed" feature of ResourceLoader. The Wikipedia logo is created with javascript. The image has a size of 200px, but the browser is told to display it at 50px. What you get is way more than that... weird... It is said that IE has an icon on the bottom left that indicates a javascript error. If you double click it, you should have some message? --Kip (talk) 19:44, 28 May 2013 (CEST)
I usually get javascript loading errors as you said in the status bar at the bottom of the screen, but I get none when opening OmegaWiki pages including the ones with the huge Wikipedia logo.Amqui (talk) 22:36, 28 May 2013 (CEST)
I tried the expression using IE 10, and the interface seems ok. Maybe IE 7 does not have support for newer js scripts used to display the huge Wikipedia logo. Maybe you could request the internet cafe to install firefox or chrome? --Hiong3-eng5 (talk) 06:56, 29 May 2013 (CEST)
Or should the js be replaced with a code that is compatible with IE7, for the benefit of those who are using old browsers? -- Hiong3-eng5 (talk) 06:58, 29 May 2013 (CEST)
I implemented all the js functions using jQuery (1.8.3). It is supposed to work on all browsers including IE6,7,8 (which are dropped in jQuery 2.0). --Kip (talk) 10:23, 29 May 2013 (CEST)
I was able to reproduce the issue with an IE7 emulator under Linux. I'll investigate. --Kip (talk) 10:23, 29 May 2013 (CEST)
Turns out some functions were using pure javascript and not jQuery.
Fixed for you now? --Kip (talk) 10:45, 29 May 2013 (CEST)
The issue of the huge Wikipedia logo is fixed, thanks. I still can't see the trashcan or the triangle to unfold. Amqui (talk) 18:48, 29 May 2013 (CEST)
Now you should see the images and the triangles :) --Kip (talk) 12:42, 30 May 2013 (CEST)
Yes I do, thanks, Amqui (talk) 18:39, 31 May 2013 (CEST)

MediaWiki 1.21[edit]

We are now using MediaWiki 1.21. Please report any unusual behaviour. Thanks :) --Kip (talk) 21:41, 28 May 2013 (CEST)

So that's why I couldn't access the wiki, no warning nothing? :-P Amqui (talk) 22:32, 28 May 2013 (CEST)
I thought it would be seemless but it was not possible to let the server running during the upgrade, it gave a php error until the update script is complete. So I just turned apache down and had my dinner while upgrade was running ;-) --Kip (talk) 10:48, 29 May 2013 (CEST)

Add Inuktitut as an editable language[edit]

Only because I just talked about it, could you add Inuktitut as an editable language. Inuktitut is in fact a macro-language (iu and iku codes), so the language to add is Eastern Canadian Inuktitut with the ISO 639-3 code ike. It has two scripts: the latin alphabet (they call it the Standard Roman Orthography or SRO) and the Unified Canadian Aboriginal Syllabics script. I suggest to create ike-latn (or whatever is OmegaWiki's standard for codes for latin script) and ike-cans; "cans" is the ISO 15924 code for the Unified Canadian Aboriginal Syllabics. The other language under "iu" is Inuinnaqtun (or Western Canadian Inuktitut) with the code ikt, but I don't intend to contribute to it personally in the near future. Thanks, Amqui (talk) 22:53, 28 May 2013 (CEST)

Added! :) --Kip (talk) 13:21, 29 May 2013 (CEST)
Thanks, Amqui (talk) 22:45, 29 May 2013 (CEST)

Are former names considered synonyms?[edit]

The title says it all: Are former names considered synonyms on OmegaWiki? For example, if a city had been renamed, do we add the former name as a synonym? Thanks, Amqui (talk) 19:34, 29 May 2013 (CEST)

I don't think we have discussed that one already. I see several options:
  • we can consider that the city is the same, so the word mean the same, so they are synonymous - but we can add a "usage note" or any other annotation to mention that the name is not used anymore.
  • or we can consider that the former name, when used, only refers to that city up to a certain time. So we would have a separate DM like "Constantinople: Former name of the city of Istanbul, used from 330 to 1930".
At the moment, I have a slight preference for the second option. Any opinion anyone? --Kip (talk) 20:39, 29 May 2013 (CEST)
I agree with you, the second option makes more sense. --InfoCan (talk) 19:02, 4 June 2013 (CEST)

mul[edit]

I read in the Help page there is a language with the code "mul" which I assume means "multilingual" for international convention, like H2O for water. What is the actual name of the language in the English interface of OmegaWiki to add synonym into that language? Also does this "international convention" language includes scientific names of species? Thanks, Amqui (talk) 01:20, 30 May 2013 (CEST)

The language is called "International" (cf. Help:Language#Chemical_formulas.2C_Scientific_Latin.2C_etc.).
It includes Scientific names like Equus caballus. --Kip (talk) 12:02, 30 May 2013 (CEST)
  • mul is to be used for documents written in multiple languages, that is a mix of several languages. It has nothing to do with materials that are not part of any specific language, like classification systems. --Purodha Blissenbach (talk) 00:42, 8 September 2013 (CEST)

Universal Language Selector[edit]

I have installed the Universal Language Selector [4] - that you might have seen in some WMF wikis - which allows to change the user interface language without going to the preferences. It appears on top of the page, next to the username.

It replaces the combobox that we had on the left column for the same purpose. It also does a lot of cool things which I haven't completely understood yet. --Kip (talk) 17:28, 1 June 2013 (CEST)

This is good, it makes it easier to find your favorite language. Thanks! --InfoCan (talk) 19:04, 4 June 2013 (CEST)

Part of speech sorting[edit]

New feature: definitions are now sorted by part of speech :-)

See for example round. There is also a blogspot about it containing some illustrations.

If someone is interested in running a bot to fill in the missing part of speech information, please let me know. We already have (I think) an API to add annotations, so that a bot can use that API using any programming language. --Kip (talk) 21:42, 4 June 2013 (CEST)

Wow! This is great! Nice feature. --Hiong3-eng5 (talk) 01:00, 5 June 2013 (CEST)

Great! Now Omegawiki looks a bit more like a dictionary. --Tosca (talk) 13:58, 5 June 2013 (CEST)

Inflexions, again[edit]

In the discussion above [5] regarding how to deal with Inflexions, Kip had suggested that for some languages this can be done as a vector, for others as a table, and for some languages some more complex solution will need to be determined.

I agree that each language will probably have to be done differently but would like to offer a different perspective. I think that in most cases inflected forms are generated according to simple rules from a few basic forms. It is those basic forms that we need to be recorded in the OmegaWiki database.

For example, in English, plurals of nouns cannot be generated automatically, because there are many exception: house-houses, mouse - mice, sheep - sheep, belly - bellies, etc. So, for English nouns, I suggest we need to create a lexical annotation field called "plural".

Similarly, for English adjectives, comparatives and superlatives are not always obvious: bad-worse-worst, but sad-sadder-saddest; sour-sourer-sourest, but brief-more brief-most brief. So, for English adjectives, I suggest we create two lexical annotation fields called "comparative" and one called "superlative".

For English verbs, there are only four different forms: base form, third person singular present, past tense, Past participle, present participle. For the verb "to go", these are "go", "goes", "went", "gone", "going". Of these, the base form is the same as the infinitive (eg. "to go" -> "go") and the third person singular present can be constructed with consistent rules from the base form. However the past tense, past participle and present participle can have irregularities and for this reason most English dictionaries record them. For example for the verb "go", the words "went", "gone" and "going" are given. So, for English verbs, I suggest we create three lexical annotation fields called "past tense", "past participle", "present participle". German should be similarly simple.

I realize not every language is as simple, French is one example, and we will have to find a different solution for it.

Turkish is an inflectional language but the rules for constructing inflected forms are simple enough that you can write a program to do it, with only a few exceptions. If those exceptions are recorded in the database as lexical annotations, all other inflexions can be generated correctly according to simple rules. So, a few lexical annotations and some scripts are sufficient to generate all inflected forms of Turkish words. (I am currently working on writing scripts to generate inflected Turkish words and to parse inflected Turkish words.) I am guessing that Mongolian is like Turkish, Fiable could comment on that.

Kip, could we start adding language-specific lexical annotation fields for basic inflected forms, like the ones I described above?

--InfoCan (talk) 21:32, 5 June 2013 (CEST)

I'm afraid that English is one of the simplest languages inflexion-wise. Even German has a dozen or so forms for verbs, six for adjectives, etc. (Disclaimer: I don't speak German.) So while we could create hundreds of annotations, there might be a better way. Or not. --- Ypnypn (talk) 19:39, 6 June 2013 (CEST)
In German you would have a table with declension on one row and singular/plural/gender on columns. Adjectives in German are not really interesting, they are all regular.
Hmm actually, the German Wiktionary does have declension tables for adjectives: [6].
I know that the solution proposed by InfoCan would be nice in that we could have it right now, however I have already considered it in the past and saw the following problems in it:
  • basically it can take care only of the simple "vector" inflexions.
  • the inflexions would be mixed with the other annotations and not particularly ordered. I think it is best in the database to store them separatly as inflexions, and also to display them separatly from the other annotations. In Wiktionary, comparative/superlative are displayed differently from other annotations and I think it is how the reader expects to have it.
  • you would have to enter the inflexions for all DMs of the same word that share the same inflexions (e.g. all meanings of the verb "go"). In the solution that I proposed, the vector (or table) of inflexions has a unique id, and you could then attach this id directly to other meanings. When you modify or correct an error in one inflexion, it would be automatically corrected for all words that are attached to that inflexion.
  • I would also prefer that the inflexions be considered as expressions so that you find them when you use the search feature (where you currently find expressions).
  • And also it would be nice that inflexions have themselves annotations. For example we could give their IPA, and maybe a usage note (like "this inflexion is valid but considered old" or "this inflexion is valid only in South Germany" or whatever). Or: would we also want that any annotation can be annotated?
I am still considering the solution I proposed, but now with a 3-dimensional table (basically a list of table). For example this conjugation of a French verb is a 3 dimensional table. Of course with a 3-dimensional table, if you set two dimensions to "1", then you have a vector, and for the user it would look like a vector. Probably I should just go ahead, implement it, and see which problems arise (and let you test it on the test server). I don't see why it should not work, and I have not seen someone come up with a good alternative.
However, if you can't wait with having inflexions and want to have what InfoCan listed above, we can start with that now, and convert them later to another system. This is the point of a relational database, conversions from one format to the other are easy :) --Kip (talk) 20:11, 6 June 2013 (CEST)

My suggestion is an alternative to the 2- or 3-dimensional solution. It seems to me that a lot of inflexions can be algorithmically generated.

For example, in French, first and second group verbs are regular, they conjugate according to a simple pattern. To conjugate first group verbs in the present tense, you remove the -er ending of the infinitive, then add -e, -es, -e, -ons, -ez, -ont for 1PS, 2PS, 3PS, 1PP, 2PP, 3PP. This rule is a much more compact way of storing the information then a table. Even third group verbs, which are irregular, can be conjugated if you record in the database the verb's principal part ( the forms that you must memorize in order to be able to conjugate the verb through all its forms). See the section on third group verb conjugation in the Wikipedia article on French conjugation [7]. That leaves only a very small number of words that don't fit the rule and most be recorded as exceptions. So, as far as OmegaWiki is concerned, for French verbs, you only need to annotate 1) what conjugation group a verb belongs to, 2) if it is a 3rd group verb then its principal part, and 3) if its conjugation is very irregular, then a flag to indicate that rather than use an algorithm for its conjugation, a pre-filled conjugation table should be displayed.

Turkish is a highly inflectional language and you cannot display all possible inflections in 3 or even 4 dimensions. For example "Avrupalılaştıramadıklarımızdanmışsınız" means "you are rumored to be one of those whom we have not been able to make European" [8]. However, Turkish is highly regular and all such words can be generated algorithmically. There are a few exceptions for foreign-origin words and for these it is sufficient to record one inflected form of the word (the dative case, for nouns) so that the algorithm can generate all other inflections correctly. So, as far as OmegaWiki is concerned, for Turkish nouns, you only need to indicate whether it is irregular, and if so, record the dative case of that noun. With this minimal amount of information an algorithm can generate all inflected forms of that noun.

For English, as I mentioned above, you only need to store three forms of verbs (e.g., went, gone, going) to generate all conjugations according to algorithmic rules. If the past participle of a verb is recorded in the database (e.g., "gone"), an algorithm can correctly generate conjugations like "I had gone" or "he will have gone" or "we should have been gone"; you don't need separate tables for all of these tenses.

So, I would argue that storing language-specific algorithms, plus, possibly, a minimal number of word-specific parameters, is sufficient to display all inflections. This would be a much more efficient form of storage than tables of inflections. Tables could still be used, but only for a small fraction of words that are irregular. --InfoCan (talk) 15:24, 7 June 2013 (CEST)

I understand what you mean, it is also how it is done at Wiktionary, with templates where you give only a base form and sometimes one or two other forms, and it generates the rest.
Actually, the way of storing the information (rule or explicit storage) depends on what we want to do with inflexions. If we just want to display them, then rules are fine and, as you say, an optimal way of storing inflexions. However if we want to be able to search an inflexion, and that the system returns something like "this is the third tense, present of verb ...", then rules are not nice, and you need something called stemming (or Lemmatisation) which is not trivial and has made a few generations of linguists busy already.
So, if we want to be able to search for inflexions, we need to store each of them individually. This is also done at Wiktionary, where they create a page for each inflected form (bot use the rules-templates inflexions to generate these pages).
In the system that I described, I would of course add the possibility to semi-automatically fill in an inflexion table by using rules, because entering all inflexions manually would be too cumbersome. But I think that all inflexions should be stored in the database, and not just rules.
I can only speak about languages I know. For Turkish, I don't know if it is really interesting to be able to search by inflected forms. If it is so regular that anyone can guess the stem of a form, then probably we don't need to store each form independently. Do we want to display inflexion tables for Turkish? I don't know.
In English, I don't plan to store "he will have gone", because nobody would want to look for that one, but only a vector went, gone, going. For display-only (not search), there is also the possibility to rely on external websites (like verbix).
For French, we would probably want to be able to search for "fussent" (subjonctif imparfait) but not for "eussent été" (subjonctif plus-que-parfait). Maybe we can have a mix of form that we store explicitly (e.g. all subjonctif imparfait inflected forms in French, even if regular), and forms that we want to display in the inflexion table, but don't want to be able to search for (e.g. all subjonctif plus-que-parfait forms in French, these are always regular). But the implementation of that is a challenging nightmare :) --Kip (talk) 18:32, 7 June 2013 (CEST)
I don't think inflection tables are needed in Turkish. It may be better to have a special page where a user can enter a word and the page generate all inflections on demand. This is a simple thing to program. Parsing inflected forms of Turkish words is doable too, although a bit more complex, because multiple root words with different inflections can produce the same end result. Despite some of the complexities mentioned in the Wikipedia article on stemming, I know for a fact that Turkish word-parsing scripts do exist. I don't have acces to them but I have a good idea on how they would work and I am working on writing one myself now. It would be nice to integrate both types of scripts (inflection addition and inflection removal) into OmegaWiki.
I guess we will need different solutions for different languages. For Turkish and other inflected languages, I believe the only workable solution will be to write word-parsing scripts. For languages like French, I am uncertain, if you think that lookup tables will be more efficient than rule-based parser, I am fine with that.--InfoCan (talk) 03:08, 9 June 2013 (CEST)

Preferred translation[edit]

"Preferred translation" would be the possibility to indicate, from a list of synonyms/translations in a language, the one that best fits the definition, or the most frequently used word. For example in German, "verb" can be said "Verb" or "Tätigkeitswort" (a "word for activity" ;) ). "Verb" is more common but currently the German interface of OW randomly picks "Tätigkeitswort" as a translation.

I need your help with how to implement that. As far as the database is concerned, I'll start with the easiest possible solution. It will be a flag "1" or "0". A change in the flag will not be logged, so that we won't know who put which flag, but also it won't slow down the system. (If that does not work out I'll then consider more complex solutions).

Where I need you help is

How to name it?[edit]

  • "preferred translation" , "most frequent translation", "best translation", ... I lack imagination. --Kip (talk) 20:27, 6 June 2013 (CEST)
"preferred translation" seems fine. "Best" soounds subjective. --InfoCan (talk) 18:33, 13 June 2013 (CEST)

How to show it?[edit]

Put the preferred translation in bold? underlined? with some text-color? Background-color? With a star in front? --Kip (talk) 20:27, 6 June 2013 (CEST)
(Probably this is not too much of an issue because it will be some css that each user can change to his own taste.)
How about just making it first in the display list, with no additional formatting? --InfoCan (talk) 18:35, 13 June 2013 (CEST)
Yes, that seems right --Hiong3-eng5 (talk) 06:53, 16 July 2013 (CEST)

How to change it?[edit]

This is mostly where I need input. I consider the possibility to change it without having to edit the whole page. I have the following ideas so far:

  • right click on the word would show a menu with "Set as preferred translation".
  • an extra column? --Kip (talk) 20:27, 6 June 2013 (CEST)
I think an extra column would be best. Right-clicking rarely works on websites (since it usually opens a browser-dependent menu). Ypnypn (talk) 17:01, 13 June 2013 (CEST)
Agree. Also, the option of right-clicking does not occur to some people (like myself). --InfoCan (talk) 18:32, 13 June 2013 (CEST)
Agree --Hiong3-eng5 (talk) 06:53, 16 July 2013 (CEST)

Suggestion[edit]

Kip, I read your post about how to implement this. My suggestion is that at the database level, there will be ranking, but in edit only preferrence is given. The ranking is adjusted based on the new preferred translation. example if ranking is by free (1), freedom (2) and liberty (3). If I chose freedom as my preferred translation, the ranking is saved as freedom, free, liberty. --Hiong3-eng5 (talk) 06:53, 16 July 2013 (CEST)

List of words[edit]

Good day, How can we obtain a list of all words in a given language, a bit like the category pages on Wiktionaries? Thanks, Amqui (talk) 20:01, 10 June 2013 (CEST)

I do not know, the closest I know is through a special page Data search. Check this out. Eastern Canadian Inuktitut (Latin script). --Hiong3-eng5 (talk) 07:01, 11 June 2013 (CEST)

I am in the process of adding prev/next buttons, like for category pages. Should be live tonight. --Kip (talk) 08:59, 11 June 2013 (CEST)
Thanks Hion3-eng5, that's what I needed, when we will be able to navigate to have all the words (not only 100) it will be awesome. It would be even better if there would be a way to not show the "External identifiers matching" section which is pretty useless and clutter the page. Thanks, Amqui (talk) 17:14, 11 June 2013 (CEST)
You can hide "External identifiers matching" by unclicking the checkbox. A had already planned to have it hidden by default, and I'll replace the checkboxes with a radiobutton . --Kip (talk) 17:16, 11 June 2013 (CEST)
Ok, I added previous and next buttons, and removed the useless external ID search (didn't find how to easily have a radiobutton).
I rewrote a lot of the page, so if bug there is, tell me please :) --Kip (talk) 19:57, 11 June 2013 (CEST)
Great work Kip, thanks! I don't know if it's hard to do, but it looks weird to have (Previous 100) and (Next 100) buttons when there aren't 100 items to show in a given search. The message "Showing only a maximum of 100 matches (out of X)." is also still there, but is now useless. Amqui (talk) 06:09, 12 June 2013 (CEST)
No it is not hard to do, and makes sense ;-) --Kip (talk) 10:00, 12 June 2013 (CEST)
Done. --Kip (talk) 20:14, 13 June 2013 (CEST)

Fusion[edit]

Not sure if there is a specific place to propose such things, but I think those two DMs should be merged: DefinedMeaning:blacksmith (430574) and DefinedMeaning:fabbro (699761). Thanks, Amqui (talk) 07:06, 12 June 2013 (CEST)

There is this dead page http://www.omegawiki.org/Meta:Questions_about_words , but I'll take care of the merging (we still lack a tool for that) --Kip (talk) 10:05, 12 June 2013 (CEST)
Done. --Kip (talk) 21:15, 12 June 2013 (CEST)
Thanks, Amqui (talk) 22:38, 12 June 2013 (CEST)

Add Atikamekw as an editable language[edit]

I'm helping the development of the Atikamekw Wikipedia and I thought I could add some words in it to OmegaWiki the same time, could you add it as an editable language please? Its ISO 639-3 code is atj. Thanks, Amqui (talk) 22:01, 13 June 2013 (CEST)

Done Done --Kip (talk) 12:20, 15 June 2013 (CEST)
Thanks, Amqui (talk) 23:50, 17 June 2013 (CEST)

Wikidata[edit]

The page Wikidata:Wiktionary has just been made into a new proposal for expanding Wikidata to provide basically everything OmegaWiki offers. -- Ypnypn (talk) 23:44, 19 June 2013 (CEST)

I also blogged about it. --Kip (talk) 10:22, 20 June 2013 (CEST)

There is an alternative proposal. Do you think we can get something from these and improve on OmegaWiki? --Hiong3-eng5 (talk) 06:46, 2 July 2013 (CEST)

The first proposal does not care about translations (which is at the center of OmegaWiki...) and so is pretty far apart from what we do (i.e. useless for us). I'll write a longer blog post about it when I find some time.
The alternative proposal looks much better prima facie, but I am waiting a bit to understand it better. --Kip (talk) 10:45, 2 July 2013 (CEST)
They write about us as well, but not very positive. I wonder why and what do we do wrong in their opinion? See http://www.wikidata.org/wiki/Wikidata:Wiktionary_(alternative_proposal)#How_is_this_different_from_Omegawiki
 Klaas|Z4␟V:  22:33, 8 July 2013 (CEST)
Probably because the current structure of OmegaWiki seems to be geared for experts (well that is how we are perceived, I am no expert). I know when I tried to look at it while it was still WiktionaryZ, I was a little scared to use it. But the second time I looked at it ( now as OmegaWiki ), meeting the people and tinkering a little with what OmegaWiki can do, I liked it. I too like Kip see that the alternative proposal, if we understand it better has some nice points, at least at first glance. If we can assimilate it in a way that will benefit OmegaWiki, I say we go with it, though we do lack people who can help make OmegaWiki, how do I say it in their terms... user-friendlier perhaps. The point of the alternative proposal is that if a contributor wants to contribute, they do not want to be weighed down by rigid structures (must be added to a defined meaning). We are comfortable with the structure, most are not. Any ideas for the following: 1. How to make OmegaWiki more appealing to contributors 2. Find programmers who will program OmegaWiki to be less rigid (their word). --Hiong3-eng5 (talk) 06:34, 9 July 2013 (CEST)
Well, I still don't really fully understand the alternative proposal. I only understand that it killed our "adopt OmegaWiki" proposal. --Kip (talk) 08:27, 9 July 2013 (CEST)
So, would we go now for an autonomous association and try to get money ourselves, or wait again a few years? As already said long ago by someone else, the term "DefinedMeaning" is a bit upsetting. What about "notion" or "concept" or just separating the 2 words: "defined meaning"? A way to make things clearer would be to have 2 different background colours for definedMeanings and for expressions. The 2 have similar interfaces, and this is confusing at the beginning. The white could be just for non-definedMeaning non-expression pages, as the home page, the parlours etc.. And, if we want to be efficient, we definitely need inflexions in OmegaWiki. Another user-friendly improvement would be to add a small version of the picture beside each definedMeaning in the expression page, so that the reader could choose quicker the meaning he is looking for. --Fiable.biz (talk) 06:25, 28 July 2013 (CEST)

Alexa ranking[edit]

Our global ranking improved. As of today we are ranked 815,940 (3 July 2013 +0800) against 1,559,100 (8 May 2013). fyi --Hiong3-eng5 (talk) 03:28, 3 July 2013 (CEST)

It is because I installed the alexa toolbar in my browsers... --Kip (talk) 09:11, 3 July 2013 (CEST)

internal code change[edit]

I changed a bit the way comboboxes work internally. I replaced a few hidden html input with html attributes to pass parameters. The effects are:

  • html pages are now a bit smaller in edit mode
  • less parameters are sent to the server when clicking "send", so it could be a tiny bit faster.
  • you need to refresh your cache Ctrl+F5 or Shift+F5, if the comboboxes don't work anymore for you. --Kip (talk) 12:26, 6 July 2013 (CEST)

Research[edit]

Yes I had read it as well. I don't know them personally, but they have been using OW data for a few years. --Kip (talk) 17:44, 16 July 2013 (CEST)

Dynamic editing of annotations[edit]

The annotations on the right column in the translation tables have been changed to allow for dynamic editing. The main idea behind it was to not load all the annotations when a page is displayed. It saves a lot of sql queries, makes the html pages smaller, and creates less "html input" which are sent back to the server when saving. So now:

  • a page like Expression:water can be viewed and edited much faster.
  • the annotations are loaded only when one clicks on "show" (so it takes typically half a second to one second to display, according to how the server is busy)
  • the annotation pannel now has edit and save functionalities, independently of the main "save" button.
  • a first side effect of this is that it is now possible to edit annotations in view mode, without the need to edit the entire page, which should save us some time (but it takes some times first to get used to it).
  • a second side effect is that the button "show/hide" is now always displayed, even if there is no annotations (so that you click "show" and then edit).

The main annotations ("lexical annotations" and "semantic annotations" sections) are not affected (yet). --Kip (talk) 17:42, 16 July 2013 (CEST)

Oh, and you need to refresh your browser's cache if it does not work. --Kip (talk) 17:43, 16 July 2013 (CEST)

Request[edit]

I want to request:

  • - part of speech for all Mandarin and Min Nan Languages. (btw, why is this not automatic for all Languages?)
  • - a syntrans relation script variant. I think this will be useful for Languages using many scripts, like Min Nan, Japanese etc.

--Hiong3-eng5 (talk) 06:34, 23 July 2013 (CEST)

    • "pos not automatic" => because part of speeches are different for different languages. For example, Mandarin has a part of speech "classifier" which has no equivalent in e.g. English. Many languages have part of speech "article", which do not exist in Mandarin.
    • Mandarin and Min Nan languages have "verb", "noun", "adjective", "adverb", "classifier", ...?
    • "script variant": should we enable it for all languages, or only those that have several scripts? Could we use "alternate spelling" for that, or is it better to have a separate "script variant" annotation? --Kip (talk) 10:26, 24 July 2013 (CEST)


      • Mandarin and Min Nan has:
  1. Expression:noun
  2. Expression:pronoun
  3. Classifier/measure word
  4. stative Verb/adjective Some reference I have use stative verb
  5. Expression:verb
  6. Expression:coverbs
  7. Expression:adverb
  8. Expression:place word
  9. Expression:question word
  10. Expression:time word
  11. Expression:particle
  12. Expression:auxiliary verb
  13. Expression:verb object
  14. Expression:interjection
  15. Expression:numeral
  16. Expression:postposition
  17. Expression:preposition (some of my references points to preposition where others to postposition)
  18. Expression:particle
  19. Expression:conjunction
  20. Expression:onomatopoeia
Maybe only for all with script variants? I think alternative spelling must be used within a language script, so I still prefer having a Expression:script variant, this way, I would have a way to separate these two things when using the database, say creating dictionary files.
-- 向榮 /Hiong3-eng5/ (talk) 07:06, 19 August 2013 (CEST)
I've just added the part of speech for Mandarin (simplified + traditional) and Min Nan (simplified + traditional + POJ).
for "script variant", I think it would be better to give it a name that depends on the language. For example "traditional script" for the simplified->traditional relation. Otherwise, for MinNan you would have "script variant" for both "simplified->traditional" and "simplified->POJ". Or, is that ok? --Kip (talk) 22:26, 7 September 2013 (CEST)

Yes, that would be nice. So I would request Expression:traditional script, Expression:simplified script, Expression:POJ script, Expression:Han Romanization script where they are applicable. Thanks! --向榮 /Hiong3-eng5/ (talk) 23:30, 7 September 2013 (CEST)

Policy to create separate definedMeanings or not[edit]

We need a more precise policy to create defined meanings (DMs). A few examples:

  • should "king" and "reigning queen" be one or two DMs?
  • In Malagasy, the participle "vonòina" ("killed") is not derived from the active verb 'mamòno" ("to kill") and is listed as a different word by the standard reference dictionary of Abinal and Malzac, though the 2 words or forms derive from a common root. Do we need 1 or 2 DMs for this?
  • In English, there are nearly no declensions, so "whose" is considered a different word from "who". In Latin, the equivalent words are just the genitive and nominative of one word. Do we need 1 or 2 DMs in such a case?

Remember that OmegaWiki is not only for Indo-European languages, please think then vote to the different proposals there: Policy to create separate definedMeanings or not. It is a good idea to read the discussions of that page before voting. OK: it's a bit long, but it's important, isn't it? If you share the fruit of your reflexion on the said page, it's even better. Thank you in advance. --Fiable.biz (talk) 12:05, 28 July 2013 (CEST)