As an anonymous user, you can only add new data. If you would like to also modify existing data, please create an account and indicate your languages on your user page.

International Beer Parlour/Archive20091130

From OmegaWiki
Jump to: navigation, search

Title of a page[edit]

Hi, I have changed what appears in top of a page from "Multiple meanings:" to "Definitions of:" (MediaWiki:Ow Multiple meanings). The idea would be that it appears better on a search engine. For example [1], or if someone types in google "definition gewittrig". Of course we can revert it and/or discuss it. I just wanted to see first if it worked just by changing MediaWiki:Ow Multiple meanings. --Kipcool 14:22, 24 August 2009 (UTC)

I like the idea, it is much more straightforward than "multiple meanings" and its translations. Changed the German version as well. --Tosca 15:23, 24 August 2009 (UTC)
I think it's a little awkward to have the colon after "of". "Definitions of gewittrig" is made to be a continuous phrase whereas "Multiple meanings: gewittrig" makes sense because "Multiple meanings" is more of a label than the start of a phrase. Maybe even "Definitions of 'gewittrig'", although it doesn't look like the template, as it's set up now, supports that --rappo 05:32, 25 August 2009 (UTC)
I removed the colon as suggested.
I tried "Definitions of 'gewittrig'", but that would need to have a template like "Definitions of '$1'", meaning it depends on a parameter. That is possible to do, but the php code has to be changed for this. I can try it on my testserver first and see how it looks. --Kipcool 08:40, 25 August 2009 (UTC)
Google has already started indexing some of the pages with the new title [2]. So, for example someone googling Definition aguarana would obtain [3]. --Kipcool 11:57, 25 August 2009 (UTC)
I would recommend changing it at http://translatewiki.net, the centralized repository for MediaWiki. Like this, we can get the benefit of updating all translations. Malafaya 16:15, 3 November 2009 (UTC)
Actually, the way it is built, it may not work for every language. Ideally, we should transform the message so that is gets a parameter (i.e. "Definitions of $1").
Done as suggested by Malafaya (with a parameter). --Kipcool 21:29, 9 November 2009 (UTC)

Bot/Mass Database Insert?[edit]

I've been working on transliterating Serbian Cyrillic expressions and defined meanings in Serbian Latin for a while. Now that the "Expressions needing translation" shows how many entries there actually are, I see that there are 1,666 expressions alone that need transliterating, which makes me not really want to continue this by hand. It's a fairly mechanical process going from Cyrillic->Latin (there are no irregularities as in going from Latin->Cyrillic) and I can easily make script to perform this.

So basically my question is - what resources are available? Am I able to make a bot that can add expressions on OmegaWiki? If not, is there a way I could supply someone with the proper SQL file to insert the new expressions into the database (which I would generate with my own local PHP script)? Thanks! --rappo 00:06, 26 August 2009 (UTC)

It seems to me that it would be complicated to make a bot adding expressions and translations in OmegaWiki. But if you can make one, I am interested to see (and use) it :-)
So, I think the easiest way is to do an SQL insertion directly. As far as I know, the active people with access to the OmegaWiki server are User:GerardM, User:RobertL and User:Malafaya (and maybe me soon...?).
At some point, the GEMET data was imported in OmegaWiki, also via SQL if I remember well. This was done by User:Leftmostcat. You might ask him about the details. He is not active in OmegaWiki, but you can send him an e-mail.
In any case, I am interested as well to learn how to import data. --Kipcool 07:35, 26 August 2009 (UTC)
Thanks for the info, I'll just go with the SQL route for now! I've already got all the expressions made so I'll just format it for the SQL file and hopefully one of the other users you mentioned can take it from there. I'll test the SQL fully and then report back here later on! --rappo 16:28, 26 August 2009 (UTC)
So after some trial and error I realized that SQL wasn't going to cut it, and I'd have to make a PHP script. I did that and it seems to be working out correctly (the expression is added in the right spot, the addition goes into the Recent Changes and also into the user's contributions page)... it all seems to check out. Kipcool, do you know if/how I can use this script? My plan is to create a button on the "Special:NeedsTranslation" page that only appears when you're translating from Serbian (Cyrillic) -> Serbian (Latin) - clicking the button will automatically add the Serbian (Latin) expressions for all the Serbian (Cyrillic) entries found. Is it possible to add something like this to the site, and does it sound like a viable solution? --rappo 19:28, 26 August 2009 (UTC)
By the way, if there is no "DefinedMeanings needing translation" page at the moment, I could probably work on creating that (along with the Serbian automation, since there is a need for that as well). --rappo 19:32, 26 August 2009 (UTC)
  1. In fact, the "Expressions needing translation" page is already a "DefinedMeanings needing translation" page. Only when the links are created, an expression link is displayed instead of a direct link to the DM needing translation. I have been asked recently by Gerard to change the expression link into a DM link, which I did last week-end [4]. I just have to send the patch to a developer... Is this what you intend to do?
  2. Yes, it is possible to add a button to "Special:NeedsTranslation" which appears only for this pair of languages. And yes, it would be nice to have such a button :-) The opinion of Gerard on the matter would also be nice... Time to sleep now --Kipcool 21:57, 26 August 2009 (UTC)
Oh hmmmm I'm mixing up the word DefinedMeaning with Definition then. I meant the text of the definition inside the DefinedMeaning - for example, Adang has a Serbian Cyrillic definition, but not Serbian Latin. The page I suggested would let you select "Source: Serbian (Cyrillic), Destination: Serbian (Latin)" and this DefinedMeaning would be displayed, even though the synonym/translation is there.
And as for the button I'll work on that then and make sure it's all bug-free :P --rappo 22:53, 26 August 2009 (UTC)
Ah, ok with "Definitions needing translation". It was also in my todo list, but I've not started working on it yet, so please do :-) --Kipcool 07:15, 27 August 2009 (UTC)
We had a chat about it with Gerard, that there are two possible ways of approaching the problem: (1) "definitions in English that have to be translated in French" and (2) "expressions in French that have no definition in French". We considered that (2) was more important, but of course in your case, you are more interested by (1)... Or maybe we should combine (1) and (2), i.e. "expressions in French that have no definition in French but have a definition in English" (with source=English, target=French), keeping the possibility of entering no source, corresponding to (2).
I don't know if I am clear... If not, just forget it and do what you have in mind ;-) --Kipcool 07:15, 27 August 2009 (UTC)
Hi Kipcool, I understand what you were asking - I've been trying to accomplish this and what I had in mind today but it really seems like the SQL query is too complicated... the page never loads for me! I'm not sure if the query can be simplified, if you know some SQL maybe you can take a look at it. Here is the query that I want to run to check if there's a definition in one language without a translation in another language:
SELECT translated_content_id FROM (uw_translated_content source_translated_content, uw_defined_meaning source_defined_meaning) WHERE source_translated_content.language_id = 128 
AND NOT EXISTS ( SELECT destination_defined_meaning.defined_meaning_id FROM (uw_defined_meaning destination_defined_meaning, uw_translated_content destination_translated_content) 
WHERE source_defined_meaning.defined_meaning_id = destination_defined_meaning.defined_meaning_id AND destination_defined_meaning.meaning_text_tcid = 
destination_translated_content.translated_content_id AND destination_translated_content.language_id = 129) LIMIT 100
Anyway for now I'm going to revert my code to just have the button for transliterating the Serbian expressions and forget about the definitions. What should I do with my PHP file at that point? --rappo 22:38, 28 August 2009 (UTC)
Isn't there a "source_defined_meaning.meaning_text_tcid = source_translated_content.translated_content_id" missing? --Kipcool 06:42, 31 August 2009 (UTC)
Discussion continued on User_talk:Rappo#Working_SQL_query

Saving OmegaWiki[edit]

It seems that OmegaWiki is slowly dying. The only programmer seems to be KeepCool, when you post anything, you need weeks to get an answer (even when it's just to get another language), the server has problems, some parlours died, the rank of OmegaWiki in Google page ranker or Alexa is poor. The site doesn't say which legal entity or natural person is legally responsible for itself (which is required by many countries law), but whois says the domain name is registered by Gerard Meijssen, who is probably the one to decide. I think radical measures should be taken soon if we want to avoid the lot of GNUpedia or others. I suggested to make this site more attractive, with photos, I'd add "icons", as in Wiktionary. But the main thing is to think of finances. Did the Wikipedia foundation refused OmegaWiki as one of their project? --Fiable.biz 00:30, 26 September 2009 (UTC)

  • OmegaWiki is not slowly dying, it is as always, with about 5 regular contributors (the names are changing but the number is always about 5).
  • I am not the only programmer, I am like the others, we are programming OW on our free time... I have also a job, I have a personal life, I contribute to other wikis (just like almost everybody), which gives not that much free time to program. In fact I have just started with programming OW.
  • when you don't get answers to your questions, the best is to harass GerardM on IRC :p
  • the server has problems...
  • the parlours for individual languages were never really active, because the number of critical contributors have not been reached. When I am the only French contributing, I don't discuss with myself in the French parlour...
  • Alexa rating is not relevant... I prefer considering http://omegawiki.org/stats/
  • Gerard Meijssen is the boss :-)
  • OmegaWiki was refused as a Wikimedia project, I think because they already have Wiktionaries, so if they adopt OW, they would have to stop the Wiktionaries, which is not possible at this point since OW does not provide all the functionalities that the Wiktionaries have. (this is my point of view, so it may be wrong...). --Kipcool 09:26, 24 September 2009 (UTC)
I proposed it again inside a machine translation proposal. --Fiable.biz 01:48, 26 September 2009 (UTC)
I like the proposal, thanks :-) --Kipcool 23:02, 9 October 2009 (UTC)

Finances[edit]

OmegaWiki has not enough developers and, much too often, the server has problems or is much too slow. A way to finance the project could be to accept a reasonable amount of advertisements. Another possibility would be to accept service (programming, interface translation, massive data input, hosting) in exchange for adverts. --Fiable.biz 01:34, 28 August 2009 (UTC)

  • I am not in favor of advertisements. I would prefer giving some of my own money instead... The current problem is mostly that I don't know how much money is needed. --Kipcool 09:26, 24 September 2009 (UTC)

Meta:Wikiproteins[edit]

I would be interested in participation in the project Meta:Wikiproteins. But I do not understand, where articles of this project. Why that only one from a category Category:Wikiproteins. Unless this project is not here? SergeyJ 19:09, 9 September 2009 (UTC)

Hi. I don't know much about WikiProteins but you can find it at http://www.wikiproteins.org . Cheers, Malafaya 23:09, 9 October 2009 (UTC)

Today[edit]

Today:

  • A big part of this page has been archived. If any ongoing conversation is lost, sorry and don't hesitate to bring it back.
  • Same applies mutatis mutandis to Insect room
  • I have fixed the bug that when you want to add a class and type the first letters, it doesn't work (now it does)
  • and more importantly, thanks to Malafaya it is again possible to add new annotations. For the records, the problem was that the table uw bootstrapped defined meanings was found empty in the database for some obscure reason. It should contain User:Kipcool/uw_bootstrapped_defined_meanings :-) --Kipcool 22:59, 9 October 2009 (UTC)
Hi, thanks to Malafaya and Kipcool. However, annotations are absent again. --Ascánder 12:47, 12 October 2009 (UTC)
Maybe it has something to do with the full server reboot that we've just performed? I'll try and fix it this evening. --Kipcool 13:55, 12 October 2009 (UTC)
fixed, it was the same problem.
In the process, I have made it impossible to access a particular script "Create wikidata.php" which could be access from the outside and potentially empty this table in the database. Maybe this was the problem... --Kipcool 16:58, 12 October 2009 (UTC)
Thanks again. --Ascánder 23:21, 12 October 2009 (UTC)

Annotations and relations[edit]

Hi, I read most of the guides here, but I still can't figure how to add annotations and relations. Some help please? --Anime Addict 18:09, 13 October 2009 (UTC)

Hi, it depends what kind of annotation you want to add.
when you edit a DM or an expression, one kind of annotation is available, in the list of synonyms and translations, the last column on the right (next to "identical meaning?"). You should see "Annotation>>". Clicking it opens a box with three lines, each of which can be clicked to reveal some comboboxes and textboxes where you can add your annotations.
Some annotations are language dependent (part of speech for example), and it is well possible that they have not been made available for the Romanian language. So, check first if you can add an annotation for an English word.
Below the list of synonyms and translations, there are also the "class" annotation (clicking on it reveals a combobox) and below it an "Annotation" where the available values depend on the chosen class.
If there are still problems, tell us. You can also come on IRC for a live explanation. --Kipcool 18:38, 13 October 2009 (UTC)

Hmph, I noticed them now. I've edited some few weeks ago and didn't observe them, thanks for the whole explanation.

There's one more problem though, I've made some articles on Wikinfo (very similar to Wikipedia, only it lets original research and other stuff), but it seems to only let me put links on Wikipedia. Can I be helped with that? Thanks again for such fast promptitude! --Anime Addict 19:11, 13 October 2009 (UTC)

Not really... At the moment there are no plans to link to something else than Wikipedia. --Kipcool 05:51, 14 October 2009 (UTC)

Change to the mySql server[edit]

Hi, mysql has been again unavailable today. So, I have decided to try a bit of change in the configuration file. In particular, I have increased the number of allowed (simultaneous?) connections. My hope is that it will prevent the mysql to be down so often. However, I don't know what the real consequences can be. Slower server? If you notice anything, please bug me. In any case, I have kept a copy of the previous configuration, so that I can restore it easily. --Kipcool 18:34, 15 October 2009 (UTC)

hmm the problem is still there :-( --Kipcool 14:47, 26 October 2009 (UTC)

Statistics (sorry)[edit]

Hoi, today I have fixed the statistics page [5] which was correct for the individual languages, but showed wrong overall numbers for DM and Exp, because the deleted exp and DMs were counted. In the process, we have "lost" some 1000 DM and 20k exp. We have to reach the 400k milestone again :-( --Kipcool 15:49, 25 October 2009 (UTC)

Ouch, the truth hurts. But the adjusment was necessary, so thank you. --Tosca 16:55, 25 October 2009 (UTC)

Class "Instrument"[edit]

IMO the class instrument is wrongly defined as "A device constructed or modified with the purpose of making music.", which actually is the definition of "musical instrument" which is a sub-class of instrument. The proper defintion of "instrument" should be something like "a device that requires skill for proper use". I ran into this problem when I wanted to add "measuring instrument" to a class. Dh 05:31, 27 October 2009 (UTC)

At least in French, and I believe it's the same in English, "instrument" can be used as a synonym for "instrument de musique" (see Expression:instrument de musique). As you say, it also has another more general meaning which is closer to the concept of "tool" (see DefinedMeaning:instrument (337210)).
The problem here is that when we have a concept with two words, here "instrument" and "instrument de musique", and we tell the software that this concept is a class, it is not possible at the moment to say which expression is prefered when displaying the class combobox ("instrument de musique" should be written in the list of classes instead of just "instrument"). --Kipcool 08:28, 27 October 2009 (UTC)
Yes, in German instrument is also used as a synonym for "music instrument", or better to say, it is used as a short form.
That with the software is unfortunate. But isn't there a tick-box to indicate that a word is or is not an exact translation and wouldn't that be applicable here? I mean, "instrument" is not really an exact translation/term for the concept "music instrument". Or it is rather a colloquial term for that concept.Dh 11:59, 27 October 2009 (UTC)
Yes, there is a checkbox for exact translations, but no, it is not applicable here. It is an exact translation, i.e. one of the many meanings of "instrument" which appears as well in standard dictionaries (see for example: Wiktionary, def 2). --Kipcool 18:29, 28 October 2009 (UTC)
I disagree that instrument is an exact translation/word for the defintion of the class that describes a music instrument. It's a colloquial term, but not an exact one. And, with all due respect, I don't care what standard dictionaries say, especially wiktionary. If they say that instrument is an exact word/translation for the concept "music instrument", they are wrong. A music instrument is just one kind among other kinds of instruments, and it is only unambigous if used in a sentence like :"I play an instrument". If I'd say: "I have an instrument" no one can possibly know what I mean, though it might be most likely that I mean a music instrument".
Besides, as I see it, one can't compare omegawiki to other dictionaries, because where other dictionaries circle around words to which they assign meanings, omegawiki centers on meanings to which it assigns words. Or am I mistaken? If I am not, then "instrument" shouldn't be assigned as an exact translation for the DM that describes "musical instrument", it's only a colloquial one (because it's the only kind of instrument that many people have/use and shorter). Dh 06:03, 29 October 2009 (UTC)
Well, at some point we have to care about what standard dictionaries do. They have been around for several centuries and developed by professional linguists. Wiktionary was of course just an example. Another dictionary I know in German is dwds, and its definition of Instrument also indicates musical instrument as a meaning (see second definition). In French, the same goes with the well-known dictionary Tlfi. In the three dictionaries I have looked at, there is no indication whatsoever of it being a colloquial use of the term.
Nevertheless, the opinion and arguments of other contributors would be wished at this point, because I guess the same problem will occur for other words, so we have to establish some guidelines (or case law ;-) ). --Kipcool 07:20, 29 October 2009 (UTC)
True, they've been around for long and were developed by professionals, and their input is definitely welcomed by me, but what omegawiki does is something new which wasn't possible before and it will inevitably change our understanding of (our native) language.
And I am not saying that "instrument" is not a word for "musical instrument", it just isn't an exact one and shouldn't turn up as the exact translation for the defined meaning of "Musical Instrument" (the dictionary I've got at home also gives "Musikinstrument" ("musical instrument") as second meaning of "Instrument", but indicates that it is a short form of it). And this would be accomplished by simply unticking the exact translation box. Only "musical instrument" is the exact translation of this DM, because only when I read "musical instrument" (without any further info) do I exactly know what is meant. This is not the case for "instrument" (even if one assumes to know what is meant and is likely to be right, actually it is not possible).
But you are right, the same problem will occur for other words and other contributors should join the argument to be heard. Dh 07:52, 29 October 2009 (UTC)
One last thing from me: If you enter Instrument in the search box you will see that "Instrument" shows up with an exact meaning (general device) and an approximate (musical instrument). This is because I unticked the exact-translation-box for "Instrument" under the DM of musical instrument and this is how it should be IMO. Dh 08:16, 29 October 2009 (UTC)

This is about meaning. Musikinstrument has only one meaning: a device to play music. Instrument has several meanings: a device to play music, a device to measure something, a device to treat or diagnose diseases etc. So if you uncheck the box what you're saying is that Instrument does not mean "device to play music" and that would be wrong, because it does. It just means other things as well. What we should probably do is note somewhere that here "Instrument" is a synecdoche (a general class of things is used to refer to a smaller, more specific class). Of course we can't include all synecdoches as synonyms, only the most commonly used ones. A good indication would be if a traditional dictionary includes the synecdoche. The exact translation box is meant for different cases. For example language A has a word, but there is no exact translation in language B, only a word that comes close. I'm not really a fan of this whole checkbox because it's confusing and I think it's especially useless to add inexact translations or synonyms when there are already exact ones. --Tosca 13:16, 29 October 2009 (UTC)

Allright, that makes sense. I'll check the box again. Though the original problem isn't solved with that. I guess what is needed as well is a way to indicate which translation should be the "head" or "title" of the DM (and thus of the class, if the DM is one). In the case of the DM at hand it should be "music instrument" in English and "Musikinstrument" in German. Right now it appears that the simply the first word (alphanumeric-wise) is automatically chosen. --Dh 17:51, 29 October 2009 (UTC)
Yes, that is in the list of the much needed features. You seem to be a programmer, so if you have an idea of how we could implement that in the interface (a new check box column, or radio button or I don't know what), your opinion is welcome. Regarding the implementation in the database, it does not sound too complex (a new table with "DM number", "language", "number of the prefered expression"). Another question would be: do we need several prefered expressions for a DM, or only one? --Kipcool 18:08, 29 October 2009 (UTC)
Well, I have never written anything in PHP, but generally all that is needed is to copy one of the existing functions and just change the variables accordingly. In the case at hand I would say the "exact translation" code would be right, though it must be made exclusive (the word doesn't come to mind now, but there is a special kind of tick-box in HTML) since only one word of a given language can be the head word (this seems to be the answer to your question). Regarding the database, I've already downloaded a dump and just need to clean up my hard-drive a bit to import it. Do you know the exact size of the current db? The last time I tried, I ran out of space... I'll see, since I have several things in mind I probably have to get familiar with php and the omegawiki code anyways... --Dh 18:48, 29 October 2009 (UTC)
Another, related improvement would be to add a hierachy of meanings for a given expression, so that the most common would come first and the least common last. --Dh 19:37, 29 October 2009 (UTC)

Random Expressions[edit]

Does "Random Expression" link intentionally to strange expressions without definitions, like Expression:CEFUROXIME_SODIUM_1.5_GM_/_CEFUROXIME_SODIUM_1.5_MG_INTRAVENOUS_INJECTION,_POWDER,_FOR_SOLUTION, Expression:CLFB-STAAW, Expression:Malignant_neoplasm_of_supraglottis, Expression:Rothia_amarae_Fan_et_al._2002, Expression:System,_MNSs_Blood-Group, Expression:Nwi_2809, Expression:YLL2-EBVA8, Expression:Skimmia_anquetilia etc. (these examples all turned up in a row)? --Dh 12:19, 29 October 2009 (UTC)

There are definitions if you click on "UMLS" on the right side of the screen. UMLS was imported a while ago. There's something wrong with the random script though, it should link to non-UMLS stuff as well. --Tosca 13:19, 29 October 2009 (UTC)
It does, but there is an overwhelming number of ULMS entries. --Kipcool 17:07, 29 October 2009 (UTC)
Well, I guess it doesn't make a very good impression on visitors. Is there a way to exclude expression without defintions? I figure it should be quite easy to adjust the mysql query. Dh 17:30, 29 October 2009 (UTC)
It includes these ULMS expressions only since the recent mediawiki upgrade. At the moment, I don't know how the index for random expressions is generated, so I have no idea of the solution... --Kipcool 18:02, 29 October 2009 (UTC)

unitizing usage annotation[edit]

IMO it would be better to unitize usage annotation options and have a drop down menu instead of a text box. Some of the needed annotations that come to mind are:

  • colloquial
  • children's speech
  • technical
  • medicinical
  • vulgar
  • linguistic
  • logic

Or do I misunderstand the use of the usage box? --Dh 18:23, 29 October 2009 (UTC)

I have never used it, I don't know what it's for... Anybody?
In any case, I agree with the idea of a drop down menu. --Kipcool 20:29, 29 October 2009 (UTC)
I assumed it is for that as I needed to explain the usage for [Expression:Piepmatz] and there isn't any other option to do so. Anyway, we do need something like it, and I wasn't satisfied with the free text solution. If you go to the example you probably will understand why not. In French you probably would use a french word to express "kindersprachlich", in Englisch it would be another word again and even a German would maybe use just another one etc.) --Dh 23:27, 29 October 2009 (UTC)
A drop-down menu would be great. But I think we should also keep the box for free text. Sometimes a little explanation is necessary, though I can't think of an example right now :) --Tosca 21:13, 29 October 2009 (UTC)
Then the explanation should be a seperate field, like "additional information" or whatever. The idea behind having a fixed list of words is to make it more computer friendly, i.e. possible for a computer to easily read, categorize etc. the entries, free text would disturb that. Another reason (though actually it is the same just viewed from a different angle) is that we would end up with several different words, spelling(mistake)s etc. for the same meaning otherwise. And being more computer friendly also means that it will be more human friendly. At least in this case :) --Dh 23:27, 29 October 2009 (UTC)

If you think this is a good idea, then let's start with refining and adding needed usage terms to the above list. dh 08:39, 30 October 2009 (UTC)

It's a great idea, but we need someone to program it... Unfortunately, I can't program. --Tosca 15:44, 31 October 2009 (UTC)
Note: an extensive list of usage terms can be found in Wiktionary [6]. It is not needed to redo the list from scratch. --Kipcool 09:53, 3 November 2009 (UTC)
That's great. Thanks! --dh 23:22, 3 November 2009 (UTC)

Hmm, I've just realized that this feature was quite easy to implement: All I had to do was to add a class attribute to DefinedMeaning:lexical_item_(402295) and we now can annotate translations with usage categories. For now the list is rather short, check it out (you'll find it under annotations/option values/usage) and tell me what you think. Also, it would be good to know if and how it is represented in the database itself, something I can't find out myself right now since I still didn't manage to import the damn database... --dh 21:10, 22 November 2009 (UTC)

Apparently it is also possible to add language specific usage context terms that only show up for the specified language(s) so that it is possible to denote a region where a certain translation is in use (For example, in Germany there are words that are only known/used in the south, while others are only known/used in the north etc.). A Problem I see right now is that it seems neccessary to create extra DMs for the usage terms as they (at least sometimes) need further explaination, though personally I'd like to avoid that. --dh 21:53, 22 November 2009 (UTC)

Usage terms as annotations seem good, thanks. Just a question: in Expression:lexical item, I see that you have created usage term twice: as plain text and as a combobox. I'd say that the plain text one is to remove, oder?
In the database, all annotations are a mess... You can have a look at User:Kipcool/From DefinedMeaning to attributes to have an idea. --Kipcool 09:09, 23 November 2009 (UTC)
Of course each usage term needs a DM... I have the feeling I have not understood your concern here. Could you give an example?
You can have a try with the regionalisms. If it ends up not working, I can always remove it from the database. --Kipcool 09:09, 23 November 2009 (UTC)
No, I've only created one new usage field, the other (plain text) one was already there and I left it because Tosca said something about it being needed or useful. But the name probably should be changed to something like "usage note".
Why are all annotations a mess in the database? I only took a quick look, but it looks like a normal normalized database. Of course, it is a bit messy for humans to read, but that is not a problem since we are not supposed to (with the exception of developing or debugging it).
You have the answer: because it is a bit messy for humans to read, for example when debugging ;-) Before I wrote the above page, I had to discover by myself the relations between the different tables. (Kip)
Sure, each usage term needs a DM. What I meant was an extra DM for the use as an OW term. Take a look at the English DMs for Expression:usage and you'll see what I mean. The fourth DM has just been created for and only makes sense as a "internal" OmegaWiki DM. Or look at DefinedMeaning:Wikipedia article (740663). It does not give a general definition of what a Wikipedia article is, but defines it in relation to OW. To be honest, I do not like that very much. These DMs shouldn't show up with the other, 'normal' DMs and in my opinion they shouldn't exist at all and instead 'normal' DMs should be used, as it's now the case with the usage context options I've created. Imagine one wants to use the OW data as a vocabulary trainer or whatever. The data is 'contaminated' by such DMs IMO. The problem I see is that while using 'normal' DMs works nicely for most of the terms, it seems neccessary to explicitly explain for example the differences between "dated", "ancient" and "obsolete" when used as a usage description. One way to solve this could be to simply have a help page explaining it. What do you think?
And since we are at it, is there something that speaks against assigning the POS to the DefinedMeanings instead to the SynTrans as it is now? It could be changed easily, though I have no clue what happens to the already existing POS tags then.
--dh 10:05, 23 November 2009 (UTC)
It is not so bad as long as the DMs specific for OW are in the class "Community class attributes".
What do you mean by that? Being in the cass "Community class attributes" doesn't distinguish them from regular DMs since those can also be assigned to it, and in fact have to in order to be usable as a class, if I understand correctly. --dh 19:41, 23 November 2009 (UTC)
Oops, yes you're right... I was mixing with something else which exists maybe only in my mind. I need some rest ;-) --Kipcool 20:37, 23 November 2009 (UTC)
But it was not very far away from a solution: We could create a class for DMs that are merely for internal use. --dh 22:22, 23 November 2009 (UTC)
POS is language dependent. For example, in Chinese there is the POS "classifier" which does not have an equivalent in German, French or English grammar. Or sometimes, an adverb in English is translated as an adjective in French. I guess there are some more subtleties, but I am not a linguist, and I don't know many non-European languages. Though I agree that in most cases, they are the same POS for all languages. --Kipcool 17:53, 23 November 2009 (UTC)
Yeah, I was afraid that it's that complicated... --dh 19:41, 23 November 2009 (UTC)
Now I am curious, do you have an example of an English adverb that translates to a French adjective? How is that possible? I mean, an adverb qualifies a verb and a adjective a noun...
Maybe it would be possible if we'd implement a way to overide the DM level annotation with SysTrans level one in cases where it is needed? I am not a linguist either, but I somehow doubt that a word like, for example the english noun 'chair' is a verb or something in another language. Things are things, acts are acts etc., no matter in which language. No? --dh 20:23, 23 November 2009 (UTC)
I found it again, it was this word DefinedMeaning:away (442312). --Kipcool 20:30, 23 November 2009 (UTC)
Ok, thanks. But ... why is it an adjective? Isn't 'sont' an inflected form of 'est', which would be a verb modified by 'distants' and the latter therfore an adverb? Just like in English away is an adverb because it modifies 'is'? Though it is tricky, because in the sentence "Santa Claus is away", 'away' is an adjective, but in the sentence 'Christmas is two weeks away', it's an adverb. I don't know how that is in French though... --dh 22:12, 23 November 2009 (UTC)

Bulk import?[edit]

Since there are several free (in the GNU sense of free) datesets available on the net and one of the aims of omegawiki is to "provide information on all words of all languages", I wonder how people think about a mass import of some of this data. Some datasets that come to mind are:

  • geonames: It offers the names of countries, cites and other locations in several languages. It would be possible to import the names and automatically add descriptions and translations in several languages, as well as links to Wikipedia, geonames, CIA World Factbook(?) etc. Things like population, geocoordinates, spoken languages etc. could be added as well, but since at least populations change, this kind of info might be better given through a link to geonames. This could look like this (the italic part of the sentences would be standard sentences that have to be provided for the different languages beforehand):
  • New York
  • de: Eine Stadt in den USA.
  • en: A city in the USA.
  • etc.
  • Berlin
  • de: Die Hauptstadt von Deutschland.
  • en: The captital of Germany.
  • Germany
  • de: Ein Land in Europa.
  • en: A country in Europe.
  • etc.
  • CMUdict: This is a free American English pronounciation dictionary with 133791 words. Though the pronounciation is given in the Arpabet notation, they can easily be converted to IPA.
  • BOMP: This is a German pronounciation dictionary. Though it is not free per se, they've already allowed other dictionary projects to use their data and license the end product under the GPL. So maybe if we'd ask nicely... :)
  • Census data: They offer all family and given names used in the USA. But since they are without any descriptions etc., they might be of no interest for omegawiki.
  • Musicbrainz: I guess we do not want to import all names of musicians and bands (though they are words as well...) and therefore this data might be of no interest either.
  • Wiktionaries: The only safely parseble and reusable data from the several wiktionaries are probably the basic grammatical properties, i.e. conjunction of verbs, Casus of substantives etc. This makes Wiktionary unfortunately pretty useless and a waste of time. Though with some effort one might be able to safely extract IPA pronounciation and hypenation.
  • Wordnet: The wordnet data could be imported similar to the UMLS and Swiss-Prot datasets.
  • There are probably more datasets that could be of use for us. If you know one, add it. --dh 08:35, 30 October 2009 (UTC)
On my wishlist I wrote that I'd like to have some kind of semiautomatic import program. So we could, for example, rapidly import translations from Wikipedia. Semiautomatic, because we should still be able to check translations and remove those that seem doubtful. We already imported the GEMET database and while it's a great resource it also contains many errors and needs a lot of work. Look at this for example, or this. It's been a while since the import and we still couldn't fix all problems because it's very time and work-intensive. So I'm hesitant to do more bulk imports. Also, I think a small, but well-written dictionary is more attractive than a large database full of errors and bad formatting. --Tosca 15:53, 31 October 2009 (UTC)
From the examples you gave, it seems that the problem with GEMET was in the original data, and that could easily be fixed by just dropping the whole table(s) with GEMET and reimport them but without any fields/rows containing "(" or ";" or whatever it is you don't want. Or, if these "anomalies" follow a consistent scheme, one could convert the data and, let's say, put everthing in parenthesises into a seperate field or just delete the parenthesises (That's by the way why it is crucial to use as less free text boxes as possible in omegawiki). As far as I can tell, the geonames database, for example, is quite consistent. And the BOMP and CMU data is even more, as these databases are specifically meant to be computer readable (they are used in speech synthesis and recognition). To get an impression of the quality of the automatic mapping of pronounciation data, try this link. The pronounciation part comes from the BOMP dictionary and was automatically added to the translation dictionary.
The problem with Wiktionary is that the data is not in a relational database (well, the pages itself are, but not the data fields like IPA, translations, hyphenation etc.; besides, the way the data is/can be written is highly inconsistent), so that it is impossible to import the data, even semi-automatically. Maybe it would be possible, but one needed some sort of advanced AI programm to analyze the pages. That's why I am strongly opposed to Wiktionary (and even Wikipedia) and consider it a waste of time and obsolete. It is nearly criminal to continue to hype it the way it is done, because people are wasting their energies by entering data that can't be parsed or reused properly. As I said, the only thing that would be possible to import from Wiktionary are some grammatical properties, as they are written in box templates which can be easily parsed (though even they are inconsistent, like the in the German wiktionary the grammatical gender is often indicated by prefixing the article, but sometimes it is not. If I'd import this data, I'd simply drop all entries without a given article and convert the given ones to the proper values in the database). One also could import translations with some effort, but the problem here is that the translations are linked from meanings to words, while omegawiki links meanings. An additional problem here is that while in the German wiktionary the translations are numbered and the number could be used to link them to the meaning, the English wiktionary doesn't number the translations but only gives a short description of the meaning they translate. Compare this English entry with this German entry for example.
But in general I agree with you: quality is better than quantity. I just think that it is possible to have both by importing some of the data I mentioned. At least it is worth to try. If it doesn't work out, the tables can simply be dropped again... --dh 09:52, 1 November 2009 (UTC)
Bulk import is not possible for most datasets, for several reasons that I see:
  1. it would have to be merged with the existing data = too complicated + manual check needed anyway
  2. often definitions are not good or circular. For example definitions that are lists of synonyms, or a word1 defined by a word2 defined by a word1.
  3. translations (in Wiktionary for example, or in LEO) are not sorted by meanings.
I think the main problem is that nothing like OW has been done before, in particular in avoiding circular definitions (by having the same def for all the synonyms), which makes import a mess. Even Gemet, as said above, still needs a lot of manual checking.
The best I can think of would be WordNet, which is not free, so not possible to import.
Thus, while a bulk import is very tempting (at the beginning, I promised to import all of Wiktionaries...), I believe that all our data (at least def and translations) have to be entered manually... Considering one entry per person and 6 billion of people on the planet, this is not so much :p --Kipcool 09:46, 3 November 2009 (UTC)
Why do I have the impression we all are talking about different things?
  1. True, it would have to be merged. But I can't see why this should be too complicated and why manual checks are needed (apart from basic checks to ensure that it is imported properly).
  2. The CMU and BOMP dictionaries are not about definitons at all. And the geonames database is quite straight forward, that is, they do not have any circular definitions.
  3. Yes, that's why I said that Wiktionary is not a useful database (if one can call it a database at all). The only things that could be used from Wiktionary are... but I already wrote that. Twice actually. Please read again what I've written above since I really got the impression I am not understood.
I agree that nothing like OW has been done before, but I can't see what this has to do with what I've proposed. The datasets I suggested mostly aren't ordinary dictionaries with definitions etc. so that circular definitons aren't an issue.
Regarding GEMET: From the examples given by Tosca it seems that the data was a mess in the first place, but none of the datasets I suggested are like that at all.
Regarding Wiktionaries: Of course you weren't able to import them, nobody could since, I've already said it but apparently it can't be said too often: Wiktionary (and also Wikipedia) is a mess and a waste of time since the data is not stored in a sane manner and thus can't be parsed and reused properly. It is a crime to hype it further and make people contribute anymore. It should have been abandoned a long time ago.
Regarding Wordnet: What makes you think that Wordnet is not free? You can read their license that says: "Permission to use, copy, modify and distribute this software and database and its documentation for any purpose and without fee or royalty is hereby granted" at this site. It is used and converted by several projects out there.
--dh 22:34, 3 November 2009 (UTC)
WordNet: but then it continues with "provided that..." (but I don't really understand it). It said it is not free because it is not written GFDL or CC-BY, but maybe I am wrong?
the others databases: Ok I was a bit too fast ;-) if they are not definitions it should be ok.
Note on geonames: I think we already have all the names of countries and capitals in OW (countries, capitals). I don't know if it makes sense to have the names of all cities of the world in here as well. In any case, it would be nice if we first implement a way to indicate the geo-coordinates of a city in OW. --Kipcool 08:13, 4 November 2009 (UTC)

Maybe I should put the whole issue in another way: As soon as I've solved my disk issues and imported the omegawiki db, I'll try to import English (CMUdict) and German (BOMP, if they allow us to use the data) pronounciations and geonames. If I am able to do so successfully and the data checks out allright, would there be interest to do the same with the main database? --dh 22:48, 3 November 2009 (UTC)

Yes, I'd be interested to know if this works. :-) --Tosca 23:05, 3 November 2009 (UTC)
Ok. --Kipcool 08:13, 4 November 2009 (UTC)
Geonames: Sure, all cities would be too much, but I thought about cities above a certain population. And the ones that are already in the db could just be skipped.
Wordnet: Well, though the GNU licenses are the mothers of all free licenses, there are several licenses by now that are considered free, and it's not unusual for big organizations to have their custom-made license, like Apache, Mysql, BSD etc. Though IANAL, so I can't really assure you that it is free, but the part after the "provided that" just seems to be to make sure that if you use the data, Wordnet is credited, the data remains free, and that the name of Princeton University is not used to advertise your product. Wordnet could be interesting to "fill in" all English words (and their definitions) that are not yet in the database. Since we'd have to write translations for each of them anyways, any "circular defintion" or other problem could be corrected then. But I don't know how others feel about that.
--dh 18:22, 4 November 2009 (UTC)
Update: I've looked a bit closer into the BOMP pronounciation dictionary and apparently it does not include primary and secondary stresses so that it is questionable if it would be a good idea to use it. Though adding stresses is definitely less work than adding the whole pronounciation transcription. And in case we'd want to use the data, it already is available under the GPL via the freedict dictionary that I've linked to earlier.
(I still did not succeed with importing the database since even 2.2 GB free space weren't enough. I am not sure if I can free any more and probably have to come up with another solution, like moving the mysql data to another partition)
--dh 21:20, 9 November 2009 (UTC)

Relations again[edit]

I'm reposting a question I asked here already back in June. Namely, I can't understand why the Relations feature (including the hypernim/hyponim relations between DefinedMeanings) has been deprecated, or how the use of Classes can substitute for this. As I wrote back then, recording semantic relations between DefinedMeanings (especially "broader/narrower" and "close meaning" relations) would open up OmegaWiki to a variety of compelling applications. In addition, it would ameliorate the overall inflexibility of OmegaWiki's DefinedMeaning/SynTrans system compared to traditional dictionaries. --Studyclerk 00:59, 3 November 2009 (UTC)

I might be wrong, but I agree with you that classes are not meant to replace the hypernim/hyponim relations. hypernim/hyponim relations are still to be implemented. The ones that we see "broader/narrower" are only the ones that have been automatically imported with Gemet, but I don't think it has ever been possible to add them manually.
If you have programming skills with php+java+sql+freetime, you can implement it, the code is opensource ;-). --Kipcool 09:37, 3 November 2009 (UTC)
The problem with hyponims and hyperonims is that they are true in one language and not necessarily in another. So what is needed is a function where you relate these within one language. As we cannot do that we cannot support theses structures. The notion that we should because dictionaries have them is nice and we would if we could. More relevant to me is that we can support conjugations and inflections.. they are also language dependent BUT they have substantial benefit because you can link the structure from within one languages to the stuctures from another and this helps with translation. Thanks, GerardM 16:11, 3 November 2009 (UTC)
Could you clarify this claim (that the hyperonim/hyponim relation is true in one language and not necessarily in another)? I know about some hard-to-model cases, such as the Italian word topo which may mean either mouse or rat, but this would cause problems in the OmegaWiki structure as-is. If anything, conjugations and inflections seem more difficult since grammar differs so much across language families. --Studyclerk 18:19, 3 November 2009 (UTC)
Maybe I don't get it, but I don't see the problem with hyponyms and hyperonyms. All our expressions are linked to meanings. So topo=rat and topo=mouse would be different DMs and have different hyponyms and hyperonyms. --Tosca 23:10, 3 November 2009 (UTC)
Some languages do not have the word for the other relation.. and consequently it falls foul. We do indicate that a translation is not exact, but it does not have implications in the software and consequently it just does not work. GerardM 10:50, 4 November 2009 (UTC)

geo coordinates[edit]

Hoi, when I know what the format and the name for the geo coordinates should be, we can have it. Thanks, GerardM 10:48, 4 November 2009 (UTC)

Well, there are not that many options. You have latitude, longitude (and if you want even altitude) and the values are either given in degrees or decimal/float (with decimal being the better option IMO). But personally I can't quite see the point of having geo coordinates and think that a link to the geonames entity would be more useful. --dh 18:33, 4 November 2009 (UTC)
I've to correct myself: The geo coordinate values are always given in degrees, but there are two different notation formats: minutes (e.g. 51° 40'' 32') and decimal (e.g. 51.53231). --dh 23:20, 8 November 2009 (UTC)

ISO 639-6 entity[edit]

While translating entries to Spanish, I have found a few defined meanings whose definitions are "ISO 639-6 entity" (for instance, DefinedMeaning:Mozarabic_(604617)). Are there a reason for that, or are those defined meanings just lacking accurate descriptions? Saludos/Greetings --Galeote 16:23, 4 November 2009 (UTC)

The names of languages + iso codes were imported by a bot (from World Language Documentation Centre if I remember correctly). Since they had no definition, "ISO 639-6 entity" was put in the definition field. Feel free to replace it by a real definition if you can/like, that's what I do myself. --Kipcool 21:49, 4 November 2009 (UTC)

Basic shortcomings[edit]

IMO the project has issues with definitons being inaccurate or at least too unspecific. For example, DefinedMeaning:borrar (471047) has the English definition: "To remove markings or information.", and apart from "or" being problematic in definitions, when I want to translate the definition and check the translations, I realize that I have to know what exactly is meant with "markings". Wordnet for example gives four different meanings and depending on which of them is meant, I would have to add or remove certain German translations, and the translation of the definition depends on that as well. Or another randomly picked example would be DefinedMeaning:delicacy (1040439), which is defined as "The state or quality of being delicate." But which of the several meanings (Wordnet indicates seven) of delicate is meant here?

The only solution I can see for this is to have every(!) word of a definition defined properly, that is, it is not enough to write "marking", but one also had to indicate which sense of marking is meant, and for this the definition already has to be in the database. Now, I understand that this would require changing of the software and to start all over again, and most people do not want this or even see the need. But I just thought I share my frustration and that I am more and more convinced that if we continue like this, we'll end up with a lot of definitons and translations that are simply wrong or at best inaccurate. --dh 23:24, 8 November 2009 (UTC)

Yes, linking words of a definition to their corresponding DM is in the todo list... --Kipcool 12:33, 9 November 2009 (UTC)
Sometimes definitions are badly written (by people who don't understand the project or who didn't research the definition enough). If you find bad definitions, you can either improve them or delete them (or put something on the discussion page), that's what I do. This is normal I think, you can't always get it right on first try. --Tosca 15:39, 9 November 2009 (UTC)
Re: Kipcool: Oh really? That's great to hear!
Re: Tosca: Sure, we are not perfect etc. and improving definitions is the way I handle this as well. It's just that there are a lot of unclear definitions, many of them already translated into several languages and having lots of translations, but the discussion pages are not touched at all. So it sometimes seems futile and something that needs general attention.
--dh 21:09, 9 November 2009 (UTC)
The eXtended WordNet project could be inspirational in this context. --dh 22:56, 9 November 2009 (UTC)

Some changes[edit]

While I am still desperatly trying to enable adding several translations at once (it is really more complicated than I imagined, but it takes me through the whole code so I am not really wasting time), I have done small changes:

  1. The title of the Expression pages are as suggested above, i.e. with a parameter: Definitions of '$1'. This allows more flexibility. Drawback, at the moment, the title does not contain the name of the page anymore. Consequently, the message MediaWiki:Ow_Multiple_meanings needs updating according to what I changed in the English version. Two possibilities:
    • you can wait that the people at translatewiki translate it (I don't know how long it takes)
    • or you can translate it locally (but in the end, it is better to use the version of translatewiki)
  2. The page Special:DataSearch now tells how many results there are (the message MediaWiki:Datasearch_showing_only needs to be updated for it to be visible. As above, you can translate it locally or on translatewiki.net)
  3. In Special:NeedsTranslation, I have added a bit of random. Each time you click on the button, new results will be shown. The way it is done: it does not show always the first 100 results of the SQL query as before, but it takes 100 consecutive results at a random location. This way, you don't see always the same expressions.
  4. In the statistics page, I have removed a superfluous SQL query, so that the page loads faster (though still not that fast).
--Kipcool 21:44, 9 November 2009 (UTC)
Wow they are fast at translatewiki : in one day, we have already the translations in 19 languages :-) --Kipcool 21:33, 10 November 2009 (UTC)
Merci beaucoup pour les améliorations ! --Tosca 21:42, 10 November 2009 (UTC)

One more change[edit]

Today, I have changed the sql query that retrieves the list of languages, for displaying in the combobox. The result is still the same (language in your user language, and, if not available, in English). However, I believe that the new query is faster. I was not really able to measure the real query time on my local server. The only measure I did said about 3 times faster, which seems a lot to me.

Anyway, if you notice an improvement, tell me so that I can modify all the comboboxes in the same way. If you notice nothing, or think it is slower, I can revert the change. If you notice a bug, tell me as well ;) --Kipcool 20:36, 11 November 2009 (UTC)

Though it's hard to tell if there is an improvement since the speed depends on several factors, like connection speed etc., I believe it is a bit faster. However, thanks anyway for your efforts.
While I am not that familiar with JavaScript, I wonder why do the languages (and other values for that matter) need an extra query anyway? Every extra query is rather expensive (network and sql server wise) and it should be possible and would be enough to load the languages (and other values) into an array once when the edit page is loaded. Not only would that reduce the server and network load, but also eliminate any delay when editing and therefore ease the whole process.
And in case browsers do deal with JavaScript in a similar way as with CSS, it should be possible to load the languages (and all other variables) in an external JavaScript which should stay in memory and could be reused even when changing the page. That would mean that one had to load all the values only once per session. This should reduce the load dramatically. Though as I said, I do not know if it works with JavaScript that way. CSS can be loaded as an external file (which this site does, by the way) which is retrieved only once and cached by the browser and one can do the same with JavaScript files.
Another improvement in that context, though probably not that trivial and rather an increase of convenience for users than an decrease of load, would be to only load the languages that are specified in the babel template of the user and maybe set the native language as a default.
--dh 00:22, 14 November 2009 (UTC)
I already have a similar idea, but I am not sure how to implement it either. Therefore, I did the quick - better than nothing - improvement that I knew how to implement ;-). The code is open source, you can give it a try yourself :p. --Kipcool 17:41, 16 November 2009 (UTC)

I've just applied the same change to the attribute comboboxes. Tell me if any problem is noticed. --Kipcool 15:59, 22 November 2009 (UTC)

I've also made one more change: I've added a text field for etymology. I think it's fine, but if you have any objections, let me know. --dh 21:58, 23 November 2009 (UTC)

Class and class attributes[edit]

Am I allowed to make an expression to a class or can only people with special privileges do so? If I am allowed, how do I do it? I wanted to make a class "vehicle" to group all parts and things associated with vehicles (obviously) such as "track width", "wheel base" etc. to it. --dh 18:00, 20 November 2009 (UTC)

Yes, you have the privileges to do it. I have just discovered that we have a page explaining it Class. I have never used it myself, so I don't know if it is clear enough. If not I can give more details.
Thanks! --dh 11:33, 22 November 2009 (UTC)
Now I do have some questions:
  • Is there a place to discuss and list classes? It seems to be necessary to be restrictive in that regard and also to clarify what should be associated to a certain class and what not, lest we end up with too many classes etc.
  • What is the relationship between classes and annotations? When should one use annotations (and how) and when classes (see also the next point)?
  • According to the Class page, which says: "A class defines a group of concepts that have an 'is a' relation with the class.", the above mentioned grouping would be wrong since, for example, "track width" is not a vehicle, but a subject of the vehicle topic. Instead, DefinedMeaning:ship (3403) should be member of the class 'vehicles', because a ship is a vehicle, and 'track width' should be annotated with "is theme of 'vehicle'. Right now, DefinedMeaning:ship (3403) has the relationships "broader terms" and "is theme of" to 'vehicles'. To make it short, the whole thing is quite messy and confusing and it seems that some clarifications and simplifications are needed. Or is it just me and everybody else has a perfect grip of the whole matter? --dh 16:37, 22 November 2009 (UTC)
  • Originally, classes have been made to allow to add class specific attributes. For example, if a concept belongs to the class "language", you can add the attribute "spoken in" and a link to the country. If a concept belongs to the class "country", you can add "capital is" and a link to the capital. The "is a" relation is needed in that respect.
  • The proper name of what you are adding now are meronyms of "vehicle" (Expression:meronym). We don't have at the moment the possibility to specify the meronym relationship in OW, but we could add this as a relation (though I am not sure how at the moment, so depending on how easy it is, my motivation, and the time I have, it can take more or less time).
  • If your aim is to add class specific attributes, as I described above, then I'd say you need to use the classes.
  • If your aim is (just) to list the part of a vehicle and indicate that relationship, then you can use classes for now because we don't have anything better, but probably they will have to be changed to a meronym relationship if we have the feature someday.
  • The place to discuss and list classes could be Classes? Though you might get more replies if you discuss it here. --Kipcool 17:48, 22 November 2009 (UTC)
  1. So, English, French, German etc. all belong to the class 'language' (or linguistic entity) and since this class has attributes like 'spoken in' etc., those can be assigned to the several class members (German, English etc.).
    Yes (Kip)
  2. If the above is true, shouldn't then DefinedMeaning:motor vehicle (2416) be an instance of the class vehicle (because a motor vehicle is a vehicle) and car an instance and/or subclass of the class 'motor vehicle' (since car is a 'motor vehicle' (and therefore, by inference, also a vehicle) instead of having assigned rather unspecific relationship descriptors like 'broader terms'?
    The relationship "broader terms" comes from the automatic import of GEMET, and is not perfect. It should be changed to something like hyperonym/hyponym. For the moment, just consider that these relationships are not there. (Kip)
    Yep, it's part of the skos vocabulary used by the GEMET RDF representation. But why was it imported like this in the first place? I hope you are not thinking about manually changing it. It should be possible to do this automatically. When (or if) I succeed importing the database, I'll have a look. --dh 09:33, 23 November 2009 (UTC)
  3. And aren't the several languages or vehicles meronyms of 'language' or 'vehicle', respectively?
    No: meronym is "part of", different from hyponym which is "kind of". A wheel is a meronym of a car (wheel is a part of a car). A car is an hyponym of a vehicle (car is a kind of vehicle). (Kip)
    Yes, that's how I understood it first, but then I looked up meronym in wordnet and... misunderstood the example. It says "brim and crown are meronyms of hat" and I took brim and crown as being kind of hats, instead of names for parts of a hat. But now I understand, thanks! --dh 09:33, 23 November 2009 (UTC)
    Could you please have a look at DefinedMeaning:jinepuedan_(658684) to see if I've assigned everything correctly. I have assigned 'profession' as a class of 'Prostitute', but it also is a hyperonym, isn't it? Should we assign both in such cases? Right now the only hyperonym I've assigned is 'woman', and while wiktionary also assigns 'person' as a hyperonym, I did not because I figured that it is implicit since 'woman' is a hyponym of 'person'. I feel we need some guidelines for that and maybe I will write some as soon as I have a sufficient insight and overview in the whole matter. --dh 13:29, 24 November 2009 (UTC)
  4. Couldn't the meronym problem (if it still exists, see above) simply be solved by creating a class, let's say, 'technical part' and/or 'technical term/concept' with attributes like 'is part of' and/or 'is used in connection with', so that, in order to properly list all vehicle parts, one could assign 'steering wheel' to the class 'technical part' and use a attribute (of the class 'technical part') like 'is part of' to connect it to 'vehicles' (or, more specificly, a subclass of 'motor vehicles' (and/or whatever actually has a steering wheel))? Such an ontology would of course need to be well thought out and takes time to be developed, that's why I asked about a place to discuss classes. Also, we'd need a visualization of this tree or hierachy ;)
    It could probably be solved this way, but I would call it a dirty solution. I think the clean solution is to implement a true "meronym" relationship, as I said above. (Kip)
    Then why not just add 'meronym', 'hyperonym', 'hyponym' etc. as a class attribute to DefinedMeaning:lexical_item_(402295), just like 'Antonym'? --dh 09:33, 23 November 2009 (UTC)
    Instead of awaiting an answer, I've just added 'meronym', 'hyperonym' etc. as a class attribute. Take a look and tell me if that'll do. --dh 16:47, 23 November 2009 (UTC)
    Nice! Just some comment:
    Comment 1: to make the relation clear, maybe it should be called "meronym of", "hyponym of", otherwise we don't know in which direction the relation goes.
    Yes, we can do that. Though we have to create extra DMs for that. --dh 20:12, 23 November 2009 (UTC)
    Though now that I think about it, the relationship is rather "has meronym" etc. instead of "meronym of" etc.. Compare with the annotation attribute "Wikipedia-Commons-Category": The relationship is "has Wikipedia-Commons-Category" and not "Wikipedia-Commons-Category of". Personally I find it clear and intuitive as it is now, just like with the other annotations (for example we have "Genus" and not "has Genus"). But if you feel it is confusing then we can create new DMs for "has meronym" etc. --dh 13:21, 24 November 2009 (UTC)
    A problem with changing the DM form "hyponym" etc. to "has hyponym" etc. could be that then the many annotations that already have been assigned automatically when importing GEMET or whatever are lost or have to be changed as well. It seems they use the same DM I've used when creating the hypoynm etc. relations, so that it is nicely, even though unwittingly, consistent right now. --dh 13:58, 24 November 2009 (UTC)
    Comment 2: We only need one of 'hyperonym', 'hyponym', I'd say only "hyponym (of)", the other will show up as incoming relation (otherwise you have to do the job twice). Holonym is also not needed.
    I understand what you mean, but I still think we should leave them all because this way we can assign the relationship from wherever we are in the moment. (Otherwise one would have to go to the DM which is the hyperonym when he is editing the hyponym). Ideally the software should be able to resolve these relations, so that if I annotate DM A as "hyponym of" DM B, it automatically will annotate DM B with "hyperonym of" DM A (or the other way round). This would then replace the "incoming relations" thing. --dh 20:05, 23 November 2009 (UTC)
    Ok, this is another possibility. (Kip)
    Comment 3: Read also [7] but I am still not convinced. As long as we are linking DMs together I don't see any problem.
    I agree, it is not a problem, because it does not matter if another language has a word for the hyponym/hyperonym/whatever or not, since we are, as you said, linking DMs, not words. And the DMs are actually language independent (though of course they are expressed in languages). Besides, we could relate hyponym/hyperonym/whatever within one language if we wanted (by just assigning them on the SynTrans level), but I don't think that this would be right since these relations apply on the meaning level, not the word level. --dh 20:05, 23 November 2009 (UTC)
  5. Am I right to assume that the class scheme is replacing the (DM) annotations?
    Uh, yes/no? I am not sure what you mean here. Classes are not replacing annotations, but are linked with annotations (because some annotations are made available only when a class is given). See For example DefinedMeaning:Paris (6724), having the class city, and several city-specific annotations. (Kip)
    Forget this question, it's a result of a misunderstanding. --dh 09:33, 23 November 2009 (UTC)

Another question: What exactly is the gender attribute in DefinedMeaning:profession (744566) for? It seems to implement what I proposed here. --dh 10:33, 23 November 2009 (UTC)

I don't know who did that. It was not explained to me, so I cannot tell... --Kipcool 17:41, 23 November 2009 (UTC)
Well, I still think that this would be the better way of dealing with this... --dh 20:12, 23 November 2009 (UTC)

Personalized css and colors in the interface[edit]

I have enabled personalized css. For example, I have created User:Kipcool/monobook.css and added some colors in the interface.

  • Does it look nice, should I make it default, any suggestion?
  • if you have more idea of nice customization, you can try them on your monobook, and if necessary we can then make them default.
  • --Kipcool 10:17, 22 November 2009 (UTC)
Since you are all happy with it ( :p ) I have made it default. --Kipcool 09:19, 23 November 2009 (UTC)

How to deal with numerus, casus, conjugation etc.?[edit]

??? No one? Aren't there any thoughts about this already? I find the lack of inflections or plural forms etc. a major drawback of OW, but somehow have a feeling that to implement this would require a major change of the software or even the database structure. --14:02, 24 November 2009 (UTC)

Now moved there: International Linguists Beer Parlour/Inflexions

Etymology[edit]

Dh has added a text field for etymology. An example of a word with etymology annotation: Expression:Uraufführung.

I find it a nice initiative and/but I see the following contra/pros in it:

Contra

  1. It is monolingual: only one text field for one syntrans. I have seen that he put the etymology in German for German words and in English for English words. It makes sense, but it has to be specified on some help page, and I would prefer something multilingual.
  2. Also I think we wanted to have the etymology fields with links to the other expressions (are they called etymons or sth?). But GerardM knows more about it, I don't remember what the initial plan was.

Pros

  1. It is better than nothing
  2. It can always be moved to a more complex system later on, with a bot (or sql query or call it as you like).

So, the question is: do we keep it as it is or do we wait for a more complex system? (personally I am undecided at the moment) --Kipcool 09:16, 26 November 2009 (UTC)

Re: Contra:
  1. Multilinguality can be easily achieved by simply changing the field type from 'plain text' to 'translatable text' and I first did or wanted to do that but I figured that since most words do not even have an indication of the POS, it won't be used anyways.
  2. Linking to etymons (or whatever they are called) could be achieved by allowing (or better to say translating) hyperlinks in plain text fields.
Re: Pro:
  1. 'Better than nothing' is not really a pro IMO ;)
  2. That depends on what exactly you want to move to a more complex system. To convert plain text entries to translatable text seems doable, to convert them to link to the etymons not.
--dh
Well, I like it. :-) Multilingual etymologies would be nice, but is it really necessary? Who is going to translate an etymology like "Zusammengesetzt aus Erst- und Aufführung." into every language? I think etymologies are mostly relevant to the people who already speak the language or want to learn it. Still, at least an English translation would be great. Links - I agree that we need them. But for now this solution is not bad. --Tosca 21:34, 26 November 2009 (UTC)
Ok, but if I want to put the etymology of a Latin word, I will definetly not write it in Latin. Then I have the choice between English and French, and there is no reason to prefer English. So, I'll do the conversion to multilingual text when I have some time (I have to look in the database for existing etymologies to do that, to be sure not to miss one). Then, it's up to whoever to translate it into his language if he is interested.
Good point. Then let's change it to translatable text. We can do that without loosing any entries. And probably we do not even need a change in the database itself since if someone wants to translate an entry, s/he can just copy&paste the original one to the 'translatable text' field, set the language and delete the 'plain text' entry.--dh 22:32, 28 November 2009 (UTC)
'Better than nothing' <= Sorry, I use too much irony sometimes...
'move to a more complex system' <= I had the idea to implement it with a special table in the database dedicated to etymology, but maybe we don't need it after all. --Kipcool 22:50, 27 November 2009 (UTC)
I've changed it to translateble text. See DefinedMeaning:Rummel (1145572) for example. --dh 23:57, 28 November 2009 (UTC)
Ok, but you should have waited a bit before changing it, because then it is harder for me to find the old etymology option in the database.
Also, an option should not be removed when there are still entries using this option. There is still no automatic check by the software, and sometimes it creates an error.
Anyway, I have changed all of the existing etymologies to translatable text. --Kipcool 18:34, 29 November 2009 (UTC)
Ok, sorry. But I've figured that there were just very few etymologies already entered and there were no problems with any errors when I did the same with some annotation options I've added. Hope everythings alright now. --dh 00:36, 30 November 2009 (UTC)

Loading classes made faster[edit]

There was something strange with loading the list of classes, which was taking a lot of time when clicking on the combobox. I have removed the part "AND synt.identical_meaning=1" and now it is wayyy faster, though I am not sure why. --Kipcool 21:06, 26 November 2009 (UTC)

Great, this problem has come up several times for me. Now it works much faster. --Tosca 21:12, 26 November 2009 (UTC)