As an anonymous user, you can only add new data. If you would like to also modify existing data, please create an account and indicate your languages on your user page.

International Beer Parlour/Archive20101106

From OmegaWiki
Jump to: navigation, search

Contents

New Annotations[edit]

I've added the following new annotation options for all DefinedMeaning:lexical item (402295)s (See also Meta:International_Beer_Parlour#unitizing_usage_annotation):

Apparently, meronyms and holonyms can be subdivided into "(is|has) part (of)", "(is|has) member (of)" and "(is|has) substance (of)" meronyms and holonyms. And I think we should do that.--dh 18:14, 13 December 2009 (UTC)

And these for DefinedMeaning:state (3609):

The supposed entry in the plain text field is the unique Geonames number that identifies a given feature in the Geonames database. In the case of France this is '3017382'. This number can be used in several ways:
  1. It can be used to import or update data from the geonames database, like population, neighbouring country, alternative names, children, geocoordinates etc.
  2. It can be used to unambigously identify a DM in OW if an external resource identifies a spatial thing by its geonames id (or a geonames link). (An addressbook or a restaurant guide for example could, instead of giving a country, city, district, postal code etc., just give one geonames id that identifies the area)
  3. It can be used to construct several URIs, like:
    1. http://sws.geonames.org/3017382/ will, depending on the request header of the requesting application, return either a RDF document describing France or link to a map showing France.
    2. http://sws.geonames.org/3017382/about.rdf returns a computer-readable rdf document describing the feature.
    3. http://sws.geonames.org/3017382/neighbours.rdf returns a computer-readable rdf document describing the neighbours of the feature, etc.
    4. http://geotree.geonames.org/3017382/ shows France in a tree representation.

Proposals:

Adding a class attribute on the DM level to enter the corresponding Wordnet 3.0 synset URI on lexvo.org. See [1] for example.

--dh

Wow, that's a lot of stuff. I like etymologies and I like the indication of register, but I'm not convinced about the rest. This is first and foremost a dictionary and I don't think it's helpful to overcomplicate things. --Tosca 14:03, 29 November 2009 (UTC) (I restructured the whole thing a bit, it was hard to read)
(I find it hard to read as it is now) --dh 01:22, 30 November 2009 (UTC)
Also, I appreciate your enthusiams to develop things, but please, could you slow down a little? You've created tons of new DMs and you have added annotations to existing DMs, but not all changes are good/agreed upon and it's won't be easy to fix all this. So please, one thing at a time. --Tosca 15:05, 29 November 2009 (UTC)
I think the same. --Kipcool 15:21, 29 November 2009 (UTC)
'Tons of new DMs'? Oh please, you really tend to overstatements, Tosca. I've just created a few (5 if I'm not mistaken) that were necessary for the implementation and I've just annotated a few as example and test. The new DMs are all listed on this page and I easily can remove them in case we decide so, though I can't see what exactly is complicated about linking DM to concepts or instances of a few external authorative ontologies and still think that it would add value to OW. As I see it, there are two possibilities right now: Either I am wrong with my assertion that OW can and should be part of the semantic web, or you do not understand what I mean. And I yet haven't seen an argument that proves my assertion wrong. --dh 01:15, 30 November 2009 (UTC)
The problem here is that other contributors will want to link to other ontologies or ressources or whatever they consider important. If four people create each 5 links to other ressources, we have already 20 links, and we'll be lost in them. As an idea, see a list of ontologies on Wikipedia. I don't think we can possibly make links to all of them. This would need to be another project specifically dedicated to linking ontologies together.
I think a link to external ressource has to bring useful information (such as links to Wikipedia). At the moment, I don't see what the ones you propose bring. I have clicked on a few but I have not been impressed so far. This have to be discussed of course.
Yes, sure, the links I proposed do not bring any useful information to the reader. But that is not the intention. The intention or purpose is to unambigously identify DMs and thus make them accessible and usable by external applications etc. One major drawback of Wordnet for example is that it does not give unique identifiers to concepts and while what I propose does not do that for OW either, it at least makes it possible to identify concepts in a broader context. --dh 09:24, 2 December 2009 (UTC)
If you want to implement something like semantic web, I think we should link to only ONE ontology or mother-ontology or super-ontology whatever the name, which links back to all the others ontologies (including us). OpenCyc seems to want to do that? or SUMO? See Suggested Upper Merged Ontology (Wikipedia). This has to be discussed. --Kipcool 09:17, 30 November 2009 (UTC)
I agree, it is hardly productive to link to all or even many ontologies and it has to be decided which one to use. I think opencyc is a good choice since it is one of the oldest, most complete and most developed around (at least amongst the free ones). Dbpedia for example has started to link to opencyc and the UMBEL ontology is entirely based on it. SUMO probably would be a good choice too, though as far as I know they do not have permant URIs for the concepts. Besides, opencyc offers an API to search for concepts, which could be used in some way in the future.--09:24, 2 December 2009 (UTC)
Ok, I do tend to exaggerate a little. ;-) I admit that I don't really know much about ontologies and such. What would be the concrete use of a link to an ontology? Because IMO a link such as this one: http://umbel.org/umbel/sc/Restaurant_Organization.rdf isn't very useful for the average dictionary reader. If we do link to ontologies we should select the best ones, because too many links create clutter. --Tosca 16:05, 30 November 2009 (UTC)
See what I wrote above. I hope that makes the usefulness a bit clearer. Something more concrete would be the Musicbrainz example. They have an instrument list and started to translate it. While the instrument list is quite complete, the translations are far from it and if OW had identifiable DMs, they could instead simply link to or import from OW. (and maybe even contribute...) --dh 09:24, 2 December 2009 (UTC)
Another link I find important and would like to keep is the geonames id. --dh 15:25, 3 December 2009 (UTC)
I will try and post something about it (linking to ontologies, geonames, etc.) on the OW blog with a link to the discussions here, in order to get more feedback.
My personal opinion for now (but I'll try to be neutral on the blog I promise ;-) ) is to just have a link to opencyc. For Geonames I don't know, it is too bad that Opencyc does not link to Geonames.
So: post coming probably this week-end. --Kipcool 19:35, 8 December 2009 (UTC)
I agree with having just a link to opencyc for all lexical items, but still think that we should have a link to geonames for geographical entities since it is an external, authorative source for geographic information. --dh 09:37, 9 December 2009 (UTC)
Likewise we should have links for animals and plants to at least one external authorative source with permanent IDs or URIs (not sure if any and which are available, I think geospecies has permanent URIs (and also a free license)). Those links would only be available for certain classes and do not intersect, that is one won't have the option to put both a link to geonames and to geospecies (or any other ressource we might link to) for any entity. --dh 09:37, 9 December 2009 (UTC)
I have posted a summary yesterday. I hope it is ok.
Now that I had time to think about it, links to geonames are ok for me as well. Just wait a week or two maybe for more reactions.
For species, since we are MediaWiki friendly, we could maybe link to Wikispecies. They try to link to several external resources. However, they are not very complete. In fact geospecies links to more resources including Wikipedia and Wikispecies. So ok, just wait also a week or two ;-) --Kipcool 08:11, 14 December 2009 (UTC)
Great, thanks!
Regarding Geospecies and Wikimedia:
  • Wikimedia is not computer readable, Geospecies is and offers its entire content in HTML as well as RDF.
  • As you've correctly noted, Geospecies includes, amongst others, computer readable, consistent links to Wikimedia, while Wikimedia does not link to Geospecies and all links it might have to other ressources are not computer readable and consistent.
  • Geopecies tries to tie together disparate data about species and be a reliable, consistent means to identify a species, resistant to possible changes in naming conventions etc.
  • Geospecies is quite complete.
In my opinion we should consider to import the Geospecies data, as I can't see the use of repeating work others have already done. And as I've already mentioned elsewhere, I think we should do the same with Geonames. The problem is that we already have some entries that are also in these datasets, and right now it is just partly or imperfectly possible to identify them properly to avoid duplicates. This is one of the reasons I think we need to link to other datasets or ontologies with a "same as" relationship.
Regarding Opencyc and Geonames: It is true that Opencyc does not link to geonames, but it does link to wikipedia, as geonames does...
--dh 21:16, 14 December 2009 (UTC)
I've taken a long wikivacation from OmegaWiki after I graduated in mid-2008, but I recently had some ideas bubble up in my head, mostly to do with language learning. I won't go into detail about that here now, but one of my conclusions (which is also relevant to the above) is that the biggest problem with OmegaWiki right now is maintainability. Adding mappings to as many different sources as possible is fine, but when the underlying DM is changed, everything needs to be rechecked. IMHO, it's moot to discuss advanced features before this problem of maintainability is fixed. I have a beautiful idea that may be able to fix this, but this space is too limited to explain how it could work. The keyword is crowdsourcing, though. Hopefully I will have time to explain in more detail sometime soon. László 16:04, 19 December 2009 (UTC)
Well, welcome back then.
Regarding maintainability of mappings to external sources I can't see the problem since the mapping is done on DM-level so that in case a DM is changed, the mapping can, and has to be changed too, along with translations, other links, hyponyms, antonyms etc. That's a lot of work and therefore definitons should be constantly checked and questioned when adding translations so that more and more stable definitons that won't change anymore will crystalize over time. And mapping to a devoloped and well thought out ontology like opencyc can actually help with that by comparing our DMs to their concepts.
Changing DMs in OW would only be a problem if it wanted to offer stable, reliable, constant URIs for use by semantic web applications.
Regarding "crowdsourcing" I am really curious what your idea is, considering that crowdsourcing itself is not the solution but the problem (since, at least in our context, it is just another word for the "wiki approach" Omegawiki takes) --dh 05:48, 20 December 2009 (UTC)
I'm very happy with these ideas of very useful annotations for lexical items, though I still don't like the complicated access to them (and I didn't notice the change before reading this discussion). Thank you very much. --Fiable.biz 18:43, 11 February 2010 (UTC)

Two things[edit]

1) Do we still need the message "Edit conflicts can lead to overwriting of data. It is recommended to wait at least 10 minutes before editing a recently changed Definition, or find the person on IRC to ask if they've finished editing." displayed above the recent changes? It looks a bit ugly. ;-)
2) How can we fix the server time? There is a 30 min difference between server time and real time.

--Tosca 14:09, 29 November 2009 (UTC)

I fixed the server time. However, there is something strange with ntpd (the daemon that is supposed to correct time from time to time ;-) ), which is not working. --Kipcool 16:41, 29 November 2009 (UTC)
Danke dir. So the server time will be off after a while again? --Tosca 16:46, 29 November 2009 (UTC)
I don't know, but I fixed it already about 2 months ago, and now it was off... I don't know if it is a normal drift, or if another program updated the time in a wrong manner. --Kipcool 16:55, 29 November 2009 (UTC)

Word class vs. grammatical property[edit]

Right now this is a bit of a mess. If you want to assign a word class to an English word, you are presented with 14 different options. But traditional English grammar only knows 8 word classes (noun, verb, adjective, adverb, pronoun, preposition, conjunction, and article (some grammars have interjection instead of article)). I know that this is debatable from a linguistic point of view, but most people are familiar with these terms and we should stick to them and leave the debate to the linguists. ;-) The drop-down list includes "Expression:article", but also "Expression:definite article" and "Expression:indefinite article", so it's not very clear to the user what to use. Then there's also the grammatical property. In German I just removed "transitive verb" from the list of word classes because this can be indicated via the grammatical property (that is, you choose "verb" as the word class and then "transitive" as grammatical propery). But several other languages have this double too.

So my proposal is this: A word should first be put into a general word class. Then you can indicate which subclass it belongs to and what special grammatical properties it has. --Tosca 17:51, 29 November 2009 (UTC)

Good idea. Though I'm afraid it is not possible to implement this without changing the software. All we can do is to split up word class and subclass/properties, but no matter which class a word is in, all properties will show up, if they make sense or not. Just as it is possible now to assign several word classes to the same word/DM. --dh 01:42, 30 November 2009 (UTC)
Yes, a software change is needed to implement that some options are dependent on other options. This is not impossible, the code is opensource... ;-) --Kipcool 09:20, 30 November 2009 (UTC)
If we do not implement grammatical properties (which word class is only one of) in a hierarchical manner, we end up with enormous grammatical property lists for some languages, having entries such as preposition, requiring grammatical case x, numerus y and being (not) lenient towards plurale/singulare-tanti , requiring an (in)definite + (un)pointed article, (dis)allowing/requiring a numeral, (dis)allowing cardinals, (dis)allowing adverbial constructs to follow before the target, (dis)allowing pronomina as target, (dis)allowing/requiring a target having the 'container' property, (dis)allowing/requiring a target having the 'container' property when being used in conjunction with a verb of movement only, … -- not all possible combinations of case, numerus, lenience, … exist, but many do. If we want to support machine aided translation engines in the future, we shall collect more grammatical properties such as 'container' for sure so as to provide useful data. So currently, I suppose, it is not useful to add detailed grammatical properties of the kind mentioned above?
No, please don't add detailed grammatical properties! First I need to program something better (probably during winter) and clean the mess in the database. --Kip 14:51, 14 September 2010 (UTC)

Proposal to add DefinedMeaning:is part of theme (3) as annotation option on DM-level for all lexical items[edit]

... with a limited set of themes, like: biology, computer science, mathematics, chemistry etc. Right now it is only possible if an item is assigned to certain classe. See DefinedMeaning:Integer_(1146835) for example. First I added it to the class 'computer science', but then I realized that this would be wrong as the relationship is not that "'integer' is 'computer science'", but rather "'integer' 'is part of theme' 'computer science'", as it is now annotated. But I only could do this annotation because the DM was member of the class 'computer science', which I changed again since it is wrong. --dh 07:05, 3 December 2009 (UTC)

(By the way, though they exist, I can't see any use of classes like DefinedMeaning:informatics (1934), biology etc. as the relationship of "something 'is' 'informatics'" does not make much sense. Or do I misunderstand the whole issue?) --dh 07:18, 3 December 2009 (UTC)

No, you are right. It seems that classes have been used both as "is a" relationship (which is its true aim), and as a theme/field/whatever it is called. I also think that classes that are not true classes (not a "is a") have to be changed to something else.
Yes, I agree. But for that we first have to introduce this "something else". --dh 12:02, 3 December 2009 (UTC)
I am not sure if "theme" is a good word. I think "field" or "semantic field" is the word commonly used in linguistics to indicate such idea (here Wikipedia).
To clarify further the distinction between class/field, I'd say for example that Germany is in the class "Country", but in the field "Geography". --Kipcool 08:52, 3 December 2009 (UTC)
This makes sense and touches on something I've already asked elsewhere: Isn't "Country" also a hypernym of "Germany"? And if that's the case, aren't the classes a replacement for the hypernym relation? --dh 12:02, 3 December 2009 (UTC)
Not sure if we can say hypernym here. I think WordNet uses the term "instance" because Germany is not really a concept, but one specific country which has only one existance (like me, I am an instance of a man, not an hyponym).
But I see your point. Classes are indeed a kind of hypernym (because "is a" is the definition of hypernym), but in my opinion, with hypernyms, we should only indicate direct hypernyms, therefore ideally allowing to display a tree like in Wordnet. With classes, we are interested in objects having a common property.
To take an example, for the expression "German shepherd", I would give "dog" as a direct hypernym, but "animal" as a class, allowing me to give annotation attributes specific to animals (like all the taxonimocal information).
In this case, the class is an indirect hypernym, and probably the attributes could be indirectly inherited, but I believe it would be less pratical than it is now. --Kipcool 12:40, 3 December 2009 (UTC)
Yes, think I got it. And you are right, it is called "instance", though not only in wordnet. But what is about continent - country - city? Are these meronym/holonym relationships? Though no matter if they are or not, it seems better to have more explicit annotation options. For example for geographical entities, we could need something like "located in". Right now this is expressed through "is part of theme". See "Germany" for example. --dh 15:39, 3 December 2009 (UTC)
I've just realized we already have DefinedMeaning:located in (755407) as annotation option for (at least) countries, though it is rarely used. It seems many instances of "is theme of" need to be replaced by "located in". But still, isn't this also a meronym/holonym relationship? --dh 15:50, 3 December 2009 (UTC)
Agreed for: "is theme of" need to be replaced by "located in".
I don't know for meronym/holonym relationship in this case. --Kipcool 08:56, 4 December 2009 (UTC)

Are there any objections to me starting to implement something like a "Topic" annotation on DM-level for all lexical items? There is a subject list on my user page and I would start with the terms I need. Anybody could then do the same. --dh 15:58, 11 December 2009 (UTC)

Ok for me, but I am not sure about the name "topic". I like "theme" or "field" better... I have to think about it. What are you considering in German: "Thema", "Fachgebiet" or something else? --Kipcool 07:58, 14 December 2009 (UTC)
Well, Wordnet does call this information "topic" (which by the way translates to "Thema" in German, just like "theme") and I think it is fitting, but if you can come up and agree on a better name we can change it. Until now I just implemented a few "topics" but for some reasons I do not understand, it does not work, that is, "topic" shows up as an annotation-option option but I can't select anything. It's a mystery to me since I did the same with the usage option list and there it works without any problems. --dh 00:06, 18 December 2009 (UTC)
Maybe, instead of using a noun as relation descriptor, we should use a real predicate like "has topic" and do the same for synonyms, hyponyms etc. --dh 00:11, 18 December 2009 (UTC)

Annotations for animals[edit]

like species etc. shouldn't be plain text but fixed meanings since typing the names is error prone and needless if we want to have all words in the database anyways. --dh 18:01, 6 December 2009 (UTC)

The problem is that we need a language for "scientific name". Conventionally expressions like "Equus ferus" etc. are called Latin, but they aren't real Latin. --Tosca 17:41, 8 December 2009 (UTC)
Well, it shouldn't be that hard to add something like "modern scientific Latin" as a language for binomial classifications. --dh 22:20, 8 December 2009 (UTC)
It is possible, but we need to be sure about the name of the language, and find an iso 639-3 code (is it already existant?). --Kipcool 07:46, 14 December 2009 (UTC)
It's called "New Latin", a subgroup of "Contemporary Latin" and apparently the ISO-639 system does not distinguish between different kinds of Latin as they all have ISO 639-1 code "la" and ISO 639-2 and ISO 639-3 codes "lat". If that is indeed the case then I would suggest we simply call it "New Latin" or "Scientific Latin" (and make up an ISO-code in case the software requires one). As long as it is coherent it can always be easily changed later.
In this context I would suggest to import the different families, orders etc. and their translations from GeoSpecies along with links to Wikimedia Commons and Wikipedia and the corresponding GeoSpecies URI. I would even go so far to import all the individual species as well since it is a lot of work and takes a lot of time to enter all this information manually and the time and effort could be better invested in other areas, like adding translations and definitions to the imported GeoSpecies data. --dh 23:51, 17 December 2009 (UTC)

Persons and their gender[edit]

As I already mentioned elsewhere, I am not really satisfied with the way we handle the gender specific forms/DMs of persons, and while by now I think the way it is handled is correct (at least partially), I am bothered that not native but foreign words show up in gender neutral DMs for languages which distinguish gender. See Expression:Musiker for example. One way to solve this would be to indicate a word which denotes both male and female instances and mark this as indentical translation. For DefinedMeaning:muzikant (401479) this would be "Musiker/in" in German. See DefinedMeaning:cubano (868982) for an example. --dh 08:42, 8 December 2009 (UTC)

I really don't like this solution: DefinedMeaning:cubano_(868982). It isn't "clean", because Kubaner/in not one word, but two (there's also a spelling error). The other thing is that a word like "Musiker" has two meanings in German a) someone who plays music b) a man who plays music. For example when we say: "Die Musiker bereiten sich auf das Konzert vor" the musicians could all be male or there could be males and females. This is considered sexist by some, but well, this is how it is. Their workaround is to say "Die MusikerInnen bereiten sich auf das Konzert vor" but this is very artificial and not widely used. So I think we should use the standard (male) form as translation for both the gender-neutral and the female DM. The gender-neutral DM should link to the male and female DM.
Right, it might not be one word (and expression would probably the more fitting term), but there are many translations that consists of two or even more words. And somehow I think these forms should be in OW as well, since they are used (at least in writing) and have a meaning. But I agree that we should use the standard (male (and thus sexist)) forms as "identical meaning" for both the gender-neutral and the male DM (assuming you meant "we should use the standard (male) form as translation for both the gender-neutral and the male form"), and use the female form only in the female DMs (and link the female, male and neutral DMs by somthing similar to what I suggested below. --dh 09:23, 9 December 2009 (UTC)
Yes, some translations consist of more than one word. But in those cases all the words are needed. For example the English translation for "Waschmaschine" is "washing machine". Only the 2 words "washing" + "machine" together describe the concept. That is not the same for "Kubaner/in" which contains 2 different concepts. Your last phrase: That's what I meant. :-) --Tosca 13:41, 12 December 2009 (UTC)
So, what about languages that do not make a gender distinction (or not always). This (Expression:Musiker is not a good solution IMO, because it is wrong. "Musician" is a perfect translation for "A man who makes music, either by using his voice or by playing an instrument.". Checking off the identical meaning box just doesn't make sense. We could a) not translate the gender-specific DMs into English b) translate all three DMs (gender-neutral, male, female into English and put "musician" in as translation. I tend to prefer b).--Tosca 17:35, 8 December 2009 (UTC)
What about c) to only use "male musician" (which IMO is the only perfect, that is really identical and unambigous (even if complicated) translation of "A man who makes music...") for the male DM, "female musician" for the female DM and "musician" for the neutral DM? But then... in German one would say "Meine Freundin ist Musikerin" and not "Meine Freundin ist Musiker", and in English one would say "My girlfriend is musician" and not "My girlfriend is a female musician", so maybe your solution b) is the best. --dh 09:23, 9 December 2009 (UTC)
True "male musician" would be correct. But as you said that's not how it is used. You would only add the adjective if you really want to put an emphasis on it (for example: "My girlfriend is the only female musician in the orchestra").
The most elegant solution would be if we had SubDMs for cases like this, when some languages are more specific than others. But I'm sure it's not easy to program. --Tosca 13:41, 12 December 2009 (UTC)


Another proposal in this context is to assign those kind of DMs to the class DefinedMeaning:person (333730) and add class attributes on DM-level to indicate the gender of the given DM and point to the male and female forms. But I do not yet have a clue how to name these attributes/DMs. --dh 08:42, 8 December 2009 (UTC)

Good idea. --Tosca 17:35, 8 December 2009 (UTC)
Any idea how to express this? Right now I simply indicate the female and male DMs as hyponyms of the neutral DMs, but that puts "Musiker" and "Musikerin" on the same level as for example "Gitarrist" and "Geiger" and it somehow does not seem right. It's a special relationship and should be indicated as such. --dh 09:23, 9 December 2009 (UTC)
Maybe it would make sense to have extra classes for male and female persons, so that instead of putting "Musiker" in the person class we'd put it in the "male person" class. --dh 11:36, 9 December 2009 (UTC)
I'm not really sure how to do that. We have DMs for "man" and "woman", maybe we could use those as classes. I hope Kipcool answers, he's more of a technical guy. :-) --Tosca 13:41, 12 December 2009 (UTC)
Yes, I thought of these DMs as well. The problem with them is that they define "man" and "woman" as "adult", but this isn't necessarily true for "persons". --dh 18:18, 12 December 2009 (UTC)

Agreeing with Dh, a possible implementation would be:

  1. To the class DefinedMeaning:person (333730), we can add attributes "male person" and "female person", pointing to the male and female forms.
    note 1 : it seems that we don't have these 2 DMs yet? we have "man" and "woman" but they are specific to adult humans, and here we don't need the "adult" part.
    note 2 : indeed, as Dh said, they are a kind of hyponymy, but I agree that they deserve to be indicated with a specific relation (class attribute).
  2. I agree with having a "male person" class. We can then add an attribute "female person" to this class, poiting to the female form of a given male form (and mutatis mutandis for female/male). We can also have a further attribute to point to the gender unspecific DM, but I don't find a good name for it ; or we can rely only on the "incoming relations" part, or only indicate it as an hypernym. --Kipcool 18:18, 12 December 2009 (UTC)

Why not link the genderless version of expressions into the 'person' class? We can have three relations: 'male version of', 'female version of', 'genderless version of' - so as to satisfy German needs, the genderless version would have to list each of "Kubaner oder Kubanerin", Kubaner/in", "KubanerIn", and "Kubaner" as exact matches, but would assign them different levels of speech so as to assist choosing the right one for a given audience or occasion or era. Note btw. that, the Plural of "Kubaner oder Kubanerin" is ambiguous, it can be either "Kubaner oder Kubanerinnen" or "Kubaner und Kubanerinnen" depending on context. --Purodha Blissenbach 09:08, 14 September 2010 (UTC)

There is another complication worth mentioning in this context. During the 1970 and 1980, the German language was differing between the different states laws and administrations (it still is, but) there was an interesting difference concerning gender: In Germany, Austria, and Switzerland, since the 1800's at least, you could learn, get a state approval, and become a "Kaufmann" (trade-man) - for all sexes. In West Germany, the title was changed for women, they got a certificate as "Kauffrau" (trade-woman) after a certain date. One of the other German speaking countries dissapeared and was incorporated into West Germany. Now Ex-GDR women are certified being a "Kaufmann" whilst their Ex-and-still-BRD colleagues are "Kauffrau". (Many Ex-GDR women are proud of that because they consider "Kauffrau" and simialar "typical western" genderisms sort of quite ridiculous). Thus, the correct translations of these words in resumes, etc. would have to differ depending on the year and place of the underlying certificate. We do not have an official language code for East-German of the GDR in OmegaWiki, but I think, we should. The proper code would have to be de-DD, since DD was (and still is) the ISO code for the GDR, and here is one of the (imho few) use cases. This also serves as an example, why ISO codes that became obsolete, must not be reassigned. --Purodha Blissenbach 09:08, 14 September 2010 (UTC)

Example sentences[edit]

The following has already posted on DefinedMeaning_talk:example_sentence_(402304) and any discussion should probably take place there:

IMO it would be good to make the translation of example sentences mandatory, that means, if there already is an example sentence for one translation in a language, and you want to add an example sentence for a translation in another language, you have to first translate it and use the word the example sentence is for. In case you can't translate it (because you do not know the source language etc.) you simply fill in a question mark (?), and in case it is not possible to translate it (because of gender etc.), fill in a hyphen (-). For an example see DefinedMeaning:Orwellian (354083) and for an example including synonyms, see DefinedMeaning:yet (5937) --dh 16:15, 10 December 2009 (UTC)

I believe it is a good idea (because I already had the same idea ;-) ), well not really of making it mandatory, but of making it recommanded.
What I had in mind was to change the software a little bit, by creating a namespace for example sentences, which would be similar to the current list of a definition translated in several languages. With this, it would be clear which example sentence is the translation of which other example sentence. At the moment, in DefinedMeaning:Orwellian (354083), there are 2 sentences which are translated in several languages, but it is not straightforward (and not explicit in the database) to know which corresponds to which. --Kipcool 16:40, 10 December 2009 (UTC)
Well, it works since OW keeps the order of the sentences, at least seemingly. And that's also the reason why I spoke of making it mandatory: It's the only way right now to make it work. Of course it would be better to have a more straightforward way of doing this. Personally I think this should have a high priority as it appears that having and translating example sentences is a great, and maybe the only way to ensure synonymity of a translation and often is simply needed in order to understand the DM at all. See Expression:yet for example... --dh 19:30, 10 December 2009 (UTC)
By the way, parallel translations in more than 10 languages can be found with the search engine of the OPUS Corpus project. --16:05, 11 December 2009 (UTC)


Dear colleagues (Kipcool and others),

I think it’s an excellent idea by Kipcool to create a namespace for example sentences, similar to the present style of creating articles (expressions) with translations and synonyms for definitions. I agree completely with the reasoning that example sentences do help greatly in the process of understanding the definition of a given word. This way we have theory (isolated word) and practice (example sentence) duly brought together. Indeed, example sentences presented in “annotations” are not so effective as I believe they would be on a separate namespace.

Concerning my experience in terms of phraseology (groups of example sentences in a language) , I quote below a note I have published here on my user page [2]:

“Notice that I have a list of around 100.000 bilingual English-Portuguese sentences I have been compiling in the past 28 years (since 1979) – most of them written in the period March1979 / August 1968 – when I lived in Washington,D.C. I started gradually publishing this list at the pt-Wiktionary (Portuguese-language dictionary of the Wiktionary Project), at the “English-Portuguese Phraseological Dictionary – IPDF” [3], a bilingual English-Portuguese dictionary whose entries (phraseology included) I intend to share with OmegaWiki as well. I have also a list of hundreds of French-Portuguese and Spanish-Portuguese sentences, also compiled by me since 1974;” (See more details on my user page).

In case the idea of a separate namespace works out, I have some suggestions we might possibly discuss about:

1. A possible name for each example-sentence article could be, say, “Phraseology:nameOfcollocation” (general example). Specific example = Phraseology:leave well enough alone (for creating an entry for the idiomatic expression to leave well enough alone. For each entry we could have:

EXAMPLE SENTENCE = Phraseology: regard with skeptism

a) defined meaning = not to believe;

b) example sentence: Many people would regard this intention with skeptism. – Synonym sentece: A lot of people would not believe this idea.;

c) synonym sentences (that is, how to say a given sentence in other words, mainly in simple English). Example: not to take seriously. As the case may be, a synonym sentence (when a collocation) would be presented in link form, in case someone would like to go into details about it;

d) etc

Each item above (a, b, c etc) would be translated, by users, into many other languages (in most times into each user’s mother tongue);

Well, let’s see what our colleagues have to say about this topic. Waltter Manoel da Silva wtz 21:44, 17 December 2009 (UTC)

Wow, 100.000 sentences is a lot. I hope nobody thinks about manually entering them into the database :) The best way to utilize them would probably be to import them to a separate table and make them searchable in a way that if one wants to enter an example sentence in Portugese or English, the given word is searched in the sentences and the matching sentence pairs are automatically imported as example sentences in the corresponding languages. Unfortunately these would only cover two languages at a time, unless your translations to French and Spanisch share the same original Portugese sentence so that they could be alligned.
I had a similar idea with the Europarl3 corpus (which is not only alligned but also POS-tagged!): Import it (or at least the part of it that contains sentences that are translated in all of the eleven available languages) and make it searchable in the way described above. Of course that would require some programming efforts, but it would be a great aid and gradually lead to a parallel corpus in which the words are not only POS-tagged but also linked to their meanings! --dh 21:46, 18 December 2009 (UTC)

Old Spellings[edit]

Dh recently created this DM (DefinedMeaning:alte_deutsche_Schreibweise_(1154013)) and added it to Expression:lexical item as an annotation. It is meant to allow us to add German spellings that were correct prior to the orthography reform of 1996. For example see this: DefinedMeaning:strenggenommen (1153979). I see several problems with this:

1) Not everybody is going to read the annotations, so for a causal reader all given spellings seem correct.
2) There were 2 other German spelling reforms (1876 and 1901). How do we incorporate them? What about spelling reforms in other languages? French had one in 1990 for example.
3) The translations of DefinedMeaning:alte_deutsche_Schreibweise_(1154013) aren't very clear. "Old spelling" can refer to any old spelling of any language.

I thought about the problem of old spellings before because I want to add them as well (:-). But I haven't really found a solution. Maybe we could add them as languages, the way we do it with regional language variants (French Canada, German Austria etc.) and different writing systems (Serbian Latin and Serbian Cyrillic). --Tosca 12:05, 1 January 2010 (UTC)

@1 Agreed, right now it is not obvious enough, but this is not really a problem of how it is annotated but of how annotations are displayed by the software and actually is true for other annotations as well. The important thing is that it is annotated, not how it is displayed. The latter can and hopefully will be improved in the future.
@2 The DM right now explicitly states that the reform of 1996 is meant and if you want you can create DMs that refer to the reforms of 1876 and 1901. The problem here seems not to be how to incorporate them but how to distinguish them when annotating a word. This could be done by changing the translation from, for example, "alte Schreibweise" to "alte Schreibweise (vor 1996)". In regards to reforms in other languages I can't see the problem since the one I created only shows up for German words and if you or anybody else wants to add another one, a DM refering to the given reform in the given language can easily be created and added to the usage option list for the given language. This way all reforms in all languages can be incorporated without clogging the option menu since only the reforms relevant for the language at hand will show up.
@3 True, just like other expressions can mean several things. That's what the DMs are for, and the one I created makes abundantly clear what exactly is meant. The only problem here is the same as in 2) and for a possible solution see there.
In regard to your last point, I would rather go the other direction: remove regional language variants from the language list and add them as annotation option on a per language basis. Right now they are not sufficient anyhow, for example there is no way to mark a word as Canadian or New Zealand English.
--dh 21:02, 1 January 2010 (UTC)
For me, "old spelling" is a bit different from the other annotations in that an old spelling of German does not form part of the current German language anymore.
At the moment, I am wondering about the two alternative options (alternative to using an annotation field):
  • create a new language with an explicit name: "German (old spelling 1996)". In this way, it is clear to the reader that it is an old spelling
  • or, do something in the software to have old spelling greyed out or in italic or something. This could be implemented with a tick box similar to "identical spelling", that would say "old spelling" and is unticked by default (or "current spelling" and ticked by default). In this case, the year of the reform would need to be indicated somewhere else. --Kipcool 09:09, 7 January 2010 (UTC)
N.B.: for French, the reform of 1990 is different because at the moment, the two spellings are still considered valid (so a standard annotation is enough). However, there has been some real reforms in the 19th century.
Is it true that "old spelling of German does not form a part of the German language anymore"? And what does that mean? There are thousands if not millions of texts out there that use this words and the same is true for spellings prior to other, older reforms. --04:44, 8 January 2010 (UTC)
In regard of how to insert or mark them (annotation versus new language) I'd still prefer the annotation method since it is not really a new language, but I don't think that a default field should be added since it would be unused most of the time and would opt for a way to only show values ("old spelling", "colloquial" etc.) that are actually used for a word (and while we are at it, the "identical spelling" field should go as well and replaced by indicating something like "near synonym" (or even sort them under an own heading) for expression that now have this box unticked). This could be achieved by creating abbrevations for each entry (for example "coll." for colloquial) and show them in parenthesis and maybe italized and/or smaller or however next to (after) the words in question. Something similar should be done with the POS, but its indication probably should precede the words (or better yet, sort expressions by POS so that for example we have a heading for nouns which lists all noun forms of an expression of a certain language). --dh 04:44, 8 January 2010 (UTC)
Is it true that "old spelling of German does not form a part of the current German language anymore"? In the sense that using an old German spelling is considered as an error, yes (otherwise, tell my teacher about it). And when copying my sentence, you omitted "current" which is of importance here ;-). If you copy a text from Goethe or whoever else, and try to publish it now with your own name on it, the publisher will ask you to correct the spellings. --Kipcool 09:12, 8 January 2010 (UTC)
Many new spellings of the so called "reform of 1996" (it was finally finalized in 2005 or 2006, and grossly modified in between, and was inconsistently used, adopted, and rejected in different parts of the states, and by different communities and/or publishers) are now "allowed as alternate spellings" while only few new ones are mandatory, and some of the mandatory ones are nevertheless not widely used. Reality is also, that German spellings were not at all constant in the times between reforms. E.g. Bureau → Büro (and similar), Telephon → Telefon (and similar), lots of dropped trailing 'e's (im Kriege → im Krieg) (Many -e words changed their class from "normal" to "pathetic" speech) and many other changes occurred by and by during the 20th century. German spelling is very democratic, and dynamic, too. Authors write the ways they want, and scientists report that in dictionaries, having annual or bi-annual updated editions. Only schoolchildren and governmental offices are obligued to follow a certain spelling system. Thus, e.g. "Bureau" should be annotated as "majority spelling until the 1950's + regional" while "Büro" should be annotated as "minority spelling introduced in the 1930's (iIrr) + regional + majority spelling since about 1960" while "Kontor" is only marked as "regional" and "Office" is marked "Neologism, since the 1990s + genre-specific foreign word since the 1960s" and all four are synonymous. Agreed, that is a lot of detail, but it has to be as it is, imho. -- Purodha Blissenbach 13:59, 14 September 2010 (UTC)

FYI: grammatical properties[edit]

I've created DefinedMeaning:predicative_(1154701) & DefinedMeaning:attributive (1154724) and added them as grammatical property annotation to the lexical item class. Criticism and praise are welcome. --dh 18:53, 2 January 2010 (UTC) For an example, see the expression "legion" at DefinedMeaning:zahlreich (495137). --dh 19:28, 2 January 2010 (UTC)

Nice! --Kipcool 08:54, 7 January 2010 (UTC)
Nice. Sometime in the near or far future we should probably write a "Guide to Omegawiki" in oder to document all features and annotations and their usage. There we could note that this annotation is only needed for adjectives that cannot be used attributively + predicatively. --Tosca 16:17, 10 January 2010 (UTC)
A functionality needed in this respect would be to only display annotations that are applicable for a given POS, so that for example predicative and attributive only show up as annotation option for words that are identified as adjectives. And in addition, one should only be able to select one of two for property pairs that exclude each other, like in this case. --dh 20:56, 13 January 2010 (UTC)

Topic[edit]

As some may have noted, I've created the annotation option "topic" on DM level for all lexical items. Unfortunately it does not work and I just can't figure out why. Any suggestions are welcome. --dh 19:28, 2 January 2010 (UTC)

I also cannot find the source of the problem. I'll have a look into the code and the sql queries involved (some time before Sunday evening). --Kipcool 08:52, 7 January 2010 (UTC)
Much appreciated, thanks! --dh 05:01, 8 January 2010 (UTC)
There was a bug in the software. It works now :-) --Kipcool 15:38, 10 January 2010 (UTC)
Well done, exterminateur! --dh 19:39, 12 January 2010 (UTC)

Etymology (2)[edit]

As Kipcool noted, it would be good to have links to original expressions in the etymology fields. Now, since this requires a software change and it does not seems that this is going to happen in the near future, a temporary workaround could be to create an additional etymology annotation for simple etymologies like for example the one for the expression "Sekretär" DefinedMeaning:escritoire_(1128186). Right now, the etymology field says: "Von lateinisch "secretarius" (= "Geheimschreiber").", and this could be easily changed to a relational annotation. All that needs to be done is to create a DM describing the relation and insert it as a relational class attribute to "lexical item". And in case a better solution for etymologies is found/created, it should be trivial to convert them. --dh 00:25, 7 January 2010 (UTC)

I appreciate your effort, but I'm kind of wary of this kind of workaround. This would be an abuse of what a DM is supposed to be. And I also think of the work to undo all this once we have a better solution. I'm not a programmer, but is it really so difficult to allow wikification in the input boxes? --Tosca 16:29, 10 January 2010 (UTC)
I also don't think we need this workaround right now, as it is not really urgent to have this (in the sense that there will not be many people using it, so it can wait).
Allowing wikification should not be too complicated. I am just wondering if there would be bad consequences (why has it not been implemented in the first place?) --Kipcool 07:48, 11 January 2010 (UTC)
If I remember correctly, Gerard was opposed to it because he wanted definitions to stand on their own and not rely on links to other definitions. Maybe there were also other reasons that I can't recall right now. --Tosca 15:57, 11 January 2010 (UTC)

Identical meaning[edit]

So that it is clear: the field "identical meaning" has been implemented for, and should be used only, when there is no exact translation of a DM in a given language. If there is an exact translation, then there should be no approximate translations.

For example, in Expression:Latte macchiato, "Milchkaffee" should not be in the list. If we want to make it appear somewhere it would be under a section "see also" or something similar.

Maybe this is not clear from the help pages, because I've seen many DM having exact translations and approximate translations for a given language (from several users). If this is a hint that a "see also" feature is wanted, then we should think about how to implement it. --Kipcool 07:58, 11 January 2010 (UTC)

Good point. I really need to write this Guide to Omegawiki, there's another thing to mention. :-) --Tosca 15:59, 11 January 2010 (UTC)
Oh, ok. I was the one that added "Milchkaffe" and if the way it is added to the DM right now is not the right one then there should indeed be another way to do this since "Milchkaffe" can in fact mean a "Latte Macciato" (maybe it's rather a hypernym?). I think what we need is a "near synonym" section (this is how euronet calls it), as well as some other relations like "xpos synonym" (indicating a synonymous relationship across different POS), "has subevent"/"is subevent of", "causes"/"is cause of" etc. (See User:Dh#Relations) --dh 09:36, 12 January 2010 (UTC)
And I don't think we should have relationships like "see also" since it is not specific and does not describe the kind of relationship. --dh 12:34, 12 January 2010 (UTC)
Actually, now that I think about it, isn't "identical meaning" synonymous with "synonym"? If so, then unticking the box means it is not synonym but "near synonym" . My point is, there isn't actually a difference and it does not really matter if an identical meaning exists or not. Hence, what we call "identical meaning" means "synonym" and unticking the box means "near synonym". No? --dh 13:12, 12 January 2010 (UTC)
You're right that "identical meaning" could be used to indicate a near synonym. But I don't think that we should do it, because it dilutes the DM. If we add Expression:Milchkaffee, Expression:Kaffee and other Expressions that are vaguely synonymous it will become increasingly difficult to spot the correct translation under all the stuff that accumulates. Imagine if there are near synonyms for all languages, this would easily double the number of entries in the SynTrans! And if someone searches for Expression:Latte macchiato he will not only find the correct DM but also many other DMs he didn't really ask for. So IMHO a separate section for near synonyms or related words is a much better solution.
"See also" isn't specific, but that isn't necessarily a bad thing. We need to accept a certain level of fuzziness because that's what language is like. If we try to make relations and structures for everything, OmegaWiki may end up too complex to be useful. --Tosca 17:00, 12 January 2010 (UTC)
Well, "Kaffee" is clearly not a near-synonym of Expression:Latte macchiato but rather a hypernym, therefore the problem here seems not to be that there would be too many near synonyms that dilute the DM, but rather that we need define what exactly constitutes a "near synonym", "hypernym"/"hyponym", "meronym"/holonym" etc. and what not. This is what the documentation pages should do. And regarding the number of entries in the SynTrans I am of the opinon that all non-identical meanings (however we use them) should either be automatically sorted at the end of the list, or, even better, go into a seperate section so that one can clearly distinguish between exact translations and near synonyms. Apart from that, there should be a way to indicate the "best" or most common translation of a DM, a problem we've already discussed some time ago (regarding "Instrument" and "Musikinstrument")
Regarding "see also": Hmm, maybe you are right. It certainly is not useful to have too many relationships, though I think we should have a few basic ones (see my user page for examples) and I can't think of anything that isn't covered by them and would need a "see also" relation. Can you? --dh 19:29, 12 January 2010 (UTC)
The advantage of the web is the possibility of contextual help. How many of you read all the help and instructions of a forum, wiki etc. before contributing? And even if you do, do you really remember all that stuff? I'm not against adding what you want to the help, but I think there are much better methods. One is to block the possibility of an approximate synonym once one exact synonym already exists, adding an explanation (for instance a balloonhelp). Another possibility is to turn each field name into a hyperlink and, in this case, to add a hyperlink beside the checkbox, the link leading to an explanation of what is expected there. Instead of a page, it can be a balloonhelp.--Fiable.biz 17:41, 11 February 2010 (UTC)
I think we need precise relations, so we need many, not things like "see also". It'll be very important for automatic translation. But if these relations are referred to by uncommon words such as "antonym", "hyperonym", an explication should be provided. For instance, turning these words into hyperlink towards their definition.--Fiable.biz 17:41, 11 February 2010 (UTC)

Topic, classes, help pages and new features[edit]

Now that the topics are working, several points have to be made clear, and help pages should be written/updated accordingly:

  1. What is a topic, what is a class? (At the moment not very clear: several classes should be renamed as topics)
  2. What name do we give to topic? (I still like "semantic field" better, but it is only a name anyway).
  3. There are "is part of theme" relations from Gemet, what do we do with them? If we have both "is part of theme" and "topic", it becomes not very clear to the reader.
  4. Each topic in the list of topics should be discussed in a dedicated page.
  5. For example the page Topics exists, it is old and does not correspond to the current topics we have, it should be rewritten entirely I think.
  6. The page Class should also be rewritten because it is written as if classes and hypernyms are the same thing (but we have the two fields, so it should not be the same...).
  7. Each class in the list of classes should be discussed in a dedicated page. We have the problem that many classes have been added by singular people, and the others don't know what they are (and for many classes, I don't think they are relevant).
  8. etymologies are not documented
  9. usage field is not documented (also here there is a list of usage to discuss)

Please don't answer here (or very shortly please). I think it is better to start writing the help pages, and discuss it then (to have a basis to discuss). I can do it myself if I find enough energy and motivation in this cold grey winter of Munich. I have some ideas, involving reviewing and updating all the help pages.

If I forgot a point that is not clear, and should be in the help pages, please indicate it here. It would be good that we stop adding new features until the many new features that we have since Dh arrived are documented. --Kipcool 09:00, 11 January 2010 (UTC)

You are right, and I've already started User:Dh#USAGE and User:Dh#TOPIC, though its rather a listing than a documentation, but maybe it can be of use.
Just some thoughts on the points you've made:
2. I've already made my standpoint clear: As I see it "topic" is both correct (Wordnet for example calls it this way too) and comprehensible (for non-linguists it probably makes more sense than "semantic field")
3. It seems that the "is part of theme" relationship from Gemet is rather unspecific and sometimes it's what we call "topic" (for) now, other times it's a hypernym and the rest are relationships we do not have yet (but probably should introduce, see above). Therefore they should probably be (automatically) deleted.
6. Sometimes "hypernym" and class are indeed the same and in this cases the DM in question should be indicated both as "hypernym" and as class.
--dh 09:53, 12 January 2010 (UTC)
I just wanted to start a top-level help page (with a overview and/or table of contents), but the "Help" link at the top of the page links to the main page and can't be edited. Could someone please change this or tell me where to start? Thanks. --dh 19:35, 12 January 2010 (UTC)
I don't know which link you are talking about. If it is a redirect, the way to correct it is: when you click on it, it says, in small letters "redirect from XXX", and then you can click on XXX and edit the redirect, change it, or delete it or write a page.
For where to start, I have created the ugly page Help:Index with links to the help pages I could find. All help pages should begin with "Help:" which is a dedicated namespace.
I have also put two links to Help:Index on the mainpage. Probably it would be good to have a link on the left panel as well (can do later). --Kipcool 21:43, 12 January 2010 (UTC)
The skin I use, "Kölnisch Blau" ("Cologne Blue"?) displays a "Hilfe" (help) link at the top of the page, and this one linked/redirected to the main page. But now it works as supposed, so thanks and nevermind... --dh 20:47, 13 January 2010 (UTC)

Proposal: Get rid of the text in DM names[edit]

Very often the name of the DM has become obsolete, for example here: DefinedMeaning:element_of_group_II_(alkaline_earth_metals)_(1131). But we can't rename it and so it's very confusing for users. The text doesn't really serve a purpose anyway because you may enter anything you like, only the DM number is important. Example: DefinedMeaning:randomtext_(1131). (The link appears red but you will still be taken to the same DM). Thoughts?--Tosca

I absolutely agree and already have thought about it myself. IMO we should convert the DM URLs to use Universally Unique Identifiers (UUUIDs) instead and maybe even turn the URLs to URIs as used by the semantic web. For example, http://www.omegawiki.org/DefinedMeaning:infiltration_(1930) would become something like http://www.omegawiki.org/DefinedMeaning/4d80f288-cdbd-4806-a09a-98797a386891. Later then the software could be altered to return either HTML page readable by humans if the request comes from a browser or a RDF document readable by computers if the request comes from a RDF aware application.
For DMs we would end up having URIs like:

And for Expressions:

Though eventually it might be neccessary to convert Expressions to numbers or UUIDs as well since I suspect that we need to treat Expressions as seperate entieties to distinguish and/or group together words with the same or different annotation properties on expression level. For example, the German word "Tee" needs two identifiers or pages which both have different etymologies, genders, plurals etc but each or the two has several meanings but shares all annotation properties on the expression level, like IPA, etymology, gender etc. See [[4]]
--dh 09:41, 17 January 2010 (UTC)
I agree with having the DM number only. I don't really see the advantage of UUUIDs: a DM number is shorter, easier to remember, also corresponds to the dm number in the database, which makes it easier to debug, and which ensures that this number is unique and will not be changed.
I know the place in the php code where the DM title is generated. I can fix it quickly, but in that case, the DM pages with numbers only will appear red for some time, until I figure out the best way to delete the current pages (of the form DefinedMeaning:element_of_group_II_(alkaline_earth_metals)_(1131)) and create the new pages (of the form DefinedMeaning:(1131)).
I need to know if this is bothering. --Kipcool 17:03, 17 January 2010 (UTC)
I for my part am willing to take whatever it takes to get rid of the names, though I am not really sure what you are talking about...
Regarding UUIDs: Well, UUIDs are maybe not really neccessary, though I think we really should consider to make OW semantic web compatible, at least in the long run. And for that we probably would need to get rid of the fullstops that seperates the namespace from the DM number or expression. That is, "http://www.omegawiki.org/DefinedMeaning:(1131)" needs to become "http://www.omegawiki.org/DefinedMeaning/1131". This then can be abbrevated to DefinedMeaning:1131 or DM:1131 (depending which alias is given for "http://www.omegawiki.org/DefinedMeaning/") in RDF documents or OW intern. For an example and explanation of what I am trying to say, take a look at the Geospecies website, where it says:
For instance, the Universal Resource Locator or URI for the mosquito Aedes vexans is:
http://lod.geospecies.org/ses/4XSQO
Your browser will automatically resolve to this page:
http://lod.geospecies.org/ses/4XSQO.html
A semantic web crawler will resolve to this page:
http://lod.geospecies.org/ses/4XSQO.rdf
--dh 18:12, 17 January 2010 (UTC)


, in the english gemet rdf file, the concept "abandoned industrial site" has the uri "http://www.eionet.europa.eu/gemet/concept/7"

I am talking about: "DefinedMeaning/4d80f288-cdbd-4806-a09a-98797a386891" is not nice (for me human), "DefinedMeaning/1131" is simple and enough. Though, it will be DefinedMeaning:(1131) for the time being, because the ":" is a requisite of MediaWiki, so I cannot get rid of it so easily. The parenthesis I don't know, maybe I can get rid of them. It depends on the code... --Kipcool 18:49, 17 January 2010 (UTC)
I think UUUIDs, URIs etc would be a little overkill and not easy to implement. The DM number should suffice. How does that number work incidentally? We have 42941 DMs right now, but the most recent DM I created has the number 1159790. --Tosca 20:37, 19 January 2010 (UTC)
I think the UMLS and Swiss-prot are also in there, but they are not counted in the statistics. --Kipcool 21:22, 19 January 2010 (UTC)
No I am wrong, apparently, there is a number that is incremented when a new expression is created, or a new DM, or a new relation, or whatever else. It is stored in a table called "uw_objects", and this table already stores an UUID. But still, I prefer to display the number. --Kipcool 21:30, 19 January 2010 (UTC)
FYI: mapping an (external) URI such as http://omegawiki.org/DM/1131.html to an URL such as http://omegawiki.org/DefinedMeaning:(1131) can most easily be done at the OW web server level (but it inhibits having a "real" wikipage DM/1131.html, which is of little concern unless we make stupid choices) So fiddling around with brackets and/or the colon is not worth much labor. --Purodha Blissenbach 14:43, 14 September 2010 (UTC)

Lighter pages (update of today)[edit]

Today, I have programmed a bit omegawiki.

The visible part is that the ids in the html pages are smaller, resulting in lighter pages (and possibly faster editing). I have measured that pages, when editing a DM or an expression, are up to 50% lighter (the bigger the page, the better the percentage).

By doing it, I also cleaned the code (adding some global variables instead of hard coded strings), touching several php files. It is possible that it results in some functionality not working anymore (e.g. impossibility to add whatever attribute?). I have tested it on my local system before commiting, and solved all the problems I noticed, but if you notice something, please advise me so that I can fix it. --Kipcool 19:01, 17 January 2010 (UTC)

I'm not sure if this is related, but I tried to link DefinedMeaning:lysergic acid diethylamide (1159424) (OW) to DefinedMeaning:Lysergic acid diethylamide (51205) (UMLS) and failed since no expressions to choose from show up for the UMLS dataset. --dh 07:09, 19 January 2010 (UTC)
It was related. Bug fixed. --Kipcool 18:35, 19 January 2010 (UTC)

Romanizations[edit]

We can implement romanizations in OmegaWiki as attributes. However, there are several concurring romanization systems for each non-Latin script, and it has to be decided which systems we are going to use.

There are several ISO norms for romanizations: [5]. There are also many other romanization systems which are not ISO norms.

Among these languages, I know a bit of Mandarin and Japanese. So, I can discuss them. For the other languages, I have no clue.

For Mandarin Chinese, it is clear that the ISO norm, i.e. pinyin, is to be used. It is what is in the books when we learn the language, and what appears in almost all dictionaries, it is the most used system to write Chinese on a computer, and it is even taught to Chinese at school. If there is no opposition I'll simply implement pinyin.

For Japanese, the solution is not easy. The ISO norm is the Kunrei-shiki romanization [6]. However, the most widely used seems to be the Hepburn romanization [7]. It is the one that is used in my books at home. So I have a doubt:

  • should we use the Kunrei-shiki only, because it is ISO
  • should we use the Hepburn only, becasue it is the most widely used
  • or should we implement both?

If you know about other languages/scripts that need romanization, such as Cyrillic, Arabic, Hebrew, Greek, Georgian, Armenian, Thai, Korean, Indic scripts, you can help by answering the following question: "Should we use the ISO norm, or another system, or several system?". --Kipcool 09:11, 19 January 2010 (UTC)

Though I do not know anything about this topic and thus can't really say anything about it, do the different romanization system allow translation/transcription among each other, that is, is it possible to get from the correct transcription of a given non-Latin script word in one system to the correct transcription in the/another system? If so, a fourth solution would be to implement one, for example the easiest or most widely used, and add some code that transcribes it automatically to the other system. (Something like this is possible for pronounciation systems, at least for Arpabet to IPA). --dh 11:39, 19 January 2010 (UTC)
No, there is alas not always a one to one relationship, at least for the Japanese transcriptions I mentioned above. --Kipcool 12:10, 19 January 2010 (UTC)
Chinese seems to be a fairly straightforward case: Pinyin. For Japanese I would prefer Hepburn, because I have never actually seen Kunrei used anywhere. According to Wikipedia not even the Japanese government uses it... Cyrillic looks very complicated, I'm not sure which system is most widely used. --Tosca 20:52, 19 January 2010 (UTC)
In the blog, someone asked about the following additional transcriptions:
  • Hiragana for words in kanji (i.e. Japanese words imported from Chinese) : I think we have to implement it, it widely used, even in Japan.
  • Bopomofo : interesting, we could implement it if a contributor is interested.
  • fully vowelled versions of Arabic and Hebrew : I don't know much about it. --Kipcool

Although it's not widely used, I think the best for Cyrillic should be ISO 9. If not, we will have to deal with problems such as a one Russian word borrowed by other languages using Cyrillic, but to be romanized differently according to the language which borrowed it. Note that ISO 9 is a one-to-one transliteration, so that, from the romanized word, you can get back the original form. Romanization of Cyrillic can be automatised easily: don't ask us to type the romanized form. There is no consensus about the romanization of Mongolian Cyrillic words. Mongolian is a special case because it uses currently 2 non-latin scripts, whose transliterations into latin characters will be different. I don't know of the possible standard romanization of Uighur-Mongolian script. The main question is: "Is romanization a so important feature?". I think dealing with inflections is a much more urgent task. --Fiable.biz 09:36, 14 February 2010 (UTC)

The answer to "Is romanization a so important feature?" is yes, for all people who speak Mandarin or Japanese, they cannot do without it. Now, the feature is there anyway, and it's up to the users to add new romanizations (i.e. no more progamming needed in this direction). --Kipcool 12:21, 14 February 2010 (UTC)
Romanization is only a specific aspect of transliteration as defined on the CLDR page on transliterations, which gives a good overview. There are varying transliteration systems matching varying needs. Thus I think, we should be able to accomodate them all. Some available and/or applicable transliterations can be automatically derived from script, others are language (or group) dependant, and of course, editors should be able to individually add ones per language in a manor similar to other attributes. There are implementations of many transliteration systems available, thus we should 'import' those into omegawiki, and have the transliterated strings added automatically or semiautomatically, where possible. --Purodha Blissenbach 15:23, 14 September 2010 (UTC)

Implementation[edit]

When trying to implement pinyin, I discover that we don't have a way to indicate that an annotation is for a specific language only. (apart from options in an option list, but this would not work for romanizations). Or did I miss something?

So, it needs a bit of programming. I am considering doing something similar to what we have in "lexical item" (class attributes), but I'd do that on the language pages (e.g. Expression:Mandarin (simplified), I have put pinyin attribute there, but it does not work yet), where we would specify language specific attributes. Is there a problem with that? --Kipcool 17:40, 23 January 2010 (UTC)

Done. There is now the possibility to add a pinyin text attribute, only for Chinese words. --Kipcool 19:03, 24 January 2010 (UTC)

IPA[edit]

What exactly do the "/" at the beginning and the end of IPA transcriptions mean and which rule do or should we have for their usage? --dh 11:56, 19 January 2010 (UTC)

I am not sure... Personally, I don't put any "/" when entering an IPA in OW. But I'd be interested to know if this is wrong. --Kipcool 12:16, 19 January 2010 (UTC)
It's complicated. There is Expression:phonetic transcription and there is Expression:phonemic transcription. [] is used for phonetic transcriptions which is more detailed. For example, phonetic transcription notes a difference between clear l and dark l (lip vs. hill). Phonemic transcription uses // and notes only phonemes. In English it doesn't change the meaning of a word whether you use a clear l or a dark l, so phonemic transcription notes both variations of the phoneme /l/ (= allophones) as "l".
Phonetic transcription can also be narrow (= detailed):[ˈpʰɹ̥ʷɛʔt.sɫ̩] or broad (=less detailed): [ˈpʰɹɛt.sɫ̩]. (The transcribed word here is Expression:pretzel)
So the brackets should always be included to know what type of transcription it is. --15:43, 19 January 2010 (UTC)
Oh dear... Seems I've done it wrong up until now. Thanks for enlightening me! --dh 18:03, 19 January 2010 (UTC)

Is "ˈliːbəsˌkʊmɐ" (Expression:Liebeskummer) a phonemic or a phonetic transcription? --dh 17:38, 21 January 2010 (UTC)

Look like it was taken from here, so it's phonetic. I'll add the brackets. --Tosca 14:45, 24 January 2010 (UTC)
OK, thanks. --dh 17:52, 24 January 2010 (UTC)

Comboboxes[edit]

I repaired a bug in some comboboxes, where it would not work when you type in a letter (mostly for attributes). Tell me if you still see some similar bug. --Kipcool 17:43, 23 January 2010 (UTC)

Great, thanks. This was an annoying little bug (and I thought I am the only one that, out of impatience or carelessness, types in letters in those boxes...)

Language specific annotations, und so weiter[edit]

It is a bit hidden above, so I announce it here and develop:

  • it is now possible to have language specific annotations, i.e. annotations that will only appear for a word (syntrans) of a given language. For example, pinyin should only be available for words in Mandarin, and "Hepburn romanization" for words in Japanese.
  • This feature is not limited to pronunciation, though I am not sure what else we can do with it. For example we could add links to authoritative or public domain online dictionaries, for attestation purpose. If the dictionary is English, the link should only appear for expressions in English.
  • On a related aspect, I now see a way to implement annotations dependent on other annotations. For example, "transitive" and "intransitive" should only appear for verbs. The solution would be to have "transitive" and "intransitive" as an option list (the annotation could be called something like "verb property" or "transitivity") in DefinedMeaning:verb (6100) instead of having it in DefinedMeaning:lexical item (402295). Does this look reasonable? Question with this: do we need annotation that depend on a combination of two annotations? If yes, do you have an example please? --Kipcool 21:33, 25 January 2010 (UTC)
Thank you, it works great! --Tosca 20:53, 26 January 2010 (UTC)
Are there grammatical properties depending on combinations of annotations? I believe so, but I am not certain at the moment.
Imho, (in)transitivity of verbs is imho a formally poor concept. I'd suggest to rather add a set or ordered list (depending on language) of the grammatical cases, that a verb commands, for languages having the like. For example, the German verb "spielen" in the sense of "playing a musical instrument" can be used in these ways:
see: User:Purodha/Grammar of a German verb "spielen" like 'play a musical instrument'
Note that other meanings of "spielen", eg. "act, play a role" have somewhat differing grammars, also in part depending on (some of) their objects grammatical classes. Though both mentioned instances of verb are called transitive, their grammatical use is different, so this term obviously does not convey the precision needed, e.g. for the syntactical and grammatical analysis and composition of texts required in machine aided translations of some decency.
I know, this is talking about the far future. But we should not bar such applications now by poor or superficial or hasty design decisions. --Purodha Blissenbach 20:51, 14 September 2010 (UTC)
Please move this table to some personal page of yours, you are just flooding the discussion page here. Thanks. --Kip 22:06, 14 September 2010 (UTC)
Done. When I started it, I did not at all expect it to be that long. --Purodha Blissenbach 18:57, 15 September 2010 (UTC)

Kurdish & Norwegian?[edit]

Kurdish and Norwegian do not turn up amongst the languages. But they are languages, aren't they? --dh 17:25, 27 January 2010 (UTC)

they are meta languages.
The two main Norwegian languages are Bokmål and Nynorsk (that we have in the list of languages). Cf. en.wikipedia. I understood that Bokmål is the most spoken languages, so if you see somewhere a word that is identified as Norwegian, it is probably Bokmål.
I know less about Kurdish, but one of the Kurdish languages is Sorani, that we also have in the list of languages. --Kipcool 18:02, 27 January 2010 (UTC)
OK, thanks. --dh 09:35, 31 January 2010 (UTC)

Ancient Greeks[edit]

Question of renaming, dividing the language and the spelling to choose: see Portal_talk:grc.--Fiable.biz 10:44, 14 February 2010 (UTC)

Meta:Community[edit]

I am hesitating between:

  1. Replacing the International Beer Parlour link on the left to a link to Meta:Community
  2. And/or Adding Meta:Community as a header of all discussion pages.
  3. Or (alternatively to 2), having a panel on the right on each discussion page, linking to all other discussion pages. It would be similar to this.
Opinions, comments? --Kipcool 18:07, 27 January 2010 (UTC)
Did 3, as you may have noticed on the present page. --Kipcool 18:12, 30 January 2010 (UTC)
It's nice! --dh 09:34, 31 January 2010 (UTC)

Attributes dependent on other attributes[edit]

I think I know how to implement that. However, to be sure, I'd like to have a list of all interesting attributes (dependent on other attributes) that you can think of. Please complete "User:Kipcool/Attributes dependent on other attributes" or put them here on the beer parlour. Thanks. --Kipcool 18:13, 27 January 2010 (UTC)

Etymology: nice website[edit]

I've just found the website www.myetymology.com. I like the hierarchical presentation of the etymology (however it is not a wiki). I think we should aim at something similar in the long (long) run. In any case, it is a nice ressource. --Kipcool 10:00, 29 January 2010 (UTC)

little changes[edit]

I coded some small functionalities:

  • the statistics page now shows the number of languages
  • the options comboboxes (for example the list of topics, or the list of PoS) are now sorted. I have seen that there are problems in French with "é" that appear after "z". I guess it will be the same for German. I have to discover the right sort function... In English should be no pb ;-) --Kipcool 21:44, 2 February 2010 (UTC)

Customizing CSS[edit]

Hi, I have created a new page: Help:Customizing CSS. --Kipcool 18:39, 10 February 2010 (UTC)

New feature: adding multiple translations at once[edit]

Hi, I finally succeeded in trying to get this green "+" button to do something.

Now, when you click on it, it adds a new row. So, it is now possible to add multiple translations at once, without reloading the page each time. I give translations as an example, but it works also with all other fields having the "+" button: annotations, definitions, relations, etc. (except "new language" and "new exact meaning", but I don't think it is necessary).

In order to work, it might need that you refresh your cache to reload the javascript (in Firefox : Shift + click on the refresh button, in IE : Control + F5 ?).

Of course, I've tested it as harsh as I could, but there might be bugs. In that case, tell me. Thanks. --Kipcool 21:06, 15 February 2010 (UTC)

Yes! Thank you! It works great and it saves a lot of time. --Tosca 21:38, 15 February 2010 (UTC)
Fabulous! Splendid! You are the man! --dh 23:38, 15 February 2010 (UTC)
Congratulations - this had been on our dev wishlist for a long time. Really great to see all the improvements you've been making.--Eloquence 19:58, 16 February 2010 (UTC)

Finances[edit]

Omegawiki's server has problems and was recently interrupted during about 3 weeks. I'm not interested any more in spending time in a dictionary which is sometimes available. I thought it would improve slowly, but it's not. I proposed to accept a reasonable amount of adverts to fund it. Kipcool wrote he doesn't like advert and nobody else answered anything. My question is: "Has any measure been taken to improve the situation?". If nothing has been done nor is done soon, I will probably just give up, or, improbably, make a commercial fork. --Fiable.biz 15:58, 28 March 2010 (UTC)

Hoi, the problem was that our host shut down the servers without giving notice. Things got bad when they were not able to get the server back on line. We collected the system in the USA and got it to work. The hosting is now on a server provided to us by Erik Moeller. The downtime was really unfortunate but it is unlikely that this will repeat itself. Thanks, GerardM 13:49, 29 March 2010 (UTC)
It's good to have OmegaWiki back! Thanks for the work put into it! --dh 07:49, 30 March 2010 (UTC)
The server seems to work better than the previous one, thank you. But the usage statistics, which used to work before the change, have not been updated since March. --Fiable.biz 07:44, 18 July 2010 (UTC)

Strange Babel categories[edit]

I have heb-5 in my Babel code and i see myself in Category:User he.

User:Dovi also has heb-5, but he doesn't appear in Category:User he.

User:Nadav has heb-N and he also doesn't appear in Category:User he.

What's wrong? --Amir E. Aharoni 13:58, 3 April 2010 (UTC)

It is a problem of update of the categories. I am not sure why it happens, but I know how to solve it: with a null edit.
So, I edited the page of the users you mention, and saved the page without changing anything, and now they appear in the category. --Kipcool 15:41, 3 April 2010 (UTC)
Thanks. Now i see that there's also Category:User heb and i don't appear there.
I made a null edit to User:Drork and he moved from Category:User heb to Category:User he.
It probably has something to do with migration from two-letter codes to three-letter codes, but it looks like it's going in the wrong direction. --Amir E. Aharoni 15:53, 3 April 2010 (UTC)
I think it is due to the introduction of the Babel extension. Before that, we defined the classes locally, with three letters-codes. Now, the classes are defined by the extension, i.e. are the same than in Wikipedia (which preferably uses 2-letters codes when available). So, what you see is "normal", or at least it's not our fault ;-) --Kipcool 18:26, 3 April 2010 (UTC)
So under which extension should i report it on Bugzilla - OmegaWiki or Babel?
I want to avoid fingerpointing preemptively :) --Amir E. Aharoni 18:29, 3 April 2010 (UTC)
Hmm, now I am not sure what you want to report ;-)
  • If you want to report that the users do not appear in Category:User he, i.e. that a null edit is needed for the user pages to update the classes, then this is a bug of OmegaWiki, but I'll probably be the one who fixes it, so you don't really have to report it.
  • if you want to report that 2-letters codes are used instead of 3-letters codes, then this should be reported to Babel, but I am not sure if they will consider it as a bug. --Kipcool 19:18, 3 April 2010 (UTC)
I'm not quite experienced at OmegaWiki, but if i understood correctly, this project wants to use three-letter codes for all languages, which is a good idea. If the Babel extension classifies users by two-letter codes, then it goes against the goal of migration to three-letter codes.
It should probably be possible to classify users both ways, and to be able to configure the Babel extension accordingly. Since i don't know much about extensions configuration, i don't know whether it's already possible to do that in the current version of Babel. If it's possible, then it's a bug for OmegaWiki. If it's impossible, then it's a feature request for Babel. --Amir E. Aharoni 19:25, 3 April 2010 (UTC)
Yes, it is a bit strange that Babel uses a mix of 2-letters and 3-letters codes. I don't think it is configurable [8]. I think you can do a feature request then. --Kipcool 21:49, 3 April 2010 (UTC)
Reported: https://bugzilla.wikimedia.org/show_bug.cgi?id=23034 (doesn't OmegaWiki support :bugzilla: links?).
Feel free to comment and correct me. --Amir E. Aharoni 22:04, 3 April 2010 (UTC)

Why isn't OmegaWiki a WikiMedia Foundation project?[edit]

Why isn't OmegaWiki a WMF project?

There is one possible answer here, but is pretty vague and it has no links to the actual discussion about it.

On the other hand, at least some of the people here are closely related to WMF projects - Gerard, Erik, Dovi and others.

I'm not saying that OmegaWiki's being or not being a WMF project is good or bad; i am just saying that one would expect it to be a WMF project and it's not, so it's kinda confusing.

Can anyone write a few words about it in the FAQ? Thanks in advance. --Amir E. Aharoni 23:08, 3 April 2010 (UTC)

The easiest answer is "why would that be a WMF project?".
I tried to answer in the FAQ nevertheless. --Kipcool 14:35, 12 April 2010 (UTC)

Translation of the interface[edit]

I found mistakes in the translation of system messages to Hebrew. For example, the translation of "annotation" in DefinedMeaning pages is not quite correct.

Where can i correct it?

I couldn't find it here and i couldn't find it at translatewiki.net.

I think that it should be in the FAQ. --Amir E. Aharoni 12:35, 4 April 2010 (UTC)

It is at translatewiki.net, in the group named Wikidata. Link: [9]. When you correct it, please tell me so that I update the OW server with the new translations. --Kipcool 12:46, 4 April 2010 (UTC)
Thanks. I started correcting existing translations and adding missing ones.
Can you please explain me the meaning of "Concept mapping"? I need to understand it before i can translate it. --Amir E. Aharoni 13:17, 4 April 2010 (UTC)
We have 3 databases coexisting at omegawiki.org : the community database, which we can edit, and two others: UMLS and Swiss-Prot.
When a concept of the community database is the same as the concept in for example UMLS, we do what we call a "concept mapping" to indicate that the two concepts of these two databases are the same thing (e.g. DefinedMeaning:AIDS (105)).
The term "Concept mapping" that you have to translate is the title of this page which allows to define these mappings (not sure how this work, I don't use it). --Kipcool 13:41, 4 April 2010 (UTC)
KTHX. I overhauled the Hebrew translation and you may update the translation. --Amir E. Aharoni 14:06, 4 April 2010 (UTC)
It is updated. I hope it is ok. --Kipcool 17:56, 8 April 2010 (UTC)
Perfect. Thanks. --Amir E. Aharoni 12:50, 9 April 2010 (UTC)
I put a word in the FAQ. --Kipcool 14:57, 12 April 2010 (UTC)

OmegaWiki in Bugzilla[edit]

Why is OmegaWiki listed in Bugzilla as a MediaWiki extension? It is rather confusing. If i understand correctly, it is not a MediaWiki extension, but a separate project. It should be directly on this page. --Amir E. Aharoni 12:44, 4 April 2010 (UTC)

Well, on a "code" point of view, it is an extension: http://svn.wikimedia.org/viewvc/mediawiki/trunk/extensions/Wikidata/OmegaWiki/... --Kipcool 12:50, 4 April 2010 (UTC)
OK. Maybe to make it less confusing the page http://www.mediawiki.org/wiki/Extension:OmegaWiki can be created. --Amir E. Aharoni 12:54, 4 April 2010 (UTC)

"Usage"[edit]

If one looks at DefinedMeaning:outrage (1188476), it can be seen that I've annotated the translations/synonyms with "+ at" and "+ über", respectively, since these words are usually used in connection with the translation. Now, I am sure there is a name for such words and if so, we should probably add an extra annotation field for them. --dh 11:30, 9 April 2010 (UTC)

They are called "preposition" (Präposition auf Deutsch).
I agree that it is nice to have them, however I am not sure how to enter such information in the database. One problem is that prepositions are not unique. For example "I write on a paper" (auf?), "I write to my sister" (an), "I write about war" (über). I think the best is to have an example sentence for each preposition, but the software is not really good for this now (much work is needed on example sentences).
Also we cannot have a combobox in this case, because it will then be translated in the language of the interface. (which is usually what we want but not in this case). Hmmhmm
extra remark: I thought we were not using the "usage text field", because we now have a "usage combo box"?... and I was planning to remove it. If we use it, we have to agree on a clear definition and write about it in the help pages. --Kipcool 13:02, 10 April 2010 (UTC)
Well, it seems that we'd need a way to connect prepositions to example sentences in a given language as well as a way to connect prepositions and example sentences across different languages. This probably requires some serious programming efforts...
In regard to the "usage text field": I am not sure if and for what we use it. When I created the "usage combo box" I did not delete the text field since Tosca said that it might be needed/useful and I agree, since this offers the possibility to give some additional free text explanation as to how or when to use a certain word (and I think I've already used it for this in some cases). --dh 10:47, 12 April 2010 (UTC)

Apertium and other wordlists[edit]

Following Gerard's blog post about Apertium English wordlist: Are there other wordlists in OmegaWiki?

They can be created from spelling dictionaries for Mozilla and OpenOffice.org.

Surely i'm not the first one who comes up with the idea... --Amir E. Aharoni 14:22, 9 April 2010 (UTC)

That's all I know.
The problem with spelling dictionaries is that they include inflexions, which we don't support yet. --Kipcool 13:10, 10 April 2010 (UTC)
The Free (GPL) Hebrew spelling dictionary hspell has a file with non-inflected forms, from which another file with all the inflections is created automatically.
Is GPL compatible with the license of OmegaWiki? Quite ridiculously, GPL is partially incompatible with the Mozilla tri-license, for example.
If such a wordlist is welcome and the license is compatible, i'll try to upload it. --Amir E. Aharoni 13:34, 10 April 2010 (UTC)
Now that i think of it, it is a bit weird: GPL is not quite compatible with the non-copyleft CC-BY, but it seems that Apertium is GPL-only, unless i'm missing something. Can anyone explain it? --Amir E. Aharoni 14:17, 11 April 2010 (UTC)
We have been granted permission to use the LIST of Apertium words.. Apertium needs more information then we currently provide.. GerardM 14:57, 11 April 2010 (UTC)
But if a wordlist - or anything else - is saved on OmegaWiki, it becomes GFDL & CC-BY, not just for OmegaWiki, but for the whole world. Of course, this is not true if it was done without the consent of the rights owner, but you say that this was done with Apertium's consent, so this essentially makes Apertium's wordlist GFDL & CC-BY. And now, if i understand correctly, the wordlist can be incorporated in a non-free application, provided that Apertium is properly credited.
Did i get this right? --Amir E. Aharoni 15:06, 11 April 2010 (UTC)
It becomes available from OmegaWiki with a new license. It is still available from Apertium as GPL. From my perspective, licenses and copyright are a pain; when people want to use material from OmegaWiki, I will only be pleased. Thanks, GerardM 15:53, 11 April 2010 (UTC)

Logos[edit]

I was looking for "competitors" of OmegaWiki and a friend pointed me to http://www.logos.it/ . It also claims to be a volunteer-written universal multilingual dictionary, but it is quite unclear what is its license. Also, while it is quite clear OmegaWiki is centered around DefinedMeanings, it is not so clear for Logos.

Can anyone add anything about this project? --Amir E. Aharoni 13:55, 11 April 2010 (UTC)

It is indeed volunteer driven.. It does not publish a particular license. The drawback is that it only provides an English interface to its data. GerardM 14:58, 11 April 2010 (UTC)
Did anyone try to contact them about the license? Did anyone consider collaboration with them?
An English interface is not such a bad drawback and it can be easily fixed. If that project's license would be compatible with OmegaWiki, its data could be just imported to OW and use OW's internationalized interface.
The lack of a Free, or for that matter - any license, and the lack of a specification of their database seem to me much worse drawbacks.
Competition is usually a Good Thing, but this is an extremely huge project and OW's resources are scarce, so any collaboration would be good. --Amir E. Aharoni 15:14, 11 April 2010 (UTC)

Aramaic[edit]

Is it possible to enter Aramaic words in OW?

I couldn't find any kind of Aramaic in the list of languages. I tried Aramaic, Syriac, Official Aramaic, Biblical Aramaic and Babylonian Aramaic.

I'm not sure about the naming and the ISO codes. These should probably be separate:

  • Syriac - i don't know it, but it is widely studied and it has its own script. The code is syc.
  • Biblical Aramaic - i do know it. It is a limited and well-documented corpus - ~10 Bible chapters. I'm not sure about the ISO code - maybe arc. Written in the Hebrew alphabet.
  • The Aramaic of the Talmud and the Zohar - i know a bit of it. The code for at least one variety of it is tmr; i'm not sure that it covers them all. Written in the Hebrew alphabet.

There are many other ancient and modern varieties of Aramaic of which i don't anything, but the ancient languages mentioned above would be useful for many students and there are dictionaries for them, which are in the public domain for Biblical and Talmudic, and probably for Syriac, too. --Amir E. Aharoni 20:41, 11 April 2010 (UTC)

OK, better:
  • syc - Syriac
  • arc - Aramaic (Official, Hebrew). This is for Biblical Aramaic. It was written with other scripts, so the script must be noted, like with Serbian.
  • jpa - Aramaic (Jewish Palestinian) - this is for the Jerusalem Talmud and related works.
  • trm - Aramaic (Jewish Babylonian) - this is for the Babylonian Talmud.
I'm still not sure about the Zohar.
Feel free to change the order of words and the parentheses placement in the language names. --Amir E. Aharoni 10:08, 12 April 2010 (UTC)
I added some lines in the FAQ ;-)
I will add the above languages maybe tonight (I believe you for the ISO codes and names, but I'll do a quick check nevertheless... ;-) ) --Kipcool 15:02, 12 April 2010 (UTC)
Now comes the picky details ;-)
  1. In OW, we have the syc code for Expression:Classical Syriac. Does that mean that there are several Syriac? Or is it the same?
  2. In OW, we have the arc code for two expressions Expression:Aramaic and Expression:Official Aramaic. Should they be merged? This is less problematic, since we will create the code "arc-heb" to take the script into account.
    I created Expression:Aramaic (Official, Hebrew) and enabled it for editing.
  3. trm : do you mean tmr? We have Expression:Jewish Babylonian Aramaic which looks similar.
  4. jpa : we do not seem to have it.
    I created Expression:Aramaic (Jewish Palestinian) and enabled it for editing.
Thanks.
Classical Syriac is probably the only one, but if you're unsure, don't enable it just for me. Many students of the Bible and of Semitic languages learn it, and it's probably one of the most "popular" kinds of Aramaic in today's academic world, but i didn't get to it yet and i won't edit in it any time soon.
trm is, of course, tmr and Jewish Babylonian Aramaic is exactly it.
"Aramaic (Official, Hebrew)" and "arc-heb" seem OK. --Amir E. Aharoni 23:34, 12 April 2010 (UTC)
I found another macrolanguage named "Syriac" [10], with code syr. So I am giving the name "Classical Syriac" to the language which I added for editing.
I also added Expression:Jewish Babylonian Aramaic for editing so, you are complete.
It would be nice if you could add at least one word in each language, so that it shows up in the statistics ;-) --Kipcool 08:48, 13 April 2010 (UTC)

Unknown meaning[edit]

What can we do with a word whose existence is certain, but whose meaning in uncertain?

In modern Hebrew the word חַשְׁמַל (khashmál) means "electricity", but in Biblical Hebrew its meaning is not clear. One of the theories is "amber", and the modern meaning is based on it, but there are other theories, such as "gold alloy", "shining substance", "angel" (although this probably pertains to a different period) etc.

Should i just write something like "an uncertain concept appearing in the Book of Ezekiel, possibly shining substance, amber or gold"? Or is there a smarter solution?

In ancient Hebrew and ancient Aramaic there are many more words like this one. --Amir E. Aharoni 23:20, 11 April 2010 (UTC)

I'd say you should probably just do that, i.e. write a definition that explains its possible meanings, reasons as to why it is uncertain, occurences, etc. Though I think it probably doesn't make sense to attempt to add translations when they are not certain. --dh 10:53, 12 April 2010 (UTC)
Thanks.
Certainly, translations cannot be added in such cases as "Identical meaning", but what about non-identical meanings, to relate it somehow?
This brings up the problem of identical vs. non-identical - there are only two options. Maybe another option could be added. --Amir E. Aharoni 10:58, 12 April 2010 (UTC)
As per dh.
I don't know for non-identical meanings. We should be able to guess already a translation from the definition, so it may not be useful.
There seems to be also some arguments against the use of non-identical meaning, because it seems to create more problems than it solves: everybody has its own definition of non-identical meaning and how it should be used, which makes it a bit useless. --Kipcool 15:09, 12 April 2010 (UTC)

FAQs[edit]

Hi, it would be interesting to include in the FAQ the meaning of classes, collections and subjects, how to create them and their recommended use. --Ascánder 13:44, 13 April 2010 (UTC)

Thanks for the effort in clarifying the use of collections, classes and topics. It will be very useful.
Up to now, I've used classes based on the model of the GEMET collection: A "place" where terms centered or related to a field can be found. For instance, class:computer science actually contain the terms which most frequently occurs in abstracts of published papers of a representative part of the subfields of computer science. It also includes terms that also belongs to related subfields. Now I am not clear if this is a right use of the class concept.--Ascánder 20:14, 15 April 2010 (UTC)
No, that would be what OW calls a topic (See also DefinedMeaning_talk:isomorphic_(1193839)). In my view, the class should only denote an "is instance of" relation as opposed to an "is a type of" (or "is a subclass of") relation, which should be expressed by assigning a hyponym or hypernym, respectively. But, beside making some software changes neccessary, there seems to be no consent in this question. --dh 21:37, 15 April 2010 (UTC)
Hi Ascander, you're welcome for the clarifications ;-) I am not still much convinced by my own definition of "collections". If someone has some ideas...?
Indeed, there are a number of classes which should be changed to topics, but it is not your fault, since topics where introduced only recently. Before that, we only had classes to indicate such information (and "is part of theme"? but this relation should disappear)
I also still have to implement a "topic browser" (and "class browser" and "collection browser"), which would allow us to see what is in which topic / class ... --Kipcool 09:16, 16 April 2010 (UTC)
The definition of topics, particularly, the one DefinedMeaning should have one and only one topic clause, does not correspond to what is needed in order to build GEMET style groups of concepts.
Now if classes should neither be used for that and collections (as I remember) do not allow to relate terms, we are missing some kind of structure.
With respect to topics, I tried to use them, however it seems to me that the hierarchical structure, the one showed manually in the list of topics, is missing. I would prefer topics to be based on a omegawiki class (or a collection?) in which members are topics and an inclusion relation is available. Then, creating a new topic would amount to add the topic to the class and relate it. In this way we could use a "delete by consensus" instead of a "create by consensus" schema.
Even better would be to use the category structure of Mediawiki, but I guess meanings and expressions can not be categorized now. --Ascánder 03:56, 19 April 2010 (UTC)
First of all, I am not sure if it is desirable to copy the GEMET structure as it seems to be quite vague and often inexact. But that's only an impression without having studied the database thouroughly.
Then, yes, unfortunately the hierarchical structure of topics is missing. But making topics a class or collection won't change that, though I think it would be good to have something similar to classes just for topics (and maybe for hypernyms as well), that is, have an extra field like the class field just for topic. The downside would probably be that it would easily end up messy, like the classes, since it becomes to easy too add topics. --dh 12:31, 1 May 2010 (UTC)
...it becomes to easy too add topics It's a similar debate as to whether categories should be created by users or not in Wikipedia and I'm also against pre-control on that point. With respect to GEMET, it's aim as I understand it, is to offer in one place all of the terms in or related to environmental themes, which of course has to include terns in the border line or clearly on other topics, without including the other topics completely, and that can only be model with structures (different from topics) in which the same term can belong to several of them.--Ascánder 12:59, 4 May 2010 (UTC)

Creating a new expression[edit]

When an expression is not found, there are (now) two links proposed: one to create an expression, and one to create a standard wiki page.

  • If you are using the interface in English, German, Spanish or French, this is not new, since the message was already translated locally at OW. I have just changed a bit the message, and put it at translatewiki for localizations.
  • If you are using another language, this is a new feature. Before that, it would not propose to create an expression
  • If the message is displayed in English, translatewiki.net is the place where to do the localization. --Kipcool 19:18, 18 April 2010 (UTC)

url title and special characters[edit]

Two things:

  1. I have changed the url titles and links to display what we call "nice url". It was activated for normal pages, but not for pages in the namespaces of Expression and Namespace. Expressed in a more simple way, it means for example in DefinedMeaning:etc_(867340), the links will look like http://www.omegawiki.org/Expression:ens?dataset=uw instead of http://www.omegawiki.org/index.php?title=Expression:ens&dataset=uw (both url work, but this is not new).
  2. I have also changed the url encoding. It means that links like Expression:&c. now work (before, it said "database error"). --Kipcool 20:25, 21 April 2010 (UTC)

que cosa vuol dire saccense ?[edit]

Italiani prego: aiutate mi
Cordiali saluti,
Patio 10:44, 30 April 2010 (UTC)

I have created Expression:saccense and Expression:Saccense. Does it fit your context? --Kipcool 13:26, 30 April 2010 (UTC)

Probably :-) Grazie mille Patio 08:37, 4 May 2010 (UTC)

Databank error!?[edit]

I tried to add Judaism as a sub-topic of Religion to the "lexical item" class, but I repeatedly failed and got the following error message:

Es ist ein Datenbankfehler aufgetreten. Der Grund kann ein Programmierfehler sein. Die letzte Datenbankabfrage lautete: SELECT option_id FROM uw_option_attribute_options WHERE attribute_id = 1150889 AND option_mid = 405008 AND language_id = AND uw_option_attribute_options.remove_transaction_id IS NULL

aus der Funktion „“. MySQL meldete den Fehler „1064: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'AND uw_option_attribute_options.remove_transaction_id IS NULL' at line 1 (localhost)“.

--dh 12:21, 1 May 2010 (UTC)

It was a bug (probably caused by me?). It should work now. --Kipcool 19:55, 1 May 2010 (UTC)
Thanks! --dh 17:34, 3 May 2010 (UTC)

A new statistics page[edit]

See Special:Ow_statistics. (I know that Gerard is fond of statistics...).

  • the language names, etc. are now localized
  • you can now choose your favorite ranking where your language appears best (Ascander leads anyway)
  • the page is fully integrated as a special page (for security reasons, it is better, and it also looks better)
  • About speed : even though it seemed a bit faster than the previous page on my personal system, I don't really notice a difference here, it is slow anyway. So please, don't click too often on the statistics. An idea is to cache the data for a given time to improve speed. todo... --Kipcool 22:28, 3 May 2010 (UTC)
They are great! Thanks Kipcool. Maybe a little explaining paragraph on each subpage or on the main statistics page would complement them appropriately.--Ascánder 12:38, 4 May 2010 (UTC)
This is of course possible, though we have to consider localization.
I cannot put the message in English only, because it will look odd.
Thus, if I include an explanation in the Special page, I think it will have to be a system message, i.e. something that can be translated at translatewiki.net. Since it would be a long message, I don't know if this is appropriate (or at least I have to think about it a bit before giving more work to the translatewiki people ;-) )
Another possibility is to create a page Help:Statistics with explanations and screenshots, where we (the users) can translate the content, and then link to this page from the Special:Ow_statistics with a short message that can be translated at translatewiki. --Kipcool 13:29, 4 May 2010 (UTC)
Both possibilities will complement the statistics pages. Translatewiki is what I had in mind as system messages are the standard Mediawiki way to localize pages and messages which are paragraphs are not uncommon. --Ascánder 15:29, 4 May 2010 (UTC)

Translation of help pages[edit]

With the above discussion, I am wondering if all the help pages should maybe be like system messages translated at translatewiki or use a similar system, i.e.:

  • Automatically see if a translation is uptodate or not,
  • Display the help page in the user language if available, in English otherwise.

Gerard? --Kipcool 13:29, 4 May 2010 (UTC)

I think, that kind of translation work is better done here than on translatewiki.net since translators should have some knowledge about Omegawiki. We could use the page translation feature of the translate extension of MediaWiki, provided, it works together with Omegawiki and is deemed stable enough. --Purodha Blissenbach 21:49, 14 September 2010 (UTC)

Automatically displaying the page in your language[edit]

1. Now, when you click on "main page" on the left pannel, it will automatically show the main page in your language (when available). For me it shows Meta:Main Page/fr. It also work with Help:Expression. There are however some subtleties:

  • For languages that have a 2-letter code, this code has to be used instead of the 3-letter code. This means that pages like Meta:Main Page/deu, Meta:Main Page/spa, ... have to be moved to Meta:Main Page/de, Meta:Main Page/es.
  • It is now a bit difficult to access the original English page. A ?redirect=no (or &redirect=no) has to be manually inserted in the url, for example http://www.omegawiki.org/Meta:Main_Page?redirect=no but then if you click on history, it will take you back again to the history in your language. It still needs a bit of coding.
  • It should also be possible to have it work like a normal redirect: When it redirects automatically to your language, it should say "this is a redirect from ..." and give the correct link to the English page, already with the redirect=no. (coming soon)

2. I think we can therefore get rid of Template:Mainpage/Translations. What do you think?

3. I have also developed a template Template:TranslatedFrom to help with the translation of pages. It uses three parameters and shows like this Meta:Main Page/fr (see on top). It provides a link to the original English page, tells when the translation was done, and proposes to show the changes in the original English page that have occurred since the translation was done. --Kipcool 20:02, 5 May 2010 (UTC)


Why moving the 3-letter code subpages to 2-letter code ones? Better make the 2-letter code subpages redirect to the 3-letter code subpages, or would that create problems? --Purodha Blissenbach 21:54, 14 September 2010 (UTC)

Recent spam and measures against it[edit]

We have been recently spammed by someone creating lots of accounts, from lots of different IP (probably trojan horses) and adding contents.

Therefore, I have installed some new spam-fighting tools:

  • SpamBlacklist : blocks any edit that tries to link to websites specified in that list
  • reCaptcha : when you create an account, you have to type 2 words from an image (up to now, you only had to do a mathematical operation which could be done by a bot)
  • CheckUser: Allows to see the IP used by users. At the moment, I am the only one with checkuser rights, which is not a security problem since I have access to the server anyway. If need be, I can also give the rights to trusted contributors (or bureaucrats can appoint themselves...). --Kipcool 19:19, 19 July 2010 (UTC)

Links and icon on the right for Wikipedia links[edit]

Yesterday, I introduced the functionality that Wikipedia links are now more visible, on the right (see the blog post about it).

Basically, the code is a piece of javascript (i.e. on the client side, no additionnal load for the server) that looks into the list of translations of a concept to find links to Wikipedia article. When a link to Wikipedia is found, an icon and a link is created on the right of the page. This will be in the language of the user if available. Otherwise, it is displayed in English, or at least in any other language available.

When several definitions are present on a page, each have a different link to Wikipedia, such as in Expression:dólar. Enjoy! --Kipcool 11:33, 21 July 2010 (UTC)

  • Note 1: it only shows up on expression pages, not on DefinedMeaning pages. (I don't think it is necessary, and it would need some more coding)
  • Note 2: now, the links to Commons also show up on the same way (cf. Expression:horse) --Kipcool 22:27, 21 July 2010 (UTC)
  • Note 3: if the Wikipedia link is not in the user language, it will show in Italic (to inform the reader discreetly). --Kipcool 16:53, 22 July 2010 (UTC)
Very useful, thank you! --Tosca 11:25, 22 July 2010 (UTC)

Catalan word class[edit]

The "word class" option for catalan words only allows "prefix" and "suffix". Should it not have "noun", "verb", etc, like the other languages? —Celestianpower Háblame 19:24, 22 July 2010 (UTC)

Hi, I have added adjective, noun, adverb, verb.
I tried a new method (so, there might be problem), i.e. instead of adding them at the page Expression:lexical item, where the list becomes a bit too long, I added them at DefinedMeaning:Catalan_(7665). If you need more options, you can add them there.
It seems to work well even though suffix and prefix are defined at Expression:lexical item
In the long term, I want to move all the attributes definitions to the language pages. Don't try to help, I'll do it with sql queries. --Kipcool 18:18, 28 July 2010 (UTC)

Loading the pages faster[edit]

To load the pages (a bit) faster, it is possible to disable the automatic sorting of the columns, which takes time and typically freezes the browser for a second. To do that, simply insert the following in your personal monobook.js (or some other name):

function toSort(elementName,skipRows, columnIndex) {
}

Recent problems with the server[edit]

Dear all, you may have noticed that OmegaWiki has had some problems recently. And when it is back online, it almost always means that I have rebooted the mysql and apache servers manually... Some info, just to keep you posted:

Yesterday, for example, the problem was that the memory (1GB) was full and the swap as well (2GB). When the server begins to swap, it becomes slow, and if it is slow, it is not able to cope with the demand, and becomes even slower.

Every day, I try something new to have a functional server. I thought it was because of too many bots indexing the website, so I forbade access to most of them (basically, at the moment only Google and Msn are allowed). I also have played around with the mysql parameters (by changing the memory allocated to the cache mostly), and now I am trying with the apache parameters.

I have no experience with servers (OW is my first server that I can play with), so I am learning by doing. Thus, I don't know when I'll find a good set of parameters (maybe the ones now are good?). If someone with experience passes by, you can help me ;-). What I still don't understand is why the problems appear now. We have these servers since March (thus my first idea of connecting it with the indexing bots). --Kip 12:03, 9 September 2010 (UTC)

Do you have a clue if it's Apache or mySql the responsible for the memory consumption? Malafaya 13:27, 20 September 2010 (UTC)
It was Apache. I reduced the number of servers it can create, and it seems to be ok now because I was away for a week and it's still working. --Kip 18:24, 23 September 2010 (UTC)

Question about the empty word[edit]

The empty word, or nullword, is assumed to be 'used' occasionally, usually for systematic reasons, in some languages. For example, in many Germanic languages, the empty word is an article that preceedes plural nouns. In English, in I eat the apple / an apple / apples, the nullword preceedes apples.

Now I want to make a table listing all declensed forms of an Englich noun with their respective articles linked to their definedmeanings. I hit problems: I don't know, how to link to an empty word having the meaning English article preceeding all Plural nouns, and I need to know how to mark the empty word so that there is something clickable. --Purodha Blissenbach 00:50, 19 September 2010 (UTC)

The empty word is just a grammatical concept and not a word per se. Therefore, no entry should be created for such a word. (thus solving your problem :p ) --Kip 18:27, 23 September 2010 (UTC)
No, its not solving my problem. I means that grammar rules have to be more complicated, especially those being applied in semiautomatic translation suggestions. --Purodha Blissenbach 14:43, 24 September 2010 (UTC)
But never mind, I am not proposing the empty word now for OmegaWiki :-) I know, it's a complicated subject matter, that can of course be dealt with elsewhere. For the time being, I only have to annotate my grammatical tables appropriately. --Purodha Blissenbach 21:17, 25 September 2010 (UTC)

related term?[edit]

I would like to add the relaton "related term" that exists elsewhere, but it does not appear in the list of possible choices on DM edit pages. What can I do? --Purodha Blissenbach 12:02, 20 September 2010 (UTC)

We decided not to use the relation called "related term" because it is too vague, and does not indicate in which way the word is related. The problem would then be that everybody interprets it his own way, creating a mess... --Kip 18:30, 23 September 2010 (UTC)
There are such relations there, and they are useful for me. If we had preciser ones, fine, I was certainly using them. But not having a (set of) better replacement(s), I want that functionality back. Specifically, I rememember that I had a case recently, where I could not think of enything better, or more distinctive, than "related term". --Purodha Blissenbach 14:43, 24 September 2010 (UTC)
Among grammatical terms, we have sets of anotonyms having more than 2 members each (eg. collective name is a member of one) which may also be called siblings, as I read. "Is a sibling of" might thus be a preciser relation than "related term". I personally do not care to have sets of antonyms of larger cardinality than two. --Purodha Blissenbach 21:09, 25 September 2010 (UTC)

theme: grammar?[edit]

I've added some grammatical terms. I had to put them under the "theme: linguistics" even though this is far to broad a concept, imho. I want them put under the "theme: grammar" which in turn is under the "theme: linguistics" so as to be precise and not to waste information. Note also, that there are words being ambiguous in the realm of linguistics, but not any more when tied to specific linguistic fields. What can I do to achieve the wanted precision? --Purodha Blissenbach 12:02, 20 September 2010 (UTC)

I think it is ok to add a topic "grammar". We already have topics that are part of a larger topic, cf. Help:Topic. --Kip 18:32, 23 September 2010 (UTC)
OK, I'll go ahead and do that. --Purodha Blissenbach 14:43, 24 September 2010 (UTC)
If you have added the topic, please put it in the list at Help:Topic. Thanks. --Kip 08:20, 25 September 2010 (UTC)
Done Done Still. adding DMs to the topic, there are so many… --Purodha Blissenbach

The word of the day[edit]

I see different word of the day when I am logged out.--AivoK 12:22, 10 October 2010 (UTC)

I think that the IP users (i.e. logged out) see a cached version of the pages, whereas logged users see a current version. This allows to reduce the load for the server.
I am not sure how to have the word of the day updated everyday also for IP users. --Kip 14:52, 10 October 2010 (UTC)
Purge the page cache after midnight? Could be done by a cron job. --Purodha Blissenbach 00:13, 11 October 2010 (UTC)

New page for a Dutch professional[edit]

It's been ages since I created a new page. I made one in the old school Wikipedia way for ketellapper , but it did not look well so removed it...
Please help me, Dutch natives and specialists in this odd language ;-) thx in advance Patio 10:00, 11 October 2010 (UTC)

You need to be in the "Expression" namespace, i.e. Expression:ketellapper. Then, one thing is to not forget to select a language in the upper combobox (with red background) and a language in the definition combobox.
Not sure of the definition. Is it "A person who repairs kettles"? --Kip 14:25, 12 October 2010 (UTC)
I took the definition from the deleted page and created Expression:ketellapper from it, and added some translations.
Associatingly, I created some somewhat related expressions. Please have a look at Expression:Wännläpper at my attempt to write a Dutch definition of one of the meanings, it may need operhaul. Thank you. --Purodha Blissenbach 14:19, 16 October 2010 (UTC)