As an anonymous user, you can only add new data. If you would like to also modify existing data, please create an account and indicate your languages on your user page.

Meta:Functionality wanted

From OmegaWiki
(Redirected from FW)
Jump to: navigation, search
Shortcut:
FW

Create a new section for new functionality.

Implemented requested functionality are moved to:Meta:Functionality wanted/Archive

See also OmegaWiki related bugs at phabricator.

Discussion pages
Strategic discussions
International Beer Parlour

- archives


List of Beer Parlours

International Linguists Beer Parlour

- Inflexions


Questions about words
Articles needing attention
Insect room (report bugs)

- archives


Functionality wanted

- archives


Contents


Counting characters of words[edit]

The idea of OmegaWiki is that it will be useful to as many people as possible. Different communities bring different strengths to WZ but also weaknesses. Professional translators are interested in finding/adding translations; they are not really interested in adding definitions. People who do puzzles are very much interested in definitions and relations. (cat six characters starting with a “j”). The number of characters of an Expression that does not include a space, is easy to calculate. By providing query options for finding cats sorted by number of characters, you add a hole new group of people with an interest in WZ.

Benefit[edit]

More translations, and more linking to all kinds of relations. (remember, categories ARE relations)

Interface[edit]

Selecting several 'display languages'[edit]

A 'display language' has to be a language plus, where applicable, a variety selector, such as a geographic region or a specific script. Allowing a user to specify a set of such selectors in his preferences or via URL, independent of his interface language, selection should greatly improve usability for some tasks over having only the interface language.

Benefit[edit]

Reduces the number of routinely performed consecutive lookups. Increases the amount of useful information visible on one page for certain kinds of uses.

Allowing e.g. a serbian contributor to simultanously see cyrillic and latin script data at a time lets him quicker fill in missing data. In ethymology lookups, it's often necessary to review current and historic language data routinely. A translator working bidirectional may wish not to respecify his target language all the time.

It is desirable to separate interface language and target language for may tasks.

Technical supervision is easier, if there is a way to have all information of all languages available on one page rather than having to dig through thousands of specific pages. --Purodha Blissenbach 19:52, 18 July 2006 (CEST)

Varieties[edit]

In addition the expression in "English (United Kingdom)" or "English (United States)" should be displayed automatically when the interface language is "English" and there is no English expression.

Example: The page Expression:Sulfid now looks like that:
+– сулфид: Any compound that includes one or more sulfur atoms with a more electroposi...

--Ortografix 13:46, 14 August 2007 (EDT)

Languages choice[edit]

I'm not sure I understand the above, so I express my idea, which may be the same. I'm not interested in all these languages cluttering up my screen and making it difficult for me to find what I'm looking for. I'm only interested in very few languages. I'd like to be enabled to choose, in my preferences, the languages that will appear, and even the order in which they will appear on the screens and in the rolling lists when editing a page. This would be a concrete improvement for the reader, compared to Wiktionary. --Fiable.biz 22:51, 26 August 2009 (UTC)

I agree, I had the same idea :-) Not sure how to implement this though... --Kipcool 09:22, 2 September 2009 (UTC)
It doesn't seem very difficult to me:
  1. You add a few fields to the users table: one field indicating if the user has already chosen the language he's interested with or not, and if he has stated he's interested in all languages. Optionally, this field could have a 4th state saying that he hasn't chosen but the languages he's interested in have been guessed by the system. Then you add as many fields as a human user of OmegaWiki can be interested in if not interested in all. To decide the number of fields to add, look at the user declaring the most languages in his "Babel user information" and increase that number a bit. I guess 20 would be a good number.
  2. You write a form allowing the user to choose and sort the languages he's interested in. They may be different from the languages he has a competence in.
  3. For each table containing a language field, you write a query filtering the table according to the languages the user is interested in. If all, then the query just returns the table. Possibly some existing queries aimed at sorting results will have to be dropped.
  4. You replace all the accesses to the said tables by accesses to the queries. This part may be a bit long.
The advantage of this is that it would lighten the following processes (and thus the server work): no need to sort and display language names, expressions, definitions etc. for these undesired languages. Am I missing something important? All the best. --Fiable.biz 11:55, 27 July 2010 (UTC)
Then do it, it is opensource. Thanks. --Kipcool 16:04, 6 August 2010 (UTC)

Fields explanation[edit]

Each field name could be a link towards a Wiki page explaining what is expected there. --Fiable.biz 10:41, 6 September 2009 (UTC)

Transitivity and reflexivity of Relations[edit]

Currently a Relation is seen as going only from A to B, no matter what the type of the Relation is. That is, if a is related to b then the opposite b is related to a does not automatically hold, but has to be entered separately.

It is highly desirable to be able to specify a type of the relationship; f ex related to would be a reflexive relationship, so that adding it between a and b would make it visible from both a and b. Other relationships (f ex broader term would not be reflexive, but transitive, so that if a is a broader term than b, and b is a broader term than c, then it holds that a is a broader term than c

Benefit[edit]

It would greatly minimize editor work load if not Relations of this type need to be added on several places. --Sannab 10:22, 19 July 2006 (CEST)

See also previous entry #narrower terms / broader terms. HenkvD 16:33, 19 July 2006 (CEST)

Discussion from Insect room[edit]

narrower terms / broader terms

I added narrower term iron on metal, which is shown correctly on there. I expected on iron to show metal as a broader term, but that is not the case. Is this a bug, am I impatient or am I expecting too much? HenkvD 21:55, 1 June 2006 (CEST)

This was not considered in the software. The point is that not all relations are two way. This has to be considered on the RelationType level. It is not a bug, you are impatient as we all are .. :) GerardM 12:13, 3 June 2006 (CEST)
I've been discussing this with Peter-Jan. There are different implementation strategies -- we can add the relations redundantly, or we can treat certain relations as working in both directions on the database level. In the case of the former strategy, the software still needs to be aware that relation B is the inverse relation of A -- otherwise any future rollback capability will only roll back one side of the addition. For now, please add them from both pages, but I hope we'll have a solution ready soon.--Erik 00:07, 14 June 2006 (CEST)
As far as i can see this are the relations
  • broader terms <--> narrower terms
  • related terms <--> related terms
HenkvD 08:56, 14 June 2006 (CEST)
This relations are different:
  • (Relation:broader term, owl:inverseOf, Relation:broader term)
  • (Relation:related term, rdf:type, owl:SymmetricProperty)
but as you can see, OWL has a solution. The only thing that is missing is the programmer that is implementing this logic. MovGP0 11:48, 28 February 2007 (EST)
Should not be so difficult. I recently wrote an Article about Knowledge Extraction on Wikipedia. Lot's of tools there to get the data into RDF/OWL from the relational table. After that you can load it into a triple Store and do additional queries on that. I would recommend Triplify and Virtuoso. Just my 50 cent. I was looking for an RDF Dump of Omega Wiki, but I do not have time to make one right now, please mail me if there is one. I am converting Wiktionary to RDF at the moment with the DBpedia framework. Sebastian Hellmann

Presentation[edit]

I would add that ideally, most Relations and Incoming Relations should be presented in one section, by reversing the relation.

The reason for this is that it is extremely disturbing for a newcomer to read:

Expression: ship
Relations: hypernym: vehicule
Incoming relation: tanker: hypernym

Most people have never learnt the concept of oriented graph during their life (especially writers or translators, I guess) so what they will understand here is that both vehicule and tanker are hypernym of ship and that for some reason which they don't want to mess with, they are presented separately . Ideally, all relations should be expressed taking the entry as the base.

hyponym of vehicule.
hypernym of tanker.
part of theme navigation (and navigation "is one of the theme of ship", if we want to express that).

Just my two cents... I know that the writing guide speaks a bit about oriented relations, but most readers won't ever read that page or pay attention to this part. Eden 15:30, 12 April 2007 (EDT)

"Dirty" mark on related Expressions when DefinedMeaning is altered[edit]

Moved here from Insect room--Sannab 16:36, 21 July 2006 (CEST)

It would be very useful if there was a flag that would automatically be set to 'dirty' or 'needs verification' on Relations between Expressions and DefinedMeanings if the DefinedMeaning was in any way altered. Without such a mark, ensuring the quality of the related Expressions seems to me basically impossible.--Sannab 09:48, 3 May 2006 (CEST)

It should be part of a QA drive; when certain things are changed to a DefinedMeaning, the content needs validation.

Relation weight, Attribute weight[edit]

At times relations are of different strength; e.g. an expression may be 'central' or vital to a domain, while another one, although generally in the domain, is seldom used and is weakly bound. If such an expression is also in another domain, where it has another DefinedMeaning but probably itentical grammatical properties, it might be helpful to attach a weight factor to the binding of the expression to either domain.

Samples of possible uses[edit]

So a translator e.g. has some beforehand indication of the probability of one domain over the other, or (s)he can work on ambigous passages of text from the most uncertain (i.e. the most decisive, most discriminating) ones downward, thus possibly saving an amount of research on expressions that are not likely to give clues about the domain.

Some type of comedy uses sudden unexpected domain context switches, that makes it funny. "Good morning sir, what is your profession? - I am a psychiatrist. - Oh, I see. How am I today?", or "My cock popped out of my pants again … and hopped back over in the chicken shack" If we can relate the likelihood of double bound expressions (in two domains) to the aquaintance of an audience with either domain, we can tell how many in the audience will likely understand the joke using them. We can even take the reverse algorthim to construct new artificial expressions of the type "Alfons, der viertel-vor-zwölfte" or the "bottle of good old british spinster insective" with some likelihood to be funny.

This is to complex to use. Even Humans can't always translate jokes while keeping them funny. In synchonizing films human don't even try a 1:1 translation, but instead write jokes of their own for the target-language. I don't think that its worth the effort to teach AI thinks like humor (at least not on the current development stage of AI-Systems). MovGP0 12:03, 28 February 2007 (EST)

Language filter for Special:Allpages[edit]

It would be very nice if I could reduce the pages displayed by Special:Allpages to one select language, for example, all German words and all pages in German. That would make finding a specific word or checking for broken expressions in one language so much more easier. --Mkill 02:18, 5 August 2006 (CEST)

Inflexions (conjugations, declensions)[edit]

Inflexions are a functionality much wanted. The discussion about this has been moved to International Linguists Beer Parlour/Inflexions --Fiable.biz 12:41, 21 December 2009 (UTC)

Different meaning of the same word vs. entirely different word[edit]

It would be great if we had some way to indicate whether two different DMs under the same lemma describe different meanings for one word or different words altogether. This distinction is necessary once we have grammar functionality: While DM's which indicate different meanings of one word would have the same grammar, different words which only share the same spelling would not. Example: German "das Band" vs. "die Band", English "to walk" vs. "a walk". --Mkill 11:31, 5 August 2006 (CEST)


I believe the entire idea is mistaken. Grammar attributes cannot be linked to DMs, they're only available any DM+language+expression (i.e. what you call 'word') so we have (re-using a recent example), simplified:
DM (short) language expression syntacto-grammatical info (excerpt) example sentence
Emanation eng birth linked-by-genitive → the-object-emanating the birth of a star
Childbirth eng birth linked-by-genitive → child the birth of Don Niclas by Regina Niclas
Childbirth deu Geburt linked-by-genitive → child die Geburt des Don Niclas durch Regina Niclas
Childbirth eng delivery linked-by-genitive → child the delivery of her son Don Niclas by Regina Niclas
Childbirth deu Niederkunft linked-by-genitive → mother die Niederkunft der Regina Niclas mit ihrem Sohn Don Niclas
Here you can see, how you can translate eng:birth ←→ deu:Geburt, and eng:delivery ←→ deu:Niederkunft; but you could also do crosswise — in any case you've got to get the grammar right. While a human translator would usually not think much about the adjustments (they might, if not at all fluent), we have to consider and document them. At times, grammar is sufficient to exclude specific DefinedMeanings in many langages — your sample words are of that kind.
 
What if we have identical DM+language+expression and yet different grammar or syntax going with them? This is more common than one might think at 1st sight. E.g. a verb may have transitive and intransitive forms sharing a DefinedMeaning; if so, two sets of syntacto-grammatical properties will be linked to one key DM+language+expression, it is then up to the user to decide which of them she is analysing, or which he must (or wants to) use as a translation or synonyme. -- Purodha Blissenbach 01:12, 6 August 2006 (CEST)

While it's of course possible to limit grammar to DM+language+expression, this will generate a lot of redundant data and it's just not the way languages work. I think that's the main point: the structure of OmegaWiki will have to follow the structure of languages, or it won't work.

But first, to make it clearer, I think we talk about two different things: inflection and usage (syntactical information).

Let's start with inflection and look at an example: Expression:be. Currently, it lists 5 DMs. Note that all 5 DMs cover the same word: the English verb "to be". Entering information about inflection (1st person singular: am; present participle: being / past participle: been and so on) in all five DM's is possible, yes. It's just not a clever way to do it. It's much simpler to link all 5 DMs as "same word in English" and have the information about inflection stored once.

The other field is usage: Yes, syntactical information like linked-by-genitive is in fact closely linked to DM and the example Expression:be shows this very clearly. Three of them indicate usage as a normal verb, two of them usage as an auxilliary verb. --Mkill 11:19, 6 August 2006 (CEST)

Yes, we talk about both inflection and (syntactical) use at the same time. From a formal standpoint of view they're interrelated, and can be at our current level of generallity be collectively talked on. More specifically, these are of course language dependant and exist only in some languages as separable concepts of either grammar or syntax.
There is no redundant data, if these are linked to DM+lang+expr, where would the redundance be? (Wiktionaries currently do replicate the grammatical forms very often unless templates are used, but where would WZ?)
I did not in any way imply how the 5 DM+(eng:"be") were to be linked to their grammatical info, and I do not care. If there is only one set of inflections applying to all of them, then sure it would be economical to have in the end only 1 data base record describing them (and I am in favour of only having one in this case)
Forget the idea of defining something as "same word in English" - expressions may have 'all properties in common', that is all there is to it.
As to usage / syntac vs. grammar: call me a mathematical linguist in this field. There is no real distinction between them generally. In Latin or in English you can say there is a Grammar describing inflections of word, and there is a Syntax describing how words are arranges into sentences. Yet as letters form words, so do words make sentences, sentences make paragraphs, etc. It is one principle iterated. The mathematical concept of gammar can be applied on every level. -- Purodha Blissenbach 20:39, 7 August 2006 (CEST)

biological taxonomy[edit]

We have already a few entries on species. At the moment, the expressions are rather garbled. Example: Expression:chiropteran. To clean this up, it would be nice to have a means of adding the scientific names of species, families, and so on.

Expression:Chiroptera will be the same in every language, even those who do not use Latin script, like Japanese and Chinese.

I beg to differ. Expression:Chiroptera is 翼手目 in Japanese. Other taxons use katakana version of the binominal name. Gon-no-suke 03:45, 14 February 2007 (EST)
翼手目 is the Japanese name of the order, like Fledertiere in German. Still, Japanese sources always list the scientific name in Latin script, Wikipedia:コウモリ. It clearly says "Chiroptera" and not チロプテラ in Katakana (transcribing biologic taxonomy into katakana would be too confusing anyway).

So, what I wanted to say is, the scientific terminology is independent of individual languages and can be considered a language of its own. And it should be handled this way in Omegawiki. --Mkill 04:27, 16 February 2007 (EST)

Please look again. It says 翼手目 for the scientific name, and next to it there is a interwiki link to the English wikipedia at Wikipedia:Chiroptera. This is an English expression, not a Japanese one. Japanese scientific texts often list the English version of scientific expressions for more clarity. Since the scientific name is 翼手目 there is of course no need for an expression like チロプテラ in this case. For examples of transliterated latin taxonomy, check out the Wikipedia:アメーバ page. There are some latin names, but those are probably there as it is a bother or not clear how to transcribe into katakana. And yes, it is confusing... Gon-no-suke 09:32, 16 February 2007 (EST)

The idea is to add "scientific name" or something like that to the list of languages. Maybe this can be used for other purposes, too: Numbers, chemical formulae ... --Mkill 11:50, 6 August 2006 (CEST)

collocations[edit]

Another feature request... well, it can't hurt to collect them all, even if they will see implementation much much later...

What are collocations? A collocation is a combination of two or more words. For example, if you have "nuclear" and "accident", "nuclear accident" is a collocation of the two. We already have a lot of these in the database, including the example I just gave: Expression:nuclear accident. But so far, the database does not know that this expression is in fact a collocation. Adding this feature would help to create a better network between expressions.

So, collocations link an expression+language+DM entry (the collocation) with two or more other expression+language+DMs. Links should be added in both directions, if I define "nuclear waste" as a collocation of "nuclear" and "waste", it should appear in the collocation list of both entries.

It should be possible to create a cascade of collocations, for example defining "nuclear waste disposal" as "nuclear waste" + "disposal".

Collocations should not be transferred across languages and synonyms, because what is a collocation in one language, might be one word in another. (cf. Expression:little sister).

Note: it is already possible to link these expressions by using "broader terms" or other entries, but this does not always work. Example: "nuclear waste" while is of course a narrower term of "waste", it is not a narrower term of "nuclear". Also, as far as I know "broader term" and "narrower term" don't create bidirectional links. --Mkill 12:29, 6 August 2006 (CEST)

Just a thought I get when talking about "nuclear waste disposal": being able to define this as a collocation of collocations would also remove some of the present ambiguity. Is it "disposal" of "nuclear waste", or is it "waste disposal" which is "nuclear"? The latter is sort of nonsensical, but I'm sure there are situations where several different combinations are plausible. Each of these combinations would then be at their own DM. László 06:02, 14 February 2007 (EST)

unique placeholder DM content[edit]

Following up on Sabines note in International_Beer_Parlour#Adding_terms_while_translating I suggest to enable the usual wiki use of ~~~~ so as to create a unique placeholder for a series later to be filled in words inside DMs as a software extension -- Purodha Blissenbach 19:51, 7 August 2006 (CEST)

Huh, I don't understand what you mean by "enable the usual wiki use of ~~~~". Does it do something else than adding this : 150.214.57.42 16:44, 11 August 2006 (CEST) ? (Kip)
No, in relational data it does not do this, it simply remains as 4 tildes. -- Purodha Blissenbach 11:37, 23 August 2006 (CEST)

"Variant Spelling" Flag[edit]

Following up on International_Beer_Parlour#Variant_spellings, I'd like to suggest that if an expression is entered as a synonyme with 'Identical meaning' checked, then a 2nd flag becomes available which, when checked, puts both expressions in an equivalence class sharing most properties but their spelling and a class of attributes specifically related to classifying/grouping spelling variants. If shared properties are already there, they need to be merged. If this cannot be handled safely by software (e.g. due to omissions or contradictions in preexisting data), the user must be prompted to solve them, or the flag cannot be set. The availability of options in this field might be subject to user preferences or privileges. -- Purodha Blissenbach 12:29, 8 August 2006 (CEST)

Feature requests cleaned out from Insect room 11 august 2006[edit]

nice to have[edit]

  1. Special:Statistics should have a counter of the pages from Special:Allpages/GEMET:!.

Table with symbols to insert[edit]

When editing, a table with symbols to insert would be helpful. For example, see the table "Insert:" that appears in Wikipedia when you start to edit an entry. Such table is very helpful for editors switching between several languages. That looks like us! Miguel Andrade 22:28, 22 August 2006 (CEST)

Yes, that would be nice.. an example could be the way it is done on the English Wiktionary.. The missing Armenian alphabet is something Connel is about to include there .. :) GerardM 22:32, 22 August 2006 (CEST)
Caveat: The javascripts (indirectly) used in the Wikipedias MediaWiki:edittools entries only works on the edit box (single html <textarea>) but not on all other data entry fields. When I quickly glanced into it so as to estimate the time needed to adapt them to rather work on the field currently having the focus, I noticed that this is not easily done, and might need some care so as not to break other things. WZ would need this adaption, so as to have them available for the editing of relational data.
Done Done mostly meanwhile by the ULS input methods. --Purodha Blissenbach (talk) 21:58, 4 February 2016 (CET)

Search for special characters[edit]

This is of minor priority.
Some special characters are not found, likely due to MediaWiki treating them specially, even though they can be part of expressions. Samples:

  • [{{fullurl:Search:[}} Expressions containing an opening square bracket]
  • [{{fullurl:Search:]}} Expressions containing a closing square bracket]
  • [{{fullurl:Search:<}} Expressions containing an opening angle bracket]
  • [{{fullurl:Search:>}} Expressions containing a closing angle bracket]

--Purodha Blissenbach 21:57, 27 September 2006 (CEST)

Request for addition of CharInsert plugin[edit]

It would be very nice if the CharInsert plug would be added to this MediaWiki instance. This would enable us to add special characters in edit mode. Siebrand 22:31, 19 October 2006 (CEST)

see #Table with symbols to insert above. --Purodha Blissenbach 21:28, 19 November 2006 (CET)
The problem is that there is not one but MANY boxes that the characters need to be inserted. It therefore needs quite a lot of work before it works .. Yes we want this .. who want to do this .. ?? GerardM 22:16, 19 November 2006 (CET)

Mabee a convenient way would be to use JQuery and add this functionality with JavaScript on the browser side ? --Toka 10:38, 28 May 2007 (EDT)

Done Done mostly meanwhile by the ULS input methods. --Purodha Blissenbach (talk) 22:00, 4 February 2016 (CET)

Copying example sentences[edit]

I discussed this with Leftmost and he suggested I placed it on this page. Here goes:

Currently, if you add an example sentence to an expression, it is added to that expression only, and not to its direct translations. It is possible to add translations of the whole sentence, at that expression, but while that exact same sentence might be applicable at another expression, I still have to copy it manually, effectively creating a new example sentence, instead of a link to the original example.

Take Expression:voorbeeldzin. The Dutch translation, voorbeeldzin, is annotated with an example sentence. That example sentence is translated in English. However, the English translation of voorbeeldzin, which is example sentence, does not currently have an annotation. The English translation that is at voorbeeldzin would do perfectly at example sentence, but currently there is no mechanism to quickly put it there. A way to quickly copy it would be good, a way to link it would be better.

I hope I'm making sense. László 06:37, 18 December 2006 (EST)


  • So as to make the idea of linking more concrete, we should have a look into the general possiblility of using a mechanism like #redirect [[somewhere]] with relational data, as this may likely show up at other places, too.
  • While I would not mind doing it by typing, as suggested above, imho a better way of handling redirects, or forwarding links, was by virtue of a " [ ] Link " checkbox which, when checked, opens a drop down list for target selection.
  • There needs to be a specification wether of not referential integrity is required for such links. It likely varies depending on the type of link.
- just my 2ct. --Purodha Blissenbach 09:33, 18 December 2006 (EST)

Special:Newpages[edit]

I would like to have a page like Special:Newpages for new DefinedMeanings. It could also be usefull to have this for new Expressions. HenkvD 14:43, 10 January 2007 (EST)

With the new Wikimedia version of 5/2/2007 it is possible to select other namespaces. This works fine for all namespaces, except for Expressions or DefinedMeanings. HenkvD 15:52, 5 February 2007 (EST)
See T125883. --Purodha Blissenbach (talk) 22:12, 4 February 2016 (CET)

List of DM's without relations[edit]

I would like to have a possibility to list DM's without Relations. I am especially interested in DM's without neither is part of theme and broader terms to be able to find DM's at are currently roots or that might need linking. HenkvD 14:53, 10 January 2007 (EST)

RSS Feeds[edit]

I'm looking for an RSS feed from OmegaWiki to extract "new words for language XXX". It should only be words linking to their page, with none of the typing parts (like expression:) because they make the entry unfit to show in side bars (it's too long). Any idea? It would help in spreading participation by being published in local sites. It would also do good if more people checked new definitions. --Bèrto 'd Sèra 17:42, 29 January 2007 (EST)

RSS feeds do not have our priority at the moment, certainly not when we are to filter as well. There is plenty of stuff that I would rather have, inflections and conjugations come first. This does not stop someone to develop this. GerardM 03:06, 30 January 2007 (EST)
WikiMedia Foundation wikies have "recent chanes" IRC channels on irc.wikimedia.org. If the same functionality is in OW, or is an extension which would fit into OW, I believe it'd be pretty easy for a programmer to filter and format the data stream in the manner wanted. Whether or not it's a good idea to do it at this place is outside my scope to tell. --Purodha Blissenbach 15:43, 31 January 2007 (EST)

Language filter for Special:Allpages[edit]

I'd like to have a selection box with the languages on Special:Allpages, so that if you select a language it only shows expressions of that language.

This would be very helpful for an overview how well-stocked the dictionary is in a certain language. It also helps to search for expressions with spelling errors etc. --Mkill 01:49, 1 February 2007 (EST)

See T124780 --Purodha Blissenbach (talk) 22:16, 4 February 2016 (CET)

Namespace for sentences[edit]

Recent updates gave us example sentences. Now something that would be very handy to have would be a namespace to access those sentences, something like:

Sentence:To be or not to be. or Sentence:To be or not to be. (18384) or Phrase:To be or not to be.

What for? First of all, it allows easier access. You could click on an example sentence and you could see all existing translations, and click edit to add new ones. Of course, there could also be same-language sentences with the same meaning.

The second handy feature would be added relations. As it stands, each sentence is only connected to one DM. But sentences consist of more than one word, so many sentences could be referenced by multiple DMs. So the above example could be connected to be (DefinedMeaning:be (6480)) and or (DefinedMeaning:ou (433943)).

Third, there could be a function to annotate example sentences, such as marking the above example as a Shakespeare quote.--Mkill 21:51, 8 February 2007 (EST)

You can already have example sentences. The Shakespeare quote is only a part of what people remember; I do not think it is the whole sentence. GerardM 06:07, 11 February 2007 (EST)
I know that example sentences are already included, I was requesting additional functionality to handle them better. --Mkill 00:51, 13 February 2007 (EST)
Making example sentences searchable might be such a task. But I think you want them as different Entities, so the sencences can get reused with other words. Right? MovGP0 12:19, 28 February 2007 (EST)
Yes, exactly. --Mkill 00:57, 7 March 2007 (EST)

Auto-delete empty pages[edit]

So far, when you remove "expression1" from a DM, and there is nothing left under Expression:expression1, the page still exists. I would be great if the database would have a self-cleaning function that removes the page of deleted expressions the same way it creates a page when the expression is added. --Mkill 05:04, 11 February 2007 (EST)

New relation: opposite[edit]

I would like to request a new kind of relations between DMs: opposite. That way, "equality" and "inequality", "high" and "low", "give" and "take" etc. could be set in relation. It would increase the dictionary's usefulness as a thesaurus a great deal. --Mkill 05:07, 11 February 2007 (EST)

Why not call it antonym of. (unsigned)
An antagonism was another choice of wording. What an antonym is, depends in part on context, or domain of speech, and "antonym" is not necessarily identical to "opposite" - thingk e.g. of width, depth, hight when talking of measurements of three-dimensional objects; or high and low land versus high and flat mountains, etc. - So to remedy those interrelations better, which may even lead to more DefinedMeanings when done, it might be wise to introduce those relation types together with domain-of-speech functionality. --Purodha Blissenbach 09:46, 11 February 2007 (EST)

The Oxford Thesaurus uses "opposite", and I think that's an easily understandable term. Well, a rose is a rose, so call it whatever as long as we get the functionality :) --Mkill 00:52, 13 February 2007 (EST)

I thought that the opposite of high is deep!
Therefore we need to clarify the question what is a opposite? before starting to do any implementation. MovGP0 12:27, 28 February 2007 (EST)
No problem here. You can easily add two opposites to one DM.
high <-> low is the opposite pair when referring to land height and everything else that does not go below zero, such as prices.
high <-> deep is the opposite pair when referring to something that can go below zero, i.e. altitude when including the oceans.
Also, don't forget high <-> sober. --Mkill 22:53, 1 March 2007 (EST)

All we need to do for implementation is add the existing DefinedMeaning:antonym (7574) to the list of selectable relations. Please? --Mkill 23:10, 4 March 2007 (EST)

diff function[edit]

Now that we have a working history, it's time for step two: A diff function for edits in Expression. If you go to Special:Recentchanges and click on any (diff) in an Expression you'll notice we still need this. --Mkill 04:29, 16 February 2007 (EST)

I think the history function should be changed into a list of edits, like on standard wikimedia. The diff function could then be like the current history, but should be enhanced to be able to select the difference between two versions. A further enhancement on the logging could be that all changes should be logged on the users contributions. Currently of Expressions or DefinedMeanings only creations are logged. HenkvD 13:54, 16 February 2007 (EST)
Actually, I like the current format. What the developers could include is a list of changes in chronological order as an additional page for the history. --Mkill 13:03, 18 February 2007 (EST)
I would like to keep the current history too, but maybe under a different name. I wrote: The diff function could then be like the current history. HenkvD 13:39, 18 February 2007 (EST)

When speaking about the history I think its useful to group edits done by the same user. Therefore the history should not show something like this:

# 17:27, 28. Feb. 2007 MovGP0
# 17:19, 28. Feb. 2007 MovGP0
# 17:08, 28. Feb. 2007 MovGP0
# 17:03, 28. Feb. 2007 MovGP0
# 16:48, 28. Feb. 2007 MovGP0
# 19:40, 26. Feb. 2007 HenkvD
# 15:48, 26. Feb. 2007 Mkill

but something like this:

# 17:27, 28. Feb. 2007 MovGP0 (5x) [show]
# 19:40, 26. Feb. 2007 HenkvD
# 15:48, 26. Feb. 2007 Mkill

MovGP0 12:36, 28 February 2007 (EST)

Alternative expressions[edit]

For a number of languages, there are alternative ways to display a single translation (syntrans) in text. Some of these are limited to same-language dictionary entries, but they still are valid Expressions. In the current system there is no satisfactory way to add this information and I would like to propose to add this functionality.

This discussion started from the perspective of Japanese. In Japanese there are often multiple ways to write the same syntrans although they share the same DefinedMeaning, using hiragana, katakana, and different kanji. In addition there is a need to add a universial reading using hiragana, since it is impossible to be certain of the reading for an Expression by looking at the kanji. This problem is not one of mere transcription, since even native speakers need this functionality.

However, the functionality is not limited to Japanese. A similar problem is found in Korean and I belive this solution can also be of use in other circumstances, c.f. benefits.

Current usage[edit]

In the current system there are no ways to relate different expressions for the same DefinedMeaning. There are two solutions used so far, but both have serious shortcomings.

  1. Adding the connected expressions as synonyms (e.g. DefinedMeaning:bathroom (5916)). Problems:
    • No way to know what hiragana expression goes with what kanji expression where there are additional synonyms for the same DefinedMeaning.
    • No way to know if a hiragana expression is an acceptable translation by itself or just there for purpose of searching.
  2. Adding the reading as a separate DefinedMeaning with the hiragana as an expression, and link back to the kanji expression through relations (e.g. DefinedMeaning:ちっそ_(456162)). Problem:

Since users are already adding questionable data to OmegaWiki, we need to deal with these questions before the problem explodes in our face.

Proposal[edit]

There is no such thing as an annotation of a transalted_content type. There is a separate table called SynTrans; in it you find both the translations and synonyms. This negates the whole notion that is being proposed; it has to fit in the existing database.. it is different from how you think it is. GerardM 05:35, 23 February 2007 (EST)

Then please explain to me of what type the annotation in the table uw_translated_content_attribute_values is. It looks like translated_content to me. The following sql:

SELECT st.syntrans_sid,
       ex.spelling AS expression,
       ev.spelling AS annotation,
       tc.translated_content_id,
       tx.old_text
FROM   uw_syntrans st,
       uw_expression_ns ex,
       uw_translated_content_attribute_values av,
       translated_content tc,
       text tx,
       uw_defined_meaning dm,
       uw_expression_ns ev
WHERE  st.expression_id=357612 AND
       st.expression_id=ex.expression_id AND
       st.syntrans_sid=av.object_id AND
       av.value_tcid=tc.translated_content_id AND
       tc.text_id=tx.old_id AND
       av.attribute_mid=dm.defined_meaning_id AND
       dm.expression_id=ev.expression_id;

Gives this result set:

         syntrans_sid: 357613
           expression: Belgrado
           annotation: example sentence
translated_content_id: 402592
             old_text: The flight to Belgrade has unfortunately been delayed because of freezing fog.
1 row in set (0.04 sec)

As far as I can tell this is an annotation with a attribute value of the type translated_content that is stored in the table text. The object_id links this annotation to the SynTrans Belgrado. Accordingly I propose a similar table structure where a new annotation table uw_expression_attribute_values links SynTrans:es to annotations of type Expression. I would very much appreciate if GerardM or some developer could explain in more detail where I am misunderstanding things. Regards, Gon-no-suke 21:50, 25 February 2007 (EST)

Benefits[edit]

  • Using Expressions for alternative ortography will allow for the search function to work as before without changes.
  • You can add multiple alternative expressions.
  • You avoid showing invalid expressions in the list of translations.
  • You are not limited to showing "transcriptions" of a word.

Here are some usage cases for different languages. Some of these goes beyond the proposal, but they could easily be acommondated for afterwards.

  • Hiragana readings for Japanese kanji words.
  • Alternative kanji representations of Japanese words.
  • Hanja versions of sino-Korean words.
  • Pinyin or Bopomofo transliterations for Chinese.
  • Linking traditional and simplified Mandarin expression together.
  • Linking latin and cyrillic orthograpies for Serbian together.
  • Non-standard orthographies for Kölsh.
  • Chu nho and chu nôm versions of Vietnamese expressions.
  • Actual readings for modern Tibetan compared to classic Tibetan?
  • Inflections/conjugations??

The alternative expressions could eventually be marked with options to classify them as historical, non-standard, declanation, &c in the future.

Minimal implementation[edit]

CREATE TABLE uw_expression_attribute_values (
    value_id int NOT NULL,
    object_id int NOT NULL,
    attribute_mid int NOT NULL,
    expression_id NOT NULL,
    add_transaction_id int NOT NULL,
    remove_transaction_id int
);

And of course some tweaking of the interface. In the initial version attribute Expressions could automatically get the same language as the annotated syntrans.

Gon-no-suke 02:50, 23 February 2007 (EST)

I second this request. Thank you Gon-no-suke! --Mkill 04:44, 23 February 2007 (EST)
This proposal cannot be implemented in this way; the way it is proposed is incompatible with the database. GerardM 05:36, 23 February 2007 (EST)
Then please explain why. I added my understanding of the database schema above. Gon-no-suke 03:54, 26 February 2007 (EST)
The idea is good, but the implementation has to be done in another way. I see it in this way:
  • Words are Expressions
  • Phrases are Expressions
  • Sencences are Expressions
  • IPA, Kanji, Pinyin, etc. are Expressions
  • Description of a defined Meaning is a Expression
But even they are all Expressions, they need different Handling. Providing IPA is easier, because this is about adding an additional tablerow in the database and an additional field in the input form, because you can use IPA with any language. But ie. Pinyin is specially for Chinese (and Japanese?), so it doesn't makes sense to provide a special tablerow, but instead a special table for the annotation. Therefore a table like:
              Table:Speech
+--------------+--------+--------------+
| ExpressionID | Type   | AnnotationID |
+--------------+--------+--------------+
| 耳           | IPA    | əɻ           |
| 耳           | Zhuyin | ㄦ           |
| 耳           | Pinyin | er           |
...
could work. In this example the Annotations are still a kind of Expression, but stored in a separate table. The advantage is that this solution is more flexible and doesn't affect the current database design.
Also the search has to get extended for this. If a Expression is not found in the Expressions-table, then the Annotation-table will get searched too. To improve speed, it might be possible to let search don't search the Annotation if there is a perfect match within the Expressions unless using advanced search. MovGP0 13:28, 28 February 2007 (EST)

@MovGP0: Please read my solution once more. I propose an extra table linking an expression (actually a SynTrans; object_id) with its alternative representation (another expression; expression_id) in a similar way to your table above. The difference is that the AnnotationID is another ExpressionID, so that the annotations are kept in the same table as the "normal" expressions. A lot of them are normal expression, so I think this makes sense. They are not kept in an new field, they are separate entries in the table. Another difference is that I havn't added any type information, but this could be possible in a smiliar way to the POS tags used now. As you say, the Handling have to be different, and it will be since alternative expressions are linked through a different table than the SynTranses, and can thus be handled differently when displaying. As they are not listed in the SynTrans table they will not be shown as direct synonyms/translations of the defined meaning. This proposal will also not affect the current database design, it only adds one table.

As a side note, your view of OmegaWiki is not in sync with the database.

  • Sencences are Expressions
  • Description of a defined Meaning is a Expression

These are not handled as expressions in the database, they are stored in a free-text table (the same as the Wiki pages) separate from the words and phrases. Gon-no-suke 18:15, 28 February 2007 (EST)

Directly linking DMs / entering DMs in the address bar[edit]

So far, a link like DefinedMeaning:chocolate (without the correct number) creates a database error. What I was thinking about is a way to redirect these to a useful result.

It would be great if such a link would instead open a page with a search result of "chocolate", "English:A rich, sweet foodstuff...", to lead the user where he wanted to go.

The page could also include a link "create a new DM for "chocolate".--Mkill 12:24, 25 February 2007 (EST)

The proposed link does not allow for homonyms. The DM will always include a number. It is the text part that is not needed. GerardM 01:13, 26 February 2007 (EST)
I see. Well, it would also be handy if links like DefinedMeaning:100 would work. Bots might use that a lot. --Mkill 10:48, 26 February 2007 (EST)
DefinedMeaning:(100) does already work, although the link is red and in the DM it looks red too. HenkvD 14:40, 26 February 2007 (EST)
Maybe the software could auto-redirect DefinedMeaning:(100) to DefinedMeaning:agriculture framework plan (100) ? --Mkill 23:12, 28 February 2007 (EST)
Done Done - redirecting meanwhile works. --Purodha Blissenbach (talk) 00:27, 5 February 2016 (CET)

Stylistic level[edit]

To start a discussion on how to add stylistic level / level of speech functionality, I created stylistic level. Please have a look. --Mkill 00:58, 7 March 2007 (EST)

Definitions needing translation Special page[edit]

It would be absolutely great if Special:NeedsTranslation would get a sister page that allows you to search for missing definitions in a language instead of missing translations. --Mkill 09:18, 10 March 2007 (EST)

Language quick-pick[edit]

It would be great if you could select three or four languages, that the language drop-down box already displays before you enter anything into the edit field. This would speed up editing a great deal. After all, most editors contribute in only a one, two, maybe three languages. The languages shown in the box should be selectable in the user preferences. --Mkill 09:18, 10 March 2007 (EST)

"Advanced" implementation: known languages in preferences[edit]

I've been thinking about this again, and I've come up with a better implementation:

If users could select their language abilities in the preferences, we could add this and even more features:

  • automatic creation of babel boxes
  • preselected languages in edit fields
  • expressions and definitions in known languages are displayed at the top
  • In word search fields, such as when adding relations to DMs, it currently displays the definition either in the interface language, or, when that one does not exist, in the topmost existing language. Instead, the software could select an alternative known language. --Mkill 01:36, 13 March 2007 (EDT)
This has my full support! It would be so much easier to see the Expression and Meaning in both the languages that I use most; it would really start to make OW actually useful! Sergio.ballestrero 17:15, 9 June 2007 (EDT)

Done Done aleady since a while. --Purodha Blissenbach (talk) 00:35, 5 February 2016 (CET)

"core expression" tag[edit]

It would be great if OmegaWiki had a flag to mark one expression as "best fitting the definition" in one particular language.

This flag could be automaticly set for the first expression that is added in a foreign language, but it should be possible to change the flag to a different expression in the same language, kind of like a radio button.

What do we need this for?

First of all, OmegaWiki depends on certain DMs to translate part of its user interface. To prevent the software from picking a term at random, we can let it choose the one with the "core expression" flag.

OmegaWiki has no practical limit on how many same-language synonyms it can collect under one DM. For some uses, this is an advantage, for example when you use it as a single-language thesaurus. When you write a text in a foreign language, and the word that comes to your mind does not perfectly fit what you want to express, you need synonyms.

For other uses, a large list of synonyms is a disadvantage. When you are just a beginner in a foreign language, you don't want a long list of terms that contains slang, poetic and outdated expressions. All you want is the best-fitting and most widely applicable word for what you want to say. Identifying a core meaning helps here, too.

The core expression flag would also help in cases, where the definition is only given in an obcure language, but there is a long list of expressions.--Mkill 01:00, 13 March 2007 (EDT)

Absolutely agree. We can call it "most-used". --Synagonism 12:22, 30 December 2011 (UTC)

A new look at the Japanese kanji/reading issue[edit]

I use a digital Japanese dictionary (Canon Wordtank G55) every day and I was wondering how that one handles Japanese in its database, especially the Super Daijirin. The Daijirin is a good model because it is a Japanese-Japanese dictionary, i.e. it links Japanese expressions with definitions.

Using Hiragana as base

What I found is that entries are based on their Hiragana readings. That makes sense: In the spoken language, if you say すし (sushi) it always has the same meaning, regardless of how you represent the word in writing (寿司、鮨、酸し、寿し). On the other hand, different readings for the same Kanji often mark a difference in meaning, i.e. different DMs.

The different ways to represent a word in writing (different kanji, hiragana, katakana, or romaji) are tied to this hiragana expression+definition entry (or syntrans in OmegaWiki terms).

Advantages of the hiragana-based approach:

  • expressions can easily be sorted using the Gojūon
  • Only one syntrans entry represents all ways to write a Japanese word/expression
  • Hiragana can be auto-converted to Katakana, Kunrei-shiki (a romanization) and the Hepburn-system (another romanization)

If we follow this structure for OmegaWiki, we need some basic attribute annotation logic, as we need a special syntrans annotation that is only used for Japanese.

Basicly, it's a variant of the string annotation we already have for syntranses. For the sushi example, 「すし」 in Hiragana would be the syntrans, and 寿司、鮨、酸し、寿し would be added as Annotation->string properties->Property "Kanji"->Text "寿司".

Where do we need advanced functionality?

Selecting the displayed expression

A word could be written in different kanji, with hiragana, katakana, romaji, or any combination, but there is always one most common way to write it. So, we need a second annotation, how the syntrans is represented in the interface.

Option properties->displayed term->hiragana / katakana / kanji1 / kanji2 ...

Display

When opening an Expression or DM with a Japanese syntrans, the entry should look like "[selected display term] ([Hiragana reading])", i.e. "寿司 (すし)" Advanced functionality would be that the Hiragana reading is replaced by a transliteration selected by the user/the interface language. I.e. somebody who can't read Hiragana could opt to have Japanese terms transliterated using the Hepburn system.

In Special:Allpages, Japanese entries should be sorted by the hiragana gojuon table, but the term displayed on the page should be the selected most common one.

Access

First of all, the Japanese syntrans should appear, no matter which of the different expressions you enter in the adress bar. "Sushi" should appear under Expression:すし, Expression:寿司, Expression:鮨, Expression:酸し, Expression:寿し, even though the latter terms are not syntrans entries on their own, but annotation text.

Auto-conversion

We need a function to auto-convert Hiragana to Katakana (for terms that are normally written in Katakana), and to Kunrei-shiki and the Hepburn-system

Romaji terms

A few Japanese expressions include capital letters of the latin script, such as 「ISO規格」(Iso standard). The easiest solution is to allow latin letters in Japanese expressions. These would be treated like Kanji and other ways of writing. ISO規格 would be entered as "いそきかく" (isokikaku). --Mkill 01:13, 18 March 2007 (EDT)

What can be done technically and how[edit]

First of all any string that you want to be able to search for is an Expression. You want to be able to search for ANY string in any script. This is basic functionality that we do not want to compromise. It means that not having 寿司 as a Syntrans is not a good idea because this is where the connection to a DefinedMeaning happens.

Having functionality that allows for connecting the Syntrans records in different scripts for Japanese needs programming. The first language based functionality we have in our Part of speech implementation. This will need to be further developed to satisfy you. One of the things required is to recognise the script. This is doable given that we use UTF-8.

What comes directly to mind is to link Expressions that differ in script together in a relation kind of way. The big challenge is that relations link DefinedMeanings, here we want to link Expressions. This takes some serious thinking and development.

I do not understand why annotation text should be findable as an Expression; it is either an Expression or it is not. Annotations are not findable in their own right. There may be some query options for certain attributes.. but that is for later. GerardM 03:28, 18 March 2007 (EDT)

Etymology[edit]

Part of speech have been added as an annotation which is quite sensible at first sight but would be much better if we could consider word family. Knowing that acquérir is a verb is nice already, but better yet is if it can tell you that acquérir is the verb from the noun acquisition (or acquisition and acquis are nouns based on the verb acquérir) or that acquis is also the adjective derived from the same verb.

For this, we can add proper relations between words of a same family or if we would have etymology, looking at the stem would give us, thanks to relations (based-on-stem? verb-based-on-stem?) or to "what links here", all words based on a stem which is somewhat equivalent to the relation mentionned above, but better yet: you don't have to enter the information n times (but you don't necessarilly know what part of speech it is, depending on how it is implmented). Eden 08:00, 15 April 2007 (EDT)

Collection statistics don't include non-English translated DMs[edit]

Currently, DMs that are not translated to English don't appear in the missing translations for any language, in the collections statistics and missing translations list. It would be nice if they appeared instead of having to look for the missing ones trying to guess what languages they are already translated to. (Example: DefinedMeaning:fleuriste_(447049) is part of OLPC collection and it's not listed in any language as a missing translation -- because it's not translated to English) Malafaya 00:18, 9 June 2007 (EDT)

Hoi, they do not. In a way it is a shame, however this functionality was created by a volunteer effort and it is much better than what we had before.. nothing. When someone is willing to take up this code, and make it better, then it would be cool. GerardM 06:51, 9 June 2007 (EDT)

It sure is much better than the nothing we had before. No doubt about it! :) And many thanks to Zdenek Broz for the job! I was wondering how I could find the 12 DMs whose portuguese translation is missing for OLPC though. Malafaya 11:15, 9 June 2007 (EDT)

Corrected! ;) Malafaya 11:26, 21 June 2007 (EDT)

Function tabs[edit]

A + tab add new section on this page even though it is not a talk/discussion page.
A "SHOW SOURCE" tab next to to "EDIT" tab. Why because I often like to use samples of code while editing an other page (standard wiki-page). It lessens the risk of a mistaken edit with multiple windows/tabs open.
I would add these myself (from W:User Scripts) but I'm afraid of breaking my MonoBook.js --Chief Mike 06:49, 9 July 2007 (EDT)

Some exports[edit]

I would like to be able to export a list of all the pages I've edited -- with the comment I made at that time.

I would like to be able to export all the words in Khmer, with the English definition and the Khmer definition if there is one. Well, that would not be difficult now, since Khmer has relatively few words entered in OmegaWiki. I guess there'd need to be some way to keep this practical if there were 25,000 words.

I'd like to be able to keep a list of words (or DM's, or Expressions) that would be a collection of Rsperberg words. And then I could (A) export my list or (B) restrict my searches to words in my list.

For that matter, I'd like to be able to get the export in XML or in PDF (that is, a format immediately suitable for printing). -- Rsperberg 18:55, 22 February 2008 (EST)

Wikipedia article link should be language sensitive[edit]

The Wikipedia article link (present in annotations) should be language sensitive. See DefinedMeaning talk:Wikipedia article (740663) --Purodha Blissenbach 07:08, 8 April 2008 (EDT)

I find it better as it is now. --Kipcool 11:23, 27 August 2009 (UTC)

Khmer numerals[edit]

I see at DefinedMeaning:hundred (6685) that the word "hundred" has annotations to Arabic numerals (100) and Roman numerals (C).

Khmer is a language that does not use Arabic numerals but instead its own:​​ ០​ ១ ២ ៣ ៤ ៥ ៦ ៧ ៨ ៩​ (eg, 0 1 2 3 4 5 6 7 8 9). (Unless you have a Khmer font installed, the Khmer digits will probably not display. They are the ten consecutive glyphs in the Unicode range U+17E0 through U+18E9.)

It would be useful if Khmer numerals could be added as a property under Annotation - Plain Text, as Arabic and Roman numerals are. Rsperberg 10:53, 7 June 2008 (EDT)

Added. :) GerardM 18:54, 7 June 2008 (EDT)
Thanks! Rsperberg 14:38, 8 June 2008 (EDT)

Add 'usage' to annotation[edit]

Khmer has two words meaning "yes." There is no difference in their meaning whatsoever. But one should be used only by a man and the other should be used only by a woman.

Similarly there are many expressions in Khmer that have the identical defined meaning -- some words are for common speech, others should only be used when addressing a monk, others when addressing royalty, others still only by royalty. So the king's chauffeur and the king use different words for "car" and the like even when conversing with each other.

For that matter, if I recall correctly, there are about 18 different terms for "you," each reserved for a particular class of person (for instance, a female relative older than you).

Lastly, there are many words whose usage is restricted to formal circumstances, and others that carry a insulting or obscene meaning, and so on.

I think the annotations for plain text ought to include "Usage" so that this information can be carried with the Khmer word. Apart from the complicated community and familial relations, there is a fairly well circumscribed set of usage instances for Khmer: formal, royal, colloquial, insulting, obscene, archaic, religious, and mystical are ones I have encountered. (I will see if I can locate a complete list used by various dictionaries.) Rsperberg 22:53, 8 June 2008 (EDT)

Regionalism[edit]

It'd be useful to mark a regional term as being in conpetition with the main stream term or replacing it totally in the concerned region. For instance, in the French Midi-Pyrénées region, the term "pain au chocolat" is unknown (I think I was about 25 when I first heard "pain au chocolat", in another region, and I learnt that day that "chocolatine" is regional.), replaced by "chocolatine", while the regional word "poutou" (kiss) is used as well as the main French word "bisou", and locals are usually aware that "poutou" is local. --Fiable.biz 07:38, 13 September 2009 (UTC)

API!?[edit]

I am thinking of writing an application for KDE to ease the editing/adding process of omegawiki and to hopefully make the project known amongst KDE folks. But this would need to interface with the database and my questions are:

  1. Is there maybe already an API?
  2. Is it generally wanted to have applications interface with omegawiki?

Dh 11:03, 29 October 2009 (UTC)

Hi (again me answering :p), the short answer is "no" to question 1 and "YES!!!" to question 2 ;-)
There has been work some time ago about creating an API, but as far as I know this work was not completed. I'll ask around to know the status of the project. --Kipcool 17:13, 29 October 2009 (UTC)
Guess I've to disappoint you: I was talking about writing a C++ app for KDE. Writing an API for omegawiki does require knowlege of PHP and familiarity with the omegawiki/mediawiki code, which I both don't have. This can change but since the codebase is quite complex and writing a stable API not trivial, I don't know if I am up to it. Fortunately the wiki already has a kind of API which could be used: the wiki itself! I'll see. --Dh 19:13, 29 October 2009 (UTC)


Hebrew spelling with vowels[edit]

User:Drork (a very valued contributor to the Hebrew Wikipedia) tried to raise this issue in the past, and so did User:Dirk gently (see here), but it went unanswered: How should different variants of Hebrew spelling be entered here?

Hebrew has two standard spellings: with and without vowel dots, a.k.a niqqud. In most books, newspapers, websites and emails the vowel dots are not written, because a Hebrew speaker can usually guess the pronunciation without them. In professional publishing - newspapers and books - they are sometimes printed when ambiguity may arise, e.g. between דָּבָר (thing) and דֶּבֶר (plague), both spelled דבר without niqqud. They are almost never used in emails, because most people simply don't know how to type them.

They are, however, practically always used in dictionaries, because these have to provide the complete information about pronunciation. It must be noted that spelling with niqqud is not a system for professional linguists only, like IPA, but an integral part of the language.

The situation with Hebrew is rather different from the situation with Uzbek or Serbian, which can be written in Cyrillic or in Latin. To the best of my knowledge these two can be almost losslessly converted both ways. Not so Hebrew - converting from vowel-less to vocalized is practically impossible to do automatically (i get paid for developing a heuristic engine that does it with limited success) and converting from vocalized to vowelless is also problematic, because מִזּוּג can be both "from a pair" (vowel-less מזוג) and "merging" (מיזוג). So both vowel-less and vocalized versions must be entered manually for every Hebrew expression. Certain automation is possible, but limited.

Another problem is that writing correctly with niqqud is something that few people know, because it is a rather outdated and complicated system, the intricacies of which are hardly taught in schools. I, luckily, studied it in the university, but there are many people who may want to contribute here, but don't know how to write with niqqud, so it must be possible to contribute a word and mark it "niqqud needed" or something.

To make things even worse, the spelling without niqqud, though standardized by the Academy of the Hebrew Language, is not actually employed consistently by all people. There are common deviations from it, and sometimes the same person can write a word inconsistently within the same sentence. These deviations appear in most printed dictionaries as references to the standard spelling, for example from תוכנית to תכנית, from להיפך to להפך, from אוניה to אנייה etc. They must also appear here, either as redirections, or better, as what i call "qualified redirections", one which is marked as "a nonstandard spelling for X" (as in Shoshanna Bahat's dictionary).

To sum things up, a structured solution for these things is much-needed. If anyone can point me a the relevant code and DB tables, i can try to think of something.

A comment about other languages: It is not quite relevant for Yiddish, which uses the same script as Hebrew, but doesn't have the vowel duplicity found in Hebrew. Unfortunately i know very little Arabic, but the situation there is probably very similar to that in Hebrew, although the Standard Arabic harakat are not as outdated as the Hebrew niqqud and writing with them is probably easier (the situation in modern colloquial variants of Arabic is probably more complicated, but still comparable). I'm not sure about the situation with languages like Urdu, Pashto and Persian, which use almost the same script as Arabic, but have a completely different phonetic and grammatical structure. --Amir E. Aharoni 10:57, 5 April 2010 (UTC)

Well well... ;-)
First of all, please note that I have no knowledge of Hebrew...
I see several solutions at the moment:
  1. If both forms can be found in published material, then they should be expressions at the same level, i.e. considered like synonyms. It's what we do now for example for alternative spellings. We could then add optionally several annotations to each form
    • one annotation would say "this is a form with niqqud", or "this is a form without niqqud" (it could be a combo box like "masculine/feminine")
    • one annotation could be a link to the corresponding niqqud form, or non-niqqud form. But I don't think it is useful for the Hebrew reader if the two forms are already appearant as synonyms/alternatives
  2. A variant of solution 1 is to create artificially two languages, one called "Hebrew" or "Hebrew (without niqqud)" and the other called "Hebrew (with niqqud)". For example it is what we do for Serbian, we have "Servian (latin)" and "Serbian (cyrillic)"
  3. Another solution is to decide that you only enter e.g. the non-niqqud form as an expression, and then have an annotation to enter the niqqud form. This is similar to what we do for Japanese, where we indicate the Hiragana lecture in an annotation. However, the situation is a bit different with Japanese, because the Hiragana lecture would not be written in a sentence (usually, it is written in small characters above a word).
Now you can tell me whether one of these solutions suits you. Note that they can be implemented already without any programming. If they don't suit you and you need some programming done, then, hmm... Do you know some php and sql? ;-) --Kipcool 12:51, 5 April 2010 (UTC)
I know some php and sql and i'll gladly contribute.
Currently solution 3 looks close enough to what is needed, but i would do it the other way around: In a perfect entry the form with niqqud should be the primary and the form without niqqud can appear as an annotation, because that's how Hebrew dictionaries usually work. People who can't write with niqqud can simply write the niqqud-less form as the primary and then someone can fix it.
Also, it must be possible to search for words in the niqqud-less form. If in the case of Japanese someone can search for the word using only Hiragana, then this problem is probably already solved.
What about the redirections for non-standard spellings? --Amir E. Aharoni 13:10, 5 April 2010 (UTC)
In the case of Japanese, it is also a requested feature to be able to search by Hiragana. In a more general context, this means that the possibility to search for (text-free) annotations needs to be implemented. Programming this would also allow to search Chinese words by entering pinyin. So, a nice feature to have for many languages.
I am less sure about what to do for "redirections for non-standard spellings". If we go for solution 3, maybe we can have further field called "qualified redirection" or "non-standard spelling for X" or whatever you want to call it, similar to the field "niqud-less form", which will be searchable as well. --Kipcool 16:13, 5 April 2010 (UTC)
If you want to try to implement it, you can take example on Expression:Japanese. i.e. put Expression:Hebrew in the collection "lexical functionalities" and then define some class attributes. --Kipcool 17:52, 12 April 2010 (UTC)
After almost a year i recalled this discussion :)
I put Expression:Hebrew in the collection "lexical functionalities", but how do i define class attributes? --Amir E. Aharoni 21:00, 3 March 2011 (UTC)
Well, now when you edit this expression, you have the possibility to add class attributes just below the definitions (and above alternative definitions).
But first you'll need to have a name of what you want to call the annotation, and create a DM for it. Then you select Level=>Syntrans, Attribute=>the DM you created with the name of your annotation , Type => free text (or whatever it is called in English).
I am thinking about writing a simpler interface to define all these annotations, but it'll take time... --Kip 22:40, 3 March 2011 (UTC)

Number - not just singular and plural[edit]

If i understand correctly, in annotations it is only possible to choose singular and plural number. For Hebrew, Arabic, Slovenian, Lithuanian and others the dual number is needed and for other languages there are also other categories (see w:Grammatical number).

We can also go further and generalize "plurale tantum" and "singulare tantum" by marking in which numbers a word can appear. In Hebrew, for example:

  1. most words can be singular or plural and the dual is almost unused.
  2. some words are plurale tantum and singulare tantum.
  3. some words are nearly always plural and rarely singular, for example sg. מָגוֹר pl. מְגוּרִים. There should be away to mark this, too.
  4. some words, for example שָׁבוּעַ 'week', can be singular, dual and plural.
  5. some words, for example גֶּרֶב 'sock', can be only singular and dual.
  6. some words, for example רֶגֶל 'leg', can be singular and dual, and in plural they have a different meaning (in this case, 'holiday'). This can probably solved simply by putting it under different DM and indicating the same etymology.

For every language possible number categories should be defined; for example, Hebrew doesn't need "paucal". There should also be a default setting for the most common option - for Hebrew it's number 1 above. --Amir E. Aharoni 12:22, 5 April 2010 (UTC)

We do not support inflexions at the moment. This is the next thing I am supposed to implement, but I still have no precise idea how to do it... There are too many exceptions and differencies between the languages :-( --Kipcool 12:55, 5 April 2010 (UTC)
I am not talking about actual inflected forms. I am just talking about having a way to mark the number of the word as it is entered and about how it can be inclined.
Some words in Hebrew are, strangely enough, dual in their base form, for example מֹאזְנַיִם 'scales' and have no singular form. There should be a way to mark it. --Amir E. Aharoni 13:14, 5 April 2010 (UTC)
Ah ok! we can add "dual" without much effort, but we have agree on the name first (Expression:dual? ;-) ) and it has to be defined only for languages that have dual.
The options currently existing (plurale tantum, singulare tantum) have to be changed to become language specific as well (language specific annotations has been implemented only recently).
For the first point of indicating how a word can be inclined, it is also easy to have it. I think the main problem is that we need to agree on a name to identify each group. --Kipcool 16:22, 5 April 2010 (UTC)
Expression:dual looks nearly perfect. All grammar books of Hebrew and Arabic call it "dual".
"Nearly", because currently the definition says "precisely 2 objects"; in Hebrew it can in practice mean more than two, for example the plural of עַיִן 'eye' is identical to the dual עֵינַיִם and it can mean 'two eyes' or just 'x eyes' where x > 2. But it's still perfectly OK to call it "dual".
It's possible that plurale tantum won't be needed anymore as a separate field, if you just mark that the word can only have a plural form. --Amir E. Aharoni 17:42, 5 April 2010 (UTC)
I have added "dual" for Hebrew, Arabic, Slovenian, Lithuanian. --Kipcool 17:49, 12 April 2010 (UTC)

Gender marking[edit]

It is possible to mark plural and singular number, but it seems that it's totally impossible to mark gender. It's not just me, right? --Amir E. Aharoni 11:34, 12 April 2010 (UTC)

Yes it is possible! But genders are language specific, so they have to be defined for your language (Hebrew?). --Kipcool 16:16, 12 April 2010 (UTC)
For Hebrew, Arabic and Aramaic it's masculine and feminine. Quite a lot of Hebrew substantives fluctuate between both genders, but if the same property can be entered twice, then it's good enough to start.
For Russian and probably for most of the other Slavic languages it's masculine, feminine and neuter. --Amir E. Aharoni 16:54, 12 April 2010 (UTC)
I have added the genders for Hebrew, Arabic and Russian.
You can add more genders by editing DefinedMeaning:lexical item (402295). --Kipcool 17:35, 12 April 2010 (UTC)

Multiple grammatical properties and their sources[edit]

It would be useful to be able to mark multiple values for a grammatical property. For example, the Hebrew word גֶּרֶב (sock) is masculine according to normative grammar as prescribed by the Academy of the Hebrew Language, but it is used as feminine in the colloquial speech.

It would also be nice to have a structured way to note the sources for this definitions. --Amir E. Aharoni 11:39, 12 April 2010 (UTC)

It is already possible to mark multiple values for a grammatical property (mostly because we did not implement a check to have it unique)
I am not sure I understand the story about the sources. You mean saying "masculine according to Academy" and "feminine in the colloquial speech"? If so, we need a freetext for that, it is not really possible to have it structured. --Kipcool 16:19, 12 April 2010 (UTC)
Yes, that's what i meant. In worst case i can probably put this info under "usage", but it would be nicer to have something like this (these sources are real):
property value source
gender masculine Academy Decisions: Grammar 2.4.4
gender feminine Isaac Avinery, Yad Hallaschon
Essentially it would be like footnotes in Wikipedia, but structured. I'm into database normalization ;-) and the lack of structure in Wikipedia's footnotes is quite a pain: if i cite the same book in two articles i actually have to copy the whole reference. If we could have a bibliographical database of reliable sources about every language - dictionaries, grammars, corpora - it would be very very neat. --Amir E. Aharoni 16:45, 12 April 2010 (UTC)
This would require some thoughts and some work on the database... But it seems an interesting thing to have. We do have a mysterious "Source" field when entering an alternative definition. Maybe this is a good start (for whoever is ready to implement what you suggest...). --Kipcool 17:44, 12 April 2010 (UTC)

Grammatical properties varying regionally[edit]

According to Duden, grammatical gender of some German nouns varies between regions, such as "Triangel" being female in the Rhineland but not elsewhere. I read something similar about Dutch, where Flemish and southern Nederlands genders and pronunciations are said to occasionally deviate from northern (aka official) ones, if I recall that right. --Purodha Blissenbach 06:52, 22 September 2010 (UTC)

Linking alternate spellings to each other[edit]

I'll have dictionary data in the foreseeable future which could be imported. The dictionaries cover the same language at different times in history, and use different spelling systems depending on time and author. Expressions covered by several dictionaries, when spelt diffently, will show up as synonyms of each other.

Some of their annotations, such as grammatical properties (word class, declension/conjugation type, etc.) should be identical - not even propagated, since that is bound to create a potential for data base consistency issues. This is functionality currently missing.

Some of their annotations, such as sample sentences, should be individual to either spelling, obviously.

Some of their annotations that we are going to have in the future, such as 'grammatical' relations to other expressions like "… is genitive of …" must be individual for each spelling, while their existence or nonexistence depend on grammar properties that are common to all spelling variants.

Currently, it seems to me that, we may want to have an equivalence class "alternate spellings of otherwise identical expressions" among spellings, that would have to have a (predominantly language dependent) set of annotations on their own. --Purodha Blissenbach 04:05, 17 September 2010 (UTC)

Linking alternate spellings to collections[edit]

I'll have dictionary data in the foreseeable future which could be imported. The dictionaries cover the same language at different times in history, and use different spelling systems. Likely, we must give publishers and/or authors ways to refer to "their" material on OmegaWiki, which can be easily done via collections linked to from spellings. We could even index the collections by "volume #, page #, column #, line #" in OmegaWiki. Only, we currently can only have definedmeanings as collection members, which obviously does not fit here, since we must account for the different spellings having identical definedmeanings. Not finding the actual spelling used in a specific dictionary would be both detrimental to scientific treatment and hardly motivate publishers and living authors. Variant spellings inside a dictionary are often reported on a single line. Thus I suggest to add a "collection-of-spellings" type having an optional non-unique index. --Purodha Blissenbach 04:05, 17 September 2010 (UTC)

Assessment of varying ortographic systems by having ortography codes registered with ISO, which can be used as subtags in identifiers of the type "en-GB-scouse", was an option, but does not solve several problems well:

  • Many spellings are identical between the systems
  • All dictionaries record several variant spellings of several expressions on their own.
  • There are additional modern spelling variants that do not belong to either dictionary, but can be found in a variety of literature, newspapers, record covers, etc.

Thus, e.g. an exhaustive reverse lookup "what exactly is in which dictionary" would not work based on these codes alone. --Purodha Blissenbach 04:05, 17 September 2010 (UTC)

Dependencies among grammatical annotations[edit]

It would be nice, if the grammatical properties "pluraletantum" and "singularetantum" could be set to imply the respective grammatical properties "plural", or "singular", automatically for the languages having those. --Purodha Blissenbach 11:09, 18 September 2010 (UTC)

Sound samples[edit]

I would like to link sound samples to expressions, predominantly ones on Commons. That would mean to add another link type to annotations. Would that be ok to do? --Purodha Blissenbach 17:42, 19 September 2010 (UTC)

For sound samples, as well as for images, I'd like to have something that works with InstantCommons. This needs a bit of programming (the programming being the same for sound and images). --Kip 18:41, 23 September 2010 (UTC)

related term[edit]

I would like to add the relaton "related term" that exists elsewhere, but it does not appear in the list of possible choices. What can I do? --Purodha Blissenbach 13:34, 20 September 2010 (UTC)

cf. answer on the beer parlour. --Kip 18:40, 23 September 2010 (UTC)

theme: grammar[edit]

I've added some grammatical terms. I had to put them unter "theme: linguistics" even though this is far to broad a concept, imho. I want them put under "theme: grammar" which in turn is under "theme: linguistics" so as to be precise and not to waste information. Note also that there are words being ambiguous in the realm of linguistics, but not any more when tied to specific linguistic fields (accent, eg.). What can I do to achieve the wanted precision? --Purodha Blissenbach 13:34, 20 September 2010 (UTC)

cf. answer on the beer parlour. --Kip 18:40, 23 September 2010 (UTC)

Showing selected expressions only[edit]

With links such as [[Expression:something]], we get a list of languages, and lists of meaninings of the expression in those languages. Sometimes, one is interested in one, or some, language(s) only, and not interested in accidental hits in other languages. Since translations have to be precise, we often have several definedmeanings of an expression even though it is seen as single thing in its language. Nevertheless, the language may have unrelated homographs of this expression. Sometimes, one may be interested in the set of meanings of one homograph only, but not others. Thus:

  1. It may be desirable, to restrict Expression display to a language, or a language set.
  2. It may be desirable, to restrict Expression display to a group of definedmeanings, excluding other ones.

For showing selected languages only, I could imagine using Links, or URLs, having a language code list attached in brackets, such as [[Expression:something (lang1, lang2, lang3, …)]], where anything not in the character set of language code tags and their possible subtags (regular expression: [-_0-9a-zA-Z]) is considered a separator. Implementation would be pretty straightforward, imho. Selecting a set of definedmeanings only could be handled with DM numbers in brackets. Since language codes cannot start with digits (regexp: [0-9]), while DM numbers always do, ambiguities cannot occur. For the same reason, both selection methods can be combined. --Purodha Blissenbach 06:31, 22 September 2010 (UTC)

There is also the problem that when an expression is the same in many languages, it will be a long page (for example "Berlin"). I already have an idea on how to change the display when several languages are involved. Just wait the cold winter months when I have nothing else to do than programming ;-) --Kip 18:37, 23 September 2010 (UTC)

ParserFunctions[edit]

Is there a chance to install the Extension:ParserFunctions in this wiki? I've seen that there is a lot of work to do getting the help pages together in an orderly manner. Parserfunctions could make life a lot easier on it. --Purodha Blissenbach 17:19, 22 September 2010 (UTC)

Yes, why not, I can install it. Can you just explain how it will help with the help pages? --Kip 18:40, 23 September 2010 (UTC)
For example, currently, most translated pages have individual lists of their respective translations. With parserfunctions, these can all be generated using a single template with conditionals. Since pages are cached, this adds to the server load only when they have to be re-evaluated, which is neglegible, since that happens immediately after each edit only, when a saved page is shown. See also Insect room#Can't acces English Main Page. I would not like to do away with lists of translations, only have them less bold by making them collapsible, or a html <select> now, and once we have the preferred languages of users in their preferences, make them account for these.
Another example was to automatically show when a translated page is older than its source, but that's not cacheable, and thus adds to the server load each time a translated page is rendered, thus we may not want it now.
There are some more uses that I occasionally came accross. --Purodha Blissenbach 22:19, 25 September 2010 (UTC)

SynTrans with PartOfSpeech Column[edit]

I think that the part-of-speech annotation, in the syntrans table, because of its importance, should be moved in a new column, after "identical meaning?". --Synagonism 12:55, 30 December 2011 (UTC)

the current idea of the majority is rather to have it as a header, in order to then group definitions by part of speech. This means many changes in the database. --Kip (talk) 10:44, 30 January 2013 (CET)

Links in Definition's Text[edit]

By setting links to other DMs in a definition's text, the author of this definition can make it less ambiguous because s|he can show what exactly s|he means with the expressions used in her|his text. --Synagonism 09:46, 1 January 2012 (UTC)

Etymologies not normalized[edit]

For an expression like "long" that has many defined meanings, it doesn't make sense to enter the same etymology for each one. The information is not stored efficiently in the database either, and it is not easy to make a modification later. It should be made possible to enter an etymology once and then link to it from wherever it is relevant. --InfoCan 12:24, 17 March 2012 (CET)

It depends on the level of details that we enter in an etymology.
Of course an etymology like "from Latin longus" is probably the same for all meanings, but if we want to give details on each meaning, and where it comes from (generally, it derives from another meaning of the same word), then we need different etymologies for each.
Anyway, the idea of storing a data once and linking to it from multiple words will be needed for inflexions, and would be also nice for example sentences. So, hopefully it will be implemented at some point (but you have probably already noticed by now that I have many ideas, but actually few free time to implement them...). --Kip 13:06, 17 March 2012 (CET)

"Add New" button on Expression page[edit]

The page for an expression with many definitions takes a long time to open if you want to add a new definition (about 30 sec in my case, for opening the page Expression:long). However, clicking on the Edit link of any one of the meanings opens that section by itself much faster. I assume that if only a blank section were to open to add a new DefinedMeaning, the Edit page would open much faster too. Could you add a "Add New" button at the bottom of an Expression page to minimize the waiting? --InfoCan 21:41, 17 March 2012 (CET)

Hierarchical category structure[edit]

Could we add Category to the list of available attributes? I don't mean Commons Category, which is not very developed for non-visual concepts. Nor do I mean the relation "is part of theme", which is not easily scalable. I think a hierarchical category structure similar to that in Wikipedia or Wiktionary would be useful to find other words belonging to a general concept. --InfoCan 17:40, 25 March 2012 (CEST)

Something like a hierarchical hypernym tree viewer (or whatever name we can give it) is planned. This would look like what Wordnet is doing. --Kip 13:48, 26 March 2012 (CEST)

"usage" should move to "Translatable texts"[edit]

The SynTrans annotation "usage" should be moved from Plain texts to Translatable texts. See [1] for an example of this need. --InfoCan 17:28, 16 April 2012 (CEST)

This has been done some months ago (not sure when). --Kip (talk) 11:30, 20 September 2013 (CEST)

Palatinean language wanted enabled[edit]

Can we have the Palatinean language (pfl) admitted for additions of translations? I occasionally have ones. Thank you. --Purodha Blissenbach (talk) 23:10, 29 January 2013 (CET)

Laut Wikipedia.de "Pfälzisch (pfälz. Pälzisch) ist ein Sammelbegriff für die Dialekte der beiden rheinfränkischen Dialektgruppen Westpfälzisch und Vorderpfälzisch".
Depending on how much the difference between Westpfälzisch and Vorderpfälzisch is, I see two possibilities.
  1. we create two separate languages in OmegaWiki, "Westpfälzisch" and "Vorderpfälzisch".
  2. we create one language "Palatinean" and add an annotation "area" that can be used to annotate the few words that are used only in a specific region (like we have now for French, English, Minnan and some other languages).
Tell me the one you prefer, I cannot judge about that language. --Kip (talk) 10:52, 30 January 2013 (CET)

I tried hard to find out how natives see it and there is no clear cut decision. Some want more areas, up to five, some feel they were superfulous, and noone is willing to join Omegawiki. I cannot decide without further research. If I have something to add in Palatinean, it is likely genuine Palatinean, or sometimes Vorderpfälzisch. --Purodha Blissenbach (talk) 13:19, 19 September 2013 (CEST)

I did overlook this request a few times it seems... Now added. --Kip (talk) 20:21, 16 December 2015 (CET)

Colognian Main Page[edit]

I would like to be able to clean the Colognian Main Page up. It is in need of some cleanup work since years, but although I created the page and edited it for a while, I was denied editing rights for it several years ago. --Purodha Blissenbach (talk) 13:19, 19 September 2013 (CEST)

I unprotected the page, and gave you back admin rights that you'll probably need for deleting obsolete pages. Thanks for updating that page :) --Kip (talk) 11:30, 20 September 2013 (CEST)
Thank you & welcome :-) --Purodha Blissenbach (talk) 15:18, 20 September 2013 (CEST)

Another annotation type: gloss[edit]

In the Discussion about connecting Wiktionaries with WikiData we find various references to glosses. See Wikipedia for an introduktion of glosses in linguistics. In Omegawiki, we do not have (linguistic) glosses at the moment although adding them was fairly easy. They are one of:

  • one text field with its language related to a DM only
  • a pair of text fields, the 2nd of which has a language, related to spellings and DMs

As long as we have not made at a final decision about how we want to keep grammatically declensed forms, the question how and where to store detailed glosses has to remain open to some extent. We do already know what they look like. According to the Leipzig Glossing Rules, we expect e.g. the english word 'feeds' to appear at least twice:

feeds (DM0) : 'feed-s' , en:'feed-prs.act.2sg' (i.e. 2nd person singular, active voice, present tense, of a verb meaning 'to feed')
feeds (DM1) : 'feed-s' , en:'feed-pl' (i.e. plural numerus of a noun meaning 'the feed')

Note you could switch the language of the 2nd field by extracting the word 'feed' from it, go to the syntrans list of 'feed' with DM1, find a corresponding word in another language, Having grammatical annotations would likely enable us to determine the grammatical part of glosses automatically and vice versa. There can be several distinct glosses to one spelling+DM combination. Consider e.g. the latin word:

puellae (DM2) : 'puell-ae' , 'girl-gen.sg' (i.e. the genitive case and singular numerus of a word corresponding to 'girl')
puellae (DM2) : 'puell-ae' , 'girl-dat.sg' (i.e. the dative case and singular numerus of a word corresponding to 'girl')
puellae (DM2) : 'puell-ae' , 'girl-nom.pl' (i.e. the nominative case and plural numerus of a word corresponding to 'girl')

Here grammatical endings are abiguous. --Purodha Blissenbach (talk) 19:36, 28 September 2013 (CEST)

Reconstructed & proto-languages, dead languages, corpus languages, Trümmersprachen, artificial reference pseudo languages[edit]

There are various languages or bodies of references to words similar to languages or lookalikes that are not currently spoken or were never spoken at all. They are somehow closely related to real spoken languages of today, usually being their precedents in history. Often they provide valuable ethymological data, or are used in academia to study ancient texts, groups of languages, or so on. What types do we have? This is a sketchy overview:

  • There are languages that came out of use, such as Old Egyptian. We have written evidence of them. Unless new artefacts were detected, there is a known and limited set of texts which could eventually be completely available electronically in the foreseeable future.
    • Sometimes we know little about a dead language having only some inscriptions or other fragmentary relicts.
    • Old people sometimes let us record an often very limited set of words of their languages before both die.
  • There are languages that may have never been spoken, or at least we do not know, which nevertheless must have existed in one form or another, such as the astonishingly well researched Proto-Indo-European. They were reconstructed, often only partially, from our knowledge about later languages. They may have words based on assumptions and likelihood, decreasing as research progresses.
  • There are 'languages' with words that have in part been constructed as a common denominator of reference for almost identical words in closely related languages or dialect groups, so as to have a single lemma for all of them. This approach to dictionary making is e.g. known from ancient indian vedic. modern West Middle German of the pre-comuter era, and others.

For all these languages or pseudo languages there are dictionaries available including many license-free ones. Some of the older prints are already available as scans, some have been processed to better searchable digital forms. The majority of these languages or pseudo languages does not have ISO 639 codes. I pledge for including all of those in OmegaWiki. There are benefits:

  • We get a chest of ethymoligical relations.
  • Allowing the lot of dead or dying languages having only few words does not cost much beyond maybe finding an ISO 639 code for them, but may be of great help to researchers.
  • With the inclusion of ancient and protolanguages, we become attractive for the academic research community currently having to struggle to get their findings publicised at acceptable cost. If OmegaWiki reliefs them of considerable parts of their technical and administrative workloads, some are likely able to become donors of their savings.
  • With enough data on corpus languages, testing and proving theries on inheritance and reconstruction become possible almost as a snap. You only need to make database queries to find examples or counterexample for your ideas in great numbers. If nothing else, imho this argument alone warrants the inclusion of these kinds of languages.

Think over it. With love -- Purodha 19:36, 28 September 2013 (CEST)

Downloadabe lists of languages, and language names in each language[edit]

If not too expensive to make, a set "Download list of languages" would make sense to be added to Special:Ow_downloads. There are imho two useful versions of it. One is the bare table data, the others have additional columns giving the expression/definition data of each language name in a given language for easier use for humans. --Purodha Blissenbach (talk) 08:40, 11 October 2013 (CEST)

Produce error messages when attempted edts or additions of lexial data fail[edit]

This has imho a low priority.

There are various occasions and circumstances when attempted edits or additions fail in one way or another but editors are not notified.

  1. type a defintion but forget to specity its language
  2. type another definition in a language already having one
  3. for a DM, add another key into a collection

There are likely many more. --Purodha Blissenbach (talk) 17:41, 12 October 2013 (CEST)

Ethymological relations: loans and borrowings[edit]

We do not deal much with ethymology yet. This acknowledged, I want to suggest to introduce a new relation type and these four relations:

<expressionA> of <languageA> is a loan from <languageB>'s <expressionB>

and the reverse relation:

<expressionB> of <languageB> became a loan to <languageA> as <expressionA>

as well as:

<expressionC> of <languageC> is borrowed from <languageD>'s <expressionD>

and the reverse relation:

<expressionD> of <languageD> is used as a borrowing in <languageC> as <expressionC>

One difference between loan and boorowing is that usually the spelling of <expressionD> and <expressionC> will be identical (except when transliterated) while <expressionA> and <expressionB> are mostly spelled differently. Another, higher ranking, distinction between loan and boorowing is that the speakers of <languageA> are usually unaware that they are using a word which came from <languageB>, while the opposite is likely for <languageC> versus <languageD>. --Purodha Blissenbach (talk) 13:37, 7 December 2013 (CET)

Palatinean language[edit]

I would like to have the Palatinean (pfl) language enabled. --14:28, 4 November 2014 (CET)

Done --Kip (talk) 20:10, 16 December 2015 (CET)

Allow the Central American Spanish language.[edit]

Please add Central American Spanish as a language that we can add words for. The web page http://www.buzzfeed.com/ailbhemalone/language-facts-that-will-blow-your-mind mentions it in its item #34, the contents of which I would like to copy to OmegaWiki, and there are likely many more distinctions from Standard Spanish that we should be able to document here. --Purodha Blissenbach (talk) 13:22, 10 November 2014 (CET)

Done. --Kip (talk) 20:16, 16 December 2015 (CET)

Additional language wanted.[edit]

I can accasionally add translations to the Palatinean German language (pfl) and I want to be able to do so. Please enable it, or allow me to do that myself. --Purodha Blissenbach (talk) 11:54, 28 November 2014 (CET)

Done above. Sorry for the wait... :) --Kip (talk) 20:17, 16 December 2015 (CET)

Add an annotation type "source"[edit]

Often I enter data in OmegaWiki which I obtained from some external source, such as a web site, a broadcast, a newspaper, a journal, a book, etc. I would like to:

  1. give proper credit to the source,
  2. allow fellow editors to do further research in the sources that used, if they want to.

Since sources such as dictionaries and word books and word list in websites can be repeatedly used, we need to allow them to be entered in the data base once, and then to be referred to in a convenient way, when entering the annotations referring to them. There can be several sources to a single information, e.g. a single translation can be listed in several bilingual dictionaries. Some information that are stored as annotations in OmegaWiki, such as the word class of an expression, may need to be annotated with sources, too. --Purodha Blissenbach (talk) 17:04, 18 October 2015 (CEST)

I am not sure how to implement this at this point, i.e. what kind of annotation (plain text, or drop down list), and at what level (syntrans or definedmeaning. Probably mostly syntrans). A fixed list would get too huge with the current system, so we would actually need a special interface to define available sources (and a new field in the database probably). --Kip (talk) 20:26, 16 December 2015 (CET)
In German at least, we have many books or similar that have well known proper names, such as "der Duden", "der Brockhaus", "der Langenscheid Englisch Deutsch" oder "der Stohwasser" (Latin←→German) and others that are lesser known but commonly well known among experts. We can enter then as expressions and thus refer to them via relations. This may or may not be expandable into a general solution. --Purodha Blissenbach (talk) 00:09, 17 December 2015 (CET)
Indeed, a DM relation at syntrans level might work in that case (then edit Expression:lexical item if you want to add it).
A URL could also be used for online sources. --Kip (talk) 20:57, 17 December 2015 (CET)

Please allow SVG fles to be uploaded.[edit]

Needed e.g. in Template:not done. --Purodha Blissenbach (talk) 11:10, 15 January 2016 (CET)