Word Net - Word Synonyms & related word constructs - Java or Python?

It is easiest to understand the WordNet data by looking at the Prolog files. They are documented here.

It is easiest to understand the WordNet data by looking at the Prolog files. They are documented here: wordnet.princeton.edu/wordnet/man/prolog... WordNet terms are group into synsets. A synset is a maximal synonym set.

Synsets have a primary key so that they can be used in semantic relationships. So answering your first question, you can list the different senses and corresponding synonyms of a word as follows: Input X: Term Output Y: Sense Output L: Synonyms in this Sense s_helper(X,Y) :- s(X,_,Y,_,_,_).?- setof(H,(s_helper(Y,X),s_helper(Y,H)),L). Example:?

- setof(H,(s_helper(Y,'discouraged'),s_helper(Y,H),L). Y = 301664880, L = demoralised, demoralized, discouraged, disheartened ; Y = 301992418, L = discouraged ; No For the second part of your question, WordNet terms are sequences of words.So you can search this WordNet terms for words as follows: Input X: Word Output Y: Term s_helper(X) :- s(_,_,X,_,_,_). Word_in_term(X,Y) :- atom_concat(X,' ',H), sub_atom(Y,0,_,_,H).

Word_in_term(X,Y) :- atom_concat(' ',X,H), atom_concat(H,' ',J), sub_atom(Y,_,_,_,J). Word_in_term(X,Y) :- atom_concat(' ',X,H), sub_atom(Y,_,_,0,H).?- s_helper(Y), word_in_term(X,Y). Example:?

- s_helper(X), word_in_term('beat',X). X = 'beat generation' ; X = 'beat in' ; X = 'beat about' ; X = 'beat around the bush' ; X = 'beat out' ; X = 'beat up' ; X = 'beat up' ; X = 'beat back' ; X = 'beat out' ; X = 'beat down' ; X = 'beat a retreat' ; X = 'beat down' ; X = 'beat down' ; No This would give you potential n-grams, but no so much morphological variation. WordNet does also exhibit some lexical relations, which could be useful.

But both Prolog queries I have given are not very efficient. The problem is the lack of some word indexing. A Java implementation could of course implement something better.

Just imagine something along: class Synset { static Hashtable synset_access; static Hashtable> term_access; } Some Prolog can do the same, by a indexing directive, it is possible to instruct the Prolog system to index on multiple arguments for a predicate. Putting up a web service shouldn't be that difficult, either in Java or Prolog. Many Prologs systems easily allow embedding Prolog programs in web servers, and Java champions servlets.

A list of Prologs that support web servers can be found here: http://en.wikipedia.org/wiki/Comparison_of_Prolog_implementations#Operating_system_and_Web-related_features Best Regards.

Thanks for this info, very very helpful. Quick question, I'm still not 100% sure on this, but if possible, what would be the best way to link a word like 'discouraged' to 'beat down' programatically? – NightWolf Aug 8 at 17:36 Depends on the "link" and the programming language.In WordNet there are two kinds of "links", semantic and lexical.

Via Prolog, for a semantic link, you would assert a fact link(synset_id1, synset_id2). For a lexical link you would assert a fact link(synset_id1, word_num1, synset_id2, word_num2). In Java you would use your appropriate datastructure/service.

Or you put the two words into the same synset, or create a new synset where they are together in it. The later applies in case your "link" should represent same sense. – Countably Infinite Aug 8 at 20:00.

These are two different problems. 1) Wordnet and python. Use NLTK, it has a nice interface to wordnet.

You could write something on your own, but honestly why make life difficult? Lingpipe probably also has something built in but NLTK is much easier to use. I think nltk just downloads an ntlk database, but I'm pretty sure there are apis to talk to wordnet.2) To get bigrams in nltk follow this tutorial.In general you tokenize text and then just iterate over the sentence getting all the n-grams for each word by looking forward and backward.

Thanks for the links. From my testings with WordNet certain phrases such as "beat down" cant be identified, is this correct? – NightWolf Aug 8 at 16:16 If you use wordnet online you can see some synonyms: wordnetweb.princeton.Edu/perl/… – nflacco Aug 8 at 16:26 maybe the online version is a newer db of words?

– nflacco Aug 8 at 16:27.

As alternative to NLTK, you can use one of available WordNet SPARQL endpoints to retrieve such information. Query example: PREFIX rdfs: PREFIX wordnet: SELECT DISTINCT? Label {?

Input_word a wordnet:WordSense; rdfs:label? Input_label. FILTER (?input_label = 'run')?

Synset wordnet:containsWordSense? Input_word.?synset wordnet:containsWordSense?synonym.?synonym rdfs:label?label. } LIMIT 100 In Java universe, Jena and Sesame frameworks can be used.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions