I’m learning Natural Language Processing in Python with NLTK. I’m trying to process Nigerian Language text. For now, I’m using Google translation api.
Does anyone know of any module that has library of Nigerian Languages, or an NLTK lexicon of Nigerian Languages? I want to know if there is one before attempting to compile.
Cool to see someone showing interest in this. I really doubt if there is anything reasonable out there for Nigerian languages. What exactly are you looking for ? Dictionary? Tag Library? Translations? I once read a research that was done by some folks in UI, they used MOSES for translation, but the results didn’t look too good. The Ife people also did something around rule based translation, but doesn’t look too good either.
If this is something you’re very interested in, we can meet up to discuss ideas around this.
I’m working with NLTK module, and I’ll like to know if anyone has trained the module with Nigerian Languages. But I’ll like to see other projects that have worked with Nigerian Languages, whether speech to text or language processing.
Resources and projects on language processing for Nigerian languages exist.
I have been working on text-to-speech for Nigerian languages.
If you are interested, kindly let me know.
NLP is a really broad and growing topic so it’ll be useful if you specified the particular area that interests you.
I doubt if you’ll find any publicly available lexicons on models trained on any Nigerian language. Even Google Translate still struggles to get things right (e.g. it fails miserably with Hausa translations.)
It’s a challenging path, but you can pick a Nigerian language, and then get a couple of language experts to guide you on the syntactic and semantic features of the language. In Hausa, for example, the phrase “bana so” and “ba na so” is generally accepted to be the same thing (i.e. “I don’t want”). Knowing such peculiarities will greatly help whatever heuristics you put into building your language preprocessors and whatever comes afterward.
You’ll definitely find inspiration if you read some key academic papers that focus on NLP problems in other languages besides English.
This might be a long shot, but you can talk to Kola Tubosun who’s doing a lot of work with bringing the Yoruba language online. He might be able to point you in the right direction.