IRP Tutorial (3,7): Slides
IRR Tutorial1: Slides
IRR Tutorial2: Slides
IRR Tutorial3: Slides
IRR Tutorial5: Slides
IRR Tutorial6: Slides
Visual Question Answering Dataset . Image specific Question/Answer pairs generated from gold-standard human captions.
Telugu transliteration parallel corpus . This is from the generous contribution by a public facebook group called telugu inspiration.
Location Recognizer: A recent work on recognizing locations based on string matching algorithms. Capable of handling spelling corrections of locations as well.
Indic Unicode Equivalence: A simple script that handles unicode and transliteration equivalence issues in Indian languages (Hindi, Gujarati and Bengali) – Thanks to Jatin Sharma for the help with understanding the cases to be handled.
Other Useful Datasets
- Crowdflower data library : A pubicly released set of various datasets from Crowdflower for research