Staff

Project Committee

Technology at IIIT-A

Technologies are being developed for the Universal Digital Library (UDL) under the supervision of Dr. U.S.Tiwari, Dr. S. Sanyal and Dr R. Sanyal.

1. Development and up gradation of Optical Character Recognition systems for Indian Languages.

2. Cross-lingual Information Retrieval System and search engines between
Indian languages.

3. Multimedia authoring tools in Indian languages.

4. Multilingual, multimedia, multi-interface information access for physically
challenged.

The above four technologies covers/includes the following topics of development

Video conferencing along with massaging facilities: This utility uses the RTP and SIP protocol for setting up streaming video and video conferencing facility.

Text to speech (both in Hindi and English): English TTS is based on the approach and database provided by CMU. The audio database for Hindi TTS is being developed. The initial prototype is ready and the Hindi TTS.

Content development, content management and user management system: Content Management System has been developed as a part of Annotation System to maintain the available records(e-books ,media files,�etc or collection of these represented as course file) in a well organized way and to maintain the users associated in the process.

Multimodal user interface for deaf-dumb people: Deaf and dumb persons can read text, if literate, but sign language is known to almost all the deaf and dumb person. This module aims to solve this problem with the help of an interface in sign language. The text to sign language transliteration tool was implemented successfully using Java3D API. Facility to add new words and delete or update old words was provided through an easy to use user interface which even a layman can use.

Editor (multilingual): A complete multilingual word processor has been developed for various Indian languages and English. The output of the editor is comparable with standard microprocessor.

Development of Optical Character Recognition systems for Indian Languages: The individual characters, of the original text, has to be recovered from the scanned image by the system. The OCR system extracts the features from the scanned image and performs the recognition task. Basic architecture for multilingual OCR developed.

Cross-lingual Information Retrieval System and search engines between Indian languages: Documents in three languages namely Hindi, English and Sanskrit will be taken. The retrieval of information from such documents only will be considered initially. The interface will also be developed. Agent based architecture will be developed for cognitive modeling and retrieval Basic search engine developed for IRS. It is working for English search phrases.

Semantic analysis of English document: Goal is to generate a concise document description that is more revealing than the title but short enough to be absorbed in a single glance and can be used for rapid �relevance assessment�. The user specifies the amount of summarization desired and the software generates an extract of the given document which conveys the main idea expressed in it. User can also opt for showing the �important keywords� in the document. This Document Summarizer can be used as an API by other software and also has a well developed Graphical user interface. Rules for interrogative sentences in sentence splitting still needs to be worked out. And same goes for pronoun resolution.

2. Cross-lingual Information Retrieval System and search engines between Indian languages.

3. Multimedia authoring tools in Indian languages.

2. Cross-lingual Information Retrieval System and search engines between
Indian languages.