|
Technology
at IIIT-A
Technologies
are being developed for the Universal Digital Library
(UDL)
under the supervision of Dr. U.S.Tiwari, Dr. S. Sanyal and Dr R.
Sanyal.
1.
Development and up gradation of Optical Character Recognition
systems for Indian Languages.
2.
Cross-lingual Information Retrieval System and search engines
between
Indian languages.
3.
Multimedia authoring tools in Indian languages.
4.
Multilingual, multimedia, multi-interface information access
for physically
challenged.
|
The above four technologies
covers/includes the following topics of development
Video
conferencing along with massaging facilities: This
utility uses the RTP and SIP protocol for setting up streaming video
and video conferencing facility.
Text
to speech (both in Hindi and English): English TTS is
based on the approach and database provided by CMU. The audio
database for Hindi TTS is being developed. The initial prototype is
ready and the Hindi TTS.
Content
development, content management and user management system:
Content Management System has been developed as a part of
Annotation System to
maintain the available records(e-books ,media files,…etc or
collection of these represented as course file) in a well organized
way and to maintain the users associated in the process.
Multimodal
user interface for deaf-dumb people: Deaf and dumb
persons can read text, if literate, but sign language is known to
almost all the deaf and dumb person. This module aims to solve this
problem with the help of an interface in sign language. The text to
sign language transliteration tool was implemented successfully
using Java3D API. Facility to add new words and delete or update old
words was provided through an easy to use user interface which even
a layman can use.
Editor
(multilingual): A complete multilingual word processor
has been developed for various Indian languages and English. The
output of the editor is comparable with standard microprocessor.
Development
of Optical Character Recognition systems for Indian Languages:
The individual characters, of the original text, has to be recovered
from the scanned image by the system. The OCR system
extracts the features from the scanned image and performs the
recognition task. Basic architecture for multilingual OCR developed.
Cross-lingual
Information Retrieval System and search engines between Indian
languages: Documents in three languages namely Hindi,
English and Sanskrit will be taken. The retrieval of information
from such documents only will be considered initially.
The interface will also be developed. Agent based
architecture will be developed for cognitive modeling and retrieval
Basic search engine developed for IRS. It is working for English
search phrases.
Semantic
analysis of English document:
Goal is to generate a concise
document description that is more revealing than the title
but short enough to be absorbed in a single glance and can be used
for rapid
“relevance assessment”.
The
user specifies the amount of summarization desired and the software
generates an extract of the given document which conveys the main
idea expressed in it. User can also opt for showing the “important
keywords” in the document. This Document Summarizer can be used as
an API by other software and also has a well developed Graphical
user interface. Rules for interrogative sentences in sentence
splitting still needs to be worked out. And same goes for pronoun
resolution. | |
|
|