Funding Organization: GSRT
Funding Programme: ΔΡΑΣΗ ΕΘΝΙΚΗΣ ΕΜΒΕΛΕΙΑΣ: «ΕΡΕΥΝΩ-ΔΗΜΙΟΥΡΓΩ-ΚΑΙΝΟΤΟΜΩ»
Funding Instrument: Business Partnerships with Research Organizations
Start Date:
Duration: 30 months
Total Budget: 843.341,06 EUR
ITI Budget: 299.620,97 EUR

The “Επικοινωνώ” project aims to implement a system, in the form of a mobile or tablet application, which will aim at the easy and efficient communication of deaf and hearing individuals in real time. The primary objective is to recognize the hand gestures of the deaf, executed in the Greek Sign Language (GSL).

For the purpose of this project a deep learning network architecture will be designed to identify human gestures and body movements, using innovative deep learning methods, based on the development of large-scale deep neural networks. Using multidimensional information (color, depth, movement, human skeleton) to identify the various sign meanings on multiple levels (whole body movements, gestures). An important challenge to consider is the embedding of the developed neural network on the smart device, which has limited processing power.

Having converted the projected manual communication into subsets of information elements, structures and parameters (such as a dictionary of hand shapes, hand movements, palm orientations, hand shape position chirp positions of the speaker’s body and body movements), the next step is to map the aforementioned parameters with the written languages syntax. This process will be carried out by using the linguistic knowledge operations and interactions of the linguistic levels of the natural languages as a cognitive background, in order to codify the interaction of all the previously mentioned parameters that are combined to shape the final meaning of a sentence in sign language. Then, conversion from written text into spoken language methods will be applied using MLS’s Text To Speech (TTS) technology. To perform the inverse process, the spoken sound signal will be captured from the device’s microphone and subsequently be converted into written speech using MLS’s Automatic Speech Recognition (ASR) technology. The written discourse will be “translated” into GSL using correlations between linguistic levels of the spoken Greek language and the morphological, syntactical and semantic structures of GSL. The signal finally displayed on the device’s screen will be an anthropomorphic animation (avatar), which will execute the calculated sign syntax. Along with the avatar, there is a provision to print the encoded written discourse in the form of subtitles to help the Deaf with relatively limited comprehension skills of GSL. The proposed system will be tested in real cases, taking into account the difficulty of a permanent GSL translator presence (e.g. in extraordinary circumstances) but also issues relating to the privacy of deaf people due to the search of a non-certified interpreter. Indicative use cases that will be considered are access to public services.