Abstract:
People in developing countries without tertiary education, face hurdles in using digital platforms for communication. The linguistic diversity of this section of population makes design of near-universal digital enablement methodology a challenging task. It is therefore pivotal to build a language agnostic methodology with bare minimum text to achieve digital communication across language boundaries. This would also help in bridging the "Digital Divide". In this paper, we illustrate building a Multimodal Semantographic Metalanguage (MSM) using Machine Learning (ML), Natural Language Processing (NLP) and Natural Semantic Metalanguage (NSM). The proposed methodology uses pictographs and ideographs, which are visually more distinctive, simpler to understand, have a reduced learning time and appropriate for achieving digital literacy for semi-literates. We establish our claim on a dataset compiled from text messages by semi-literates. We have observed that using the proposed approach, we can successfully communicate semantic elements across semi-literates with different linguistic backgrounds with an accuracy of more than 80%.