Language Identification and Context-based Analysis of Code-switching Behaviors in Social Media Discussions

dc.contributor.authorSharma, Yashvardhan
dc.date.accessioned2024-11-15T09:11:47Z
dc.date.available2024-11-15T09:11:47Z
dc.date.issued2019
dc.description.abstractSocial media discussions see the participation of multilingual individuals: who tend to utilize alternate languages in a single post (code-switching) for effective communication in a discussion. This paper attempts to characterize such discussions to analyze contextual factors related to multilingual communities. Features extracted from the posts are used to train a CRF-based sequence labeling algorithm for language identification in an intra-sentential code-switching scenario. The context of a sentence in a discussion is modeled in defining relevance through Term Frequency Inverse Document Frequency (TF-IDF). Further context of a multilingual sentence with respect to the discussion such as agreement and questioning between pairs of posts is also modeled.en_US
dc.identifier.urihttps://ieeexplore.ieee.org/abstract/document/9006032
dc.identifier.urihttps://dspace.bits-pilani.ac.in/handle/123456789/16392
dc.language.isoenen_US
dc.publisherIEEEen_US
dc.subjectComputer Scienceen_US
dc.subjectCode-switchingen_US
dc.subjectData miningen_US
dc.subjectLanguage identificationen_US
dc.subjectCRFen_US
dc.titleLanguage Identification and Context-based Analysis of Code-switching Behaviors in Social Media Discussionsen_US
dc.typeArticleen_US

Files

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: