MixSarc: A Bangla-English Code-Mixed Corpus For Implicit Meaning Identification

dc.contributor.authorAhmed, Tamim
dc.contributor.authorAlam, Kazi Samin Yasar
dc.contributor.authorChowdhury, Md Tanbir
dc.date.accessioned2026-06-24T09:43:22Z
dc.date.issued2025-10-25
dc.descriptionSupervised by Mr. Md Rafid Haque, Lecturer, Department of Computer Science and Engineering (CSE) Islamic University of Technology (IUT) Board Bazar, Gazipur, Bangladesh This thesis is submitted in partial fulfillment of the requirement for the degree of Bachelor of Science in Computer Science and Engineering, 2025
dc.description.abstractThisthesisfocusesondetectinghumor,sarcasm,offensiveness,andvulgarityinBangla English code-mixed text, an area largely overlooked in existing natural language pro cessing (NLP) research. A novel dataset has been proposed, which will be created by scraping and filtering social media content, followed by manual annotation across fourattributes. Twotransformer-basedapproacheswereexploredinsmallscale: multi class and multi-label text classification. The study also proposes future directions, in cluding dataset balancing, comparative evaluation of transformer models and large language models (LLMs), and the introduction of a SarOff Score to better capture sarcasm-offense overlap. By addressing the complexities of code-mixed tone detec tion, this work advances NLP in low-resource, multilingual settings
dc.identifier.urihttps://repository.iutoic-dhaka.edu/handle/123456789/2637
dc.language.isoen
dc.publisherDepartment of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur-1704, Bangladesh
dc.titleMixSarc: A Bangla-English Code-Mixed Corpus For Implicit Meaning Identification
dc.typeThesis

Files

Original bundle

Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
50 Fulltext_ CSE_ MixSarc A Bangla-English Code-Mixed Corpus For Implicit Meaning.pdf
Size:
1.02 MB
Format:
Adobe Portable Document Format
Loading...
Thumbnail Image
Name:
50 Turnitin Report_ CSE_200041150_200041119_200041114_PR.pdf
Size:
561.05 KB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections