MixSarc: A Bangla-English Code-Mixed Corpus For Implicit Meaning Identification
| dc.contributor.author | Ahmed, Tamim | |
| dc.contributor.author | Alam, Kazi Samin Yasar | |
| dc.contributor.author | Chowdhury, Md Tanbir | |
| dc.date.accessioned | 2026-06-24T09:43:22Z | |
| dc.date.issued | 2025-10-25 | |
| dc.description | Supervised by Mr. Md Rafid Haque, Lecturer, Department of Computer Science and Engineering (CSE) Islamic University of Technology (IUT) Board Bazar, Gazipur, Bangladesh This thesis is submitted in partial fulfillment of the requirement for the degree of Bachelor of Science in Computer Science and Engineering, 2025 | |
| dc.description.abstract | Thisthesisfocusesondetectinghumor,sarcasm,offensiveness,andvulgarityinBangla English code-mixed text, an area largely overlooked in existing natural language pro cessing (NLP) research. A novel dataset has been proposed, which will be created by scraping and filtering social media content, followed by manual annotation across fourattributes. Twotransformer-basedapproacheswereexploredinsmallscale: multi class and multi-label text classification. The study also proposes future directions, in cluding dataset balancing, comparative evaluation of transformer models and large language models (LLMs), and the introduction of a SarOff Score to better capture sarcasm-offense overlap. By addressing the complexities of code-mixed tone detec tion, this work advances NLP in low-resource, multilingual settings | |
| dc.identifier.uri | https://repository.iutoic-dhaka.edu/handle/123456789/2637 | |
| dc.language.iso | en | |
| dc.publisher | Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur-1704, Bangladesh | |
| dc.title | MixSarc: A Bangla-English Code-Mixed Corpus For Implicit Meaning Identification | |
| dc.type | Thesis |
Files
Original bundle
1 - 2 of 2
Loading...
- Name:
- 50 Fulltext_ CSE_ MixSarc A Bangla-English Code-Mixed Corpus For Implicit Meaning.pdf
- Size:
- 1.02 MB
- Format:
- Adobe Portable Document Format
Loading...
- Name:
- 50 Turnitin Report_ CSE_200041150_200041119_200041114_PR.pdf
- Size:
- 561.05 KB
- Format:
- Adobe Portable Document Format
License bundle
1 - 1 of 1
Loading...
- Name:
- license.txt
- Size:
- 1.71 KB
- Format:
- Item-specific license agreed upon to submission
- Description:
