Statement of Purpose Essay - Johns Hopkins
Statement of Purpose Motivation: The inequitable nature of language models can compound the disadvantages faced by minority communities across the world. For example, in India, several factors such as location, occupation, economic status, name, native language, etc., can serve as proxies for an individual’s caste and/or religion [1] among other demographic attributes. The lack of transparency and identification of the threats posed by AI models to these communities and the accessibility and control of these systems lying solely with the privileged elite is a dire cause for concern. I want to address these issues through my research, which has focused on understanding and quantifying the harms caused by large language models (LLMs) and other aspects of natural language processing. To build equitable systems that incorporate cultural context, I aim to focus my research on the fairness and interpretability of LLMs and their intersection with privacy and efficiency. Some of the problems that I would like to work are: • Quantifying and mitigating harms due to allocational and representational biases. Current research has yet to consider this from an intersectional, multicultural and multilingual standpoint. • Using interpretability methods to highlight fairness issues in language models and modeling interpretability in various downstream tasks. Creating trustworthy LMs from the perspective of privacy, efficiency, fairness, and interpretability. • Studying the limitations and the potential harms that can be caused by language models, and their influence on user interaction. Fairness, Interpretability, Privacy and Efficiency: Despite the limited opportunities and guidance available in my college to work on academic research, I took the initiative to explore and discover my interest in NLP. I utilized the opportunity to take online courses and attend conferences virtually whenever possible, which introduced me to a host of new areas and research, including fairness in NLP. A fellow student and I initiated our first independent research project in this area, which entailed a thorough review of bias in machine translation systems. [2] dealt with evaluating bias in Hindi-English machine translation using both intrinsic (measuring bias in pretrained representations) and extrinsic (measuring bias in downstream tasks) methods. Apart from this, we also discovered that most existing methods of evaluating bias in machine translation rely exclusively on the mistranslation of binarized pronouns [3]. They do not provide us with information about a model’s stereotypical tendencies in the extrinsic outputs- a flaw in our work as well. As potential solutions to some of these problems, I would be interested in working on the development of an evaluation benchmark and datasets that account for non-binary genders, as well as studies to investigate whether MT systems use gendered and racial stereotypes to generate these outputs. Due to a lack of supervision, we had to carefully plan and execute certain aspects of the paper, such as formulating the problem statement as well as crafting experiments and ensuring the paper addressed the problem thoroughly from both a technical and an ethical perspective. We attempted to model our approach after similar research and sought input from the relevant communities on ethical concerns. Our research is among the first to address fairness evaluation across any task in the Indian context, which is indicative of the sheer scarcity of work related to diagnosing biases in these systems. During my graduate studies, I would like to address problems in the Indic context from an intersectional lens, particularly given that the additional dimension of caste has almost entirely been excluded from the fairness narrative thus far. My stint at Microsoft Research India, under the guidance of Dr. Sunayana Sitaram and Dr. Monojit Choudhury, has allowed me to work on bias in multilingual NLP and the effects of model compression on LMs. Having kept up with research on bias in LMs, I sought to provide a critical overview of the literature on fairness in multilingual settings and languages other than English. This culminated in a work outlining the challenges and unaddressed problems from technical, cultural, and linguistic perspectives. As pointed out in study, much of the work is Anglo-centric and does not account for culture nuance. Due to the linguistic differences between languages, it may not always be possible to adapt metrics. Certain axes that are more prominent in specific regions also tend to be ignored in the process and even with respect to gender, research needs to acknowledge communities outside the Western reach. These pitfalls reinforce social hierarchies from a technological point of view. Bias metrics have been demonstrated to be fallible themselves, and few papers perform the requisite benchmarking to understand how several factors in the research design can influence bias. However, given the diversity of languages, dialects and the variations in cultural nuance between them, both our evaluation methods and data-based approaches (which would require extensive curation) are unlikely to be sufficient in their current state for these systems to consider racial and cultural nuance. The vast expanse of this space requires us to consider strategies that would also be practically feasible. The research study necessitated a comprehensive review of the literature on fairness and sociolinguistics. This provided me with a broad understanding of the challenges in fairness evaluation and mitigation. I was able to conduct a comprehensive survey and provide a critical analysis of the current state of research in this area, which enabled me to contribute significantly to the writing process. A research direction that stood out to me in particular was the development of parallel corpora for fairness evaluation. While several automated ways have been developed and used to generate parallel corpora in other languages, recent research has demonstrated that this may be a flawed approach [4]. The need to consider these problems in deep learning from a practical stance has propelled me toward research related to efficiency in LMs. Compressed models are heavily utilized due to resource constraints imposed in real-world settings; however, most studies only consider the cost in terms of the drop in performance. Although model compression has been shown to be detrimental to fairness in models trained on tabular data, no equivalent extensive study has been conducted for language models. I conceived the idea to carry out a first-of-its-kind analysis benchmarking of pruning, quantization, and distilled language models for text classification and observe trends across these methods. Being at an institution like MSR was what gave me access to the computational resources to pursue this research, and made me realize the value of such resources to research. I also used this opportunity to connect with and direct other collaborators whose experience in compression methods helped bring this idea to fruition. Previous studies such as [5], [6], [7] delivered inconclusive and contradictory results and only experimented with a limited set of variables in their research design. To answer this question definitively, we i) incorporated multiple model architectures and methods of model compression, ii) conducted preliminary studies from a multilingual perspective, and iii) utilized a wide range of intrinsic and extrinsic evaluation measures. Both these projects are currently under review at EACL 2023. Additionally, having gained a thorough understanding of the model compression methods used in NLP and how they compare to one another, I have begun looking at the development of model compression techniques that minimize the resultant bias. Prior to this, I interned in MSR’s EzPC team under Dr. Rahul Sharma. The EzPC framework uses secure multi-party computation to compute functions over data without divulging information about this data. My work entailed a collaboration where I used this framework and worked with Dr. Rijurekha Sen from IIT Delhi for the secure training and inference of models for urban sensing problems, with the objective being to balance privacy, accuracy and efficiency tradeoffs. This is motivated by the need for traffic congestion estimation and air pollution monitoring in large cities, where data can be contributed by rival fleet companies. This is under review at PoPETS ’23. I’ve since gained an understanding of the privacy elements critical to PPML, the costs and benefits of developing secure ML models, and the overall practical feasibility of PPML. Simultaneously, my forays into privacy have led me to question how differential privacy (DP) affects the disparate treatment of LLMs, in addition to the tradeoff in performance, this would involve. However, DP in NLP is still in its infancy, with questions surrounding its application to unstructured data and how we can adapt it to modern NLP architectures and contextual representations. Some existing research questions in this field also ask what parts of the selected text must be kept private and how these models can be used to retrieve sensitive information. Although new to this field, I believe that privacy in language models is an essential element of responsible AI to consider. Another question I’d like to probe further during my Ph.D. is the interpretability of LLMs. What would be interesting to observe is whether the results from interpretability methods align with the results from fairness measures to indicate the model is indeed making biased decisions. [8] demonstrates how differences in dialects could result in biased automated hate speech regulators, and interpretability techniques could highlight these. This could also be used to determine the efficacy of model interpretability algorithms. On another note, another research question that I would be interested in probing further is whether hate speech detection and text generation systems are also biased toward other dialects or variants within language (such as code-mixed language). For instance, are text generation systems more prone to generating certain types of language in a negative/positive context? Other Applications of AI For Social Good: The societal ramifications of models that extend beyond the expanse of FATE-related domains are a direction relevant to the motivation of my work as well. Since 2021, I have worked with Dr. Sumeet Kumar and Dr. Ashique KhudaBukhsh on related problems. My initial foray aimed to quantify gender bias in videos for children. However, this quickly evolved into a critique of how SoTA ASR systems used by platforms like YouTube produce inappropriate, abusive transcriptions. Although our original work [9] proposes an MLM-based algorithm to correct these slurs, we have been working on expanding this by developing a fix at the ASR level, thus addressing the problem at the root itself. One of the issues that I noticed and objected to during the previous work was the inclusion of words related to the queer community in existing abusive language lexicons, which include reclaimed terms. Tangentially, we discovered the vast degree of inconsistency in these lexicons [10], and the harm that they have been known to cause when used without ethical consideration. Most of these lexicons do not mention crucial details about the annotation process for developing them. Our ongoing extension of this work seeks to analyze these issues from a multilingual, multicultural standpoint and highlight points that must be considered when developing such tools. Future Goals: [Single paragraph regarding professors whose work I’m interested in and why.] References [1] Nithya Sambasivan, Erin Arnesen, Ben Hutchinson, Tulsee Doshi, and Vinodkumar Prabhakaran. Re-imagining algorithmic fairness in india and beyond. Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 2021. [2] Krithika Ramesh, Gauri Gupta, and Sanjay Singh. Evaluating gender bias in Hindi-English machine translation. In Proceedings of the 3rd Workshop on Gender Bias in Natural Language Processing, pages 16–23, Online, August 2021. Association for Computational Linguistics. doi: 10.18653/v1/2021.gebnlp-1.3. URL https://aclanthology.org/2021.gebnlp-1.3. [3] Karolina Stanczak and Isabelle Augenstein. A survey on gender bias in natural language processing. CoRR, abs/2112.14168, 2021. URL https://arxiv.org/abs/2112.14168. [4] Cristina Espana-Bonet and Alberto Barr ̃ on-Cede ́ no. The (undesired) attenuation of human biases by multilinguality. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pages 2056—-2077, Online and Abu Dhabi, UAE, December 2022. Association for Computational Linguistics. URL https://aclanthology.org/2022.emnlp-main.133. [5] Andrew Silva, Pradyumna Tambwekar, and Matthew Gombolay. Towards a comprehensive understanding and accurate evaluation of societal biases in pre-trained transformers. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2383–2389, Online, June 2021. Association for Computational Linguistics. doi: 10.18653/v1/2021.naacl-main.189. URL https://aclanthology.org/2021.naacl-main.189. [6] Jaimeen Ahn, Hwaran Lee, Jinhwa Kim, and Alice Oh. Why knowledge distillation amplifies gender bias and how to mitigate from the perspective of DistilBERT. In Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP), pages 266–272, Seattle, Washington, July 2022. Association for Computational Linguistics. doi: 10.18653/v1/2022.gebnlp-1.27. URL https://aclanthology.org/2022.gebnlp-1.27. [7] Yarden Tal, Inbal Magar, and Roy Schwartz. Fewer errors, but more stereotypes? the effect of model size on gender bias. In Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP), pages 112–120, Seattle, Washington, July 2022. Association for Computational Linguistics. doi: 10.18653/v1/2022.gebnlp-1.13. URL https://aclanthology.org/2022.gebnlp-1.13. [8] Maarten Sap, Dallas Card, Saadia Gabriel, Yejin Choi, and Noah A. Smith. The risk of racial bias in hate speech detection. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1668–1678, Florence, Italy, July 2019. Association for Computational Linguistics. doi: 10.18653/v1/P19-1163. URL https://aclanthology.org/P19-1163. [9] Krithika Ramesh, Ashiqur R. KhudaBukhsh, and Sumeet Kumar. ‘beach’ to ‘bitch’: Inadvertent unsafe transcription of kids’ content on youtube. Proceedings of the AAAI Conference on Artificial Intelligence, 36(11): 12108–12118, Jun. 2022. doi: 10.1609/aaai.v36i11.21470. URL https://ojs.aaai.org/index.php/AAAI/article/view/21470. [10] Krithika Ramesh, Sumeet Kumar, and Ashiqur Khudabukhsh. Revisiting queer minorities in lexicons. In Proceedings of the Sixth Workshop on Online Abuse and Harms (WOAH), pages 245–251, Seattle, Washington (Hybrid), July 2022. Association for Computational Linguistics. doi: 10.18653/v1/2022.woah-1.23. URL https://aclanthology.org/2022.woah-1.23.