Abhinav Rao

I am heading over to LREC-CoLING 2024! Feel free to catch up with me at Torino from the 22nd to 24th of May 🇮🇹


Language Technologies Institute,

School of Computer Science,

Gates and Hillman Centre,

Carnegie Mellon University,

Pittsburgh, PA, 15213

abhinavr (at) cs.cmu.edu

abhinav (dot) 797c (at) gmail.com

I am a Master’s student at Carnegie Mellon University majoring in Natural Language Processing. I am currently working in Prof. Maarten Sap’s Lab on Harmlessness and Factuality in Language Models in the context of social cultures. I am particularly interested in studying biases and toxicity in Language models and hope to extend it to multiple languages and modalities (RAI).

Prior to joining Carnegie Mellon, I spent a year as a predoctoral research fellow at Microsoft Turing, where I worked on responsible AI for Large Language Models. I was part of Dr. Monojit Choudhury’s group, where we worked on exciting problems such as evaluating the ethical reasoning capabilities of Large Language Models and Jailbreaking LLMs. As part of my product work at Turing India, I was involved in building Bing Chat, where I developed multilingual classifiers for detecting harmful content!

Before this, I had interned at Microsoft Research India with Dr. Sunayana Sitaram, on project LITMUS, where I worked with the Bing Defensive team at Microsoft’s Search Technology Center of India (STCI), to develop a multilingual query expansion pipeline.

As part of my undergraduate thesis at SpeechLab, NTU, I was advised by Prof. Chng Eng-Siong on Multilingual natural language processing. [Paper] [Code]

In my free time, I like to walk around campus and Pittsburgh. I also like learning about Languages and their origins; I am involved in a project with Monojit and Quincy Amoah on the Sumerian Language! I also closely follow the demoscene (Site’s down! Alternate link). I’ve not gotten into coding anything yet, but I’d prefer to some day in the near or far future. I usually prefer demos on the Commodore or PC-DOS, and am especially awed at size-constrained ones.


Apr 2024 Our work at CMU, titled “NORMAD: A Benchmark for Measuring the Cultural Adaptability of Large Language Models” is out as a preprint, here’s our X thread. Code and Data to be out soon!
Apr 2024 Our work from Microsoft, titled “Tricking LLMs into Disobedience, Formalizing, Analyzing and Detecting Jailbreaks” is accepted at LREC-CoLING 2024. Heading over there to present it Preprint, X thread.
Oct 2023 Our work from Microsoft, titled “Ethical Reasoning over Moral Alignment: A Case and Framework for In-Context Ethical Policies in LLMs” is accepted at EMNLP Findings 2023. Find the preprint here, and the Twitter (X?) thread here.
Aug 2023 Joined Carnegie Mellon University as a Master’s student in LTI
May 2023 Preprint out on jailbreaking large language models.
Nov 2022 Paper out on Multilingual Punctuation Restoration at APSIPA 2022
Aug 2022 Joined Microsoft as a Research Fellow!
Jan 2022 Joined Microsoft Research as a Research Intern!
Aug 2021 Joined SpeechLab, Nanyang Technological University as a Research Intern!
Jun 2021 Joined Oracle Corporation as an SDE Intern!

Academic Advisors

I have been fortunate to work with great researchers at and outside Microsoft

Selected Papers

  1. Ethical Reasoning over Moral Alignment: A Case and Framework for In-Context Ethical Policies in LLMs
    Abhinav Rao*, Aditi Khandelwal*, Kumar Tanmay*, and 2 more authors
  2. Tricking LLMs into Disobedience: Understanding, Analyzing, and Preventing Jailbreaks
    Abhinav Rao*, Sachin Vashistha*, Atharva Naik, and 2 more authors
  3. Punctuation Restoration for Singaporean Spoken Languages: English, Malay, and Mandarin
    Abhinav Rao, Ho Thi-Nga, and Chng Eng Siong
    In 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 2022