Generative AI (GAI) systems such as ChatGPT have revolutionised the way we interact with AI systems. These models can provide precise and detailed answers to our information needs, expressed in the form of brief text-based prompts. However, some of the responses generated by the GAI systems can contain harmful social biases such as gender or racial biases. Detecting and mitigating such biased responses is an important step towards establishing user trust in GAI. In this talk, I will describe the latest developments in methodologies that can be used to detect social biases in texts generated by GAI systems. In particular, I will describe methods that can be used to detect social biases expressed not only in English but other languages as well, with minimal human intervention. This is particularly important when scaling social bias evaluation for many languages. Second, I will describe methods that can be used to mitigate the identified social biases in large-scale language models. Experiments show that although some of the social biases can be identified and mitigated with high accuracy, the existing techniques are not perfect and indirect associations remain in the generative NLP models. Finally, I will describe on-going work in the NLP community to address these shortcomings and develop not only accurate but also trustworthy AI systems for the future.
