Generative AI models like ChatGPT are trained using vast amounts of data obtained from websites, forums, social media and other online sources; as a result, their responses can contain harmful or discriminatory biases.
Researchers at the Universitat Oberta de Catalunya (UOC) and the University of Luxembourg have developed LangBiTe, an open source program that assesses whether these models are free of bias and comply with legislation concerning non-discrimination.
“LangBiTe hasn’t been created for commercial reasons, rather to provide a useful resource both for creators of generative AI tools and for non-technical users; it should contribute to identifying and mitigating biases in models and ultimately help create better AIs in the future,” explained Sergio Morales, a researcher in the Som Research Lab Systems, Software and Models group at the UOC Internet Interdisciplinary Institute (IN3), whose Ph.D. thesis is based on this tool.
The thesis has been supervised by Robert Clarisó, member of the UOC Faculty of Computer Science, Multimedia and Telecommunications and lead researcher of the Som Research Lab, and by Jordi Cabot, a researcher at the University of Luxembourg. The research is published in the journal Proceedings of the ACM/IEEE 27th International Conference on Model Driven Engineering Languages and Systems.
Beyond gender discrimination
LangBiTe differs from other similar programs due to its scope, and according to the researchers, it is the “most comprehensive and detailed” tool currently available. “Most experiments used to focus on male-female gender discrimination, without considering other important ethical aspects or vulnerable minorities. With LangBiTe we’ve analyzed the extent to which some AI models can respond to certain questions in a racist way, with a clearly biased political point of view, or with homophobic or transphobic connotations,” they explained.
The researchers also stressed that, although other projects classified AI models based on various dimensions, their ethical approach was “too superficial, with no detail about the specific aspects evaluated.”
A flexible and adaptable program
The new program lets users analyze whether an application or tool that incorporates functions based on AI models is suitable for each institution or organization’s specific ethical requirements or user communities. The researchers explained how “LangBiTe doesn’t prescribe any specific moral framework. What is and isn’t ethical largely depends on the context and culture of the organization that develops and incorporates features based on generative AI models in its product.
“As such, our approach lets users define their own ethical concerns and their evaluation criteria, and adapt the evaluation of bias to their particular cultural context and regulatory environment.”
To this end, LangBiTe includes libraries containing more than 300 prompts that can be used to reveal biases in the AI models, each prompt focusing on a specific ethical concern: ageism, LGBTIQA+phobia, political preferences, religious prejudices, racism, sexism or xenophobia.
Each of these prompts has associated responses to assess whether the response from the model is biased. It also includes prompt templates that can be modified, allowing the user to expand and enrich the original collection with new questions or ethical concerns.
Much more than ChatGPT
LangBiTe currently provides access to proprietary OpenAI models (GPT-3.5, GPT-4), and dozens of other generative AI models available on HuggingFace and Replicate, which are platforms enabling interaction with a wide variety of models, including those of Google and Meta. “Furthermore, any developer who wants to do so can extend the LangBiTe platform to evaluate other models, including their own,” added Morales.
The program also lets users see the differences between responses by different versions of the same model and between models from different suppliers at any time. “For example, we found that the version of ChatGPT 4 that was available had a success rate in the test against gender bias of 97%, which was higher than that obtained by the version of ChatGPT 3.5 available at that time, which had a success rate of 42%.
“On that same date, we saw that for Google’s Flan-T5 model, the larger it was, the less biased it was in terms of gender, religion and nationality,” said the researcher.
Multilingual and multimedia analysis
The most popular AI models have been created based on content in English, but there are regional projects under way with models being trained in other languages such as Catalan and Italian. The UOC researchers have also included the function of evaluating tools in different languages, which means that users can “detect if a model is biased depending on the language they use for their queries,” said Morales.
They are also working on being able to analyze models that generate images, such as Stable Diffusion, DALL·E and Midjourney. “The current applications for these tools range from producing children’s books to graphics for news content, which can spread distorting and/or negative stereotypes which society obviously wants to eradicate.
“We hope that the future LangBiTe will be useful for identifying and correcting all types of bias in images that these models generate,” said the UOC researcher.
A tool for compliance with the EU AI Act
The features of this tool can help users comply with the recent EU AI Act, which aims to ensure that new AI systems promote equal access, gender equality and cultural diversity, and that their use does not compromise the rights of non-discrimination stipulated by the European Union and the national laws of its member states.
The program has already been adopted by institutions including the Luxembourg Institute of Science and Technology (LIST), which has integrated LangBiTe to assess several popular generative AI models.
More information:
Sergio Morales et al, A DSL for Testing LLMs for Fairness and Bias, Proceedings of the ACM/IEEE 27th International Conference on Model Driven Engineering Languages and Systems (2024). DOI: 10.1145/3640310.3674093
Citation:
AI bias detection tool promises to tackle discrimination in models (2024, December 11)
retrieved 11 December 2024
from
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.