n late 2021, the tech company OpenAI publicly released its Generative Pre-trained Transformer 3 (GPT-3), a cutting-edge AI system that generates text in any form—including prose, poetry, and dialogue—based on a prompt given by the user. With support from Microsoft, OpenAI will market GPT-3 to companies for everyday commercial uses. One of these potential uses is search engines, for which GPT-3’s autocomplete generations currently include high rates of bias, according to a new study by DisinfoLab, a student-led think tank at the College of William & Mary’s Global Research Institute. These search biases have major consequences for exposure to and the spread of mis- and disinformation online.
All search engines have a major flaw; biased searches lead users to biased sources. Generally, search engines like Google or Bing autocomplete a user’s question in the search bar, offering predictions before the query is fully typed. If these predictions feature stereotypes or falsehoods, users are more likely to view sources that contain biased, misleading, or flat-out false information. These biased searches are a key and often overlooked accelerator for the spread of mis- and disinformation online. Skewed search results reinforce racial stereotypes, gender roles, and other forms of discrimination in two main ways.
First, search engines have the perception of being neutral calculators whose predictions are only as biased as the user. They’re not, according to data ethics researchers. These experts note that the manner in which search engines suggest sources to users is subjective. Often, this bias is the result of a handful of programmers in Silicon Valley, whose lack of diversity has historically overlooked certain biases in search results.
Second, once a user selects one biased prediction, more are likely to follow. Past searches are a key factor in how search engines rank future autocomplete predictions. Initial biased searches crowd out alternative information, so users lack easy access to sources that offer nuanced and more accurate representations. As a result, the stereotypes are reinforced—not because of evidence but because of the repetition of the false claim without a counterbalance.
Biased searches may be unintentional and spread misinformation, but search engines are also used to deliberately spread disinformation. Bad actors seek to spread messages with specific political aims that promote false stereotypes to create chaos, hate, or achieve a specific political goal. Producers of these narratives—both foreign and domestic—attempt to exploit search mechanisms in order for their messages to reach as many eyes as possible.
Knowledge about how search engines form questions related to social groups is crucial to minimizing widespread bias and therefore the proliferation of mis- and disinformation. To this end, DisinfoLab constructed a parallel study between GPT-3 and Google Search consisting of 3,290 combined text predictions to compare the two programs’ autocomplete predictions. This format offers insights into the extent of GPT-3’s bias compared to a baseline, and it simultaneously allows an interrogation of the biases present in the most widely-used search engine, Google.
The bias was separated into four categories: gender; sexual orientation and sexuality; race, ethnicity and nationality; and religion. From the analysis of the identity bias in GPT-3 and Google Search, DisinfoLab arrived at three key takeaways.
Across 1,645 text predictions, GPT-3 produced query generations that were negative with respect to the subject group 43.83% of the time. It produced the most bias in phrases about sexuality and the least in phrases about religion.
Additionally, GPT-3, despite being more technologically advanced than Google Search, generated more bias. It produced 13.68% more negative predictions compared to Google. Technical complexity thus far does not safeguard against bias and requires concrete efforts for moderation in order to limit the spread of false or biased information.
Finally, given Google’s legacy as the most popular search engine, DisinfoLab expected it to have low rates of bias across the board. It doesn’t. Google maintained a high base level of bias, generating negative phrases in 30.15% of its predictions. Like GPT-3, the most bias occurred in Google’s phrases about sexuality, and the least occurred in phrases about religion.
These results give way to a series of recommendations for GPT-3 and Google Search. As Microsoft and OpenAI team up to offer GPT-3 to private businesses with the new Azure service, these companies must incorporate safeguards against bias. In the realm of search engines, Azure is positioned to offer GPT-3 to companies pursuing niche searches, or a new type of search engine based on the synthesis of information rather than a list of sources.
According to Azure’s website, “Azure OpenAI Service offers tools to empower customers with the ability to moderate generated content and guidance and implementation best practices to help customers design their applications, while keeping safety front and center.” DisinfoLab recommends that these tools be made compulsory for companies and must fully protect against a range of harmful biases.
Additionally, in light of Google’s high rates of bias, DisinfoLab suggests that Google develop a more robust moderation algorithm that protects against biases. In this process, Google should consult individuals from various identity groups and academics who specialize in gender studies, racial studies, and other fields.
Finally, DisinfoLab urges transparency. The release of how search engines suggest autocomplete predictions would allow researchers and policymakers to monitor search engines’ algorithms and protect against downstream problems of search biases, namely the spread of mis- and disinformation online.
Editor’s Note: DisinfoLab thanks Alyssa Nekritz, Conrad Ning, Pritika Ravi, Chas Rinne, Madeline Smith, Samantha Strauss, and Selene Swanson for their tremendous support in researching and designing this project.
a global affairs media network
How AI Bias in Search Engines Contributes to Disinformation
Image via Pixabay.
January 27, 2022
William & Mary's DisinfoLab conducted a World in 2050-commissioned study comparing bias embedded in Google Search to GPT-3. DisinfoLab's directors discuss the implications of their findings on the future of search engines and what it means for the spread of disinformation.
I
n late 2021, the tech company OpenAI publicly released its Generative Pre-trained Transformer 3 (GPT-3), a cutting-edge AI system that generates text in any form—including prose, poetry, and dialogue—based on a prompt given by the user. With support from Microsoft, OpenAI will market GPT-3 to companies for everyday commercial uses. One of these potential uses is search engines, for which GPT-3’s autocomplete generations currently include high rates of bias, according to a new study by DisinfoLab, a student-led think tank at the College of William & Mary’s Global Research Institute. These search biases have major consequences for exposure to and the spread of mis- and disinformation online.
All search engines have a major flaw; biased searches lead users to biased sources. Generally, search engines like Google or Bing autocomplete a user’s question in the search bar, offering predictions before the query is fully typed. If these predictions feature stereotypes or falsehoods, users are more likely to view sources that contain biased, misleading, or flat-out false information. These biased searches are a key and often overlooked accelerator for the spread of mis- and disinformation online. Skewed search results reinforce racial stereotypes, gender roles, and other forms of discrimination in two main ways.
First, search engines have the perception of being neutral calculators whose predictions are only as biased as the user. They’re not, according to data ethics researchers. These experts note that the manner in which search engines suggest sources to users is subjective. Often, this bias is the result of a handful of programmers in Silicon Valley, whose lack of diversity has historically overlooked certain biases in search results.
Second, once a user selects one biased prediction, more are likely to follow. Past searches are a key factor in how search engines rank future autocomplete predictions. Initial biased searches crowd out alternative information, so users lack easy access to sources that offer nuanced and more accurate representations. As a result, the stereotypes are reinforced—not because of evidence but because of the repetition of the false claim without a counterbalance.
Biased searches may be unintentional and spread misinformation, but search engines are also used to deliberately spread disinformation. Bad actors seek to spread messages with specific political aims that promote false stereotypes to create chaos, hate, or achieve a specific political goal. Producers of these narratives—both foreign and domestic—attempt to exploit search mechanisms in order for their messages to reach as many eyes as possible.
Knowledge about how search engines form questions related to social groups is crucial to minimizing widespread bias and therefore the proliferation of mis- and disinformation. To this end, DisinfoLab constructed a parallel study between GPT-3 and Google Search consisting of 3,290 combined text predictions to compare the two programs’ autocomplete predictions. This format offers insights into the extent of GPT-3’s bias compared to a baseline, and it simultaneously allows an interrogation of the biases present in the most widely-used search engine, Google.
The bias was separated into four categories: gender; sexual orientation and sexuality; race, ethnicity and nationality; and religion. From the analysis of the identity bias in GPT-3 and Google Search, DisinfoLab arrived at three key takeaways.
Across 1,645 text predictions, GPT-3 produced query generations that were negative with respect to the subject group 43.83% of the time. It produced the most bias in phrases about sexuality and the least in phrases about religion.
Additionally, GPT-3, despite being more technologically advanced than Google Search, generated more bias. It produced 13.68% more negative predictions compared to Google. Technical complexity thus far does not safeguard against bias and requires concrete efforts for moderation in order to limit the spread of false or biased information.
Finally, given Google’s legacy as the most popular search engine, DisinfoLab expected it to have low rates of bias across the board. It doesn’t. Google maintained a high base level of bias, generating negative phrases in 30.15% of its predictions. Like GPT-3, the most bias occurred in Google’s phrases about sexuality, and the least occurred in phrases about religion.
These results give way to a series of recommendations for GPT-3 and Google Search. As Microsoft and OpenAI team up to offer GPT-3 to private businesses with the new Azure service, these companies must incorporate safeguards against bias. In the realm of search engines, Azure is positioned to offer GPT-3 to companies pursuing niche searches, or a new type of search engine based on the synthesis of information rather than a list of sources.
According to Azure’s website, “Azure OpenAI Service offers tools to empower customers with the ability to moderate generated content and guidance and implementation best practices to help customers design their applications, while keeping safety front and center.” DisinfoLab recommends that these tools be made compulsory for companies and must fully protect against a range of harmful biases.
Additionally, in light of Google’s high rates of bias, DisinfoLab suggests that Google develop a more robust moderation algorithm that protects against biases. In this process, Google should consult individuals from various identity groups and academics who specialize in gender studies, racial studies, and other fields.
Finally, DisinfoLab urges transparency. The release of how search engines suggest autocomplete predictions would allow researchers and policymakers to monitor search engines’ algorithms and protect against downstream problems of search biases, namely the spread of mis- and disinformation online.
Editor’s Note: DisinfoLab thanks Alyssa Nekritz, Conrad Ning, Pritika Ravi, Chas Rinne, Madeline Smith, Samantha Strauss, and Selene Swanson for their tremendous support in researching and designing this project.