The Problem of Invisibility of Underrepresented Groups
Invisibility and Discoverability of Underrepresented Changemakers is More Than an Algorithmic Problem
Discoverability on the internet determines whose voices are heard, whose experiences are shared, and whose perspectives are acknowledged. It also influences the role models being found by young people, and emphasizes the interlocutors who are granted authority to narrate the way we see the world.
But it is fair to question that authority, granted, apparently, by the Gods of Google and OpenAI and then, we assume, legitimized by the daily endorsement of users themselves.
In fact, however, as we know, not everyone or everything that should be found is indeed findable. More accurately, some of the world’s greatest changemakers and some of the most innovative work being done around the world to confront significant problems such as poverty, climate change, and racism is likely to appear far “below the fold” in internet searches. Too many inspiring and instructive stories are, in this sense, “invisible”.
Often referred to what Safiya Omoja Noble calls “technological redlining” (Algorithms of Oppression: How Search Engines Reinforce Racism”, NYU Press, 2018), the gaps in findability of content on the internet is hardly by chance or, even less so, purely because of an objective assessment of “notability”. See for example, “Why Do Many Nigerians Fail the Wikipedia Notability Criteria”. These empty spaces reflect longstanding patterns of marginalization based on race, gender, and geographic location, among other factors ... which creates a significant problem. The story of humanity is skewed. When journalists, donors, researchers, tourists, and students search for any given theme, what they find is a misrepresentation of human endeavors. For example: “Africa is a country riddled with terrorism, corruption, poverty and stolen elections - at least according to how the media and entertainment shows depict Africa and Africans. These are some of the findings made by the ‘Africa in the U.S Media’ report published in 2019”, according to Africa, No Filter. Similar silences, invisibilities, misrepresentations, and lacunae can be found in every area of human endeavor, as is made abundantly clear in Catherine D’Ignacio and Lauren Klein's Data Feminism (MIT Press, 2023), as well as in Joy Buolamwini's remarkable autobiographical jaunt through the biases in AI, Unmasking AI: My Mission to Protect What is Human in a World of Machines (Penguin, 2023).
The work of Francesca Tripodi is also instructive, such as "Ms. Categorized: Gender, notability, and inequality on Wikipedia", which reminds us that even biographies of women who meet Wikipedia’s criteria for inclusion "are more frequently considered non-notable and nominated for deletion compared to men’s biographies".
How Invisibility Is Perpetuated by Technology
Discoverability relies heavily on search engines like Google, which employ complex algorithms to index and rank web pages. A similar dynamic takes place with Generative AI chatbots like ChatGPT, Claude, or GoogleBard, which rely on Large Language Models (LLMs) and clean datasets to produce narrative results. When a user enters a query, algorithms scan vast databases and provide a list of results, ordered by relevance, in the case of regular search. The notion of "relevance" is central to understanding why larger and more established sites often dominate these results, and the algorithms prioritize websites that are considered authoritative and credible. Established sites often have a long history, a significant number of backlinks from other reputable sites, and a substantial volume of content. This creates a positive feedback loop: the more authoritative a site is perceived to be, the more it attracts links, and the higher it ranks in search results. For example, educational institutions, government websites, and well-known news outlets are typically considered authoritative sources. Statistics reveal this phenomenon; a study conducted by Moz in 2021 found that the first page of Google search results received over 70% of all clicks, while the second page received just 5.6%. This stark drop in user engagement underscores the importance of appearing on the first page, where established sites predominantly reside.
User behavior further exacerbates the situation. People tend to click on the top search results, as they are perceived as the most relevant and trustworthy. This click-through behavior reinforces the prominence of large, well-established sites, making it even more challenging for smaller or lesser-known websites to gain visibility. A study by Advanced Web Ranking in 2021 showed that the first result on Google received a click-through rate of nearly 30%, while the tenth result received only about 2.5%.
Algorithms Contribute to Invisibility
The global majority -- including historically marginalized communities (or "minoritized" communities, as D’Ignacio and Klein put it, in Data Feminism), racial and ethnic minorities, indigenous communities, LGBTQ+ individuals, people with disabilities, people from poor communities or from the Global South, and other underrepresented groups -- face significant hurdles in achieving discoverability on the internet. These groups often "invisible" because they have limited access to resources, making it difficult for them to create and maintain authoritative websites. As a result, their online presence remains limited, perhaps many pages below the fold on a Google search, for example, further marginalizing their experiences and perspectives.
Additionally, the bias in Google's algorithms can perpetuate existing inequalities. Algorithms are trained on data from the past, which reflects historical biases and discrimination. When marginalized voices are underrepresented in the data, the algorithms are less likely to prioritize their content in search results, deepening the digital divide.
Invisibility of Underrepresented Groups in History
These biases also have an impact on historical narratives. When we search for topics about the past, our searches seek and find sources from dominant narrative-producers. For example, the stories of victims of mass atrocity (whether authoritarian rule in Chile, for example, the trans-atlantic slave trade, or the Mau Mau massacres in Kenya), the "memory" of victims is often lost, in spite of the best efforts of truth and reconciliation commissions or activities under the heading of transitional justice.
Finally, there are more subtle but equally important forms of marginalization that are reinforced by processes of discoverability. For example, D’Ignacio and Klein (in Data Feminism) point to “subjugated knowledge”, i.e. “forms of knowledge that have been pushed out of mainstream institutions and the conversations that they encourage". These might include "‘music, literature, daily conversations, and everyday behavior’ as a result of being excluded from ‘white male controlled social institutions’”.
Addressing the Problem of Invisibility
Discoverability on the internet, in short, is a complex interplay of algorithms, user behavior, and historical biases. At its root, however, it is more accurate to say that discoverability replicates and reinforces systems of hegemonic and structural power rending invisible far too many inspiring and instructive stories of people, trends, and activities that should be found. Dominant search engines like Google favor established and authoritative sites, creating a cycle that reinforces the prominence of already influential voices. This has profound implications for historically marginalized individuals and communities, who find themselves at a disadvantage in the digital realm. To address this issue, we are part of a movement towards a more equitable online ecosystem that values and amplifies diverse voices, experiences, and perspectives.
For example, our partner Invisible Giants focuses on documenting the stories of amazing but sometimes under-recognized individuals, further emphasizing the importance of celebrating the contributions of new role models, heroes/Sheroes, and teachers for us all as we navigate life. Indeed, as we collectively try to address the enormous problems of our planet and humanity, we need these stories and voices to shine through.
The emerging movement to confront these interwoven problems includes, for example, the Overlooked initiative by The New York Times which acknowledges the lives of remarkable individuals who were left out of the newspaper's obituary section, primarily because of their gender, race, or social status, and Whose Knowledge, a “global campaign to center the knowledge of marginalized communities (the majority of the world) on the internet”. Similarly, the Wikimedia Foundation, which oversees Wikipedia, has provided grants and formed partnerships with organizations to encourage diversity among editors and to improve content on underrepresented topics, such as WikiProject Women in Red: A project aimed at increasing the number of biographies about women on Wikipedia; and Art+Feminism, which is building a "community of activists that is committed to closing information gaps related to gender, feminism, and the arts, beginning with Wikipedia". From a different angle, we join entities like The Female Quotient’s Algorithm for Equality and Latimer.ai, a large language model trained with diverse histories and inclusive voice, in developing technological and content-based strategies to confront the problem of underrepresentation of unsung heroes and Sheroes. .
Invisibility is, without hyperbole, one of the greatest problems of our era. In our quotidian searches and casual acceptance of generative AI, we are constructing narratives of "who we are" and who we should be. In order to do that well, we need a fuller and more inclusive picture of the world than is currently possible with the limitations imposed by current processes of discoverability.