Skip to main content

2024 | OriginalPaper | Buchkapitel

Measuring Bias in Search Results Through Retrieval List Comparison

verfasst von : Linda Ratz, Markus Schedl, Simone Kopeinik, Navid Rekabsaz

Erschienen in: Advances in Information Retrieval

Verlag: Springer Nature Switzerland

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Many IR systems project harmful societal biases, including gender bias, in their retrieved contents. Uncovering and addressing such biases requires grounded bias measurement principles. However, defining reliable bias metrics for search results is challenging, particularly due to the difficulties in capturing gender-related tendencies in the retrieved documents. In this work, we propose a new framework for search result bias measurement. Within this framework, we first revisit the current metrics for representative search result bias (RepSRB) that are based on the occurrence of gender-specific language in the search results. Addressing their limitations, we additionally propose a metric for comparative search result bias (ComSRB) measurement and integrate it into our framework. ComSRB defines bias as the skew in the set of retrieved documents in response to a non-gendered query toward those for male/female-specific variations of the same query. We evaluate ComSRB against RepSRB on a recent collection of bias-sensitive topics and documents from the MS MARCO collection, using pre-trained bi-encoder and cross-encoder IR models. Our analyses show that, while existing metrics are highly sensitive to the wordings and linguistic formulations, the proposed ComSRB metric mitigates this issue by focusing on the deviations of a retrieval list from its explicitly biased variants, avoiding the need for sub-optimal content analysis processes.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
The authors also suggest using term frequencies, but they find that both methods exhibit similar behaviour, and hence we here focus on using the Boolean approach.
 
Literatur
1.
Zurück zum Zitat Bender, E.M.: On achieving and evaluating language-independence in NLP. Linguist. Issues Lang. Technol. 6 (2011) Bender, E.M.: On achieving and evaluating language-independence in NLP. Linguist. Issues Lang. Technol. 6 (2011)
2.
Zurück zum Zitat Bigdeli, A., Arabzadeh, N., Seyedsalehi, S., Zihayat, M., Bagheri, E.: On the orthogonality of bias and utility in ad hoc retrieval. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1748–1752 (2021) Bigdeli, A., Arabzadeh, N., Seyedsalehi, S., Zihayat, M., Bagheri, E.: On the orthogonality of bias and utility in ad hoc retrieval. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1748–1752 (2021)
3.
Zurück zum Zitat Bigdeli, A., Arabzadeh, N., Seyedsalehi, S., Zihayat, M., Bagheri, E.: A light-weight strategy for restraining gender biases in neural rankers. In: Advances in Information Retrieval, pp. 47–55 (2022) Bigdeli, A., Arabzadeh, N., Seyedsalehi, S., Zihayat, M., Bagheri, E.: A light-weight strategy for restraining gender biases in neural rankers. In: Advances in Information Retrieval, pp. 47–55 (2022)
4.
Zurück zum Zitat Bigdeli, A., Arabzadeh, N., Zihayat, M., Bagheri, E.: Exploring gender biases in information retrieval relevance judgement datasets. In: Hiemstra, D., Moens, M.-F., Mothe, J., Perego, R., Potthast, M., Sebastiani, F. (eds.) ECIR 2021. LNCS, vol. 12657, pp. 216–224. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-72240-1_18CrossRef Bigdeli, A., Arabzadeh, N., Zihayat, M., Bagheri, E.: Exploring gender biases in information retrieval relevance judgement datasets. In: Hiemstra, D., Moens, M.-F., Mothe, J., Perego, R., Potthast, M., Sebastiani, F. (eds.) ECIR 2021. LNCS, vol. 12657, pp. 216–224. Springer, Cham (2021). https://​doi.​org/​10.​1007/​978-3-030-72240-1_​18CrossRef
5.
Zurück zum Zitat Crawford, K.: The trouble with bias. In: Keynote at Annual Conference on Neural Information Processing Systems (NIPS) (2017) Crawford, K.: The trouble with bias. In: Keynote at Annual Conference on Neural Information Processing Systems (NIPS) (2017)
6.
Zurück zum Zitat Devinney, H., Björklund, J., Björklund, H.: Theories of “gender” in NLP bias research. In: 2022 ACM Conference on Fairness, Accountability, and Transparency, pp. 2083–2102 (2022) Devinney, H., Björklund, J., Björklund, H.: Theories of “gender” in NLP bias research. In: 2022 ACM Conference on Fairness, Accountability, and Transparency, pp. 2083–2102 (2022)
7.
Zurück zum Zitat Dixon, L., Li, J., Sorensen, J., Thain, N., Vasserman, L.: Measuring and mitigating unintended bias in text classification. In: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pp. 67–73 (2018) Dixon, L., Li, J., Sorensen, J., Thain, N., Vasserman, L.: Measuring and mitigating unintended bias in text classification. In: Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pp. 67–73 (2018)
8.
Zurück zum Zitat Ekstrand, M.D., Das, A., Burke, R., Diaz, F.: Fairness in information access systems. Found. Trends® Inf. Retriev. 16, 1–177 (2022) Ekstrand, M.D., Das, A., Burke, R., Diaz, F.: Fairness in information access systems. Found. Trends® Inf. Retriev. 16, 1–177 (2022)
9.
Zurück zum Zitat Fabris, A., Purpura, A., Silvello, G., Susto, G.A.: Gender stereotype reinforcement: measuring the gender bias conveyed by ranking algorithms. Inf. Process. Manag. 57(6), 102377 (2020)CrossRef Fabris, A., Purpura, A., Silvello, G., Susto, G.A.: Gender stereotype reinforcement: measuring the gender bias conveyed by ranking algorithms. Inf. Process. Manag. 57(6), 102377 (2020)CrossRef
10.
Zurück zum Zitat Feng, Y., Shah, C.: Has CEO gender bias really been fixed? Adversarial attacking and improving gender fairness in image search. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 11, pp. 11882–11890 (2022) Feng, Y., Shah, C.: Has CEO gender bias really been fixed? Adversarial attacking and improving gender fairness in image search. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 11, pp. 11882–11890 (2022)
11.
Zurück zum Zitat Gezici, G., Lipani, A., Saygin, Y., Yilmaz, E.: Evaluation metrics for measuring bias in search engine results. Inf. Retrieval J. 24(2), 85–113 (2021)CrossRef Gezici, G., Lipani, A., Saygin, Y., Yilmaz, E.: Evaluation metrics for measuring bias in search engine results. Inf. Retrieval J. 24(2), 85–113 (2021)CrossRef
12.
Zurück zum Zitat Kay, M., Matuszek, C., Munson, S.A.: Unequal representation and gender stereotypes in image search results for occupations. In: Proceedings of the Annual ACM Conference on Human Factors in Computing Systems, pp. 3819–3828 (2015) Kay, M., Matuszek, C., Munson, S.A.: Unequal representation and gender stereotypes in image search results for occupations. In: Proceedings of the Annual ACM Conference on Human Factors in Computing Systems, pp. 3819–3828 (2015)
13.
Zurück zum Zitat Kopeinik, S., Mara, M., Ratz, L., Krieg, K., Schedl, M., Rekabsaz, N.: Show me a “Male Nurse”! how gender bias is reflected in the query formulation of search engine users. In: Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI 2023), pp. 1–15 (2023) Kopeinik, S., Mara, M., Ratz, L., Krieg, K., Schedl, M., Rekabsaz, N.: Show me a “Male Nurse”! how gender bias is reflected in the query formulation of search engine users. In: Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI 2023), pp. 1–15 (2023)
14.
Zurück zum Zitat Krieg, K., Parada-Cabaleiro, E., Medicus, G., Lesota, O., Schedl, M., Rekabsaz, N.: Grep-BiasIR: a dataset for investigating gender representation bias in information retrieval results. In: Proceedings of the 2023 Conference on Human Information Interaction and Retrieval (CHIIR 2023), pp. 444–448 (2023) Krieg, K., Parada-Cabaleiro, E., Medicus, G., Lesota, O., Schedl, M., Rekabsaz, N.: Grep-BiasIR: a dataset for investigating gender representation bias in information retrieval results. In: Proceedings of the 2023 Conference on Human Information Interaction and Retrieval (CHIIR 2023), pp. 444–448 (2023)
15.
Zurück zum Zitat Li, Y., et al.: Debiasing neural retrieval via in-batch balancing regularization. In: Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP), pp. 58–66 (2022) Li, Y., et al.: Debiasing neural retrieval via in-batch balancing regularization. In: Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP), pp. 58–66 (2022)
16.
Zurück zum Zitat Lin, J., Ma, X., Lin, S.C., Yang, J.H., Pradeep, R., Nogueira, R.: Pyserini: a Python toolkit for reproducible information retrieval research with sparse and dense representations. In: Proceedings of the 44th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2021), pp. 2356–2362 (2021) Lin, J., Ma, X., Lin, S.C., Yang, J.H., Pradeep, R., Nogueira, R.: Pyserini: a Python toolkit for reproducible information retrieval research with sparse and dense representations. In: Proceedings of the 44th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2021), pp. 2356–2362 (2021)
17.
Zurück zum Zitat Maughan, K., Near, J.P.: Towards a measure of individual fairness for deep learning. arXiv e-prints pp. arXiv-2009 (2020) Maughan, K., Near, J.P.: Towards a measure of individual fairness for deep learning. arXiv e-prints pp. arXiv-2009 (2020)
18.
Zurück zum Zitat Nguyen, T., et al.: MS MARCO: a human generated machine reading comprehension dataset. In: Proceedings of the Workshop on Cognitive Computation: Integrating Neural and Symbolic Approaches 2016 co-located with the 30th Annual Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain, 9 December 2016. CEUR Workshop Proceedings, vol. 1773 (2016) Nguyen, T., et al.: MS MARCO: a human generated machine reading comprehension dataset. In: Proceedings of the Workshop on Cognitive Computation: Integrating Neural and Symbolic Approaches 2016 co-located with the 30th Annual Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain, 9 December 2016. CEUR Workshop Proceedings, vol. 1773 (2016)
19.
Zurück zum Zitat Otterbacher, J., Bates, J., Clough, P.: Competent men and warm women: gender stereotypes and backlash in image search results. In: Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, pp. 6620–6631 (2017) Otterbacher, J., Bates, J., Clough, P.: Competent men and warm women: gender stereotypes and backlash in image search results. In: Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, pp. 6620–6631 (2017)
20.
Zurück zum Zitat Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing, pp. 3982–3992 (2019) Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing, pp. 3982–3992 (2019)
21.
Zurück zum Zitat Rekabsaz, N., Kopeinik, S., Schedl, M.: Societal biases in retrieved contents: measurement framework and adversarial mitigation of BERT rankers. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 306–316 (2021) Rekabsaz, N., Kopeinik, S., Schedl, M.: Societal biases in retrieved contents: measurement framework and adversarial mitigation of BERT rankers. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 306–316 (2021)
22.
Zurück zum Zitat Rekabsaz, N., Schedl, M.: Do neural ranking models intensify gender bias? In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 2065–2068 (2020) Rekabsaz, N., Schedl, M.: Do neural ranking models intensify gender bias? In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 2065–2068 (2020)
23.
Zurück zum Zitat Rosch, E.H.: Natural categories. Cogn. Psychol. 4(3), 328–350 (1973)CrossRef Rosch, E.H.: Natural categories. Cogn. Psychol. 4(3), 328–350 (1973)CrossRef
24.
Zurück zum Zitat Vlasceanu, M., Amodio, D.M.: Propagation of societal gender inequality by internet search algorithms. Proc. Natl. Acad. Sci. 119(29), e2204529119 (2022)CrossRef Vlasceanu, M., Amodio, D.M.: Propagation of societal gender inequality by internet search algorithms. Proc. Natl. Acad. Sci. 119(29), e2204529119 (2022)CrossRef
25.
Zurück zum Zitat Wang, W., Wei, F., Dong, L., Bao, H., Yang, N., Zhou, M.: MiniLM: deep self-attention distillation for task-agnostic compression of pre-trained transformers. In: Advances in Neural Information Processing Systems, vol. 33, pp. 5776–5788 (2020) Wang, W., Wei, F., Dong, L., Bao, H., Yang, N., Zhou, M.: MiniLM: deep self-attention distillation for task-agnostic compression of pre-trained transformers. In: Advances in Neural Information Processing Systems, vol. 33, pp. 5776–5788 (2020)
27.
Zurück zum Zitat Zerveas, G., Rekabsaz, N., Cohen, D., Eickhoff, C.: Mitigating bias in search results through contextual document reranking and neutrality regularization. In: Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 2532–2538 (2022) Zerveas, G., Rekabsaz, N., Cohen, D., Eickhoff, C.: Mitigating bias in search results through contextual document reranking and neutrality regularization. In: Proceedings of the International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 2532–2538 (2022)
Metadaten
Titel
Measuring Bias in Search Results Through Retrieval List Comparison
verfasst von
Linda Ratz
Markus Schedl
Simone Kopeinik
Navid Rekabsaz
Copyright-Jahr
2024
DOI
https://doi.org/10.1007/978-3-031-56069-9_2

Premium Partner