nach oben

Discover Computing

Erschienen in:

01.06.2010 | Learning to rank for information retrieval

Adapting boosting for information retrieval measures

verfasst von: Qiang Wu, Christopher J. C. Burges, Krysta M. Svore, Jianfeng Gao

Erschienen in: Discover Computing | Ausgabe 3/2010

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

We present a new ranking algorithm that combines the strengths of two previous methods: boosted tree classification, and LambdaRank, which has been shown to be empirically optimal for a widely used information retrieval measure. Our algorithm is based on boosted regression trees, although the ideas apply to any weak learners, and it is significantly faster in both train and test phases than the state of the art, for comparable accuracy. We also show how to find the optimal linear combination for any two rankers, and we use this method to solve the line search problem exactly during boosting. In addition, we show that starting with a previously trained model, and boosting using its residuals, furnishes an effective technique for model adaptation, and we give significantly improved results for a particularly pressing problem in web search—training rankers for markets for which only small amounts of labeled data are available, given a ranker trained on much more data from a larger market.

Vorheriger Artikel Knowledge transfer for cross domain learning to rank

Nächster Artikel On the choice of effectiveness measures for learning to rank

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

We use query length to mean the number of words in the query.

We could also consider merging the data sets and training a model on the merged data. In our experiments linearly interpolating models trained on background and adaptation data sets respectively achieves better results than simply training on merged datasets.

Bacchiani, M., Roark, B., & Saraclar, M. (2004). Language model adaptation with MAP estimation and the perceptron algorithm. In HLT-NAACL (pp. 21–24).

Bellagarda, J. (2001). An overview of statistical language model adaptation. In ITRW on adaptation methods for speech recognition (pp. 165–174).

Burges, C. (2005). Ranking as learning structured outputs. In C. C. S. Agarwal & R. Herbrich (Eds.), Proceedings of the NIPS workshop on learning to rank.

Burges, C., Ragno, R., & Le, Q. (2006). Learning to rank with non-smooth cost functions. In NIPS.

Burges, C., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, N., et al. (2005). Learning to rank using gradient descent. In ICML. Bonn, Germany.

Cao, Z., Qin, T., Liu, T. Y., Tsai, M. F., & Li, H. (2007). Learning to rank: From pairwise approach to listwise approach. In ICML.

Chen, K., Lu, R., Wong, C., Sun, G., Heck, L., & Tseng, B. (2008). Trada: Tree based ranking function adaptation. In ACM 17th conference on information and knowledge management.

Donmez, P., Svore, K., & Burges, C. (2008). On the optimality of LambdaRank. SIGIR.

Friedman, J. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5).

Gao, J., Nie, J. Y., Wu, G., & Cao, G. (2004). Dependence language model for information retrieval. In SIGIR, (pp. 170–177).

Gao, J., Qin, H., Xia, X., & Nie, J. Y. (2005). Linear discriminative model for information retrieval. In SIGIR, (pp. 290–297).

Gao, J., Suzuki, H., & Yuan, W. (2006). An empirical study on language model adaptation. ACM Trans on Asian Language Information Processing, 5(3), 207–227.

Gao, J., Wu, Q., Burges, C., Svore, K., Su, Y., Khan, N., et al. (2009). Model adaptation via model interpolation and boosting for web search ranking. In Conference on Empirical Methods in Natural Language Processing.

Jarvelin, K., & Kekalainen, J. (2000). IR evaluation methods for retrieving highly relevant documents. In SIGIR 23. ACM.

Jones, K., Walker, S., & Robertson, S. (1998). A probabilistic model of information retrieval: Development and status. Tech. Rep. TR-446, Cambridge University Computer Laboratory.

Le, Q., & Smola, A. J. (2007). Direct optimization of ranking measures. CoRR abs/0704.3359. Informal publication.

Li, P., Burges, C., & Wu, Q. (2007). Learning to rank using classification and gradient boosting. In NIPS.

Mason, L., Baxter, J., Bartlett, P., & Frean, M. (2000). Boosting algorithms as gradient descent. In T. L. S. A. Solla & K. R. Müller (Eds.), Advances in neural information processing systems (Vol. 12, pp. 512–518).

Robertson, S., & Zaragoza, H. (2007). On rank-based effectiveness measures and optimization. Information Retrieval, 10(3), 321–339.CrossRef

Song, F., & Croft, B. (1999). A general language model for information retrieval. In CIKM (pp. 316–321).

Xu, J., & Li, H. (2007). A boosting algorithm for information retrieval. In SIGIR.

Yue, Y., & Burges, C. (2007). On using simultaneous perturbation stochastic approximation for learning to rank, and the empirical optimality of LambdaRank. Tech. Rep. MSR-TR-2007-115, Microsoft research.

Yue, Y., Finley, T., Radlinski, F., & Joachims, T. (2007). A support vector method for optimizing average precision. In SIGIR.

Zhai, C., & Lafferty, J. (2002). Two-stage language models for information retrieval. In SIGIR (pp. 49–56).

Zheng, Z., Zha, H., Zhang, T., Chapelle, O., Chen, K., & Sun, G. (2007). A general boosting method and its application to learning ranking functions for web search. In NIPS.

Titel: Adapting boosting for information retrieval measures
verfasst von: Qiang Wu
Christopher J. C. Burges
Krysta M. Svore
Jianfeng Gao
Publikationsdatum: 01.06.2010
Verlag: Springer Netherlands
Erschienen in: Discover Computing / Ausgabe 3/2010
Print ISSN: 2948-2984
Elektronische ISSN: 2948-2992
DOI: https://doi.org/10.1007/s10791-009-9112-1

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 3/2010

Knowledge transfer for cross domain learning to rank

Learning to rank with (a lot of) word features

On the choice of effectiveness measures for learning to rank

Introduction to special issue on learning to rank for information retrieval

Gradient descent optimization of smoothed information retrieval metrics

Efficient algorithms for ranking with SVMs

Premium Partner