nach oben

Erschienen in:

2024 | OriginalPaper | Buchkapitel

lil’HDoC: An Algorithm for Good Arm Identification Under Small Threshold Gap

verfasst von : Tzu-Hsien Tsai, Yun-Da Tsai, Shou-De Lin

Erschienen in: Advances in Knowledge Discovery and Data Mining

Verlag: Springer Nature Singapore

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Good arm identification (GAI) is a pure-exploration bandit problem in which a single learner outputs an arm as soon as it is identified as a good arm. A good arm is defined as an arm with an expected reward greater than or equal to a given threshold. This paper focuses on the GAI problem under a small threshold gap, which refers to the distance between the expected rewards of arms and the given threshold. We propose a new algorithm called lil’HDoC to significantly improve the total sample complexity of the HDoC algorithm. We demonstrate that the sample complexity of the first \(\lambda \) output arm in lil’HDoC is bounded by the original HDoC algorithm, except for one negligible term, when the distance between the expected reward and threshold is small. Extensive experiments confirm that our algorithm outperforms the state-of-the-art algorithms in both synthetic and real-world datasets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Adversarial Text Purification: A Large Language Model Approach for Defense

Nächstes Kapitel ScaleViz: Scaling Visualization Recommendation Models on Large Data

Nur mit Berechtigung zugänglich

Asuncion, A., Newman, D.: UCI machine learning repository (2007)

Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47(2), 235–256 (2002)CrossRef

Chen, S., Lin, T., King, I., Lyu, M.R., Chen, W.: Combinatorial pure exploration of multi-armed bandits. Adv. Neural. Inf. Process. Syst. 27, 379–387 (2014)

Da Tsai, Y., De Lin, S.: Fast online inference for nonlinear contextual bandit based on generative adversarial network. arXiv preprint arXiv:2202.08867 (2022)

Dudík, M., Langford, J., Li, L.: Doubly robust policy evaluation and learning. arXiv preprint arXiv:1103.4601 (2011)

Goldberg, K., Roeder, T., Gupta, D., Perkins, C.: Eigentaste: a constant time collaborative filtering algorithm. Inf. Retrieval 4(2), 133–151 (2001)CrossRef

Harper, F.M., Konstan, J.A.: The movielens datasets: history and context. ACM Trans. Interact. Intell. Syst. (TIIS) 5(4), 1–19 (2015)

Jamieson, K., Malloy, M., Nowak, R., Bubeck, S.: lil’ucb: an optimal exploration algorithm for multi-armed bandits. In: Conference on Learning Theory, pp. 423–439. PMLR (2014)

Jiang, H., Li, J., Qiao, M.: Practical algorithms for best-k identification in multi-armed bandits (2017)

10.

Kalyanakrishnan, S., Tewari, A., Auer, P., Stone, P.: PAC subset selection in stochastic multi-armed bandits. In: ICML, vol. 12, pp. 655–662 (2012)

11.

Kano, H., Honda, J., Sakamaki, K., Matsuura, K., Nakamura, A., Sugiyama, M.: Good arm identification via bandit feedback. Mach. Learn. 108(5), 721–745 (2019)MathSciNetCrossRef

12.

Kaufmann, E., Cappé, O., Garivier, A.: On the complexity of best-arm identification in multi-armed bandit models. J. Mach. Learn. Res. 17(1), 1–42 (2016)MathSciNet

13.

Locatelli, A., Gutzeit, M., Carpentier, A.: An optimal algorithm for the thresholding bandit problem. In: International Conference on Machine Learning, pp. 1690–1698. PMLR (2016)

14.

Tsai, Y.D., Tsai, T.H., Lin, S.D.: Differential good arm identification. arXiv preprint arXiv:2303.07154 (2023)

Titel: lil’HDoC: An Algorithm for Good Arm Identification Under Small Threshold Gap
verfasst von: Tzu-Hsien Tsai
Yun-Da Tsai
Shou-De Lin
Verlag: Springer Nature Singapore
Buch: Advances in Knowledge Discovery and Data Mining
Print ISBN: 978-981-9722-64-8

Electronic ISBN: 978-981-9722-62-4

Copyright-Jahr: 2024
DOI: https://doi.org/10.1007/978-981-97-2262-4_7

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner