Abstract
We introduce a classical-quantum hybrid approach to computation, allowing for a quadratic performance improvement in the decision process of a learning agent. Using the paradigm of quantum accelerators, we introduce a routine that runs on a quantum computer, which allows for the encoding of probability distributions. This quantum routine is then employed, in a reinforcement learning set-up, to encode the distributions that drive action choices. Our routine is well-suited in the case of a large, although finite, number of actions and can be employed in any scenario where a probability distribution with a large support is needed. We describe the routine and assess its performance in terms of computational complexity, needed quantum resource, and accuracy. Finally, we design an algorithm showing how to exploit it in the context of Q-learning.
| Original language | English |
|---|---|
| Article number | 3913 |
| Journal | Scientific Reports |
| Volume | 13 |
| Issue number | 1 |
| DOIs | |
| Publication status | Published - Dec 2023 |
| MoE publication type | A1 Journal article-refereed |
Funding
This work was partially funded by the Italian MUR Ministry under the project PNRR National Centre on HPC, Big Data and Quantum Computing, PUN: B93C22000620006, and from the Spanish State Research Agency, through the QUARESC project (PID2019-109094GB-C21/AEI/ 10.13039/501100011033) and the Severo Ochoa and María de Maeztu Program for Centers and Units of Excellence in R &D (MDM-2017-0711), from CAIB through the QUAREC project (PRD2018/47).