Pointer Networks with Q-Learning for Combinatorial Optimization

Barro, Alessandro

Computer Science > Machine Learning

arXiv:2311.02629 (cs)

[Submitted on 5 Nov 2023 (v1), last revised 24 Oct 2024 (this version, v4)]

Title:Pointer Networks with Q-Learning for Combinatorial Optimization

Authors:Alessandro Barro

View PDF HTML (experimental)

Abstract:We introduce the Pointer Q-Network (PQN), a hybrid neural architecture that integrates model-free Q-value policy approximation with Pointer Networks (Ptr-Nets) to enhance the optimality of attention-based sequence generation, focusing on long-term outcomes. This integration proves particularly effective in solving combinatorial optimization (CO) tasks, especially the Travelling Salesman Problem (TSP), which is the focus of our study. We address this challenge by defining a Markov Decision Process (MDP) compatible with PQN, which involves iterative graph embedding, encoding and decoding by an LSTM-based recurrent neural network. This process generates a context vector and computes raw attention scores, which are dynamically adjusted by Q-values calculated for all available state-action pairs before applying softmax. The resulting attention vector is utilized as an action distribution, with actions selected hinged to exploration-exploitation dynamic adaptibility of PQN. Our empirical results demonstrate the efficacy of this approach, also testing the model in unstable environments.

Subjects:	Machine Learning (cs.LG); Optimization and Control (math.OC)
Cite as:	arXiv:2311.02629 [cs.LG]
	(or arXiv:2311.02629v4 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2311.02629

Submission history

From: Alessandro Barro [view email]
[v1] Sun, 5 Nov 2023 12:03:58 UTC (242 KB)
[v2] Tue, 19 Mar 2024 09:15:36 UTC (199 KB)
[v3] Mon, 17 Jun 2024 10:27:54 UTC (1,702 KB)
[v4] Thu, 24 Oct 2024 17:25:19 UTC (1,703 KB)

Computer Science > Machine Learning

Title:Pointer Networks with Q-Learning for Combinatorial Optimization

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Pointer Networks with Q-Learning for Combinatorial Optimization

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators