CARMI: A Cache-Aware Learned Index with a Cost-based Construction Algorithm

Zhang, Jiaoyi; Gao, Yihan

Computer Science > Databases

arXiv:2103.00858v1 (cs)

[Submitted on 1 Mar 2021 (this version), latest version 19 May 2022 (v4)]

Title:CARMI: A Cache-Aware Learned Index with a Cost-based Construction Algorithm

Authors:Jiaoyi Zhang, Yihan Gao

View PDF

Abstract:Learned indexes, which use machine learning models to replace traditional index structures, have shown promising results in recent studies. However, our understanding of this new type of index structure is still at an early stage with many details that need to be carefully examined and improved. In this paper, we propose a cache-aware learned index (CARMI) design to improve the efficiency of the Recursive Model Index (RMI) framework proposed by Kraska et al. and a cost-based construction algorithm to construct the optimal indexes in a wide variety of application scenarios. We formulate the problem of finding the optimal design of a learned index as an optimization problem and propose a dynamic programming algorithm for solving it and a partial greedy step to speed up. Experiments show that our index construction strategy can construct indexes with significantly better performance compared to baselines under various data distribution and workload requirements. Among them, CARMI can obtain an average of 2.52X speedup compared to B-tree, while using only about 0.56X memory space of B-tree on average.

Comments:	16 pages, 15 figures
Subjects:	Databases (cs.DB); Machine Learning (cs.LG)
Cite as:	arXiv:2103.00858 [cs.DB]
	(or arXiv:2103.00858v1 [cs.DB] for this version)
	https://doi.org/10.48550/arXiv.2103.00858

Submission history

From: Jiaoyi Zhang [view email]
[v1] Mon, 1 Mar 2021 09:20:53 UTC (844 KB)
[v2] Thu, 11 Mar 2021 13:08:52 UTC (1,030 KB)
[v3] Fri, 21 Jan 2022 13:05:37 UTC (2,339 KB)
[v4] Thu, 19 May 2022 09:51:10 UTC (2,204 KB)

Computer Science > Databases

Title:CARMI: A Cache-Aware Learned Index with a Cost-based Construction Algorithm

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Databases

Title:CARMI: A Cache-Aware Learned Index with a Cost-based Construction Algorithm

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators