KKANs: Kurkova-Kolmogorov-Arnold Networks and Their Learning Dynamics

Toscano, Juan Diego; Wang, Li-Lian; Karniadakis, George Em

Computer Science > Machine Learning

arXiv:2412.16738 (cs)

[Submitted on 21 Dec 2024]

Title:KKANs: Kurkova-Kolmogorov-Arnold Networks and Their Learning Dynamics

Authors:Juan Diego Toscano, Li-Lian Wang, George Em Karniadakis

View PDF HTML (experimental)

Abstract:Inspired by the Kolmogorov-Arnold representation theorem and Kurkova's principle of using approximate representations, we propose the Kurkova-Kolmogorov-Arnold Network (KKAN), a new two-block architecture that combines robust multi-layer perceptron (MLP) based inner functions with flexible linear combinations of basis functions as outer functions. We first prove that KKAN is a universal approximator, and then we demonstrate its versatility across scientific machine-learning applications, including function regression, physics-informed machine learning (PIML), and operator-learning frameworks. The benchmark results show that KKANs outperform MLPs and the original Kolmogorov-Arnold Networks (KANs) in function approximation and operator learning tasks and achieve performance comparable to fully optimized MLPs for PIML. To better understand the behavior of the new representation models, we analyze their geometric complexity and learning dynamics using information bottleneck theory, identifying three universal learning stages, fitting, transition, and diffusion, across all types of architectures. We find a strong correlation between geometric complexity and signal-to-noise ratio (SNR), with optimal generalization achieved during the diffusion stage. Additionally, we propose self-scaled residual-based attention weights to maintain high SNR dynamically, ensuring uniform convergence and prolonged learning.

Comments:	Kolmogorov-Arnold representation theorem; physics-informed neural networks; Kolmogorov-Arnold networks; optimization algorithms; self-adaptive weights; information bottleneck theory
Subjects:	Machine Learning (cs.LG); Numerical Analysis (math.NA); Machine Learning (stat.ML)
Cite as:	arXiv:2412.16738 [cs.LG]
	(or arXiv:2412.16738v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2412.16738

Submission history

From: Juan Toscano [view email]
[v1] Sat, 21 Dec 2024 19:01:38 UTC (15,632 KB)

Computer Science > Machine Learning

Title:KKANs: Kurkova-Kolmogorov-Arnold Networks and Their Learning Dynamics

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:KKANs: Kurkova-Kolmogorov-Arnold Networks and Their Learning Dynamics

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators