Numerical Analysis

New submissions
Cross-lists
Replacements

Total of 53 entries

Showing up to 2000 entries per page: fewer | more | all

[1] arXiv:2405.15799 [pdf, ps, html, other]: Title: An improved Halton sequence for implementation in quasi-Monte Carlo methods

Nathan Kirk, Christiane Lemieux

Comments: 12 pages, 4 figures

Subjects: Numerical Analysis (math.NA); Number Theory (math.NT)

Despite possessing the low-discrepancy property, the classical d dimensional Halton sequence is known to exhibit poorly distributed projections when d becomes even moderately large. This, in turn, often implies bad performance when implemented in quasi-Monte Carlo (QMC) methods in comparison to, for example, the Sobol' sequence. As an attempt to eradicate this issue, we propose an adapted Halton sequence built by integer and irrational based van der Corput sequences and show empirically improved performance with respect to the accuracy of estimates in numerical integration and simulation. In addition, for the first time, a scrambling algorithm is proposed for irrational based digital sequences.
[2] arXiv:2405.15993 [pdf, ps, other]: Title: Efficient Multifidelity Uncertainty Propagation in the Presence of Process Noise

Alberto Fossà, Roberto Armellin, Emmanuel Delande, Francesco Sanfedino

Comments: Submitted to the Journal of Guidance, Control, and Dynamics

Subjects: Numerical Analysis (math.NA)

A multifidelity method for the nonlinear propagation of uncertainties in the presence of stochastic accelerations is presented. The proposed algorithm treats the uncertainty propagation (UP) problem by separating the propagation of the initial uncertainty from that of the process noise. The initial uncertainty is propagated using an adaptive Gaussian mixture model (GMM) method which exploits a low-fidelity dynamical model to minimize the computational costs. The effects of process noise are instead computed using the PoLynomial Algebra Stochastic Moments Analysis (PLASMA) technique, which considers a high-fidelity model of the stochastic dynamics. The main focus of the paper is on the latter and on the key idea to approximate the probability density function (pdf) of the solution by a polynomial representation of its moments, which are efficiently computed using differential algebra (DA) techniques. The two estimates are finally combined to restore the accuracy of the low-fidelity surrogate and account for both sources of uncertainty. The proposed approach is applied to the problem of nonlinear orbit UP and its performance compared to that of Monte Carlo (MC) simulations.
[3] arXiv:2405.16040 [pdf, ps, html, other]: Title: Iterative Thresholding Methods for Longest Minimal Length Partitions

Shilong Hu, Hao Liu, Dong Wang

Subjects: Numerical Analysis (math.NA)

In this paper, we introduce two iterative methods for longest minimal length partition problem, which asks whether the disc (ball) is the set maximizing the total perimeter of the shortest partition that divides the total region into sub-regions with given volume proportions, under a volume constraint. The objective functional is approximated by a short-time heat flow using indicator functions of regions and Gaussian convolution. The problem is then represented as a constrained max-min optimization problem. Auction dynamics is used to find the shortest partition in a fixed region, and threshold dynamics is used to update the region. Numerical experiments in two-dimensional and three-dimensional cases are shown with different numbers of partitions, unequal volume proportions, and different initial shapes. The results of both methods are consistent with the conjecture that the disc in two dimensions and the ball in three dimensions are the solution of the longest minimal length partition problem.
[4] arXiv:2405.16076 [pdf, ps, html, other]: Title: Online randomized interpolative decomposition with a posteriori error estimator for temporal PDE data reduction

Angran Li, Stephen Becker, Alireza Doostan

Subjects: Numerical Analysis (math.NA)

Traditional low-rank approximation is a powerful tool to compress the huge data matrices that arise in simulations of partial differential equations (PDE), but suffers from high computational cost and requires several passes over the PDE data. The compressed data may also lack interpretability thus making it difficult to identify feature patterns from the original data. To address this issue, we present an online randomized algorithm to compute the interpolative decomposition (ID) of large-scale data matrices in situ. Compared to previous randomized IDs that used the QR decomposition to determine the column basis, we adopt a streaming ridge leverage score-based column subset selection algorithm that dynamically selects proper basis columns from the data and thus avoids an extra pass over the data to compute the coefficient matrix of the ID. In particular, we adopt a single-pass error estimator based on the non-adaptive Hutch++ algorithm to provide real-time error approximation for determining the best coefficients. As a result, our approach only needs a single pass over the original data and thus is suitable for large and high-dimensional matrices stored outside of core memory or generated in PDE simulations. We also provide numerical experiments on turbulent channel flow and ignition simulations, and on the NSTX Gas Puff Image dataset, comparing our algorithm with the offline ID algorithm to demonstrate its utility in real-world applications.
[5] arXiv:2405.16111 [pdf, ps, html, other]: Title: Computation of tensors generalized inverses under $M$-product and applications

Jajati Keshari Sahoo, Saroja Kumar Panda, Ratikanta Behera, Predrag S. Stanimirović

Comments: 27

Subjects: Numerical Analysis (math.NA)

This paper introduces notions of the Drazin and the core-EP inverses on tensors via M-product. We propose a few properties of the Drazin and the core-EP inverse of tensors, as well as effective tensor-based algorithms for calculating these inverses. In addition, definitions of composite generalized inverses are presented in the framework of the M-product, including CMP, DMP, and MPD inverse of tensors. Tensor-based higher-order Gauss-Seidel and Gauss-Jacobi iterative methods are designed. Algorithms for these two iterative methods to solve multilinear equations are developed. Certain multilinear systems are solved using the Drazin inverse, core-EP inverses, and composite generalized inverses such as CMP, DMP, and MPD inverse. A tensor M-product-based regularization technique is applied to solve the color image deblurring.
[6] arXiv:2405.16117 [pdf, ps, html, other]: Title: Positivity and Maximum Principle Preserving Discontinuous Galerkin Finite Element Schemes for a Coupled Flow and Transport

Shihua Gong, Young-Ju Lee, Yukun Li, Yue Yu

Subjects: Numerical Analysis (math.NA)

We introduce a new concept of the locally conservative flux and investigate its relationship with the compatible discretization pioneered by Dawson, Sun and Wheeler [11]. We then demonstrate how the new concept of the locally conservative flux can play a crucial role in obtaining the L2 norm stability of the discontinuous Galerkin finite element scheme for the transport in the coupled system with flow. In particular, the lowest order discontinuous Galerkin finite element for the transport is shown to inherit the positivity and maximum principle when the locally conservative flux is used, which has been elusive for many years in literature. The theoretical results established in this paper are based on the equivalence between Lesaint-Raviart discontinuous Galerkin scheme and Brezzi-Marini-Suli discontinuous Galerkin scheme for the linear hyperbolic system as well as the relationship between the Lesaint-Raviart discontinuous Galerkin scheme and the characteristic method along the streamline. Sample numerical experiments have also been performed to justify our theoretical findings
[7] arXiv:2405.16172 [pdf, ps, html, other]: Title: Existence and nonexistence of solutions for underdetermined generalized absolute value equations

Cairong Chen, Xuehua Li, Ren-Cang Li

Comments: 24 pages

Subjects: Numerical Analysis (math.NA); Optimization and Control (math.OC)

Underdetermined generalized absolute value equations (GAVE) has real applications. The underdetermined GAVE may have no solution, one solution, finitely multiple solutions or infinitely many solutions. This paper aims to give some sufficient conditions which guarantee the existence or nonexistence of solutions for the underdetermined GAVE. Particularly, sufficient conditions under which certain or each sign pattern possesses infinitely many solutions of the underdetermined GAVE are given. In addition, iterative methods are developed to solve a solution of the underdetermined GAVE. Some existing results about the square GAVE are extended.
[8] arXiv:2405.16201 [pdf, ps, html, other]: Title: Further study on two fixed point iterative schemes for absolute value equations

Jiayu Liu, Tingting Luo, Cairong Chen

Comments: 13 pages

Subjects: Numerical Analysis (math.NA)

In this paper, we reconsider two new iterative methods for solving absolute value equations (AVE), which is proposed by Ali and Pan (Jpn. J. Ind. Appl. Math. 40: 303--314, 2023). Convergence results of the two iterative schemes and new sufficient conditions for the unique solvability of AVE are presented. In addition, for a special case, the optimal iteration parameters of the two algorithms are analyzed, respectively. Numerical results demonstrate our claims.
[9] arXiv:2405.16232 [pdf, ps, html, other]: Title: Numerical scheme for delay-type stochastic McKean-Vlasov equations driven by fractional Brownian motion

Shuaibin Gao, Qian Guo, Zhuoqi Liu, Chenggui Yuan

Subjects: Numerical Analysis (math.NA); Probability (math.PR)

This paper focuses on the numerical scheme for delay-type stochastic McKean-Vlasov equations (DSMVEs) driven by fractional Brownian motion with Hurst parameter $H\in (0,1/2)\cup (1/2,1)$. The existence and uniqueness of the solutions to such DSMVEs whose drift coefficients contain polynomial delay terms are proved by exploting the Banach fixed point theorem. Then the propagation of chaos between interacting particle system and non-interacting system in $\mathcal{L}^p$ sense is shown. We find that even if the delay term satisfies the polynomial growth condition, the unmodified classical Euler-Maruyama scheme still can approximate the corresponding interacting particle system without the particle corruption. The convergence rates are revealed for $H\in (0,1/2)\cup (1/2,1)$. Finally, as an example that closely fits the original equation, a stochastic opinion dynamics model with both extrinsic memory and intrinsic memory is simulated to illustrate the plausibility of the theoretical result.
[10] arXiv:2405.16291 [pdf, ps, html, other]: Title: Transparent boundary condition and its high frequency approximation for the Schr\"odinger equation on a rectangular computational domain

Samardhi Yadav, Vishal Vaibhav

Comments: 36 pages, 8 figures and 4 tables. arXiv admin note: text overlap with arXiv:2403.07787

Subjects: Numerical Analysis (math.NA); Computational Physics (physics.comp-ph)

This paper addresses the numerical implementation of the transparent boundary condition (TBC) and its various approximations for the free Schrödinger equation on a rectangular computational domain. In particular, we consider the exact TBC and its spatially local approximation under high frequency assumption along with an appropriate corner condition. For the spatial discretization, we use a Legendre-Galerkin spectral method where Lobatto polynomials serve as the basis. Within variational formalism, we first arrive at the time-continuous dynamical system using spatially discrete form of the initial boundary-value problem incorporating the boundary conditions. This dynamical system is then discretized using various time-stepping methods, namely, the backward-differentiation formula of order 1 and 2 (i.e., BDF1 and BDF2, respectively) and the trapezoidal rule (TR) to obtain a fully discrete system. Next, we extend this approach to the novel Padé based implementation of the TBC presented by Yadav and Vaibhav [arXiv:2403.07787(2024)]. Finally, several numerical tests are presented to demonstrate the effectiveness of the boundary maps (incorporating the corner conditions) where we study the stability and convergence behavior empirically.
[11] arXiv:2405.16400 [pdf, ps, html, other]: Title: Weighted sampling recovery of functions with mixed smoothness

Dinh Dũng

Comments: arXiv admin note: text overlap with arXiv:2309.04994

Subjects: Numerical Analysis (math.NA)

We study sparse-grid linear sampling algorithms and their optimality for approximate recovery of functions with mixed smoothness on $\mathbb{R}^d$ from a set of $n$ their sampled values in two different settings: (i) functions to be recovered are in weighted Sobolev spaces $W^r_{p,w}(\mathbb{R}^d)$ of mixed smoothness and the approximation error is measured by the norm of the weighted Lebesgue space $L_{q,w}(\mathbb{R}^d)$, and (ii) functions to be recovered are in Sobolev spaces with measure $W^r_p(\mathbb{R}^d; \mu_w)$ of mixed smoothness and the approximation error is measured by the norm of the Lebesgue space with measure $L_q(\mathbb{R}^d; \mu_w)$. Here, the function $w$, a tensor-product Freud-type weight is the weight in the setting (i), and the density function of the measure $\mu_w$ in the setting (ii). The optimality of linear sampling algorithms is investigated in terms of the relevant sampling $n$-widths. We construct sparse-grid linear sampling algorithms which are completely different for the settings (i) and (ii) and which give upper bounds of the corresponding sampling $n$-widths. We prove that in the one-dimensional case, these algorithms realize the right convergence rate of the sampling widths. In the setting (ii) for the high dimensional case ($d\ge 2$), we also achieve the right convergence rate of the sampling $n$-widths for $1\le q \le 2 \le p \le \infty$ through a non-constructive method.
[12] arXiv:2405.16582 [pdf, ps, html, other]: Title: A comparison of the Coco-Russo scheme and $\protect\mathghost$-FEM for elliptic equations in arbitrary domains

Clarissa Astuto, Armando Coco, Umberto Zerbinati

Subjects: Numerical Analysis (math.NA)

In this paper, a comparative study between the Coco-Russo scheme (based on finite-difference scheme) and the $\mathghost$-FEM (based on finite-element method) is presented when solving the Poisson equation in arbitrary domains. The comparison between the two numerical methods is carried out by presenting analytical results from the literature \cite{cocoStissi,astuto2024nodal}, together with numerical tests in various geometries and boundary conditions.
[13] arXiv:2405.16768 [pdf, ps, html, other]: Title: Time-dependent complex variable solution on quasi three-dimensional shallow tunnelling in gravititational geomaterial with reasonable far-field displacement

Luobin Lin, Fuquan Chen, Changjie Zheng

Comments: 38p pages, 11 figures

Subjects: Numerical Analysis (math.NA)

Three-dimensional effect of tunnel face and gravitational excavation generally occur in shallow tunnelling, which are nevertheless not adequately considered in present complex variable solutions. In this paper, a new time-dependent complex variable solution on quasi three-dimensional shallow tunnelling in gravitational geomaterial is derived, and the far-field displacement singularity is eliminated by fixed far-field ground surface in the whole excavation time span. With an equivalent coefficient of three-dimensional effect, the quasi three-dimensional shallow tunnelling is transformed into a plane strain problem with time-dependent virtual traction along tunnel periphery. The mixed boundaries of fixed far-field ground surface and nearby free segment form a homogenerous Riemann-Hilbert problem with extra constraints of the virtual traction along tunnel periphery, which is simultaneously solved using an iterative linear system with good numerical stability. The mixed boundary conditions along the ground surface in the whole excavation time span are well satisified in a numerical case, which is further examined by comparing with corresponding finite element solution. The results are in good agreements, and the proposed solution illustrates high efficiency. More discussions are made on excavation rate, viscosity, and solution convergence. A latent paradox is disclosed for objectivity.
[14] arXiv:2405.16827 [pdf, ps, html, other]: Title: Structure-preserving finite element methods for computing dynamics of rotating Bose-Einstein condensate

Meng Li, Junjun Wang, Zhen Guan, Zhijie Du

Subjects: Numerical Analysis (math.NA)

This work is concerned with the construction and analysis of structure-preserving Galerkin methods for computing the dynamics of rotating Bose-Einstein condensate (BEC) based on the Gross-Pitaevskii equation with angular momentum rotation. Due to the presence of the rotation term, constructing finite element methods (FEMs) that preserve both mass and energy remains an unresolved issue, particularly in the context of nonconforming FEMs. Furthermore, in comparison to existing works, we provide a comprehensive convergence analysis, offering a thorough demonstration of the methods' optimal and high-order convergence properties. Finally, extensive numerical results are presented to check the theoretical analysis of the structure-preserving numerical method for rotating BEC, and the quantized vortex lattice's behavior is scrutinized through a series of numerical tests.
[15] arXiv:2405.16864 [pdf, ps, other]: Title: Sparsity comparison of polytopal finite element methods

Christoph Lehrenfeld, Paul Stocker, Maximilian Zienecker

Comments: 15 pages, 12 figures, 15 tables

Subjects: Numerical Analysis (math.NA); Computational Complexity (cs.CC); Computational Engineering, Finance, and Science (cs.CE)

In this work we compare crucial parameters for efficiency of different finite element methods for solving partial differential equations (PDEs) on polytopal meshes. We consider the Virtual Element Method (VEM) and different Discontinuous Galerkin (DG) methods, namely the Hybrid DG and Trefftz DG methods. The VEM is a conforming method, that can be seen as a generalization of the classic finite element method to arbitrary polytopal meshes. DG methods are non-conforming methods that offer high flexibility, but also come with high computational costs. Hybridization reduces these costs by introducing additional facet variables, onto which the computational costs can be transfered to. Trefftz DG methods achieve a similar reduction in complexity by selecting a special and smaller set of basis functions on each element. The association of computational costs to different geometrical entities (elements or facets) leads to differences in the performance of these methods on different grid types. This paper aims to compare the dependency of these approaches across different grid configurations.
[16] arXiv:2405.16894 [pdf, ps, other]: Title: An Unconstrained Formulation of Some Constrained Partial Differential Equations and its Application to Finite Neuron Methods

Jiwei Jia, Young Ju Lee, Ruitong Shan

Subjects: Numerical Analysis (math.NA)

In this paper, we present a new framework how a PDE with constraints can be formulated into a sequence of PDEs with no constraints, whose solutions are convergent to the solution of the PDE with constraints. This framework is then used to build a novel finite neuron method to solve the 2nd order elliptic equations with the Dirichlet boundary condition. Our algorithm is the first algorithm, proven to lead to shallow neural network solutions with an optimal H1 norm error. We show that a widely used penalized PDE, which imposes the Dirichlet boundary condition weakly can be interpreted as the first element of the sequence of PDEs within our framework. Furthermore, numerically, we show that it may not lead to the solution with the optimal H1 norm error bound in general. On the other hand, we theoretically demonstrate that the second and later elements of a sequence of PDEs can lead to an adequate solution with the optimal H1 norm error bound. A number of sample tests are performed to confirm the effectiveness of the proposed algorithm and the relevant theory.
[17] arXiv:2405.16917 [pdf, ps, html, other]: Title: LRAMM -- Low precision approximates GEMM via RSVD

Hongyaoxing Gu

Subjects: Numerical Analysis (math.NA); Performance (cs.PF)

Matrix multiplication computation acceleration has been a research hotspot across various domains. Due to the characteristics of some applications, approximate matrix multiplication can achieve significant performance improvements without losing much precision.
In this paper, we propose LRAMM - a high-performance matrix multiplication approximation algorithm that combines mixed-precision quantized matrix multiplication with RSVD techniques, further enhancing efficiency within the error range of low-precision matrix multiplication by utilizing matrix low-rank decomposition technology.
[18] arXiv:2405.16985 [pdf, ps, other]: Title: Optimal error bounds for the two point flux approximation finite volume scheme

Robert Eymard (LAMA), Thierry Gallouët (I2M), Raphaele Herbin (I2M)

Subjects: Numerical Analysis (math.NA)

We consider a finite volume scheme with two-point flux approximation (TPFA) to approximate a Laplace problem when the solution exhibits no more regularity than belonging to $H^1_0(\Omega)$. We establish in this case some error bounds for both the solution and the approximation of the gradient component orthogonal to the mesh faces. This estimate is optimal, in the sense that the approximation error has the same order as that of the sum of the interpolation error and a conformity error. A numerical example illustrates the error estimate in the context of a solution with minimal regularity. This result is extended to evolution problems discretized via the implicit Euler scheme in an appendix.
[19] arXiv:2405.17090 [pdf, ps, html, other]: Title: Positivity preserving finite element method for the Gross-Pitaevskii ground state: discrete uniqueness and global convergence

Moritz Hauck, Yizhou Liang, Daniel Peterseim

Comments: 22 pages, 5 figures

Subjects: Numerical Analysis (math.NA)

We propose a positivity preserving finite element discretization for the nonlinear Gross-Pitaevskii eigenvalue problem. The method employs mass lumping techniques, which allow to transfer the uniqueness up to sign and positivity properties of the continuous ground state to the discrete setting. We further prove that every non-negative discrete excited state up to sign coincides with the discrete ground state. This allows one to identify the limit of fully discretized gradient flows, which are typically used to compute the discrete ground state, and thereby establish their global convergence. Furthermore, we perform a rigorous a priori error analysis of the proposed non-standard finite element discretization, showing optimal orders of convergence for all unknowns. Numerical experiments illustrate the theoretical results of this paper.
[20] arXiv:2405.17204 [pdf, ps, other]: Title: Numerical solution of the boundary value problem of elliptic equation by Levi function scheme

Jinchao Pan, Jijun Liu

Subjects: Numerical Analysis (math.NA); Mathematical Physics (math-ph)

For boundary value problem of an elliptic equation with variable coefficients describing the physical field distribution in inhomogeneous media, the Levi function can represent the solution in terms of volume and surface potentials, with the drawback that the volume potential involving in the solution expression requires heavy computational costs as well as the solvability of the integral equations with respect to the density pair. We introduce an modified integral expression for the solution to an elliptic equation in divergence form under the Levi function framework. The well-posedness of the linear integral system with respect to the density functions to be determined is rigorously proved. Based on the singularity decomposition for the Levi function, we propose two schemes to deal with the volume integrals so that the density functions can be solved efficiently. One method is an adaptive discretization scheme (ADS) for computing the integrals with continuous integrands, leading to the uniform accuracy of the integrals in the whole domain, and consequently the efficient computations for the density functions. The other method is the dual reciprocity method (DRM) which is a meshless approach converting the volume integrals into boundary integrals equivalently by expressing the volume density as the combination of the radial basis functions determined by the interior grids. The proposed schemes are justified numerically to be of satisfactory computation costs. Numerical examples in 2-dimensional and 3-dimensional cases are presented to show the validity of the proposed schemes.

[21] arXiv:2405.15955 (cross-list from astro-ph.EP) [pdf, ps, html, other]: Title: Anomaly distinguishability in an asteroid analogue using quasi-monostatic experimental radar measurements

Yusuf Oluwatoki Yusuf, Astrid Dufaure, Liisa-Ida Sorsa, Christelle Eyraud, Jean-Michel Geffrin, Alain Hérique, Sampsa Pursiainen

Comments: 11 pages, 13 figures, and 1 table

Subjects: Earth and Planetary Astrophysics (astro-ph.EP); Instrumentation and Methods for Astrophysics (astro-ph.IM); Numerical Analysis (math.NA)

This study conducts a quantitative distinguishability analysis using quasi-monostatic experimental radar data to find a topographic and backpropagated tomographic reconstruction for an analogue of asteroid Itokawa (25143). In particular, we consider a combination of travel-time and wavefield backpropagation tomography using the time-frequency representation (TFR) and principal component analysis (PCA) approaches as filtering techniques. Furthermore, we hypothesise that the travel time of the main peaks in the signal can be projected as a topographic imaging of the analogue asteroid while also presenting a tomographic reconstruction based on the main peaks in the signal. We compare the performance of several different filtering approaches covering several noise levels and two hypothetical interior structures: homogeneous and detailed. Our results suggest that wavefield information is vital for obtaining an appropriate reconstruction quality regardless of the noise level and that different filters affect the distinguishability under different assumptions of the noise. The results also suggest that the main peaks of the measured signal can be used to topographically distinguish the signatures in the measurements, hence the interior structure of the different analogue asteroids. Similarly, a tomographic reconstruction with the main peaks of the measured signal can be used to distinguish the interior structure of the different analogue asteroids.
[22] arXiv:2405.15986 (cross-list from cs.LG) [pdf, ps, other]: Title: Accelerating Diffusion Models with Parallel Sampling: Inference at Sub-Linear Time Complexity

Haoxuan Chen, Yinuo Ren, Lexing Ying, Grant M. Rotskoff

Subjects: Machine Learning (cs.LG); Distributed, Parallel, and Cluster Computing (cs.DC); Numerical Analysis (math.NA); Machine Learning (stat.ML)

Diffusion models have become a leading method for generative modeling of both image and scientific data. As these models are costly to train and evaluate, reducing the inference cost for diffusion models remains a major goal. Inspired by the recent empirical success in accelerating diffusion models via the parallel sampling technique~\cite{shih2024parallel}, we propose to divide the sampling process into $\mathcal{O}(1)$ blocks with parallelizable Picard iterations within each block. Rigorous theoretical analysis reveals that our algorithm achieves $\widetilde{\mathcal{O}}(\mathrm{poly} \log d)$ overall time complexity, marking the first implementation with provable sub-linear complexity w.r.t. the data dimension $d$. Our analysis is based on a generalized version of Girsanov's theorem and is compatible with both the SDE and probability flow ODE implementations. Our results shed light on the potential of fast and efficient sampling of high-dimensional data on fast-evolving modern large-memory GPU clusters.
[23] arXiv:2405.15992 (cross-list from cs.LG) [pdf, ps, html, other]: Title: Data Complexity Estimates for Operator Learning

Nikola B. Kovachki, Samuel Lanthaler, Hrushikesh Mhaskar

Subjects: Machine Learning (cs.LG); Numerical Analysis (math.NA)

Operator learning has emerged as a new paradigm for the data-driven approximation of nonlinear operators. Despite its empirical success, the theoretical underpinnings governing the conditions for efficient operator learning remain incomplete. The present work develops theory to study the data complexity of operator learning, complementing existing research on the parametric complexity. We investigate the fundamental question: How many input/output samples are needed in operator learning to achieve a desired accuracy $\epsilon$? This question is addressed from the point of view of $n$-widths, and this work makes two key contributions. The first contribution is to derive lower bounds on $n$-widths for general classes of Lipschitz and Fréchet differentiable operators. These bounds rigorously demonstrate a ``curse of data-complexity'', revealing that learning on such general classes requires a sample size exponential in the inverse of the desired accuracy $\epsilon$. The second contribution of this work is to show that ``parametric efficiency'' implies ``data efficiency''; using the Fourier neural operator (FNO) as a case study, we show rigorously that on a narrower class of operators, efficiently approximated by FNO in terms of the number of tunable parameters, efficient operator learning is attainable in data complexity as well. Specifically, we show that if only an algebraically increasing number of tunable parameters is needed to reach a desired approximation accuracy, then an algebraically bounded number of data samples is also sufficient to achieve the same accuracy.
[24] arXiv:2405.16020 (cross-list from math.OC) [pdf, ps, other]: Title: Block Acceleration Without Momentum: On Optimal Stepsizes of Block Gradient Descent for Least-Squares

Liangzu Peng, Wotao Yin

Comments: 36 pages, accepted to ICML 2024

Subjects: Optimization and Control (math.OC); Numerical Analysis (math.NA)

Block coordinate descent is a powerful algorithmic template suitable for big data optimization. This template admits a lot of variants including block gradient descent (BGD), which performs gradient descent on a selected block of variables, while keeping other variables fixed. For a very long time, the stepsize for each block has tacitly been set to one divided by the block-wise Lipschitz smoothness constant, imitating the vanilla stepsize rule for gradient descent (GD). However, such a choice for BGD has not yet been able to theoretically justify its empirical superiority over GD, as existing convergence rates for BGD have worse constants than GD in the deterministic cases.
To discover such theoretical justification, we set up a simple environment where we consider BGD applied to least-squares with two blocks of variables. Assuming the data matrix corresponding to each block is orthogonal, we find optimal stepsizes of BGD in closed form, which provably lead to asymptotic convergence rates twice as fast as GD with Polyak's momentum; this means, under that orthogonality assumption, one can accelerate BGD by just tuning stepsizes and without adding any momentum. An application that satisfies this assumption is \textit{generalized alternating projection} between two subspaces, and applying our stepsizes to it improves the prior convergence rate that was once claimed, slightly inaccurately, to be optimal. The main proof idea is to minimize, in stepsize variables, the spectral radius of a matrix that controls convergence rates.
[25] arXiv:2405.16359 (cross-list from stat.CO) [pdf, ps, html, other]: Title: A First Course in Monte Carlo Methods

Daniel Sanz-Alonso, Omar Al-Ghattas

Comments: 150 pages, 21 figures

Subjects: Computation (stat.CO); History and Overview (math.HO); Numerical Analysis (math.NA)

This is a concise mathematical introduction to Monte Carlo methods, a rich family of algorithms with far-reaching applications in science and engineering. Monte Carlo methods are an exciting subject for mathematical statisticians and computational and applied mathematicians: the design and analysis of modern algorithms are rooted in a broad mathematical toolbox that includes ergodic theory of Markov chains, Hamiltonian dynamical systems, transport maps, stochastic differential equations, information theory, optimization, Riemannian geometry, and gradient flows, among many others. These lecture notes celebrate the breadth of mathematical ideas that have led to tangible advancements in Monte Carlo methods and their applications. To accommodate a diverse audience, the level of mathematical rigor varies from chapter to chapter, giving only an intuitive treatment to the most technically demanding subjects. The aim is not to be comprehensive or encyclopedic, but rather to illustrate some key principles in the design and analysis of Monte Carlo methods through a carefully-crafted choice of topics that emphasizes timeless over timely ideas. Algorithms are presented in a way that is conducive to conceptual understanding and mathematical analysis -- clarity and intuition are favored over state-of-the-art implementations that are harder to comprehend or rely on ad-hoc heuristics. To help readers navigate the expansive landscape of Monte Carlo methods, each algorithm is accompanied by a summary of its pros and cons, and by a discussion of the type of problems for which they are most useful. The presentation is self-contained, and therefore adequate for self-guided learning or as a teaching resource. Each chapter contains a section with bibliographic remarks that will be useful for those interested in conducting research on Monte Carlo methods and their applications.
[26] arXiv:2405.16563 (cross-list from cs.LG) [pdf, ps, other]: Title: Reality Only Happens Once: Single-Path Generalization Bounds for Transformers

Yannick Limmer, Anastasis Kratsios, Xuwei Yang, Raeid Saqur, Blanka Horvath

Comments: 11 pages (+30 appendix), 3 figures, 6 tables

Subjects: Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Numerical Analysis (math.NA); Probability (math.PR); Machine Learning (stat.ML)

One of the inherent challenges in deploying transformers on time series is that \emph{reality only happens once}; namely, one typically only has access to a single trajectory of the data-generating process comprised of non-i.i.d. observations. We derive non-asymptotic statistical guarantees in this setting through bounds on the \textit{generalization} of a transformer network at a future-time $t$, given that it has been trained using $N\le t$ observations from a single perturbed trajectory of a Markov process. Under the assumption that the Markov process satisfies a log-Sobolev inequality, we obtain a generalization bound which effectively converges at the rate of ${O}(1/\sqrt{N})$. Our bound depends explicitly on the activation function ($\operatorname{Swish}$, $\operatorname{GeLU}$, or $\tanh$ are considered), the number of self-attention heads, depth, width, and norm-bounds defining the transformer architecture. Our bound consists of three components: (I) The first quantifies the gap between the stationary distribution of the data-generating Markov process and its distribution at time $t$, this term converges exponentially to $0$. (II) The next term encodes the complexity of the transformer model and, given enough time, eventually converges to $0$ at the rate ${O}(\log(N)^r/\sqrt{N})$ for any $r>0$. (III) The third term guarantees that the bound holds with probability at least $1$-$\delta$, and converges at a rate of ${O}(\sqrt{\log(1/\delta)}/\sqrt{N})$.
[27] arXiv:2405.16651 (cross-list from quant-ph) [pdf, ps, html, other]: Title: Variational Quantum Framework for Partial Differential Equation Constrained Optimization

Amit Surana, Abeynaya Gnanasekaran

Subjects: Quantum Physics (quant-ph); Analysis of PDEs (math.AP); Numerical Analysis (math.NA)

We present a novel variational quantum framework for partial differential equation (PDE) constrained design optimization problems. Such problems arise in simulation based design in many scientific and engineering domains. For instance in aerodynamic design, the PDE constraints are the conservation laws such as momentum, mass and energy balance, the design variables are vehicle shape parameters and material properties, and the objective could be to minimize the effect of transient heat loads on the vehicle or to maximize the lift. The proposed framework utilizes the variational quantum linear system (VQLS) algorithm and a black box optimizer as its two main building blocks. VQLS is used to solve the linear system, arising from the discretization of the PDE constraints for given design parameters, and evaluate the design cost/objective function. The black box optimizer is used to select next set of parameter values based on this evaluated cost, leading to nested bi-level optimization structure within a hybrid classical-quantum setting. We present detailed complexity analysis to highlight the potential advantages of our proposed framework over classical techniques. We implement our framework using the PennyLane library, apply it to solve a prototypical heat transfer optimization problem, and present simulation results using Bayesian optimization as the black box
[28] arXiv:2405.16679 (cross-list from math.AP) [pdf, ps, html, other]: Title: Aggregation-Diffusion Equations for Collective Behaviour in the Sciences

Rafael Bailo, José A. Carrillo, David Gómez-Castro

Subjects: Analysis of PDEs (math.AP); Numerical Analysis (math.NA)

This is a survey article based on the content of the plenary lecture given by José A. Carrillo at the ICIAM23 conference in Tokyo. It is devoted to produce a snapshot of the state of the art in the analysis, numerical analysis, simulation, and applications of the vast area of aggregation-diffusion equations. We also discuss the implications in mathematical biology explaining cell sorting in tissue growth as an example of this modelling framework. This modelling strategy is quite successful in other timely applications such as global optimisation, parameter estimation and machine learning.
[29] arXiv:2405.16841 (cross-list from math.AP) [pdf, ps, html, other]: Title: Approximation of arbitrarily high-order PDEs by first-order hyperbolic relaxation

David I. Ketcheson, Abhijit Biswas

Subjects: Analysis of PDEs (math.AP); Numerical Analysis (math.NA)

We present a framework for constructing a first-order hyperbolic system whose solution approximates that of a desired higher-order evolution equation. Constructions of this kind have received increasing interest in recent years, and are potentially useful as either analytical or computational tools for understanding the corresponding higher-order equation. We perform a systematic analysis of a family of linear model equations and show that for each member of this family there is a stable hyperbolic approximation whose solution converges to that of the model equation in a certain limit. We then show through several examples that this approach can be applied successfully to a very wide range of nonlinear PDEs of practical interest.
[30] arXiv:2405.16903 (cross-list from math.OC) [pdf, ps, html, other]: Title: Kaczmarz Projection Algorithms in Moving Window: Performance Improvement via Extended Orthogonality & Forgetting

Alexander Stotsky

Comments: J Sign Process Syst, 2024

Subjects: Optimization and Control (math.OC); Dynamical Systems (math.DS); Numerical Analysis (math.NA); Spectral Theory (math.SP)

New Kaczmarz algorithms with rank two gain update, extended orthogonality property and forgetting mechanism which includes both exponential and instantaneous forgetting (implemented via a proper choice of the forgetting factor and the window size) are introduced and associated in this report with well-known Kaczmarz algorithms with rank one update.
[31] arXiv:2405.17018 (cross-list from cs.CE) [pdf, ps, html, other]: Title: Structural cohesive element for the modelling of delamination in composite laminates without the cohesive zone limit

Xiaopeng Ai, Boyang Chen, Christos Kassapoglou

Subjects: Computational Engineering, Finance, and Science (cs.CE); Numerical Analysis (math.NA)

Delamination is a critical mode of failure that occurs between plies in a composite laminate. The cohesive element, developed based on the cohesive zone model, is widely used for modeling delamination. However, standard cohesive elements suffer from a well-known limit on the mesh density-the element size must be much smaller than the cohesive zone size. This work develops a new set of elements for modelling composite plies and their interfaces in 3D. A triangular Kirchhoff-Love shell element is developed for orthotropic materials to model the plies. A structural cohesive element, conforming to the shell elements of the plies, is developed to model the interface delamination. The proposed method is verified and validated on the classical benchmark problems of Mode I, Mode II, and mixed-mode delamination of unidirectional laminates, as well as on the single-leg bending problem of a multi-directional laminate. All the results show that the element size in the proposed models can be ten times larger than that in the standard cohesive element models, with more than 90% reduction in CPU time, while retaining prediction accuracy. This would then allow more effective and efficient modeling of delamination in composites without worrying about the cohesive zone limit on the mesh density.
[32] arXiv:2405.17068 (cross-list from cs.LG) [pdf, ps, html, other]: Title: The Poisson Midpoint Method for Langevin Dynamics: Provably Efficient Discretization for Diffusion Models

Saravanan Kandasamy, Dheeraj Nagaraj

Comments: "One often meets his destiny on the road he takes to avoid it" - Master Oogway. My destiny seems to be to write triangle inequalities for the rest of my life

Subjects: Machine Learning (cs.LG); Numerical Analysis (math.NA); Machine Learning (stat.ML)

Langevin Dynamics is a Stochastic Differential Equation (SDE) central to sampling and generative modeling and is implemented via time discretization. Langevin Monte Carlo (LMC), based on the Euler-Maruyama discretization, is the simplest and most studied algorithm. LMC can suffer from slow convergence - requiring a large number of steps of small step-size to obtain good quality samples. This becomes stark in the case of diffusion models where a large number of steps gives the best samples, but the quality degrades rapidly with smaller number of steps. Randomized Midpoint Method has been recently proposed as a better discretization of Langevin dynamics for sampling from strongly log-concave distributions. However, important applications such as diffusion models involve non-log concave densities and contain time varying drift. We propose its variant, the Poisson Midpoint Method, which approximates a small step-size LMC with large step-sizes. We prove that this can obtain a quadratic speed up of LMC under very weak assumptions. We apply our method to diffusion models for image generation and show that it maintains the quality of DDPM with 1000 neural network calls with just 50-80 neural network calls and outperforms ODE based methods with similar compute.
[33] arXiv:2405.17087 (cross-list from math.DS) [pdf, ps, html, other]: Title: Symbolic dynamics for the Kuramoto-Sivashinsky PDE on the line II

Daniel Wilczak, Piotr Zgliczyński

Comments: 57 pages

Subjects: Dynamical Systems (math.DS); Analysis of PDEs (math.AP); Numerical Analysis (math.NA)

We present a new algorithm for the rigorous integration of the variational equation (i.e. producing $\mathcal C^1$ estimates) for a class of dissipative PDEs on the torus. As an application for some parameter value for the Kuramoto-Sivashinsky PDE on the line with odd and periodic boundary conditions we prove the existence of infinite number of homo- and heteroclinic orbits to two periodic orbits. The proof is computer assisted.
[34] arXiv:2405.17211 (cross-list from cs.LG) [pdf, ps, html, other]: Title: Spectral-Refiner: Fine-Tuning of Accurate Spatiotemporal Neural Operator for Turbulent Flows

Shuhao Cao, Francesco Brarda, Ruipeng Li, Yuanzhe Xi

Subjects: Machine Learning (cs.LG); Numerical Analysis (math.NA); Fluid Dynamics (physics.flu-dyn)

Recent advancements in operator-type neural networks have shown promising results in approximating the solutions of spatiotemporal Partial Differential Equations (PDEs). However, these neural networks often entail considerable training expenses, and may not always achieve the desired accuracy required in many scientific and engineering disciplines. In this paper, we propose a new Spatiotemporal Fourier Neural Operator (SFNO) that learns maps between Bochner spaces, and a new learning framework to address these issues. This new paradigm leverages wisdom from traditional numerical PDE theory and techniques to refine the pipeline of commonly adopted end-to-end neural operator training and evaluations. Specifically, in the learning problems for the turbulent flow modeling by the Navier-Stokes Equations (NSE), the proposed architecture initiates the training with a few epochs for SFNO, concluding with the freezing of most model parameters. Then, the last linear spectral convolution layer is fine-tuned without the frequency truncation. The optimization uses a negative Sobolev norm for the first time as the loss in operator learning, defined through a reliable functional-type \emph{a posteriori} error estimator whose evaluation is almost exact thanks to the Parseval identity. This design allows the neural operators to effectively tackle low-frequency errors while the relief of the de-aliasing filter addresses high-frequency errors. Numerical experiments on commonly used benchmarks for the 2D NSE demonstrate significant improvements in both computational efficiency and accuracy, compared to end-to-end evaluation and traditional numerical PDE solvers.
[35] arXiv:2405.17277 (cross-list from cs.LG) [pdf, ps, html, other]: Title: Gradients of Functions of Large Matrices

Nicholas Krämer, Pablo Moreno-Muñoz, Hrittik Roy, Søren Hauberg

Subjects: Machine Learning (cs.LG); Numerical Analysis (math.NA); Machine Learning (stat.ML)

Tuning scientific and probabilistic machine learning models -- for example, partial differential equations, Gaussian processes, or Bayesian neural networks -- often relies on evaluating functions of matrices whose size grows with the data set or the number of parameters. While the state-of-the-art for evaluating these quantities is almost always based on Lanczos and Arnoldi iterations, the present work is the first to explain how to differentiate these workhorses of numerical linear algebra efficiently. To get there, we derive previously unknown adjoint systems for Lanczos and Arnoldi iterations, implement them in JAX, and show that the resulting code can compete with Diffrax when it comes to differentiating PDEs, GPyTorch for selecting Gaussian process models and beats standard factorisation methods for calibrating Bayesian neural networks. All this is achieved without any problem-specific code optimisation. Find the code at this https URL and install the library with pip install matfree.

[36] arXiv:2205.07583 (replaced) [pdf, ps, html, other]: Title: A least-squares Galerkin approach to gradient recovery for Hamilton-Jacobi-Bellman equation with Cordes coefficients

Omar Lakkis, Amireh Mousavi

Comments: 24 pages, 2 Figures (6 graphs)

Subjects: Numerical Analysis (math.NA); Analysis of PDEs (math.AP); Optimization and Control (math.OC); Adaptation and Self-Organizing Systems (nlin.AO)

We propose a conforming finite element method to approximate the strong solution of the second order Hamilton-Jacobi-Bellman equation with Dirichlet boundary and coefficients satisfying Cordes condition. We show the convergence of the continuum semismooth Newton method for the fully nonlinear Hamilton-Jacobi-Bellman equation. Applying this linearization for the equation yields a recursive sequence of linear elliptic boundary value problems in nondivergence form. We deal numerically with such BVPs via the least-squares gradient recovery of Lakkis & Mousavi [2021, arXiv:1909.00491]. We provide an optimal-rate apriori and aposteriori error bounds for the approximation. The aposteriori error are used to drive an adaptive refinement procedure. We close with computer experiments on uniform and adaptive meshes to reconcile the theoretical findings.
[37] arXiv:2210.02432 (replaced) [pdf, ps, html, other]: Title: Coercive second-kind boundary integral equations for the Laplace Dirichlet problem on Lipschitz domains

Simon N. Chandler-Wilde, Euan A. Spence

Subjects: Numerical Analysis (math.NA); Analysis of PDEs (math.AP)

We present new second-kind integral-equation formulations of the interior and exterior Dirichlet problems for Laplace's equation. The operators in these formulations are both continuous and coercive on general Lipschitz domains in $\mathbb{R}^d$, $d\geq 2$, in the space $L^2(\Gamma)$, where $\Gamma$ denotes the boundary of the domain. These properties of continuity and coercivity immediately imply that (i) the Galerkin method converges when applied to these formulations; and (ii) the Galerkin matrices are well-conditioned as the discretisation is refined, without the need for operator preconditioning (and we prove a corresponding result about the convergence of GMRES). The main significance of these results is that it was recently proved (see Chandler-Wilde and Spence, Numer. Math., 150(2):299-271, 2022) that there exist 2- and 3-d Lipschitz domains and 3-d starshaped Lipschitz polyhedra for which the operators in the standard second-kind integral-equation formulations for Laplace's equation (involving the double-layer potential and its adjoint) $\textit{cannot}$ be written as the sum of a coercive operator and a compact operator in the space $L^2(\Gamma)$. Therefore there exist 2- and 3-d Lipschitz domains and 3-d starshaped Lipschitz polyhedra for which Galerkin methods in $L^2(\Gamma)$ do $\textit{not}$ converge when applied to the standard second-kind formulations, but $\textit{do}$ converge for the new formulations.
[38] arXiv:2305.08221 (replaced) [pdf, ps, html, other]: Title: Validated integration of semilinear parabolic PDEs

Jan Bouwe van den Berg, Maxime Breden, Ray Sheombarsing

Comments: Revised accepted version

Subjects: Numerical Analysis (math.NA); Analysis of PDEs (math.AP); Dynamical Systems (math.DS)

Integrating evolutionary partial differential equations (PDEs) is an essential ingredient for studying the dynamics of the solutions. Indeed, simulations are at the core of scientific computing, but their mathematical reliability is often difficult to quantify, especially when one is interested in the output of a given simulation, rather than in the asymptotic regime where the discretization parameter tends to zero. In this paper we present a computer-assisted proof methodology to perform rigorous time integration for scalar semilinear parabolic PDEs with periodic boundary conditions. We formulate an equivalent zero-finding problem based on a variations of constants formula in Fourier space. Using Chebyshev interpolation and domain decomposition, we then finish the proof with a Newton--Kantorovich type argument. The final output of this procedure is a proof of existence of an orbit, together with guaranteed error bounds between this orbit and a numerically computed approximation. We illustrate the versatility of the approach with results for the Fisher equation, the Swift--Hohenberg equation, the Ohta--Kawasaki equation and the Kuramoto--Sivashinsky equation. We expect that this rigorous integrator can form the basis for studying boundary value problems for connecting orbits in partial differential equations.
[39] arXiv:2306.05975 (replaced) [pdf, ps, html, other]: Title: Efficient Tensor-Product Spectral-Element Operators with the Summation-by-Parts Property on Curved Triangles and Tetrahedra

Tristan Montoya, David W. Zingg

Comments: 27 pages, 5 figures

Subjects: Numerical Analysis (math.NA)

We present an extension of the summation-by-parts (SBP) framework to tensor-product spectral-element operators in collapsed coordinates. The proposed approach enables the construction of provably stable discretizations of arbitrary order which combine the geometric flexibility of unstructured triangular and tetrahedral meshes with the efficiency of sum-factorization algorithms. Specifically, a methodology is developed for constructing triangular and tetrahedral spectral-element operators of any order which possess the SBP property (i.e. satisfying a discrete analogue of integration by parts) as well as a tensor-product decomposition. Such operators are then employed within the context of discontinuous spectral-element methods based on nodal expansions collocated at the tensor-product quadrature nodes as well as modal expansions employing Proriol-Koornwinder-Dubiner polynomials, the latter approach resolving the time step limitation associated with the singularity of the collapsed coordinate transformation. Energy-stable formulations for curvilinear meshes are obtained using a skew-symmetric splitting of the metric terms, and a weight-adjusted approximation is used to efficiently invert the curvilinear modal mass matrix. The proposed schemes are compared to those using non-tensorial multidimensional SBP operators, and are found to offer comparable accuracy to such schemes in the context of smooth linear advection problems on curved meshes, but at a reduced computational cost for higher polynomial degrees.
[40] arXiv:2307.07486 (replaced) [pdf, ps, other]: Title: Global sensitivity analysis with limited data via sparsity-promoting D-MORPH regression: Application to char combustion

Dongjin Lee, Elle Lavichant, Boris Kramer

Comments: 26 pages, 11 figures

Subjects: Numerical Analysis (math.NA)

In uncertainty quantification, variance-based global sensitivity analysis quantitatively determines the effect of each input random variable on the output by partitioning the total output variance into contributions from each input. However, computing conditional expectations can be prohibitively costly when working with expensive-to-evaluate models. Surrogate models can accelerate this, yet their accuracy depends on the quality and quantity of training data, which is expensive to generate (experimentally or computationally) for complex engineering systems. Thus, methods that work with limited data are desirable. We propose a diffeomorphic modulation under observable response preserving homotopy (D-MORPH) regression to train a polynomial dimensional decomposition surrogate of the output that minimizes the number of training data. The new method first computes a sparse Lasso solution and uses it to define the cost function. A subsequent D-MORPH regression minimizes the difference between the D-MORPH and Lasso solution. The resulting D-MORPH based surrogate is more robust to input variations and more accurate with limited training data. We illustrate the accuracy and computational efficiency of the new surrogate for global sensitivity analysis using mathematical functions and an expensive-to-simulate model of char combustion. The new method is highly efficient, requiring only 15% of the training data compared to conventional regression.
[41] arXiv:2307.07780 (replaced) [pdf, ps, html, other]: Title: Accuracy Controlled Schemes for the Eigenvalue Problem of the Radiative Transfer Equation

Wolfgang Dahmen, Olga Mula

Subjects: Numerical Analysis (math.NA); Analysis of PDEs (math.AP); Spectral Theory (math.SP)

The criticality problem in nuclear engineering asks for the principal eigen-pair of a Boltzmann operator describing neutron transport in a reactor core. Being able to reliably design, and control such reactors requires assessing these quantities within quantifiable accuracy tolerances. In this paper we propose a paradigm that deviates from the common practice of approximately solving the corresponding spectral problem with a fixed, presumably sufficiently fine discretization. Instead, the present approach is based on first contriving iterative schemes, formulated in function space, that are shown to converge at a quantitative rate without assuming any a priori excess regularity properties, and that exploit only properties of the optical parameters in the underlying radiative transfer model. We develop the analytical and numerical tools for approximately realizing each iteration step withing judiciously chosen accuracy tolerances, verified by a posteriori estimates, so as to still warrant quantifiable convergence to the exact eigen-pair. This is carried out in full first for a Newton scheme. Since this is only locally convergent we analyze in addition the convergence of a power iteration in function space to produce sufficiently accurate initial guesses. Here we have to deal with intrinsic difficulties posed by compact but unsymmetric operators preventing standard arguments used in the finite dimensional case. Our main point is that we can avoid any condition on an initial guess to be already in a small neighborhood of the exact solution. We close with a discussion of remaining intrinsic obstructions to a certifiable numerical implementation, mainly related to not knowing the gap between the principal eigenvalue and the next smaller one in modulus.
[42] arXiv:2401.03245 (replaced) [pdf, ps, other]: Title: Deep learning algorithms for FBSDEs with jumps: Applications to option pricing and a MFG model for smart grids

Clémence Alasseur, Zakaria Bensaid, Roxana Dumitrescu, Xavier Warin

Subjects: Numerical Analysis (math.NA); Optimization and Control (math.OC); Probability (math.PR)

In this paper, we introduce various machine learning solvers for (coupled) forward-backward systems of stochastic differential equations (FBSDEs) driven by a Brownian motion and a Poisson random measure. We provide a rigorous comparison of the different algorithms and demonstrate their effectiveness in various applications, such as cases derived from pricing with jumps and mean-field games. In particular, we show the efficiency of the deep-learning algorithms to solve a coupled multi-dimensional FBSDE system driven by a time-inhomogeneous jump process with stochastic intensity, which describes the Nash equilibria for a specific mean-field game (MFG) problem for which we also provide the complete theoretical resolution. More precisely, we develop an extension of the MFG model for smart grids introduced in Alasseur, Campi, Dumitrescu and Zeng (Annals of Operations Research, 2023) to the case when the random jump times correspond to the jump times of a doubly Poisson process. We first provide an existence result of an equilibria and derive its semi-explicit characterization in terms of a system of FBSDEs in the linear-quadratic setting. We then compare the MFG solution to the optimal strategy of a central planner and provide several numerical illustrations using the deep-learning solvers presented in the first part of the paper.
[43] arXiv:2402.04407 (replaced) [pdf, ps, html, other]: Title: Sharp Lower Bounds on the Manifold Widths of Sobolev and Besov Spaces

Jonathan W. Siegel

Subjects: Numerical Analysis (math.NA)

We consider the problem of determining the manifold $n$-widths of Sobolev and Besov spaces with error measured in the $L_p$-norm. The manifold widths control how efficiently these spaces can be approximated by general non-linear parametric methods with the restriction that the parameter selection and parameterization maps must be continuous. Existing upper and lower bounds only match when the Sobolev or Besov smoothness index $q$ satisfies $q\leq p$ or $1 \leq p \leq 2$. We close this gap and obtain sharp lower bounds for all $1 \leq p,q \leq \infty$ for which a compact embedding holds. A key part of our analysis is to determine the exact value of the manifold widths of finite dimensional $\ell^M_q$-balls in the $\ell_p$-norm when $p\leq q$, which complements existing results that handle the case $q\leq p$. Our results show that the Bernstein widths, which are typically used to lower bound the manifold widths, decay asymptotically faster than the manifold widths in many cases.
[44] arXiv:2402.06429 (replaced) [pdf, ps, html, other]: Title: Exact a posteriori error control for variational problems via convex duality and explicit flux reconstruction

Sören Bartels, Alex Kaltenbach

Comments: To appear in Advances in Applied Mechanics

Subjects: Numerical Analysis (math.NA)

A posteriori error estimates are an important tool to bound discretization errors in terms of computable quantities avoiding regularity conditions that are often difficult to establish. For non-linear and non-differentiable problems, problems involving jumping coefficients, and finite element methods using anisotropic triangulations, such estimates often involve large factors, leading to sub-optimal error estimates. By making use of convex duality arguments, exact and explicit error representations are derived that avoid such effects.
[45] arXiv:2403.04329 (replaced) [pdf, ps, html, other]: Title: A mechanism-driven reinforcement learning framework for shape optimization of airfoils

Jingfeng Wang, Guanghui Hu

Comments: 25 pages

Subjects: Numerical Analysis (math.NA); Computational Engineering, Finance, and Science (cs.CE); Machine Learning (cs.LG)

In this paper, a novel mechanism-driven reinforcement learning framework is proposed for airfoil shape optimization. To validate the framework, a reward function is designed and analyzed, from which the equivalence between the maximizing the cumulative reward and achieving the optimization objectives is guaranteed theoretically. To establish a quality exploration, and to obtain an accurate reward from the environment, an efficient solver for steady Euler equations is employed in the reinforcement learning method. The solver utilizes the Bézier curve to describe the shape of the airfoil, and a Newton-geometric multigrid method for the solution. In particular, a dual-weighted residual-based h-adaptive method is used for efficient calculation of target functional. To effectively streamline the airfoil shape during the deformation process, we introduce the Laplacian smoothing, and propose a Bézier fitting strategy, which not only remits mesh tangling but also guarantees a precise manipulation of the geometry. In addition, a neural network architecture is designed based on an attention mechanism to make the learning process more sensitive to the minor change of the airfoil geometry. Numerical experiments demonstrate that our framework can handle the optimization problem with hundreds of design variables. It is worth mentioning that, prior to this work, there are limited works combining such high-fidelity partial differential equatons framework with advanced reinforcement learning algorithms for design problems with such high dimensionality.
[46] arXiv:2403.07787 (replaced) [pdf, ps, html, other]: Title: Transparent boundary condition and its effectively local approximation for the Schr\"{o}dinger equation on a rectangular computational domain

Samardhi Yadav, Vishal Vaibhav

Comments: 53 pages, 16 figures, 5 tables

Subjects: Numerical Analysis (math.NA); Computational Physics (physics.comp-ph)

The transparent boundary condition for the free Schrödinger equation on a rectangular computational domain requires implementation of an operator of the form $\sqrt{\partial_t-i\triangle_{\Gamma}}$ where $\triangle_{\Gamma}$ is the Laplace-Beltrami operator. It is known that this operator is nonlocal in time as well as space which poses a significant challenge in developing an efficient numerical method of solution. The computational complexity of the existing methods scale with the number of time-steps which can be attributed to the nonlocal nature of the boundary operator. In this work, we report an effectively local approximation for the boundary operator such that the resulting complexity remains independent of number of time-steps. At the heart of this algorithm is a Padé approximant based rational approximation of certain fractional operators that handles corners of the domain adequately. For the spatial discretization, we use a Legendre-Galerkin spectral method with a new boundary adapted basis which ensures that the resulting linear system is banded. A compatible boundary-lifting procedure is also presented which accommodates the segments as well as the corners on the boundary. The proposed novel scheme can be implemented within the framework of any one-step time marching schemes. In particular, we demonstrate these ideas for two one-step methods, namely, the backward-differentiation formula of order 1 (BDF1) and the trapezoidal rule (TR). For the sake of comparison, we also present a convolution quadrature based scheme conforming to the one-step methods which is computationally expensive but serves as a golden standard. Finally, several numerical tests are presented to demonstrate the effectiveness of our novel method as well as to verify the order of convergence empirically.
[47] arXiv:2405.05844 (replaced) [pdf, ps, html, other]: Title: Structure-preserving parametric finite element methods for simulating axisymmetric solid-state dewetting problems with anisotropic surface energies

Meng Li, Chunjie Zhou

Subjects: Numerical Analysis (math.NA)

Solid-state dewetting (SSD), a widespread phenomenon in solid-solid-vapor system, could be used to describe the accumulation of solid thin films on the substrate. In this work, we consider the sharp interface model for axisymmetric SSD with anisotropic surface energy. By introducing two types of surface energy matrices from the anisotropy functions,we aim to design two structure-preserving algorithms for the axisymmetric SSD. The newly designed schemes are applicable to a broader range of anisotropy functions, and we can theoretically prove their volume conservation and energy stability. In addition, based on a novel weak formulation for the axisymmetric SSD, we further build another two numerical schemes that have good mesh properties. Finally, numerous numerical tests are reported to showcase the accuracy and efficiency of the numerical methods.
[48] arXiv:2405.06464 (replaced) [pdf, ps, other]: Title: Single-seed generation of Brownian paths and integrals for adaptive and high order SDE solvers

Andraž Jelinčič, James Foster, Patrick Kidger

Subjects: Numerical Analysis (math.NA); Machine Learning (cs.LG); Probability (math.PR); Computation (stat.CO)

Despite the success of adaptive time-stepping in ODE simulation, it has so far seen few applications for Stochastic Differential Equations (SDEs). To simulate SDEs adaptively, methods such as the Virtual Brownian Tree (VBT) have been developed, which can generate Brownian motion (BM) non-chronologically. However, in most applications, knowing only the values of Brownian motion is not enough to achieve a high order of convergence; for that, we must compute time-integrals of BM such as $\int_s^t W_r \, dr$. With the aim of using high order SDE solvers adaptively, we extend the VBT to generate these integrals of BM in addition to the Brownian increments. A JAX-based implementation of our construction is included in the popular Diffrax library (this https URL).
Since the entire Brownian path produced by VBT is uniquely determined by a single PRNG seed, previously generated samples need not be stored, which results in a constant memory footprint and enables experiment repeatability and strong error estimation. Based on binary search, the VBT's time complexity is logarithmic in the tolerance parameter $\varepsilon$. Unlike the original VBT algorithm, which was only precise at some dyadic times, we prove that our construction exactly matches the joint distribution of the Brownian motion and its time integrals at any query times, provided they are at least $\varepsilon$ apart.
We present two applications of adaptive high order solvers enabled by our new VBT. Using adaptive solvers to simulate a high-volatility CIR model, we achieve more than twice the convergence order of constant stepping. We apply an adaptive third order underdamped or kinetic Langevin solver to an MCMC problem, where our approach outperforms the No U-Turn Sampler, while using only a tenth of its function evaluations.
[49] arXiv:2405.14772 (replaced) [pdf, ps, other]: Title: Vortex-capturing multiscale spaces for the Ginzburg-Landau equation

Maria Blum, Christian Döding, Patrick Henning

Subjects: Numerical Analysis (math.NA)

This paper considers minimizers of the Ginzburg-Landau energy functional in particular multiscale spaces which are based on finite elements. The spaces are constructed by localized orthogonal decomposition techniques and their usage for solving the Ginzburg-Landau equation was first suggested in [Dörich, Henning, SINUM 2024]. In this work we further explore their approximation properties and give an analytical explanation for why vortex structures of energy minimizers can be captured more accurately in these spaces. We quantify the necessary mesh resolution in terms of the Ginzburg-Landau parameter $\kappa$ and a stabilization parameter $\beta \ge 0$ that is used in the construction of the multiscale spaces. Furthermore, we analyze how $\kappa$ affects the necessary locality of the multiscale basis functions and we prove that the choice $\beta=0$ yields typically the highest accuracy. Our findings are supported by numerical experiments.
[50] arXiv:2405.15344 (replaced) [pdf, ps, html, other]: Title: Adaptive Finite Element Method for a Nonlinear Helmholtz Equation with High Wave Number

Run Jiang, Haijun Wu, Yifeng Xu, Jun Zou

Subjects: Numerical Analysis (math.NA)

A nonlinear Helmholtz (NLH) equation with high frequencies and corner singularities is discretized by the linear finite element method (FEM). After deriving some wave-number-explicit stability estimates and the singularity decomposition for the NLH problem, a priori stability and error estimates are established for the FEM on shape regular meshes including the case of locally refined meshes. Then a posteriori upper and lower bounds using a new residual-type error estimator, which is equivalent to the standard one, are derived for the FE solutions to the NLH problem. These a posteriori estimates have confirmed a significant fact that is also valid for the NLH problem, namely the residual-type estimator seriously underestimates the error of the FE solution in the preasymptotic regime, which was first observed by Babuška et al. [Int J Numer Methods Eng 40 (1997)] for a one-dimensional linear problem. Based on the new a posteriori error estimator, both the convergence and the quasi-optimality of the resulting adaptive finite element algorithm are proved the first time for the NLH problem, when the initial mesh size lying in the preasymptotic regime. Finally, numerical examples are presented to validate the theoretical findings and demonstrate that applying the continuous interior penalty (CIP) technique with appropriate penalty parameters can reduce the pollution errors efficiently. In particular, the nonlinear phenomenon of optical bistability with Gaussian incident waves is successfully simulated by the adaptive CIPFEM.
[51] arXiv:2303.04045 (replaced) [pdf, ps, other]: Title: Observer-based data assimilation for barotropic gas transport using distributed measurements

Jan Giesselmann, Martin Gugat, Teresa Kunkel

Subjects: Analysis of PDEs (math.AP); Numerical Analysis (math.NA)

We consider a state estimation problem for gas pipeline flow modeled by the one-dimensional barotropic Euler equations. In order to reconstruct the system state, we construct an observer system of Luenberger type based on distributed measurements of one state variable. First, we show the existence of Lipschitz-continuous semi-global solutions of the observer system and of the original system for initial and boundary data satisfying smallness and compatibility conditions for a single pipe and for general networks. Second, based on an extension of the relative energy method we prove that the state of the observer system converges exponentially in the long time limit towards the original system state. We show this for a single pipe and for star-shaped networks.
[52] arXiv:2311.00907 (replaced) [pdf, ps, html, other]: Title: New vector transport operators extending a Riemannian CG algorithm to generalized Stiefel manifold with low-rank applications

Xuejie Wang, Kangkang Deng, Zheng Peng, Chengcheng Yan

Subjects: Optimization and Control (math.OC); Numerical Analysis (math.NA)

This paper proposes two innovative vector transport operators, leveraging the Cayley transform, for the generalized Stiefel manifold embedded with a non-standard metric. Specifically, it introduces the differentiated retraction and an approximation of the Cayley transform to the differentiated matrix exponential. These vector transports are demonstrated to satisfy the Ring-Wirth non-expansive condition under non-standard metrics, and one of them is also isometric. Building upon the novel vector transport operators, we extend the modified Polak-Ribi$\grave{e}$re-Polyak (PRP) conjugate gradient method to the generalized Stiefel manifold. Under a non-monotone line search condition, we prove our algorithm globally converges to a stationary point. The efficiency of the proposed vector transport operators is empirically validated through numerical experiments involving generalized eigenvalue problems and canonical correlation analysis.
[53] arXiv:2402.03460 (replaced) [pdf, ps, html, other]: Title: Approximation Rates and VC-Dimension Bounds for (P)ReLU MLP Mixture of Experts

Anastasis Kratsios, Haitz Sáez de Ocáriz Borde, Takashi Furuya, Marc T. Law

Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Combinatorics (math.CO); Numerical Analysis (math.NA)

Mixture-of-Experts (MoEs) can scale up beyond traditional deep learning models by employing a routing strategy in which each input is processed by a single "expert" deep learning model. This strategy allows us to scale up the number of parameters defining the MoE while maintaining sparse activation, i.e., MoEs only load a small number of their total parameters into GPU VRAM for the forward pass depending on the input. In this paper, we provide an approximation and learning-theoretic analysis of mixtures of expert MLPs with (P)ReLU activation functions. We first prove that for every error level $\varepsilon>0$ and every Lipschitz function $f:[0,1]^n\to \mathbb{R}$, one can construct a MoMLP model (a Mixture-of-Experts comprising of (P)ReLU MLPs) which uniformly approximates $f$ to $\varepsilon$ accuracy over $[0,1]^n$, while only requiring networks of $\mathcal{O}(\varepsilon^{-1})$ parameters to be loaded in memory. Additionally, we show that MoMLPs can generalize since the entire MoMLP model has a (finite) VC dimension of $\tilde{O}(L\max\{nL,JW\})$, if there are $L$ experts and each expert has a depth and width of $J$ and $W$, respectively.

Total of 53 entries

Showing up to 2000 entries per page: fewer | more | all

Numerical Analysis

New submissions for Tuesday, 28 May 2024 (showing 20 of 20 entries )

Cross submissions for Tuesday, 28 May 2024 (showing 15 of 15 entries )

Replacement submissions for Tuesday, 28 May 2024 (showing 18 of 18 entries )