Computational Finance
- [1] arXiv:2405.13102 (cross-list from cs.GT) [pdf, ps, html, other]
-
Title: Trading Volume Maximization with Online LearningSubjects: Computer Science and Game Theory (cs.GT); Machine Learning (cs.LG); Computational Finance (q-fin.CP)
We explore brokerage between traders in an online learning framework. At any round $t$, two traders meet to exchange an asset, provided the exchange is mutually beneficial. The broker proposes a trading price, and each trader tries to sell their asset or buy the asset from the other party, depending on whether the price is higher or lower than their private valuations. A trade happens if one trader is willing to sell and the other is willing to buy at the proposed price. Previous work provided guidance to a broker aiming at enhancing traders' total earnings by maximizing the gain from trade, defined as the sum of the traders' net utilities after each interaction. In contrast, we investigate how the broker should behave to maximize the trading volume, i.e., the total number of trades. We model the traders' valuations as an i.i.d. process with an unknown distribution. If the traders' valuations are revealed after each interaction (full-feedback), and the traders' valuations cumulative distribution function (cdf) is continuous, we provide an algorithm achieving logarithmic regret and show its optimality up to constant factors. If only their willingness to sell or buy at the proposed price is revealed after each interaction ($2$-bit feedback), we provide an algorithm achieving poly-logarithmic regret when the traders' valuations cdf is Lipschitz and show that this rate is near-optimal. We complement our results by analyzing the implications of dropping the regularity assumptions on the unknown traders' valuations cdf. If we drop the continuous cdf assumption, the regret rate degrades to $\Theta(\sqrt{T})$ in the full-feedback case, where $T$ is the time horizon. If we drop the Lipschitz cdf assumption, learning becomes impossible in the $2$-bit feedback case.
- [2] arXiv:2405.13513 (cross-list from q-fin.RM) [pdf, ps, html, other]
-
Title: An Asymptotic CVaR Measure of Risk for Markov ChainsComments: 9 pages, 5 figuresSubjects: Risk Management (q-fin.RM); Computational Finance (q-fin.CP)
Risk sensitive decision making finds important applications in current day use cases. Existing risk measures consider a single or finite collection of random variables, which do not account for the asymptotic behaviour of underlying systems. Conditional Value at Risk (CVaR) is the most commonly used risk measure, and has been extensively utilized for modelling rare events in finite horizon scenarios. Naive extension of existing risk criteria to asymptotic regimes faces fundamental challenges, where basic assumptions of existing risk measures fail. We present a complete simulation based approach for sequentially computing Asymptotic CVaR (ACVaR), a risk measure we define on limiting empirical averages of markovian rewards. Large deviations theory, density estimation, and two-time scale stochastic approximation are utilized to define a 'tilted' probability kernel on the underlying state space to facilitate ACVaR simulation. Our algorithm enjoys theoretical guarantees, and we numerically evaluate its performance over a variety of test cases.
- [3] arXiv:2405.13609 (cross-list from cs.LG) [pdf, ps, html, other]
-
Title: Tackling Decision Processes with Non-Cumulative Objectives using Reinforcement LearningSubjects: Machine Learning (cs.LG); Computational Finance (q-fin.CP); Quantum Physics (quant-ph)
Markov decision processes (MDPs) are used to model a wide variety of applications ranging from game playing over robotics to finance. Their optimal policy typically maximizes the expected sum of rewards given at each step of the decision process. However, a large class of problems does not fit straightforwardly into this framework: Non-cumulative Markov decision processes (NCMDPs), where instead of the expected sum of rewards, the expected value of an arbitrary function of the rewards is maximized. Example functions include the maximum of the rewards or their mean divided by their standard deviation. In this work, we introduce a general mapping of NCMDPs to standard MDPs. This allows all techniques developed to find optimal policies for MDPs, such as reinforcement learning or dynamic programming, to be directly applied to the larger class of NCMDPs. Focusing on reinforcement learning, we show applications in a diverse set of tasks, including classical control, portfolio optimization in finance, and discrete optimization problems. Given our approach, we can improve both final performance and training time compared to relying on standard MDPs.
Cross submissions for Friday, 24 May 2024 (showing 3 of 3 entries )
- [4] arXiv:2207.00949 (replaced) [pdf, ps, html, other]
-
Title: Stochastic arbitrage with market index optionsComments: 19 pages, 7 figures, 7 tablesSubjects: Computational Finance (q-fin.CP)
Opportunities for stochastic arbitrage in an options market arise when it is possible to construct a portfolio of options which provides a positive option premium and which, when combined with a direct investment in the underlying asset, generates a payoff which stochastically dominates the payoff from the direct investment in the underlying asset. We provide linear and mixed-integer linear programs for computing the stochastic arbitrage opportunity providing the maximum option premium to an investor. We apply our programs to 18 years of data on monthly put and call options on the Standard & Poors 500 index, confining attention to options with moderate moneyness, and using two specifications of the underlying asset return distribution, one symmetric and one skewed. The pricing of market index options with moderate moneyness appears to be broadly consistent with our skewed specification of market returns.
- [5] arXiv:2301.05886 (replaced) [pdf, ps, html, other]
-
Title: Efficient Risk Estimation for the Credit Valuation AdjustmentComments: 35 pages, 2 figuresSubjects: Computational Finance (q-fin.CP)
The valuation of over-the-counter derivatives is subject to a series of valuation adjustments known as xVA, which pose additional risks for financial institutions. Associated risk measures, such as the value-at-risk of an underlying valuation adjustment, play an important role in managing these risks. Monte Carlo methods are often regarded as inefficient for computing such measures. As an example, we consider the value-at-risk of the Credit Valuation Adjustment (CVA-VaR), which can be expressed using a triple nested expectation. Traditional Monte Carlo methods are often inefficient at handling several nested expectations. Utilising recent developments in multilevel nested simulation for probabilities, we construct a hierarchical estimator of the CVA-VaR which reduces the computational complexity by 3 orders of magnitude compared to standard Monte Carlo.
- [6] arXiv:2401.17472 (replaced) [pdf, ps, html, other]
-
Title: Convergence of the deep BSDE method for stochastic control problems formulated through the stochastic maximum principleSubjects: Optimization and Control (math.OC); Numerical Analysis (math.NA); Computational Finance (q-fin.CP)
It is well-known that decision-making problems from stochastic control can be formulated by means of a forward-backward stochastic differential equation (FBSDE). Recently, the authors of Ji et al. 2022 proposed an efficient deep learning algorithm based on the stochastic maximum principle (SMP). In this paper, we provide a convergence result for this deep SMP-BSDE algorithm and compare its performance with other existing methods. In particular, by adopting a strategy as in Han and Long 2020, we derive a-posteriori estimate, and show that the total approximation error can be bounded by the value of the loss functional and the discretization error. We present numerical examples for high-dimensional stochastic control problems, both in case of drift- and diffusion control, which showcase superior performance compared to existing algorithms.