Adaptive GMRES(m) for the Electromagnetic Scattering Problem

In this article, an adaptive version of the restarted GMRES (GMRES(m)) is introduced for the resolution of the finite difference approximation of the Helmholtz equation. It has been observed that the choice of the restart parameter m strongly affects the convergence of standard GMRES(m). To overcome this problem, the GMRES(m) is formulated as a control problem in order to adaptively combine two strategies: a) the appropriate variation of the restarted parameter m, if a stagnation in the convergence is detected; and b) the augmentation of the search subspace using vectors obtained at previous cycles. The proposal is compared with similar iterative methods of the literature based on standard GMRES(m) with fixed parameters. Numerical results for selected matrices suggest that the switching adaptive proposal method could overcome the stagnation observed in standard methods, and even improve the performance in terms of computational time and memory requirements.


INTRODUCTION
This article deals with the numerical resolution of a Helmholtz scattering problem by using an adaptive version of the restarted GMRES. The resolution of the Helmholtz scattering equation using iterative methods is particularly difficult since the problem is ill-posed for a set of frequencies that physically corresponds to the resonance modes of the domain to be solved, the discretization grid has to be refined as a function of the frequency of the operator, and the oscillatory and non-local structure of the solution affects the numerical methods [10]. As a consequence, fast methods (like multigrid) and preconditioners (like incomplete LU) fail to give fast convergence for discretizations of the Helmholtz equation; in fact, the improvement of numerical methods and preconditioners for this kind of problems are an active area of research [7,10,11].
Generalized Minimal Residual Method (GMRES) is an iterative method frequently selected for its robustness in problems whose discretization results in a large sparse non-Hermitian linear system [20]. This method approximates the solution of the linear system Au = f at each iteration, where A ∈ C n×n is nonsingular and u, f ∈ C n . GMRES is a method based on Krylov subspace which obtains at each iteration an approximate solution by minimizing 2-norm of the residual, building an orthogonal basis for the Krylov subspace. To maintain the computational and memory requirement of the orthogonalization process under control, the GMRES is restarted, i.e., the dimension of the Krylov subspace is allowed to grow to a certain maximum m and then, using the obtained approximation of the solution, a new residual is computed and a new Krylov subspace is built. This procedure allows to maintain the dimension of the Krylov subspace at m at the most, and consequently keep the cost under control for the orthogonalization at each cycle of the GMRES(m).
Unfortunately, the convergence to the solution is not guaranteed if the selection of the fixed parameter m is not appropriate, causing slowdown or, even more serious, stagnation in its rate of convergence [21]. If the rate of convergence presents stagnation, a simple alternative consists of enlarging the maximum allowed dimension m, enlarging the subspace information. However, this strategy does not always ensure faster convergence [9,21]. It is necessary to include another kind of information or modify the search subspace. Several adaptive strategies to modify the restart parameter are encountered in the literature (see for instance [15]). In [12], the use of a stagnation test was proposed; [25] was based on the difference of the Ritz and harmonic Ritz values; [1] presented a simple strategy based on the angles between consecutive residual vectors for modifying the restart parameter; and [6] introduced a proportional-derivative control-inspired strategy for choosing the parameter m adaptively. Other modification strategies are hybrid iterative methods and acceleration techniques [2]. Our work is framed into the category of augmented methods that is a class of acceleration techniques. Some augmented methods are presented in [2,16,17,23].
In this work, we concentrate our efforts on the improvement of the convergence of the GMRES method itself. To this end, an adaptive control strategy is proposed which includes information from previous cycles for overcoming the stagnation. The proposed strategy is used for the resolution of a linear system of equations obtained from the discretization of an electromagnetic cavity problem [14]. For this problem, the standard GMRES(m) (and modified implementations using fixed parameters [2,16]) have a poor rate of convergence for large wave numbers [7]. Numerical results for the electromagnetic cavity problem with large wave numbers illustrate the efficiency of the proposed method. This paper is organized as follows. In §2, the formulation of GMRES(m) is introduced and the Electromagnetic Scattering Problem is characterized. In §3, the strategies for overcoming the stagnation are described. The numerical results (presented in §4 with conclusion in §5), show that the adaptive strategy improves the convergence of GMRES(m).
Throughout this paper, C n×m is the set of all n × m complex matrices. I n is the n × n identity matrix, and e j is its j-th column. Given a matrix M, M T denotes its transpose and M * its conjugate transpose or Hermitian transpose. Notation · denotes the 2-norm for vectors and the induced norm for matrices. The inner product is denoted as ·, · .

GMRES(M) AND ITS MODIFICATIONS
GMRES(m) approximates the solution to the linear system at the j-th restart cycle using the previous residual, r j−1 = f − Au j−1 , for constructing a Krylov subspace of K m (A, r j−1 ) = span{r j−1 , Ar j−1 , . . . , A m−1 r j−1 } of dimension m. The j-th approximation is built as where the index m denotes that the restarting parameter was set to the value m. GMRES(m) obtains an approximate solution which minimizes the 2-norm of the residual r j , i.e., To solve this problem, the Arnoldi process is normally used for obtaining an orthonormal basis for the Krylov subspace. At the j-th cycle, the first m steps of this procedure can be expressed as: where V m ∈ C n×m and V m+1 := [V m v m+1 ] ∈ C n×(m+1) have orthonormal columns andH m ∈ C (m+1)×m is the upper Hessenberg matrix formed by an upper matrix H m of dimension m × m and an entry h m+1,m placed at position (m + 1, m). If the Arnoldi process starts with v 1 = ( 1 β )r j−1 , where β = r j−1 , then by construction the columns of V m are an orthogonal basis of the subspace K m (A, r j−1 ).
The approximate solution u j is obtained by u j = u j−1 + V m y j [20] and the expression (2.2) becomes, min y j r j−1 − AV m y j .
(2.4) Therefore, the approximate solution u j minimizes the residual norm. GMRES(m) does not maintain orthogonality between approximation spaces generated at successive restarts [2,20]. As a result, GMRES(m) exhibits a slower rate of convergence than its counterpart the GMRES. The problem arises when the residual norm between consecutive cycles does not have an adequate decrease, i.e., when an abrupt slowering occurs in its rate of convergence. A more extreme situation appears when the rate of convergence exhibits a stagnation, i.e. when the reduction in the residual norms at consecutive cycles is very small, i.e., r j ≈ r j−1 . Therefore, the residual vectors point in nearly the same direction at the end of every restart cycle, i.e., ∠(r j , r j−1 ) ≈ 0, meaning that the GMRES(m) did not introduce a large decrease in the residual norm between consecutive cycles [2,6].
There are several strategies for modifying GMRES(m) to accelerate and avoid stagnation. In this paper, we focus on the so-called augmented methods. The general idea consists in determining a subspace of dimension s, as the direct sum of two spaces of smaller dimension, denoted as span{ν 1 , . . . , ν m , ω 1 , . . . , ω k } where the first m vectors are determined using the standard Arnoldi procedure with the current residual normalized, i.e., r j−1 / r j−1 ; while the remaining vectors, which consist of the augmented part, contain relevant information saved from outer cycles. Two alternative methods are explored. The first one is the GMRES-E(m, d), proposed in [16]. This method computes {ω 1 , . . . , ω d } as harmonic Ritz vectors associated with the smallest harmonic Ritz value. This strategy seems to be particularly effective when a priori information on the problem confirms the presence of a group of relative small eigenvalues, which occurs in Electromagnetic Scattering Problem [7]. The second method is the LGMRES(m, l) proposed in [2]. In this case, at each restart cycle an approximate solution is constructed by using the first m vectors ν 1 , . . . , ν m and an additional basis ω 1 , . . . , ω l , in which each of the vectors contains error information of the each previously built l subspaces. Next, the above methods are described .
GMRES-E(m, d): Including approximate eigenvectors. The goal consists in the elimination of the components that supposedly slow down convergence [16]. The strategy consists of enriching the subspace by introducing eigenvectors associated to the problematic eigenvalues. In practice, the approximate eigenvectors are the harmonic Ritz vectors associated to the harmonic Ritz values per cycle [3,16,18].
At the j-th cycle, the harmonic Ritz valueλ k with the associated harmonic Ritz vector ϑ k = W s g k , where g k ∈ C s , with respect to the subspace AK s (A, r j−1 ) satisfies the following expression Using the Arnoldi relation, where W s is a n × s matrix, its first m columns are Arnoldi's vectors and the last d corresponds to the approximate eigenvectors ϑ k and V s+1 is the n × (s + 1) orthonormal matrix whose first m + 1 columns are the Arnoldi vectors and last d columns result from orthogonalizing the d approximation eigenvectors (ϑ k , k = 1, . . . , d) against the previous columns of Arnoldi's vectors.H s is an (s + 1) × s upper Hessenberg matrix with its elements constructed by the new orthogonalization process. At the j-th cycle, the reduced generalized eigenvalue problem is defined using the expression (2.6) and the new Arnoldi relation (2.7).
whereH * s is the hermitian of the upper Hessenberg matrix whose dimension is s × (s + 1) and H * s is the hermitian of the Hessenberg matrix whose dimension is s × s. Usually the value of s is much less than n. These eigenvalues are called harmonic Ritz values and are the roots of the GMRES residual polynomial [21]. The eigenvectors associated with these harmonic Ritz values are called harmonic Ritz vectors. Some g k associated with the smallestλ k are needed to deflates the smallest eigenvalues and thus improves the convergence [16,17]. Since the harmonic Ritz vectors are useful only at an specific cycle, it needs to be recomputed at each cycle. For this reason, only the residuals are stored for being reused in the next cycle [16].
At the end of the j-th restart cycle, GMRES-E(m, d) seeks the approximate solution u j of the form u j = u j−1 +W s y j , (2.9) such that y j is obtained by solving the following minimization problem where β = r j−1 . The sequence of the residual norm of the GMRES-E(m, d) has the property of being non-increasing but can not guarantee convergence [16,18].
LGMRES(m, l): Including the error approximations in the search subspace. The motivation of LGMRES(m, l) is based on preventing an alternating behavior observed in the GMRES(m) residual at consecutive cycles which results in deteriorating the convergence [2].
LGMRES(m, l) includes approximations to the error in the current search subspace. The error approximation at the j-th restart cycle is defined by using the approximate solution at previous cycles as and ϕ j ≡ 0 for j < 1. This error approximation vector is used for augmenting the search subspace K m (A, r j−1 ) at the next cycle. Note that ϕ j ∈ K m (A, r j−2 ). Therefore, this error approximation ϕ j in some sense represents the space K m (A, r j−2 ) generated in the previous cycle and subsequently discarded in the restarting procedure. Hence it is a natural choice for enriching the next approximation space K m (A, r j−1 ).

The augmented approximation space
This method, instead of using K m (A, r j−1 ) in equation (2.1), uses the subspace S . The matrix V s+1 is the n × (s + 1) orthonormal matrix whose first m + 1 columns are the Arnoldi vectors and the last l columns result from orthogonalizing the l error approximation vectors (ϕ k , k = ( j − l), . . . , ( j − 1)) against the previous columns of Arnoldi vectors. W s is an n × s matrix whose first m columns are the orthogonalized Arnoldi vectors and the last l columns are the l most recent error approximations (typically normalized so that all columns are of unit length). Then the new relationship at the j-th cycle is holds for LGMRES(m, l),H s denotes an (s + 1) × s upper Hessenberg matrix. In practice, l << m, following [2] l ≤ 3 is a good choice for for LGMRES(m, l). Similar to expressions (2.9) and (2.10), the approximate solution at the j-th cycle is and y j minimize the residual norm r j .
Remark. It is important to remark that LGMRES is not helpful when one of the following situations occurs: (a) when GMRES(m) skip angles (∠(r j , r j−2 )) are not small; (b) when GMRES(m) sequential angles (∠(r j , r j−1 )) vary greatly from cycle to cycle; (c) when GMRES(m) converges in a small number of iterations; or (d) when GMRES(m) skip angles and sequential angles are near zero (indicating stagnation).
LGMRES is not typically a substitute for preconditioning and does not help when a problem stagnates for a given restart parameter. Possible improvements to the algorithm include a robust adaptive variant [2].
A-LGMRES-E(m, d, l): Including approximate eigenvectors and errors approximations simultaneously with adaptive restart parameter. The proposed method, an adaptive version of GMRES(m) denoted as A-LGMRES-E(m, d, l), is inspired by the augmented method presented in [18] that combine the GMRES-E(m, d) and LGMRES(m, l) with fixed parameters. In problems with stagnation, according to remark (d) of LGMRES, the error approximation vectors do not help to improve the rate of convergence, i.e., ϕ j−1 = u j−1 − u j−2 ≈ 0. Hence, these errors vector are discarded and only the harmonic Ritz vectors are maintained to enrich the search subspace.
As can be seen in the numerical results of the LGMRES-E, keeping constant m and enriching the search subspace, it does not necessarily avoid stagnation. To solve this problem a controller to augment the size of the search subspace is proposed for enlarging the Arnoldi basis, since decreasing the restart parameter does not contribute to an improvement in the convergence [4]. This is done by adding a positive integer value α to the value m. Thus the search subspace is formed by the m j−1 + α Arnoldi vectors and d harmonic Ritz vectors, where m j−1 is the restart parameter of the previous cycle. The dimension of the search subspace at the j-th cycle is defined by the following rule: where δ is the stagnation threshold parameter. A slow convergence is considered to occur when y j−1 < δ . The α and δ values are constant and are selected according to before works and numerical experiments (see Section 4). According to the previous rule, the matrix W s in the j-th cycle is formed in the following way, (2.14) Finally, the approximate solution at the j-th cycle is, where y j is the vector that minimizes the residual norm r j and W s is a n × s j matrix containing the Krylov subspace enriched with information from previous cycles. Baker [2] and Morgan [16] suggest between 1 and 3 as the number of error approximations l and the harmonic Ritz vectors d, respectively; since the increase of these values does not significantly improve the decrease in the number of cycles necessary for convergence. In this work, the values l = 1 and d = 3 are chosen considering the above suggestions and giving more importance to the harmonic Ritz vectors, since the error approximations allow to accelerate the convergence but do not avoid stagnation. When the matrix is non-Hermitian and non-normal, as in the case of the problem addressed in this work, it is observed that the best selection of the aforementioned parameters is difficult to obtain since an a priori behavior of the iterative method with respect to the parameters is not completely understood.
The pseudocode for the j-th cycle of the proposed method denoted as A-LGMRES-E(m, d, l) is presented in the Algorithm 1.

AN ELECTROMAGNETIC CAVITY PROBLEM
The electromagnetic problem observed at Figure 1 is focused on a 2-D geometry by assuming that the medium is invariant in the z-direction and nonmagnetic with constant magnetic permeability µ(x, y) = µ 0 . The ground plane (x-axis) and the wall of the cavity are perfect electric conductors, and the interior of the cavity is filled with inhomogeneous material characterized with its relative permittivity ε r (x, y) [7].

yields the Helmholtz equation (3.1) together with a Sommerfeld's radiation condition imposed at infinity (equation (3.3)):
where k 0 is the free space wave number, Ω = [0, a] × [−b, 0] ∈ R 2 is the problem domain, f is the source term and f = 0 in the free space, S denotes the walls of cavity, T is a non-local boundary operator, Γ is the aperture between the cavity and the free space and g(x) = −2iβ e iαx .
The discretization by finite differences of equation (3.1) gives: The discrete form of the non-local boundary condition (3.3) is given by For more details about the system of equations, see [7].
The discretization of Helmholtz problem (3.1)-(3.3) by finite differences yields a linear system of equations Au = f , in which the coefficient matrix A is non Hermitian and highly indefinite for large values of the wave number k 0 [11]. In addition, the obtained linear system of equations is ill-conditioned and the growth of the mesh size (M and N) leads to very large matrices, and hence to high computational costs. Usually, direct methods do not perform well, and a general iterative method is required. GMRES is normally used in this context (non-Hermitian and indefinite matrices [20]) but due to the high computational and memory requirements the GMRES(m) is normally used. Unfortunately, as mentioned at the introduction section, §1, the possible problems of convergence of the GMRES(m) are exacerbated by the particularities of the Helmholtz problem described above. This is the reason way the A-LGMRES-E(m, d, l) is designed and numerically analyzed for the resolution of this problem.
The mesh size is determined by accuracy requirements on the discretization. The quantities, (3.8) are the numbers of mesh points per wavelength in both directions. A commonly employed engineering rule [8] states that, for a second order finite difference, is required to obtain satisfactory results. This rule is used in this work, hence the number of points per wavelength is maintained as a constant, which means that the grid is refined as the wave number is increasing, i.e., when the wave number is increased, the values of M and N is also increased.

NUMERICAL EXPERIMENTS
The numerical experiments consider Ω = [0, 1] × [−0.25, 0] filled with the non-homogeneous medium: The experiments were run on a computer with Intel Core i7-6700T with 8GB RAM, using the software MATLAB R2016a for Windows 10. The resulting linear system Au = f was solved with the following iterative methods: GMRES(m) [20]; GMRES-E(m, d) [16] using d harmonic Ritz vectors for augmenting the search subspace, whose dimension at each cycle is s = m + d; LGMRES(m, l) [2] using l errors approximation vectors for augmenting the search subspace, whose dimension at each cycle is s = m + l; LGMRES-E(m, l) [18] using l errors approximation vectors and d harmonic Ritz vectors for augmenting the search subspace, whose dimension at each cycle is s = m + l + d. For enriching the subspace in the proposed A-LGMRES-E(m j , l, d) method, if y j ≥ δ it is used an adaptive restart parameter m j = m j−1 , d harmonic Ritz vectors and l error approximation vectors; while if y j < δ it is used a restart parameter m j = m j−1 + α and d harmonic Ritz vectors only. In the latter case, the dimension at each cycle is according to rule (2.13).
It has also implemented a modified version of standard GMRES (m), that uses the rule (2.13) to modify adaptively the restart parameter and without any enrichment for the search subspace, i.e., l = d = 0. This method is denoted as GMRES(m j ) and it is used for comparison purposes. The strategies that modify the restart parameter include a minimum m min and a maximum m max for it, i.e., m min ≤ m j ≤ m max . The initial restart parameter is denoted as m 0 and is chosen m 0 = m min for these strategies.
In Table 1 is presented the considered wave numbers k 0 and size of the grid for the matrices tested of the problem introduced in §3. In this table, the size, the number of nonzero matrix elements and the condition number of the matrix A are defined as size(A), nnz(A) and cond(A) respectively; M, N are discretization parameters. The minimum and the maximum eigenvalue (in magnitude) are represented by λ 1 and λ n , respectively. The initial solution is x 0 = 0 for all numerical experiments in this section, the stopping criterion on the relative residual norm is r j / r 0 < 10 −6 or a maximum amount of 3000 restart cycles. The parameters considered for each method are summarized in Table 2. The harmonic Ritz vectors are obtained using the eigs MATLAB function with an initial vector of ones instead of using a random vector as it does by default. This allows to keep the same number of restarts cycles when a problem is running several times. The following parameters are those defined by default in the eigs function. Some of these principal parameters are tolerance of 10 −6 and a maximum number of 300 iterations [22]. For the cases where the method does not reach convergence before 3000 restart cycles, the method is stopped and the time is denoted as NC.
Experimentally, it is observed that the values y j are between 1E-01 and 1E-07 for the numerical tests of the GMRES(m) (see Table 3). Note that only four problems converged to the prescribed tolerance, which shows that the problem of the cavity is difficult for the GMRES(m) with m fixed and without subspace enrichment. The problems where the GMRES(m) exhibits stagnation (from cavity05 to cavity12) have in general a mean( y j ) lower than the problems where it converges faster (cavity01 and cavity02). This gives an idea of how susceptible is the method to suffer stagnation in case of the value of m remains constant. In this work, the stagnation threshold parameter is considered to be δ = 0.5 and it is of the order of max( y j ) for problems without stagnation (see Table 3). The value of δ is chosen heuristically. Smaller values of δ could modify the values of m unnecessarily (recall that the modification of the value of m is given by the rule (2.13) and it is directly affected by the value of δ ). With reference to the value of α, which is the increment of the restart parameter m when there is stagnation, previous works have used very small increments such as ∆ m = 2 [13] and ∆ m = 3 [6] when a stagnation is detected. In this work,   Table 3 presents the behavior of the norm of the vector y j , when GMRES(m) method is used for all the tested problems. It is observed that the norm of the vector y j (see equation (2.15)) is small (less than 10 −2 ) indicating a slowdown rate of convergence; i.e., r j ≈ r j−1 . Table 4 shows results on problems from Table 1  of cycles required for converging to r j / r 0 < 10 −6 , as well as, min( y j ), mean( y j and max( y j ) which are the minimal, mean and maximal values of y j , respectively. A-LGMRES-E converged to the prescribed tolerance in all cases, with exception of three problems: the cav-ity08, cavity09 and cavity12. For these problems, the proposed method achieves the lowest relative norm with respect to the other iterative methods tested (see Table 6).

Figures 2-(a) and 2-(b)
show the convergence curves and the value y j versus the number of restart cycles for the problems cavity03 and cavity04. It is observed that the A-LGMRES-E method, which includes information from previous cycles and updates the restart parameter according to rule (2.13), presents a similar rate of convergence to the GMRES(m j ) method, which updates only the restart parameter without including information from previous cycles. Also, in Figure 2, the search subspace dimension of the adaptive methods is compared in the second sub-figure of columns (a) and (b), that is, the value m j for GMRES(m j ) and the value s j for A-LGMRES-E(s j ). The value s j has lower growth than m j , and this allows a smaller number of matrix-vector multiplications for the A-LGMRES-E(s j ) in the first fifty cycles. In the third sub-figure, the controlled variable ||y j || are presented for the two methods. The values ||y j || of A-LGMRES-E(s j ) are relatively larger than the corresponding values provided by GMRES(m j ) but lower than the threshold δ in some cases, allowing the increase of the value s j more slowly than the m j of GMRES(m j ).
Comparing the augmented methods with information of previous cycles (see Figure 3), i.e. the LGMRES(m, l), GMRES-E(m, d), LGMRES-E(m, l, d) and A-LGMRES-E(m j , l, d); the A- LGMRES-E has better rate of convergence for the showed problems, having lowest execution  times and number of restart cycles. Furthermore, the LGMRES(m, l) does not converge for the problems cavity06 and cavity11 (Figures 3-(c) and 3-(d)). This is also true for the LGMRES-E(m, l, d), which combines error approximations and harmonic Ritz vectors [18], when constant values of m, l and d are used. This shows that the addition of information vectors from previous cycles with fixed restart parameter is not enough to get the convergence, so an adjustment of m is needed to improve the rate of convergence.
All implemented methods are compared in Table 5. It is observed that the method A-LGMRES-E(m j , l, d) has lowest values for the number of cycles necessary for converging and produces steepest decrease in the rate of convergence with respect to the methods that use fixed parameter. For problems that do not converge, the proposed method achieves the best reduction of the relative residual norm r j / r 0 with respect to the other methods tested. The best reduction of the relative residual norm for difficult problems are indicated by boldface in Table 6. It is observed that the methods with fixed parameters achieved lower reduction of the relative residual norm in cycles carried out with respect to the other tested methods.

CONCLUSION
An adaptive method based on a threshold criterion was introduced for identifying stagnation in the GMRES(m). The criterion yields the expansion of the search subspace for both improving the speed and overcoming the stagnation. The proposed method, as well as several standard methods, were implemented for the resolution of the finite difference approximation of the Helmholtz equation. Numerical experiments for different discrete domain sizes and values of wave number k 0 were compared, and the results show that the proposal is good enough to improve convergence when comparing with other iterative methods with either fixed parameters and enriched subspace, or adaptive parameters without subspace enrichment. The computation is especially challenging when k 0 is increased. In this case, a more exhaustive research is necessary for linear systems with large k 0 in order to identify what is more convenient; either to modify the restart parameter m or an appropriate augmentation of the search subspace for GMRES(m).