Algorithm Reference¶

This document provides mathematical details of the algorithms implemented in Sparse Kappa.

1-Norm Algorithms¶

Hager-Higham Algorithm¶

Reference: Higham & Tisseur (2000), SIAM J. Matrix Anal. Appl.

Problem: Estimate \(\kappa_1(A) = \|A\|_1 \cdot \|A^{-1}\|_1\)

Algorithm:

Compute \(\|A\|_1\) exactly:

\[\|A\|_1 = \max_j \sum_{i=1}^n |a_{ij}|\]
Initialize \(x = \mathbf{1}/n\) where \(\mathbf{1} = (1,1,\ldots,1)^T\)
For \(k = 1, 2, \ldots, k_{\max}\):
1. Solve \(Aw = x\) for \(w\)
2. Estimate \(\gamma = \|w\|_1\)
3. Compute \(\text{sign}(w) = (\text{sgn}(w_1), \ldots, \text{sgn}(w_n))^T\)
4. Solve \(A^T z = \text{sign}(w)\) for \(z\)
5. Stopping criterion: If \(\max_i |z_i| \leq z^T x\), stop
6. Find \(j = \arg\max_i |z_i|\)
7. Update \(x = e_j\) (standard basis vector)
8. If \(x\) unchanged, stop
Return \(\kappa_1(A) = \|A\|_1 \cdot \gamma\)

Complexity: \(O(k \cdot \text{nnz}(A))\) where \(k\) is iterations

Why LU is crucial: - Requires \(2k + 2\) solves with \(A\) and \(A^T\) - LU factorization: Compute once, solve \(O(n^2)\) each - LSMR: Each solve is \(O(m \cdot \text{nnz}(A))\)

Power Iteration (1-norm)¶

Problem: Estimate \(\|A^{-1}\|_1\)

Algorithm:

Initialize \(x\) randomly, normalize by \(\|x\|_1\)
For \(k = 1, 2, \ldots\):
1. Solve \(Ay = x\) for \(y\)
2. Estimate \(\gamma = \|y\|_1\)
3. Normalize \(x = y / \|y\|_1\)
4. If converged, stop
Return \(\|A^{-1}\|_1 \approx \gamma\)

Convergence: Linear, depends on eigenvalue gap

Oettli-Prager Method¶

Reference: Oettli & Prager (1964), Numer. Math.

Idea: Sample columns to estimate \(\|A^{-1}\|_1\)

Variants:

Adaptive (original):

Start with column j* = argmax ||A(:,j)||₁
Solve A x = e_j*
Use dual problem A^T y = sign(x) to select next column
Repeat until converged

Random:

Sample random vectors b₁, b₂, ..., b_m
Solve A x_i = b_i
Return max ||x_i||₁

Hybrid:

Combines adaptive and random strategies

2-Norm Algorithms¶

Singular Value Decomposition (SVDS)¶

Problem: Compute \(\kappa_2(A) = \sigma_{\max}(A) / \sigma_{\min}(A)\)

Method: Partial SVD using Lanczos bidiagonalization

\[A = U \Sigma V^T\]

Algorithm (simplified):

Compute \(k\) largest singular values via Lanczos on \(A^T A\)
Compute \(k\) smallest singular values:
- Build \(A^T A\) as linear operator
- Use ARPACK with shift-invert for smallest eigenvalues
- \(\sigma = \sqrt{\lambda(A^T A)}\)
Return \(\kappa_2(A) = \sigma_{\max} / \sigma_{\min}\)

Accuracy: Exact for computed singular values

Complexity: \(O(k \cdot \text{nnz}(A))\) where \(k\) is number of singular values

Lanczos Method¶

Reference: Golub & Van Loan (2013), Matrix Computations

Idea: Build tridiagonal matrix \(T\) whose eigenvalues approximate those of \(A^T A\)

Algorithm:

Start with random vector \(v_1\), \(\|v_1\|_2 = 1\)
For \(j = 1, 2, \ldots, k\):

\[\begin{split}w &= A^T A v_j - \beta_{j-1} v_{j-1} \\ \alpha_j &= v_j^T w \\ w &= w - \alpha_j v_j \\ \beta_j &= \|w\|_2 \\ v_{j+1} &= w / \beta_j\end{split}\]
Build tridiagonal matrix:

\[\begin{split}T = \begin{pmatrix} \alpha_1 & \beta_1 & & \\ \beta_1 & \alpha_2 & \beta_2 & \\ & \ddots & \ddots & \ddots \\ & & \beta_{k-1} & \alpha_k \end{pmatrix}\end{split}\]
Compute eigenvalues \(\lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_k\) of \(T\)
Estimate:

\[\sigma_{\max}(A) \approx \sqrt{\lambda_1}, \quad \sigma_{\min}(A) \approx \sqrt{\lambda_k}\]

For symmetric \(A\): Apply Lanczos directly to \(A\)

Golub-Kahan Bidiagonalization¶

Reference: Golub & Kahan (1965), SIAM J. Numer. Anal.

Advantage: More stable than forming \(A^T A\) explicitly

Algorithm:

Start with \(u_1, v_1\) random unit vectors
For \(j = 1, 2, \ldots, k\):

\[\begin{split}v_j^* &= A^T u_j - \beta_{j-1} v_{j-1} \\ \alpha_j &= \|v_j^*\|_2, \quad v_j = v_j^* / \alpha_j \\ u_{j+1}^* &= A v_j - \alpha_j u_j \\ \beta_j &= \|u_{j+1}^*\|_2, \quad u_{j+1} = u_{j+1}^* / \beta_j\end{split}\]
Build bidiagonal matrix:

\[\begin{split}B = \begin{pmatrix} \alpha_1 & \beta_1 & & \\ & \alpha_2 & \beta_2 & \\ & & \ddots & \ddots \\ & & & \alpha_k \end{pmatrix}\end{split}\]
Singular values of \(B\) approximate those of \(A\)

Stability: Better conditioned than \(A^T A\)

Power Method (2-norm)¶

Problem: Estimate \(\sigma_{\max}(A)\)

Algorithm:

Start with random vector \(v\), \(\|v\|_2 = 1\)
For \(k = 1, 2, \ldots\):

\[\begin{split}w &= A^T A v \\ \lambda &= v^T w \\ v &= w / \|w\|_2\end{split}\]
Return \(\sigma_{\max} \approx \sqrt{\lambda}\)

For \(\sigma_{\min}\): Use inverse iteration or shift-invert

Convergence Analysis¶

Hager-Higham¶

Typical iterations: 2-5

Convergence: Depends on matrix structure, not proven in general

In practice: Usually converges quickly

Power Iteration¶

Convergence rate:

\[\text{error}_k \approx \left|\frac{\lambda_2}{\lambda_1}\right|^k \text{error}_0\]

where \(\lambda_1, \lambda_2\) are two largest eigenvalues.

Fast if: \(\lambda_1 \gg \lambda_2\) (well-separated)

Lanczos¶

Accuracy: Extremal eigenvalues converge first

Iterations needed: Typically \(k = 10-50\) for good estimates

Numerical Stability¶

Loss of Orthogonality¶

Problem: In Lanczos, \(v_j\) vectors lose orthogonality due to rounding

Solution:

Reorthogonalization
Use more iterations
Check residuals

Ill-Conditioned Systems¶

Challenge: Solving \(Ax = b\) when \(\kappa(A)\) is large

Solutions:

Use iterative refinement
Increase solver tolerance
Use preconditioners
Accept approximate results

References¶

Hager, W. W. (1984). “Condition estimates.” SIAM J. Sci. Stat. Comput., 5(2), 311-316.
Higham, N. J., & Tisseur, F. (2000). “A block algorithm for matrix 1-norm estimation.” SIAM J. Matrix Anal. Appl., 21(4), 1185-1201.
Golub, G. H., & Van Loan, C. F. (2013). Matrix Computations (4th ed.). Johns Hopkins University Press.
Golub, G. H., & Kahan, W. (1965). “Calculating the singular values and pseudo-inverse of a matrix.” SIAM J. Numer. Anal., 2(2), 205-224.
Saad, Y. (2011). Numerical Methods for Large Eigenvalue Problems (2nd ed.). SIAM.
Oettli, W., & Prager, W. (1964). “Compatibility of approximate solution of linear equations.” Numer. Math., 6(1), 405-409.