Algorithm Reference¶
This document provides mathematical details of the algorithms implemented in Sparse Kappa.
1-Norm Algorithms¶
Hager-Higham Algorithm¶
Reference: Higham & Tisseur (2000), SIAM J. Matrix Anal. Appl.
Problem: Estimate \(\kappa_1(A) = \|A\|_1 \cdot \|A^{-1}\|_1\)
Algorithm:
Compute \(\|A\|_1\) exactly:
\[\|A\|_1 = \max_j \sum_{i=1}^n |a_{ij}|\]Initialize \(x = \mathbf{1}/n\) where \(\mathbf{1} = (1,1,\ldots,1)^T\)
For \(k = 1, 2, \ldots, k_{\max}\):
Solve \(Aw = x\) for \(w\)
Estimate \(\gamma = \|w\|_1\)
Compute \(\text{sign}(w) = (\text{sgn}(w_1), \ldots, \text{sgn}(w_n))^T\)
Solve \(A^T z = \text{sign}(w)\) for \(z\)
Stopping criterion: If \(\max_i |z_i| \leq z^T x\), stop
Find \(j = \arg\max_i |z_i|\)
Update \(x = e_j\) (standard basis vector)
If \(x\) unchanged, stop
Return \(\kappa_1(A) = \|A\|_1 \cdot \gamma\)
Complexity: \(O(k \cdot \text{nnz}(A))\) where \(k\) is iterations
Why LU is crucial: - Requires \(2k + 2\) solves with \(A\) and \(A^T\) - LU factorization: Compute once, solve \(O(n^2)\) each - LSMR: Each solve is \(O(m \cdot \text{nnz}(A))\)
Power Iteration (1-norm)¶
Problem: Estimate \(\|A^{-1}\|_1\)
Algorithm:
Initialize \(x\) randomly, normalize by \(\|x\|_1\)
For \(k = 1, 2, \ldots\):
Solve \(Ay = x\) for \(y\)
Estimate \(\gamma = \|y\|_1\)
Normalize \(x = y / \|y\|_1\)
If converged, stop
Return \(\|A^{-1}\|_1 \approx \gamma\)
Convergence: Linear, depends on eigenvalue gap
Oettli-Prager Method¶
Reference: Oettli & Prager (1964), Numer. Math.
Idea: Sample columns to estimate \(\|A^{-1}\|_1\)
Variants:
Adaptive (original):
Start with column j* = argmax ||A(:,j)||₁ Solve A x = e_j* Use dual problem A^T y = sign(x) to select next column Repeat until converged
Random:
Sample random vectors b₁, b₂, ..., b_m Solve A x_i = b_i Return max ||x_i||₁
Hybrid:
Combines adaptive and random strategies
2-Norm Algorithms¶
Singular Value Decomposition (SVDS)¶
Problem: Compute \(\kappa_2(A) = \sigma_{\max}(A) / \sigma_{\min}(A)\)
Method: Partial SVD using Lanczos bidiagonalization
Algorithm (simplified):
Compute \(k\) largest singular values via Lanczos on \(A^T A\)
Compute \(k\) smallest singular values:
Build \(A^T A\) as linear operator
Use ARPACK with shift-invert for smallest eigenvalues
\(\sigma = \sqrt{\lambda(A^T A)}\)
Return \(\kappa_2(A) = \sigma_{\max} / \sigma_{\min}\)
Accuracy: Exact for computed singular values
Complexity: \(O(k \cdot \text{nnz}(A))\) where \(k\) is number of singular values
Lanczos Method¶
Reference: Golub & Van Loan (2013), Matrix Computations
Idea: Build tridiagonal matrix \(T\) whose eigenvalues approximate those of \(A^T A\)
Algorithm:
Start with random vector \(v_1\), \(\|v_1\|_2 = 1\)
For \(j = 1, 2, \ldots, k\):
\[\begin{split}w &= A^T A v_j - \beta_{j-1} v_{j-1} \\ \alpha_j &= v_j^T w \\ w &= w - \alpha_j v_j \\ \beta_j &= \|w\|_2 \\ v_{j+1} &= w / \beta_j\end{split}\]Build tridiagonal matrix:
\[\begin{split}T = \begin{pmatrix} \alpha_1 & \beta_1 & & \\ \beta_1 & \alpha_2 & \beta_2 & \\ & \ddots & \ddots & \ddots \\ & & \beta_{k-1} & \alpha_k \end{pmatrix}\end{split}\]Compute eigenvalues \(\lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_k\) of \(T\)
Estimate:
\[\sigma_{\max}(A) \approx \sqrt{\lambda_1}, \quad \sigma_{\min}(A) \approx \sqrt{\lambda_k}\]
For symmetric \(A\): Apply Lanczos directly to \(A\)
Golub-Kahan Bidiagonalization¶
Reference: Golub & Kahan (1965), SIAM J. Numer. Anal.
Advantage: More stable than forming \(A^T A\) explicitly
Algorithm:
Start with \(u_1, v_1\) random unit vectors
For \(j = 1, 2, \ldots, k\):
\[\begin{split}v_j^* &= A^T u_j - \beta_{j-1} v_{j-1} \\ \alpha_j &= \|v_j^*\|_2, \quad v_j = v_j^* / \alpha_j \\ u_{j+1}^* &= A v_j - \alpha_j u_j \\ \beta_j &= \|u_{j+1}^*\|_2, \quad u_{j+1} = u_{j+1}^* / \beta_j\end{split}\]Build bidiagonal matrix:
\[\begin{split}B = \begin{pmatrix} \alpha_1 & \beta_1 & & \\ & \alpha_2 & \beta_2 & \\ & & \ddots & \ddots \\ & & & \alpha_k \end{pmatrix}\end{split}\]Singular values of \(B\) approximate those of \(A\)
Stability: Better conditioned than \(A^T A\)
Power Method (2-norm)¶
Problem: Estimate \(\sigma_{\max}(A)\)
Algorithm:
Start with random vector \(v\), \(\|v\|_2 = 1\)
For \(k = 1, 2, \ldots\):
\[\begin{split}w &= A^T A v \\ \lambda &= v^T w \\ v &= w / \|w\|_2\end{split}\]Return \(\sigma_{\max} \approx \sqrt{\lambda}\)
For \(\sigma_{\min}\): Use inverse iteration or shift-invert
Convergence Analysis¶
Hager-Higham¶
Typical iterations: 2-5
Convergence: Depends on matrix structure, not proven in general
In practice: Usually converges quickly
Power Iteration¶
Convergence rate:
where \(\lambda_1, \lambda_2\) are two largest eigenvalues.
Fast if: \(\lambda_1 \gg \lambda_2\) (well-separated)
Lanczos¶
Accuracy: Extremal eigenvalues converge first
Iterations needed: Typically \(k = 10-50\) for good estimates
Numerical Stability¶
Loss of Orthogonality¶
Problem: In Lanczos, \(v_j\) vectors lose orthogonality due to rounding
Solution:
Reorthogonalization
Use more iterations
Check residuals
Ill-Conditioned Systems¶
Challenge: Solving \(Ax = b\) when \(\kappa(A)\) is large
Solutions:
Use iterative refinement
Increase solver tolerance
Use preconditioners
Accept approximate results
References¶
Hager, W. W. (1984). “Condition estimates.” SIAM J. Sci. Stat. Comput., 5(2), 311-316.
Higham, N. J., & Tisseur, F. (2000). “A block algorithm for matrix 1-norm estimation.” SIAM J. Matrix Anal. Appl., 21(4), 1185-1201.
Golub, G. H., & Van Loan, C. F. (2013). Matrix Computations (4th ed.). Johns Hopkins University Press.
Golub, G. H., & Kahan, W. (1965). “Calculating the singular values and pseudo-inverse of a matrix.” SIAM J. Numer. Anal., 2(2), 205-224.
Saad, Y. (2011). Numerical Methods for Large Eigenvalue Problems (2nd ed.). SIAM.
Oettli, W., & Prager, W. (1964). “Compatibility of approximate solution of linear equations.” Numer. Math., 6(1), 405-409.