General Radii Polynomial Theorem #
This file generalizes the radii polynomial approach to maps between potentially different Banach spaces (E, F). This corresponds to Theorem 7.6.2 in the informal proof.
Main differences from the E to E case: #
- Maps f : E → F between potentially different Banach spaces
- Approximate inverse A : F →L[ℝ] E (goes in reverse direction)
- Approximate derivative A† : E →L[ℝ] F (approximates Df)
- The composition AA† : E →L[ℝ] E must be close to identity on E
- Additional bound Z₁ for ‖A[Df(x̄) - A†]‖
Banach Space Setup #
We work with two Banach spaces E and F over ℝ:
For each space X ∈ {E, F}:
NormedAddCommGroup X: X has a norm satisfying definiteness, symmetry, triangle inequalityNormedSpace ℝ X: The norm is compatible with scalar multiplicationCompleteSpace X: Every Cauchy sequence converges (crucial for fixed point theorems)
This framework supports:
- Fréchet derivatives (via the norm structure)
- Fixed point theorems (via completeness)
- Mean Value Theorem (via the metric structure)
- Linear operator theory (via the vector space structure)
Equations
Instances For
Equations
Instances For
Neumann Series and Operator Invertibility #
The Neumann series results establish when operators close to the identity are invertible. These results apply to operators on a single space (E →L[ℝ] E or F →L[ℝ] F).
In the general E to F setting, we use these results for the composition AA† : E →L[ℝ] E, which must be close to the identity I_E for the method to work.
Key insight: If ‖I - B‖ < 1 for an operator B : E →L[ℝ] E, then B is invertible. This is the Neumann series theorem, already available in Mathlib.
If an operator is close to identity, it is a unit (invertible in the multiplicative sense).
This is a direct application of Mathlib's isUnit_one_sub_of_norm_lt_one.
Version for operators on F
Newton-Like Operator for E to F Maps #
For a function f : E → F and an approximate inverse A : F →L[ℝ] E, we define: T(x) = x - A(f(x))
This operator T : E → E transforms the zero-finding problem f(x) = 0 into a fixed point problem T(x) = x.
Key differences from E to E case:
- f : E → F (not E → E)
- A : F →L[ℝ] E (approximate inverse, goes "backwards")
- T : E → E (the Newton operator still maps E to itself)
The Newton-like operator T(x) = x - Af(x) for maps from E to F
Equations
- NewtonLikeMap A f x = x - A (f x)
Instances For
Fixed Points ⟺ Zeros (Proposition 2.3.1) #
This fundamental equivalence holds in the general E to F setting: T(x) = x ⟺ f(x) = 0
when A : F →L[ℝ] E is injective.
The proof is identical to the E to E case - injectivity of A is the key requirement, not invertibility.
Proposition 2.3.1: Fixed points of Newton operator ⟺ Zeros of f
Let T(x) = x - Af(x) be the Newton-like operator where:
- f : E → F
- A : F →L[ℝ] E is an injective linear map
Then: T(x) = x ⟺ f(x) = 0
This fundamental equivalence allows us to:
- Convert zero-finding problems (f(x) = 0) to fixed point problems (T(x) = x)
- Apply fixed point theorems (like Banach's) to find zeros of f
Radii Polynomial Definitions #
The radii polynomial approach uses polynomial inequalities to verify contraction conditions.
For the general E to F case (Theorem 7.6.2), we have an additional parameter Z₁: p(r) = Z₂(r)r² - (1 - Z₀ - Z₁)r + Y₀
Where:
- Y₀: bound on ‖A(f(x̄))‖ (initial defect)
- Z₀: bound on ‖I_E - AA†‖ (how close AA† is to identity)
- Z₁: bound on ‖A[Df(x̄) - A†]‖ (how well A† approximates Df(x̄))
- Z₂(r): bound on ‖A[Df(c) - Df(x̄)]‖ for c ∈ B̄ᵣ(x̄) (Lipschitz-type bound)
The simpler case (when E = F and A† = Df(x̄)) has Z₁ = 0.
The general radii polynomial with Z₁ parameter (Theorem 7.6.2, equation 7.34)
p(r) = Z₂(r)r² - (1 - Z₀ - Z₁)r + Y₀
This is used when we have:
- f : E → F with E and F potentially different
- A : F →L[ℝ] E (approximate inverse)
- A† : E →L[ℝ] F (approximate derivative)
Instances For
Alternative formulation: p(r) = (Z(r) - 1)r + Y₀
This connects the polynomial form to the contraction constant bound. When p(r₀) < 0, we get Z(r₀) < 1, which means T is a contraction.
Simple radii polynomial (when Z₁ = 0, corresponds to Theorem 2.4.1/2.4.2)
This is the special case when E = F or when A† = Df(x̄) exactly.
p(r) = Z₂(r)r² - (1 - Z₀)r + Y₀
Instances For
Simple radii polynomial as special case of general one
Key Implications of Radii Polynomial Negativity #
If p(r₀) < 0, this implies the contraction constant Z(r₀) < 1, which is the key requirement for the Banach fixed point theorem.
If the general radii polynomial is negative, then Z(r₀) < 1
Operator Bounds for Newton-Like Map #
These lemmas establish the key bounds needed to apply the contraction mapping theorem:
- Y₀ bound: ‖T(x̄) - x̄‖ ≤ Y₀ (initial displacement)
- Z bound: ‖DT(c)‖ ≤ Z(r) for c ∈ B̄ᵣ(x̄) (derivative bound)
In the general E to F setting, the derivative is DT(x) = I_E - A ∘ Df(x).
Y₀ bound for Newton operator: ‖T(x̄) - x̄‖ ≤ Y₀
This reformulates equation (7.30) for the Newton-like operator. For T(x) = x - A(f(x)), we have T(x̄) - x̄ = -A(f(x̄)).
Derivative of the Newton-like operator in the E to F setting
For T(x) = x - A(f(x)) where f : E → F and A : F →L[ℝ] E: DT(x) = I_E - A ∘ Df(x)
The composition A ∘ Df(x) has type E →L[ℝ] E since:
- Df(x) : E →L[ℝ] F
- A : F →L[ℝ] E
- A ∘ Df(x) : E →L[ℝ] E
General derivative bound for Newton operator on closed ball
‖DT(c)‖ ≤ Z₀ + Z₁ + Z₂(r)·r for all c ∈ B̄ᵣ(x̄)
This uses the decomposition (equation 7.35): DT(c) = I_E - A∘Df(c) = [I_E - A∘A†] + A∘[A† - Df(x̄)] + A∘[Df(x̄) - Df(c)]
Applying bounds:
- ‖I_E - A∘A†‖ ≤ Z₀ (eq. 7.31)
- ‖A∘[A† - Df(x̄)]‖ ≤ Z₁ (eq. 7.32)
- ‖A∘[Df(c) - Df(x̄)]‖ ≤ Z₂(r)·r (eq. 7.33)
Simple derivative bound (when A† = Df(x̄), so Z₁ = 0)
This is used in Theorem 2.4.2 when E = F or when we can set A† = Df(x̄).
‖DT(c)‖ ≤ Z₀ + Z₂(r)·r for all c ∈ B̄ᵣ(x̄)
Helper Lemmas for Fixed Point Theorems #
These technical lemmas are needed to apply the Banach fixed point theorem:
- Completeness of closed balls
- Finiteness of extended distance
- Construction of inverses from compositions
Closed balls in complete spaces are complete
Extended distance is finite in normed spaces Needed to apply Banach fixed point theorem
Construct inverse of Df(x̃) from invertibility of A∘Df(x̃)
Key insight: If A : F →L[ℝ] E is injective and A∘B : E →L[ℝ] E is invertible, then B : E →L[ℝ] F is invertible with inverse B⁻¹ = (A∘B)⁻¹ ∘ A.
This is used to show Df(x̃) is invertible without requiring A to be invertible.
T maps the closed ball into itself when the radii polynomial is negative
This is a key step in applying the Banach fixed point theorem.
Given:
- ‖T(x̄) - x̄‖ ≤ Y₀ (initial displacement bound)
- ‖DT(c)‖ ≤ Z(r₀) for all c ∈ B̄ᵣ₀(x̄) (derivative bound)
- p(r₀) < 0 where p(r) = (Z(r) - 1)r + Y₀ (radii polynomial condition)
We prove: T : B̄ᵣ₀(x̄) → B̄ᵣ₀(x̄) (T maps the ball to itself)
Strategy:
- From p(r₀) < 0, extract: Z(r₀)·r₀ + Y₀ < r₀
- For x ∈ B̄ᵣ₀(x̄), use Mean Value Theorem: ‖T(x) - T(x̄)‖ ≤ Z(r₀)·‖x - x̄‖ ≤ Z(r₀)·r₀
- Triangle inequality: ‖T(x) - x̄‖ ≤ ‖T(x) - T(x̄)‖ + ‖T(x̄) - x̄‖ ≤ Z(r₀)·r₀ + Y₀ < r₀
- Therefore T(x) ∈ B̄ᵣ₀(x̄)
General Fixed Point Theorem (Theorem 7.6.1) #
This is the fixed point theorem for differentiable maps T : E → E on Banach spaces, corresponding to Theorem 7.6.1 in the informal proof.
Given:
- T : E → E (differentiable map on same space)
- Bounds on ‖T(x̄) - x̄‖ and ‖DT(x)‖
- Radii polynomial p(r) = (Z(r) - 1)r + Y₀
If p(r₀) < 0, then there exists a unique fixed point x̃ ∈ B̄ᵣ₀(x̄).
This is used as a key step in proving Theorem 7.6.2.
Theorem 7.6.1: General Fixed Point Theorem for Banach Spaces
Let T : E → E be Fréchet differentiable and x̄ ∈ E. Suppose:
- ‖T(x̄) - x̄‖ ≤ Y₀ (eq. 7.27)
- ‖DT(x)‖ ≤ Z(r) for all x ∈ B̄ᵣ(x̄) (eq. 7.28)
Define p(r) := (Z(r) - 1)r + Y₀.
If there exists r₀ > 0 such that p(r₀) < 0, then there exists a unique x̃ ∈ B̄ᵣ₀(x̄) such that T(x̃) = x̃.
This is the Banach space version of Theorem 2.4.1. (which is in ℝⁿ)
General Radii Polynomial Theorem (Theorem 7.6.2) #
This is the main theorem for the E to F case, corresponding to Theorem 7.6.2 in the informal proof.
Given:
- f : E → F (potentially different spaces)
- A : F →L[ℝ] E (approximate inverse, must be injective)
- A† : E →L[ℝ] F (approximation to Df(x̄))
- Bounds Y₀, Z₀, Z₁, Z₂ satisfying the radii polynomial condition
If p(r₀) < 0, then there exists a unique zero x̃ ∈ B̄ᵣ₀(x̄).
The proof strategy: Apply Theorem 7.6.1 to the Newton-like operator T(x) = x - A(f(x)), then use Proposition 2.3.1 to convert the fixed point to a zero.
Note: We don't claim Df(x̃) is invertible in this general version without additional assumptions. The derivative Df(x̃) : E →L[ℝ] F may not have an inverse in the usual sense when E and F are different.
Theorem 7.6.2: General Radii Polynomial Theorem for E to F maps
Given f : E → F with E, F Banach spaces, approximate inverse A : F →L[ℝ] E, and approximate derivative A† : E →L[ℝ] F satisfying:
- ‖A(f(x̄))‖ ≤ Y₀ (eq. 7.30: initial defect)
- ‖I_E - A∘A†‖ ≤ Z₀ (eq. 7.31: AA† close to identity)
- ‖A∘[Df(x̄) - A†]‖ ≤ Z₁ (eq. 7.32: A† approximates Df(x̄))
- ‖A∘[Df(c) - Df(x̄)]‖ ≤ Z₂(r)·r for c ∈ B̄ᵣ(x̄) (eq. 7.33: Lipschitz bound)
If p(r₀) < 0 where p(r) = Z₂(r)r² - (1-Z₀-Z₁)r + Y₀ (eq. 7.34), then there exists a unique x̃ ∈ B̄ᵣ₀(x̄) with f(x̃) = 0.
RM: It turns out we only need need there exists some r₀ > 0 such that p(r₀) < 0, not that p(r) < 0 for all r ∈ (0, r₀]. This is a slight generalization over the original statement.
Proof strategy:
- Define Newton-like operator T(x) = x - A(f(x))
- Show T satisfies conditions of Theorem 7.6.1 (general_fixed_point_theorem)
- Apply Theorem 7.6.1 to get unique fixed point x̃
- Use Proposition 2.3.1 to show f(x̃) = 0
The key requirement is that A is injective.
Simplified Radii Polynomial Theorem #
This is the simpler version when we don't need an explicit A†, or when we can effectively set A† = Df(x̄). This corresponds to Theorem 2.4.2 in the original formalization.
The key simplification: Z₁ = 0, so the polynomial becomes: p(r) = Z₂(r)r² - (1 - Z₀)r + Y₀
This still works for f : E → F with different spaces, but requires tighter bounds.
Theorem 2.4.2 (Generalized): Simple Radii Polynomial Theorem for E to F
Given f : E → F and approximate inverse A : F →L[ℝ] E satisfying:
- ‖A(f(x̄))‖ ≤ Y₀ (eq. 2.14)
- ‖I_E - A∘Df(x̄)‖ ≤ Z₀ (eq. 2.15)
- ‖A∘[Df(c) - Df(x̄)]‖ ≤ Z₂(r)·r for c ∈ B̄ᵣ(x̄) (eq. 2.16)
If p(r₀) < 0 where p(r) = Z₂(r)r² - (1-Z₀)r + Y₀ (eq. 2.17), then there exists a unique x̃ ∈ B̄ᵣ₀(x̄) with f(x̃) = 0.
This is a special case of the general theorem with A† = Df(x̄) (so Z₁ = 0).
Proof strategy:
- Define Newton-like operator T(x) = x - A(f(x))
- Apply Theorem 7.6.1 (general_fixed_point_theorem) to T
- Use Proposition 2.3.1 to convert fixed point to zero
Version for same space (E = F) with invertibility claim
When E = F and Df(x̃) : E →L[ℝ] E, we can additionally prove that Df(x̃) is invertible using the Neumann series.
Proof strategy:
- Apply Theorem 7.6.1 (general_fixed_point_theorem) to get fixed point x̃
- Use Proposition 2.3.1 to show f(x̃) = 0
- Show ‖I_E - A∘Df(x̃)‖ < 1 using the derivative bound
- Apply Neumann series to construct inverse of Df(x̃)