Documentation

RadiiPolynomial.RadiiPolyGeneral

General Radii Polynomial Theorem #

This file generalizes the radii polynomial approach to maps between potentially different Banach spaces (E, F). This corresponds to Theorem 7.6.2 in the informal proof.

Main differences from the E to E case: #

Banach Space Setup #

We work with two Banach spaces E and F over ℝ:

For each space X ∈ {E, F}:

  1. NormedAddCommGroup X: X has a norm satisfying definiteness, symmetry, triangle inequality
  2. NormedSpace ℝ X: The norm is compatible with scalar multiplication
  3. CompleteSpace X: Every Cauchy sequence converges (crucial for fixed point theorems)

This framework supports:

@[reducible, inline]
abbrev I_E {E : Type u_1} [NormedAddCommGroup E] [NormedSpace E] :
Equations
Instances For
    @[reducible, inline]
    abbrev I_F {F : Type u_2} [NormedAddCommGroup F] [NormedSpace F] :
    Equations
    Instances For

      Neumann Series and Operator Invertibility #

      The Neumann series results establish when operators close to the identity are invertible. These results apply to operators on a single space (E →L[ℝ] E or F →L[ℝ] F).

      In the general E to F setting, we use these results for the composition AA† : E →L[ℝ] E, which must be close to the identity I_E for the method to work.

      Key insight: If ‖I - B‖ < 1 for an operator B : E →L[ℝ] E, then B is invertible. This is the Neumann series theorem, already available in Mathlib.

      If an operator is close to identity, it is a unit (invertible in the multiplicative sense). This is a direct application of Mathlib's isUnit_one_sub_of_norm_lt_one.

      theorem invertible_of_norm_sub_id_lt_one {E : Type u_1} [NormedAddCommGroup E] [NormedSpace E] [CompleteSpace E] {B : E →L[] E} (h : I_E - B < 1) :
      ∃ (B_inv : E →L[] E), B * B_inv = 1 B_inv * B = 1

      Alternative formulation: explicit existence of a two-sided inverse

      theorem invertible_comp_form {E : Type u_1} [NormedAddCommGroup E] [NormedSpace E] [CompleteSpace E] {B : E →L[] E} (h : I_E - B < 1) :
      ∃ (B_inv : E →L[] E), B.comp B_inv = I_E B_inv.comp B = I_E

      Composition form: useful for working with .comp notation

      Version for operators on F

      Newton-Like Operator for E to F Maps #

      For a function f : E → F and an approximate inverse A : F →L[ℝ] E, we define: T(x) = x - A(f(x))

      This operator T : E → E transforms the zero-finding problem f(x) = 0 into a fixed point problem T(x) = x.

      Key differences from E to E case:

      def NewtonLikeMap {E : Type u_1} {F : Type u_2} [NormedAddCommGroup E] [NormedSpace E] [NormedAddCommGroup F] [NormedSpace F] (A : F →L[] E) (f : EF) (x : E) :
      E

      The Newton-like operator T(x) = x - Af(x) for maps from E to F

      Equations
      Instances For

        Fixed Points ⟺ Zeros (Proposition 2.3.1) #

        This fundamental equivalence holds in the general E to F setting: T(x) = x ⟺ f(x) = 0

        when A : F →L[ℝ] E is injective.

        The proof is identical to the E to E case - injectivity of A is the key requirement, not invertibility.

        theorem fixedPoint_injective_iff_zero {E : Type u_1} {F : Type u_2} [NormedAddCommGroup E] [NormedSpace E] [NormedAddCommGroup F] [NormedSpace F] {f : EF} {A : F →L[] E} (hA : Function.Injective A) (x : E) :
        NewtonLikeMap A f x = x f x = 0

        Proposition 2.3.1: Fixed points of Newton operator ⟺ Zeros of f

        Let T(x) = x - Af(x) be the Newton-like operator where:

        • f : E → F
        • A : F →L[ℝ] E is an injective linear map

        Then: T(x) = x ⟺ f(x) = 0

        This fundamental equivalence allows us to:

        • Convert zero-finding problems (f(x) = 0) to fixed point problems (T(x) = x)
        • Apply fixed point theorems (like Banach's) to find zeros of f

        Radii Polynomial Definitions #

        The radii polynomial approach uses polynomial inequalities to verify contraction conditions.

        For the general E to F case (Theorem 7.6.2), we have an additional parameter Z₁: p(r) = Z₂(r)r² - (1 - Z₀ - Z₁)r + Y₀

        Where:

        The simpler case (when E = F and A† = Df(x̄)) has Z₁ = 0.

        def generalRadiiPolynomial (Y₀ Z₀ Z₁ : ) (Z₂ : ) (r : ) :

        The general radii polynomial with Z₁ parameter (Theorem 7.6.2, equation 7.34)

        p(r) = Z₂(r)r² - (1 - Z₀ - Z₁)r + Y₀

        This is used when we have:

        • f : E → F with E and F potentially different
        • A : F →L[ℝ] E (approximate inverse)
        • A† : E →L[ℝ] F (approximate derivative)
        Equations
        Instances For
          def Z_bound_general (Z₀ Z₁ : ) (Z₂ : ) (r : ) :

          Combined bound Z(r) = Z₀ + Z₁ + Z₂(r)·r

          This represents the total bound on ‖DT(c)‖ for c ∈ B̄ᵣ(x̄). See equation (7.35) in the proof of Theorem 7.6.2.

          Equations
          Instances For
            theorem generalRadiiPolynomial_alt_form (Y₀ Z₀ Z₁ : ) (Z₂ : ) (r : ) :
            generalRadiiPolynomial Y₀ Z₀ Z₁ Z₂ r = (Z_bound_general Z₀ Z₁ Z₂ r - 1) * r + Y₀

            Alternative formulation: p(r) = (Z(r) - 1)r + Y₀

            This connects the polynomial form to the contraction constant bound. When p(r₀) < 0, we get Z(r₀) < 1, which means T is a contraction.

            def radiiPolynomial (Y₀ Z₀ : ) (Z₂ : ) (r : ) :

            Simple radii polynomial (when Z₁ = 0, corresponds to Theorem 2.4.1/2.4.2)

            This is the special case when E = F or when A† = Df(x̄) exactly.

            p(r) = Z₂(r)r² - (1 - Z₀)r + Y₀

            Equations
            Instances For
              def Z_bound (Z₀ : ) (Z₂ : ) (r : ) :

              Simple Z bound: Z(r) = Z₀ + Z₂(r)·r

              Equations
              Instances For
                theorem radiiPolynomial_is_special_case (Y₀ Z₀ : ) (Z₂ : ) (r : ) :
                radiiPolynomial Y₀ Z₀ Z₂ r = generalRadiiPolynomial Y₀ Z₀ 0 Z₂ r

                Simple radii polynomial as special case of general one

                theorem radiiPolynomial_alt_form (Y₀ Z₀ : ) (Z₂ : ) (r : ) :
                radiiPolynomial Y₀ Z₀ Z₂ r = (Z_bound Z₀ Z₂ r - 1) * r + Y₀

                Alternative form for simple polynomial

                def simpleRadiiPolynomial (Y₀ : ) (Z : ) (r : ) :

                Simple polynomial for fixed point theorem (used in Theorem 2.4.1)

                Equations
                Instances For

                  Key Implications of Radii Polynomial Negativity #

                  If p(r₀) < 0, this implies the contraction constant Z(r₀) < 1, which is the key requirement for the Banach fixed point theorem.

                  theorem general_radii_poly_neg_implies_Z_lt_one {Y₀ Z₀ Z₁ : } {Z₂ : } {r₀ : } (hY₀ : 0 Y₀) (hr₀ : 0 < r₀) (h_poly : generalRadiiPolynomial Y₀ Z₀ Z₁ Z₂ r₀ < 0) :
                  Z_bound_general Z₀ Z₁ Z₂ r₀ < 1

                  If the general radii polynomial is negative, then Z(r₀) < 1

                  theorem radii_poly_neg_implies_Z_bound_lt_one {Y₀ Z₀ : } {Z₂ : } {r₀ : } (hY₀ : 0 Y₀) (hr₀ : 0 < r₀) (h_poly : radiiPolynomial Y₀ Z₀ Z₂ r₀ < 0) :
                  Z_bound Z₀ Z₂ r₀ < 1

                  Simple version: if p(r₀) < 0 then Z(r₀) < 1

                  theorem simple_radii_poly_neg_implies_Z_lt_one {Y₀ : } {Z : } {r₀ : } (hY₀ : 0 Y₀) (hr₀ : 0 < r₀) (h_poly : simpleRadiiPolynomial Y₀ Z r₀ < 0) :
                  Z r₀ < 1

                  Simple polynomial version

                  Operator Bounds for Newton-Like Map #

                  These lemmas establish the key bounds needed to apply the contraction mapping theorem:

                  1. Y₀ bound: ‖T(x̄) - x̄‖ ≤ Y₀ (initial displacement)
                  2. Z bound: ‖DT(c)‖ ≤ Z(r) for c ∈ B̄ᵣ(x̄) (derivative bound)

                  In the general E to F setting, the derivative is DT(x) = I_E - A ∘ Df(x).

                  theorem newton_operator_Y_bound {E : Type u_1} {F : Type u_2} [NormedAddCommGroup E] [NormedSpace E] [NormedAddCommGroup F] [NormedSpace F] {f : EF} {xBar : E} {A : F →L[] E} {Y₀ : } (h_bound : A (f xBar) Y₀) :
                  have T := NewtonLikeMap A f; T xBar - xBar Y₀

                  Y₀ bound for Newton operator: ‖T(x̄) - x̄‖ ≤ Y₀

                  This reformulates equation (7.30) for the Newton-like operator. For T(x) = x - A(f(x)), we have T(x̄) - x̄ = -A(f(x̄)).

                  theorem newton_operator_fderiv {E : Type u_1} {F : Type u_2} [NormedAddCommGroup E] [NormedSpace E] [NormedAddCommGroup F] [NormedSpace F] {f : EF} {A : F →L[] E} {x : E} (hf_diff : Differentiable f) :

                  Derivative of the Newton-like operator in the E to F setting

                  For T(x) = x - A(f(x)) where f : E → F and A : F →L[ℝ] E: DT(x) = I_E - A ∘ Df(x)

                  The composition A ∘ Df(x) has type E →L[ℝ] E since:

                  • Df(x) : E →L[ℝ] F
                  • A : F →L[ℝ] E
                  • A ∘ Df(x) : E →L[ℝ] E
                  theorem newton_operator_derivative_bound_general {E : Type u_1} {F : Type u_2} [NormedAddCommGroup E] [NormedSpace E] [NormedAddCommGroup F] [NormedSpace F] {f : EF} {xBar : E} {A : F →L[] E} {A_dagger : E →L[] F} {Z₀ Z₁ : } {Z₂ : } {r : } (hf_diff : Differentiable f) (h_Z₀ : I_E - A.comp A_dagger Z₀) (h_Z₁ : A.comp (A_dagger - fderiv f xBar) Z₁) (h_Z₂ : cMetric.closedBall xBar r, A.comp (fderiv f c - fderiv f xBar) Z₂ r * r) (c : E) (hc : c Metric.closedBall xBar r) :

                  General derivative bound for Newton operator on closed ball

                  ‖DT(c)‖ ≤ Z₀ + Z₁ + Z₂(r)·r for all c ∈ B̄ᵣ(x̄)

                  This uses the decomposition (equation 7.35): DT(c) = I_E - A∘Df(c) = [I_E - A∘A†] + A∘[A† - Df(x̄)] + A∘[Df(x̄) - Df(c)]

                  Applying bounds:

                  • ‖I_E - A∘A†‖ ≤ Z₀ (eq. 7.31)
                  • ‖A∘[A† - Df(x̄)]‖ ≤ Z₁ (eq. 7.32)
                  • ‖A∘[Df(c) - Df(x̄)]‖ ≤ Z₂(r)·r (eq. 7.33)
                  theorem newton_operator_derivative_bound_simple {E : Type u_1} {F : Type u_2} [NormedAddCommGroup E] [NormedSpace E] [NormedAddCommGroup F] [NormedSpace F] {f : EF} {xBar : E} {A : F →L[] E} {Z₀ : } {Z₂ : } {r : } (hf_diff : Differentiable f) (h_Z₀ : I_E - A.comp (fderiv f xBar) Z₀) (h_Z₂ : cMetric.closedBall xBar r, A.comp (fderiv f c - fderiv f xBar) Z₂ r * r) (c : E) (hc : c Metric.closedBall xBar r) :

                  Simple derivative bound (when A† = Df(x̄), so Z₁ = 0)

                  This is used in Theorem 2.4.2 when E = F or when we can set A† = Df(x̄).

                  ‖DT(c)‖ ≤ Z₀ + Z₂(r)·r for all c ∈ B̄ᵣ(x̄)

                  Helper Lemmas for Fixed Point Theorems #

                  These technical lemmas are needed to apply the Banach fixed point theorem:

                  Closed balls in complete spaces are complete

                  theorem edist_ne_top_of_normed {E : Type u_1} [NormedAddCommGroup E] (x y : E) :

                  Extended distance is finite in normed spaces Needed to apply Banach fixed point theorem

                  theorem construct_derivative_inverse {E : Type u_1} {F : Type u_2} [NormedAddCommGroup E] [NormedSpace E] [CompleteSpace E] [NormedAddCommGroup F] [NormedSpace F] {A : F →L[] E} {B : E →L[] F} (hA_inj : Function.Injective A) (h_norm : I_E - A.comp B < 1) :

                  Construct inverse of Df(x̃) from invertibility of A∘Df(x̃)

                  Key insight: If A : F →L[ℝ] E is injective and A∘B : E →L[ℝ] E is invertible, then B : E →L[ℝ] F is invertible with inverse B⁻¹ = (A∘B)⁻¹ ∘ A.

                  This is used to show Df(x̃) is invertible without requiring A to be invertible.

                  theorem simple_maps_closedBall_to_itself {E : Type u_1} [NormedAddCommGroup E] [NormedSpace E] {T : EE} {xBar : E} {Y₀ : } {Z : } {r₀ : } (hT_diff : Differentiable T) (hr₀ : 0 < r₀) (h_bound_Y : T xBar - xBar Y₀) (h_bound_Z : cMetric.closedBall xBar r₀, fderiv T c Z r₀) (h_Z_nonneg : 0 Z r₀) (h_radii : simpleRadiiPolynomial Y₀ Z r₀ < 0) :

                  T maps the closed ball into itself when the radii polynomial is negative

                  This is a key step in applying the Banach fixed point theorem.

                  Given:

                  • ‖T(x̄) - x̄‖ ≤ Y₀ (initial displacement bound)
                  • ‖DT(c)‖ ≤ Z(r₀) for all c ∈ B̄ᵣ₀(x̄) (derivative bound)
                  • p(r₀) < 0 where p(r) = (Z(r) - 1)r + Y₀ (radii polynomial condition)

                  We prove: T : B̄ᵣ₀(x̄) → B̄ᵣ₀(x̄) (T maps the ball to itself)

                  Strategy:

                  1. From p(r₀) < 0, extract: Z(r₀)·r₀ + Y₀ < r₀
                  2. For x ∈ B̄ᵣ₀(x̄), use Mean Value Theorem: ‖T(x) - T(x̄)‖ ≤ Z(r₀)·‖x - x̄‖ ≤ Z(r₀)·r₀
                  3. Triangle inequality: ‖T(x) - x̄‖ ≤ ‖T(x) - T(x̄)‖ + ‖T(x̄) - x̄‖ ≤ Z(r₀)·r₀ + Y₀ < r₀
                  4. Therefore T(x) ∈ B̄ᵣ₀(x̄)

                  General Fixed Point Theorem (Theorem 7.6.1) #

                  This is the fixed point theorem for differentiable maps T : E → E on Banach spaces, corresponding to Theorem 7.6.1 in the informal proof.

                  Given:

                  If p(r₀) < 0, then there exists a unique fixed point x̃ ∈ B̄ᵣ₀(x̄).

                  This is used as a key step in proving Theorem 7.6.2.

                  theorem general_fixed_point_theorem {E : Type u_1} [NormedAddCommGroup E] [NormedSpace E] [CompleteSpace E] {T : EE} {xBar : E} {Y₀ : } {Z : } {r₀ : } (hT_diff : Differentiable T) (hr₀ : 0 < r₀) (h_bound_Y : T xBar - xBar Y₀) (h_bound_Z : cMetric.closedBall xBar r₀, fderiv T c Z r₀) (h_radii : simpleRadiiPolynomial Y₀ Z r₀ < 0) :
                  ∃! xTilde : E, xTilde Metric.closedBall xBar r₀ T xTilde = xTilde

                  Theorem 7.6.1: General Fixed Point Theorem for Banach Spaces

                  Let T : E → E be Fréchet differentiable and x̄ ∈ E. Suppose:

                  • ‖T(x̄) - x̄‖ ≤ Y₀ (eq. 7.27)
                  • ‖DT(x)‖ ≤ Z(r) for all x ∈ B̄ᵣ(x̄) (eq. 7.28)

                  Define p(r) := (Z(r) - 1)r + Y₀.

                  If there exists r₀ > 0 such that p(r₀) < 0, then there exists a unique x̃ ∈ B̄ᵣ₀(x̄) such that T(x̃) = x̃.

                  This is the Banach space version of Theorem 2.4.1. (which is in ℝⁿ)

                  General Radii Polynomial Theorem (Theorem 7.6.2) #

                  This is the main theorem for the E to F case, corresponding to Theorem 7.6.2 in the informal proof.

                  Given:

                  If p(r₀) < 0, then there exists a unique zero x̃ ∈ B̄ᵣ₀(x̄).

                  The proof strategy: Apply Theorem 7.6.1 to the Newton-like operator T(x) = x - A(f(x)), then use Proposition 2.3.1 to convert the fixed point to a zero.

                  Note: We don't claim Df(x̃) is invertible in this general version without additional assumptions. The derivative Df(x̃) : E →L[ℝ] F may not have an inverse in the usual sense when E and F are different.

                  theorem general_radii_polynomial_theorem {E : Type u_1} {F : Type u_2} [NormedAddCommGroup E] [NormedSpace E] [CompleteSpace E] [NormedAddCommGroup F] [NormedSpace F] {f : EF} {xBar : E} {A : F →L[] E} {A_dagger : E →L[] F} {Y₀ Z₀ Z₁ : } {Z₂ : } {r₀ : } (hr₀ : 0 < r₀) (h_Y₀ : A (f xBar) Y₀) (h_Z₀ : I_E - A.comp A_dagger Z₀) (h_Z₁ : A.comp (A_dagger - fderiv f xBar) Z₁) (h_Z₂ : cMetric.closedBall xBar r₀, A.comp (fderiv f c - fderiv f xBar) Z₂ r₀ * r₀) (hf_diff : Differentiable f) (h_radii : generalRadiiPolynomial Y₀ Z₀ Z₁ Z₂ r₀ < 0) (hA_inj : Function.Injective A) :
                  ∃! xTilde : E, xTilde Metric.closedBall xBar r₀ f xTilde = 0

                  Theorem 7.6.2: General Radii Polynomial Theorem for E to F maps

                  Given f : E → F with E, F Banach spaces, approximate inverse A : F →L[ℝ] E, and approximate derivative A† : E →L[ℝ] F satisfying:

                  • ‖A(f(x̄))‖ ≤ Y₀ (eq. 7.30: initial defect)
                  • ‖I_E - A∘A†‖ ≤ Z₀ (eq. 7.31: AA† close to identity)
                  • ‖A∘[Df(x̄) - A†]‖ ≤ Z₁ (eq. 7.32: A† approximates Df(x̄))
                  • ‖A∘[Df(c) - Df(x̄)]‖ ≤ Z₂(r)·r for c ∈ B̄ᵣ(x̄) (eq. 7.33: Lipschitz bound)

                  If p(r₀) < 0 where p(r) = Z₂(r)r² - (1-Z₀-Z₁)r + Y₀ (eq. 7.34), then there exists a unique x̃ ∈ B̄ᵣ₀(x̄) with f(x̃) = 0.

                  RM: It turns out we only need need there exists some r₀ > 0 such that p(r₀) < 0, not that p(r) < 0 for all r ∈ (0, r₀]. This is a slight generalization over the original statement.

                  Proof strategy:

                  1. Define Newton-like operator T(x) = x - A(f(x))
                  2. Show T satisfies conditions of Theorem 7.6.1 (general_fixed_point_theorem)
                  3. Apply Theorem 7.6.1 to get unique fixed point x̃
                  4. Use Proposition 2.3.1 to show f(x̃) = 0

                  The key requirement is that A is injective.

                  Simplified Radii Polynomial Theorem #

                  This is the simpler version when we don't need an explicit A†, or when we can effectively set A† = Df(x̄). This corresponds to Theorem 2.4.2 in the original formalization.

                  The key simplification: Z₁ = 0, so the polynomial becomes: p(r) = Z₂(r)r² - (1 - Z₀)r + Y₀

                  This still works for f : E → F with different spaces, but requires tighter bounds.

                  theorem simple_radii_polynomial_theorem_EtoF {E : Type u_1} {F : Type u_2} [NormedAddCommGroup E] [NormedSpace E] [CompleteSpace E] [NormedAddCommGroup F] [NormedSpace F] {f : EF} {xBar : E} {A : F →L[] E} {Y₀ Z₀ : } {Z₂ : } {r₀ : } (hr₀ : 0 < r₀) (h_Y₀ : A (f xBar) Y₀) (h_Z₀ : I_E - A.comp (fderiv f xBar) Z₀) (h_Z₂ : cMetric.closedBall xBar r₀, A.comp (fderiv f c - fderiv f xBar) Z₂ r₀ * r₀) (hf_diff : Differentiable f) (h_radii : radiiPolynomial Y₀ Z₀ Z₂ r₀ < 0) (hA_inj : Function.Injective A) :
                  ∃! xTilde : E, xTilde Metric.closedBall xBar r₀ f xTilde = 0

                  Theorem 2.4.2 (Generalized): Simple Radii Polynomial Theorem for E to F

                  Given f : E → F and approximate inverse A : F →L[ℝ] E satisfying:

                  • ‖A(f(x̄))‖ ≤ Y₀ (eq. 2.14)
                  • ‖I_E - A∘Df(x̄)‖ ≤ Z₀ (eq. 2.15)
                  • ‖A∘[Df(c) - Df(x̄)]‖ ≤ Z₂(r)·r for c ∈ B̄ᵣ(x̄) (eq. 2.16)

                  If p(r₀) < 0 where p(r) = Z₂(r)r² - (1-Z₀)r + Y₀ (eq. 2.17), then there exists a unique x̃ ∈ B̄ᵣ₀(x̄) with f(x̃) = 0.

                  This is a special case of the general theorem with A† = Df(x̄) (so Z₁ = 0).

                  Proof strategy:

                  1. Define Newton-like operator T(x) = x - A(f(x))
                  2. Apply Theorem 7.6.1 (general_fixed_point_theorem) to T
                  3. Use Proposition 2.3.1 to convert fixed point to zero
                  theorem simple_radii_polynomial_theorem_same_space {E : Type u_1} [NormedAddCommGroup E] [NormedSpace E] [CompleteSpace E] {f : EE} {xBar : E} {A : E →L[] E} {Y₀ Z₀ : } {Z₂ : } {r₀ : } (hr₀ : 0 < r₀) (h_Y₀ : A (f xBar) Y₀) (h_Z₀ : I_E - A.comp (fderiv f xBar) Z₀) (h_Z₂ : cMetric.closedBall xBar r₀, A.comp (fderiv f c - fderiv f xBar) Z₂ r₀ * r₀) (hf_diff : Differentiable f) (h_radii : radiiPolynomial Y₀ Z₀ Z₂ r₀ < 0) (hA_inj : Function.Injective A) :
                  ∃! xTilde : E, xTilde Metric.closedBall xBar r₀ f xTilde = 0 (fderiv f xTilde).IsInvertible

                  Version for same space (E = F) with invertibility claim

                  When E = F and Df(x̃) : E →L[ℝ] E, we can additionally prove that Df(x̃) is invertible using the Neumann series.

                  Proof strategy:

                  1. Apply Theorem 7.6.1 (general_fixed_point_theorem) to get fixed point x̃
                  2. Use Proposition 2.3.1 to show f(x̃) = 0
                  3. Show ‖I_E - A∘Df(x̃)‖ < 1 using the derivative bound
                  4. Apply Neumann series to construct inverse of Df(x̃)