Let f be a differentiable and μ-strongly-convex function whose minimum is achieved at x*. Let us assume that the variance on the gradients is controlled: There exists σ > 0 and L≥ 0 such that E; [||Vf; (x)||² | xk] ≤0² + L ||xk − x* ||². Prove the following statements: 1. If σ > 0 and L = 0, SGD with step size ŋk satisfies | E || xo - Ef (zk) - f*] ≤ =0 (1) 2 Στο where Σj=0jxj Zk= (2) ΣΕ In particular, E[f (zk) - f*] converges to 0 if and only if Σ, n; = ∞ and 2. If σ > 0 and L > 0, SGD with a constant step size n satisfies = 0. E||xk+1 - x* ||² ≤ (1 - 2nμ+n²L)*E ||xo-x* ||² + (1 − 2nµ + n²L); - ησε 2μ-nL (3) What is the restriction on the stepsize? 3. Let us observe by definition, SGD with step size n satisfies: ||K+1 – x = ||xk - xu t ng Vi(x)|| − 20k (k – x*, Vf(x)). (4) Derive the optimal step size and comment on it.

Calculus For The Life Sciences
2nd Edition
ISBN:9780321964038
Author:GREENWELL, Raymond N., RITCHEY, Nathan P., Lial, Margaret L.
Publisher:GREENWELL, Raymond N., RITCHEY, Nathan P., Lial, Margaret L.
Chapter9: Multivariable Calculus
Section9.CR: Chapter 9 Review
Problem 6CR
Question
Let f be a differentiable and μ-strongly-convex function whose minimum is achieved at x*. Let us assume that the
variance on the gradients is controlled: There exists σ > 0 and L≥ 0 such that E; [||Vf; (x)||² | xk] ≤0² + L ||xk − x* ||².
Prove the following statements:
1. If σ > 0 and L = 0, SGD with step size ŋk satisfies
| E || xo -
Ef (zk) - f*] ≤
=0
(1)
2 Στο
where
Σj=0jxj
Zk=
(2)
ΣΕ
In particular, E[f (zk) - f*] converges to 0 if and only if Σ, n; = ∞ and
2. If σ > 0 and L > 0, SGD with a constant step size n satisfies
= 0.
E||xk+1 - x* ||² ≤ (1 - 2nμ+n²L)*E ||xo-x* ||² + (1 − 2nµ + n²L);
-
ησε
2μ-nL
(3)
What is the restriction on the stepsize?
3. Let us observe by definition, SGD with step size n satisfies:
||K+1 – x = ||xk - xu t ng Vi(x)|| − 20k (k – x*, Vf(x)).
(4)
Derive the optimal step size and comment on it.
Transcribed Image Text:Let f be a differentiable and μ-strongly-convex function whose minimum is achieved at x*. Let us assume that the variance on the gradients is controlled: There exists σ > 0 and L≥ 0 such that E; [||Vf; (x)||² | xk] ≤0² + L ||xk − x* ||². Prove the following statements: 1. If σ > 0 and L = 0, SGD with step size ŋk satisfies | E || xo - Ef (zk) - f*] ≤ =0 (1) 2 Στο where Σj=0jxj Zk= (2) ΣΕ In particular, E[f (zk) - f*] converges to 0 if and only if Σ, n; = ∞ and 2. If σ > 0 and L > 0, SGD with a constant step size n satisfies = 0. E||xk+1 - x* ||² ≤ (1 - 2nμ+n²L)*E ||xo-x* ||² + (1 − 2nµ + n²L); - ησε 2μ-nL (3) What is the restriction on the stepsize? 3. Let us observe by definition, SGD with step size n satisfies: ||K+1 – x = ||xk - xu t ng Vi(x)|| − 20k (k – x*, Vf(x)). (4) Derive the optimal step size and comment on it.
Expert Solution
steps

Step by step

Solved in 2 steps

Blurred answer
Recommended textbooks for you
Calculus For The Life Sciences
Calculus For The Life Sciences
Calculus
ISBN:
9780321964038
Author:
GREENWELL, Raymond N., RITCHEY, Nathan P., Lial, Margaret L.
Publisher:
Pearson Addison Wesley,
College Algebra (MindTap Course List)
College Algebra (MindTap Course List)
Algebra
ISBN:
9781305652231
Author:
R. David Gustafson, Jeff Hughes
Publisher:
Cengage Learning
College Algebra
College Algebra
Algebra
ISBN:
9781938168383
Author:
Jay Abramson
Publisher:
OpenStax
Algebra & Trigonometry with Analytic Geometry
Algebra & Trigonometry with Analytic Geometry
Algebra
ISBN:
9781133382119
Author:
Swokowski
Publisher:
Cengage
Algebra: Structure And Method, Book 1
Algebra: Structure And Method, Book 1
Algebra
ISBN:
9780395977224
Author:
Richard G. Brown, Mary P. Dolciani, Robert H. Sorgenfrey, William L. Cole
Publisher:
McDougal Littell