斯坦福大学基于稀疏的统计学习

一般编程问题

下载此实例

开发语言：Others
实例大小：10.03M
下载次数：3
浏览次数：86
发布时间：2020-09-06
实例类别：一般编程问题
发布人：robot666
文件格式：.pdf
所需积分：2

网友评论举报投诉收藏该页

下载此实例

实例介绍

【实例简介】
斯坦福大学较为优秀的一个正则化教材，涉及到统计相关知识，也涉及到稀疏性质，对很多理论都做出了很好的诠释
Contents Preface 1 Introduction 2 The lasso for Linear models 2.1 Introduction 1778 2.2 The Lasso estimator 2.3 CrOss-Validation and Inference 13 2.4 Computation of the lasso solution 2.4.1 Single Predictor: Soft Thresholding 15 2.4.2 Multiple Predictors: Cyclic Coordinate Descent 16 2.4.3 Soft-Thresholding and Orthogonal Bases 17 2.5 Degrees of Freedom 17 2.6 Uniqueness of the Lasso Solutions 2.7 A Glimpse at the Theory 2.8 The Nonnegative Garrote 2.9 Penalties and bayes Estimates 22 2.10 Some Perspective Exercises 3 Generalized linear models 29 3.1 Introduction 29 3.2 Logistic Regression 3.2.1 Example: Document Classification 3.2.2 Algorithms 3.3 Multiclass Logistic Regression 36 3.3.1 Example: Handwritten Digits 3.3.2 Algorith 39 3.3.3 Grouped-Lasso Multinomial 3.4 Log-Linear Models and the Poisson GLM 3.4.1 Example: Distribution Smoothing 3.5 Cox Proportional Hazards Modcls 3.5.1 Cross-Validatic 3.5.2 Pre-Validation 45 3.6 Support Vector Machines 46 3.6.1 Logistic Regression with Separable Data 49 3.7 Computational Details and glmnet 50 Bibliographic N 52 Exerc 53 4 Generalizations of the Lasso Penalty 55 4.1 Introduction 55 4.2 The clastic Net 56 4.3 The Group L 58 4.3.1 Computation for the group lasso 62 4.3.2 Sparse Group lasso 64 4.3.3 The Overlap Group Lasso 65 4.4 Sparse Additive Models and the Group Lasso 69 4.4.1 Additive Modcls and Backfitting 4.4.2 Sparse Additive Models and Backfitting 4.4.3 Approaches using Opt imization and t he group l asse 72 4.4.4 Multiple Penalization for Sparse Additive Models 74 4.5 The Fused Lasso 76 4.5. 1 Fitting the Fused Lasso 77 4.5.1.1 Reparametrization 78 4.5.1.2 A Path Algorithm 79 4.5.1.3 A Dual Path Algorit 79 4.5.1.4 Dynamic Programming for the Fused Lasso 80 4.5.2 Trend Filtering 4.5.3 Nearly Isotonic Regression 83 4.6 Nonconvex penalties 84 Bibliographic Notes 86 E excises 88 5 Optimization Methods 95 5.1 Introduction 95 5.2 Convex optimality conditions 95 5.2. 1 Optimality for Differentiable Problems 95 5.2.2 Nondifferentiable Functions and subgradients 98 5.3 Gradient Descent 100 5.3.1 Unconstrained Gradient Descent 5.3.2 Projected Gradient Metliods 102 5.3.3 Proximal Gradient Methods 103 5.3.4 Accelerated Gradient Methods 107 5.4 Coordinate Descent 109 5.4.1 Sepa. rahility and Coordinate Descent. 110 5.4.2 Linear Regression and the Lasso 112 5.4.3 Logistic Regression and Generalized Linear Models 115 5.5 A Simulation Study 5.6 Least Angle Regression 118 5.7 Altcrnating Dircction Mcthod of Multipliers 121 5.8 Minorization-Maximization Algorithms 123 5.9 Biconvexity and Alternating Minimization 124 5.10 Screening Rules Bibliographic notes ppcndIx 132 EXercises 6 Statistical Inference 139 6.1 The Bayesian Lasso 139 6.2 The boot 6.3 Post-Selection Inference for the Lasso 147 6.3.1 The Covariance Test 6.3.2 A Gcncral Schcmc for Post-Sclcction Infcrcncc 150 6.3.2.1 Fixed-入Ife 154 6.3.2.2 The Spacing Test for LAR 156 6.3.3 What Hypothesis Is Being Tested? 6.3.4 Back to Forward Stepwise Regression 6.4 Inference via a debiased lasso 158 6.5 Other Proposals for Post-Selection Inference 160 Bibliographic Notes 161 Exercises 162 7 Matrix Decompositions, Approximations, and Completion 167 7.1 Introduction 167 7.2 The Singular Value Decomposition 7.3 Missing Data and Matrix Completion 169 7.3.1 The Netfix Movie Challenge 7.3.2 Matrix Completion Using Nuclcar Norm 174 7.3.3 Theoretical Results for Matrix Completion 177 7.3.4 Maximum Margin Factorization and Related Methods 181 7.4 Reduced-Rank Regression 7.5 A General Matrix Regression Framework 185 7.6 Penalized Matrix Decomposition 187 7.7 Additive Matrix Decomposition Bibliographic Notes 95 Exercises 8 Sparse Multivariate Met hods 201 8.1 Introduction 201 8.2 Sparse Principal Components analysis 8.2.1 Some Background 202 8.2.2 Sparse Principal Components 204 8.2.2.1 Sparsity from Maximum Variance 204 8.2.2.2 Methods Based on reconstructioN 206 8.2.3 Higher-Rank Solutions 207 8.2.3.1 Illustrative Application of s PCA 209 8.2.4 Sparse PCA via Fantope Projection 210 8.2.5 Sparse Autoencoders and Deep learning 210 8.2.6 Some Theory for Sparse PCA 212 8.3 Sparsc Canonical Corrclation Analysis 213 8.3.1 ExaMple: Netflix Movie Rating Data 215 8.4 Sparse Linear Discriminant Analysis 217 8.4.1 Normal Th neary and d Bayes'Rule 217 8.4.2 Nearest Shrunken Centroids 218 8.4.3 Fishers Linear Discriminant Analysis 221 8.4.3.1 Example: Simulated Data with Five Classes 222 8.4.4 Optimal Scoring 225 8.4.4.1 Exalllple: Face silhouettes 226 5 Sparse Clusterin 227 8.5.1 Somc Background on Clusterin 227 8.5.1.1 Example: Simulated Data with Six Classes 228 8.5.2 Sparse Hierarchical Clustering 228 8.5.3 Sparse K-Means Clustering 230 8.5.4 Convex Clustering 231 Bibliographic Notes 232 Exercises 234 9 Graphs and Model Selection 241 9. 1 Introduction 241 9.2 Basics of Graphical models 241 9. 2.1 Factorization and Markov Properties 241 9.2.1.1 Factorization Propert 24 9.2.1.2 Markov Property 243 9.2.1.3 Equivalence of Factorization and Markov Propcrtics 243 9.2.2 Some Examples 244 9.2.2.1 Discrete graphica. I Models 244 9.2.2.2 Gaussian Graphical Models 245 9.3 Graph Selection via Penalized Likelihood 246 9.3.1 Global Likelihoods for Gaussian Models 247 9.3.2 Graphical Lasso Algorithm 248 9.3.3 Exploiting Block-Diagonal Structure 251 9.3.4 Theoretical Guarantees for the Graphical lasso 252 9.3.5 Global Likelihood for Discrete Models 253 9.4 Graph Selection via Conditional Inference 254 9.4.1 Neighborhood-Based Likelihood for Gaussians 255 9.4.2 Neighborhood-Based Likelihood for Discrete Models 256 9. 4.3 Pseudo-Likelihood for Mixed models 259 9.5 Graphical Models with Hidden Variables 261 Bibliographic Notes 261 Exercises 263 10 Signal Approximation and Compressed Sensing 269 10.1 Introduction 269 10.2 Signals and Sparse Representations 269 10.2.1 Orthogonal bases 269 10.2.2 Approximation in Orthogonal Bases 271 10.2.3 Reconstruction in Overcomplete Bases 274 10.3 Random Projection and Approximation 276 10.3.1 Johnson Lindenstrauss Approximation 277 10.3.2 CoMpressed Sensing 278 10.4 Equivalence between lo and li recovery 280 10.4.1 Restricted Nullspace Pr 281 10.4.2 Sufficient Conditions for Restricted Nullspace 282 10.4.3 Proofs 284 10.41.3.1 Proof of theorem 10.1 284 10.4.3.2 Proof of Proposition 10.1 284 Bibliographic n 285 E 11 Theoretical results for the lasso 289 11.1 Introduction 289 11.1.1T of loss fu 11.1.2 Types of sparsity Models 290 11.2 Bounds on lasso e -Erro 291 11.2.1 Strong Convexity in the Classical Setting 291 11.2.2 Restricted Eigenvalues for Regression 11.2.3 A Basic Consistency Rcsult 294 11. 3 Bounds on prediction error 99 11.4 Support Recovery in Linear Regression 301 11.4.1 Variable-Selection Consistency for the Lasso 301 11.4.1.1 Some Numerical Studies 303 11.4.2 Proof of Theorem 11.3 305 11.5 Beyond the Basic Lasso Bibliographic Notes 311 Exercises 312 Bibliography 315 Author index 337 343 Preface In this monograph, we have attempted to summarize the actively developing field of statistical learning with sparsity. A sparse st at ist ica.I model is one having only a small number of nonzero parameters or weights. It represents a classic case of " less is more: a sparse model can be much easier to estimate and interpret than a dense model. In this age of big data, the number of features measured on a person or object can be large, and might be larger than the number of observations. The sparsity assumption allows us to tackle such probleMs and extract useful and reproducible patterns froIn big datasets The ideas described here represent the work of an entire community of researchers in statistics and machine learning, and we thank everyone for their continuing contributions to this exciting area. We particularly thank our colleagues at Stanford, Berkeley and elsewhere; our collaborators, and our past and current students working in this area. These include Alekh Agarwal Arash Amini, Francis Bach, Jacob Bien, Stephen Boyd, Andreas Buja, Em Manuel Candes. Alexandra Chouldechova. David Donoho. John Duchi. brad Efron. Will Fithian. Jerome Friedman. Max GSell. Iain Johnstone. Michael Jordan, Ping Li, Po-Ling Loh, Michacl Lim, Jason Lcc, Richard Lockhart Rahul Mazumder, Balasubramanian Narashimhan, Sahand Negahban, gui laume Obozinski, Mee-Young Park, Junyang Qian, Ga.rvesh Raskutti, Pradeep Ravikumar, Saharon rosset, Prasad Santhanam, Noah Simon, Dennis Sun, Yukai Sun, Jonathan Taylor, Ryan Tibshirani, I Stefan Wager, Daniela Wit ten, Bin Yu, Yuchen Zhang, Ji Zhou, and Hui Zou. We also thank our editor John Kimmel for his advice and support Stanford ulliversity Trevor hastie Robert tibshirani University of California, Berkeley Martin Wainwright I Some of the bibliographic references, for example in Chapters 4 and 6, are to Tibshirani2, R.J., rather than Tibshirani, R. the former is Ryan Tibshirani, the latter is Robert(son and father) V 【实例截图】
【核心代码】

标签：

实例下载地址