在好例子网,分享、交流、成长!
您当前所在位置:首页Python 开发实例Python语言基础 → 六步使用Python构建机器学习系统(Building Machine Learning Systems with Python_ Sec.pdf)

六步使用Python构建机器学习系统(Building Machine Learning Systems with Python_ Sec.pdf)

Python语言基础

下载此实例
  • 开发语言:Python
  • 实例大小:6.90M
  • 下载次数:16
  • 浏览次数:164
  • 发布时间:2020-02-27
  • 实例类别:Python语言基础
  • 发 布 人:lao.
  • 文件格式:.pdf
  • 所需积分:1
 相关标签: pdf

实例介绍

【实例简介】Building Machine Learning Systems with Python_ Sec.pdf

【实例截图】

from clipboard

【核心代码】

Table of Contents
Preface vii
Chapter 1: Getting Started with Python Machine Learning 1
Machine learning and Python – a dream team 2
What the book will teach you (and what it will not) 3
What to do when you are stuck 4
Getting started 5
Introduction to NumPy, SciPy, and matplotlib 6
Installing Python 6
Chewing data efficiently with NumPy and intelligently with SciPy 6
Learning NumPy 7
Indexing 9
Handling nonexisting values 10
Comparing the runtime 11
Learning SciPy 12
Our first (tiny) application of machine learning 13
Reading in the data 14
Preprocessing and cleaning the data 15
Choosing the right model and learning algorithm 17
Before building our first model… 18
Starting with a simple straight line 18
Towards some advanced stuff 20
Stepping back to go forward – another look at our data 22
Training and testing 26
Answering our initial question 27
Summary 28
Chapter 2: Classifying with Real-world Examples 29
The Iris dataset 30
Visualization is a good first step 30
Building our first classification model 32
Evaluation – holding out data and cross-validation 36
www.allitebooks.com
Table of Contents
[ ii ]
Building more complex classifiers 39
A more complex dataset and a more complex classifier 41
Learning about the Seeds dataset 41
Features and feature engineering 42
Nearest neighbor classification 43
Classifying with scikit-learn 43
Looking at the decision boundaries 45
Binary and multiclass classification 47
Summary 49
Chapter 3: Clustering – Finding Related Posts 51
Measuring the relatedness of posts 52
How not to do it 52
How to do it 53
Preprocessing – similarity measured as a similar
number of common words 54
Converting raw text into a bag of words 54
Counting words 55
Normalizing word count vectors 58
Removing less important words 59
Stemming 60
Stop words on steroids 63
Our achievements and goals 65
Clustering 66
K-means 66
Getting test data to evaluate our ideas on 70
Clustering posts 72
Solving our initial challenge 73
Another look at noise 75
Tweaking the parameters 76
Summary 77
Chapter 4: Topic Modeling 79
Latent Dirichlet allocation 80
Building a topic model 81
Comparing documents by topics 86
Modeling the whole of Wikipedia 89
Choosing the number of topics 92
Summary 94
Chapter 5: Classification – Detecting Poor Answers 95
Sketching our roadmap 96
Learning to classify classy answers 96
Tuning the instance 96
Table of Contents
[ iii ]
Tuning the classifier 96
Fetching the data 97
Slimming the data down to chewable chunks 98
Preselection and processing of attributes 98
Defining what is a good answer 100
Creating our first classifier 100
Starting with kNN 100
Engineering the features 101
Training the classifier 103
Measuring the classifier's performance 103
Designing more features 104
Deciding how to improve 107
Bias-variance and their tradeoff 108
Fixing high bias 108
Fixing high variance 109
High bias or low bias 109
Using logistic regression 112
A bit of math with a small example 112
Applying logistic regression to our post classification problem 114
Looking behind accuracy – precision and recall 116
Slimming the classifier 120
Ship it! 121
Summary 121
Chapter 6: Classification II – Sentiment Analysis 123
Sketching our roadmap 123
Fetching the Twitter data 124
Introducing the Naïve Bayes classifier 124
Getting to know the Bayes' theorem 125
Being naïve 126
Using Naïve Bayes to classify 127
Accounting for unseen words and other oddities 131
Accounting for arithmetic underflows 132
Creating our first classifier and tuning it 134
Solving an easy problem first 135
Using all classes 138
Tuning the classifier's parameters 141
Cleaning tweets 146
Taking the word types into account 148
Determining the word types 148
Successfully cheating using SentiWordNet 150
Table of Contents
[ iv ]
Our first estimator 152
Putting everything together 155
Summary 156
Chapter 7: Regression 157
Predicting house prices with regression 157
Multidimensional regression 161
Cross-validation for regression 162
Penalized or regularized regression 163
L1 and L2 penalties 164
Using Lasso or ElasticNet in scikit-learn 165
Visualizing the Lasso path 166
P-greater-than-N scenarios 167
An example based on text documents 168
Setting hyperparameters in a principled way 170
Summary 174
Chapter 8: Recommendations 175
Rating predictions and recommendations 175
Splitting into training and testing 177
Normalizing the training data 178
A neighborhood approach to recommendations 180
A regression approach to recommendations 184
Combining multiple methods 186
Basket analysis 188
Obtaining useful predictions 190
Analyzing supermarket shopping baskets 190
Association rule mining 194
More advanced basket analysis 196
Summary 197
Chapter 9: Classification – Music Genre Classification 199
Sketching our roadmap 199
Fetching the music data 200
Converting into a WAV format 200
Looking at music 201
Decomposing music into sine wave components 203
Using FFT to build our first classifier 205
Increasing experimentation agility 205
Training the classifier 207
Using a confusion matrix to measure accuracy in
multiclass problems 207
Table of Contents
[ v ]
An alternative way to measure classifier performance
using receiver-operator characteristics 210
Improving classification performance with Mel
Frequency Cepstral Coefficients 214
Summary 218
Chapter 10: Computer Vision 219
Introducing image processing 219
Loading and displaying images 220
Thresholding 222
Gaussian blurring 223
Putting the center in focus 225
Basic image classification 228
Computing features from images 229
Writing your own features 230
Using features to find similar images 232
Classifying a harder dataset 234
Local feature representations 235
Summary 239
Chapter 11: Dimensionality Reduction 241
Sketching our roadmap 242
Selecting features 242
Detecting redundant features using filters 242
Correlation 243
Mutual information 246
Asking the model about the features using wrappers 251
Other feature selection methods 253
Feature extraction 254
About principal component analysis 254
Sketching PCA 255
Applying PCA 255
Limitations of PCA and how LDA can help 257
Multidimensional scaling 258
Summary 262
Chapter 12: Bigger Data 263
Learning about big data 264
Using jug to break up your pipeline into tasks 264
An introduction to tasks in jug 265
Looking under the hood 268
Using jug for data analysis 269
Reusing partial results 272
Table of Contents
[ vi ]
Using Amazon Web Services 274
Creating your first virtual machines 276
Installing Python packages on Amazon Linux 282
Running jug on our cloud machine 283
Automating the generation of clusters with StarCluster 284
Summary 288
Appendix: Where to Learn More Machine Learning 291
Online courses 291
Books 291
Question and answer sites 292
Blogs 292
Data sources 293
Getting competitive 293
All that was left out 293
Summary 294
Index 295

标签: pdf

实例下载地址

六步使用Python构建机器学习系统(Building Machine Learning Systems with Python_ Sec.pdf)

不能下载?内容有错? 点击这里报错 + 投诉 + 提问

好例子网口号:伸出你的我的手 — 分享

网友评论

发表评论

(您的评论需要经过审核才能显示)

查看所有0条评论>>

小贴士

感谢您为本站写下的评论,您的评论对其它用户来说具有重要的参考价值,所以请认真填写。

  • 类似“顶”、“沙发”之类没有营养的文字,对勤劳贡献的楼主来说是令人沮丧的反馈信息。
  • 相信您也不想看到一排文字/表情墙,所以请不要反馈意义不大的重复字符,也请尽量不要纯表情的回复。
  • 提问之前请再仔细看一遍楼主的说明,或许是您遗漏了。
  • 请勿到处挖坑绊人、招贴广告。既占空间让人厌烦,又没人会搭理,于人于己都无利。

关于好例子网

本站旨在为广大IT学习爱好者提供一个非营利性互相学习交流分享平台。本站所有资源都可以被免费获取学习研究。本站资源来自网友分享,对搜索内容的合法性不具有预见性、识别性、控制性,仅供学习研究,请务必在下载后24小时内给予删除,不得用于其他任何用途,否则后果自负。基于互联网的特殊性,平台无法对用户传输的作品、信息、内容的权属或合法性、安全性、合规性、真实性、科学性、完整权、有效性等进行实质审查;无论平台是否已进行审查,用户均应自行承担因其传输的作品、信息、内容而可能或已经产生的侵权或权属纠纷等法律责任。本站所有资源不代表本站的观点或立场,基于网友分享,根据中国法律《信息网络传播权保护条例》第二十二与二十三条之规定,若资源存在侵权或相关问题请联系本站客服人员,点此联系我们。关于更多版权及免责申明参见 版权及免责申明

;
报警