在好例子网,分享、交流、成长!
您当前所在位置:首页Others 开发实例一般编程问题 → 《Learning Haskell Data Analysis》

《Learning Haskell Data Analysis》

一般编程问题

下载此实例
  • 开发语言:Others
  • 实例大小:6.79M
  • 下载次数:2
  • 浏览次数:21
  • 发布时间:2023-01-16
  • 实例类别:一般编程问题
  • 发 布 人:老刘
  • 文件格式:.pdf
  • 所需积分:0
 相关标签:

实例介绍

【实例简介】《Learning Haskell Data Analysis》

【实例截图】

【核心代码】

Table of Contents
Preface vii
Chapter 1: Tools of the Trade 1
Welcome to Haskell and data analysis! 1
Why Haskell? 3
Getting ready 5
Installing the Haskell platform on Linux 5
The software used in addition to Haskell 7
SQLite3 7
Gnuplot 7
LAPACK 8
Nearly essential tools of the trade 8
Version control software – Git 8
Tmux 10
Our first Haskell program 11
Interactive Haskell 15
An introductory problem 16
Summary 18
Chapter 2: Getting Our Feet Wet 19
Type is king – the implications of strict types in Haskell 19
Computing the mean of a list 20
Computing the sum of a list 20
Computing the length of a list 21
Attempting to compute the mean results in an error 21
Introducing the Fractional class 21
The fromIntegral and realToFrac functions 22
Creating our average function 22
The genericLength function 23
Metadata is just as important as data 24
www.it-ebooks.info
Table of Contents
[ ii ]
Working with csv files 25
Preparing our environment 25
Describing our needs 26
Crafting our solution 26
Finding the column index of the specified column 27
The Maybe and Either monads 29
Applying a function to a specified column 30
Converting csv files to the SQLite3 format 33
Preparing our environment 33
Describing our needs 34
Inspecting column information 34
Crafting our functions 36
Summary 39
Chapter 3: Cleaning Our Datasets 41
Structured versus unstructured datasets 41
How data analysis differs from pattern recognition 42
Creating your own structured data 43
Counting the number of fields in each record 43
Filtering data using regular expressions 45
Creating a simplified version of grep in Haskell 46
Exhibit A – a horrible customer database 47
Searching fields based on a regular expression 48
Locating empty fields in a csv file based on a regular expression 52
Crafting a regular expression to match dates 53
Summary 55
Chapter 4: Plotting 57
Plotting data with EasyPlot 57
Simplifying access to data in SQLite3 59
Plotting data from a SQLite3 database 60
Exploring the EasyPlot library 62
Plotting a subset of a dataset 64
Plotting data passed through a function 66
Plotting multiple datasets 69
Plotting a moving average 72
Plotting a scatterplot 74
Summary 76
www.it-ebooks.info
Table of Contents
[ iii ]
Chapter 5: Hypothesis Testing 77
Data in a coin 77
Hypothesis test 78
Establishing the magic coin test 78
Understanding data variance 79
Probability mass function 80
Determining our test interval 82
Establishing the parameters of the experiment 83
Introducing System.Random 83
Performing the experiment 84
Does a home-field advantage really exist? 84
Converting the data to SQLite3 85
Exploring the data 86
Plotting what looks interesting 87
Returning to our test 90
The standard deviation 90
The standard error 91
The confidence interval 91
An introduction to the Erf module 94
Using Erf to test the claim 95
A discussion of the test 96
Summary 96
Chapter 6: Correlation and Regression Analysis 97
The terminology of correlation and regression 98
The expectation of a variable 98
The variance of a variable 99
Normalizing a variable 100
The covariance of two variables 100
Finding the Pearson r correlation coefficient 101
Finding the Pearson r 2 correlation coefficient 102
Translating what we've learned to Haskell 102
Study – is there a connection between scoring and winning? 103
A consideration before we dive in – do any games end in a tie? 103
Compiling the essential data 104
Searching for outliers 105
Plot – runs per game versus the win percentage of each team 106
Performing correlation analysis 107
www.it-ebooks.info
Table of Contents
[ iv ]
Regression analysis 107
The regression equation line 108
Estimating the regression equation 108
Translate the formulas to Haskell 109
Returning to the baseball analysis 109
Plotting the baseball analysis with the regression line 110
The pitfalls of regression analysis 111
Summary 113
Chapter 7: Naive Bayes Classification of Twitter Data 115
An introduction to Naive Bayes classification 117
Prior knowledge 117
Likelihood 118
Evidence 118
Putting the parts of the Bayes theorem together 119
Creating a Twitter application 119
Communicating with Twitter 120
Creating a database to collect tweets 123
A frequency study of tweets 125
Cleaning our tweets 126
Creating our feature vectors 126
Writing the code for the Bayes theorem 128
Creating a Naive Bayes classifier with multiple features 130
Testing our classifier 133
Summary 135
Chapter 8: Building a Recommendation Engine 137
Analyzing the frequency of words in tweets 140
A note on the importance of removing stop words 141
Working with multivariate data 143
Describing bivariate and multivariate data 144
Eigenvalues and eigenvectors 145
The airplane analogy 146
Preparing our environment 148
Performing linear algebra in Haskell 148
Computing the covariance matrix of a dataset 149
Discovering eigenvalues and eigenvectors in Haskell 151
Principal Component Analysis in Haskell 153
Building a recommendation engine 155
Finding the nearest neighbors 155
Testing our recommendation engine 157
Summary 159
www.it-ebooks.info
Table of Contents
[ v ]
Appendix: Regular Expressions in Haskell 161
A crash course in regular expressions 161
The three repetition modifiers 162
Anchors 163
The dot 164
Character classes 165
Groups 166
Alternations 166
A note on regular expressions 167
Index 169

标签:

网友评论

发表评论

(您的评论需要经过审核才能显示)

查看所有0条评论>>

小贴士

感谢您为本站写下的评论,您的评论对其它用户来说具有重要的参考价值,所以请认真填写。

  • 类似“顶”、“沙发”之类没有营养的文字,对勤劳贡献的楼主来说是令人沮丧的反馈信息。
  • 相信您也不想看到一排文字/表情墙,所以请不要反馈意义不大的重复字符,也请尽量不要纯表情的回复。
  • 提问之前请再仔细看一遍楼主的说明,或许是您遗漏了。
  • 请勿到处挖坑绊人、招贴广告。既占空间让人厌烦,又没人会搭理,于人于己都无利。

关于好例子网

本站旨在为广大IT学习爱好者提供一个非营利性互相学习交流分享平台。本站所有资源都可以被免费获取学习研究。本站资源来自网友分享,对搜索内容的合法性不具有预见性、识别性、控制性,仅供学习研究,请务必在下载后24小时内给予删除,不得用于其他任何用途,否则后果自负。基于互联网的特殊性,平台无法对用户传输的作品、信息、内容的权属或合法性、安全性、合规性、真实性、科学性、完整权、有效性等进行实质审查;无论平台是否已进行审查,用户均应自行承担因其传输的作品、信息、内容而可能或已经产生的侵权或权属纠纷等法律责任。本站所有资源不代表本站的观点或立场,基于网友分享,根据中国法律《信息网络传播权保护条例》第二十二与二十三条之规定,若资源存在侵权或相关问题请联系本站客服人员,点此联系我们。关于更多版权及免责申明参见 版权及免责申明

;
报警