实例介绍
If you know how to program with Python and also know a little about probability, you’re ready to tackle Bayesian statistics. With this book, you'll learn how to solve statistical problems with Python code instead of mathematical notation, and use discrete probability distributions instead of continu
Think bayes Bayesian Statistics Made Simple Version 1.0.1 Allen b. downey Green tea press Nccdham massachusetts Copyright 2012 Allen B. downey Green tea press 9 Washburn ave Needham ma 02492 Permission is granted to copy, distribute, and / or modify this document under the terms of the creative Commons attribution-Non commercial 3.0UnportedLicense,whichisavailableathttp://creativecommons.org/ licenses/by-nC/3.0/ Preface 0.1 My theory, which is mine The premise of this book, and the other books in the Think X series is that if you know how to program you can use that skill to learn other topics Most books on bayesian statistics use mathematical notation and present ideas in terms of mathematical concepts like calculus. This book uses Python code instead of math, and discrete approximations instead of con- tinuous mathematics. As a result, what would be an integral in a math book becomes a summation and most operations on probability distributions are simple le OopS I think this presentation is easier to understand, at least for people with pro gramming skills. It is also more general, because when we make modeling decisions, we can choose the most appropriate model without worrying too much about whether the model lends itself to conventional analysis Also, it provides a smooth development path from simple examples to real world problems. Chapter 3 is a good example. It starts with a simple ex- ample involving dice, one of the staples of basic probability. From there it proceeds in small steps to the locomotive problem, which I borrowed from Mosteller's Fifty Challenging problems in Probability with Solutions, and from there to the german tank problem, a famously successful application of Bayesian methods during World War ll 0.2 Modeling and approximation Most chapters in this book are motivated by a real-world problem so they involve some degree of modeling. Before we can apply Bayesian methods (or any other analysis), we have to make decisions about which parts of the Chapter 0. Preface real-world system to include in the model and which details we can abstract awal For example, in Chapter7 the motivating problem is to predict the winner of a hockey game. I model goal-scoring as a Poisson process, which implies that a goal is equally likely at any point in the game. That is not exactly true, but it is probably a good enough gh model for most purposes In Chapter 12 the motivating problem is interpreting SAT scores(the SAT is a standardized test used for college admissions in the United States). I start with a simple model that assumes that all sat questions are equally diffi- cult, but in fact the designers of the sat deliberately include some questions that are relatively easy and some that are relatively hard. I present a second model that accounts for this aspect of the design, and show that it doesnt have a big effect on the results after all I think it is important to include modeling as an explicit part of problem solving because it reminds us to think about modeling errors(that is,errors due to simplifications and assumptions of the model) Many of the methods in this book are based on discrete distributions which makes some people worry about numerical errors. But for real-world prob lems numerical errors are almost always smaller than modeling errors Furthermore, the discrete approach often allows better modeling decisions and i would rather have an approximate solution to a good model than an exact solution to a bad model On the other hand, continuous methods sometimes yield performance advantages-for example by replacing a linear- or quadratic-time compu tation with a constant-time solution bo I recommend a general process with these steps 1. While you are exploring a problem, start with simple models and im- plement them in code that is clear, readable, and demonstrably correct Focus your attention on good modeling decisions not optimization 2. Once you have a simple model working identify the biggest source of error. You might need to increase the number of values in a discrete approximation, or increase the number of iterations in a Monte Carlo simulation or add details to the model 3. If the performance of your solution is good enough for your applica- tion, you might not have to do any optimization. But if you do, there are two approaches to consider. You can review your code and look 0.3. Working with the code for optimizations; for example, if you cache previously computed re- sults you might be able to avoid redundant computation. Or you can look for analytic methods that yield computational shortcuts One benefit of this process is that Steps 1 and 2 tend to be fast, so you can explore several alternative models before investing heavily in any of them Another benefit is that if you get to Step 3, you will be starting with a ref- erence implementation that is likely to be correct which you can use for regression testing(that is, checking that the optimized code yields the same results, at least approximately 0.3 Working with the code Many of the examples in this book use classes and functions defined in thinkbayes.py.Youcandownloadthismodulefromhttp://thinkbayes com/thinkbayespy Mostchapterscontainreferencestocodeyoucandownloadfromhttp /thinkbayes. com. Some of those files have dependencies you will also have to download. I suggest you keep all of these files in the same directory so they can import each other without changing the Python search path You can down load these files one at a time as you need them or you candownloadthemallatoncefromhttp://thinkbayes.com/thinkbayes_- code. zip! This file also contains the data files used by some of the pro grams. When you unzip it, it creates a directory named thinkbayes_code that contains all the code used in this book Or, if you are a git user, you can get all of the files at once by forking and cloningthisrepositoryhttps://github.com/allendownEy/thinkbayes One of the modules I use is thinkplot. py, which provides wrappers for some of the functions in pyplot. To use it, you need to install matplotlib If you dont already have it, check your package manager to see if it isavailableOtherwiseyoucangetdownloadinstructionsfromhttp //matplotlib or Finally, some programs in this book use numpy and Scipy, which are avail ablefromhttp://numpy.organdhttp://scipy.org Chapter 0. Preface 0.4 Code style Experienced Python programmers will notice that the code in this book does not comply with pep 8, which is the most common style guide for Python(http://www.python,org/dev/peps/pep-0008/) Specifically, PEP 8 calls for lowercase function names with underscores be- tween words, like_ this. In this book and the accompanying code, function and method names begin with a capital letter and use camel case, LikeThis I broke this rule because I developed some of the code while I was a Visiting Scientist at Google, so I followed the Google style guide, which deviates from PEp 8 in a few places. Once I got used to Google style, I found that I liked it. And at this point it would be too much trouble to change Also on the topic of style, I write"Bayes's theorem"with an s after the apos trophe, which is preferred in some style guides and deprecated in others. I don' t have a strong preference. I had to choose one, and this is the one I chose And finally one typographical note: throughout the book, I use PMF and CDF for the mathematical concept of a probability mass function or cumu lative distribution function and pmf and caf to refer to the python objects I use to represent them 0.5 Prerequisites There are several excellent modules for doing bayesian statistics in Python, including pymc and OpenBUGS. I chose not to use them for this book be cause you need a fair amount of background knowledge to get started with these modules, and i want to keep the prerequisites minimal. If you know Python and a little bit about probability, you are ready to start this book Chapter 1 is about probability and Bayes's theorem; it has no code. Chap- ter introduces Pmf, a thinly disguised Python dictionary I use to represent a probability mass function (PMF). Then Chapter3introduces Suite, a kind of Pmf that provides a framework for doing Bayesian updates. And thats just about all there is to it Well, almost. In some of the later chapters i use analytic distributions in- cluding the gaussian(normal)distribution, the exponential and poisson distributions, and the beta distribution In Chapter 15 I break out the less common Dirichlet distribution but I explain it as I go along. If you are not 0.5. Prerequisites familiar with these distributions, you can read about them on Wikipedia You could also read the companion to this book think stats or an introduc- tory statistics book(although I'm afraid most of them take a mathematical approach that is not particularly helpful for practical purposes) Contributor List If you have a suggestion or correction, please send email to downey@allendowney.com.IfImakeachangebasedonyourfeedback, I will add you to the contributor list (unless you ask to be omitted) If you include at least part of the sentence the error appears in, that makes it easy for me to search. Page and section numbers are fine too but not as easy to work with Thanks First, I have to acknowledge David MacKay's excellent book, Information The- ory, Inference, and Learning Algorithms, which is where I first came to under- stand Bayesian methods. With his permission, I use several problems from his book as examples This book also benefited from my interactions with Sanjoy Mahajan, espe cially in fall 2012, when i audited his class on bayesian Inference at olin Colleg I wrote parts of this book during project nights with the Boston Python User Group, so I would like to thank them for their company and pizza Jonathan edwards sent in the first typo George Purkins found a markup error Olivier Yiptong sent several helpful suggestions Yuriy Pasichnyk found several errors Kristopher Overholt sent a long list of corrections and suggestions Robert Marcus found a misplaced Max Hailperin suggested a clarification in Chapter I Markus dobler pointed out that drawing cookies from a bowl with replace ment is an unrealistic scenario Tom Pollard and Paul a. Giannaros spotted a version problem with some of the numbers in the train example Chapter 0. Preface Ram limbu found a typo and suggested a clarification In spring 2013, students in my class, Computational Bayesian Statistics, made many helpful corrections and suggestions: Kai austin, Claire Barnes, Kari Bender, Rachel Boy, Kat Mendoza, Arjun Iyer, Ben Kroop, Nathan Lintz Kyle Mcconnaughay, Alec radford, Brendan Ritter, and Evan Simpson Greg Marra and Matt Aasted helped me clarify the discussion of The Price is Right probler Marcus ogren pointed out that the original statement of the locomotive prob lem was ambiguous Jasmine Kwityn and Dan Fauxsmith at OReilly Media proofread the book and found many opportunities for improvement James Lawry spotted a math error. 【实例截图】
【核心代码】
标签:
小贴士
感谢您为本站写下的评论,您的评论对其它用户来说具有重要的参考价值,所以请认真填写。
- 类似“顶”、“沙发”之类没有营养的文字,对勤劳贡献的楼主来说是令人沮丧的反馈信息。
- 相信您也不想看到一排文字/表情墙,所以请不要反馈意义不大的重复字符,也请尽量不要纯表情的回复。
- 提问之前请再仔细看一遍楼主的说明,或许是您遗漏了。
- 请勿到处挖坑绊人、招贴广告。既占空间让人厌烦,又没人会搭理,于人于己都无利。
关于好例子网
本站旨在为广大IT学习爱好者提供一个非营利性互相学习交流分享平台。本站所有资源都可以被免费获取学习研究。本站资源来自网友分享,对搜索内容的合法性不具有预见性、识别性、控制性,仅供学习研究,请务必在下载后24小时内给予删除,不得用于其他任何用途,否则后果自负。基于互联网的特殊性,平台无法对用户传输的作品、信息、内容的权属或合法性、安全性、合规性、真实性、科学性、完整权、有效性等进行实质审查;无论平台是否已进行审查,用户均应自行承担因其传输的作品、信息、内容而可能或已经产生的侵权或权属纠纷等法律责任。本站所有资源不代表本站的观点或立场,基于网友分享,根据中国法律《信息网络传播权保护条例》第二十二与二十三条之规定,若资源存在侵权或相关问题请联系本站客服人员,点此联系我们。关于更多版权及免责申明参见 版权及免责申明
网友评论
我要评论