在好例子网,分享、交流、成长!
您当前所在位置:首页Others 开发实例一般编程问题 → Foundations of Statistical Natural Language Processing.pdf

Foundations of Statistical Natural Language Processing.pdf

一般编程问题

下载此实例
  • 开发语言:Others
  • 实例大小:6.97M
  • 下载次数:5
  • 浏览次数:239
  • 发布时间:2020-08-04
  • 实例类别:一般编程问题
  • 发 布 人:robot666
  • 文件格式:.pdf
  • 所需积分:2
 

实例介绍

【实例简介】
Foundations of Statistical Natural Language Processing.pdf
Second printing, 1999 1999 Massachusetts Institute of Technology Second printing with corrections, 2000 All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means (including photocopying, recording, or informa tion storage and retrieval) without permission in writing from the publisher Typeset in 10/13 Lucida Bright by the authors using WATEX2 Printed and bound in the United States of america Library of Congress Cataloging-in-Publication Information Manning, Christopher D Foundations of statistical natural language processing Christopher D Manning. inrich schutze p. Includes bibliographical references (p. )and index ISBN0-262-13360-1 1. Computational linguistics-Statistical methods. I. Schutze, Hinrich I. Title P98.5S83M361999 410’.285-dc21 99-21137 CIP Brief Contents I Preliminaries 1 1 Introduction 3 2 Mathematical foundations 39 3 Linguistic Essentials 81 Corpus-Based work 117 II words 149 5 Collocations 151 6 Statistical Inference: n-gram Models over Sparse Data 191 7 Word Sense Disambiguation 229 8 Lexical acquisition 265 I Grammar 315 9 Markov Models 317 10 Part-of-Speech Tagging 341 11 Probabilistic Context Free grammars 381 12 Probabilistic Parsing 407 Iv Applications and Techniques 461 13 Statistical Alignment and Machine Translation 463 14 Clustering 495 15 Topics in Information Retrieval 529 16 Text categorization 575 Contents List of tables xu List of figures Table of Notations xxv Preface xxix Road Map xxxv I Preliminaries 1 1 Introduction 3 1. 1 Rationalist and Empiricist Approaches to Language 4 1.2 Scientific Content 7 1. 2.1 Questions that linguistics should answer 8 1. 2.2 Non-categorical phenomena in language 11 1.2.3 Language and cognition as probabilistic henomena 15 1.3 The Ambiguity of Language: Why NLP Is Difficult 17 1.4 Dirty Hands 19 1. 4.1 Lexical resources 19 1. 4.2 Word counts 20 1.4.3Zipf” s laws23 1. 4.4 Collocations 29 1. 4.5 Concordances 31 1. 5 Further Reading 3 4 Contents 1. 6 Exercises 35 2 Mathematical foundations 39 2. 1 Elementary Probability Theory 40 2.1.1 Probabili y sp aces 40 2.1.2 Conditional probability and independence 42 2. 1. 3 Bayes' theorem 43 2.1.4 Random variables 45 2.1.5 Expectation and variance 46 2.1.6 Notation 4 7 2.1.7 Joint and conditional distributions 48 2.1.8 Determining P 48 2. 1. 9 Standard distributions 50 2.1.10 Bayesian statistics 54 2.1.11 Exercises 59 2.2 Essential Information Theory 60 2.2.1 Entr y 61 2.2.2 Joint entropy and conditional entropy 63 2.2. 3 Mutual information 66 2.2.4 The noisy channel model 68 2.2.5 Relative entropy or Kullback-Leibler divergence 72 2.2.6 The relation to language: Cross entropy 73 2.2.7 The entropy of English 76 2.2.8 Perplexity 78 2. 2. 9 Exercises 78 2.3 Further Reading 79 3 Linguistic Essentials 81 3. 1 Parts of Speech and Morphology 8 1 3.1.1 Nouns romans 8 3.1.2 Words that accompany nouns: Determiners and adjectives 87 3.1.3 Verbs 88 3.1. 4 Other parts of speech 91 3.2 Phrase structure 93 3.2.1 Phrase sdmachmears 96 3.2.2 Dependency: Arguments and adjuncts 101 3.2.3 X theory 106 3.2.4 Phrase structure ambiguity 107 Contents 3.3 Semantics and Pragmatics 109 3. 4 Other Areas 112 3.5 Further Reading 113 3.6 Exercises 11 4 4 Corpus-Based Work 117 4.1 Getting Set Up 118 4.1.1 Computers 1l8 4.1.2 Corpora ll8 4.13 Software 120 4.2 Looking at Text 123 4.2. 1 Low-level formatting issues 123 4.2.2 Tokenization What is a word? 24 4.2.3 Morphology 13 1 4.2. 4 Sentences 13 4 4.3 Marked-up Data 136 4.3.1 Markup schemes 137 4.3.2 Grammatical tagging 139 4. 4 Further Reading 145 4.5 Exercises 147 II Words 149 5 Collocations 151 5.1 Frequency 153 5.2 Mean and Variance 157 5.3 Hypothesis Testing 162 5.3.1 The t t 163 5.3.2 Hypothesis testing of differences 166 5.3.3 Pearsons chi-square test 169 5.3. 4 Likelihood ratios 172 5.4 Mutual Information 178 5.5 The notion of collocation 183 5.6 Further Reading 187 6 Statistical Inference: n gram Models over Sparse data 6.1 Bins: Forming Equivalence Classes 192 6.1.1 Reliability vs. discrimination 192 6.1.2 n-grammdels 192 Conte 6.1.3 Buildinggram models 195 6.2 Statistical Estimators 196 6.2. 1 Maximum Likelihood Estimation(MLE) 197 6.2.2 Laplace's law, Lidstone's law and the Jeffreys-Perks law 202 6.2 3 Held out estimation 205 6.2.4 Cross-validation (deleted estimation) 210 6.2.5 Good-Turing estimation 212 6.2.6 Briefly noted 216 6.3 Combining Estimators 217 6.3.1 Simple linear interpolation 218 6.3.2 Katz's backing-off 219 6.3.3 General linear interpolation 220 6.3. 4 Briefly noted 222 6.3.5 Language models for Austen 223 6. 4 Conclusions 224 5 Further Reading 225 6.6 Exercises 225 7 Word sense Disambiguation 229 7. Methodological preliminaries 232 7.1.1 Supervised and unsupervised learning 232 7.1.2 Pseudowords 233 7.1.3 Upper and lower bounds on performance 233 7. 2 Supervised Disambiguation 235 7. 2.1 Bayesian classification 235 7.2.2 An information-theoretic approach 239 7.3 Dictionary-Based Disambiguation 241 7.3.1 Disambiguation based on sense definitions 242 7.3.2 Thesaurus-based disambiguation 244 7.3.3 Disambiguation based on translations in a second-language corpus 247 7.3.4 One sense per discourse, one sense per collocation 249 7.4 Unsupervised Disambiguation 252 7.5 What Is a Word sense? 256 7.6 Further Reading 260 7.7 Exercises 262 Contents 8 Lexical Acquisition 265 8. 1 Evaluation measures 267 8.2 Verb Subcategorization 271 8.3 Attachment Ambiguity 278 8.3.1 Hindle and Rooth (1993 )280 8.3.2 General remarks on pp attachment 284 8. 4 Selectional preferences 288 8.5 Semantic Similarity 294 8.5.1 Vectospace measures 296 8.5.2 Probabilistic measures 303 8.6 The role of Lexical Acquisition in Statistical NLP 308 8.7 Further Reading 312 III Grammar 315 9 Markov Models 317 91 Markoⅴ Models318 9.2 Hidden markov models 320 9.2.1 Why use HMMs? 322 9.2.2 General form of an hmm 324 9.3.2 Finding the best state sequence 33/ p 9.3 The Three Fundamental Questions for HMMs 3 9.3. 1 Finding the probability of an observatic 326 9.3.3 The third problem: Parameter estimation 333 9.4 HMMs: Implementation, Properties, and variants 336 9.4.1 Implementation 33 6 9.4.2 Variants 337 9.4.3 Multiple input observations 33 8 9.4.4 Initialization of parameter values 339 9.5 Further Reading 33 9 10 Part-of-Speech Tagging 341 10. 1 The Information Sources in Tagging 343 10.2 Markov Model Taggers 345 10.2. 1 The probabilistic model 345 10.2.2 The Viterbi algorithm 349 10.2.3 Variations 351 10.3 Hidden Markov Model Taggers 356 【实例截图】
【核心代码】

标签:

实例下载地址

Foundations of Statistical Natural Language Processing.pdf

不能下载?内容有错? 点击这里报错 + 投诉 + 提问

好例子网口号:伸出你的我的手 — 分享

网友评论

发表评论

(您的评论需要经过审核才能显示)

查看所有0条评论>>

小贴士

感谢您为本站写下的评论,您的评论对其它用户来说具有重要的参考价值,所以请认真填写。

  • 类似“顶”、“沙发”之类没有营养的文字,对勤劳贡献的楼主来说是令人沮丧的反馈信息。
  • 相信您也不想看到一排文字/表情墙,所以请不要反馈意义不大的重复字符,也请尽量不要纯表情的回复。
  • 提问之前请再仔细看一遍楼主的说明,或许是您遗漏了。
  • 请勿到处挖坑绊人、招贴广告。既占空间让人厌烦,又没人会搭理,于人于己都无利。

关于好例子网

本站旨在为广大IT学习爱好者提供一个非营利性互相学习交流分享平台。本站所有资源都可以被免费获取学习研究。本站资源来自网友分享,对搜索内容的合法性不具有预见性、识别性、控制性,仅供学习研究,请务必在下载后24小时内给予删除,不得用于其他任何用途,否则后果自负。基于互联网的特殊性,平台无法对用户传输的作品、信息、内容的权属或合法性、安全性、合规性、真实性、科学性、完整权、有效性等进行实质审查;无论平台是否已进行审查,用户均应自行承担因其传输的作品、信息、内容而可能或已经产生的侵权或权属纠纷等法律责任。本站所有资源不代表本站的观点或立场,基于网友分享,根据中国法律《信息网络传播权保护条例》第二十二与二十三条之规定,若资源存在侵权或相关问题请联系本站客服人员,点此联系我们。关于更多版权及免责申明参见 版权及免责申明

;
报警