实例介绍
information retrieval 最经典教材 Introduction to Information Retrieval By Christopher D. Manning Stanford University Prabhakar Raghavan Yahoo! Research Hinrich Schutze University of Stuttgart
P1: KRU/IRP ibook CUUS232/Manning 9780521865715 May27,2008 12 Introduction to Information Retrieⅴall Christopher D. Manning Stanford university Prabhakar raghava Yahoo! research Hinrich schiitze University of stuttgart CAMBRIDGE UNIVERSITY PRESS P1: KRU/IRP ibook CUUS232/Manning 9780521865715 May27,2008 128 CAMBRIDGE UNIVERSITY PRESS Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, Sao Paulo, Delhi Cambridge university Press 32 Avenue of the Americas, New York ny 10013-2473, usa www.cambridge.org Informationonthistitlewww.cambridge.org/9780521865715 C Cambridge University Press 2008 This publication is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements, no reprod of any part may take place witho the written permission of Cambridge University Press First published 2008 Printed in the United states of america A catalog record for this publication is available from the British library Librury uf Congress Cataloging in Publication duty Manning, Christopher d Introduction to information retrieval Christopher D. Manning, Prabhakar Includes bibliographical references and index IsBN9780-521-86571-5(ha 1. Text processing( Computer science) 2. Information retrieval. 3. Document clustering. 4. Semantic Web. I Raghavan, Prabhakar. IL Schutze, Hinrich III. Title QA769.T48M262008 02504-dc22 2008001257 isbn 978-0-521-86571-5 hardback Camb University Press has no responsibility for the persistence or accuracy of URLs for external or third-party Internet Web sites referred to in this publie P ation nd does not guarantee that any content on such Web sites is, or will remain, accurate or appropriate P1: KRU/IRP ibook CUUS232/Manning 9780521865715 May27,2008 128 Contents Table of notation ave 1 Boolean retrieval 1.1 An example information retrieval problem 1.2 A first take at building an inverted index 1.3 Processing boolean queries 369 1.4 The extended boolean model versus ranked retrieval 13 1.5 References and further reading 16 2 The term vocabulary and postings lists 18 2.1 Document delineation and character sequence decoding 18 2.2 Determining the vocabulary of terms 21 2.3 Faster postings list intersection via skip pointers 2. 4 Positional postings and phrase queries 36 2.5 References and further reading 43 3 Dictionaries and tolerant retrieval 45 3.1 Search structures for dictionaries 45 3.2 Wildcard queries 3.3 Spelling c 52 3.4 Phonetic correction 58 3.5 References and further reading 59 4 Index construction 4.1 Hardware basics 4. 2 Blocked sort-based indexing 63 4.3 Single-pass in-memory indexing 4.4 Distributed indexing 68 4.5 Dynamic indexing P1: KRU/IRP ibook CUUS232/Manning 9780521865715 May27,2008 12 Contents 4.6 Other types of indexes 4.7 References and further reading 5 Index compression 5.1 Statistical properties of terms in information retrieval 5.2 Dictionary compression 5.3 Postings file compression 87 5.4 References and further reading 97 6 Scoring, term weighting and the vector space model 100 6.1 Parametric and zone indexes 101 6.2 Term frequency and weighting 107 6.3 The vector space model for scoring 110 6. 4 Variant tf-idf functions 6.5 References and further reading 122 7 Computing scores in a complete search system 124 7. 1 Efficient scoring and ranking 124 7.2 Components of an information retrieval system 132 7.3 Vector space scoring and query operator interaction 136 7. 4 References and further reading 137 8 Evaluation in information retrieval 139 8.1 Information retrieval system evaluation 8.2 Standard test collections 8.3 Evaluation of unranked retrieval sets 142 8.4 Evaluation of ranked retrieval results 145 8.5 Assessing relevance 151 8.6 A broader perspective: System quality and user utility 154 8.7 Results snippets 157 8.8 References and further reading 9 Relevance feedback and query expansion 162 9.1 Relevance feedback and pseudo relevance edvac k 16 9.2 Global methods for query reformulation 173 9.3 References and further reading 177 10 XML retrieval 178 10.1 Basic XML concepts 180 10.2 Challenges in XML retrieval 183 10. 3 A vector space model for XML retrieval 188 10.4 Evaluation of XMl retrieval 192 P1: KRU/IRP ibook CUUS232/Manning 9780521865715 May27,2008 128 10.5 Text-centric versus data-centric XML retrieval 10.6 References and further reading 11 Probabilistic information retrieval 201 11.1 Review of basic probability theory 202 11.2 The probability rank ng principle 11. 3 The binary independence model 204 11.4 An appraisal and some extensions 212 11.5 References and further reading 216 12 Language models for information retrieval 218 12. 1 Language models 218 12.2 The query likelihood model 223 12.3 Language modeling versus other approaches in information retrieval 229 12.4 Extended language modeling approaches 230 12.5 References and further reading 232 13 Text classification and Naive Bayes 234 13.1 The text classification problem 237 13.2 Naive Bayes text classification 238 13.3 The bernoulli model 243 134P ties of Naive bayes 245 13.5 Feature selection 251 13.6 Eval of text classificatio 258 13.7 References and further reading 264 14 Vector space classification 14. 1 Document representations and measures of relatedness in vector spaces 26 14.2 Rocchio classification 26 14.3 k nearest neighbor 273 14.4 Linear versus nonlinear classifiers 277 14.5 Classification with more than two classes 281 14. 6 The bias-variance tradeoff 284 14.7 References and further reading 291 15 Support vector machines and machine learning on documents 15.1 Support vector machines: The linearly separable case 294 15.2 Extensions to the support vector machine model 300 15.3 Issues in the classification of text documents 307 15.4 Machine-learning methods in ad hoc information retrieval 314 15.5 References and further reading 318 P1: KRU/IRP ibook CUUS232/Manning 9780521865715 May27,2008 128 16 Flat clustering 321 16.1 Clustering in information retrieval 322 16.2 Problem statement 326 16.3 Evaluation of clustering 327 16.4 K-means 331 16.5 Model-based clustering 16.6 References and further reading 343 17 Hierarchical clustering 346 17.1 Hierarchical agglomerative clustering 347 17.2 Single-link and complete-link clustering 350 17.3 Group-average agglomerative clustering 17.4 Centroid clustering 17.5 Optimality of hierarchical agglomerative clusterin g 17.6 Divisive clustering 17.7 Cluster labeling 363 17.8 Implementation notes 365 17.9 References and further reading 367 18 Matrix decompositions and latent semantic indexing 69 18.1 Linear algebra review 369 18.2 Term-document matrices and singular value decompositions 18.3 Low-rank approximations 18 4 Latent semantic indexing 18.5 References and further reading 383 19 Web search basics 385 19. 1 Background and history 385 19.2 Web characteristics 387 19.3 Advertising as the economic model 392 19.4 The search user experience 395 19.5 Index size and estimation 19.6 Near-duplicates and shinglir 400 19.7 References and further reading 404 20 Web crawling and indexes 405 20.1 Overview 40 20.2 Crawling 406 20.3 Distributing indexes 20.4 Connectivity servers 416 20.5 References and further reading 419 P1: KRU/IRP ibook CUUS232/Manning 9780521865715 May27,2008 128 Contents 21 Link analysis 421 21.1 The Web as a grap 422 21.2 Pagerank 424 21. 3 Hubs and authorities 21.4 References and further reading 439 Bibliograph 441 Index 469 P1: KRU/IRP ibook CUUS232/Manning 9780521865715 May27,2008 128 【实例截图】
【核心代码】
标签:
小贴士
感谢您为本站写下的评论,您的评论对其它用户来说具有重要的参考价值,所以请认真填写。
- 类似“顶”、“沙发”之类没有营养的文字,对勤劳贡献的楼主来说是令人沮丧的反馈信息。
- 相信您也不想看到一排文字/表情墙,所以请不要反馈意义不大的重复字符,也请尽量不要纯表情的回复。
- 提问之前请再仔细看一遍楼主的说明,或许是您遗漏了。
- 请勿到处挖坑绊人、招贴广告。既占空间让人厌烦,又没人会搭理,于人于己都无利。
关于好例子网
本站旨在为广大IT学习爱好者提供一个非营利性互相学习交流分享平台。本站所有资源都可以被免费获取学习研究。本站资源来自网友分享,对搜索内容的合法性不具有预见性、识别性、控制性,仅供学习研究,请务必在下载后24小时内给予删除,不得用于其他任何用途,否则后果自负。基于互联网的特殊性,平台无法对用户传输的作品、信息、内容的权属或合法性、安全性、合规性、真实性、科学性、完整权、有效性等进行实质审查;无论平台是否已进行审查,用户均应自行承担因其传输的作品、信息、内容而可能或已经产生的侵权或权属纠纷等法律责任。本站所有资源不代表本站的观点或立场,基于网友分享,根据中国法律《信息网络传播权保护条例》第二十二与二十三条之规定,若资源存在侵权或相关问题请联系本站客服人员,点此联系我们。关于更多版权及免责申明参见 版权及免责申明
网友评论
我要评论