Springer Handbook of Speech Processing

一般编程问题

下载此实例

开发语言：Others
实例大小：36.35M
下载次数：7
浏览次数：266
发布时间：2021-03-04
实例类别：一般编程问题
发布人：好学IT男
文件格式：.pdf
所需积分：2

网友评论举报投诉收藏该页

下载此实例

实例介绍

【实例简介】
音频处理方面非常经典的一本书，非常系统和全面。。。
Tinge Handon of Speech processing Jacob benesty, M. Mohan Sondhi, Yiteng huang (Eds With DVD-ROM, 456 Figures and 113 Tables ② Springer Editors Jacob benesty INRS-EMT, University of Quebec 800 de la gauchetiere quest Suite 69 Montreal, Quebec, H5A 1K6, Canada benesty @emt inrs.ca M. Mohan sondh Avayalabs research 233 Mount Airy road Basking Ridge, NJ07920, USA mms@research.avayalabs.com Yiteng Huang Bell laboratories. Alcatel-Lucent 600 Mountain avenue Murray Hill, NJ07974, USA arden_ huang @@ieee. org Library of Congress Control Number 2007931999 ISBN:978-3-540-49125-5 e-ISBN:978-3-540-49127-9 This work is subject to copyright. All rights reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilm or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September, 9, 1965, in its current version and permission for use must always be obtained from Springer-Verlag Violations are liable for prosecution under the german Copyright Law Springer is a part of Springer Science+Business Media Springer. com C Springer-Verlag Berlin Heidelberg 2008 The use of designations, trademarks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general Product liability: The publisher cannot guarantee the accuracy of any information about dosage and application contained in this book. In every individual case the user must check such information by consulting the elevant literature Typesetting and production LE-TEX Jelonek, Schmidt& Vockler GbR, leipzi Senior Manager Springer Handbook: Dr W.Skolaut, Heidelberg Typography and layout: schreiber Vis, Seeheim Illustrations: Hippmann GbR, Schwarzenbruck Cover design: eStudio Calamar Steinen, Barcelona Cover production: WMXDesign GmbH, Heidelberg Printing and binding: Sturtz gmbh, wurzburg Printed on acid free paper SPIN1154403660/3180/YL543210 Foreword Over the past three decades digital signal processing has emerged as a recognized discipline. Much of the impetus for this advance stems from research in representation coding, transmission, storage and reproduction of speech and image information. In particular. interest in voice communication has stimulated central contributions to digital filtering and discrete-time spectral transforms This dynamic development was built upon the convergence of three then-evolving technologies: (1) sampled-data theory and representation of information signals(which led directly to digital telecommunication that provides signal quality independent of transmission distance); (ii)electronic binary computation(aided in early implementa tion by pulse-circuit techniques from radar design ) and, (iii) invention of solid-state devices for exquisite control of electronic current(transistors-which now, through mi croelectronic materials, scale to systems of enormous size and complexity ) This timel 」L. Flanagan convergence was soon followed by optical fiber methods for broadband information Professor emeritus Electrical and computer transport Engineering These advances impact an important aspect of human activity -information ex- Rutgers University change. And, over man's existence, speech has played a principal role in human communication. Now, speech is playing an increasing role in human interaction with complex information systems. Automatic services of great variety exploit the comfort of voice exchange, and, in the corporate sector, sophisticated audio/ video teleconfer- encing is reducing the necessity of expensive, time-consuming business travel. In each instance an overarching target is a user environment that captures some of the nat uralness and spatial realism of face-to-face communication. Again, speech is a core element, and new understanding from diverse research sectors can be brought to bear Editors-in-Chief Benesty, Sondhi and Huang have organized a timely engineer- ing handbook to answer this need. They have assembled a remarkable compendium of current knowledge in speech processing. And, this accumulated understanding can be focused upon enlarging the human capacity to deal with a world ever increasing in complexity. Benesty, Sondhi and Huang are renowned researchers in their own right and they have attracted an international cadre of over 80 fellow authors and collab orators who constitute a veritable Who's Who of world leaders in speech processing research. The resulting book provides under one cover authoritative treatments that commence with the basic physics and psychophysics of speech and hearing, and range through the related topics of computational tools, coding, synthesis, recognition, and signal enhancement, concluding with discussions on capture and projection of sound in enclosures. The book can be expected to become a valuable resource for researchers, engineers and speech scientists throughout the global community. It should equally serve teachers and students in human communication, especially delimiting knowledge frontiers where graduate thesis research may be appropriate Warren, New Jersey Jim Flanagan October 2007 Preface The achievement of this springer handbook is the result of a wonderful journey that started in March 2005 at the 30th International Conference on Acoustics, Speech, and Signal Processing (ICASSP). Two of the editors-in-chief (Benesty and Huang)met in one of the long corridors of the Pennsylvania Convention Center in Philadelphia with Dr Dieter Merkle from Springer. Together we had a very nice discussion about the con- ference and immediately an idea came up for a handbook. After a short discussion we converged without too much hesitation on a handbook of speech processing. It was quite surprising to see that, even after 30 years of ICASSP and more than half a century of research in this fundamental area, there was still no major book summarizing the im- portant aspects of speech processing. We thought that the time was ripe for such a large project. Soon after we got home, a third editor-in-chief (Sondhi) joined the efforts We had a very clear objective in our minds: to summarize, in a reasonable number Jacob Benesty of pages, the most important and useful aspects of speech processing. The content was then organized accordingly. This task was not easy since we had to find a good balance between feasible ideas and new trends. as we all know practical ideas can be viewed as old stuff while emerging ideas can be criticized for not having passed the test of time; we hope that we have succeeded in finding a good compromise. For this we relied on many authors who are well established and are recognized as experts in their field from all over the world. and from academia as well as from industry From simple consumer products such as cell phones and MP3 players to more sophisticated projects such as human-machine interfaces and robots that can obey orders, speech technologies are now everywhere. We believe that it is just a matter of time before more applications of the science of speech become impossible to miss in our daily life. So we believe that this Springer handbook will play a fundamental role M. Mohan Sondhi in the sustainable progress of speech research and development This handbook is targeted at three categories of readers: graduate students of speech processing, professors and researchers in academia and research labs who are active in this field, and engineers in industry who need to understand or implement specific algorithms for their speech-related products. The handbook could also be used as a text for one or more graduate courses on signal processing for speech and various aspects of speech processing and applications For the completion of such an ambitious project we have many people to thank First, we would like to thank the many authors who did a terrific job in delivering very high-quality chapters. Second, we are very grateful to the members of the editorial board who helped us so much in organizing the content and structure of this book, tak ing part in all phases of this project from conception to completion. Third, we would Yiteng Huang like to thank all the reviewers, who helped us to improve the quality of the mater al. Last, but not least, we would like to thank the springer team for their availability and very professional work. In particular, we appreciated the help of Dieter Merkle, Christoph Baumann, Werner Skolaut, Petra Jantzen, and Claudia Rau We hope this springer handbook will inspire many great minds to find new research deas or to implement algorithms in products Montreal, Basking Ridge, Murray Hill Jacob Be October 2007 M. Mohan sondhi Yiteng huang List of editors Editors-in-Chief Jacob Benesty, Montreal M. Mohan Sondhi, basking ridge Yiteng(Arden) Huang, Murray Hill Part editors Part A: Production, Perception, and Modeling of Speech M.M. Sondhi, Basking ridge Part B: Signal Processing for Speech Y. Huang, Murray Hill; J. Benesty, Montreal Part C: Speech Coding W.B. Kleijn, Stockholm Part D: Text-to-Speech Synthesis S Narayanan, Los Angeles Part E: Speech Recognition L Rabiner, Piscataway; B.-H. Juang, Atlanta Part F: Speaker Recognition S Parthasarathy, Sunnyvale Part G: Language Recognition C.-H. Atlanta Part H: Speech Enhancement J. Chen, Murray Hill;S Annot, Ramat-Gan;J. Benesty, Montreal Part I: Multichannel Speech Processing J. Benesty, Montreal; I. Cohen, Haifa; Y. Huang, Murray Hill List of authors Alex Acero Microsoft research Rolf carlson One microsoft way Royal Institute of Technology( KTH Redmond Wa 98052 USA Department of Speech, Music and Hearing e-mail:alexa@microsoft.com Lindstedtsvagen 24 10044 Stockholm sweden Jont b. allen e-mail: rolf@speech. kthse University of Illinois ECE Jingdong chen Urbana,‖61801,USA Bell laboratories e-mail: JontAllen@ieee. org Alcate -lucent 600 Mountain ave Jacob benesty Murray Hill NJ 07974, USA University of quebec INRS-EMT e-mail:jingdong@research.bell-labs.com 800 de la gauchetiere ouest Juin-Hwey Chen Montreal, Quebec H5A 1K6, canada Broadcom Corp e-mail: benesty@emt inrs. ca 5300 California Avenue Irvine, CA 92617, USA Frederic bimbot e-mail:rchen@broadcom.com IRISA (CNRS INRIA)-METISS Piece c 320- campus Universitaire de beaulieu Israel cohen 35042 Rennes france Technion-Israel Institute of Technology e-mail: bimbot @irisa. fr Department of Electrical Engineering Technion City Thomas brand Haifa 32000, israel Carl von ossietzky Universitat oldenburg e-mail: cohen@ee technion. acil Sektion Medizinphysik Haus des hores marie- curie -str. 2 Jordan cohen 26121 Oldenburg Germany SRi International e-mail: thomas brand@uni-oldenburg. de 300 Ravenswood drive Menlo park, Ca 94019, USA Nick Campbell e-mail:jrc@speech.sri.com nowledge creating Communication Research Centre Corinna cortes Acoustics Speech research project, spoken Google, Inc. Language communication Group Google research 2-2-2 Hikaridai 76 9th Avenue, 4th floor 619-0288 Keihanna Science City, Japan New york. ny 10011. USA e-mail: nick@nict. go. jp e-mail:corinna@google.com William M. Campbel Eric j. diethorn MIT Lincoln Laboratory Avaya Labs research Information Systems Technology Group Multimedia Technologies research Department 244 Wood street 233 Mt. Airy Road Lexington, MA 02420-9108, USA Basking ridge, NJ 07920, USA e-mail: wcampbellall. mit. edu e-mail:ejd@avaya.com XI List of authors Simon docto Michael m. goodwin Katholieke universiteit leuven Creative advanced technology center Department of Electrical Engineering(ESAT-SCD Audio research Kasteelpark arenberg 10 bus 2446 1500 Green hills road 3001 Leuven, belgium Scotts Valley, CA 95066, USA mail: simon dolo@esat kuleuven.be e-mail:mgoodwin@atc.creative.com Volodya Grancharov Jasha Droppo Multimedia technologies Microsoft research Speech Technology Group Ericsson research ericsson ab Torshamnsgatan 23, Kista, KI/EAB/TVA/A One microsoft Way 16480 Stockholm. sweden Redmond wa 98052 USA e-mail:jdroppo@microsoft.com e-mail:volodya.grancharov@ericsson.com Bjorn Granstrom Thierry dutoit Royal Institute of Technology( KTH) Faculte polytechnique de mons FPMs Department for Speech, Music and hearing TCTS Laboratory Lindstedsvagen 24 Bvd dolez, 31 10044 Stockholm. sweden 7000 Mons, belgium e-mail: bjorn@speech.kthse e-mail: thierry dutoit @fpmsac be Patrick haffner at&t la bs- research Gary W. Elko mh acoustics llc iP and voice services 25A Summit Ave 200S Laurel ave Summit,N」07901,USA Middletown n07748 USA e-mail:gwe@mhacoustics.com e-mail:haffner@research.att.com Roar hagen Sadao ki furui Global ip solutions Tokyo Institute of Technology street Magnus Ladulsgatan 63B Department of Computer Science 118 27 Stockholm. sweden 2-12-1 okayama, Meguro-ku e-mail:roarhagen@gipscorp.com 152-8552 Tokyo, Japan e-mail: furui@cs. titech ac jp Mary P. Harper University of maryland Sharon annot Center for Advanced study of Language 7005 52nd Avenue Bar-llan University College park, Md 20742 USA School of Electrical Engineering Ramat-Gan 52900, Israel e-mail: harper@cas umd.edu e-mail: annot@eng biuacil Jurgen Herre Fraunhofer Institute for Integrated circuits Mazin e. gilbert (Fraunhofer lIS AT&T labs, Inc. Research Audio and multimedia 180 Park Ave Am Wolfsmantel 33 Florham park NJ07932, USA 91058 Erlangen, Germany e-mail: mazin a research att com e-mail: hrr@iis fraunhofer. de List of authors xill Wolfgang j。Hess Esther klabbers University of Bonn Oregon Health Science University Institute for Communication Sciences, Dept. of Center for spoken language understanding og Communication, Language, and speech School of science and Engineering Poppelsdorfer Allee 47 20000 NW Walker rd 53115 Bonn, Germany Beaverton or 97006 USA e-mail: wgh@ifk uni-bonn. de e-mail: klabbers @csluogi. edu Kiyoshi Honda W. Bastiaan Klein Universite de la sorbonne nouvelle-paris Ill Royal Institute of Technology(KTH Laboratoire de phonetique et de phonologie atR School of Electrical Engineering, Sound and Image Cognitive Information la oratories Processing lab UMR-7018-CNRS, 46 rue barrault Osquldas vag 10 75634 Paris, france 10044 Stockholm sweden e-mail: honda@atr jp e-mail: bastiaan. kleijn@ee kthse Yiteng(Arden) Huang Birger Kohlmeier Bell la oratories Universitat oldenburg Alcatel-lucent Medizinische physik 600 Mountain avenue 26111 oldenburg Germany Murray Hill, NJ 07974, USA e-mail: birger. kollmeierQuni-oldenburg. de e-mail: arden_ huangaieee org Ermin kozica Matthieu hebert Royal Institute of Tech nology( KTH Network ASR Core Technology School of Electrical Engineering, Sound and image Nuance communications Processing Laboratory 1500 Universite Osquldas vag 10 Montreal, Quebec H3A-3S7, Canada 10044 Stockholm, sweden e-mail:hebert@@nuance.com e-mail: ermin. kozica@ee kthse Biing-Hwang Juang SenM。Kuo Georgia Institute of Technology Northern Illinois University School of electrical computer Engineering Department of Electrical Engineering 777 Atlantic dr nw Dekalb,‖L60115,USA Atlanta, Ga 30332-0250, USA e-mail: kuo @ceet. niu. edu e-mail: juang@ece gatech. edu Jan larsen Tatsuya Kawahara Technical University of Denmark Kyoto University Informatics and mathematical Modelling Academic Center for Com puting and Media studies Richard Petersens plads Sakyo-ku 2800 Kongens Lyngby, Denmark 606-8501 Kyoto, Japan e-mail:il@imm dtu. dk e-mail: kawahara @i. kyoto-u ac jp Chin-hui lee Ulrik Kjems Georgia Institute of Technology oticon als School of Electrical and computer Engineering 9 Kongebakken 777 Atlantic drive nw 2765 Smgrum denmark Atlanta Ga 30332-0250, USA e-mail: uk@oticon. dk e-mail:chl@ece gatech. edu 【实例截图】
【核心代码】

标签：

实例下载地址