The CUDA Handbook： A Comprehensive Guide to GPU Programming

一般编程问题

下载此实例

开发语言：Others
实例大小：4.66M
下载次数：13
浏览次数：485
发布时间：2020-06-05
实例类别：一般编程问题
发布人：robot666
文件格式：.pdf
所需积分：2

网友评论举报投诉收藏该页

下载此实例

实例介绍

【实例简介】
The CUDA Handbook： A Comprehensive Guide to GPU Programming 英文原版，PDF
This page intentionally left blank The cuda handbook A Comprehensive guide to GPU Programming Nicholas wilt WAddison-Wesley Upper Saddle river, NJ. Boston. Indianapolis San Francisco New york· Toronto· Montreal· London· Munich· Paris· Madrid Capetown· Sydney· Tokyo· Singapore· Mexico City Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. where those designations appear in this book and the publisher wa aware of a trademark claim, the designations have been printed with initial capital letters or in all capitals The author and publisher have taken care in the preparation of this book, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the nformation or programs contained herein The publisher offers excellent discounts on this book when ordered in quantity for bulk purchases or special sales, which may include electronic versions and/or custom covers and content partic- ular to your business, training goals, marketing focus, and branding interests. For more informa tion, please contact U.S. Corporate and Government Sales 800]382-3419 corpsalesfapearsontechgroup.com For sales outside the United States, please contact International sales nternationallapearsoned. com Visit us on the Web informit. com/aw Cataloging in Publication Data is on file with the Library of Congress Copyright o 2013 Pearson education Inc All rights reserved. Printed in the United States of America. This publication is protected by copy- right, and permission must be obtained from the publisher prior to any prohibited reproduction, storage in a retrieval system, or transmission in any form or by any means, electronic, mechanical photocopying, recording, or likewise To obtain permission to use material from this work, please submit a written request to Pearson education Inc, Permissions department one lake street Upper Saddle River, New Jersey 07458, or you may fax your request to (201)236-3290 SBN-13:978-0-321-80946-9 SBN-10:0-321-80946-7 Text printed in the United States on recycled paper at RR Donelley in Crawfordsville, Indian First printing, June 2013 For robin This page intentionally left blank Contents reface XXI Acknowledgments XXI About the author XXV PARTI Chapter 1: Background 1.1 Our Approach 1.2 Code 1.2.1 Microbenchmarks 1.2.2 Microdemos 7 1.2.3 Optimization journeys 1.3 Administrative tems 1.3.1 Open source 1.3.2 CUDA Handbook Library [chLib 1.3.3 Coding style 1.3.4 CUDA SDK 1.4 Road Map Chapter 2: Hardware Architecture 2.1 CPU Configurations 11 2.1.1 Front-Side bus 12 CONTENTS 2.1.2 Symmetric Multiprocessors 13 2.1.3 Nonuniform Memory access 14 2.1. 4 PCI Express Integration 17 2.2 Integrated GPUs 17 2. 3 Multiple gPus 19 2.4 Address spaces in cuda 22 2.4.1 Virtual Addressing: A Brief History ....22 2.4.2 Disjoint Address spaces 26 2.4.3 Mapped Pinned memory 28 2.4.4 Portable Pinned Memory 29 2.4.5 Unified Addressing 2.4.6 Peer-to-Peer Mappings 2.5 CPU/GPU Interactions .32 2.5.1 Pinned host memory and command Buffers 32 2.5.2 CPU/GPU Concurrency 35 2.5.3 The Host Interface and Intra-GPU Synchronization 39 2.5. 4 Inter-GPU Synchronization 2.6 G PU Architecture 2.6.1 Overview 42 2.6.2 Streaming Multiprocessors 46 2.7 Further Reading 50 Chapter 3: Software Architecture 51 3.1 Software layers 51 3.1.1 CUDA Runtime and driver 53 3.1.2 Driver models ....54 3.1.3 nvCC PTX and microcode 57 CONTENTS 3.2 Devices and initialization 59 3.2.1 Device Count 3.2.2 Device attributes 3.2.3 When cuda is not present .63 3.3 Contexts 67 3.3.1 Lifetime and Scoping 3.3.2 Preallocation of resources 3.3.3 Address Space 69 3.3.4 Current Context stack 69 3.3.5 Context state 3.4 Modules and functions 71 3.5 Kernels(Functions) 3.6 Device Memory 3.7 Streams and Events 76 3.7.1 Software Pipelining 76 3.7.2 Stream callbacks 77 34.3 The null stream 77 3.4.4 Events 3.8 Host Memory 3.8.1 Pinned Host memory 3.8.2 Portable Pinned memory 3.8.3 Mapped Pinned Memory 81 3.8.4 Host Memory Registration 81 3.9 CUDA Arrays and Texturing 82 3.9.1 Texture references 82 3.9.2 Surface references 85 【实例截图】
【核心代码】

标签：

实例下载地址