000 06230cam a2200565Mi 4500
001 ocn962064304
003 OCoLC
005 20190719103346.0
006 m o d
007 cr |||||||||||
008 161006t20172017ne a o 001 0 eng d
040 _aIDEBK
_beng
_erda
_cIDEBK
_dCOO
_dOCLCO
_dYDX
_dUMI
_dSTF
_dTOH
_dMERUC
_dDEBBG
_dOPELS
_dN$T
_dUPM
_dDEBSZ
_dGZM
_dOCLCQ
_dUAB
_dNLE
_dOCLCF
_dDXU
_dK6U
_dLIV
_dD6H
_dSNK
_dOCL
_dVVB
_dU3W
_dOCLCQ
_dUOK
_dWYU
019 _a958865658
_a959427707
_a960211532
_a961309123
_a962007727
020 _a0128037881
_q(ebk)
020 _a9780128037881
020 _z0128037385
020 _a9780128037386
020 _a0128037385
035 _a(OCoLC)962064304
_z(OCoLC)958865658
_z(OCoLC)959427707
_z(OCoLC)960211532
_z(OCoLC)961309123
_z(OCoLC)962007727
050 4 _aT385
072 7 _aCOM
_x000000
_2bisacsh
082 0 4 _a006.6869
_223
245 0 0 _aAdvances in GPU research and practice /
_cedited by Hamid Sarbazi-Azad.
264 1 _aAmsterdam :
_bElsevier,
_c[2017]
264 4 _c�2017
300 _a1 online resource (776 pages) :
_billustrations (some color)
336 _atext
_btxt
_2rdacontent
337 _acomputer
_bc
_2rdamedia
338 _aonline resource
_bcr
_2rdacarrier
490 1 _aEmerging Trends in Computer Science and Applied Computing
504 _aReferencesChapter 2: SnuCL: A unified OpenCL framework for heterogeneous clusters; 1 Introduction; 2 OpenCL; 2.1 Platform Model; 2.2 Execution Model; 2.3 Memory Model; 2.4 Synchronization; 2.5 Memory Consistency; 2.6 OpenCL ICD; 3 Overview of SnuCL framework; 3.1 Limitations of OpenCL; 3.2 SnuCL CPU; 3.3 SnuCL Single; 3.4 SnuCL Cluster; 3.4.1 Processing synchronization commands; 4 Memory management in SnuCL Cluster; 4.1 Space Allocation to Memory Objects; 4.2 Minimizing Copying Overhead; 4.3 Processing Memory Commands; 4.4 Consistency Management.
505 0 _aFront Cover; Advances in GPU Research and Practice; Copyright; Dedication; Contents; List of Contributors; Preface; Acknowledgments; Part 1: Programming and tools; Chapter 1: Formal analysis techniques for reliable GPU programming: current solutions and call to action; 1 GPUs in Support of Parallel Computing; Bugs in parallel and GPU code; 2 A quick introduction to GPUs; Organization of threads; Memory spaces; Barrier synchronization; Warps and lock-step execution; Dot product example; 3 Correctness issues in GPU programming; Data races; Lack of forward progress guarantees.
505 8 _aFloating-point accuracy4 The need for effective tools; 4.1 A Taxonomy of Current Tools; 4.2 Canonical Schedules and the Two-Thread Reduction; Race freedom implies determinism; Detecting races: ``all for one and one for all''; Restricting to a canonical schedule; Reduction to a pair of threads; 4.3 Symbolic Bug-Finding Case Study: GKLEE; 4.4 Verification Case Study: GPUVerify; 5 Call to Action; GPUs will become more pervasive; Current tools show promise; Solving basic correctness issues; Equivalence checking; Clarity from vendors and standards bodies; User validation of tools; Acknowledgments.
505 8 _a4.5 Detecting Memory Objects Written by a Kernel5 SnuCL extensions to OpenCL; 6 Performance evaluation; 6.1 Evaluation Methodology; 6.2 Performance; 6.2.1 Scalability on the medium-scale GPU cluster; 6.2.2 Scalability on the large-scale CPU cluster; 7 Conclusions; Acknowledgments; References; Chapter 3: Thread communication and synchronization on massively parallel GPUs; 1 Introduction; 2 Coarse-Grained Communication and Synchronization; 2.1 Global Barrier at the Kernel Level; 2.2 Local Barrier at the Work-Group Level; 2.3 Implicit Barrier at the Wavefront Level.
505 8 _a3 Built-In Atomic Functions on Regular Variables4 Fine-Grained Communication and Synchronization; 4.1 Memory Consistency Model; 4.1.1 Sequential consistency; 4.1.2 Relaxed consistency; 4.2 The OpenCL 2.0 Memory Model; 4.2.1 Relationships between two memory operations; 4.2.2 Special atomic operations and stand-alone memory fence; 4.2.3 Release and acquire semantics; 4.2.4 Memory order parameters; 4.2.5 Memory scope parameters; 5 Conclusion and Future Research Direction; References; Chapter 4: Software-level task scheduling on GPUs; 1 Introduction, Problem Statement, and Context.
520 _aAdvances in GPU Research and Practice focuses on research and practices in GPU based systems. The topics treated cover a range of issues, ranging from hardware and architectural issues, to high level issues, such as application systems, parallel programming, middleware, and power and energy issues. Divided into six parts, this edited volume provides the latest research on GPU computing. Part I: Architectural Solutions focuses on the architectural topics that improve on performance of GPUs, Part II: System Software discusses OS, compilers, libraries, programming environment, languages, and paradigms that are proposed and analyzed to help and support GPU programmers. Part III: Power and Reliability Issues covers different aspects of energy, power, and reliability concerns in GPUs. Part IV: Performance Analysis illustrates mathematical and analytical techniques to predict different performance metrics in GPUs. Part V: Algorithms presents how to design efficient algorithms and analyze their complexity for GPUs. Part VI: Applications and Related Topics provides use cases and examples of how GPUs are used across many sectors.
650 0 _aGraphics processing units
_xProgramming.
650 0 _aImaging systems.
650 0 _aComputer graphics.
650 0 _aImage processing
_xDigital techniques.
650 7 _aCOMPUTERS
_xGeneral.
_2bisacsh
650 7 _aImaging systems.
_2fast
_0(OCoLC)fst00967605
650 7 _aImage processing
_xDigital techniques.
_2fast
_0(OCoLC)fst00967508
650 7 _aComputer graphics.
_2fast
_0(OCoLC)fst00872119
655 4 _aElectronic books.
700 1 _aSarbazi-Azad, Hamid,
_eeditor.
776 0 8 _iPrint version:
_aAzad, Hamid Sarbazi.
_tAdvances in GPU Research and Practice.
_dSaint Louis : Elsevier Science, �2016
_z9780128037386
830 0 _aEmerging trends in computer science & applied computing.
856 4 0 _3ScienceDirect
_uhttp://www.sciencedirect.com/science/book/9780128037386
999 _c504246
_d504181