文档介绍:OpenCL Evaluation for Numerical Linear Algebra
Library Development
Peng Du∗, Piotr Luszczek∗, Jack Dongarra∗†‡
∗University of Tennessee puting Laboratory
†Oak Ridge National Laboratory
‡University of Manchester
I. PORTABLE GPU PROGRAMMING CUDA term OpenCL term
host CPU host
With the help of of CUDA [7], [6], many applications streaming multiprocessor (SM) compute unit (CU)
improved their performance by using GPUs. In our project scalar core processing element (PE)
called Matrix Algebra on GPU and Multicore Architec- host thread host program
tures (MAGMA) [10], we mainly focus on dense linear thread work-item
algebra routines similar to those from LAPACK [1]. Other thread block work-group
than CUDA, there exist other frameworks that allow platform- grid NDRange
independent programming for GPUs. The main three frame- shared memory local memory
works are: constant memory space constant memory
1) pute from Mircosoft, texture memory space constant memory
2) OpenGL Shading Language (GLSL), and TABLE I
3) PARISON OF TERMS USED BY CUDA AND OPENCL TO DESCRIBE
VERY SIMILAR CONCEPTS.
The first one allows access to graphics cards from multiple
vendors. However, it is specific to Microsoft Windows and
therefore it is not portable between host Operating Sys-
tems (OS). • Obtaining the ID for the thread/work-item and
OpenGL Shading language [8] is portable a