Autotuning OpenCL kernels – CLTune on Windows 7

The OpenCL logo.

CLTune is a C++ library for automatically tuning OpenCL kernels to extract the maximum speed from your device. I'm going to try building and using it on Windows 7 with MinGW-w64 (GCC 4.9.1) to see what I can achieve with it. While properly written OpenCL code should work on any conformant device and platform, there's … Continue reading Autotuning OpenCL kernels – CLTune on Windows 7

Correctly enabling cl_khr_fp64 in both OpenCL 1.1 and 1.2

The OpenCL logo.

I started most of my OpenCL development on Nvidia GPUs, which still only support OpenCL 1.1.  When I started testing code that used double precision arithmetic on AMD Radeon GPUs, I kept running into a warning about the cl_khr_fp64 extension.  The reason for this is, of course, that in OpenCL 1.2 cl_khr_fp64 moved from an … Continue reading Correctly enabling cl_khr_fp64 in both OpenCL 1.1 and 1.2

Making PyOpenCL handle NumPy arrays as images

The OpenCL logo.

PyOpenCL Image objects take a shape tuple that gives (width, height, depth), but NumPy arrays specify shape in the order (rows, columns, ...) a.k.a. (height, width, ...) where the ellipsis indicates higher dimensions.  What's important is that the width and height dimensions have been swapped.  The PyOpenCL documentation suggests creating the NumPy arrays in the … Continue reading Making PyOpenCL handle NumPy arrays as images