Tag Archives: C++

Autotuning OpenCL kernels – CLTune on Windows 7

CLTune is a C++ library for automatically tuning OpenCL kernels to extract the maximum speed from your device. I’m going to try building and using it on Windows 7 with MinGW-w64 (GCC 4.9.1) to see what I can achieve with it. While properly written OpenCL code should work on any conformant device and platform, there’s no guarantee it will be fast. What’s fast on an Nvidia GTX 560 Ti isn’t going to get maximum speed out of an Intel CPU. A kernel that squeezes the maximum throughput out of an Intel CPU when using the Intel OpenCL runtime probably won’t do so well on the AMD CPU runtime. This problem even exists between different versions of Nvidia GPUs – each new compute capability requires different tuning. Continue reading