OpenCL Toys => SmallptGPU
SmallptCPU vs
SmallptGPU
Written by David Bucciarelli
SmallptGPU is a small and simple demo written in
OpenCL in order to test the performance of this new standard. It
is based on Kevin Beason's Smallpt available at
http://www.kevinbeason.com/smallpt/.
SmallptGPU has been written using the ATI OpenCL SDK 2.0 on Linux but it should work on any
platform/implementation (i.e. NVIDIA). Some discussion about this little toy can be found at Luxrender's forum
A video of SmallptGPU is available here: http://vimeo.com/8013005 (the old low quality version is available here: http://vimeo.com/8013005)
History
V1.6 - Thanks to Jens and all the discussion at Luxrender's forum
now SmallptGPU works fine with MacOS and NVIDIA cards. A bug in the Apple's OpenCL
compiler has been found (Khronos's OpenCL forum)
and a workaround has been applied to SmallptGPU.
Added a new kernel with direct lighting surface integrator (very fast indeed).
V1.5 - Thanks to discussion at Beyond3D, the perfomances on NVIDA GPUs have been improved. They are not yet where they should be but are lot better now.V1.5 - The thanks to discussion at http://forum.beyond3d.com/showthread.php?t=55913 The perfomances on NVIDA GPUs have been improved. They are not yet where they should be but are lot better now.
V1.4 - Updated for ATI SDK 2.0, fixed a problem in object selection
V1.3 - Jens's patch for MacOS, added on-screen help, fixed performance estimation, removed movie recording, added on-screen help, added Windows binaries
V1.2 - Indirect diffuse path can be now disabled/enabled (available only on CPU version because a bug of ATI's compiler), optimized buffers reallocation, added keys to select/move objects
V1.1 - Fixed few portability problems, added support to save movie, fixed a problem in window resize code
V1.0 - First release
The following test has been done at 1024x768 with
scenes/cornell.scn.
SmallCPU
This is just a simple mono-thread CPU implementation (no OpenCL
involved). Result:
Sample/sec 446836
SmallptGPU on CPU device
This is the OpenCL implementation using only the CPU device.
Result:
Reading scene: scenes/cornell.scn
Scene size: 9
For test only: Expires on Sun Feb 28 00:00:00 2010
OpenCL Device 0: Type = TYPE_CPU
OpenCL Device 0: Name = Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz
OpenCL Device 0: Compute units = 4
OpenCL Device 0: Max. work group size = 1024
Reading file 'rendering_kernel.cl' (size 2634 bytes)
[...]
Rendering time 1.870000 sec (pass 7) Sample/sec 420552
It uses the 4 cores but it has the same performance of
smallptCPU (with only one core). I guess CPU devices are useful
only for developing purpose (i.e. when you don't have a fast GPU
available).
SmallGPU (on GPU)
This is the OpenCL implementation using only the GPU device.
Result:
Reading scene: scenes/cornell.scn
Scene size: 9
For test only: Expires on Sun Feb 28 00:00:00 2010
OpenCL Device 0: Type = TYPE_GPU
OpenCL Device 0: Name = ATI RV770
OpenCL Device 0: Compute units = 10
OpenCL Device 0: Max. work group size = 256
Reading file 'rendering_kernel.cl' (size 2634 bytes)
[...]
Rendering time 1.040000 sec (pass 236) Sample/sec 4537108
It is about 10 time faster than the single-thread CPU
implementation.
How to compile
Just edit the Makefile and use an appropriate value
for ATISTREAMSDKROOT.
Key bindings
'p' - save image.ppm
ESC - exit
Arrow keys - rotate camera left/right/up/down
'a' and 'd' - move camera left and right
'w' and 's' - move camera forward and backward
'r' and 'f' - move camera up and down
PageUp and PageDown - move camera target up and down
' ' - refresh the window
'+' and '-' - to select next/previous object
'2', '3', '4', '5', '6', '8', '9' - to move selected object
Download: smallptgpu-v1.6.tgz (includes sources, Linux 64bit binaries and Windows 32bit binaries)
|