Why you should never use NVIDIA OptiX: bugs

In my most recent project, I implemented a GPU path and bidirectional path tracer. I already previously used OptiX for my bachelor thesis, and I thought it is a good idea to use it also for this. I mean, it was created for ray tracing, so it should be predestined for such kind of GPU application.

One of the biggest advantages for me were the reliable and fast acceleration structures, which are crucial for ray tracing. But it proved to cause more pain than benefit due to several bugs. And the overall speed was not that great in the end, probably due to lacking optimisation in my code, which in turn is at least partly caused by lacking profilers (which would be available for CUDA).

But the big issue were really the bugs, which prevented me actively from implementing stuff. Some of the bugs are totally ridiculous and don’t get fixed..

I already encountered the first bug when coding for the bachelor thesis 2 years ago. Unfortunately I ignored it and decided again on OptiX. There is a special printf function, which takes into account the massively parallel architecture of the graphics card. For example, it is possible to limit the printing just to one pixel. But, ehm

// the following line works:

// on the contrary the following two don't

The latter crashes with

OptiX Error: Invalid value (Details: Function “RTresult _rtContextLaunch2D(RTcontext, unsigned int, RTsize, RTsize)” caught exception: Error in rtPrintf format string: “”, [7995632])

The difference is really only in calling the rtPrintf once or twice with the same text. Sometimes it could be even a different text and sometimes it worked with the same text. I reduced the code to a minimal example in order to eliminated possible stack corruption, but to no avail. This problem was
reported on the NVIDIA forum in June 2013. It’s possible to use stdio’s printf in conjunction with some ‘if’ guards as a workaround, as proposed in the forum.

The second one is also connected to rtPrintf:

RT_PROGRAM void pathtrace_camera() {
   BiDirSubPathVertex lightVertices[2];
   lightVertices[0].existing = false;
   lightVertices[1].existing = false;

   for(unsigned int i=0; i<2; i++) {
      if(!(lightVertices[i].existing)) break;
   // rtPrintf("something\n");

   output_buffer[launch_index] = make_float4(1.f, 1.f, 1.f, 1.f);

This code would crash on two tested operating systems (Windows 7 and Linux) and on two different computers (my own workstation and one from the uni).

OptiX Error: Unknown error (Details: Function “RTresult _rtContextLaunch2D(RTcontext, unsigned int, RTsize, RTsize)” caught exception: Encountered a CUDA error: Kernel launch returned (700): Launch failed, [6619200])

It runs fine though, when the rtPrintf is not commented. I reported it here. Again I made a minimal example, in the end the source was only two files, containing ~30 and ~60 lines, but it constantly and reliably crashed depending on weather the rtPrintf was commented or not.

So, later, if the program crashed, the first attempt to resolve the issue was spreading randomly rtPrintfs. This was also one of the solutions to another problem presented in the next paragraphs.

But before I come to it, I quickly have to explain the compilation process. There are two steps, first, during “compile time”, the C++ code is turned into binaries and the OptiX source into intermediate .ptx files. Those contain sort of a GPU assembler, which is then compiled during runtime by the NVIDIA GPU driver into actual binaries executed on the device. This is triggered by a C++ function call, usually context->compile().

Now, the problem is that these context->compile()s don’t return always. Sometimes they run until the host memory is full. Once it helped to spread calls to the mentioned rtPrintf in the code, another time the resolution was an extra RT_CALLABLE_PROGRAM construct, report with additional information here.

Generally those problems are painful and demotivating. Especially taking into account, that it seems like NVIDIA doesn’t care, at least they didn’t answer any of my reports. But the reason for this could also be, that they are phasing out OptiX already due to little success. Any way, it was certainly the last time I used OptiX and I don’t recommend anybody to start with it.

2 thoughts on “Why you should never use NVIDIA OptiX: bugs”

  1. Hi,

    I’m David McAllister, the engineering manager for NVIDIA OptiX. I’m sorry rtPrintf didn’t work for you, and I’m glad we were able to unblock you by suggesting that you use CUDA printf. Your other bug with rtContextCompile is concerning and I’ll take a look at it. The community support forum is mostly for discussion among the community, and we, the OptiX development team, only interact on the forum as time permits. The fact that we don’t hang out there much indicates that OptiX is growing rapidly and we are busy growing it. When you posted your bug last summer we were busy working on the award-winning real-time lighting preview with Pixar that was shown at Siggraph (http://www.nvidia.com/object/siggraph2013-theater.html) and in a GTC keynote talk (http://www.gputechconf.com/attendees/keynotes-replay).

    I like your Mandelbulb distance estimator optimization. Christian Buchner used OptiX and our Julia sample within just a few days of the Mandelbulb being published, four years ago, to make the first interactive renderer of the Mandelbulb. See Christian’s stereo video here: http://vimeo.com/8043636. While that was performant, our hardware has improved dramatically for ray tracing since that time. I expect the reason you were having performance trouble, while Christian didn’t, is that your optimization is not well suited to OptiX’s single ray programming model. I’m glad you were able to express it well in CUDA.

    David McAllister

  2. Hi,
    I was flattered to see an reply by an NVIDIA engineer :) I had a lot on my head recently and in addition lost my glasses, that’s why I couldn’t answer earlier..

    I didn’t post any bugs last summer, because I thought they are problems on my end. This rtPrintf bug was posted by somebody else. But that’s only a detail..

    As to the performance of the Mandelbulb, it was interactive from the beginning, when not using shadows. But the performance was below 24 fps in certain situations and therefore not real-time. In fact the performance was very similar to the Julia example and I don’t think there were any performance trouble with OptiX in that case (trouble as in the GPU was not used optimally). Sorry that was not clear.

    Adam Celarek

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.