Photo rendering engines performance test
Concept
The purpose of this article is to present some performance and output comparisons of open source ray tracing software that could be used as alternatives to the current photo generation engine embedded in Sweet Home 3D, i.e. SunFlow.
Background
Sweet Home 3D is widely praised in the design community for its good photo rendering capabilities. This is largely due to the excellent open source library SunFlow which is embedded and used. As SunFlow is no longer maintained, it has not been and will not be brought up to date to use the latest techniques delivered by both academic research and graphic processing hardware progress. In particular it can make no attempt to use the graphics processor in any manner. So this article was initiated to see if an alternative rendering engine can be found (and eventually added to Sweet Home 3D) to provide more options in the future. This newer engine would need to satisfy several requirements in order to be considered a valuable addition to the program; in particular it must have these attributes:
- Equal or better performance to SunFlow
- Good memory management
- Able to run cross platform
- Have an open source license
Requirements for candidates
- Must create output images via a ray tracer algorithm such that existing knowledge within the community keeps its value, this excludes options such as Reyes rendering as seen in Renderman
- Must be open source; preferably, LGPL, MIT or Apache license (or equivalent)
- Must be written in a language that is appropriate to the community: Java or C++ (possibly C); there are JavaScript and Python options that have not been considered
- Must have a feature set comparable to SunFlow. For example photon mapping, ray casting, caustics, translucent scattering, several global illumination options, various sky light options, depth of field, ambient occlusion, etc.
- Must be able to provide high quality output, photo realistic is the benchmark
- Must be currently maintained, a significant release should have been made within about a year
- Must run under Windows, Mac OS X and Linux, should have minimal code differences, no quality, and minimal speed difference between platforms
- Must be able to handle OBJ 3D models
- Must be able to handle JPEG and PNG images
- Must provide feedback during rendering computation (at minimum % of operation complete, but for preference image parts already computed)
- Should be able to have settings altered at runtime to allow customization of output by users
- Should be faster than SunFlow, or at the least of comparable speed, this suggests that if it could use graphics processor acceleration that would be good
- If a code is required to drive the render it should be a simple, plain text, interpreted, scripting language
- In order to allow a staged approach to research and development the ray tracing engine must allow a standalone binary and command line operation, or be a plugin to a well-established 3D design program; if it requires compilation into an executable it will be excluded. Note there is a chance that a good candidate that provides all other features will be excluded on this basis, but there is no real work around to this requirement.
Candidates
The following candidate list comes primarily from this Wikipedia article but also from other Internet searches.
Table 1. Candidates
Name | License | Notes | Level of support |
OSPRay | Apache 2.0 | 13 committers with 10+ commits, many with less. Source updated recently |
|
YafaRay | LGPL | 12 committers with 10+ commits, 6 with less | |
Blender Cycles | GPL | 18 committers with 10+ commits, many with less | |
LuxRender | GPLv3 | The site shows 19 members, recent activity shows 11 commits in the past year by 3 members. The current BitBucket release note of 1.5 is 2.5 years old |
|
Radiance | BSD | Written in C | CVS repo shows since 2017-01-01 there have been 2 users committing, the bulk by 1 of them. Last release august 2017 |
POV-Ray | AGPLv3 | 3 committers with 10+ commits, many with less, GitHub shows 2 contributors this year. Not released frequently, last release 4 years ago. Source code update and beta release still continue however |
|
Mitsuba | GPLv3 | Online git repo shows 2 contributors. Last update feb 2016 |
|
BRL-CAD | BSD, LGPL | Focused on constructive solid geometry CSG | SourceForge shows 5 contributors in the past few months. Actively being developed on SourceForge. Last release 2016_09_02 7.26.0 |
Visionaray | MIT | States it is not Windows only, might be difficult to verify support on Mac OS X and Linux | GitHub shows 1 recent contributor. The only compiled release is old |
Embree | OSPRay is built on this, so only one should be tested | ||
BEAM4 | Last GitHub update 2 years ago | ||
Art of Illusion | GPL | Low quality example images, editor focus no reference to ray trace features | Last release Dec 2016 |
Manta Interactive Ray Tracer | MIT | Last source update 4 years ago | |
Open Ray Trace | Python | Last release 2007 | |
Ray Trace 3.0 | Wayback machine shows release 3.0 prior to 2006 | ||
Raja | Last update 2003 | ||
Tonatiuh Project | Not a full featured scene designed for solar collectors. | ||
McXtrace | GPL | Not a ray tracer, creates compliable c code | |
Picogen | GPLv3 | Terrain focused | |
Pixie | GPL | Not a ray trace renderer, uses ray tracing for hidden surface determination | |
Tachyon | GPL | Last release 2013 |
Methodology
The following scene shown in Sweet Home 3D 5.0 splash screen and based on http://www.sweethome3d.com/examples/SweetHome3DExample7.sh3d, was used as the reference scene to be loaded and rendered by each engine.
Figure 1. Reference scene
The scene was exported to the Wavefront (OBJ) format for import.
- Each candidate is installed.
- The scene is loaded in whatever manner that requires.
- The scale of the engine is factored versus the OBJ scale (cm) and ratios applied to transforms and lighting powers.
- The 32 lights from the reference scene are added via scripts (as these are not exported to OBJ) and adjusted to produce reasonable light powers.
- The camera position, view angle and setting are replicated
- The sun light, ambient light and background texture are set up.
- The render settings are made to match SunFlow settings as much as possible.
- A test render is run and…
- Iteratively adjust settings in cases where the scene output is not similar enough to the reference image.
- Render the scenes at sizes from 360x360, 720x720 and 1024x1024 and record metrics.
It is expected that the output images will not be the same as the reference image for a variety of reasons. This may be because the engines uses different techniques that do not produce the exact same results, it may be due to the engine's treatment of light requiring extra input data to make it accurate (for example light portals), and it will certainly be because the expertise required in each engine to maximize output quality cannot be gained in the short timeframe of the experiment.
With the above in mind, the approach will be to ensure that each engine is doing all of the work that SunFlow is doing and as far as possible ensure settings are not allowing work to be short cut, and hence reducing processing effort required. Some trivial examples of this type of short cut might be a low setting for ambient occlusion allowing early cutoffs for intersection tests, or a low light energy allowing the lights sphere of influence to be fewer object than required.
Rendering results
The 4 candidates selected for this test were OSPRay, YafaRay, Cycles and LuxRender. These engines along with SunFlow were tested on the following PC:
- Windows 10 64 bit
- AMD 6 core 3.5 GHz
- 16 GB ram
- NVidia GeForce GTX 750 1 GB GDDR5 memory
Table 2. Sample outputs
Sample | Progress | |
SunFlow | ||
OSPRay | ||
Cycles | ||
YafaRay | ||
LuxRender |
Each output image was checked to ensure that lights were giving off light and shadows appeared in the test output as they did in the reference. Cases like the reference image having very strong and crisp shadows from the sunlight that were not reproduced in any test engine may indicate a need for more effort by those engines.
Performance results
Although the CPU (main processor) and GPU (graphics processor) usage at each stage was gathered during the tests, it varied greatly depending on phase and engine, and doesn't provide any further clarification regarding performance. Those engines that did use the GPU never appeared to be bottlenecked by the GPU itself. Use of the OpenCL API failed in all cases.
Every candidate tested kept memory to a reasonable level, well below that used by SunFlow. In some candidates like LuxRender and YafaRay this may be because of aggressive intermediate file writing/reading so it may rise for those engines if they were moved to a (faster) in-memory only technique. Details and comparison of these figures is not considered useful for this article.
Average render time | Time for pre-processing OBJ file (approximate) | Progress feedback | GPU usage | |||
Resolution | Seconds | hh:mm:ss | ||||
SunFlow | 360x360 | 379 | 6:19 | No intermediate files | Complete buckets | None |
720x720 | 1309 | 21:49 | ||||
1024x1024 | 2333 | 38:53 | ||||
OSPRay | 360x360 | 25 | 0:25 | 20 s no intermediate files | Incremental whole image | Strong usage with peaks of 45% |
720x720 | 41 | 0:41 | ||||
1024x1024 | 68 | 1:08 | ||||
Cycles | 360x360 | 570 | 9:30 | 25 s no intermediate files | Incremental buckets | None |
720x720 | 1950 | 32:30 | ||||
1024x1024 | 3974 | 1:06:14 | ||||
YafaRay | 360x360 | 300 | 5:00 | 290 s including intermediate file write then read | Complete buckets | GPU memory used and low processor use |
720x720 | 324 | 5:24 | ||||
1024x1024 | 347 | 5:47 | ||||
LuxRender | 360x360 | 794 | 13:14 | 420 s including intermediate file write then read | Incremental whole image | None |
720x720 | 3189 | 53:09 | ||||
1024x1024 | 4527 | 1:15:27 |
Figure 2. Rendering times
Examining the 2 best performers and adding a linear trend line equation, this confirms the overhead figures in Table 3 and shows linear scaling appears to be true.
Figure 3. Rendering times trend lines for top 2
The two trend line equations shown in the Figure 3 reveal that the scaling performance for each is both linear and reasonably similar. The equation should be read as "time = cost per megapixel plus constant overhead". Where time is "y", the cost per megapixel is the number adjacent to "x" and constant overhead is the number after the "+".
These equations suggest that removing the writing/reading of intermediate files from YafaRay will result in a similar performance and scaling to that of OSPRay. It may be possible that keeping the intermediate files data in memory is a viable option that would reduce this overhead cost, though source code investigation will be needed to confirm that.
Conclusion
It's fairly obvious that OSPRay and YafaRay are the standout best performers under these test conditions. There is a tenfold speed improvement over SunFlow for both engines so pushing more work onto them should still result in better speed than SunFlow. It's possible that getting a good result will require more data in the scene which will be a factor in the development effort required to get the engine integrated.
There are several factors that suggest YafaRay is the better choice.
OSPRay produces images with the materials (and associated textures) obviously mis-indexed in some manner such that they are applied to the wrong objects. It is noted that the materials themselves appear to be reasonably correct.
The output quality (achieved by this study) for the YafaRay renderer is superior to that produced for the OSPRay renderer.
The ability to integrate YafaRay with Blender is an advantage. It means development work for (the integration of) this engine and of plugins and extensions within the community will be facilitated by having a known working environment.
Unsurprisingly these two candidates utilized the GPU and had excellent performance results, this is as would be expected, though the low GPU processor use by YafaRay might be an area to investigate further and attempt to increase.
[Update: a rendering plug-in based on YafaRay was developed in 2019 and released on January 31, 2020]
Notes on each candidate
SunFlow
In order to set up each of the other renderers I extracted (as best I could) the configuration that SweetHome3D uses to drive SunFlow.
Values from the PhotoRender.properties file:
# High quality parameters highQuality.antiAliasing.min=1 highQuality.antiAliasing.max=2 # Filter used to control oversampled image: "box", "triangle", "gaussian", # "mitchell", "catmull-rom", "blackman-harris", "sinc", "lanczos" or "bspline" highQuality.filter=blackman-harris # Global illumination algorithm: "default" or "path" # "default" uses ambient occlusion during day hours in virtual visit mode # "path" takes much longer to compute but gives more realistic view at day hours highQuality.globalIllumination=default # Maximum bounces done by light rays when global illumination "path" is used # Increasing this value greatly slows down rendering process highQuality.diffusedBounces=1 # Caustics photons count, with 0 producing no caustics # If different from 0, should be higher than 1000000 to obtain some visible effect highQuality.causticsPhotons=0 # Shader used to render shiny materials: "default", "glossy" or "silk" # "default" uses silk shader at high quality level # and in virtual view mode, glossy otherwise highQuality.shininessShader=default highQuality.normalLens.focusDistance=250. highQuality.normalLens.radius=1 # Algorithm used by the renderer: "bucket", "fast" or "ipr" highQuality.samplerAlgorithm=bucket
Values from inspecting the code path in PhotoRenderer class:
sunDirection = 45
° up in the east
useSunSky = false
due to using the observer camera (the indoors camera)
So Direct Lighting and Ambient Occlusion will be the primary lighting system with some distant sunlight from a sun sphere with settings: sunPower = 40
, samples = 4
, radius = 1000
, center very high
useSunskyLight = false
and globalIllumination = default
rather than path
therefore "Direct Lighting with Ambient Occlusion" is set as the gi.engine
; with white as the bright color and a 1/200th as the dark
ambocc.samples = 1
Discussion of this rendering method can be found here.
camera = PINHOLE
, camera location and fov and aspect and image resolution are all straight forward
noCeilingLights
bucket.size = 64
bucket.order = spiral
The effects of these depths parameters aren't clear:
depths.reflection = 4
depths.refraction = 16
The silk shader is set to be used, so the export of nodes to SunFlow uses the uber
shader for shiny materials, I don't know if this affects performance or could be relevant to other engines.
OSPRay
OSPRay is a set of pre-compiled programs from the open source project targeted at various operating systems. It is not designed to be a fully comprehensive solution, but rather to allow easier access for developers to understand what might be possible with the projects source code.
OSPRay is built on Embree which appears to be a very well regarded ray tracing engine.
As it is basically a demonstration system of the underlying engine it has a very limited set of operations and controls and comes with just a few examples to show what can be achieved.
OSPRay has a pre-compiled onscreen viewing tool, which was used to test and measure the engines capabilities. The only mechanism available to register when a render had completed (for this tool) was on a screen update, which would be done once processing was complete. Unfortunately render buffers larger than the physical monitor size would not output results on screen, and therefore the test render sizes had to be equal to or smaller than the resolution of the testing device's screen.
So 1024x1024 was the maximum output achievable, therefore this was incorporated in the testing methodology for all engines.
At higher resolutions OSPRay did continue to do the work and the CPU/GPU were utilised, so it can be assumed it produces larger images in the engine itself.
This engine has a progress output of iterative full screen, which is not as good as buckets.
The OSPRay renders shown above in Table 2 have some very obvious material mis-assignments, it is assumed that this could be solved once code is accessed. The texture coordinates appear correct. The materials are still used, so this mismatch doesn't in itself invalidate the results, though it makes them less reliable. Examples of bad materials:
- The couch is missing portions, and the birthday cake decorations are located incorrectly.
- The glasses and bottle are not transparent, but the windows are.
Another quirk of note is that the French bread has a very poor shadow under it. This may be an issue with the ambient occlusion technique and distances as the dining chairs appear to have reasonable shadows.
Cycles
Cycles is the default renderer for Blender which is an open source 3D modelling and rendering tool that is very widely used. It is built-in and therefore required no setup effort beyond installing Blender.
This engine has a progress output of iterative buckets, which is slightly better than the complete buckets of SunFlow.
Cycles produces very grainy images at the samples per pixel of 64, so it was increased to 512 for these tests which slowed the renders down considerably.
Figure 4. Cycles 720x720, samples per pixel set to 64 on left and 512 on right
Cycles has a good progress update output and also supplies an estimated time remaining figure.
YafaRay
YafaRay is primarily a plugin to Blender.
This engine has a progress output of complete buckets, which is the same as SunFlow.
The installation of YafaRay suffered from an apparently infrequent issue, where the preview image would crash Blender when attempting to view the material editor panel. This meant difficulties in getting all materials accurate. For example it prevented from making all glass objects correctly transparent in the time allocated.
Figure 5. YafaRay showing some objects not correctly transparent
An extra test was run at 7200x7200 to ensure good scaling for YafaRay up to the sizes needed. The time taken was 1885s or 31:25.
The trend line equation for YafaRay for all results including this larger one is
y = 30.455x + 306.39
showing that the scaling continues linearly.
YafaRay (like most engines) has an antialiasing option, which is applied per sample, so it affects the render times. A test was also run at 720x720 4x Antialiasing (AA). This added roughly 15% to the render time, and appeared to scale linearly.
Figure 6. YafaRay at 720x720, left side is no AA, right side is 4x AA
To investigate the effect of the many lights in scene, a YafaRay 720x720 render was run with half the lights removed (randomly); it decreased the work load by roughly 20%.
LuxRender
LuxRender is primarily a plugin to Blender.
This engine has a progress output of iterative full screen, which is not as good as buckets.
LuxRender's "Physically Based Rendering" (PBR) design requirement means Ambient Occlusion is not implemented.
A question was raised during the testing of LuxRender, is there perhaps a render method other than Direct Lighting that performs faster, and if so could a change of technique allow some engines to produce comparable images in better times. So to answer that all possible render methods for LuxRender were tested for speed using a very trivial 180x180, 64 Samples per pixel, intermediate files reused set up.
Table 4. All render methods performance comparison
LuxCore BidirVCM | 5:04 |
LuxCore Bidir | 4:56 |
LuxCore Biased Path OpenCL | Never finished |
LuxCore Biased Path | 5:09 |
LuxCore Path OpenCL | RUNTIME ERROR: The SampleData buffer is too big... |
LuxCore Path | 5:08 |
Hybrid Path | 4:56 |
SPPM (Experimental) | 6:02 |
Ex-photon map | 8:34 |
Distributed Path | 5:20 |
Direct Lighting | 4:58 |
Path | 5:12 |
Bidirectional | 5:33 |
None of them performed significantly faster than direct lighting, though many were of a similar performance. Note the above figures cannot be compared to the results table in any manner.
Radiance
Radiance is a set of pre compiled executables for a given OS.
Initially Radiance was in the list of candidates. It was chosen as it uses a different model from the predominantly Blender plugin systems and it also has a long pedigree.
After a lot of effort was expended it was not possible to make Radiance produce a useful output. This is due to three factors.
- The design ideology of using only piped input/output streams between many small, tightly scoped programs
- The use of proprietary file formats for object definition, material definition and (rather strangely) even for image file definition
- A low quantity and quality of documentation, and a fairly inactive user community.
Given the first two issues good documentation should have made transformation and use of pre-existing content possible. After over a week's research and experimentation I was unable to reliably convert HDR image files into and out of TIFF files. At that point the candidate had to be removed from consideration.