Download

Online

Gallery

Blog

Sweet Home 3D Blog

This blog presents news and tips about Sweet Home 3D.

Photo rendering engines performance test

Concept

The purpose of this article is to present some performance and output comparisons of open source ray tracing software that could be used as alternatives to the current photo generation engine embedded in Sweet Home 3D, i.e. SunFlow.

Background

Sweet Home 3D is widely praised in the design community for its good photo rendering capabilities. This is largely due to the excellent open source library SunFlow which is embedded and used. As SunFlow is no longer maintained, it has not been and will not be brought up to date to use the latest techniques delivered by both academic research and graphic processing hardware progress. In particular it can make no attempt to use the graphics processor in any manner. So this article was initiated to see if an alternative rendering engine can be found (and eventually added to Sweet Home 3D) to provide more options in the future. This newer engine would need to satisfy several requirements in order to be considered a valuable addition to the program; in particular it must have these attributes:

  • Equal or better performance to SunFlow
  • Good memory management
  • Able to run cross platform
  • Have an open source license

Requirements for candidates

  • Must create output images via a ray tracer algorithm such that existing knowledge within the community keeps its value, this excludes options such as Reyes rendering as seen in Renderman 
  • Must be open source; preferably, LGPL, MIT or Apache license (or equivalent)
  • Must be written in a language that is appropriate to the community: Java or C++ (possibly C); there are JavaScript and Python options that have not been considered
  • Must have a feature set comparable to SunFlow. For example photon mapping, ray casting, caustics, translucent scattering, several global illumination options, various sky light options, depth of field, ambient occlusion, etc.
  • Must be able to provide high quality output, photo realistic is the benchmark
  • Must be currently maintained, a significant release should have been made within about a year
  • Must run under Windows, Mac OS X and Linux, should have minimal code differences, no quality, and minimal speed difference between platforms
  • Must be able to handle OBJ 3D models
  • Must be able to handle JPEG and PNG images
  • Must provide feedback during rendering computation (at minimum % of operation complete, but for preference image parts already computed)
  • Should be able to have settings altered at runtime to allow customization of output by users
  • Should be faster than SunFlow, or at the least of comparable speed, this suggests that if it could use graphics processor acceleration that would be good
  • If a code is required to drive the render it should be a simple, plain text, interpreted, scripting language
  • In order to allow a staged approach to research and development the ray tracing engine must allow a standalone binary and command line operation, or be a plugin to a well-established 3D design program; if it requires compilation into an executable it will be excluded. Note there is a chance that a good candidate that provides all other features will be excluded on this basis, but there is no real work around to this requirement.

Candidates

The following candidate list comes primarily from this Wikipedia article but also from other Internet searches. 

Table 1. Candidates

OSPRay Apache 2.0   13 committers with 10+ commits, many with less.
Source updated recently
YafaRay LGPL   12 committers with 10+ commits, 6 with less
Blender Cycles GPL   18 committers with 10+ commits, many with less
LuxRender GPLv3   The site shows 19 members, recent activity shows 11 commits in the past year by 3 members.
The current BitBucket release note of 1.5 is 2.5 years old
Radiance BSD Written in C CVS repo shows since 2017-01-01 there have been 2 users committing, the bulk by 1 of them.
Last release august 2017
POV-Ray AGPLv3   3 committers with 10+ commits, many with less, GitHub shows 2 contributors this year.
Not released frequently, last release 4 years ago. Source code update and beta release still continue however
Mitsuba GPLv3   Online git repo shows 2 contributors.
Last update feb 2016
BRL-CAD BSD, LGPL Focused on constructive solid geometry CSG SourceForge shows 5 contributors in the past few months.
Actively being developed on SourceForge.
Last release 2016_09_02 7.26.0
Visionaray MIT States it is not Windows only, might be difficult to verify support on Mac OS X and Linux GitHub shows 1 recent contributor.
The only compiled release is old
Embree   OSPRay is built on this, so only one should be tested  
BEAM4     Last GitHub update 2 years ago
Art of Illusion GPL Low quality example images, editor focus no reference to ray trace features Last release Dec 2016
Manta Interactive Ray Tracer MIT   Last source update 4 years ago
Open Ray Trace   Python Last release 2007
Ray Trace 3.0     Wayback machine shows release 3.0 prior to 2006
Raja     Last update 2003
Tonatiuh Project   Not a full featured scene designed for solar collectors.  
McXtrace GPL Not a ray tracer, creates compliable c code  
Picogen GPLv3 Terrain focused  
Pixie GPL Not a ray trace renderer, uses ray tracing for hidden surface determination  
Tachyon GPL   Last release 2013

Methodology

The following scene shown in Sweet Home 3D 5.0 splash screen and based on http://www.sweethome3d.com/examples/SweetHome3DExample7.sh3d, was used as the reference scene to be loaded and rendered by each engine.


Figure 1. Reference scene

The scene was exported to the Wavefront (OBJ) format for import.

  1. Each candidate is installed.
  2. The scene is loaded in whatever manner that requires.
  3. The scale of the engine is factored versus the OBJ scale (cm) and ratios applied to transforms and lighting powers.
  4. The 32 lights from the reference scene are added via scripts (as these are not exported to OBJ) and adjusted to produce reasonable light powers.
  5. The camera position, view angle and setting are replicated
  6. The sun light, ambient light and background texture are set up.
  7. The render settings are made to match SunFlow settings as much as possible.
  8. A test render is run and…
  9. Iteratively adjust settings in cases where the scene output is not similar enough to the reference image.
  10. Render the scenes at sizes from 360x360, 720x720 and 1024x1024 and record metrics.

It is expected that the output images will not be the same as the reference image for a variety of reasons. This may be because the engines uses different techniques that do not produce the exact same results, it may be due to the engine's treatment of light requiring extra input data to make it accurate (for example light portals), and it will certainly be because the expertise required in each engine to maximize output quality cannot be gained in the short timeframe of the experiment.

With the above in mind, the approach will be to ensure that each engine is doing all of the work that SunFlow is doing and as far as possible ensure settings are not allowing work to be short cut, and hence reducing processing effort required. Some trivial examples of this type of short cut might be a low setting for ambient occlusion allowing early cutoffs for intersection tests, or a low light energy allowing the lights sphere of influence to be fewer object than required.

Rendering results

The 4 candidates selected for this test were OSPRay, YafaRay, Cycles and LuxRender. These engines along with SunFlow were tested on the following PC:

  • Windows 10 64 bit
  • AMD 6 core 3.5 GHz
  • 16 GB ram
  • NVidia GeForce GTX 750 1 GB GDDR5 memory

Table 2. Sample outputs

SunFlow
OSPRay
Cycles
YafaRay
LuxRender

Each output image was checked to ensure that lights were giving off light and shadows appeared in the test output as they did in the reference. Cases like the reference image having very strong and crisp shadows from the sunlight that were not reproduced in any test engine may indicate a need for more effort by those engines.

Performance results

Although the CPU (main processor) and GPU (graphics processor) usage at each stage was gathered during the tests, it varied greatly depending on phase and engine, and doesn't provide any further clarification regarding performance. Those engines that did use the GPU never appeared to be bottlenecked by the GPU itself. Use of the OpenCL API failed in all cases.

Every candidate tested kept memory to a reasonable level, well below that used by SunFlow. In some candidates like LuxRender and YafaRay this may be because of aggressive intermediate file writing/reading so it may rise for those engines if they were moved to a (faster) in-memory only technique. Details and comparison of these figures is not considered useful for this article.

Table 3. Rendering times

SunFlow 360x360 379 6:19 No intermediate files Complete buckets None
720x720 1309 21:49
1024x1024 2333 38:53
OSPRay 360x360 25 0:25 20 s no intermediate files Incremental whole image Strong usage with peaks of 45%
720x720 41 0:41
1024x1024 68 1:08
Cycles 360x360 570 9:30 25 s no intermediate files Incremental buckets None
720x720 1950 32:30
1024x1024 3974 1:06:14
YafaRay 360x360 300 5:00 290 s including intermediate file write then read Complete buckets GPU memory used and low  processor use
720x720 324 5:24
1024x1024 347 5:47
LuxRender 360x360 794 13:14 420 s including intermediate file write then read Incremental whole image None
720x720 3189 53:09
1024x1024 4527 1:15:27


Figure 2. Rendering times

Examining the 2 best performers and adding a linear trend line equation, this confirms the overhead figures in Table 3 and shows linear scaling appears to be true.


Figure 3. Rendering times trend lines for top 2

The two trend line equations shown in the Figure 3 reveal that the scaling performance for each is both linear and reasonably similar. The equation should be read as "time = cost per megapixel plus constant overhead". Where time is "y", the cost per megapixel is the number adjacent to "x" and constant overhead is the number after the "+".
These equations suggest that removing the writing/reading of intermediate files from YafaRay will result in a similar performance and scaling to that of OSPRay. It may be possible that keeping the intermediate files data in memory is a viable option that would reduce this overhead cost, though source code investigation will be needed to confirm that.

Conclusion

It's fairly obvious that OSPRay and YafaRay are the standout best performers under these test conditions. There is a tenfold speed improvement over SunFlow for both engines so pushing more work onto them should still result in better speed than SunFlow. It's possible that getting a good result will require more data in the scene which will be a factor in the development effort required to get the engine integrated.

There are several factors that suggest YafaRay is the better choice.
OSPRay produces images with the materials (and associated textures) obviously mis-indexed in some manner such that they are applied to the wrong objects. It is noted that the materials themselves appear to be reasonably correct.
The output quality (achieved by this study) for the YafaRay renderer is superior to that produced for the OSPRay renderer.
The ability to integrate YafaRay with Blender is an advantage. It means development work for (the integration of) this engine and of plugins and extensions within the community will be facilitated by having a known working environment.

Unsurprisingly these two candidates utilized the GPU and had excellent performance results, this is as would be expected, though the low GPU processor use by YafaRay might be an area to investigate further and attempt to increase.

[Update: a rendering plug-in based on YafaRay was developed in 2019 and released on January 31, 2020]

Notes on each candidate

SunFlow

In order to set up each of the other renderers I extracted (as best I could) the configuration that SweetHome3D uses to drive SunFlow.

Values from the PhotoRender.properties file:

# High quality parameters
highQuality.antiAliasing.min=1
highQuality.antiAliasing.max=2
# Filter used to control oversampled image: "box", "triangle", "gaussian", 
# "mitchell", "catmull-rom", "blackman-harris", "sinc", "lanczos" or "bspline"
highQuality.filter=blackman-harris
# Global illumination algorithm: "default" or "path"
# "default" uses ambient occlusion during day hours in virtual visit mode
# "path" takes much longer to compute but gives more realistic view at day hours
highQuality.globalIllumination=default
# Maximum bounces done by light rays when global illumination "path" is used
# Increasing this value greatly slows down rendering process
highQuality.diffusedBounces=1
# Caustics photons count, with 0 producing no caustics
# If different from 0, should be higher than 1000000 to obtain some visible effect
highQuality.causticsPhotons=0
# Shader used to render shiny materials: "default", "glossy" or "silk"
# "default" uses silk shader at high quality level
# and in virtual view mode, glossy otherwise
highQuality.shininessShader=default
highQuality.normalLens.focusDistance=250.
highQuality.normalLens.radius=1
# Algorithm used by the renderer: "bucket", "fast" or "ipr"
highQuality.samplerAlgorithm=bucket

Values from inspecting the code path in PhotoRenderer class:

sunDirection = 45° up in the east
useSunSky = false due to using the observer camera (the indoors camera)
So Direct Lighting and Ambient Occlusion will be the primary lighting system with some distant sunlight from a sun sphere with settings: sunPower = 40, samples = 4, radius = 1000, center very high
useSunskyLight = false and globalIllumination = default rather than path therefore "Direct Lighting with Ambient Occlusion" is set as the gi.engine; with white as the bright color and a 1/200th as the dark
ambocc.samples = 1
Discussion of this rendering method can be found here.
camera = PINHOLE, camera location and fov and aspect and image resolution are all straight forward
noCeilingLights
bucket.size = 64
bucket.order = spiral
The effects of these depths parameters aren't clear:
depths.reflection = 4
depths.refraction = 16
The silk shader is set to be used, so the export of nodes to SunFlow uses the uber shader for shiny materials, I don't know if this affects performance or could be relevant to other engines.

OSPRay

OSPRay is a set of pre-compiled programs from the open source project targeted at various operating systems. It is not designed to be a fully comprehensive solution, but rather to allow easier access for developers to understand what might be possible with the projects source code.
OSPRay is built on Embree which appears to be a very well regarded ray tracing engine.

As it is basically a demonstration system of the underlying engine it has a very limited set of operations and controls and comes with just a few examples to show what can be achieved.

OSPRay has a pre-compiled onscreen viewing tool, which was used to test and measure the engines capabilities. The only mechanism available to register when a render had completed (for this tool) was on a screen update, which would be done once processing was complete. Unfortunately render buffers larger than the physical monitor size would not output results on screen, and therefore the test render sizes had to be equal to or smaller than the resolution of the testing device's screen.
So 1024x1024 was the maximum output achievable, therefore this was incorporated in the testing methodology for all engines.
At higher resolutions OSPRay did continue to do the work and the CPU/GPU were utilised, so it can be assumed it produces larger images in the engine itself.

This engine has a progress output of iterative full screen, which is not as good as buckets.

The OSPRay renders shown above in Table 2 have some very obvious material mis-assignments, it is assumed that this could be solved once code is accessed. The texture coordinates appear correct. The materials are still used, so this mismatch doesn't in itself invalidate the results, though it makes them less reliable. Examples of bad materials:

  • The couch is missing portions, and the birthday cake decorations are located incorrectly.
  • The glasses and bottle are not transparent, but the windows are.

Another quirk of note is that the French bread has a very poor shadow under it. This may be an issue with the ambient occlusion technique and distances as the dining chairs appear to have reasonable shadows.

Cycles

Cycles is the default renderer for Blender which is an open source 3D modelling and rendering tool that is very widely used. It is built-in and therefore required no setup effort beyond installing Blender.
This engine has a progress output of iterative buckets, which is slightly better than the complete buckets of SunFlow.

Cycles produces very grainy images at the samples per pixel of 64, so it was increased to 512 for these tests which slowed the renders down considerably.

 
Figure 4. Cycles 720x720, samples per pixel set to 64 on left and 512 on right

Cycles has a good progress update output and also supplies an estimated time remaining figure.

YafaRay

YafaRay is primarily a plugin to Blender.
This engine has a progress output of complete buckets, which is the same as SunFlow.

The installation of YafaRay suffered from an apparently infrequent issue, where the preview image would crash Blender when attempting to view the material editor panel. This meant difficulties in getting all materials accurate. For example it prevented from making all glass objects correctly transparent in the time allocated.


Figure 5. YafaRay showing some objects not correctly transparent

An extra test was run at 7200x7200 to ensure good scaling for YafaRay up to the sizes needed. The time taken was 1885s or 31:25.
The trend line equation for YafaRay for all results including this larger one is
y = 30.455x + 306.39
showing that the scaling continues linearly.

YafaRay (like most engines) has an antialiasing option, which is applied per sample, so it affects the render times. A test was also run at 720x720 4x Antialiasing (AA). This added roughly 15% to the render time, and appeared to scale linearly.


Figure 6. YafaRay at 720x720, left side is no AA, right side is 4x AA

To investigate the effect of the many lights in scene, a YafaRay 720x720 render was run with half the lights removed (randomly); it decreased the work load by roughly 20%.

LuxRender

LuxRender is primarily a plugin to Blender.
This engine has a progress output of iterative full screen, which is not as good as buckets.
LuxRender's "Physically Based Rendering" (PBR) design requirement means Ambient Occlusion is not implemented.

A question was raised during the testing of LuxRender, is there perhaps a render method other than Direct Lighting that performs faster, and if so could a change of technique allow some engines to produce comparable images in better times. So to answer that all possible render methods for LuxRender were tested for speed using a very trivial 180x180, 64 Samples per pixel, intermediate files reused set up.

Table 4. All render methods performance comparison

LuxCore BidirVCM 5:04
LuxCore Bidir 4:56
LuxCore Biased Path OpenCL Never finished
LuxCore Biased Path 5:09
LuxCore Path OpenCL RUNTIME ERROR: The SampleData buffer is too big...
LuxCore Path 5:08
Hybrid Path 4:56
SPPM (Experimental) 6:02
Ex-photon map 8:34
Distributed Path 5:20
Direct Lighting 4:58
Path 5:12
Bidirectional 5:33

None of them performed significantly faster than direct lighting, though many were of a similar performance. Note the above figures cannot be compared to the results table in any manner.

Radiance

Radiance is a set of pre compiled executables for a given OS.
Initially Radiance was in the list of candidates. It was chosen as it uses a different model from the predominantly Blender plugin systems and it also has a long pedigree.

After a lot of effort was expended it was not possible to make Radiance produce a useful output. This is due to three factors.

  1. The design ideology of using only piped input/output streams between many small, tightly scoped programs
  2. The use of proprietary file formats for object definition, material definition and (rather strangely) even for image file definition
  3. A low quantity and quality of documentation, and a fairly inactive user community.

Given the first two issues good documentation should have made transformation and use of pre-existing content possible. After over a week's research and experimentation I was unable to reliably convert HDR image files into and out of TIFF files. At that point the candidate had to be removed from consideration.

Tags :


Avatar: Enko Nyito

Re: Photo rendering engines performance test

Nice work of comparative study!
Avatar: hansmex

Re: Photo rendering engines performance test

Impressive and very enlightening! I would be very interested to see this implemented in SH3D, directly in the program, or as a plug-in. <Looking left at Enko...>
Avatar: Anonymous

Eevee

Interesante, pero como usuario de blender, cycles es mucho mas limpio en terminos de ruido pero seria mas sano tratar de implementar Blender Eevee no seria trasado de rayos pero seria mas rapido para trabajo. ------------- Interesting, but as a blender user, cycles is much cleaner in terms of noise but it would be healthier to try to implement Blender. Eevee would not be lightning stricken but it would be faster for work.

Another test in the future

Are you going to perform this test again, this time with updated rendering software? Blender EEVEE can be a candidate, and LuxCoreRender replaced LuxRender.
Avatar: Emmanuel Puybaret

Another test in the future

Rather than losing more time on new born libraries, I think it would be more valuable to start programming something based on the conclusions of this article. Logical, no?
Avatar: abhifx

Another test in the future

That could be true but by considering better options, time will be saved in future endeavor. Luxcorerender is imho much improved which basically skews this comparison. Further, Yafaray's future is still dubious with single developer pushing for update. and may result in sunflow like status. Personally, realtime preview engines like eevee, armour3d, godot etc should be best suited at it almost reduces the rendering time to almost realtime (considering hardware is good enough). Of course this is just my opinion as exporting object to any rendering engine is also available.
  Get Sweet Home 3D at SourceForge.net. Fast, secure and Free Open Source software downloads  
© Copyright 2024 Space Mushrooms - All rights reserved