Improving Cycles CPU Performance with Clang and Hardware Profiled Optimization

Despite the growing adoption of GPU rendering, CPU rendering remains crucial—particularly for large scenes and movie production using render farms and cloud environments. This still makes CPU efficiency a key factor for cost efficient rendering. Standard compiler optimizations typically aim for generic workloads and hardware, lacking visibility into how and where software is used in practice.
Hardware Profile-Guided-Optimization (HWPGO) solves this issue by implementing a two-step compilation process. In a first profiling run, a profile is collected on a specific hardware for representative workloads , for example a single frame of a scene with a fixed rendering configuration. In the second step, the information of the first run is used to compile an optimized binary. The execution profile enables more effective compiler decisions like function inlining, conditional branch reordering, and basic block layout tuned to real workloads.
By applying HWPGO to Blender Cycles, we optimize not just for the underlying hardware, but also for specific usage patterns— enabling targeted optimization of Blender to significantly increase rendering performance for particular shots or scenes from the Blender benchmark.
We demonstrate up to an 18% performance uplift on Intel’s latest CPUs compared to the MSVC baseline used in public binaries: +9% from switching to Clang, and an additional +9% from HWPGO. In this talk, we explain how to integrate HWPGO into a custom build and analyze the workload characteristics behind the performance gains, aiming to enable Blender contributors, studio build engineers, and performance-focused users to adopt HWPGO in their own builds.

Speaker

  • Leonhard Rannabauer has been Application Engineer at Intel for 4.5 years, directing Intel's technical partnership with Independent Software Vendors as Blender.

    Before his time at Intel Leonhard was research assistant at the Technical University of Munich and part of the group hardware-aware algorithms and software for high performance computing. He researched algorithmic and hardware-close opt…

    Speaker Profile