Obi 3.3: What’s new

Obi 3.3 (cloth, rope and fluid) has just been released! This time we have focused into the tiny details, squeezing that last drop of performance out of it, and laying the foundation of what will be Obi 4.0.

  • Better performance (up to x10 faster in some scenes)
  • Faster convergence (less stretchy ropes/cloth)
  • Arbitrary mesh deformation for ropes
  • Fast and accurate collision detection using distance fields
  • Lots of bug fixes and small corrections

We’ve packed a lot of goodies into this release, let’s go into a bit of detail about each of them:

Substepping

Prior to 3.3, Obi was updated only once per FixedUpdate(). This ensured perfect synchronization with Unity’s own physics, however it also severely limited what you could do with Obi: decreasing the timestep (calling FixedUpdate more often) is a very good way of improving accuracy, however doing so affects both Obi and regular rigidbody physics.

We’ve introduced sub stepping, which allows Obi to update more than once per physics step. This effectively decouples Unity’s update frequency from Obi’s, while still keeping both in sync: Obi’s timestep will always be a multiple of Unity’s.

Increasing the amount of substeps allows you to dramatically reduce the amount of constraint iterations needed, increasing both accuracy and performance. This is specially noticeable in long ropes and dense cloth pieces.

Less memory allocation

We’ve achieved near-zero garbage generation for cloth and ropes. Ropes will now only allocate memory when dynamically increasing their length. This guarantees less GC spikes and better performance in less powerful devices.

Fine-grained parallelism

Obi’s task-based threading model dynamically splits large tasks (such as detecting collisions between all particles and colliders) into smaller ones, waking threads up as needed to attend these tasks. Despite this, threads in 3.2 spent a noticeable amount of time awake, just waiting in a queue for other threads to pick a task. This forced us to ensure tasks were big enough to amortize the time spent waiting for them: you don’t want your threads to spend more time waiting than actually working!.

In 3.3 we’ve introduced a special kind of task, optimized for embarrassingly parallel calculations: these use atomic counters to hand the correct task to each thread, virtually reducing synchronization costs to zero. This enables us to efficiently perform extremely small tasks in parallel, thus improving load balancing and the overall performance.

Here’s Obi’s profiler during a granular material simulation, in 3.2 and 3.3. Time runs from left to right, each row in the graph is a thread, and each green bar is a task:

3.2 tasks are much bigger, which means more wasted time waiting.

3.3 tasks are extremely small, threads accept a new task as soon as they finish the current one.

As a result, the new solver is between 1.5 and 1.7 times faster.

When combined all these performance improvements mean, in layman’s terms, that 3.3 is extremely fast. Here’s a FPS comparison between 3.2 and 3.3 standalone builds, simulating a piece of cloth made up of 7200 triangles in a 4-core CPU. (The amount of iterations in 3.2 was adjusted to yield quality similar to 3.3, that uses 4 substeps per frame).

As you can see, 3.2 hovers around 30 fps. Obi 3.3 goes over 400 fps, mainly thanks to sub stepping.

Custom rope meshes

We’ve added a new, much demanded render mode to ropes: custom mesh. This allows ropes to deform arbitrary meshes, so you’re no longer limited to procedural tubes or prefab chains.

You can deform the mesh trough any axis (x,y or z), repeat it a fixed number of times, or stretch it the entire length of the rope.

Per-particle color and thickness

Procedural render modes (thick rope and line) now support per-particle color and thickness. This allows for pretty cool effects like this one, done using a sine wave to alter both particle colors and thickness:

Note that your shader will need to support per-vertex colors in order to reflect the underlying particle colors!

Signed distance fields

This is a new, experimental feature that enables the use of huge, complex colliders. Signed distance fields (SDFs) are a special collision primitive that stores the distance to the surface of an object from every point in space. This allows to perform collision detection by simply reading a value from an array. If the value is positive, we are outside of the collider. If it is negative, we are inside. Simple, easy, fast.

The problem with distance fields is that they’re usually stored as huge volume textures, and consume a lot of memory. They also waste a lot of space in very uninteresting areas. We’ve solved this by adopting an aggressive hierarchical compression scheme, that reduces storage needs by more than 85% in most cases. Here’s a cutaway view of a human model SDF:

Red areas = inside, grey areas = outside

In addition to improving performance, SDFs also improve accuracy: unlike regular MeshColliders they allow for inside/outside tests at no additional cost, even for concave shapes.

Like MeshColliders, SDFs do not support deformable objects (skinned meshes, blend shapes, etc). Here’s 6500 particles colliding against a SDF and sticking to it, and the exact same scene failing miserably both performance and accuracy-wise when using a MeshCollider instead:

UI Improvements

As a bonus, we’ve cleaned up ObiSolver’s inspector. 

Manual constraint ordering has been removed and is now automatically taken care of. Constraint parameter foldouts now automatically fold when the constraints are disabled, and unfold when you enable them to reduce visual clutter:

Upgrading to 3.3 is completely FREE. New and exciting stuff is up ahead on our development roadmap!

The VM Team

 

 

PD: Check out our other assets, as you might find something useful!

 

6 comments

Leave a Reply

Your email address will not be published. Required fields are marked *