Instanced Drawing With Unity, Part 1

Back in 2017, during the Unite Austin presentation, Unity revealed a fully performance oriented approach to programming, namely the Entity Component System (ECS). In there, the show case demonstrated that they could place 100,000 units on screen with logic and graphics and run the code at over 30 FPS. This was really impressive, but what was more impressive was that they started open sourcing a lot of it.

ECS at Unity Austin 2017

Having looked into the ECS samples I noticed that the way the boids demo was being rendered was using Graphics.DrawMeshInstanced() as you can see down here:

This was really interesting as it showed that a large number of meshes could be drawn with very little overhead compared to having each one be an individual object.

What is Instanced Drawing

Generally speaking, if you wanted to draw a large forest or a number of buildings, this could be a way. You draw the same mesh multiple times, in a single command, with some differing parameters to reduce repetition (changing color, or even animation frames for meshes).

Unity introduced this in Unity 5.4 as a very welcome feature!

There’s a few samples out there for how to use Graphics.DrawMeshInstanced(), but I’d like to try and present a minimal code sample for what you need in case you want to start looking into Instanced Drawing.

Instanced Shader Properties

First, you’ll need a shader that’s set up to work with instancing.

This is achieved by adding a number of instancing specific properties to it, namely:

#pragma multi_compile_instancing

From the Unity Documentation on GPU Instancing:

#pragma multi_compile_instancing

Use this to instruct Unity to generate instancing variants. It is not necessary for surface Shaders.


Use this in the vertex Shader input/output structure to define an instance ID. See SV_InstanceID for more information.


Use this to make the instance ID accessible to Shader functions. It must be used at the very beginning of a vertex Shader, and is optional for fragment Shaders.


Every per-instance property must be defined in a specially named constant buffer. Use this pair of macros to wrap the properties you want to be made unique to each instance.


Use this to define a per-instance Shader property with a type and a name. In this example, the _Color property is unique.


Use this to access a per-instance Shader property declared in an instancing constant buffer. It uses an instance ID to index into the instance data array. The arrayName in the macro must match the one in UNITY_INSTANCING_BUFFER_END(name) macro.

Now that the minimum amount of properties needed to have a shader that is compatible with instancing have been identified, its time to put them to use in a shader!

Creating the shader

Here is a shader that uses the properties mentioned above. Its also available on GitHub under the MinimalInstancing project

MinimalInstanced Shader

Shader "Custom/MinimalInstancedShader"
        _Color("Color", Color) = (1, 1, 1, 1)


            #pragma vertex vert
            #pragma fragment frag
            #pragma multi_compile_instancing

            #include "UnityCG.cginc"

            struct appdata
                float4 vertex : POSITION;

            struct v2f
                float4 vertex : SV_POSITION;

            UNITY_DEFINE_INSTANCED_PROP(float4, _Color)

            v2f vert(appdata v)
                v2f o;
                o.vertex = UnityObjectToClipPos(v.vertex);
                return o;

            fixed4 frag(v2f i) : SV_Target
                float4 color = UNITY_ACCESS_INSTANCED_PROP(Props, _Color);
                return color;

Graphics.DrawMeshInstanced Script

Next, we’ll need a script that uses the shader combined with Graphics.DrawInstanced() to really show off just how much geometry you can draw with this API!

We need to use a material that uses the shader we created above, and we also need a model to draw, in this case a prefab I created with ProBuilder.

Draw Mesh Instanced gotchas

When it comes to actually invoking Graphics.DrawMeshInstanced() there are a couple things to point out:

    1. An Array of Matrix4x4 are required. These matrices represent where the Meshes are to be drawn (translation, rotation & scale).

    2. The Matrix4x4 array is limited to 1023 entries, so as the number of entities goes up, we’ll need to create more batches which could hinder performance, though 40,000 instances runs without a hitch.

    3. The desired length of the Array is also required as a separate variable, but it does not need to match the 1023 limit as you can see in the code below.

using UnityEngine;

public class DrawInstancedScript : MonoBehaviour
    const float BATCH_MAX_FLOAT = 1023f;
    const int BATCH_MAX = 1023;

    public GameObject prefab;
    public Material meshMaterial;
    public int width;
    public int depth;    
    public float spacing;
    private MeshFilter mMeshFilter;
    private MeshRenderer mMeshRenderer;
    private Matrix4x4[] matrices;

    void Start ()
        mMeshFilter = prefab.GetComponent<MeshFilter>();
        mMeshRenderer = prefab.GetComponent<MeshRenderer>();

    private void InitData()
        int count = width * depth;

        matrices = new Matrix4x4[count];
        Vector3 pos = new Vector3();
        Vector3 scale = new Vector3(1, 1, 1);

        for (int i = 0; i < width; ++i)
            for (int j = 0; j < depth; ++j)
                int idx = i * depth + j;

                matrices[idx] = Matrix4x4.identity;

                pos.x = i * spacing;
                pos.y = 0;
                pos.z = j * spacing;

                matrices[idx].SetTRS(pos, Quaternion.identity, scale);

    void Update ()
        int total = width * depth;
        int batches = Mathf.CeilToInt(total / BATCH_MAX_FLOAT);

        for (int i = 0; i < batches; ++i)
            int batchCount = Mathf.Min(BATCH_MAX, total - (BATCH_MAX * i));
            int start = Mathf.Max(0, (i - 1) * BATCH_MAX);

            Matrix4x4[] batchedMatrices = GetBatchedMatrices(start, batchCount);
            Graphics.DrawMeshInstanced(mMeshFilter.sharedMesh, 0, meshMaterial, batchedMatrices, batchCount);

    private Matrix4x4[] GetBatchedMatrices(int offset, int batchCount)
        Matrix4x4[] batchedMatrices = new Matrix4x4[batchCount];

        for(int i = 0; i < batchCount; ++i)
            batchedMatrices[i] = matrices[i + offset];

        return batchedMatrices;

DrawInstanced Script setup

Just one more step before we can see the result, which is assigning values to the DrawInstancedScript

For this demo, I went with a 200×200 sized grid for a total of 40,000 prisms being drawn on screen, with these settings:

With just enough spacing to be able to tell that the prisms are individual

Once  we have this all set up, this is what we can see

Batching 40,000 prisms

This project (as well as the upcoming Part02) are available on GitHub:


In Part 2, I’ll show you how to use the Material Property Block combined with DrawMeshInstanced in order to create a scene with more variation, and here’s a sneak peak into it:

Batching 40,000 prisms

You can always follow me on Twitter @JavDev



Facebook’s 3D Posts

Facebook has been creating a lot of innovative content lately.

From immersive panoramic posts, live video feeds and the Facebook Game Room there’s a lot of different technologies that one can look into, with sometimes some great explanations (i.e. the Thundering Herd Problem) offered by the teams at Facebook themselves!

A feature that particularly interested me are the recently announced 3D Posts. Not only because they are in 3D, but also because there’s a potential to bring a lot of technologies together in order to create content.

3D posts are essentially 3D models displayed on your newsfeed which you can rotate, as can be seen below:

A Treasure chest of 3D posts!

glTF and 3D Posts

It turns out that Facebook’s 3D posts use the glTF 2.0 format. This is a format that the Khronos Group (OpenGL, Vulkan) introduced not too long ago for WebGL and they’ve iterated on since its inception.

Naturally, this looked like a good opportunity to learn about the glTF ecosystem, while also getting some results by creating some content that I could publish on Facebook!

Initial Investigation

I first downloaded the glTF-Sample-Models repo that the Khronos Group released (a mere 500MB zip file!)

The 2.0 folder is of particular interest in this repo, and the models they have there have been exported in the following formats:

  1. glTF-Binary
  2. glTF-Draco
  3. glTF-Embedded
  4. glTF-pbrSpecularGlossiness
  5. glTF
A collection of glTF Samples

After a few failed attempts at uploading some glTF files to my Facebook status update, I managed to create a post using the glTF-Binary format.

Dragging the glTF-Binary file to your status update works best, and you can get a preview for it where you get a chance to edit the background color too:

Showing a 3D cube

The next thing I wanted to try out was to see if I could take an existing 3D model and convert it to something that Facebook could use.

FBX2glTF converter

Fortunately, Facebook has recently added a FBX2glTF converter on their github page!

Building it was relatively simple if you have:

Once you’ve gone through the building process, you end up with an executable that lets you convert FBX files to glTF-Binary files!

This is achieved by running the FBX2glTF.exe with the following parameters:

&gt; FBX2glTF --binary --input YOUR_MODEL_NAME.fbx --output YOUR_MODEL_NAME

There are a few things to keep in mind, however:

  1. Your textures must be in the same folder as the FBX file
  2. Try to have 1 material per model, as it looks like having multiple materials per model is not fully supported yet.
  3. Keep your models simple. It looks like animation support is not there yet either.

Once all that’s done, you can upload your glTF-Binary file and make a 3D post!

Taking it further

I wanted to see if I could create a sort of authoring pipeline to for models where I didn’t need to download a FBX file from made by someone else to create a glTF-Binary.


Turns out, earlier in 2017, Unity teamed up with Autodesk to allow the creation of FBX files right from the Editor!

Export FBX files right from Unity!

Not only that, but earlier this week Unity also announced that they had incorporated ProBuilder into the editor itself!

This meant that I could quickly prototype making a 3D mesh, export it as FBX and get it converted to a glTF-Binary!

ProBuilder + Unity

The steps I took were the following:

  1. Download Unity’s FBX Exporter (Beta) from the Asset Store
  2. Download ProBuilder from the Asset Store
  3. Once you’re in the Unity Editor, create a cube and edit the model with the texture you want
  4. Right click on the GameObject you’d like to Export…
Last step on the editor!

5. Invoke the FBX2glTF exporter to convert your new FBX file to glTF-Binary (with the –binary flag!)
6. Upload your glTF-Binary file to Faceobok and publish it!

Here’s the end result from the conversion:

If you enjoyed reading this, and want to see what other tech. things I get into, you can always follow me on Twitter @JavDev!


Animated Outline Shader

Growing up watching all sorts of anime, it was great to see when the protagonists would power up and do their most powerful attacks. Having watched a lot of Dragon Ball Z and Saint Seiya episodes has sort of ingrained an idea in me of how a character should be powered up.

Not too long ago, this game called Saint Seiya: Soldier’s Soul came out and they really turned up the fidelity on all the graphics, especially when it comes to the energy auras.

This is how the energy aura looks on the Saint Seiya TV show:

This is how the energy aura looks on the game:

The idea here being that there is some sort of animated outline to show off the state of a character.

After researching the web, and finding various posts asking about how one would implement it, I came across Glow Highlighting in Unity where a great starting point for the implementation is shown using Unity’s Command Buffers.

Luckily, I’d already spent some time looking at MRT, which helped a lot to understand the theory behind this technique.

Essentially, once the outline is set up on your geometry, and its placed on a render target then there are a lot of things that can be done to it, for example, using UV scrolling with a pre-made texture to give it a sense of movement.

This is the animated version of this, all made with Unity:

You can check out the WebGL demo version of this here:

The way this works is the following, in shader code:

1. Render your target geometry, but make sure its only 1 color that you pick
2. Blur the flood filled image a few times to give it some additional width
3. Apply a distorted texture to the blurred render target, which is scrolled over time (and try to make the scrolling non-repetitive)
4. Combine this with the back buffer that has your already rendered scene
5. Apply bloom!

You can see on this image how the frame is built up step by step:

And presto, you have an animated outline, in realtime 🙂


Shadow Art with Unity

A few days ago I saw video titled Shadow art is better with Legos 

Watching this video got me thinking about how something like this could be done and started to look for more information on what this was all about.

I was interested in seeing what other possibilities for Shadow Art were available and found some interesting ideas.

That’s only a subset of what can be described as Shadow Art!

The way I understood it, was that you could arrange shapes in multiple ways to block the light and end up with a shadow that created a familiar shape.

The great thing about programming is that, usually, you can take an idea and turn that into some sort of demo! Even better when graphics are involved 😉

So, I decided to go ahead and make a demo that would result in some sort of Shadow Art!

Using Unity allowed me to focus on the core of the problem rather than going out and solving all sorts of dependencies that the idea depended on.

Here’s the result:


Thank’s to Unity’s WebGL export capabilities I’ve put up a runnable version of this code:

Shadow Art with Unity Demo, check it out!

Launch the Demo!


Here’s a more technical explanation of how this was achieved

The setup

I went for having 2 spot lights set up, to project the shadows


Then have a number of textures that are used to create the geometry


Since there are 2 spotlights, I wanted to see if 2 separate shadows could be generated from the same mesh, similar to what was being done on the video shown for the Magical Angle Sculptures, except I went for just 2 shadows instead of 3!

The resulting mesh

The way this is computed is in 3 passes:

First Pass

This is the most important step, since it is here where the overlap is calculated and sets up the base for showing 2 shadows at the same time.

To figure out the overlap:

  1. Iterate through the pixels on one of the textures
  2. As soon as there’s an opaque pixel do a look up on the other texture
  3. If the lookup results in a pixel that’s also opaque, then generate the vertices for a cube at that position (x,y) and then center the z position
  4. Lastly, the overlapping pixel is stored so that it is not accessed again

Second & Third Pass

These are more generic, and essentially create geometry for opaque pixels where geometry hasn’t been created before.

To give it a less uniform feel, there’s a random depth value applied to each new piece of geometry.

Image Effects

Unity comes with a number of pre-built Image Effects which helped to make this look more presentable.

I’m using the Vignette and Bloom Image Effects to create the final look for the presentation. You can see how adding them up looks below.


Hope you find this fun to play around with, and if you have some cool ideas let me know!

And remember, you can always follow me on twitter @JavDev


Multiple render targets and Stage3D

For some time I’ve been curious on how to do things with the depth buffer using Stage3D.

As far as I could find, there is no real “direct” way to access the depth buffer with Stage3D, so I went ahead and did the next best thing, which was to build my own Depth Buffer in a shader.

I saw that this could be done thanks to Flare3D’s MRT demo and started learning how I could use this to test out some things I’ve been thinking about.



Now that I had a depth buffer in place, the next step was to use this to see what sort of techniques I could combine it with.

I’ve been following Dan Moran on Patreon  and decided to try out an intersection highlight shader which he describes in one of his videos. This looked fun to do, so I went ahead and tried to implement a basic form of it using Flare3D’s shading language FLSL.


Here’s how the shader turned out:


You can also get it from here

The way that this shader works is in the following:

  1. Provide a texture with depth information
  2. Check if the difference between your current position and the value on the depth buffer is within a threshold
    1. If it is within the threshold, then use the smoothStep function to create a sort of “fall-off” effect, which at the maximum value makes the color white, and if not it fades out into the color of your mesh (or the tint being applied to it)
  3. One more thing to keep in mind is that you need to use screen space coordinates so that you test against the texture’s UVs and your own position

There are some more things to consider, such as the color format of your depth texture. If you use regular 32 bit RGBA values, then you will get some banding since the data in depth texture won’t be as precise as you need it, so using an RGBA_HALF_FLOAT value is recommended.

The final part comes by composing the 2 buffers together to create the final image. This is achieved by performing additive blending of the 2 render targets using another shader that outputs it to a final 3rd render target, which is then drawn on screen.






But, in practice how is this all achieved?

  1. Render all the geometry that you don’t want to use for effects together to a render target
    1. Also, use the MRT technique to be able to export a 2nd texture which is the equivalent of your depth buffer
    2. mrt_output
  2. Render your effect meshes to another render target and supply the depth texture as a parameter
    1. effect_mesh
  3. Finally, take the outputs of steps 1 & 2, and place them into a 3rd shader that does the additive blending for you and “composes” the final image for your shader.
    1. compose_material
  4. Draw a full screen quad with your final composed image!

Using Intel GPA you can see how the Render Targets all look:

In this case, you are only drawing what you need once to a number of buffers, and in the end compositing an image from all the various steps.

I’ve created a repo on GitHub that you can download and check out and hopefully extend for your own needs 🙂

You can also follow me on @jav_dev


3D Printing your game code

A while ago I heard that Shapeways launched a format called SVX that allowed for the exporting of voxel geometry to an intermediate format. This format could then be uploaded to their servers and they would then convert it to something that could be 3D printed!

This sounded like it would be a lot of fun to investigate, so I started looking at their format so that I could go ahead with some of the ideas I had.

The format is pretty straight forward, as it says that if there is a voxel occupying a region, it should be marked as white on a white & black image (PNG).



So, for the last 2 years I’ve been part of a team working on a game that lets users create all sorts of whacky levels, using blocks. Not only that, but we’ve had real time multiplayer from the start and we’ve also allowed you to customize your avatar, so pretty much everything in the game is user generated content.

We’ve had a number of releases, and at one point we had a level with Bots in it, where you would go in and see how long you could survive after BLASTING endless waves of bots!

Seeing how this was the most played level at the time, I thought it would be great to use that as my go-to model for my little 3D printing experiment.

This is what the level looks like now:


The SVX format

The idea of the SVX is quite simple, since it just wants you to mark a block as occupied by placing a white color, like so:


This would be the top down view of the bot level (notice the fountain in the center?)

However, there are a few things that I learned while doing this:

  1. Base: I need to have a base, just placing these pieces on the bottom most layer won’t work as there’s nothing they’re attached to (i.e. the structures would be floating)
  2. Distance: Shapeways has a smoothing algorithm that will make things really smooth, but if the voxels are placed too close together, they will smooth out a bit too much, making a block look like a cone, for example, so for every block I added 8 blocks on the SVX file!
  3. Loose Shells: There’s a final step to the process which detects something called loose shells. This is basically to ensure that there are no floating pieces in a level.
    1. This makes it quite hard to take just any level since everything has to be connected, and at times the insides of structures are not connected since we allow free-form level creation.
    2. Fortunately, the Shapeways model editor tells you exactly where the loose shells are, so its only a matter of editing your level a little bit and you got no more loose shells!

This is part of what the files I used for the model look like:


Once I uploaded the collection of files to Shapeways, I ended up with this model:


Now, it was just a matter of getting it printed and delivered.

It was an awesome moment when I finally got a 3D Printed version of the code that I’d been writing together with my team for such a long time!

The finished piece

IMG_4037 IMG_4036

If you’re interested, you can check out the level here, and even get a copy of it printed:

  1. RoboBlastPlanet Bot Level

You can also check out other cool stuff that they have in Shapeways, which helped to inspire me to make this 3D print!

You can also follow me on Twitter on @JavDevGames!

WebGL and shaders

Though I’ve heard about ShaderToy for a while, and looked at various examples of what they do, I never really had spent much time trying it out.

I recently started looking into Javascript and ThreeJS and very shortly the idea of making a cool background animation came into mind.

I wanted to do a copy of the PlayStation3’s background menu, but didn’t find much about it online. I ended up going to ShaderToy and doing my own, based off of another shader:

I think it came out pretty well, and made it using ShaderToy. Now I’ve managed to port it over to ThreeJS:

XBM Shader

How does this work?

Main Function #1

The meat of the work is done in the fragment shader.

[code language=”javascript”]

color += calcSine(uv, 0.20, 0.2, 0.0, 0.5, vec3(0.5, 0.5, 0.5), 0.1, 15.0,false);,
color += calcSine(uv, 0.40, 0.15, 0.0, 0.5, vec3(0.5, 0.5, 0.5), 0.1, 17.0,false);,
color += calcSine(uv, 0.60, 0.15, 0.0, 0.5, vec3(0.5, 0.5, 0.5), 0.05, 23.0,false);,


The main function calls into calcSine. Every call to calcSine() creates a new “line” in the shader.

Calc Sine

The calcSine() function, simply calculates the value to pass into the sin() function, and is multiplied by various parameters (amplitude & offset). Amplitude meaning how much to stretch the values by in the y-axis, and offset moving it that much of the screen percentage up (5%, 10%, etc.)

[code language=”javascript”]

float angle = time * speed * frequency + (shift + uv.x) * 3.14;


The next trick is to set a sort of exit criteria, which is diffY. What this asks is how far the value we got from sin() is from our current uv.y value. This is important since it determines if we’re below or above the uv.y coordinate we’re calculating this for.

dSqr simply figures out how far we are from the uv.y coordinate

[code language=”javascript”]

float y = sin(angle) * amplitude + offset;,
float diffY = y – uv.y;,
float dsqr = distance(y,uv.y);


The if-statement I put in there is done to determine if we’re below or above, and how to handle it. Multiplying dSqr by 8.0 helps to make the cut off more smooth than multiplying by something higher. Multiplying by 12 would create a more cutting effect,  while multiplying by something below it makes the image more blurry.

[code language=”javascript”]

if(dir && diffY > 0.0)
dsqr = dsqr * 8.0;
else if(!dir && diffY < 0.0)
dsqr = dsqr * 8.0;


The last step is to apply a power function & smoothstep to make the final effect and cause that “fadeout” effect around the lines.

[code language=”javascript”]

scale = pow(smoothstep(width * widthFactor, 0.0, dsqr), exponent);


Main Function #2

What calling calcSine() multiple times allows us to do is to shift the color being returned up closer to white. This creates an effect of almost having “additive” blending, and makes the faded-out regions shiny brighter when they overlap.

At the end of the main function, a final step is done to tint the entire background with a color, and make this into a gradient color.

[code language=”javascript”]

color.x += t1 * (1.0-uv.y);
color.y += t2 * (1.0-uv.y);

gl_FragColor = vec4(color,1.0);


Putting all this together, and updating the “time” uniform value finally gives us the final effect.

I’ve uploaded the sample to GitHub as well, so have a look there!

Intro to removing code branches


On Intel’s 64 and IA-32 Architectures Optimization Reference Manual there is a section on how to optimize code via branch removal.

I found this interesting and started looking around for some places that would explain this in a bit more detail. I found a great post on Stackoverflow explaining how to Remove Branching via Bitwise Select and some more examples of branch removing via the bit-twiddling hacks article that has been reference many times.

I haven’t yet found a place that explains this in more detail so I thought that I’d give it a go.

Example #1:

Code with branch is:

[code language=”cpp”]

if(value > some_other_value)
value *= 23;
value -= 5;

Solution #1:

[code language=”cpp”]

const int Mask = (some_other_value-value)>>31;


How does this work?

  1. Get the difference between value and some_other_value
  2. Mask will be a string of 32 1’s if the value is negative (due to 2’s compliment), 0 if the value is positive.
  3. Both multiplications are done either way, but by doing a Bitwise AND with the Mask (and its compliment via ~Mask), we either set one of the multiplications to 0, and leave the other alone by multiplying it by 1, since the ~ operator will turn all the 1’s to 0, except for the last bit.

Though this example is too specific, if it were more general then we could create some #defines to easily let you write branchless code, while keeping some sort of readability by naming the defines something like:

NO_BRANCH_EVAL_LESS_THAN_MULT_SUB(value, some_other_value, 23, 5)

Which would expand itself to be equal to the expression in Solution #1

Example #2:

Calculate the absolute value of an integer without branching,

Solution #2:

[code language=”cpp”]
#define CHAR_BIT 8
int v; // we want to find the absolute value of v
unsigned int r; // the result goes here
int const mask = v >> sizeof(int) * CHAR_BIT – 1;

r = (v + mask) ^ mask;


How does this work?

  1. Again, get the MSB out of this by doing a 31 bit right shift
  2. Add the mask to the original value, and XOR it with the mask
  3. This means that if the value is positive, it remains unmodified
  4. However, if the value was negative, then things get interesting as this approach exploits the concept of 2’s compliment
  5. The value of the mask in this case becomes -1 (or 32 1s as represented by 2’s compliment)
  6. By adding -1 to the original value, we get -2
  7. Lastly, by doing an XOR with the new value,  we strip away the bits that represent -1 from the value, and arrive at the original value, with no negative sign.

Using -2 as an example:


Until next time!

Wormhole effect in flash

A few years ago I started attempting to learn some of the stuff that the demoscene guys do. I did a little test with a wormhole effect that I’d seen in a number of demos, but nobody had really done a flash version (or at least my google-fu was not good enough to find one). So, I went ahead and implemented it. Here it is below:

Click on the images to switch the bitmap being applied to it
Press left/right on your keyboard to spin the wormhole around!

If you’d like to see more on how this effect is made, check out this website:

Creating Demos – Coder Tutorial #3 [Movement without Motion]