User Interface Design Part 4: The Model-View-Controller paradigm

Thus far we’ve talked about designing a user interface: the visual language that tells users what they can do, the elements of the language (nouns and verbs), the way these things contribute to consistency and discoverability and how they can be used to create a user interface that seems… inevitable. Simple. Invisible.

Where this attention to detail shines is when it comes to building our interface.

But before we can do this, let’s talk a little bit about a common way used to organize interfaces, and discuss the non-interface elements of our code.


The single most common used design pattern in application development today is the Model-View-Controller. In fact, it has been called “the seminal insight of the whole field of graphical user interfaces.” Originally invented at Xerox Parc in the 1970’s, it was first used to help build Smalltalk-80 user interface applications, and it was first described in the paper A Cookbook for Using the Model-View-Controller User Interface Paradigm in Smalltalk-80 by Glenn Krasner and Stephen Pope.

This paradigm is so important that nearly every modern application API (from iOS to Android to MacOS to Microsoft Windows) either explicitly provides interfaces that build on this model, or implicitly provides a way by which programmers can use this model.

The key idea is that we can think of our programs as being made up of three major parts:

Model View Controller

At the bottom is our “Model.” This is the “domain-specific software” which implements our application–and is the collection of code which does stuff like load a file, or provide in-memory editing of a text file, or (in our case) connects to the physical hardware that controls an HVAC system or which provides the time.

Our user interface is then built on top of this model code.

Our user interface roughly comprises of two parts. The first part are the “views”; these deal with everything graphical. They grab data from the model and display the contents of the screen–and they can contain “subviews” and be contained in “superviews.”

The second part is the “control” code. Controllers contain the logic which coordinates our views; they deal with user interface interactions and translate those actions into meaningful action. (So, for example, a view may represent a button, but it is the control code which determines what pressing that button “means” in terms of updating the interface and updating the model.)


For our Arduino application we don’t fully implement “views,” since some of the overhead may not quite fit in our small memory footprint. But they can help us segregate our code and help organize our thinking about the code, by thinking of the graphical parts as being separate from the control code and from our model code.

And notice one nice property about our visual language–about our thinking of a visual language in terms of the ‘nouns’ and ‘verbs’ and the consistent use of visual designs. All of this maps very nicely on our Model-View-Controller paradigm.

Our “Model” is the concrete implementation of our “nouns”: the thermostat settings. The schedule. The current time.

Our “Views” represents our visual language: how we show the user our “nouns”, our “verbs”, how we represent the actions a user can take.

And our “Control” code represents how our visual language works: how a sequence of taps on a bunch of icons translates into the user “changing the time” or “turning down the heater.”


Let’s give a concrete example of one of these “nouns”; the actual thermostat control code which determines if we need to turn on or off the heater, air conditioner or fan.

Our AdaThermostat, declared as a single GThermostat object, defines the actual code for setting the temperature, for getting the current temperature, and for turning on and off the three control lines that control our HVAC unit.

class AdaThermostat
{
    public:
                        AdaThermostat();

        void            periodicUpdate();

        uint8_t         heatSetting;        /* Target heat to temperature */
        uint8_t         coolSetting;        /* Target cool to temperature */
        uint8_t         fanSetting;         /* ADAVAC_OFF/AUTO/ON */
        uint8_t         curTemperature;     /* Current interior temperature */
        uint8_t         lastSet;            /* Last schedule used or 0xFF */

        bool            heat;               /* True if heater runs */
        bool            cool;               /* True if aircon runs */
        bool            fan;                /* True if fan runs */
};

Ultimately, when you change the thermostat setting–say, to heat the room to 70°–at the very core of our software, this sets GThermostat.heatSetting to 70.

Our class has only one entry point, periodicUpdate, which must be called periodically in order to run our logic: to periodically read the temperature sensor and set the curTemperature variable, and to decide when to turn on the heater and the air conditioner.


All of our model code looks like this: a loose collection of classes which control the hardware on our thermostat.

For example, our model includes code for getting and setting the time in AdaTime.h:

extern void AdaTimeInitialize();

extern void AdaSetTime(uint32_t t);     // Time (seconds elapsed since 1/1/2017)
extern uint32_t AdaGetTime();           // Time (seconds elapsed since 1/1/2017)
extern uint32_t AdaGetElapsedTime();    // Time (seconds elapsed since power on)

This uses the TIMER2 hardware module on the ATmega 328 CPU. This sets the timer to create an interrupt every 1/125th of a second, and the interrupt itself counts until 1 full second has elapsed before updating two global variables–one with the current time (measured as seconds from January 1, 2017), and the other with the number of seconds since the device was turned on.

Our code also provides support routines for converting the time to a Gregorian calendar date and for quickly finding the day of the week. This allows us to determine quickly which day of the week it is when we update the thermostat according to the schedule.

We also provide a class for getting and setting the thermostat’s schedule, which is stored in EEPROM on the ATmega 328.

class AdaSchedule
{
    public:
                        AdaSchedule();

        void            periodicUpdate();

        void            setCurSchedule(uint8_t value);
        uint8_t         getCurSchedule();

        AdaScheduleDay  getSchedule(uint8_t schedule, uint8_t dow);
        void            setSchedule(uint8_t schedule, uint8_t dow, const AdaScheduleDay &item);
};

This allows us to get the schedule for a single day, which is an array of four temperature settings and four times:

struct AdaScheduleItem
{
    uint8_t hour;           /* 0-23 = hour of day, 0xFF = skip */
    uint8_t minute;         /* 0-59 = minute */
    uint8_t heat;           /* temperature to heat to */
    uint8_t cool;           /* temperature to cool to */
};

struct AdaScheduleDay
{
    struct AdaScheduleItem setting[4];  // 4 settings per day
};

Our user interface code for running the main display of our thermostat then makes use of these model classes to get the current state of the system.

For example, in the part of our home page which displays the temperature buttons, our code is:

    FormatNumber(buffer,GThermostat.heatSetting);
    GC.drawButton(RECT(160,200,40,37),buffer,28,KCornerUL | KCornerLL,KCenterAlign);
    FormatNumber(buffer,GThermostat.coolSetting);
    GC.drawButton(RECT(201,200,40,37),buffer,28,KCornerUR | KCornerLR,KCenterAlign);

This will read and display the temperature on our device.


By using the Model-View-Controller paradigm and separating out the parts which control our hardware from the parts that control our display, and make creating our user interface code that much easier. It also means we can test those parts of our code that controls our hardware separately from the rest of our application.

The complete source code, which strips out most of our user interface software and illustrates the models of our thermostat software, is checked into GitHub and is available for download.

Next time we’ll discuss the code we use for managing screens and the controller code which draws our home screen.

Building a 3D Graphics Pipeline on an Arduino, Part 5

One of the advantages of our design is that we can rapidly port our 3D drawing code to another platform. In fact, we could port our code off the Arduino entirely and move it to a different microprocessor or even to your desktop computer.

All we have to do is to alter how we initialize our 3D library, change our begin/end calls, and change our low-level drawing routines. While we’re at it, we can also refine our 3D library to only draw in a subset of our screen. We may wish to do this, for example, if we’re creating a game and we want a separate status area.

So let’s begin.

For this exercise we’ll be porting the code to an Arduboy, a small hand held Arduino game unit with built-in game control buttons and a small black and white screen.


Altering the header.

Our current constructor for the G3D class contains a reference to our library, and we assume we use the entire screen. Let’s change the code two ways. First, we want to change the drawing library we pass in. Second, we want to pass in the area of the screen we’re drawing in.

So let’s update our class. I’m going to define a preprocessor token:

#define USELIBRARY	        2   // 1 = Adafruit, 2 = Arduboy

Now let’s update our G3D class. In our header file we’ll alter our constructor and create the appropriate definitions:

class G3D
{
    public:
#if USELIBRARY == 1
                G3D(Adafruit_GFX &lib, uint16_t x, uint16_t y, uint16_t width, uint16_t height);
#elif USELIBRARY == 2
                G3D(Arduboy &lib, uint16_t x, uint16_t y, uint16_t w, uint16_t h);
#endif
                ~G3D();

        void    setColor(uint16_t c)
                    {
                        color = c;
                    }

        void    begin();
        void    end();

We also want to update the color declaration, since the Arduboy uses an 8-bit color value (which is set to either BLACK or WHITE), while the Adafruit library uses a 16-bit color value.

class G3D
{
    public:
#if USELIBRARY == 1
                G3D(Adafruit_GFX &lib, uint16_t x, uint16_t y, uint16_t width, uint16_t height);
#elif USELIBRARY == 2
                G3D(Arduboy &lib, uint16_t x, uint16_t y, uint16_t width, uint16_t height);
#endif
                ~G3D();

#if USELIBRARY == 1
        void    setColor(uint16_t c)
                    {
                        color = c;
                    }
#elif USELIBRARY == 2
        void    setColor(uint8_t c)
                    {
                        color = c;
                    }
#endif

        void    begin();
        void    end();

These two changes require corresponding changes to the private values in our class. We need to update the library we reference, update the screen size parameters and the color:

    private:
        /*
         *  Internal state
         */
#if USELIBRARY == 1
        Adafruit_GFX &lib;
#elif USELIBRARY == 2
        Arduboy &lib;
#endif

        uint16_t xoffset;
        uint16_t yoffset;
        uint16_t width;
        uint16_t height;

        /*
         *  Current drawing color
         */

#if USELIBRARY == 1
        uint16_t color;
#elif USELIBRARY == 2
        uint8_t color;
#endif

        /*
         *  Stage 4 pipeline; 3D transformation
         */

These are all the changes we need to make to our header. From a public interface perspective, we simply update the value of USELIBRARY, and create our class linked to the correct library–and that’s it. There is no step 3.


G3D code changes.

For the body of our code, the changes are also similarly easy.

First, of course, we need to update our constructor. The body of the constructor is the same; we simply need to change the library name of the library passed in. We also stash the screen size parameters that we’ve changed.

/*  G3D::G3D
 *
 *      Construct our pipeline
 */

#if USELIBRARY == 1
G3D::G3D(Adafruit_GFX &l, uint16_t x, uint16_t y, uint16_t w, uint16_t h) : lib(l)
#elif USELIBRARY == 2
G3D::G3D(Arduboy &l, uint16_t x, uint16_t y, uint16_t width, uint16_t height) : lib(l)
#endif
{
    xoffset = x;
    yoffset = y;
    width = w;
    height = h;

    /* Initialize components of pipeline */
    p1init();
    p2init();
    p3init();
}

Now our implementation on the Adafruit GFX library called the internal startWrite and endWrite calls in order to reduce the number of times we had to turn on and off the Adafrut drawing engine. But the Arduboy library doesn’t require the same thing–so these calls effectively become ‘no-ops’:

void G3D::begin()
{
#if USELIBRARY == 1
    lib.startWrite();
#endif
}

void G3D::end()
{
#if USELIBRARY == 1
    lib.endWrite();
#endif
}

Finally we need to update our p1 move/draw and point routines. We need to do two things here: first, we need to offset our drawing to the pixel coordinates we supplied in our initializer. Second, we need to actually call the correct line drawing routines.

void G3D::p1movedraw(bool drawFlag, uint16_t x, uint16_t y)
{
    /*
     *  We use the p1drawflag and the drawFlag objects to determine the
     *  way to draw our line. For us, we're always drawing single
     *  segments, but we theoretically could roll up our lines into a
     *  collection of line segments and send them on close. (This
     *  requires hooking G3D::end().)
     */
    
    if (drawFlag) {
#if USELIBRARY == 1
        lib.writeLine(xoffset + p1x,xoffset + p1y,x,y,color);
#elif USELIBRARY == 2
        lib.drawLine(xoffset + p1x,xoffset + p1y,x,y,color);
#endif
    }
    
    p1draw = drawFlag;
    p1x = x;
    p1y = y;
}

void G3D::p1point(uint16_t x, uint16_t y)
{
#if USELIBRARY == 1
    lib.writePixel(xoffset + x,xoffset + y,color);
#elif USELIBRARY == 2
    lib.drawPixel(xoffset + x,xoffset + y,color);
#endif
}

And that’s it. Now our library can be used to generate 3D drawings on either an Arduboy or using the Adafruit GFX library.


Testing it out.

We can redo our cube test from the last time but on the Arduboy. Because the Arduboy has double-buffering the animation is smooth and quick–though it lacks the color of the Adafruit color display.

Our example needs to be rewritten to use the double-buffer drawing support of the Arduboy. We can also make use of the viewport feature to only draw on part of the screen; we’d do this if we wanted to make use of only part of the screen.

Most of the changes are relatively simple: we switch libraries and change how we start up our drawing system. The loop has the most changes, since we need to handle the double-buffering of the Arduboy API.

For the record, the complete source kit (just showing the Arduboy elements and their changes) is:

#include "G3D.h"

/*
 *  Drawing globals
 */

Arduboy arduboy;

// Graphics setup
G3D draw(arduboy,0,0,100,64);

// Rotation
static float GXAngle;
static float GYAngle;

void setup() 
{
    arduboy.beginNoLogo();
    arduboy.setFrameRate(50);
    arduboy.clear();
    arduboy.display();
    
    GXAngle = 0;
    GYAngle = 0;
}

void transform()
{
    draw.transformation.setIdentity();
    draw.perspective(1.0f,0.5f);
    draw.translate(0,0,-3.5);
    draw.rotate(AXIS_X,GXAngle);
    draw.rotate(AXIS_Y,GYAngle);
}

void drawBox(int x, int y, int z)
{
    draw.move(x-1,y-1,z-1);
    draw.draw(x+1,y-1,z-1);
    draw.draw(x+1,y+1,z-1);
    draw.draw(x-1,y+1,z-1);
    draw.draw(x-1,y-1,z-1);
    draw.draw(x-1,y-1,z+1);
    draw.draw(x+1,y-1,z+1);
    draw.draw(x+1,y+1,z+1);
    draw.draw(x-1,y+1,z+1);
    draw.draw(x-1,y-1,z+1);
    draw.move(x+1,y-1,z-1);
    draw.draw(x+1,y-1,z+1);
    draw.move(x+1,y+1,z-1);
    draw.draw(x+1,y+1,z+1);
    draw.move(x-1,y+1,z-1);
    draw.draw(x-1,y+1,z+1);
}

void loop()
{
    if (!arduboy.nextFrame()) return;
    
    arduboy.clear();
    
    draw.begin();
    draw.setColor(WHITE);
    transform();
    drawBox(0,0,0);
    draw.end();
    
    arduboy.display();

    GXAngle += 0.01;
    GYAngle += 0.02;
}

When we run this we get the following:

All the code here is at GitHub, so feel free to check it out. The main branch is the latest and greatest, and can be compiled (by changing the flag in G3D.h) on either the Arduboy or on any of Adafruit’s GFX compatible displays.

A quick note about the cube demo. If you watch it spin, sometimes the point of the cube nearest to you “disappears.” This is expected, and demonstrates clipping not just to the four boundaries of our screen, but to the near clipping plane. Think of it as the cube being really small and “clipping” against the camera lens of our virtual camera.


This has been a somewhat hastily written adventure into 3D, but I hope it has been an instructive one. If you found this series useful, please let me know. And if you find any problems with any of the code, or if you want to share any games you put together with this code, please let me know!

Building a 3D Graphics Pipeline on an Arduino, Part 4

We have all the pieces. Let’s put them together, shall we?

With the last article we had a clipping algorithm which could clip lines so that only the visible lines are displayed. Now let’s put the whole thing together.


Some Math in Code.

On our second article we introduced the concept of Homogeneous Coordinates. This is how we manipulate points in 3D, and you see the same matrix math used everywhere in 3D graphics, including OpenGL.

First, let’s create the methods for manipulating our matrices and vectors.

First, we need a 3D matrix, which we define as:

class G3DMatrix
{
    public:
        float           a[4][4];
};

Recall our 3D matrices are 4×4 matrices:

Matrix Stuff

Our convention is that items in our matrix across is the first index, and down is the second.

Since we’re using C++, we can also pack our matrix full of all those declarations which make our life easier. Specifically we can add routines for initializing our matrix as a translation matrix, a rotation matrix, or a scale matrix, as well as our perspective matrix. We can also add our multiply method as well, just to keep things tidy.

class G3DMatrix
{
    public:
                        G3DMatrix();

        // Initialize matrix
        void            setIdentity();
        void            setTranslate(float x, float y, float z);
        void            setScale(float x, float y, float z);
        void            setScale(float x);
        void            setRotate(uint8_t axis, float angle);
        void            setPerspective(float fov, float near);

        // Inline multiply transformation matrix
        void            multiply(const G3DMatrix &m);
        
        // Raw contents of the matrix
        float           a[4][4];
};

We also need our 3D vector as a homogeneous coordinate, and we include our vector/matrix multiply routines as well.

struct G3DVector {
    float x;
    float y;
    float z;
    float w;

    // Math support
    void                multiply(const G3DMatrix &m, const G3DVector &v);
};

Everything we’re writing here should seem familiar if you made it to the end of my article on homogeneous coordinates. Basically, the idea is that a 3D point (x,y,z) becomes our 3D homogeneous vector (x,y,z,1), and any 3D homogeneous coordinate (x,y,z,w) becomes (x/w, y/w, z/w).

The implementation of each of the matrix initialization methods follows pretty quickly from our homogeneous matrices in the prior article. For example, our implementation of setTranslate which creates a translation matrix:

Translation Matrix

looks like:

void G3DMatrix::setTranslate(float x, float y, float z)
{
    setIdentity();
    a[0][3] = x;
    a[1][3] = y;
    a[2][3] = z;
}

Our Transformations

To our G3D class, we create methods for transformations, scaling, and the transformation matrix used to translate points in our 3D coordinate space.

class G3D
{
    public:
                G3D(Adafruit_GFX &lib);
                ~G3D();

        void    setColor(int16_t c)
                    {
                        color = c;
                    }
        
        void    begin();
        void    end();
        void    move(float x, float y, float z)
                    {
                        p4movedraw(false,x,y,z);
                    }
        void    draw(float x, float y, float z)
                    {
                        p4movedraw(true,x,y,z);
                    }
        void    point(float x, float y, float z)
                    {
                        p4point(x,y,z);
                    }

        void    translate(float x, float y, float z);
        void    scale(float x, float y, float z);
        void    scale(float s);
        void    rotate(uint8_t axis, float angle);
        void    perspective(float fov, float nclip);
        void    orthographic(void);

        G3DMatrix transformation;
    private:

We expose all of our transformation stuff here to simplify our work, by keeping the pieces all in one place. For example, our translate method looks like:

void G3D::translate(float x, float y, float z)
{
    G3DMatrix m;
    m.setTranslate(x,y,z);
    transformation.multiply(m);
}

This will post-multiply our matrix into our current transformation matrix. We’ll talk about the ramification of this below.

Finally we need our p4movedraw and p4point methods for our class:

void G3D::p4movedraw(bool drawFlag, float x, float y, float z)
{
    G3DVector t;

    t.x = transformation.a[0][0] * x + transformation.a[0][1] * y + transformation.a[0][2] * z + transformation.a[0][3];
    t.y = transformation.a[1][0] * x + transformation.a[1][1] * y + transformation.a[1][2] * z + transformation.a[1][3];
    t.z = transformation.a[2][0] * x + transformation.a[2][1] * y + transformation.a[2][2] * z + transformation.a[2][3];
    t.w = transformation.a[3][0] * x + transformation.a[3][1] * y + transformation.a[3][2] * z + transformation.a[3][3];

    p3movedraw(drawFlag,t);
}

void G3D::p4point(float x, float y, float z)
{
    G3DVector t;

    t.x = transformation.a[0][0] * x + transformation.a[0][1] * y + transformation.a[0][2] * z + transformation.a[0][3];
    t.y = transformation.a[1][0] * x + transformation.a[1][1] * y + transformation.a[1][2] * z + transformation.a[1][3];
    t.z = transformation.a[2][0] * x + transformation.a[2][1] * y + transformation.a[2][2] * z + transformation.a[2][3];
    t.w = transformation.a[3][0] * x + transformation.a[3][1] * y + transformation.a[3][2] * z + transformation.a[3][3];

    p3point(t);
}

This simply does our matrix/vector multiplication, then passes the resulting point into our 3D clipping engine.


Testing the whole thing.

Our test code is very simple.

First, we need to set up our drawing system. We first need to set up our drawing engine, and then pass it as a construction parameter to our 3D drawing engine:

// For the Adafruit shield, these are the default.
#define TFT_DC 9
#define TFT_CS 10

// Use hardware SPI (on Uno, #13, #12, #11) and the above for CS/DC
Adafruit_ILI9341 tft = Adafruit_ILI9341(TFT_CS, TFT_DC);

// Graphics setup
G3D draw(tft);

And finally we set up a couple of global variables for the rotation angle.

// Rotation
static float GXAngle;
static float GYAngle;

Our setup code simply erases the screen to black and resets our globals.

void setup() 
{
    tft.begin();
    tft.fillScreen(ILI9341_BLACK);
    
    GXAngle = 0;
    GYAngle = 0;
}

And our loop draws our box at the origin, after setting up the transformation matrix. We wait a little bit, then we redraw the box in black; this is faster than erasing the screen to black. We rotate our box, and try again.

void loop() 
{
    draw.begin();
    draw.setColor(ILI9341_RED);
    transform();
    drawBox(0,0,0); 
    draw.end();

    delay(100);

    draw.begin();
    draw.setColor(ILI9341_BLACK);
    drawBox(0,0,0); 
    draw.end();

    GXAngle += 0.01;
    GYAngle += 0.02;
}

Our transformation matrix.

The code we use to set up our transformation matrix is:

void transform()
{
    draw.transformation.setIdentity();
    draw.perspective(1.0f,0.5f);
    draw.translate(0,0,-6);
    draw.rotate(AXIS_X,GXAngle);
    draw.rotate(AXIS_Y,GYAngle);
}

We initialize the matrix in the opposite direction in which we use the transformations. In other words, our transformation matrix first rotates our cube around the Y axis. Then it rotates around the X axis. We then move the object 6 units away from our view, and finally set up the perspective matrix.

The order of this is important, because it means we could animate multiple objects by saving and restoring the matrix. For example, we could represent two cubes rotating around each other independently by partially setting up our transformation matrix, saving the intermediate result, and then setting up the specific transformations for the two separate cubes.

void transform()
{
    draw.transformation.setIdentity();
    draw.perspective(1.0f,0.5f);
    draw.translate(0,0,-6);

    G3DMatrix save = draw.transformation;
    draw.rotate(AXIS_X,GXAngle);
    drawFirstCube();

    draw.transformation = save;
    draw.rotate(AXIS_Y,GYAngle);
    drawSecondCube();
}

Put it all together and we get a rotating cube on our display:

The complete sources are at GitHub.

Building a 3D Graphics Pipeline on an Arduino, Part 3

Now let’s talk clipping: the process of making sure we don’t write a line that’s not visible on the screen.

We’re so used to having clipping done for us that sometimes we can forget that clipping is not something that “just happens.” When we write a line to the screen what we’re really doing, at the bottom of the stack, is writing a bunch of bytes to memory that is being displayed using video hardware. And if we don’t stop at the edges, we’ll have a buffer overrun error: at the very least we’ll see bits of a line wrapping around the screen. Worse case: we will drop garbage into global memory or onto our stack, crashing our application.

Clipping a line ultimately requires us to find the point along the line that hits the edge of the screen. There are multiple ways we can do this: different algorithms which either modify Bresenham’s line drawing algorithm, or perform line clipping calculations before drawing.

Pipeline

For our purposes, since we are drawing 3D lines and our coordinates are specified with floating point values (because we’re now considering the third stage of clipping), we’re going to use the later strategy of finding the point where our lines intersect the edges.


Some Math.

Before we can describe how we clip lines, we need to introduce some vector math. (Yes, that dreaded 4 letter word.)

In this case, let’s talk [dot products.](https://en.wikipedia.org/wiki/Dot_product) Now I won’t get too deep into an explanation as to how this all works; for that, the incredible series on The essence of linear algebra by Grant Sanderson of 3Blue1Brown is absolutely fantastic and explains all this stuff at a level of detail with pretty pictures that can be understood by nearly anyone.

But the essence of it is this:

The dot product of two equal-length vectors–two values (a,b,c) and (x,y,z) for example–is given by the formula:

Dot

That is, you multiply all the items inside the two vectors together, and add them all up. So the dot product of (a,b,c) and (x,y,z) would be:

Dotexample

Now the dot product has a very interesting property which will be useful to us when we talk about clipping. And that is, geometrically speaking, the dot product has the useful property that the value of the dot product is equal to the length of the projection of one vector onto another, times the length of the other vector. That is:

Dot2

where ||A|| is the length of vector A, and θ is the angle between the two vectors.

Graphically this looks like:

Dot Products

That is, if we were to take a vector A and “project” it into vector B, then the dot product is the length of that projection times the length of the vector B.

Now there is a second interesting property which we will use in clipping, which is this: if A and B are pointed more or less in the same direction, then the dot product will be positive. However, if (as in the diagram below) they are pointed roughly in opposite directions (that is, the angle is greater than 90 degrees), then the dot product will be negative:

Dot Products 2

This property will come in handy in a second.


Equation of a line.

While we’re on the subject, it’s useful to contemplate the “equation of a line” on a 2D surface.

The general form of a line equation is:

Generalline

Where the variables a, b and c are selected so that any point (x,y) is on the line if it satisfies the relationship above.

Now…

Notice something familiar?

Doesn’t it look like a dot product?

Linedot

And since we know the dot product is essentially the projected distance along a vector, this gives us a hint as to a geometric interpretation of the equation of a line. For a line (a,b,c), we could interpret our line equation as follows.

First, remember that the dot product (a,b)˙(x,y) is the projected distance of one of the vectors along the other–for example, the projection of (x,y) onto the vector (a,b):

Line Equation

Since our point (x,y) is on the line, this implies the dot product satisfies our equation above: that the dot product plus C = 0.

It also means something else.

It means all the points on our line satisfy this same projection relationship–which means our line is the line that runs perpendicular to the vector (a,b) that intersects (x,y):

Line Equation 2

More importantly, consider all the lines that run parallel to our line. The closer they are to the origin in our diagram, the smaller the dot product is for the points along that line. And the farther they are from the origin than our line, the bigger the dot product. This means our equation for a line is not just an equation solving for if points are on a line, but also indicates how far a point is from our line, and which side of the line our point is on.

Line Equation 3

Both of these ideas become very useful to our screen clipping code.


Clipping.

Let’s now consider the operation of clipping a line against a single edge, and part of the line goes across the edge.

Clip Edge

In the above diagram, we’re clipping a line from A to B on the left side of our screen along the edge.

First, we need to consider if the line from A to B is even visible and if it intersects the edge. Then, if it intersects the edge, we need to find the point C where it intersects the edge

Notice both questions could be answered if we had the equation (a,b,c) that represents the edge x = -1, since from above we know if we’re inside or outside based on the sign of the result (a,b,c) ˙ (x,y,1), and the distance can help us calculate the point C.

The value for (a,b,c) should be obvious: we have the line for our edge along x = -1, and a little algebra later we get a = 1, b = 0 and c = 1. Ideally we want points inside our screen to be greater than or equal to zero–that is, for (a,b,c)˙(x,y,1) to be greater than or equal to zero if x >= -1, and from a simple inspection our values of (a,b,c) work–as if we were to plug in the coordinates (0,0), we get c = 1.

This gives us our tests to see if our two points are visible on our screen:

    if (x + 1 >= 0) { /* Point is inside screen */ }

Where x+1 is a compact way of writing (a,b,c)˙(x,y,z), where (a,b,c) = (1,0.1).

This also gives us a way to calculate the position of C if points A and B cross our edge, by using similar triangles:

Clip Edge 2

We know the triangle at ACJ is proportionally smaller to the triangle ABK. We can figure out how much smaller the triangle is, proportionally speaking, from the length of the two edges we know the length of–that is, the length of AJ, which is -A˙(a,b,c), and the length of AK, which is the length of JK (B˙(a,b,c)) plus the length of AJ.

The relative size of ACJ is:

Eqn1

And we can use this to find the position of C. The value α ranges from 0 to 1 and gives the relative size of triangle ACJ from triangle ABK. If &alpha was 1, then ACJ would be the same as ABK (because B˙(a,b,c) would be 0), and if &alpha was 0, then triangle ACJ would effectively be a point.

So we can find C through the following linear interpolation operation:

Lerp

This operation is so important to our clipper we write it as a separate subroutine:

static void Lerp(const Point &a, const Point &b, float alpha, Point &c)
{
    float a1 = 1.0f - alpha;
    c.x = a1 * a.x + alpha * b.x;
    c.y = a1 * a.y + alpha * b.y;
}

Where “Lerp” is short for “Linear interpolation”, and you’ll hear that term a lot in computer graphics.


Clipping in 2D

We now can write our 2D clipping routine, which basically clips against the four edges of our screen.

If we assume our four edges are along x = ±1 and y = ±1, then we can sketch our solution.

So let’s assume we’re given two points A and B, and we want to determine if we draw the line, draw a clipped line, or not draw a line at all.

Find the “out code”

First, we need to determine which edges the two points A and B are outside of. Since we have four edges, we need to compute a 4-bit value, with each bit representing if A and B are outside a particular edge. This we’ll call our “out code”, because in the four bits, a bit is set if the point is “outside” the screen at that edge.

Now we have four edges x = ±1 and y = ±1. Since we want a point at (0,0) to have a positive value (by convention), this gives our four edges at (-1,0,1), (1,0,1), (0,-1,1) and (0,1,1).

Remember that a point is outside each edge if the dot product is negative. So now we can quickly calculate our out code, with a little rearranging of our math:

static uint8_t OutCode(const Point &v)
{
    uint8_t m = 0;

    if (v.x < -1) m |= 1;   // v * (1,0,1) < 0
    if (v.x > 1) m |= 2;    // v * (-1,0,1) < 0
    if (v.y < -1) m |= 4;   // v * (0,1,1) < 0
    if (v.y &t; 1) m |= 8;     // v * (0,-1,1) < 0

    return m;
}

Next, quickly screen stuff that is not visible.

Notice one nice property of our out code: two points with the same bit set set represents two points on the outside of the same edge.

Clip Edge 3

We can test for this quickly by taking the bit-wise ‘and’ of both out codes:

static void DrawClippedLine(const Point &a, const Point &b)
{
    uint8_t a_out = OutCode(a);
    uint8_t b_out = OutCode(b);
    
    if (a_out & b_out) return;

Quickly find lines that are not clipped

We can also quickly test if both points are inside our screen–by testing to see if both out codes are zero. One way we can test for this–because we’ll use the intermediate result later–is to take the bit-wise ‘or’ of both out codes and test if zero:

    uint8_t out_bits = a_out | b_out;
    if (out_bits == 0) {
        DrawLine(a,b);
    } else {

If both out codes are 0, we can just draw our line.

Now find clipped lines.

But what happens if one or more bits are set? Well, that means that one (or both) of our points are outside the screen, and we need to clip our line. For our line drawing routine we have, basically, three different cases, illustrated below:

Clipping cases

The first case, edge S, has one point inside of our screen. We only need to clip one edge.

The second case, edge R, has two points outside our screen, but the middle piece is visible inside our view.

And the third case, edge P, has two points outside our screen and no part of the line is visible.

So how do we screen against all these cases?

First, notice our out_bits variable above. It has the nice property that a bit is set only if we have to clip against a particular edge. This allows us to skip those calculations we don’t need to bother to do.

So let’s iterate across all of the visible edges and calculate our α for each edge:

        uint8_t m = 1;
        uint8_t i;
        float alpha;
        for (i = 0; i < 4; ++i) {
            if (out_bits & m) {
                // Calculate alpha; the intersection along the line
                // vector intersecting the specified edge
                // 
                // These are specific cases of the general equation
                // alpha = (c - old)/(new - old), which yields
                // alpha == 0 if c == old, and alpha == 1 if c == new,
                // and with alpha as a linear scale with the intersection
                // point sliding from old to new.

                switch (i) {
                    default:
                    case 0:         // clip (1,0,1)
                        alpha = 1 + p3pos.x;
                        alpha = alpha/(alpha - (1 + v.x));
                        break;
                    case 1:         // clip (-1,0,1)
                        alpha = 1 - p3pos.x;
                        alpha = alpha/(alpha - (1 - v.x));
                        break;
                    case 2:         // clip (0,1,1)
                        alpha = 1 + p3pos.y;
                        alpha = alpha/(alpha - (1 + v.y));
                        break;
                    case 3:         // clip (0,-1,1)
                        alpha = 1 - p3pos.y;
                        alpha = alpha/(alpha - (1 - v.y));
                        break;
                }

Now we want to track which point is inside and which point is outside, and for the outside point on our line, track the ‘alpha’ for that side, so we can determine which side needs to be clipped.

We do this by tracking an ‘alpha’ for each side of our line, which we’ll call alpha_a and alpha_b. By convention we’ll assume alpha is 0 if our line clips at point A and 1 if we clip at point B.

We also know which side (the side with A or B) is being clipped by looking at our original outcode: if the bit is set in a_out, then we need to clip on side a. If b_out, then on side b. Notice it will only be one or the other; if neither are set we don’t clip (because out_bits would be zero at that bit), and if both are set–well, we skipped this line earlier.

So let’s add the following stuff (in red):

        float alpha_a = 0;
        float alpha_b = 1;

        uint8_t m = 1;
        uint8_t i;
        float alpha;
        for (i = 0; i < 4; ++i, m <<= 1) {
            if (out_bits & m) {
                // Calculate alpha; the intersection along the line
                // vector intersecting the specified edge
                // 
                // These are specific cases of the general equation
                // alpha = (c - old)/(new - old), which yields
                // alpha == 0 if c == old, and alpha == 1 if c == new,
                // and with alpha as a linear scale with the intersection
                // point sliding from old to new.

                switch (i) {
                    default:
                    case 0:         // clip (1,0,1)
                        alpha = 1 + p3pos.x;
                        alpha = alpha/(alpha - (1 + v.x));
                        break;
                    case 1:         // clip (-1,0,1)
                        alpha = 1 - p3pos.x;
                        alpha = alpha/(alpha - (1 - v.x));
                        break;
                    case 2:         // clip (0,1,1)
                        alpha = 1 + p3pos.y;
                        alpha = alpha/(alpha - (1 + v.y));
                        break;
                    case 3:         // clip (0,-1,1)
                        alpha = 1 - p3pos.y;
                        alpha = alpha/(alpha - (1 - v.y));
                        break;
                }

                if (a_out & m) {
                    if (alpha_a < alpha) alpha_a = alpha;
                } else {
                    if (alpha_b > alpha) alpha_b = alpha;
                }

One more thing: if alpha_a has become bigger than alpha_b, that means we have case P above; our line is not visible on the screen. So we can stop processing right away:

                
                if (alpha_a > alpha_b) return;
            }
        }

Now that we’ve done our calculations, we can now determine the new endpoints and draw our line. To do this we again make use of our outcodes to determine if we even need to clip one side or another.

        
        Point c1;
        if (a_out) {
            Lerp(a,b,alpha_a,c1);
        } else {
            c1 = a;
        }
        
        Point c2;
        if (b_out) {
            Lerp(a,b,alpha_b,c2);
        } else {
            c2 = b;
        }

And now–finally! we can draw our line.

        DrawLine(c1,c2);
    }
}

There are some interesting things we can say about our line clipping algorithm.

First, it follows the principle of only doing work if we have to. If we don’t have to clip against an edge, we don’t clip against the edge.

Second, notice that we could in theory extend this to 3D very easily. If we wanted to clip to a cube where x, y and z range from -1 to 1, our 6 different clipping dot products would be

    1: ( 1, 0, 0, 1)
    2: (-1, 0, 0, 1)
    3: ( 0, 1, 0, 1)
    4: ( 0,-1, 0, 1)
    5: ( 0, 0, 1, 1)
    6: ( 0, 0,-1, 1)

Clipping in homogeneous coordinates

Notice something else very interesting about our list of vectors we’re clipping against: they sort of look like homogeneous coordinates, don’t they?

And in fact, with a minor modification to our X or Y clipping (depending on the aspect ratio of the screen, since our screen is not square), this is becomes our clipping algorithm for drawing lines in 3D.

We can make a few optimizations. For example, rather than supplying both points A and B as we draw our line, if we were to adopt a ‘move/draw’ mechanism, we can reuse some of the calculations we perform with the previous point as we draw the next.

Note: This is why we use a move/draw mechanism for drawing our lines.

We can also define a homogeneous vector for convenience sake:

struct G3DVector {
    float x;
    float y;
    float z;
    float w;
};

And voliá! We have our line clipping algorithm for homogeneous coordinates.

First, we need our linear interpolation algorithm and our out code calculator for homogeneous coordinates:

void G3D::Lerp(const G3DVector &a, const G3DVector &b, float alpha, G3DVector &c)
{
    float a1 = 1.0f - alpha;
    c.x = a1 * a.x + alpha * b.x;
    c.y = a1 * a.y + alpha * b.y;
    c.z = a1 * a.z + alpha * b.z;
    c.w = a1 * a.w + alpha * b.w;
}
uint8_t G3D::OutCode(const G3DVector &v)
{
    uint8_t m = 0;
    float px = p2xsize * v.x;
    float py = p2ysize * v.y;

    if (px < -v.w) m |= 1;      // (p2xsize,0,0,1)
    if (px > v.w) m |= 2;       // (-p2xsize,0,0,1)
    if (py < -v.w) m |= 4;      // (0,p2ysize,0,1)
    if (py > v.w) m |= 8;       // (0,-p2ysize,0,1)
    if (v.z < -v.w) m |= 16;
    if (v.z > v.w) m |= 32;

    return m;
}

Next, we need to run the same clipping algorithm, but using homogeneous coordinates:

void G3D::p3movedraw(bool drawFlag, const G3DVector &v)
{
    uint8_t newOutCode = OutCode(v);
    G3DVector lerp;
    if (drawFlag) {
        uint8_t mask = newOutCode | p3outcode;

        /*
         *  Fast accept/reject
         */

        if (0 == (newOutCode & p3outcode)) {
            if (0 == mask) {
                // Fast accept. Both points are inside; we assume
                // the previous point was already passed upwards, 
                // so we only draw to the current vector location
                p2movedraw(true,v.x/v.w,v.y/v.w);
            } else {
                // At this point we have a line that crosses
                // a boundary. We calculate the alpha between
                // 0 and 1 for each point along the line.
                //
                // (This is the Liang-Barsky optimization of
                // the Cohen-Sutherland algorithm)

                float aold = 0;     // (1 - alpha) * old + alpha * new = v
                float anew = 1;     // in the above, 0 == old, 1 == new.
                float alpha;

                uint8_t m = 1;
                uint8_t i;
                for (i = 0; i < 6; ++i) {
                    if (mask & m) {
                        // Calculate alpha; the intersection along the line
                        // vector intersecting the specified edge
                        // 
                        // These are specific cases of the general equation
                        // alpha = (c - old)/(new - old), which yields
                        // alpha == 0 if c == old, and alpha == 1 if c == new,
                        // and with alpha as a linear scale with the intersection
                        // point sliding from old to new.

                        switch (i) {
                            default:
                            case 0:         // clip (p2xsize,0,0,1)
                                alpha = p2xsize * p3pos.x + p3pos.w;
                                alpha = alpha/(alpha - (p2xsize * v.x + v.w));
                                break;
                            case 1:         // clip (-p2xsize,0,0,1)
                                alpha = - p2xsize * p3pos.x + p3pos.w;
                                alpha = alpha/(alpha - (-p2xsize * v.x + v.w));
                                break;
                            case 2:         // clip (0,p2ysize,0,1)
                                alpha = p2ysize * p3pos.y + p3pos.w;
                                alpha = alpha/(alpha - (p2ysize * v.y + v.w));
                                break;
                            case 3:         // clip (0,-p2ysize,0,1)
                                alpha = - p2ysize * p3pos.y + p3pos.w;
                                alpha = alpha/(alpha - (- p2ysize * v.y + v.w));
                                break;
                            case 4:         // clip (0,0,1,1)
                                alpha = p3pos.z + p3pos.w;
                                alpha = alpha/(alpha - (v.z + v.w));
                                break;
                            case 5:         // clip (0,0,-1,1)
                                alpha = - p3pos.z + p3pos.w;
                                alpha = alpha/(alpha - (-v.z + v.w));
                                break;
                        }

                        if (p3outcode & m) {
                            if (aold < alpha) aold = alpha;
                        } else {
                            if (anew > alpha) anew = alpha;
                        }

                        if (aold > anew) {
                            // We have a case where the line is not visible
                            // because it's outside the visible frustrum.
                            // abort.
                            break;
                        }
                    }
                    m <<= 1;
                }

                if (i >= 6) {
                    // Ran all clipping edges.
                    if (p3outcode) {
                        Lerp(p3pos,v,aold,lerp);
                        p2movedraw(false,lerp.x/lerp.w,lerp.y/lerp.w);
                    }

                    // Draw to the new point
                    if (newOutCode) {
                        Lerp(p3pos,v,anew,lerp);
                        p2movedraw(true,lerp.x/lerp.w,lerp.y/lerp.w);
                    } else {
                        p2movedraw(true,v.x/v.w,v.y/v.w);
                    }
                }
            }
        }
    } else {
        if (newOutCode == 0) {
            p2movedraw(false,v.x/v.w,v.y/v.w);
        }
    }

    p3outcode = newOutCode;
    p3pos = v;
}

This assumes we’ve added a couple of new fields to our class:

        G3DVector p3pos;
        uint8_t   p3outcode;

And now we have our clipping routine!


The code for our clipping routine is in GitHub; you can download it from here.

Next time: we’ll take all that stuff we talked about with homogeneous coordinates and put the whole thing together to draw 3D objects!

Building a 3D Graphics Pipeline on an Arduino, Part 2

Let’s talk coordinate systems.

Yes, I understand that this will involve that most dreaded of four letter words, math. I promise I’ll be gentle, and I’ll try my best to make this as easy to follow as possible.

Some of this may be a little basic; feel free to skip ahead if you already know something I’m about to talk about.


2D Coordinates

First, let’s talk about finding the position of a pixel on a computer screen. This is something most of us are pretty familiar with.

Pixels

So if we want to specify the location of the yellow pixel above, well, we’d find the column (“x”) under which the pixel is located, and we’d find the row (“y”). For the yellow pixel above, x = 3, y = 1, and that’s our pixel’s location.

We can generalize this idea. For example, if you want to find the location of a pin on a piece of paper you can use a ruler and measure it’s distance from the left edge, the distance from the top edge, and those two locations (like, for example, x = 1.3 inches, y = 4.5 inches) gives us the location of that point on a piece of paper.

Notice in both cases our position is a measure of distance from some relative location: for our point on a piece of paper our “units” are inches, and the relative location is the upper-left corner of the piece of paper. For our computer display, our “units” are pixels, and it’s all measured relative to the upper-left corner of the screen.

Some terms.

With this in mind, it’s useful to define some terms.

We will call the point from which we’re measuring our point’s relative location the Origin, because this is where our measurements originate. Often we represent the origin in a diagram with a zero.

We will call the relative directions we measure from (horizontally and vertically across the page, across rows and down columns of our display) as our Axis. (In math, an “Axis” is a line that serves to orient ourselves–and it can also be something we rotate around.)

Note: The whole concept of axis in Math is a bit of a rabbit hole, but we’ll ignore it for now. For now we’ll just treat it as “up/down”, “left/right” and “near/far.”

And we’ll introduce some compact notation:

(x, y)

This is a compact way for us to represent x and y.

So our yellow pixel is at (3,1), our point on a piece of paper is at (1.3 inches, 4.5 inches). Ususally, however, our units are implied–so we’d write our point on paper as at (1.3, 4.5).


3D Coordinates

This whole idea of 2D coordinates can be extended to the world of 3D by considering “depth.”

Depth

Take our yellow dot above. Relative to our X and Y axes, the object is 4 units along X and 3 units up along Y. (What this means is, looking at the diagram above, if you were to measure from the wall that goes up along the Y-Z plane, the yellow dot is 4 units away from that wall.

From the floor (that is, from the X-Z plane), our yellow dot is 3 units up.

And it’s 5 units away from the X-Y plane.

We would represent the location of our yellow dot as x = 4 units, y = 3 units and z = 5 units, or–more compactly, as

(4, 3, 5)

On the math below.

I think at this point it is fair to say that we’re about to dive into a topic called “linear algebra”, and while I’ll only write out the things that are useful to us, the whole topic of linear algebra contains some interesting stuff that can be useful if you decide to do more interesting things in computer graphics than just draw a handful of pretty pictures with half-understood equations.

And for that, if you want, there is a wonderful series on The essence of linear algebra by Grant Sanderson of 3Blue1Brown is absolutely fantastic and explains all this stuff at a level of detail with pretty pictures that can be understood by nearly anyone.

For our purposes we’ll introduce the stuff we need–but if you want to understand “why”, you’ll want to check out his videos. They’re well done, well thought out and, I think, very easy for anyone to understand.


Perspective

Now when we talk about computer graphics, what we’re really thinking about is the whole idea of “perspective:” the idea that as things get farther away they appear smaller.

The moon, for example, is 2,150 miles in diameter. A quarter, on the other hand, is slightly less than an inch in diameter. But if you hold up the quarter to the night sky, you can cover the moon with your quarter. That’s perspective: your quarter, being inches from your eye, is a lot closer than the moon, which is 240,000 miles away.

You can think of this by drawing a line from a far away object to your eye:

Perspective1

The blue dot is your eye looking at the far away object.

Now suppose you are looking at the object at a computer screen some distance near your eye. The object would appear smaller–because the screen is nearer than the object you are looking at:

Perspective2

So how big is that object on the screen?

For convenience sake, let’s consider the distance from the screen to your eye 1 unit. A little basic geometry gives us an answer: suppose the distance to the object from your eye is Z, the height Y, then the law of similar triangles suggests the height Y’ on the screen would be given by the formula below.

Perspective3

This means the size on our screen (that is one unit away) would be Y/Z.


Perspective and Homogeneous Coordinates

When we talk about perspective, we are talking about Projective Geometry, or the geometric principles of projecting stuff.

Which is what we are doing when we project our object onto a computer screen.

And in 1827, Ferdinand Möbius devised the concept of homogeneous coordinates as a means of representing projective coordinates–the coordinates of things as they are projected somewhere else.

Now the whole topic of homogeneous coordinates, like mathematical axis, is a rabbit hole we can easily spend a lot of time on. But for our purposes, the way this works is as follows:

First, we add a new dimension, w, associated with all of our points. Generally we can take a coordinate (x, y, z) and map it into homogeneous coordinates by adding w = 1, which we show as a fourth value in our coordinates.

(x, y, z) -> (x, y, z, 1)

And we can map any homogeneous point (x, y, z, w) back to 3D coordinates by:

(x, y, z, w) -> (x/w, y/w, z/w)

Now why divide by w and not z? That’s in part a matter of convention, and in part because it works well with our 3D clipping for hand-wavey reasons.

(Basically, w serves as a projection in our 3D coordinate system. And when it comes to clipping we want to preserve all the information we can so that we can represent clipping with as much accuracy as possible, and because we may wish to represent things like points on an “infinite” sphere–like stars in the sky–through using ‘w = 0’.)

Why is this interesting?

One thing you can use homogeneous coordinates for is dealing with all your rotations, scaling operations, and translation operations (moving things around in your system) using matrix multiply operations.

Here’s an example. Suppose we have an object at (3, 4, 5)–and we want to represent it moving over by 2 units along X.

Normally we’d do this by addition: (3, 4, 5) + (2, 0, 0) = (3+2, 4, 5) = (5, 4, 5).

Matrix multiplication may seem a little more convoluted–but trust me, this will make our lives easier.


Small rabbit hole: matrix multiplication.

Before we talk about translation with matrix multiplication, let’s define a few terms.

First, for our purposes, a matrix is just a two-dimensional array of numbers. Throughout computer graphics we always use 4×4 matrices, so whenever you hear “matrix” in computer graphics, usually what you should hear is:

A Matrix

That is, a 4×4 array of numbers.

Now when we multiply a matrix by a vector, we essentially treat the vector as a 4×1 matrix (if you’re following along by looking stuff up on Wikipedia), and we essentially do the following:

We can see this with the code we use for multiplying matrices, though we store a matrix as a 2D array:

void G3DVector::multiply(const G3DMatrix &m, const G3DVector &v)
{
    x = m.a[0][0] * v.x + m.a[0][1] * v.y + m.a[0][2] * v.z + m.a[0][3] * v.w;
    y = m.a[1][0] * v.x + m.a[1][1] * v.y + m.a[1][2] * v.z + m.a[1][3] * v.w;
    z = m.a[2][0] * v.x + m.a[2][1] * v.y + m.a[2][2] * v.z + m.a[2][3] * v.w;
    w = m.a[3][0] * v.x + m.a[3][1] * v.y + m.a[3][2] * v.z + m.a[3][3] * v.w;
}

The ways, whys and wherefores of matrix multiplication aren’t really that important here, though you can read more at the above linked Wikipedia article.


Back to Translations

Notice something interesting about our matrix multiplication results: they include addition as well as multiplication. We can use this to our advantage by constructing our matrix appropriately.

So back to our point at (3, 4, 5).

If we represent this as a homogeneous coordinate (3, 4, 5, 1) (because remember: in general when we go from a point in 3D space to a homogeneous coordinate we append a w = 1 at the end), then we can construct a matrix to handle translating by x = 2:

Translate Matrix Example

(Follow along with the animation if you need to convince yourself this is the correct answer.)

Now if we perform the actual multiplications and additions we get our final result (5, 4, 5, 1), and from our rules above, in 3D space, this would be (5/1, 4/1, 5/1) = (5, 4, 5).

We can generalize this by moving through Y and Z, giving us our translation matrix (that is, a matrix which moves our object by a distance in X, Y and Z) as:

Translation Matrix

There is an interesting property of these translation matrices. We can chain them together using matrix multiplication. And it just works out the way we would think: if we have two translation matrices, one that translates by (x,y,z) and another that translates by (a,b,c), if we multiply the two matrices together we get:

Multiply Translation Matrices

(If you have to read the article on matrix multiplications and follow along, that’s fine. I can wait.)

This hints at something terribly clever going on here:

With a single matrix you can represent the concatenation of a whole bunch of rotations, scale operations and translations.

And it means we aren’t constantly moving things around the screen a pixel at a time, through a whole chain of “rotate”, “move” and “scale” operations. We simply concat all this into a single matrix through multiplication, and then all points multiplied by that matrix are moved around according to our whole chain of rotations, movements and scale operations.


Scaling and Rotations and chaining it all together

Scaling an object by (sx,sy,sz)–that is, multiplying each object’s (x,y,z) coordinate by (sx,sy,sz)–has the following matrix representation:

Scale Matrix

And rotation around the X, Y and Z axis looks like:

Rotate Matrices

Now we can chain these matrices together by pre-multiplying the matrix. Meaning if we want to first rotate our object around the Y axis by an angle, then translate the whole thing by (tx,ty,tz), we could first multiply our vector by the rotation matrix, then by the translation matrix, multiplying right to left:

Chaining

Or, you know, we could multiply the matrices together first–then multiply by all of our points.

‘Cause like I said above:

With a single matrix you can represent the concatenation of a whole bunch of rotations, scale operations and translations.

This is, by the way, what happens in an OpenGL graphics pipeline.

Display drivers which display 3D in hardware are very good at multiplying 4×4 matrices and 4×4 matrices with vectors. This math allows them to move objects around in real time on your screen.

Of course we’re not going to be moving stuff around in real time with an Arduino, but the same principles apply.


But what about perspective?

Oh, yeah. So we’ve gone down a rabbit hole using 4×4 matrices to move our objects around in three-dimensional space. But what about displaying objects in perspective?

There are a number of perspective matrices out there which essentially put the z depth into the w column, so at the end we get (x/z, y/z, ...)–we then can display our lines using the (x/z, y/z) value at the end.

The perspective matrix I prefer–and you would multiply this as the last multiplication operation before displaying your stuff–is:

Perspective Matrix

Notice what we do here: we move the z coordinate into the w column, and the w column into the z column. This has the nice result that we (eventually) divide by z–and all that perspective stuff we did before works very well.

Note: This is not the perspective matrix used by OpenGL. I provide a link above explaining why this is a preferable way to represent perspective.


A word about computer graphics math

I once worked with a computer graphics library as part of a project a long time ago–it was required by the project manager, even though I could have rolled my own more quickly.

But the manual did make the following observation:

In the end, you either have an image, or you do not.

There are a lot of things that can go wrong on your path towards drawing something. You can accidentally flip the sign of something–and think you’re rendering an object in front of you when it is behind you and invisible. You can accidentally rotate left when you intended to rotate right. You can stack the perspective matrices backwards, or multiply the matrices together wrong.

So here are some helpful hints to keep in mind when we get to the part where we start putting the code together.

Test the translation matrix first.

This will allow you to make sure you haven’t transposed the matrix: that you haven’t flipped the rows and the columns. If you take a point at (1,2,3,1) and multiply it by the transformation matrix for moving an object by x=5, if you get (6,2,3,1) you’re probably on the right path. If you get (1,2,3,6), you’ve transposed your matrix.

Test to make sure you haven’t flipped a sign somewhere.

It’s inevitable, so much so that it’s nearly a joke amongst those involved in the computer graphics industry, that you will inevitably flip the sign of some term somewhere. Look through your code for a “+” where there should be a “-” and visa-versa.

Right handed rule verses left handed rule.

This is sort of related to the last rule.

Take your right hand, and make a “gun” with your fingers–with your thumb pointing up, your pointing finger pointing out. Now your middle finger points at a right angle to your pointing finger, and your last two fingers curl around.

If your thumb is “X”, your pointing finger “Y” and your middle finger “Z”, this is the right-handed coordinate system. And this is the coordinate system our perspective matrix works in. Notice something interesting here: if you rotate your hand so your thumb is pointing horizontally to the right, your pointing finger is pointing up–well, the middle finger is pointing towards you, not away.

This means if you want to move your object out in front of your virtual camera, you must subtract some value from the Z axis.

Build your translations slowly.

When you start building your translations for display–say, you want to move a block representing an arm of a robot to the shoulder–don’t be afraid to build each of the translations slowly, using small values. Bump your arm out by 1 unit. Or rotate it by 5 to 10 degrees. Small motions help you visualize if you’re building the transformation stack (what we call all that matrix to matrix mulitplication) correctly.

And start simply: it’s easier to start with an image (even if it’s a boring cube) and make something cool with it, than to start with a blank screen and pull your hair out trying to figure out why.

Premultiply your perspective matrix last.

Because we move stuff around on the screen before we take its perspective, right?

When in doubt, comment it out.

This helps you reduce the amount of variables in the problem, so you can see if you have some crazy value being multiplied into your system.


End Notes

This was a lot, I’m sure. But next time we’ll talk about clipping and clipping in 3D. And afterwards, we’ll throw all this together to create our first 3D drawings.

Building a 3D Graphics Pipeline on an Arduino, Part 1

This is a series of articles I’ve intended to write showing the creation of a basic 3D graphics pipeline on an Arduino. By the end of this series hopefully you’ll have a good understanding of basic 3D graphics, including coordinate transformations, rendering and perspective rendering.


Things you need to follow along.

Of course, if you want to write software for an Arduino, you need one.

For this series of exercises I used the Adafruit METRO 328, which can be had (as of this writing) for $17.50. You can also use any other Arduino or Arduino clone, but you may need to fiddle with the settings.

I also used the 2.8″ TFT Touch Shield for Arduino. The advantage of all of this is that we can simply plug the two parts together, plug it into our computer, download the Arduino IDE, and we have a basic development environment.

Adafruit also does a very good job integrating with the Arduino IDE, providing libraries that can be quickly downloaded to test your display. And they provide a great on-line tutorial to get you up and running in just a few minutes.

Of course by using these components we create a dependency on their libraries; if you decide to use a different display shield from a different company you’ll also need to make a few changes in our code to make it all hang together. We’ll talk about that when we get there.

When you get it all set up, you should see a series of test screens.

Image

* Bear not included.


Introduction.

The reason why we call this a “graphics pipeline” is because we’ll break the problem down of drawing 3D objects into a series of steps–a “pipeline” where we do a small but well-defined unit of work at each step along the way.

By organizing our drawing as a pipeline we simplify each step of the process. Each step may not do much, but when added together we can create incredible things.

The pipeline we will build will allow us to draw 3D wireframes. We’ll do a series of articles covering polygonal rendering later. At the end of this series of articles we will have created a pipeline that looks like the following:

Pipeline

So let’s get started.

Hardware drawing.

At the very bottom of our pipeline is the code which draws directly to the hardware. For this we will rely on the Adafruit GFX library to actually handle the drawing–but we still call this layer out because it is the only point in our code where we talk to the Adafruit GFX library in a meaningful way. (Our code on construction interrogates the library to get the screen size, and we expose the start and stop drawing methods and color methods–but that’s it.)

We could, in theory, write our own line drawing routine. Bresenham’s line drawing algorithm can handle drawing lines in hardware very quickly, and is the algorithm used by the Adafruit library. Our line drawing routine could be wired to another library as well, such as the graphics library used by the Arduboy hand held game.

Line Drawing

Our primary line drawing for the first part of our 3D pipeline is the ‘movedraw’ method. This contains all the logic for moving a virtual pen to a given location, or drawing from the current pen location to a new pen location.

We use this ‘pen’ drawing model because it allows us to keep track of the state of the current location; this comes in handy when we do our 3D calculations or our line clipping.

This means our class G3D must keep track of the current pen location (and all the state revolving around the current pen location).

So let us create our 3D graphics class. First, we need to set up the constructor and the destructor to initialize and shut down our 3D pipeline, as well as entry points for setting the current color and for starting and stopping line drawing.

class G3D
{
    public:
                G3D(Adafruit_GFX &lib);
                ~G3D();

        void    setColor(int16_t c);		
        void    begin();
        void    end();
        void    move(uint16_t x, uint16_t y);
        void    draw(uint16_t x, uint16_t y);
        void    point(uint16_t x, uint16_t y);
};

The three routines shown in red are test methods; we will replace them when we build the next level of our graphics engine.

Now let’s add the internal state we need to track, as well as our constructor and destructor. We want to keep track of our drawing library as well as the screen size and color:

class G3D
{
    public:
                G3D(Adafruit_GFX &lib);
                ~G3D();

        void    setColor(int16_t c)
                    {
                        color = c;
                    }

        void    begin();
        void    end();
        void    move(uint16_t x, uint16_t y);
        void    draw(uint16_t x, uint16_t y);
        void    point(uint16_t x, uint16_t y);
    private:
        /*
         *  Internal state
         */
        Adafruit_GFX &lib;
        uint16_t width;
        uint16_t height;

        /*
         *  Current drawing color
         */

        uint16_t color;
};

Our constructor and destructor simply notes the width and height of the screen as well as the drawing library:

G3D::G3D(Adafruit_GFX &l) : lib(l)
{
    width = l.width();
    height = l.height();
}

G3D::~G3D()
{
}

We also want to build the begin and end methods. These call into the Adafruit GFX library to start and end drawing to our device. The reason why we do this is to optimize drawing; we don’t want to unnecessarily start up and shut down communications to our hardware every time we draw a line.

void G3D::begin()
{
    lib.startWrite();
}

void G3D::end()
{
    lib.endWrite();
}

Our setColor method can be handled inline above.


Now all of this so far has just been plumbing; creating our class and tracking state.

The real meat of our first level of drawing is the p1movedraw method and associated stuff. The idea here is to create a single entry point which handles moving and drawing through a single entry point.

Our p1movedraw method needs to keep track of the current pen location, as well as if the last time we drew something if we moved to that location or drew to that location. So let’s add the additional methods needed to handle moving and drawing:

class G3D
{
    public:
                G3D(Adafruit_GFX &lib);
                ~G3D();

        void    setColor(int16_t c)
                    {
                        color = c;
                    }

        void    begin();
        void    end();
        void    move(uint16_t x, uint16_t y);
        void    draw(uint16_t x, uint16_t y);
        void    point(uint16_t x, uint16_t y);
    private:
        /*
         *  Internal state
         */
        Adafruit_GFX &lib;
        uint16_t width;
        uint16_t height;

        /*
         *  Current drawing color
         */

        uint16_t color;

        /*
         *  State 1 pipeline
         */

        bool    p1draw;
        uint16_t p1x;
        uint16_t p1y;

        void    p1init();
        void    p1movedraw(bool drawFlag, uint16_t x, uint16_t y);
        void    p1point(uint16_t x, uint16_t y);
};

The internal variables p1draw is true if the last time we called we were drawing a line. The values (p1x,p1y) is the pixel location of the pen. We have an initialization method for our drawing, as well as our p1movedraw and p1point methods which either moves the pen or draws a line, and which plots a single pixel point.

With our methods defined for our level 1 code, we can add our initialization method to our class constructor:

G3D::G3D(Adafruit_GFX &l) : lib(l)
{
    width = l.width();
    height = l.height();

    /* Initialize components of pipeline */
    p1init();
}

And our test methods simply call into our internal drawing routines:

void G3D::move(uint16_t x, uint16_t y)
{
    p1movedraw(false,x,y);
}

void G3D::draw(uint16_t x, uint16_t y)
{
    p1movedraw(true,x,y);
}

void G3D::point(uint16_t x, uint16_t y)
{
    p1point(x,y);
}

Hardware drawing

This is a lot of stuff, but it gets us to the core of our drawing methods. And we’ll reuse a lot of this machinery when we get to the next phase of our drawing code.

First is initialization. Our initialization routine should set our system to a known state; in this case, we move the pen to (0,0). Recall that the variable p1draw is true if we drew last, false if we moved.

Our initialization code then sets everything to zero:

void G3D::p1init()
{
    p1draw = false;
    p1x = 0;
    p1y = 0;
}

Our point drawing does not move the pen; it simply sets the pixel. This is a simple wrapper into our Adafruit GFX library to write a pixel:

void G3D::p1point(uint16_t x, uint16_t y)
{
    lib.writePixel(x,y,color);
}

Note: We call the ‘writePixel’ method rather than the ‘drawPixel’ method because we are calling the ‘startWrite’ and ‘endWrite’ methods inside a ‘being’/’end’ pair.

And our move/draw routine will draw a line only if the pen is down; we then update our current state.

void G3D::p1movedraw(bool drawFlag, uint16_t x, uint16_t y)
{
    if (drawFlag) {
        lib.writeLine(p1x,p1y,x,y,color);
    }

    p1draw = drawFlag;
    p1x = x;
    p1y = y;
}

Note that this is the place we would need to change our code if we needed to draw our own line using Bresenham’s line drawing algorithm, or if we needed to create a draw list to send to a piece of connected hardware.


We can put all this together with a simple test:

// Use hardware SPI (on Uno, #13, #12, #11) and the above for CS/DC
Adafruit_ILI9341 tft = Adafruit_ILI9341(TFT_CS, TFT_DC);

// Graphics setup
G3D draw(tft);

void setup() 
{
}

void loop() 
{
    tft.begin();
    tft.fillScreen(ILI9341_BLACK);
    
    draw.begin();
    draw.setColor(ILI9341_RED);
    draw.move(0,0);
    draw.draw(0,100);
    draw.draw(100,100);
    draw.draw(100,0);
    draw.draw(0,0);
    draw.move(100,120);
    draw.draw(100,140);
    draw.end();

    for (;;) ;
}

And if all works well, we should see:

L1

Okay, it’s not much. But you have to learn to crawl before you walk.

The source code for this version of the 3D graphics pipeline on GitHub; the linked branch shows just the first part of the graphics pipeline.


Now this is rather boring. If this was all we wanted to do, big deal.

But remember our original principle:

Each step may not do much, but when added together we can create incredible things.

The next step in our pipeline is viewport drawing.

The purpose of viewport drawing is to abstract away the pixels. That is, what we want to do is to redefine our drawing coordinates so that we get more or less the same drawing regardless of the dimensions of the physical hardware attached to our Arduino. And we do this by defining a new viewport coordinate system that ranges from -1 to 1.

Note: We will be using floating point math for our operations, which may not necessarily be the fastest on an Arduino. However, it should be fast enough for us to do some simple things like rotating a cube. When designing for an Arduino, the most important thing you can do is figure out what not to draw; that way you can avoid doing math for stuff that is not visible.

Now our virtual coordinate system imposes a virtual square; the top and bottom of our screen (or the left and right sides depending on the orientation of our screen) has a smaller coordinate value:

Viewport

This means we have a little math to do.

Basically we need to figure out two things when our 3D class starts up:

  1. What are the viewport drawing dimensions of the screen? We need this in order to determine the dimensions of the screen we are clipping to.
  2. To convert from a viewport coordinate (x,y) to pixel coordinates (px,py), we need to calculate px = A + B*x and py = C + D*y for some values A, B, C, D. What are those values?

Things get particularly interesting on the fringes. Note that when we round from a floating point value to an integer, we generally truncate (or round down); this means the value 12.75 becomes 12 when converted to an integer. So we can’t, for a screen that is 320 pixels wide, write:

    px = 160 + 160 * x;  // 160 = 320 / 2

Because if we plug in a value 1 for x, we get px = 320–and our pixel coordinate range is from 0 to 319.

So we solve this problem by bumping the width down by 1 pixel, and we calculate our pixel range (in floating point values) from 0.5 to 319.5. (This will then convert to integer values from 0 to 319.)

Thus we’d write:

    px = 160 + 159.5 * x;  // 159.5 = 319/2

Plugging in -1 for x gives is px = (int)0.5 = 0, and 1 for x gives us px = (int)319.5 = 319.

We have a second problem in that in our pixel coordinates (0,0) is in the upper-left corner of the screen–but we want our drawing to put (-1,-1) at the bottom-left. This means we need to flip the sign of the y value during our calculations:

    px = A + B * x;
    py = C - D * y;

Our third problem, of course, is that our display is not square; it’s rectangular. But we assume our pixels are square. And we assume whichever is larger (the width or the height) will map from -1 to 1; the other wide will map from roughly -0.75 to 0.75 or so.

This implies our scale value B and D which we multiply in our equation above is the same. (If our pixels are not square we need to do something different.)


So let’s put all this together.

For our viewport coordinate system we need to track five variables: two giving the (+/-) width and (+/-) height of our viewport:

        float	p2xsize;      // viewport width +/-
        float	p2ysize;      // viewport height +/-

And three that handle the transform: the amount we multiply and add to our viewport coordinate to give our pixel coordinate:

        float	p2scale;      // coordinate transform scale.
        float	p2xoff;      // coordinate transform offset
        float	p2yoff;

Of course we also need to add our level 2 pipeline code.

All of this looks like:

class G3D
{
    public:
                G3D(Adafruit_GFX &lib);
                ~G3D();

        void    setColor(int16_t c)
                    {
                        color = c;
                    }

        void    begin();
        void    end();
        void    move(float x, float y);
        void    draw(float x, float y);
        void    point(float x, float y);
    private:
        /*
         *  Internal state
         */
        Adafruit_GFX &lib;
        uint16_t width;
        uint16_t height;

        /*
         *  Current drawing color
         */

        uint16_t color;

        /*
         *  State 2 pipeline; map -1/1 to screen coordinates
         */

        void    p2init();
        void    p2movedraw(bool drawFlag, float x, float y);
        void    p2point(float x, float y);

        float   p2xsize;        // viewport width +/-
        float   p2ysize;        // viewport height +/-
        float   p2scale;        // coordinate transform scale.
        float   p2xoff;         // coordinate transform offset
        float   p2yoff;

        /*
         *  State 1 pipeline
         */

        bool    p1draw;
        uint16_t p1x;
        uint16_t p1y;

        void    p1init();
        void    p1movedraw(bool drawFlag, uint16_t x, uint16_t y);
        void    p1point(uint16_t x, uint16_t y);
};

Notice we also change our test routines (shown in red).


Our p2movedraw and p2point routines look similar: they perform our math calculation using the constants we’ve calculated during startup to transform from viewport coordinates to pixel coordinates:

void G3D::p2point(float x, float y)
{
    int16_t xpos = (int16_t)(p2xoff + x * p2scale);
    int16_t ypos = (int16_t)(p2yoff - y * p2scale);
    p1point(xpos,ypos);
}

void G3D::p2movedraw(bool drawFlag, float x, float y)
{
    int16_t xpos = (int16_t)(p2xoff + x * p2scale);
    int16_t ypos = (int16_t)(p2yoff - y * p2scale);
    p1movedraw(drawFlag,xpos,ypos);
}

Now the heavy lifting is in our initization code, which must find all our constants given the size of the screen. As we noted before we assume pixels are square.

Our initialization code first needs to set up some constants and find the screen dimensions for our clipping method. We then need to calculate the scale parameter and ultimately the offsets. In the end our code looks like:

void G3D::p2init()
{
    /*
     *  We subtract one because we want our mapping to work so that
     *  virtual coordinate -1 is in the middle of the 0th pixel, and
     *  +1 is in the middle of the width-1 pixel for wide displays.
     *
     *  This offset of 1/2 by the pixel width implies we're drawing on
     *  a display 1 pixel narrower and wider, but then with 1/2 added
     *  to the pixel coordinate
     */

    uint16_t h1 = height - 1;
    uint16_t w1 = width - 1;

    /*
     *  Calculate the width, height in abstract coordinates. This
     *  allows me to quickly clip at the clipping level
     */

    if (w1 > h1) {
        p2xsize = 1;
        p2ysize = ((float)h1)/((float)w1);
    } else {
        p2xsize = ((float)w1)/((float)h1);
        p2ysize = 1;
    }

    /*
     *  Calculate the scale, offset to transform virtual to real.
     *  Note that -1 -> 0 and 1 -> (width-1) or (height-1).
     */

    if (w1 > h1) {
        p2scale = ((float)w1)/2;
    } else {
        p2scale = ((float)h1)/2;
    }

    p2xoff = ((float)width)/2;
    p2yoff = ((float)height)/2;
}

Of course we need to change our main Arduino test code to use the new floating point values for coordinates:

void loop() 
{
    tft.begin();
    tft.fillScreen(ILI9341_BLACK);

    draw.begin();
    draw.setColor(ILI9341_RED);
    draw.move(-0.5,-1);
    draw.draw(0.5,-1);
    draw.draw(0.5,1);
    draw.draw(-0.5,1);
    draw.draw(-0.5,-1);

    draw.move(-0.75,-0.75);
    draw.draw(-0.5,-0.75);
    draw.draw(-0.5,-0.5);
    draw.draw(-0.75,-0.5);
    draw.draw(-0.75,-0.75);
    draw.end();

    for (;;) ;
}

And if everything works correctly, you should see the following:

Image 2

Again, not all that exciting.

Except for one thing: This is what your image will look like regardless of your screen size. This becomes important as we move on and start building our 3D rendering engine.

All of this code is available at GitHub in a separate branch.


This has been an extremely long post, but a lot of this was setting the stage for future posts. The next post will discuss homogeneous coordinates, the coordinate system we use for representing objects in 3D. This will be followed by a post discussing clipping in homogeneous coordinates, then we’ll move into some 3D math.