User Interface Design Part 1: Defining Our Visual Language

I have always wanted to write a series of articles on user interface design from the ground up, but I’ve never seem to have the time. Well, it’s time to take the time, and now that I have this beautiful little 2.8″ TFT Touch Shield that thus far I’ve been using to show how to render 3D graphics, I figured it was time we put it through its paces for what it was intended to be used for: as the input touch screen to control some cool little gadget.

Blankcanvas

So we have a blank canvas–and we toss in a button here, some text there, a cute icon below that–and we call it a day. Simple, right?


I want to discuss a more systematic way to think about user interface design, one which allows us to think about how to assemble our interface so that users find it easy to use, can engage in complex interactions with the user, and can be extended to provide a consistent visual language that can give your gadgets pizzaz.

First, let’s define what we actually mean by “easy to use.”

Waaaaay back in the dark days before there was an Internet, at the very beginning of computer-user interfaces, a division of Xerox was doing research in creating easy to use computers for the office place. Faced with a blank screen, they decided to model their interface after real-world physical things we encounter on a day to day basis. Apple used this as the basis for their own computer interface, and crafted human interface guidelines based on their recommendations and based on an established set of user interfaces.

This set of standard controls included radio buttons–so named because they resembled the behavior of old mechanical car radios, which could tune a radio station quickly but only one was active at a time.

Radio Buttons

They created check-boxes, borrowing visual language from filling in forms, where you can check off the appropriate boxes which match your choice.

We also got buttons, which were modeled after their electronic counterpart, pull down menus (which borrow their name from restaurant menus, where you have a list of things you can pick from), and text fields, which does not have a real good counterpart in the real world.

And the notion was that so long as you used these controls–and used them consistently, according to the detailed human interface guidelines provided by a company like Apple, then a user would know how to use your application.


What Apple did was create a set of affordances, things that, once we learned what they meant, we would remember how to use them.

Affordances are, in essence, the hints we have in the objects that surround us which give us an immediate idea how that object is supposed to be used. A teapot has a handle, and the shape of the handle tells us we should grab it if we want to pick up the teapot. A door has a knob or handle; the shape of the knob tells us to rotate it to open the door.

At a more fundamental level, however, an affordance is–in a very real sense–a visual language. We know if we see a circle with some text next to it, it’s a radio button–and if we tap on it, we’ll change its state and deactivate the state of the other related buttons.

Radio Buttons

But it doesn’t mean this is the only way we can represent picking one of a list of items. On the iPhone, one can use a picker view instead.

Picker

Some ‘Affordances’ may not even have a particular shape or color or size–but may be based solely on the position the thing occupies on the screen. For example, on the iPhone, the upper part of the screen generally has a button on the left or the right. The button itself is not marked out; we only know the button exists because we’ve become accustomed to the idea the word on the upper left and the uppper right of the screen is a button–and generally the button on the upper left backs us out a screen.


The point is not that we should use a particular design.

The point is that simplicity comes from consistency.

If we know “cancel” or “back” is in the upper-left, we know to always look to the upper left for a cancel or back button. Placing the cancel button in the lower right would be confusing if we got used to the cancel button being somewhere else.

And “language” can even extend to gestures or physical controls. For example, on the side of our phones we know there are two buttons: the top one turns the volume up, the bottom one turns the volume down. If the “volume up” button suddenly turned your phone off, you’d think something was broken.

We could, in theory, conceive of a user interface that consists of a screen full of information, a rotary knob that can be pressed, and a back button. If the back button takes you back a screen–it should always take you back a screen. If suddenly going back means twisting the knob sharply to the left–users will be confused. And they may think your system is broken.


In essence, our user interface is a language, like English, like Latin–but the “domain” of our language is rather limited. You’re not going to talk to your house thermostat about what’s for dinner–but in its domain (what temperature you want your house to be at), your thermostat provides a language: turn the knob to change the temperature. Flip the switch to turn on heat or air conditioning. Flip another switch to turn the system on or off. And a third switch for changing the fan settings.

And that interface is “easy to use”–because it is discoverable: you can find the switch–and you know when you see a switch how to flip it. You can see the knob–and the dial indicates the temperature in the room and the temperature you want it to be at. And it is consistent: when you flip the fan switch, the fan changes state: flipping the fan switch doesn’t order pizza from Domino’s or opens the garage door.

Now of course we’re talking about a small 2.8″ touch display. But the same principles apply. You don’t have to make your interface look like anything else–but you have to make your interface discoverable: does the user know what can be tapped on? Does the user have an idea when something is tapped what it will do? Does the user know what cannot be tapped on?

And it’s consistent: do things that look like they can be tapped on do something when they’re tapped? Do they do the same thing when they’re tapped–does a radio button always behave like a radio button and a check box like a check box? Do things that cannot be tapped do nothing when tapped–or are you hiding behavior where the user cannot discover it by being inconsistent?


Further, like any other language, our user interface language has nouns–that is, things we are acting on, and verbs–actions we are taking. More complex applications can even have adjectives: modifiers that describe the nouns of our language, and adverbs: modifiers that alter our verbs.

And when we “talk” to our computer, we are in essence using our interface to construct simple sentences to describe what we want to do. For example, in a drawing application, our “nouns” may be the lines, circles and blocks of text we’re manipulating, and our “verbs” are the ways we are modifying our objects.

Our language may look like this: “Please move that line” (picks the line) “over there” (drags the line to its new location). Or “please draw a new line” (picks the line drawing tool) “over here” (clicks and drags a new line on the screen).

Sometimes the object we’re acting on is implicit: the thermostat knob only changes the temperature. Sometimes the order in which we pick the noun (the object of our action) and the verb (the action we are taking) can be reversed: we may pick a tool which shrinks everything we touch in half.

And sometimes our language has contexts. For example, we may have a programmable thermostat; on the ‘main screen’ the knob controls temperature. But on another ‘screen’, the knob controls the time during the day when we want to raise or lower the temperature. This context can be thought of as a way to pick nouns on a limited screen (what temperature do you want for your house?), or it could be thought about as a “context” where we are having a conversation: in the kitchen, a ‘cup’ may mean something entirely different than a ‘cup’ on a football uniform.


As a side note, it is the failure to adhere to these two basic principles: discoverability and consistency–for example, consistency in how you construct complex sentences as you “talk” to your computer–which make some applications harder and harder to use.

In other words, you are not stupid. The designers who designed the application you are having a hard time failed to adhere to these basic principles, and they failed to adhere to some of the other principles outlined in the book Designing with the Mind in Mind, such as responsiveness–the fact that when the user interacts with a computer they expect the computer to respond quickly, even if the response is “stand by.”

And increasingly more and more “UI Designers” are being hired out of art school with no formal training in affordance theory or in the necessity for discoverability and consistency.


Its also worth talking about color. Color is a great way to augment a user interface–and certainly a colorful interface is an attractive one. But you need to consider color as part of your language–as a way to augment the information you are presenting, such as using color to represent adverbs to the nouns of your language.

But you need to be careful when using color. Partially because some people are color-blind. And partially because you may not get to use a beautiful color display when developing your language–so you need to have another way to specify your adjectives.

B&W Display

Where is your color God now?


All of this is a long-winded way to say we need to think of what we want our visual language to look like, and we need to consider how we pick the things we act on and how we act on them. That includes determining the appearance and behavior of our controls–with consistency being key–and the way we specify the things we are operating on and the things we can do to them.

And today we will start designing our visual language by taking inspiration from Star Trek.

Of course there are two immediate problems we face here. First, the original LCARS interface is copyright CBS Studios. And second, there is no real consistency between screens; the layout of the screens vary according to the artistic needs of the episodes rather than are designed according to the principles of discovery and consistency.

But we certainly can use the ideas here as a launching off point.

For our design we’ll use some of the shapes but then design a standard language for our visual interface that relies on these basic shapes to function.

First, we’ll start with a sloped inverted L shape. We’ll use this to divide the screen from a top part which contains a title and an optional ‘cancel’ button in the upper left, and the bottom half. If we have ‘nouns’ we intend to operate on, we can put them along the left column of the inverted L pattern; pressing one can either navigate to a new screen or update the content area as appropriate. Otherwise, we can hide the left part and simply have a bar.

Basic Inverted L

No Inverted L

We will also create basic button groups. The idea is that these are all buttons that are related to some piece of functionality. We group them by using rounded corners.

Grouped Buttons

Finally we’ll use a grouping that contains a rectangular region on the left to indicate a group of radio buttons.

Radio Buttons

We’ll use this as the start of our visual language for buttons and radio buttons, as well as navigating between screens and picking the ‘nouns’ we’re operating on.


Now that we’ve sketched some ideas–and have a framework in which to use these ideas–next time we’ll start building the code, including our custom font and some changes to the Adafruit GFX library to reduce flicker when pressing a button.

The Earth-Moon Orrery: a quick status update.

My ultimate goal is to recreate the 3D printed object in brass, creating a small little tabletop Orrery for display. That means, of course, cutting gears.

So there has been a little bit of a pause while I order the correct things for gear cutting. Ultimately my goal is to set up a Sherline mill as a gear cutting system. I have 32DP gear cutters on order–which means I’m going to be resizing the machine I make even smaller than its current 3D printed size. And that means a lot of waiting and tinkering–since I’m a complete noob when it comes to metal working. But my goal is to learn along the way.

I also have some plate aluminum on order; my theory is learn on the cheap stuff–though I bet I’ll waste a lot of (more expensive) engravers brass when it comes time to cut the final gears.

Meanwhile, I do have a 7″ lathe and mill, both from these guys, which I bought a couple of years ago (but circumstances in my life meant they got to sit around gathering dust), and it’s time to cut the gear arbors.

I’m using something I first saw on Clickspring. The principle of cutting a gear is more or less the same: cut a roughly circular blank out of sheet metal, super-glue it to a gear arbor, then true the blank to the correct diameter. Mount on the Sherline (and yes, I know; this means a lot of fiddling to re-center the blank), align the cutters, and start cutting gears.

So today I’m making my superglue gear arbors.

For our orrery we need three arbors: one that is 1.5 inches in diameter (to hold the 52-tooth wheels), one that is 0.75 inches in diameter (to hold the 26- and 29-tooth wheels), and one that is 0.375 inches in diameter (to hold the 14 tooth wheels).

Each are made from aluminum stock I have on hand; the 0.375 inch arbor was turned on a lathe down from 1/2 inch diameter aluminum stock. And while my arbors may not have the same fit and finish as the Clickspring arbors, hey; I’m still learning how to do this stuff. My hope is that by the time I get to working with brass, I’ve learned enough from working with aluminum to get the fit and finish right.


As a footnote, since I’m using 32DP gear cutters (which has the nice property that they should work well with stock gears from Sparkfun,) this means in the future if I get interested in doing some robots work, I’m well equipped to cut my own custom gears in case I need a custom tooth ratio.

But it does mean going through and resizing the gear parameters in Kythera for the Earth-Moon Orrery.

After doing a little research I decided to cut the gears from 1/16″ thick brass sheet and 1/8″ thick brass sheet. A little fiddling with the settings for the document, but setting the gear and layer thickness to 1/16″ and 1/8th”, respectively, and I’ve come up with a tentative update to the Earth-Moon orrery design, which you can download here.

Downloads: EMOrrery2_32p.zip

Building a 3D Graphics Pipeline on an Arduino, Part 5

One of the advantages of our design is that we can rapidly port our 3D drawing code to another platform. In fact, we could port our code off the Arduino entirely and move it to a different microprocessor or even to your desktop computer.

All we have to do is to alter how we initialize our 3D library, change our begin/end calls, and change our low-level drawing routines. While we’re at it, we can also refine our 3D library to only draw in a subset of our screen. We may wish to do this, for example, if we’re creating a game and we want a separate status area.

So let’s begin.

For this exercise we’ll be porting the code to an Arduboy, a small hand held Arduino game unit with built-in game control buttons and a small black and white screen.


Altering the header.

Our current constructor for the G3D class contains a reference to our library, and we assume we use the entire screen. Let’s change the code two ways. First, we want to change the drawing library we pass in. Second, we want to pass in the area of the screen we’re drawing in.

So let’s update our class. I’m going to define a preprocessor token:

#define USELIBRARY	        2   // 1 = Adafruit, 2 = Arduboy

Now let’s update our G3D class. In our header file we’ll alter our constructor and create the appropriate definitions:

class G3D
{
    public:
#if USELIBRARY == 1
                G3D(Adafruit_GFX &lib, uint16_t x, uint16_t y, uint16_t width, uint16_t height);
#elif USELIBRARY == 2
                G3D(Arduboy &lib, uint16_t x, uint16_t y, uint16_t w, uint16_t h);
#endif
                ~G3D();

        void    setColor(uint16_t c)
                    {
                        color = c;
                    }

        void    begin();
        void    end();

We also want to update the color declaration, since the Arduboy uses an 8-bit color value (which is set to either BLACK or WHITE), while the Adafruit library uses a 16-bit color value.

class G3D
{
    public:
#if USELIBRARY == 1
                G3D(Adafruit_GFX &lib, uint16_t x, uint16_t y, uint16_t width, uint16_t height);
#elif USELIBRARY == 2
                G3D(Arduboy &lib, uint16_t x, uint16_t y, uint16_t width, uint16_t height);
#endif
                ~G3D();

#if USELIBRARY == 1
        void    setColor(uint16_t c)
                    {
                        color = c;
                    }
#elif USELIBRARY == 2
        void    setColor(uint8_t c)
                    {
                        color = c;
                    }
#endif

        void    begin();
        void    end();

These two changes require corresponding changes to the private values in our class. We need to update the library we reference, update the screen size parameters and the color:

    private:
        /*
         *  Internal state
         */
#if USELIBRARY == 1
        Adafruit_GFX &lib;
#elif USELIBRARY == 2
        Arduboy &lib;
#endif

        uint16_t xoffset;
        uint16_t yoffset;
        uint16_t width;
        uint16_t height;

        /*
         *  Current drawing color
         */

#if USELIBRARY == 1
        uint16_t color;
#elif USELIBRARY == 2
        uint8_t color;
#endif

        /*
         *  Stage 4 pipeline; 3D transformation
         */

These are all the changes we need to make to our header. From a public interface perspective, we simply update the value of USELIBRARY, and create our class linked to the correct library–and that’s it. There is no step 3.


G3D code changes.

For the body of our code, the changes are also similarly easy.

First, of course, we need to update our constructor. The body of the constructor is the same; we simply need to change the library name of the library passed in. We also stash the screen size parameters that we’ve changed.

/*  G3D::G3D
 *
 *      Construct our pipeline
 */

#if USELIBRARY == 1
G3D::G3D(Adafruit_GFX &l, uint16_t x, uint16_t y, uint16_t w, uint16_t h) : lib(l)
#elif USELIBRARY == 2
G3D::G3D(Arduboy &l, uint16_t x, uint16_t y, uint16_t width, uint16_t height) : lib(l)
#endif
{
    xoffset = x;
    yoffset = y;
    width = w;
    height = h;

    /* Initialize components of pipeline */
    p1init();
    p2init();
    p3init();
}

Now our implementation on the Adafruit GFX library called the internal startWrite and endWrite calls in order to reduce the number of times we had to turn on and off the Adafrut drawing engine. But the Arduboy library doesn’t require the same thing–so these calls effectively become ‘no-ops’:

void G3D::begin()
{
#if USELIBRARY == 1
    lib.startWrite();
#endif
}

void G3D::end()
{
#if USELIBRARY == 1
    lib.endWrite();
#endif
}

Finally we need to update our p1 move/draw and point routines. We need to do two things here: first, we need to offset our drawing to the pixel coordinates we supplied in our initializer. Second, we need to actually call the correct line drawing routines.

void G3D::p1movedraw(bool drawFlag, uint16_t x, uint16_t y)
{
    /*
     *  We use the p1drawflag and the drawFlag objects to determine the
     *  way to draw our line. For us, we're always drawing single
     *  segments, but we theoretically could roll up our lines into a
     *  collection of line segments and send them on close. (This
     *  requires hooking G3D::end().)
     */
    
    if (drawFlag) {
#if USELIBRARY == 1
        lib.writeLine(xoffset + p1x,xoffset + p1y,x,y,color);
#elif USELIBRARY == 2
        lib.drawLine(xoffset + p1x,xoffset + p1y,x,y,color);
#endif
    }
    
    p1draw = drawFlag;
    p1x = x;
    p1y = y;
}

void G3D::p1point(uint16_t x, uint16_t y)
{
#if USELIBRARY == 1
    lib.writePixel(xoffset + x,xoffset + y,color);
#elif USELIBRARY == 2
    lib.drawPixel(xoffset + x,xoffset + y,color);
#endif
}

And that’s it. Now our library can be used to generate 3D drawings on either an Arduboy or using the Adafruit GFX library.


Testing it out.

We can redo our cube test from the last time but on the Arduboy. Because the Arduboy has double-buffering the animation is smooth and quick–though it lacks the color of the Adafruit color display.

Our example needs to be rewritten to use the double-buffer drawing support of the Arduboy. We can also make use of the viewport feature to only draw on part of the screen; we’d do this if we wanted to make use of only part of the screen.

Most of the changes are relatively simple: we switch libraries and change how we start up our drawing system. The loop has the most changes, since we need to handle the double-buffering of the Arduboy API.

For the record, the complete source kit (just showing the Arduboy elements and their changes) is:

#include "G3D.h"

/*
 *  Drawing globals
 */

Arduboy arduboy;

// Graphics setup
G3D draw(arduboy,0,0,100,64);

// Rotation
static float GXAngle;
static float GYAngle;

void setup() 
{
    arduboy.beginNoLogo();
    arduboy.setFrameRate(50);
    arduboy.clear();
    arduboy.display();
    
    GXAngle = 0;
    GYAngle = 0;
}

void transform()
{
    draw.transformation.setIdentity();
    draw.perspective(1.0f,0.5f);
    draw.translate(0,0,-3.5);
    draw.rotate(AXIS_X,GXAngle);
    draw.rotate(AXIS_Y,GYAngle);
}

void drawBox(int x, int y, int z)
{
    draw.move(x-1,y-1,z-1);
    draw.draw(x+1,y-1,z-1);
    draw.draw(x+1,y+1,z-1);
    draw.draw(x-1,y+1,z-1);
    draw.draw(x-1,y-1,z-1);
    draw.draw(x-1,y-1,z+1);
    draw.draw(x+1,y-1,z+1);
    draw.draw(x+1,y+1,z+1);
    draw.draw(x-1,y+1,z+1);
    draw.draw(x-1,y-1,z+1);
    draw.move(x+1,y-1,z-1);
    draw.draw(x+1,y-1,z+1);
    draw.move(x+1,y+1,z-1);
    draw.draw(x+1,y+1,z+1);
    draw.move(x-1,y+1,z-1);
    draw.draw(x-1,y+1,z+1);
}

void loop()
{
    if (!arduboy.nextFrame()) return;
    
    arduboy.clear();
    
    draw.begin();
    draw.setColor(WHITE);
    transform();
    drawBox(0,0,0);
    draw.end();
    
    arduboy.display();

    GXAngle += 0.01;
    GYAngle += 0.02;
}

When we run this we get the following:

All the code here is at GitHub, so feel free to check it out. The main branch is the latest and greatest, and can be compiled (by changing the flag in G3D.h) on either the Arduboy or on any of Adafruit’s GFX compatible displays.

A quick note about the cube demo. If you watch it spin, sometimes the point of the cube nearest to you “disappears.” This is expected, and demonstrates clipping not just to the four boundaries of our screen, but to the near clipping plane. Think of it as the cube being really small and “clipping” against the camera lens of our virtual camera.


This has been a somewhat hastily written adventure into 3D, but I hope it has been an instructive one. If you found this series useful, please let me know. And if you find any problems with any of the code, or if you want to share any games you put together with this code, please let me know!

Building a 3D Graphics Pipeline on an Arduino, Part 4

We have all the pieces. Let’s put them together, shall we?

With the last article we had a clipping algorithm which could clip lines so that only the visible lines are displayed. Now let’s put the whole thing together.


Some Math in Code.

On our second article we introduced the concept of Homogeneous Coordinates. This is how we manipulate points in 3D, and you see the same matrix math used everywhere in 3D graphics, including OpenGL.

First, let’s create the methods for manipulating our matrices and vectors.

First, we need a 3D matrix, which we define as:

class G3DMatrix
{
    public:
        float           a[4][4];
};

Recall our 3D matrices are 4×4 matrices:

Matrix Stuff

Our convention is that items in our matrix across is the first index, and down is the second.

Since we’re using C++, we can also pack our matrix full of all those declarations which make our life easier. Specifically we can add routines for initializing our matrix as a translation matrix, a rotation matrix, or a scale matrix, as well as our perspective matrix. We can also add our multiply method as well, just to keep things tidy.

class G3DMatrix
{
    public:
                        G3DMatrix();

        // Initialize matrix
        void            setIdentity();
        void            setTranslate(float x, float y, float z);
        void            setScale(float x, float y, float z);
        void            setScale(float x);
        void            setRotate(uint8_t axis, float angle);
        void            setPerspective(float fov, float near);

        // Inline multiply transformation matrix
        void            multiply(const G3DMatrix &m);
        
        // Raw contents of the matrix
        float           a[4][4];
};

We also need our 3D vector as a homogeneous coordinate, and we include our vector/matrix multiply routines as well.

struct G3DVector {
    float x;
    float y;
    float z;
    float w;

    // Math support
    void                multiply(const G3DMatrix &m, const G3DVector &v);
};

Everything we’re writing here should seem familiar if you made it to the end of my article on homogeneous coordinates. Basically, the idea is that a 3D point (x,y,z) becomes our 3D homogeneous vector (x,y,z,1), and any 3D homogeneous coordinate (x,y,z,w) becomes (x/w, y/w, z/w).

The implementation of each of the matrix initialization methods follows pretty quickly from our homogeneous matrices in the prior article. For example, our implementation of setTranslate which creates a translation matrix:

Translation Matrix

looks like:

void G3DMatrix::setTranslate(float x, float y, float z)
{
    setIdentity();
    a[0][3] = x;
    a[1][3] = y;
    a[2][3] = z;
}

Our Transformations

To our G3D class, we create methods for transformations, scaling, and the transformation matrix used to translate points in our 3D coordinate space.

class G3D
{
    public:
                G3D(Adafruit_GFX &lib);
                ~G3D();

        void    setColor(int16_t c)
                    {
                        color = c;
                    }
        
        void    begin();
        void    end();
        void    move(float x, float y, float z)
                    {
                        p4movedraw(false,x,y,z);
                    }
        void    draw(float x, float y, float z)
                    {
                        p4movedraw(true,x,y,z);
                    }
        void    point(float x, float y, float z)
                    {
                        p4point(x,y,z);
                    }

        void    translate(float x, float y, float z);
        void    scale(float x, float y, float z);
        void    scale(float s);
        void    rotate(uint8_t axis, float angle);
        void    perspective(float fov, float nclip);
        void    orthographic(void);

        G3DMatrix transformation;
    private:

We expose all of our transformation stuff here to simplify our work, by keeping the pieces all in one place. For example, our translate method looks like:

void G3D::translate(float x, float y, float z)
{
    G3DMatrix m;
    m.setTranslate(x,y,z);
    transformation.multiply(m);
}

This will post-multiply our matrix into our current transformation matrix. We’ll talk about the ramification of this below.

Finally we need our p4movedraw and p4point methods for our class:

void G3D::p4movedraw(bool drawFlag, float x, float y, float z)
{
    G3DVector t;

    t.x = transformation.a[0][0] * x + transformation.a[0][1] * y + transformation.a[0][2] * z + transformation.a[0][3];
    t.y = transformation.a[1][0] * x + transformation.a[1][1] * y + transformation.a[1][2] * z + transformation.a[1][3];
    t.z = transformation.a[2][0] * x + transformation.a[2][1] * y + transformation.a[2][2] * z + transformation.a[2][3];
    t.w = transformation.a[3][0] * x + transformation.a[3][1] * y + transformation.a[3][2] * z + transformation.a[3][3];

    p3movedraw(drawFlag,t);
}

void G3D::p4point(float x, float y, float z)
{
    G3DVector t;

    t.x = transformation.a[0][0] * x + transformation.a[0][1] * y + transformation.a[0][2] * z + transformation.a[0][3];
    t.y = transformation.a[1][0] * x + transformation.a[1][1] * y + transformation.a[1][2] * z + transformation.a[1][3];
    t.z = transformation.a[2][0] * x + transformation.a[2][1] * y + transformation.a[2][2] * z + transformation.a[2][3];
    t.w = transformation.a[3][0] * x + transformation.a[3][1] * y + transformation.a[3][2] * z + transformation.a[3][3];

    p3point(t);
}

This simply does our matrix/vector multiplication, then passes the resulting point into our 3D clipping engine.


Testing the whole thing.

Our test code is very simple.

First, we need to set up our drawing system. We first need to set up our drawing engine, and then pass it as a construction parameter to our 3D drawing engine:

// For the Adafruit shield, these are the default.
#define TFT_DC 9
#define TFT_CS 10

// Use hardware SPI (on Uno, #13, #12, #11) and the above for CS/DC
Adafruit_ILI9341 tft = Adafruit_ILI9341(TFT_CS, TFT_DC);

// Graphics setup
G3D draw(tft);

And finally we set up a couple of global variables for the rotation angle.

// Rotation
static float GXAngle;
static float GYAngle;

Our setup code simply erases the screen to black and resets our globals.

void setup() 
{
    tft.begin();
    tft.fillScreen(ILI9341_BLACK);
    
    GXAngle = 0;
    GYAngle = 0;
}

And our loop draws our box at the origin, after setting up the transformation matrix. We wait a little bit, then we redraw the box in black; this is faster than erasing the screen to black. We rotate our box, and try again.

void loop() 
{
    draw.begin();
    draw.setColor(ILI9341_RED);
    transform();
    drawBox(0,0,0); 
    draw.end();

    delay(100);

    draw.begin();
    draw.setColor(ILI9341_BLACK);
    drawBox(0,0,0); 
    draw.end();

    GXAngle += 0.01;
    GYAngle += 0.02;
}

Our transformation matrix.

The code we use to set up our transformation matrix is:

void transform()
{
    draw.transformation.setIdentity();
    draw.perspective(1.0f,0.5f);
    draw.translate(0,0,-6);
    draw.rotate(AXIS_X,GXAngle);
    draw.rotate(AXIS_Y,GYAngle);
}

We initialize the matrix in the opposite direction in which we use the transformations. In other words, our transformation matrix first rotates our cube around the Y axis. Then it rotates around the X axis. We then move the object 6 units away from our view, and finally set up the perspective matrix.

The order of this is important, because it means we could animate multiple objects by saving and restoring the matrix. For example, we could represent two cubes rotating around each other independently by partially setting up our transformation matrix, saving the intermediate result, and then setting up the specific transformations for the two separate cubes.

void transform()
{
    draw.transformation.setIdentity();
    draw.perspective(1.0f,0.5f);
    draw.translate(0,0,-6);

    G3DMatrix save = draw.transformation;
    draw.rotate(AXIS_X,GXAngle);
    drawFirstCube();

    draw.transformation = save;
    draw.rotate(AXIS_Y,GYAngle);
    drawSecondCube();
}

Put it all together and we get a rotating cube on our display:

The complete sources are at GitHub.

Building a 3D Graphics Pipeline on an Arduino, Part 3

Now let’s talk clipping: the process of making sure we don’t write a line that’s not visible on the screen.

We’re so used to having clipping done for us that sometimes we can forget that clipping is not something that “just happens.” When we write a line to the screen what we’re really doing, at the bottom of the stack, is writing a bunch of bytes to memory that is being displayed using video hardware. And if we don’t stop at the edges, we’ll have a buffer overrun error: at the very least we’ll see bits of a line wrapping around the screen. Worse case: we will drop garbage into global memory or onto our stack, crashing our application.

Clipping a line ultimately requires us to find the point along the line that hits the edge of the screen. There are multiple ways we can do this: different algorithms which either modify Bresenham’s line drawing algorithm, or perform line clipping calculations before drawing.

Pipeline

For our purposes, since we are drawing 3D lines and our coordinates are specified with floating point values (because we’re now considering the third stage of clipping), we’re going to use the later strategy of finding the point where our lines intersect the edges.


Some Math.

Before we can describe how we clip lines, we need to introduce some vector math. (Yes, that dreaded 4 letter word.)

In this case, let’s talk [dot products.](https://en.wikipedia.org/wiki/Dot_product) Now I won’t get too deep into an explanation as to how this all works; for that, the incredible series on The essence of linear algebra by Grant Sanderson of 3Blue1Brown is absolutely fantastic and explains all this stuff at a level of detail with pretty pictures that can be understood by nearly anyone.

But the essence of it is this:

The dot product of two equal-length vectors–two values (a,b,c) and (x,y,z) for example–is given by the formula:

Dot

That is, you multiply all the items inside the two vectors together, and add them all up. So the dot product of (a,b,c) and (x,y,z) would be:

Dotexample

Now the dot product has a very interesting property which will be useful to us when we talk about clipping. And that is, geometrically speaking, the dot product has the useful property that the value of the dot product is equal to the length of the projection of one vector onto another, times the length of the other vector. That is:

Dot2

where ||A|| is the length of vector A, and θ is the angle between the two vectors.

Graphically this looks like:

Dot Products

That is, if we were to take a vector A and “project” it into vector B, then the dot product is the length of that projection times the length of the vector B.

Now there is a second interesting property which we will use in clipping, which is this: if A and B are pointed more or less in the same direction, then the dot product will be positive. However, if (as in the diagram below) they are pointed roughly in opposite directions (that is, the angle is greater than 90 degrees), then the dot product will be negative:

Dot Products 2

This property will come in handy in a second.


Equation of a line.

While we’re on the subject, it’s useful to contemplate the “equation of a line” on a 2D surface.

The general form of a line equation is:

Generalline

Where the variables a, b and c are selected so that any point (x,y) is on the line if it satisfies the relationship above.

Now…

Notice something familiar?

Doesn’t it look like a dot product?

Linedot

And since we know the dot product is essentially the projected distance along a vector, this gives us a hint as to a geometric interpretation of the equation of a line. For a line (a,b,c), we could interpret our line equation as follows.

First, remember that the dot product (a,b)˙(x,y) is the projected distance of one of the vectors along the other–for example, the projection of (x,y) onto the vector (a,b):

Line Equation

Since our point (x,y) is on the line, this implies the dot product satisfies our equation above: that the dot product plus C = 0.

It also means something else.

It means all the points on our line satisfy this same projection relationship–which means our line is the line that runs perpendicular to the vector (a,b) that intersects (x,y):

Line Equation 2

More importantly, consider all the lines that run parallel to our line. The closer they are to the origin in our diagram, the smaller the dot product is for the points along that line. And the farther they are from the origin than our line, the bigger the dot product. This means our equation for a line is not just an equation solving for if points are on a line, but also indicates how far a point is from our line, and which side of the line our point is on.

Line Equation 3

Both of these ideas become very useful to our screen clipping code.


Clipping.

Let’s now consider the operation of clipping a line against a single edge, and part of the line goes across the edge.

Clip Edge

In the above diagram, we’re clipping a line from A to B on the left side of our screen along the edge.

First, we need to consider if the line from A to B is even visible and if it intersects the edge. Then, if it intersects the edge, we need to find the point C where it intersects the edge

Notice both questions could be answered if we had the equation (a,b,c) that represents the edge x = -1, since from above we know if we’re inside or outside based on the sign of the result (a,b,c) ˙ (x,y,1), and the distance can help us calculate the point C.

The value for (a,b,c) should be obvious: we have the line for our edge along x = -1, and a little algebra later we get a = 1, b = 0 and c = 1. Ideally we want points inside our screen to be greater than or equal to zero–that is, for (a,b,c)˙(x,y,1) to be greater than or equal to zero if x >= -1, and from a simple inspection our values of (a,b,c) work–as if we were to plug in the coordinates (0,0), we get c = 1.

This gives us our tests to see if our two points are visible on our screen:

    if (x + 1 >= 0) { /* Point is inside screen */ }

Where x+1 is a compact way of writing (a,b,c)˙(x,y,z), where (a,b,c) = (1,0.1).

This also gives us a way to calculate the position of C if points A and B cross our edge, by using similar triangles:

Clip Edge 2

We know the triangle at ACJ is proportionally smaller to the triangle ABK. We can figure out how much smaller the triangle is, proportionally speaking, from the length of the two edges we know the length of–that is, the length of AJ, which is -A˙(a,b,c), and the length of AK, which is the length of JK (B˙(a,b,c)) plus the length of AJ.

The relative size of ACJ is:

Eqn1

And we can use this to find the position of C. The value α ranges from 0 to 1 and gives the relative size of triangle ACJ from triangle ABK. If &alpha was 1, then ACJ would be the same as ABK (because B˙(a,b,c) would be 0), and if &alpha was 0, then triangle ACJ would effectively be a point.

So we can find C through the following linear interpolation operation:

Lerp

This operation is so important to our clipper we write it as a separate subroutine:

static void Lerp(const Point &a, const Point &b, float alpha, Point &c)
{
    float a1 = 1.0f - alpha;
    c.x = a1 * a.x + alpha * b.x;
    c.y = a1 * a.y + alpha * b.y;
}

Where “Lerp” is short for “Linear interpolation”, and you’ll hear that term a lot in computer graphics.


Clipping in 2D

We now can write our 2D clipping routine, which basically clips against the four edges of our screen.

If we assume our four edges are along x = ±1 and y = ±1, then we can sketch our solution.

So let’s assume we’re given two points A and B, and we want to determine if we draw the line, draw a clipped line, or not draw a line at all.

Find the “out code”

First, we need to determine which edges the two points A and B are outside of. Since we have four edges, we need to compute a 4-bit value, with each bit representing if A and B are outside a particular edge. This we’ll call our “out code”, because in the four bits, a bit is set if the point is “outside” the screen at that edge.

Now we have four edges x = ±1 and y = ±1. Since we want a point at (0,0) to have a positive value (by convention), this gives our four edges at (-1,0,1), (1,0,1), (0,-1,1) and (0,1,1).

Remember that a point is outside each edge if the dot product is negative. So now we can quickly calculate our out code, with a little rearranging of our math:

static uint8_t OutCode(const Point &v)
{
    uint8_t m = 0;

    if (v.x < -1) m |= 1;   // v * (1,0,1) < 0
    if (v.x > 1) m |= 2;    // v * (-1,0,1) < 0
    if (v.y < -1) m |= 4;   // v * (0,1,1) < 0
    if (v.y &t; 1) m |= 8;     // v * (0,-1,1) < 0

    return m;
}

Next, quickly screen stuff that is not visible.

Notice one nice property of our out code: two points with the same bit set set represents two points on the outside of the same edge.

Clip Edge 3

We can test for this quickly by taking the bit-wise ‘and’ of both out codes:

static void DrawClippedLine(const Point &a, const Point &b)
{
    uint8_t a_out = OutCode(a);
    uint8_t b_out = OutCode(b);
    
    if (a_out & b_out) return;

Quickly find lines that are not clipped

We can also quickly test if both points are inside our screen–by testing to see if both out codes are zero. One way we can test for this–because we’ll use the intermediate result later–is to take the bit-wise ‘or’ of both out codes and test if zero:

    uint8_t out_bits = a_out | b_out;
    if (out_bits == 0) {
        DrawLine(a,b);
    } else {

If both out codes are 0, we can just draw our line.

Now find clipped lines.

But what happens if one or more bits are set? Well, that means that one (or both) of our points are outside the screen, and we need to clip our line. For our line drawing routine we have, basically, three different cases, illustrated below:

Clipping cases

The first case, edge S, has one point inside of our screen. We only need to clip one edge.

The second case, edge R, has two points outside our screen, but the middle piece is visible inside our view.

And the third case, edge P, has two points outside our screen and no part of the line is visible.

So how do we screen against all these cases?

First, notice our out_bits variable above. It has the nice property that a bit is set only if we have to clip against a particular edge. This allows us to skip those calculations we don’t need to bother to do.

So let’s iterate across all of the visible edges and calculate our α for each edge:

        uint8_t m = 1;
        uint8_t i;
        float alpha;
        for (i = 0; i < 4; ++i) {
            if (out_bits & m) {
                // Calculate alpha; the intersection along the line
                // vector intersecting the specified edge
                // 
                // These are specific cases of the general equation
                // alpha = (c - old)/(new - old), which yields
                // alpha == 0 if c == old, and alpha == 1 if c == new,
                // and with alpha as a linear scale with the intersection
                // point sliding from old to new.

                switch (i) {
                    default:
                    case 0:         // clip (1,0,1)
                        alpha = 1 + p3pos.x;
                        alpha = alpha/(alpha - (1 + v.x));
                        break;
                    case 1:         // clip (-1,0,1)
                        alpha = 1 - p3pos.x;
                        alpha = alpha/(alpha - (1 - v.x));
                        break;
                    case 2:         // clip (0,1,1)
                        alpha = 1 + p3pos.y;
                        alpha = alpha/(alpha - (1 + v.y));
                        break;
                    case 3:         // clip (0,-1,1)
                        alpha = 1 - p3pos.y;
                        alpha = alpha/(alpha - (1 - v.y));
                        break;
                }

Now we want to track which point is inside and which point is outside, and for the outside point on our line, track the ‘alpha’ for that side, so we can determine which side needs to be clipped.

We do this by tracking an ‘alpha’ for each side of our line, which we’ll call alpha_a and alpha_b. By convention we’ll assume alpha is 0 if our line clips at point A and 1 if we clip at point B.

We also know which side (the side with A or B) is being clipped by looking at our original outcode: if the bit is set in a_out, then we need to clip on side a. If b_out, then on side b. Notice it will only be one or the other; if neither are set we don’t clip (because out_bits would be zero at that bit), and if both are set–well, we skipped this line earlier.

So let’s add the following stuff (in red):

        float alpha_a = 0;
        float alpha_b = 1;

        uint8_t m = 1;
        uint8_t i;
        float alpha;
        for (i = 0; i < 4; ++i, m <<= 1) {
            if (out_bits & m) {
                // Calculate alpha; the intersection along the line
                // vector intersecting the specified edge
                // 
                // These are specific cases of the general equation
                // alpha = (c - old)/(new - old), which yields
                // alpha == 0 if c == old, and alpha == 1 if c == new,
                // and with alpha as a linear scale with the intersection
                // point sliding from old to new.

                switch (i) {
                    default:
                    case 0:         // clip (1,0,1)
                        alpha = 1 + p3pos.x;
                        alpha = alpha/(alpha - (1 + v.x));
                        break;
                    case 1:         // clip (-1,0,1)
                        alpha = 1 - p3pos.x;
                        alpha = alpha/(alpha - (1 - v.x));
                        break;
                    case 2:         // clip (0,1,1)
                        alpha = 1 + p3pos.y;
                        alpha = alpha/(alpha - (1 + v.y));
                        break;
                    case 3:         // clip (0,-1,1)
                        alpha = 1 - p3pos.y;
                        alpha = alpha/(alpha - (1 - v.y));
                        break;
                }

                if (a_out & m) {
                    if (alpha_a < alpha) alpha_a = alpha;
                } else {
                    if (alpha_b > alpha) alpha_b = alpha;
                }

One more thing: if alpha_a has become bigger than alpha_b, that means we have case P above; our line is not visible on the screen. So we can stop processing right away:

                
                if (alpha_a > alpha_b) return;
            }
        }

Now that we’ve done our calculations, we can now determine the new endpoints and draw our line. To do this we again make use of our outcodes to determine if we even need to clip one side or another.

        
        Point c1;
        if (a_out) {
            Lerp(a,b,alpha_a,c1);
        } else {
            c1 = a;
        }
        
        Point c2;
        if (b_out) {
            Lerp(a,b,alpha_b,c2);
        } else {
            c2 = b;
        }

And now–finally! we can draw our line.

        DrawLine(c1,c2);
    }
}

There are some interesting things we can say about our line clipping algorithm.

First, it follows the principle of only doing work if we have to. If we don’t have to clip against an edge, we don’t clip against the edge.

Second, notice that we could in theory extend this to 3D very easily. If we wanted to clip to a cube where x, y and z range from -1 to 1, our 6 different clipping dot products would be

    1: ( 1, 0, 0, 1)
    2: (-1, 0, 0, 1)
    3: ( 0, 1, 0, 1)
    4: ( 0,-1, 0, 1)
    5: ( 0, 0, 1, 1)
    6: ( 0, 0,-1, 1)

Clipping in homogeneous coordinates

Notice something else very interesting about our list of vectors we’re clipping against: they sort of look like homogeneous coordinates, don’t they?

And in fact, with a minor modification to our X or Y clipping (depending on the aspect ratio of the screen, since our screen is not square), this is becomes our clipping algorithm for drawing lines in 3D.

We can make a few optimizations. For example, rather than supplying both points A and B as we draw our line, if we were to adopt a ‘move/draw’ mechanism, we can reuse some of the calculations we perform with the previous point as we draw the next.

Note: This is why we use a move/draw mechanism for drawing our lines.

We can also define a homogeneous vector for convenience sake:

struct G3DVector {
    float x;
    float y;
    float z;
    float w;
};

And voliá! We have our line clipping algorithm for homogeneous coordinates.

First, we need our linear interpolation algorithm and our out code calculator for homogeneous coordinates:

void G3D::Lerp(const G3DVector &a, const G3DVector &b, float alpha, G3DVector &c)
{
    float a1 = 1.0f - alpha;
    c.x = a1 * a.x + alpha * b.x;
    c.y = a1 * a.y + alpha * b.y;
    c.z = a1 * a.z + alpha * b.z;
    c.w = a1 * a.w + alpha * b.w;
}
uint8_t G3D::OutCode(const G3DVector &v)
{
    uint8_t m = 0;
    float px = p2xsize * v.x;
    float py = p2ysize * v.y;

    if (px < -v.w) m |= 1;      // (p2xsize,0,0,1)
    if (px > v.w) m |= 2;       // (-p2xsize,0,0,1)
    if (py < -v.w) m |= 4;      // (0,p2ysize,0,1)
    if (py > v.w) m |= 8;       // (0,-p2ysize,0,1)
    if (v.z < -v.w) m |= 16;
    if (v.z > v.w) m |= 32;

    return m;
}

Next, we need to run the same clipping algorithm, but using homogeneous coordinates:

void G3D::p3movedraw(bool drawFlag, const G3DVector &v)
{
    uint8_t newOutCode = OutCode(v);
    G3DVector lerp;
    if (drawFlag) {
        uint8_t mask = newOutCode | p3outcode;

        /*
         *  Fast accept/reject
         */

        if (0 == (newOutCode & p3outcode)) {
            if (0 == mask) {
                // Fast accept. Both points are inside; we assume
                // the previous point was already passed upwards, 
                // so we only draw to the current vector location
                p2movedraw(true,v.x/v.w,v.y/v.w);
            } else {
                // At this point we have a line that crosses
                // a boundary. We calculate the alpha between
                // 0 and 1 for each point along the line.
                //
                // (This is the Liang-Barsky optimization of
                // the Cohen-Sutherland algorithm)

                float aold = 0;     // (1 - alpha) * old + alpha * new = v
                float anew = 1;     // in the above, 0 == old, 1 == new.
                float alpha;

                uint8_t m = 1;
                uint8_t i;
                for (i = 0; i < 6; ++i) {
                    if (mask & m) {
                        // Calculate alpha; the intersection along the line
                        // vector intersecting the specified edge
                        // 
                        // These are specific cases of the general equation
                        // alpha = (c - old)/(new - old), which yields
                        // alpha == 0 if c == old, and alpha == 1 if c == new,
                        // and with alpha as a linear scale with the intersection
                        // point sliding from old to new.

                        switch (i) {
                            default:
                            case 0:         // clip (p2xsize,0,0,1)
                                alpha = p2xsize * p3pos.x + p3pos.w;
                                alpha = alpha/(alpha - (p2xsize * v.x + v.w));
                                break;
                            case 1:         // clip (-p2xsize,0,0,1)
                                alpha = - p2xsize * p3pos.x + p3pos.w;
                                alpha = alpha/(alpha - (-p2xsize * v.x + v.w));
                                break;
                            case 2:         // clip (0,p2ysize,0,1)
                                alpha = p2ysize * p3pos.y + p3pos.w;
                                alpha = alpha/(alpha - (p2ysize * v.y + v.w));
                                break;
                            case 3:         // clip (0,-p2ysize,0,1)
                                alpha = - p2ysize * p3pos.y + p3pos.w;
                                alpha = alpha/(alpha - (- p2ysize * v.y + v.w));
                                break;
                            case 4:         // clip (0,0,1,1)
                                alpha = p3pos.z + p3pos.w;
                                alpha = alpha/(alpha - (v.z + v.w));
                                break;
                            case 5:         // clip (0,0,-1,1)
                                alpha = - p3pos.z + p3pos.w;
                                alpha = alpha/(alpha - (-v.z + v.w));
                                break;
                        }

                        if (p3outcode & m) {
                            if (aold < alpha) aold = alpha;
                        } else {
                            if (anew > alpha) anew = alpha;
                        }

                        if (aold > anew) {
                            // We have a case where the line is not visible
                            // because it's outside the visible frustrum.
                            // abort.
                            break;
                        }
                    }
                    m <<= 1;
                }

                if (i >= 6) {
                    // Ran all clipping edges.
                    if (p3outcode) {
                        Lerp(p3pos,v,aold,lerp);
                        p2movedraw(false,lerp.x/lerp.w,lerp.y/lerp.w);
                    }

                    // Draw to the new point
                    if (newOutCode) {
                        Lerp(p3pos,v,anew,lerp);
                        p2movedraw(true,lerp.x/lerp.w,lerp.y/lerp.w);
                    } else {
                        p2movedraw(true,v.x/v.w,v.y/v.w);
                    }
                }
            }
        }
    } else {
        if (newOutCode == 0) {
            p2movedraw(false,v.x/v.w,v.y/v.w);
        }
    }

    p3outcode = newOutCode;
    p3pos = v;
}

This assumes we’ve added a couple of new fields to our class:

        G3DVector p3pos;
        uint8_t   p3outcode;

And now we have our clipping routine!


The code for our clipping routine is in GitHub; you can download it from here.

Next time: we’ll take all that stuff we talked about with homogeneous coordinates and put the whole thing together to draw 3D objects!

Building a 3D Graphics Pipeline on an Arduino, Part 2

Let’s talk coordinate systems.

Yes, I understand that this will involve that most dreaded of four letter words, math. I promise I’ll be gentle, and I’ll try my best to make this as easy to follow as possible.

Some of this may be a little basic; feel free to skip ahead if you already know something I’m about to talk about.


2D Coordinates

First, let’s talk about finding the position of a pixel on a computer screen. This is something most of us are pretty familiar with.

Pixels

So if we want to specify the location of the yellow pixel above, well, we’d find the column (“x”) under which the pixel is located, and we’d find the row (“y”). For the yellow pixel above, x = 3, y = 1, and that’s our pixel’s location.

We can generalize this idea. For example, if you want to find the location of a pin on a piece of paper you can use a ruler and measure it’s distance from the left edge, the distance from the top edge, and those two locations (like, for example, x = 1.3 inches, y = 4.5 inches) gives us the location of that point on a piece of paper.

Notice in both cases our position is a measure of distance from some relative location: for our point on a piece of paper our “units” are inches, and the relative location is the upper-left corner of the piece of paper. For our computer display, our “units” are pixels, and it’s all measured relative to the upper-left corner of the screen.

Some terms.

With this in mind, it’s useful to define some terms.

We will call the point from which we’re measuring our point’s relative location the Origin, because this is where our measurements originate. Often we represent the origin in a diagram with a zero.

We will call the relative directions we measure from (horizontally and vertically across the page, across rows and down columns of our display) as our Axis. (In math, an “Axis” is a line that serves to orient ourselves–and it can also be something we rotate around.)

Note: The whole concept of axis in Math is a bit of a rabbit hole, but we’ll ignore it for now. For now we’ll just treat it as “up/down”, “left/right” and “near/far.”

And we’ll introduce some compact notation:

(x, y)

This is a compact way for us to represent x and y.

So our yellow pixel is at (3,1), our point on a piece of paper is at (1.3 inches, 4.5 inches). Ususally, however, our units are implied–so we’d write our point on paper as at (1.3, 4.5).


3D Coordinates

This whole idea of 2D coordinates can be extended to the world of 3D by considering “depth.”

Depth

Take our yellow dot above. Relative to our X and Y axes, the object is 4 units along X and 3 units up along Y. (What this means is, looking at the diagram above, if you were to measure from the wall that goes up along the Y-Z plane, the yellow dot is 4 units away from that wall.

From the floor (that is, from the X-Z plane), our yellow dot is 3 units up.

And it’s 5 units away from the X-Y plane.

We would represent the location of our yellow dot as x = 4 units, y = 3 units and z = 5 units, or–more compactly, as

(4, 3, 5)

On the math below.

I think at this point it is fair to say that we’re about to dive into a topic called “linear algebra”, and while I’ll only write out the things that are useful to us, the whole topic of linear algebra contains some interesting stuff that can be useful if you decide to do more interesting things in computer graphics than just draw a handful of pretty pictures with half-understood equations.

And for that, if you want, there is a wonderful series on The essence of linear algebra by Grant Sanderson of 3Blue1Brown is absolutely fantastic and explains all this stuff at a level of detail with pretty pictures that can be understood by nearly anyone.

For our purposes we’ll introduce the stuff we need–but if you want to understand “why”, you’ll want to check out his videos. They’re well done, well thought out and, I think, very easy for anyone to understand.


Perspective

Now when we talk about computer graphics, what we’re really thinking about is the whole idea of “perspective:” the idea that as things get farther away they appear smaller.

The moon, for example, is 2,150 miles in diameter. A quarter, on the other hand, is slightly less than an inch in diameter. But if you hold up the quarter to the night sky, you can cover the moon with your quarter. That’s perspective: your quarter, being inches from your eye, is a lot closer than the moon, which is 240,000 miles away.

You can think of this by drawing a line from a far away object to your eye:

Perspective1

The blue dot is your eye looking at the far away object.

Now suppose you are looking at the object at a computer screen some distance near your eye. The object would appear smaller–because the screen is nearer than the object you are looking at:

Perspective2

So how big is that object on the screen?

For convenience sake, let’s consider the distance from the screen to your eye 1 unit. A little basic geometry gives us an answer: suppose the distance to the object from your eye is Z, the height Y, then the law of similar triangles suggests the height Y’ on the screen would be given by the formula below.

Perspective3

This means the size on our screen (that is one unit away) would be Y/Z.


Perspective and Homogeneous Coordinates

When we talk about perspective, we are talking about Projective Geometry, or the geometric principles of projecting stuff.

Which is what we are doing when we project our object onto a computer screen.

And in 1827, Ferdinand Möbius devised the concept of homogeneous coordinates as a means of representing projective coordinates–the coordinates of things as they are projected somewhere else.

Now the whole topic of homogeneous coordinates, like mathematical axis, is a rabbit hole we can easily spend a lot of time on. But for our purposes, the way this works is as follows:

First, we add a new dimension, w, associated with all of our points. Generally we can take a coordinate (x, y, z) and map it into homogeneous coordinates by adding w = 1, which we show as a fourth value in our coordinates.

(x, y, z) -> (x, y, z, 1)

And we can map any homogeneous point (x, y, z, w) back to 3D coordinates by:

(x, y, z, w) -> (x/w, y/w, z/w)

Now why divide by w and not z? That’s in part a matter of convention, and in part because it works well with our 3D clipping for hand-wavey reasons.

(Basically, w serves as a projection in our 3D coordinate system. And when it comes to clipping we want to preserve all the information we can so that we can represent clipping with as much accuracy as possible, and because we may wish to represent things like points on an “infinite” sphere–like stars in the sky–through using ‘w = 0’.)

Why is this interesting?

One thing you can use homogeneous coordinates for is dealing with all your rotations, scaling operations, and translation operations (moving things around in your system) using matrix multiply operations.

Here’s an example. Suppose we have an object at (3, 4, 5)–and we want to represent it moving over by 2 units along X.

Normally we’d do this by addition: (3, 4, 5) + (2, 0, 0) = (3+2, 4, 5) = (5, 4, 5).

Matrix multiplication may seem a little more convoluted–but trust me, this will make our lives easier.


Small rabbit hole: matrix multiplication.

Before we talk about translation with matrix multiplication, let’s define a few terms.

First, for our purposes, a matrix is just a two-dimensional array of numbers. Throughout computer graphics we always use 4×4 matrices, so whenever you hear “matrix” in computer graphics, usually what you should hear is:

A Matrix

That is, a 4×4 array of numbers.

Now when we multiply a matrix by a vector, we essentially treat the vector as a 4×1 matrix (if you’re following along by looking stuff up on Wikipedia), and we essentially do the following:

We can see this with the code we use for multiplying matrices, though we store a matrix as a 2D array:

void G3DVector::multiply(const G3DMatrix &m, const G3DVector &v)
{
    x = m.a[0][0] * v.x + m.a[0][1] * v.y + m.a[0][2] * v.z + m.a[0][3] * v.w;
    y = m.a[1][0] * v.x + m.a[1][1] * v.y + m.a[1][2] * v.z + m.a[1][3] * v.w;
    z = m.a[2][0] * v.x + m.a[2][1] * v.y + m.a[2][2] * v.z + m.a[2][3] * v.w;
    w = m.a[3][0] * v.x + m.a[3][1] * v.y + m.a[3][2] * v.z + m.a[3][3] * v.w;
}

The ways, whys and wherefores of matrix multiplication aren’t really that important here, though you can read more at the above linked Wikipedia article.


Back to Translations

Notice something interesting about our matrix multiplication results: they include addition as well as multiplication. We can use this to our advantage by constructing our matrix appropriately.

So back to our point at (3, 4, 5).

If we represent this as a homogeneous coordinate (3, 4, 5, 1) (because remember: in general when we go from a point in 3D space to a homogeneous coordinate we append a w = 1 at the end), then we can construct a matrix to handle translating by x = 2:

Translate Matrix Example

(Follow along with the animation if you need to convince yourself this is the correct answer.)

Now if we perform the actual multiplications and additions we get our final result (5, 4, 5, 1), and from our rules above, in 3D space, this would be (5/1, 4/1, 5/1) = (5, 4, 5).

We can generalize this by moving through Y and Z, giving us our translation matrix (that is, a matrix which moves our object by a distance in X, Y and Z) as:

Translation Matrix

There is an interesting property of these translation matrices. We can chain them together using matrix multiplication. And it just works out the way we would think: if we have two translation matrices, one that translates by (x,y,z) and another that translates by (a,b,c), if we multiply the two matrices together we get:

Multiply Translation Matrices

(If you have to read the article on matrix multiplications and follow along, that’s fine. I can wait.)

This hints at something terribly clever going on here:

With a single matrix you can represent the concatenation of a whole bunch of rotations, scale operations and translations.

And it means we aren’t constantly moving things around the screen a pixel at a time, through a whole chain of “rotate”, “move” and “scale” operations. We simply concat all this into a single matrix through multiplication, and then all points multiplied by that matrix are moved around according to our whole chain of rotations, movements and scale operations.


Scaling and Rotations and chaining it all together

Scaling an object by (sx,sy,sz)–that is, multiplying each object’s (x,y,z) coordinate by (sx,sy,sz)–has the following matrix representation:

Scale Matrix

And rotation around the X, Y and Z axis looks like:

Rotate Matrices

Now we can chain these matrices together by pre-multiplying the matrix. Meaning if we want to first rotate our object around the Y axis by an angle, then translate the whole thing by (tx,ty,tz), we could first multiply our vector by the rotation matrix, then by the translation matrix, multiplying right to left:

Chaining

Or, you know, we could multiply the matrices together first–then multiply by all of our points.

‘Cause like I said above:

With a single matrix you can represent the concatenation of a whole bunch of rotations, scale operations and translations.

This is, by the way, what happens in an OpenGL graphics pipeline.

Display drivers which display 3D in hardware are very good at multiplying 4×4 matrices and 4×4 matrices with vectors. This math allows them to move objects around in real time on your screen.

Of course we’re not going to be moving stuff around in real time with an Arduino, but the same principles apply.


But what about perspective?

Oh, yeah. So we’ve gone down a rabbit hole using 4×4 matrices to move our objects around in three-dimensional space. But what about displaying objects in perspective?

There are a number of perspective matrices out there which essentially put the z depth into the w column, so at the end we get (x/z, y/z, ...)–we then can display our lines using the (x/z, y/z) value at the end.

The perspective matrix I prefer–and you would multiply this as the last multiplication operation before displaying your stuff–is:

Perspective Matrix

Notice what we do here: we move the z coordinate into the w column, and the w column into the z column. This has the nice result that we (eventually) divide by z–and all that perspective stuff we did before works very well.

Note: This is not the perspective matrix used by OpenGL. I provide a link above explaining why this is a preferable way to represent perspective.


A word about computer graphics math

I once worked with a computer graphics library as part of a project a long time ago–it was required by the project manager, even though I could have rolled my own more quickly.

But the manual did make the following observation:

In the end, you either have an image, or you do not.

There are a lot of things that can go wrong on your path towards drawing something. You can accidentally flip the sign of something–and think you’re rendering an object in front of you when it is behind you and invisible. You can accidentally rotate left when you intended to rotate right. You can stack the perspective matrices backwards, or multiply the matrices together wrong.

So here are some helpful hints to keep in mind when we get to the part where we start putting the code together.

Test the translation matrix first.

This will allow you to make sure you haven’t transposed the matrix: that you haven’t flipped the rows and the columns. If you take a point at (1,2,3,1) and multiply it by the transformation matrix for moving an object by x=5, if you get (6,2,3,1) you’re probably on the right path. If you get (1,2,3,6), you’ve transposed your matrix.

Test to make sure you haven’t flipped a sign somewhere.

It’s inevitable, so much so that it’s nearly a joke amongst those involved in the computer graphics industry, that you will inevitably flip the sign of some term somewhere. Look through your code for a “+” where there should be a “-” and visa-versa.

Right handed rule verses left handed rule.

This is sort of related to the last rule.

Take your right hand, and make a “gun” with your fingers–with your thumb pointing up, your pointing finger pointing out. Now your middle finger points at a right angle to your pointing finger, and your last two fingers curl around.

If your thumb is “X”, your pointing finger “Y” and your middle finger “Z”, this is the right-handed coordinate system. And this is the coordinate system our perspective matrix works in. Notice something interesting here: if you rotate your hand so your thumb is pointing horizontally to the right, your pointing finger is pointing up–well, the middle finger is pointing towards you, not away.

This means if you want to move your object out in front of your virtual camera, you must subtract some value from the Z axis.

Build your translations slowly.

When you start building your translations for display–say, you want to move a block representing an arm of a robot to the shoulder–don’t be afraid to build each of the translations slowly, using small values. Bump your arm out by 1 unit. Or rotate it by 5 to 10 degrees. Small motions help you visualize if you’re building the transformation stack (what we call all that matrix to matrix mulitplication) correctly.

And start simply: it’s easier to start with an image (even if it’s a boring cube) and make something cool with it, than to start with a blank screen and pull your hair out trying to figure out why.

Premultiply your perspective matrix last.

Because we move stuff around on the screen before we take its perspective, right?

When in doubt, comment it out.

This helps you reduce the amount of variables in the problem, so you can see if you have some crazy value being multiplied into your system.


End Notes

This was a lot, I’m sure. But next time we’ll talk about clipping and clipping in 3D. And afterwards, we’ll throw all this together to create our first 3D drawings.

Building a 3D Graphics Pipeline on an Arduino, Part 1

This is a series of articles I’ve intended to write showing the creation of a basic 3D graphics pipeline on an Arduino. By the end of this series hopefully you’ll have a good understanding of basic 3D graphics, including coordinate transformations, rendering and perspective rendering.


Things you need to follow along.

Of course, if you want to write software for an Arduino, you need one.

For this series of exercises I used the Adafruit METRO 328, which can be had (as of this writing) for $17.50. You can also use any other Arduino or Arduino clone, but you may need to fiddle with the settings.

I also used the 2.8″ TFT Touch Shield for Arduino. The advantage of all of this is that we can simply plug the two parts together, plug it into our computer, download the Arduino IDE, and we have a basic development environment.

Adafruit also does a very good job integrating with the Arduino IDE, providing libraries that can be quickly downloaded to test your display. And they provide a great on-line tutorial to get you up and running in just a few minutes.

Of course by using these components we create a dependency on their libraries; if you decide to use a different display shield from a different company you’ll also need to make a few changes in our code to make it all hang together. We’ll talk about that when we get there.

When you get it all set up, you should see a series of test screens.

Image

* Bear not included.


Introduction.

The reason why we call this a “graphics pipeline” is because we’ll break the problem down of drawing 3D objects into a series of steps–a “pipeline” where we do a small but well-defined unit of work at each step along the way.

By organizing our drawing as a pipeline we simplify each step of the process. Each step may not do much, but when added together we can create incredible things.

The pipeline we will build will allow us to draw 3D wireframes. We’ll do a series of articles covering polygonal rendering later. At the end of this series of articles we will have created a pipeline that looks like the following:

Pipeline

So let’s get started.

Hardware drawing.

At the very bottom of our pipeline is the code which draws directly to the hardware. For this we will rely on the Adafruit GFX library to actually handle the drawing–but we still call this layer out because it is the only point in our code where we talk to the Adafruit GFX library in a meaningful way. (Our code on construction interrogates the library to get the screen size, and we expose the start and stop drawing methods and color methods–but that’s it.)

We could, in theory, write our own line drawing routine. Bresenham’s line drawing algorithm can handle drawing lines in hardware very quickly, and is the algorithm used by the Adafruit library. Our line drawing routine could be wired to another library as well, such as the graphics library used by the Arduboy hand held game.

Line Drawing

Our primary line drawing for the first part of our 3D pipeline is the ‘movedraw’ method. This contains all the logic for moving a virtual pen to a given location, or drawing from the current pen location to a new pen location.

We use this ‘pen’ drawing model because it allows us to keep track of the state of the current location; this comes in handy when we do our 3D calculations or our line clipping.

This means our class G3D must keep track of the current pen location (and all the state revolving around the current pen location).

So let us create our 3D graphics class. First, we need to set up the constructor and the destructor to initialize and shut down our 3D pipeline, as well as entry points for setting the current color and for starting and stopping line drawing.

class G3D
{
    public:
                G3D(Adafruit_GFX &lib);
                ~G3D();

        void    setColor(int16_t c);		
        void    begin();
        void    end();
        void    move(uint16_t x, uint16_t y);
        void    draw(uint16_t x, uint16_t y);
        void    point(uint16_t x, uint16_t y);
};

The three routines shown in red are test methods; we will replace them when we build the next level of our graphics engine.

Now let’s add the internal state we need to track, as well as our constructor and destructor. We want to keep track of our drawing library as well as the screen size and color:

class G3D
{
    public:
                G3D(Adafruit_GFX &lib);
                ~G3D();

        void    setColor(int16_t c)
                    {
                        color = c;
                    }

        void    begin();
        void    end();
        void    move(uint16_t x, uint16_t y);
        void    draw(uint16_t x, uint16_t y);
        void    point(uint16_t x, uint16_t y);
    private:
        /*
         *  Internal state
         */
        Adafruit_GFX &lib;
        uint16_t width;
        uint16_t height;

        /*
         *  Current drawing color
         */

        uint16_t color;
};

Our constructor and destructor simply notes the width and height of the screen as well as the drawing library:

G3D::G3D(Adafruit_GFX &l) : lib(l)
{
    width = l.width();
    height = l.height();
}

G3D::~G3D()
{
}

We also want to build the begin and end methods. These call into the Adafruit GFX library to start and end drawing to our device. The reason why we do this is to optimize drawing; we don’t want to unnecessarily start up and shut down communications to our hardware every time we draw a line.

void G3D::begin()
{
    lib.startWrite();
}

void G3D::end()
{
    lib.endWrite();
}

Our setColor method can be handled inline above.


Now all of this so far has just been plumbing; creating our class and tracking state.

The real meat of our first level of drawing is the p1movedraw method and associated stuff. The idea here is to create a single entry point which handles moving and drawing through a single entry point.

Our p1movedraw method needs to keep track of the current pen location, as well as if the last time we drew something if we moved to that location or drew to that location. So let’s add the additional methods needed to handle moving and drawing:

class G3D
{
    public:
                G3D(Adafruit_GFX &lib);
                ~G3D();

        void    setColor(int16_t c)
                    {
                        color = c;
                    }

        void    begin();
        void    end();
        void    move(uint16_t x, uint16_t y);
        void    draw(uint16_t x, uint16_t y);
        void    point(uint16_t x, uint16_t y);
    private:
        /*
         *  Internal state
         */
        Adafruit_GFX &lib;
        uint16_t width;
        uint16_t height;

        /*
         *  Current drawing color
         */

        uint16_t color;

        /*
         *  State 1 pipeline
         */

        bool    p1draw;
        uint16_t p1x;
        uint16_t p1y;

        void    p1init();
        void    p1movedraw(bool drawFlag, uint16_t x, uint16_t y);
        void    p1point(uint16_t x, uint16_t y);
};

The internal variables p1draw is true if the last time we called we were drawing a line. The values (p1x,p1y) is the pixel location of the pen. We have an initialization method for our drawing, as well as our p1movedraw and p1point methods which either moves the pen or draws a line, and which plots a single pixel point.

With our methods defined for our level 1 code, we can add our initialization method to our class constructor:

G3D::G3D(Adafruit_GFX &l) : lib(l)
{
    width = l.width();
    height = l.height();

    /* Initialize components of pipeline */
    p1init();
}

And our test methods simply call into our internal drawing routines:

void G3D::move(uint16_t x, uint16_t y)
{
    p1movedraw(false,x,y);
}

void G3D::draw(uint16_t x, uint16_t y)
{
    p1movedraw(true,x,y);
}

void G3D::point(uint16_t x, uint16_t y)
{
    p1point(x,y);
}

Hardware drawing

This is a lot of stuff, but it gets us to the core of our drawing methods. And we’ll reuse a lot of this machinery when we get to the next phase of our drawing code.

First is initialization. Our initialization routine should set our system to a known state; in this case, we move the pen to (0,0). Recall that the variable p1draw is true if we drew last, false if we moved.

Our initialization code then sets everything to zero:

void G3D::p1init()
{
    p1draw = false;
    p1x = 0;
    p1y = 0;
}

Our point drawing does not move the pen; it simply sets the pixel. This is a simple wrapper into our Adafruit GFX library to write a pixel:

void G3D::p1point(uint16_t x, uint16_t y)
{
    lib.writePixel(x,y,color);
}

Note: We call the ‘writePixel’ method rather than the ‘drawPixel’ method because we are calling the ‘startWrite’ and ‘endWrite’ methods inside a ‘being’/’end’ pair.

And our move/draw routine will draw a line only if the pen is down; we then update our current state.

void G3D::p1movedraw(bool drawFlag, uint16_t x, uint16_t y)
{
    if (drawFlag) {
        lib.writeLine(p1x,p1y,x,y,color);
    }

    p1draw = drawFlag;
    p1x = x;
    p1y = y;
}

Note that this is the place we would need to change our code if we needed to draw our own line using Bresenham’s line drawing algorithm, or if we needed to create a draw list to send to a piece of connected hardware.


We can put all this together with a simple test:

// Use hardware SPI (on Uno, #13, #12, #11) and the above for CS/DC
Adafruit_ILI9341 tft = Adafruit_ILI9341(TFT_CS, TFT_DC);

// Graphics setup
G3D draw(tft);

void setup() 
{
}

void loop() 
{
    tft.begin();
    tft.fillScreen(ILI9341_BLACK);
    
    draw.begin();
    draw.setColor(ILI9341_RED);
    draw.move(0,0);
    draw.draw(0,100);
    draw.draw(100,100);
    draw.draw(100,0);
    draw.draw(0,0);
    draw.move(100,120);
    draw.draw(100,140);
    draw.end();

    for (;;) ;
}

And if all works well, we should see:

L1

Okay, it’s not much. But you have to learn to crawl before you walk.

The source code for this version of the 3D graphics pipeline on GitHub; the linked branch shows just the first part of the graphics pipeline.


Now this is rather boring. If this was all we wanted to do, big deal.

But remember our original principle:

Each step may not do much, but when added together we can create incredible things.

The next step in our pipeline is viewport drawing.

The purpose of viewport drawing is to abstract away the pixels. That is, what we want to do is to redefine our drawing coordinates so that we get more or less the same drawing regardless of the dimensions of the physical hardware attached to our Arduino. And we do this by defining a new viewport coordinate system that ranges from -1 to 1.

Note: We will be using floating point math for our operations, which may not necessarily be the fastest on an Arduino. However, it should be fast enough for us to do some simple things like rotating a cube. When designing for an Arduino, the most important thing you can do is figure out what not to draw; that way you can avoid doing math for stuff that is not visible.

Now our virtual coordinate system imposes a virtual square; the top and bottom of our screen (or the left and right sides depending on the orientation of our screen) has a smaller coordinate value:

Viewport

This means we have a little math to do.

Basically we need to figure out two things when our 3D class starts up:

  1. What are the viewport drawing dimensions of the screen? We need this in order to determine the dimensions of the screen we are clipping to.
  2. To convert from a viewport coordinate (x,y) to pixel coordinates (px,py), we need to calculate px = A + B*x and py = C + D*y for some values A, B, C, D. What are those values?

Things get particularly interesting on the fringes. Note that when we round from a floating point value to an integer, we generally truncate (or round down); this means the value 12.75 becomes 12 when converted to an integer. So we can’t, for a screen that is 320 pixels wide, write:

    px = 160 + 160 * x;  // 160 = 320 / 2

Because if we plug in a value 1 for x, we get px = 320–and our pixel coordinate range is from 0 to 319.

So we solve this problem by bumping the width down by 1 pixel, and we calculate our pixel range (in floating point values) from 0.5 to 319.5. (This will then convert to integer values from 0 to 319.)

Thus we’d write:

    px = 160 + 159.5 * x;  // 159.5 = 319/2

Plugging in -1 for x gives is px = (int)0.5 = 0, and 1 for x gives us px = (int)319.5 = 319.

We have a second problem in that in our pixel coordinates (0,0) is in the upper-left corner of the screen–but we want our drawing to put (-1,-1) at the bottom-left. This means we need to flip the sign of the y value during our calculations:

    px = A + B * x;
    py = C - D * y;

Our third problem, of course, is that our display is not square; it’s rectangular. But we assume our pixels are square. And we assume whichever is larger (the width or the height) will map from -1 to 1; the other wide will map from roughly -0.75 to 0.75 or so.

This implies our scale value B and D which we multiply in our equation above is the same. (If our pixels are not square we need to do something different.)


So let’s put all this together.

For our viewport coordinate system we need to track five variables: two giving the (+/-) width and (+/-) height of our viewport:

        float	p2xsize;      // viewport width +/-
        float	p2ysize;      // viewport height +/-

And three that handle the transform: the amount we multiply and add to our viewport coordinate to give our pixel coordinate:

        float	p2scale;      // coordinate transform scale.
        float	p2xoff;      // coordinate transform offset
        float	p2yoff;

Of course we also need to add our level 2 pipeline code.

All of this looks like:

class G3D
{
    public:
                G3D(Adafruit_GFX &lib);
                ~G3D();

        void    setColor(int16_t c)
                    {
                        color = c;
                    }

        void    begin();
        void    end();
        void    move(float x, float y);
        void    draw(float x, float y);
        void    point(float x, float y);
    private:
        /*
         *  Internal state
         */
        Adafruit_GFX &lib;
        uint16_t width;
        uint16_t height;

        /*
         *  Current drawing color
         */

        uint16_t color;

        /*
         *  State 2 pipeline; map -1/1 to screen coordinates
         */

        void    p2init();
        void    p2movedraw(bool drawFlag, float x, float y);
        void    p2point(float x, float y);

        float   p2xsize;        // viewport width +/-
        float   p2ysize;        // viewport height +/-
        float   p2scale;        // coordinate transform scale.
        float   p2xoff;         // coordinate transform offset
        float   p2yoff;

        /*
         *  State 1 pipeline
         */

        bool    p1draw;
        uint16_t p1x;
        uint16_t p1y;

        void    p1init();
        void    p1movedraw(bool drawFlag, uint16_t x, uint16_t y);
        void    p1point(uint16_t x, uint16_t y);
};

Notice we also change our test routines (shown in red).


Our p2movedraw and p2point routines look similar: they perform our math calculation using the constants we’ve calculated during startup to transform from viewport coordinates to pixel coordinates:

void G3D::p2point(float x, float y)
{
    int16_t xpos = (int16_t)(p2xoff + x * p2scale);
    int16_t ypos = (int16_t)(p2yoff - y * p2scale);
    p1point(xpos,ypos);
}

void G3D::p2movedraw(bool drawFlag, float x, float y)
{
    int16_t xpos = (int16_t)(p2xoff + x * p2scale);
    int16_t ypos = (int16_t)(p2yoff - y * p2scale);
    p1movedraw(drawFlag,xpos,ypos);
}

Now the heavy lifting is in our initization code, which must find all our constants given the size of the screen. As we noted before we assume pixels are square.

Our initialization code first needs to set up some constants and find the screen dimensions for our clipping method. We then need to calculate the scale parameter and ultimately the offsets. In the end our code looks like:

void G3D::p2init()
{
    /*
     *  We subtract one because we want our mapping to work so that
     *  virtual coordinate -1 is in the middle of the 0th pixel, and
     *  +1 is in the middle of the width-1 pixel for wide displays.
     *
     *  This offset of 1/2 by the pixel width implies we're drawing on
     *  a display 1 pixel narrower and wider, but then with 1/2 added
     *  to the pixel coordinate
     */

    uint16_t h1 = height - 1;
    uint16_t w1 = width - 1;

    /*
     *  Calculate the width, height in abstract coordinates. This
     *  allows me to quickly clip at the clipping level
     */

    if (w1 > h1) {
        p2xsize = 1;
        p2ysize = ((float)h1)/((float)w1);
    } else {
        p2xsize = ((float)w1)/((float)h1);
        p2ysize = 1;
    }

    /*
     *  Calculate the scale, offset to transform virtual to real.
     *  Note that -1 -> 0 and 1 -> (width-1) or (height-1).
     */

    if (w1 > h1) {
        p2scale = ((float)w1)/2;
    } else {
        p2scale = ((float)h1)/2;
    }

    p2xoff = ((float)width)/2;
    p2yoff = ((float)height)/2;
}

Of course we need to change our main Arduino test code to use the new floating point values for coordinates:

void loop() 
{
    tft.begin();
    tft.fillScreen(ILI9341_BLACK);

    draw.begin();
    draw.setColor(ILI9341_RED);
    draw.move(-0.5,-1);
    draw.draw(0.5,-1);
    draw.draw(0.5,1);
    draw.draw(-0.5,1);
    draw.draw(-0.5,-1);

    draw.move(-0.75,-0.75);
    draw.draw(-0.5,-0.75);
    draw.draw(-0.5,-0.5);
    draw.draw(-0.75,-0.5);
    draw.draw(-0.75,-0.75);
    draw.end();

    for (;;) ;
}

And if everything works correctly, you should see the following:

Image 2

Again, not all that exciting.

Except for one thing: This is what your image will look like regardless of your screen size. This becomes important as we move on and start building our 3D rendering engine.

All of this code is available at GitHub in a separate branch.


This has been an extremely long post, but a lot of this was setting the stage for future posts. The next post will discuss homogeneous coordinates, the coordinate system we use for representing objects in 3D. This will be followed by a post discussing clipping in homogeneous coordinates, then we’ll move into some 3D math.

The test print of the Earth-Moon Orrery

So after a couple of days of printing, and a little sanding, some assembly, and not a tiny bit of cursing, I have a printed orrery gear assembly for our Earth-Moon Orrery.

Assembled

It’s a little stiff, but that’s to be expected since I haven’t perfected the gears through sanding. (That means carefully going through each tooth with a tiny little strip of sandpaper and polishing down all the little rough edges.) Of course the mechanism also needs to set for a few days; I’ve found 3D prints out of the FormLabs Form 2 printer generally take a couple of days to set before the parts seem totally cured. (Of course I could buy the system FormLabs sells to harden parts–but I haven’t.)

The ideal way to build this mechanism, of course, is out of brass–and the next thing for me to do is figure out how to cut brass gears and replicate this mechanism in aluminum and brass.

But as of now, our very simple little mechanism works.

Designing a simple Earth-Moon Orrery.

My eventual goal: to build a simple a simple Tellurion, an Orrery which shows the relative position of the moon around the earth as the earth revolves around the sun.

This will be a series of blog posts, documenting the good, the bad and the ugly.


For me the first step is to fire up Kythera and design the gear chain that controls the motion of the moon around the earth as the earth goes around the sun. (Now many such Orreries show the rotation of the earth, but we’re going to skip that for now.)

Ultimately we want a gear chain that then allows us to rotate the moon around the earth relative to the earth moving around the sun, with both moving in a counter-clockwise relative motion:

E-M Diagram


So here’s a question: what’s the relative motion we should use?

Well, it takes approximately 365.25 days for the Earth to move around the Sun, relative to the fixed stars around the solar system. And it takes 29.530 days between new moons–which means it takes that long for the moon to rotate to the far side of the Earth.

This means that there are 12.368 746 34 passes of the moon per solar orbit–which agrees with the Wikipedia article.

(Now the difference between a Sidereal month and a Synodic month is that the Sidereal month represents the movement of the moon relative to the stars. Think for a moment: as our Sun/Earth arm revolves once around the sun, the moon revolves 12.368 times around the Earth/Sun arm.

And as the moon revolves 12.368 times around the Earth/Sun arm, the Earth/Sun arm goes around once. This means the moon goes 13.368 times total around the sun–which gives us a sidereal month of 27.321 days.)


Lets find a gear system which does this.

So, opening the Kythera Gear Calculator screen, we enter our target fraction:

Emratio

There is an interesting ratio, which is extremely accurate, with some very interesting prime factors at the bottom of the list.

Since we need an odd set of gears (because we want both the Earth/Sun arm and the Earth/Moon arm to rotate counter-clockwise), the last gear ratio turns out to be a very interesting ratio, because the top prime factor 17576 = 263.

A little fiddling later, and I came up with the following gears:

Gear 1: 52 teeth.
Gear 2: 29 teeth.
Gear 3 attaches to Gear 2 with 52 teeth.
Gear 4 with 14 teeth.
Gear 5 attaches to Gear 4 with 26 teeth.
Gear 6 with 14 teeth.

Gear 1 is a fixed gear attached to our sun. The other gears attach to an arm which swings around our sun, and Gear 6 rotates with a ratio of 17576/1421 = 12.368 754 398, a difference from our target fraction of 6.515·10-5%.


Some fiddling later in OpenSCAD, and we come up with the following model.

Draft Mechanism

All the files are attached here: EMOrrery2.zip.

And the gears are out of the FormLabs Form 2 printer. Right now I’m printing the rest of the components, and we’ll see how this works out.


Ultimately a mechanism like this needs to be cut using brass gears; that’s how you get the lovely appearance and the lovely movement and beautiful presentation seen in commercially available Orreries. But the beauty of using a 3D printer is it makes prototyping fairly quick and easy.

How well this works out, I’ll know tomorrow.

Testing 3D Printing of Gears

Kythera is a product sold by Glenview Software for $10 which allows you to string together complex mechanisms using spur gears. It helps you design complex mechanisms quickly on your computer and export .STL files for printing on a 3D printer.

And today we’ll use it to build a test mechanism, in order to test how well we can manipulate the gears to create a simple example mechanism; in this case, a gear train which translates the motion of a minute hand to an hour hand.

Such motion works are used in clock making and watch making. And we can very quickly build the chain of gears, to see the whole thing in action.


First, fire up Kythera. Since we’re just creating a naked motion work, we can create a new document and delete the gear that is automatically created for us. Switch to the layer menu and add a new layer, since our gear system will require two layers of gears.

Next under the “Gears” menu, select “Create Motion Work”. Create a new motion work with a 12 to 1 ratio, and for our system, we’ll select the first proposed option labeled 8:32 10:30. This will create a simple four-gear system:

Kythera Motion Work

Switch to the Document tab and verify a few settings: our axle radius should be 1.1mm; this will fit a 2mm pin. The screw radius should be set at 2.1mm; this will fit an M4 screw. Let’s bump up the “shrinkage” value to 0.625mm; this will reduce our gears by 0.625mm radius per gear, so the gears “rattles.” Remember, if it rattles, it runs; objects printed on a 3D printer are slightly larger.

(Note that I’m doing this on a FormLabs Form 2 printer; you may need to fiddle with these settings depending on the technology you’re using to print your mechanism, including fiddling with the size settings for your gears.)

Now we want to export our gear train; call it “MotionWork.gbom”. This will also design a basic frame for us.


Open the layout.scad file with OpenSCAD that is generated in the MotionWork.gbom directory. In order to make our mechanism work we’ll need to make a few adjustments.

First, we’ll want to add cylinders to our first and fourth gear; this allows us to create axles on which we could in theory attach arms. We also want to join the second and third gear so they are printed as a single gear. We need to modify the top plate so our cylinders attached to our first and fourth gears pass through the top.

In OpenSCAD the results of all this editing looks like:

Modlayout


Now we need to generate the STL files for our mechanism. As we’re printing everything as separate components, we need to selectively comment out all of the pieces, only leaving uncommented the top plate, bottom plate, support cylinders, the first gear, the second and third gear fused together, and the fourth gear.

We then print each of the parts out, and assemble using 2 25mm x 2mm pins, and four 30mm long M4 screws and M4 nuts.


This is another test print, but one of the nice elements of Kythera’s “shrinkage” feature is that we know the gears should mesh and freely spin in a 3D printed mechanism without a lot of fussing around.

Of course the actual values you want to use for your gear size and shrinkage will depend on the printing technology. But the results (printed so that spurs are not inserted between the gear teeth, which in the FormLab’s PreForm application requires a lot of fiddling with the spur locations) does spin freely, and would make a good starting point for a gear train for an hour hand and minute hand on a clock.


All the files used in this model, including generated .stl files and OpenSCAD files, can be downloaded here: MotionWork.zip