Apr 1, 2016

Happy Bleeding Edge

Good day to you, dear friends!

The upcoming C++19 standard is upcoming very soon, and the word is it's going to have many exciting new features!

Just to mention a few:

  • The language becomes garbage-collected. It means you never need to use "delete" anymore. For the backwards compatiblity, it's not an error if you still do, though
  • The curly braces syntax becomes deprecated (in an attempt to get rid of the "curly brace languages" club bad rep). The Python-like syntax is now preferred
  • One Definition Rule is now relaxed. You can define all you want; the winner is picked randomly at runtime using the Mersenne Twister
  • Going forward, friends are not anymore allowed to see each other privates
  • It is a compilation error to have tabs in the code. If the amount of comment lines in a file is less than 33.3%, it is also a compilation error (this should force good coding practices)
  • Substitution Failure Is Now An Error
  • Resource Acquisition Is No Longer an Initialization
  • The syntax for the main() function has been modernized (the correct syntax is now "public static void main ...")
  • A new modifier keyword, thread_safe, has been introduced. Insert it anywhere into your code to make it thread-safe
  • The standard library has been extended with many new useful algorithms. For example, std::fast_sort allows to sort arrays in a constant time (the precondition is that the input should be ordered)
  • Now it is OK to divide by zero, dereference NULL pointers and compare floating point values for equality: nothing bad will ever happen!

In order to keep our skills up to date and bleeding edge, let's practice some C++19.

I've prepared a little C++19 quiz for you:

What does this code do?

// note that this code requires a modern C++19 compliant compiler
#include <stdio.h>

int O = 5
char buf[6]
public static void main() thread_safe:
    float m[] = { 0.16437186, 0.57526314, 0.20729951, 0.13258759, 0.23461139 }
    for (int i = O; i --> 0;):
        for (int j = O, _= *(int*)(i + m); j --> 0; _>>=O):
            j[buf] = O*13 + (_&31)
        printf("%s\n", buf)

Enjoy!

NOTE: if you don't have an access to a bleeding-edge C++19 compiler, the code could be easily backported to a C++98 standard - just replace the Pythonic code blocks and Javaesque main() signature with an old C++ one!


UPDATE, APR 6 2016:

Posted on the 1st of April, the list of “new C++19 standard features” was indeed an "obviously non-funny" april’s fool thing.

Of course, there is no such thing as C++19, neither the committee would ever consider features as ridiculous as those mentioned (which is a shame, I would personally love to have tabs banished from existence :)).

As silly as the joke was, I tried to allude to a few topics there, in particular about “syntax vs semantics” in programming languages.

Or “form vs shape”, to put it in a philosophical perspective.

One can often hear arguments about the language syntax. The discussions are usually more heated at the levels of the “curly braces”, since this is the area where everyone is entitled to their own opinion (pretty much like on “tabs vs spaces”).

Let’s look at several forms of the same piece of code, in some hypothetical programming languages:

snippet 1:

for (i = 0; i < 5; i++) {
    for (j = 0; j < 5; j++) {
        println(i*j);
    }
}

snippet 2:

(for [i 0] (< i 5) (inc i) 
    (for [j 0] (< j 5) (inc j)
        (println (* i j))))

snippet 3:

for i = 0; i < 5; i++:
    for j = 0; j < 5; j++:
        println(i*j)

snippet 4:

for i = 0, i < 5, i = i + 1 do
    for j = 0, j < 5, j = j + 1 do
        println(i*j)
    end
end

snippet 5:

DO 1 i=0,4
   DO 1 j=0,4
      1 PRINT *,i*j

snippet 6:

For i = 0 To 5 Step 1
    For j = 0 To 5 Step 1
        Print(i*j)
    Next j
Next i

Which one is the “best”?..

Whatever your opinion is, let’s not forget that language syntax is purely a human, psychological construct!

In that light, it’s a straightforward code transformation to come back from the “C++19” syntax to a valid C++ syntax. We just replace Python-style code blocks with C++ curly braces, and of course remove the pseudo-Java main() function signature:

#include <stdio.h>

int O = 5;
char buf[6];
int main() {
    float m[] = { 0.16437186, 0.57526314, 0.20729951, 0.13258759, 0.23461139 };
    for (int i = O; i --> 0;) {
        for (int j = O, _ = *(int*)(i + m); j --> 0; _>>=O) {
            j[buf] = O*13 + (_&31);
        }
        printf("%s\n", buf);
    }
}

This code does not require a “C++19-compliant compiler” anymore. In fact, it compiles via a C99 compiler, runs and prints something!

It’s still not comprehensible, though. So let’s remove some more cruft, including the infamous “--> operator” and quirky C array indexing and try to find better names for the variables:

#include <stdio.h>

const int   BITS_PER_CHAR = 5;
const int   MASK = 31; // binary b11111, corresponding to BITS_PER_CHAR
const float WORDS_F[] = { 0.16437186, 0.57526314, 0.20729951, 0.13258759, 0.23461139 };
const int   NUM_WORDS = 5;
const int   NUM_CHARS = 5;

char buf[6];
int main() {
    for (int i = NUM_WORDS - 1; i >= 0; i--) {
        int bits = *(int*)&WORDS_F[i]; // reinterpret float value as int, bitwise
        for (int j = NUM_CHARS - 1; j >= 0; j--) {
            buf[j] = 'A' + (bits&MASK);
            bits >>= BITS_PER_CHAR;
        }
        printf("%s\n", buf);
    }
}

Now we can see (well, more or less) what does it try to do.

It extracts bit patterns (5 bits each) from the binary representation of the floating point values, interprets them as characters and prints that as words, one word per floating point value.

The floating point values are picked up specifically to encode some message.

Assuming that float is a single precision IEEE-754 floating point value (which most of the time is the case, but is not ultimately guaranteed), it is represented via 32 bits of data:

  • Mantissa in the lowest 23 bits
  • 8 bits for the shift-negative exponent
  • 1 bit for sign in the highest bit

Each character in the text is encoded via 5 bits, which is an offset from the character 'A' (i.e. 0 is 'A', 1 is 'B', 2 is 'C' etc.)

So in total each text word, such as "HAPPY", would take 25 bits. This is the whole 23 bits of mantissa plus cutting extra two bits into the exponent:

Note that unused exponent bits are padded with "011111", which is done in order to have the magic numbers look "nice":

So in the end, given a few assumptions we make about the current platform, the code would print:

HAPPY
APRIL
FIRST
RGRDS
CQUIZ

In addition to the "syntax argument fallacy", I'll throw in a few more maximas I was trying to allude to:

  • Garbage collection does not necessarily magically help to solve all the memory problems
  • It is essential to have a deterministic behaviour in the code
  • Thread safety is not something that can be achieved by magic, one still has to understand what's going on in a concurrent environment
  • Encapsulation is essential in design, but just having a "private" keyword in the language does not magically enable it
  • One has to know what are the complexity guarantees of the standard (or any other) library algorithms, if any. Otherwise it's just relying on magic (see above)
  • Certain things are fundamental to the way hardware works, and programming language can't be blamed for them not working "as expected"

There are many more programming language design related fallacies that people, myself included, may be subjected to.

I believe that working with such a complicated language as C++ does not necessarily make one a better programmer.

But it may, in a sense, force one into a certain awareness about such fallacies' existence.

Jul 12, 2015

Prime factorization Elms

Some time ago I played with the idea of "plant-like" prime factorization visualization, whereas prime decomposition of a number is rendered as a plant, with branching that corresponds to its prime factors.

There was a little C++/OpenGL demo producing amusing pictures and all, however one of the obvious problems was that one had to download and compile the code to try it.

So I figured I'd give it another shot, from a different angle.

The Elm programming language looked like an interesting candidate for a few reasons:

  • It's a web-frontend language, so there we go - the demo accessibility problem is sorted out
  • It's high level, succinct, focused on visuals and interaction, so presumably fewer lines of code to do the same thing
  • It's built upon the Functional Reactive Programming paradigm, and I heard that learning new paradigms is a good thing to do
  • It's kind of like Haskell, and Haskell is hard, but these are more like nice pictures and Marios, so I could do those instead and look a little bit like the cool kids who can do Haskell!

Anyway, here's the online demo of the result (SPACE toggles pause mode, arrows allow to swap numbers back and forth):

The code itself is on GitHub, it's pastable verbatim into "try elm", so feel free to... well, try :)

(update: also on share-elm)

Now, step by step, about how the code works.

Drawing a trunk and a fruit

Let's start from drawing a "trunk" and a "fruit" on top (the numbers are hardcoded to make thing simpler):

import Color
import Graphics.Collage exposing (..)
import Graphics.Element exposing (..)
 
-- draws a trunk
trunk: Form 
trunk =
  polygon [(-5, 0), (-2, 80), (2, 80), (5, 0)]
      |> filled (Color.hsl 0.3 0.3 0.5)
 
-- draws a fruit
fruit: Form 
fruit =
  let grad =
    Color.radial
        (0, 05
        (4, -416
        [(0, Color.hsl 1.5 0.4 0.9),
        (1, Color.hsl 1.5 0.4 0.7)]
  in
    circle 15
    |> gradient grad
 
main: Element 
main = collage 400 300 [trunk, fruit |> moveY 80]

There are two functions "trunk" and "fruit" that create corresponding graphical elements ("Forms").

Trunk is just a polygon (trapezoid), filled with a brown color. Fruit is a circle, filled with a green-ish gradient (specified by an inner center/radius, outer center/radius and two gradient stops).

These forms are combined into an "Element" (like a DOM element) using "collage".

The fruit is moved 80 pixels upwards. Note that the center of coordinates is in the middle of the collage, and Y axis is looking upwards.

This collage is what the main program produces and what we see on the screen.

Elm borrowed the pipe function ("|>") from F#, and it's used quite a few times in the code to "thread" an argument through several modifier functions.

Drawing a tree

Now let's repeat this trunk/fruit drawing recursively, traversing a list of integers.

Each integer is interpreted as a branching factor on every level:

--  finds a rotation angle for i-th branch from n,
-- so that the branches are evenly fanned out in a cone
fanOutAngle: Float -> Int -> Int -> Float 
fanOutAngle cone i n =
  if <= 1 then 0
  else degrees ((toFloat i)/((toFloat n) - 1- 0.5)*cone
 
-- draws a subtree recursively,
-- according to the list of integer factors
subTree: Float -> List Int -> Form 
subTree cone s =
  case of
    n::xs ->
      let fork = \->
            [ trunk
            , subTree (cone*0.95) xs
                 |> scale 0.8
                 |> moveY 80
            ]
            |> group
            |> rotate (fanOutAngle cone k n)
      in
      [0..(n - 1)]
      |> List.map fork
      |> group
    _ -> fruit
 
main: Element 
main = collage 400 300 [subTree 100 [1, 2, 3, 3, 5, 7|> moveY -150]

That makes the recursive magic happen:

The gist of the recursion step for a "branch" is:

  • draw first whatever is supposed to be on the tip of the branch (recursively)
  • scale the result down a bit
  • move it upwards to give some space for the parent trunk
  • draw the parent trunk itself
  • rotate all together with the trunk by a given angle (fanOutAngle)
  • repeat for all the children (the current number n from the list)

That's it. In order to understand recursion, one needs to first understand recursion.

The fanOutAngle is a little helper function which finds how much to rotate a branch with index i, so that the total of n branches are evenly distributed inside a cone. For example, that's how it looks for fanOutAngle(cone,1,6):

Drawing a bud

We've got to tackle the situation when a number becomes too big, and drawing it as a fan of branches produces too cluttered image.

The solution is to draw it as a "bud" using a Fermat spiral layout. This both gives a nice evenly looking distribution and actually does happen in nature (the so-called phyllotaxis, or an arrangement of leaves in e.g. sunflowers), so it's a good plant metaphor.

Code-wise, the change is minimal. We introduce an alternative layout function that instead of branch-fan-out layout does the Fermat Spiral one, and switch between two based on a number threshold:

goldenPhi: Float 
goldenPhi = 137.5/180.0*pi
 
-- draws a subtree recursively,
-- according to the list of integer factors
subTree: Float -> List Int -> Form 
subTree cone s =
  case of
    n::xs ->
      let fork = \->
            [ trunk
            , subTree (cone*0.95) xs
                 |> scale 0.8
                 |> moveY 80
            ]
            |> group
            |> rotate (fanOutAngle cone k n)
          bud = \->
            [subTree cone xs
                |> moveY (sqrt(goldenPhi*(toFloat k))*3)
                |> scale 0.4
            ]
            |> group
            |> rotate (goldenPhi*(toFloat k))
            |> moveY 30
      in
      [0..(n - 1)]
      |> List.map (if < 17 then fork else bud)
      |> group
    _ -> fruit
 
main: Element 
main = collage 400 300 [subTree 100 [1, 2, 3, 79|> moveY -150]

Prime number decomposition

Now let's step aside from drawing for a moment and look at the "data model" of what we draw.

The used algorithm for prime factorization is simple and rather inefficient, but it does the job just fine:

  • divide by 2 until it divides (gathering the corresponding 2s as factors)
  • continue from trying to divide by 3 and upwards, with step 2
  • repeat until the current divisor is bigger than square root of the original number

There is also a bit of code to format the description string for the prime decomposition:

import Text
import String
import Color
import Graphics.Element exposing (..)
 
primeFactors: Int -> Int -> List Int 
primeFactors n s =
  if | s*> n -> [n]
     | n%== 0 -> s::(primeFactors (n//s) s)
     | s == 2 -> primeFactors n (s + 1)
     | otherwise -> primeFactors n (s + 2)
 
primes: Int -> List Int 
primes = 1::(primeFactors n 2)
 
primeDescription: Int -> List Int -> Element 
primeDescription n factors =
    let fstr = factors
        |> List.drop 1
        |> List.map toString
        |> String.join "x"
    in
      toString n ++
         (if List.length factors == 2
          then " - PRIME!"
          else " = " ++ fstr)
        |> Text.fromString
        |> centered
 
main =
  flow down (List.map (\-> primeDescription k (primes k)) [2..99])

which displays

2 - PRIME!
3 - PRIME!
4 = 2x2
5 - PRIME!
6 = 2x3
7 - PRIME!
8 = 2x2x2
9 = 3x3
10 = 2x5
11 - PRIME!
12 = 2x2x3
...

(by the way, one is not a prime number, but we want that one in the list anyway)

Now, we can get all of the earlier code together and display 42, decomposed into prime factors:

primeView: Int -> Element 
primeView =
  let factors = primes n
  in
    flow down
    [ primeDescription n factors |> width  400
    , collage 400 300 [subTree 100 factors |> moveY -150]
    ]
 
main: Element 
main = primeView 42

Signals plumbing

So far we've been producing a static image as the program output. Now, getting to the actual "Functional Reactive Programming" part.

We'd like to do at least a few things:

  • animate the current number, starting from 2 upwards, with a certain delay between the consecutive numbers
  • allow to pause the animation with space
  • allow to step the current number forward/backward with the arrow buttons
  • have the graphics adjust to the current window size

This is done via "signals" in Elm - streams of data (such as time and user input) that get transformed, filtered, merged etc. And in the end, the result of the whole program is still in a sense a static image, or rather a sequence of them, built as a function of these signals.

The signals we employ:

  • Time.every - fires periodically, with specified period
  • Keyboard.arrows - fires whenever arrow buttons are pressed
  • Keyboard.presses - general "button pressed" signal (to know when SPACE is pressed)
  • Window.dimensions - when window size is changed

Essentially, our plant image gets updated correspondingly whenever any of the signals gets fired:

The code looks like:

primeView: (IntBool-> (IntInt-> Element 
primeView (n, paused) (w, h) =
  -- ...
  -- mostly as before, only dependent on "paused" and window dimensions now
 
-- the stream of updates from timer and keyboard
type Update = Tick Float | Arrows {x: Int, y:Int| Press Keyboard.KeyCode
updates: Signal Update 
updates =
  mergeMany
  [ map Tick (Time.every (Time.second*3))
  , map Arrows Keyboard.arrows
  , map Press Keyboard.presses
  ]
 
-- evaluate the current number and paused status based on the updates stream
foldUpdates: Update -> (IntBool-> (IntBool) 
foldUpdates update (i, paused) =
  case update of
    Arrows {x, y} -> (max 2 ((i + x + y)%1000), paused)
    Press 32 -> (i, not paused)
    Tick _ -> ((i + (if paused then 0 else 1))%1000, paused)
    _ -> (i, paused)
 
 
main: Signal Element 
main =
  primeView
  <~ foldp foldUpdates (2, False) updates
   ~ Window.dimensions

The first three signals get merged into a single signal updates via Signal.mergeMany function. A discriminated union Update is used to identify which event in the merged signal belongs to which original signal.

Then, this merged signal is threaded through Signal.foldp, which in a sense allows to maintain a current "state", represented by tuple (i, paused), where i is the current number to display, and paused tells whether the animation is currently paused.

So after the foldp we get a signal which is a sequence of states (i, paused) to display, together with Window.dimensions signal it gets mapped into the primeView function (it has a corresponding signature):

Other tidbits

A couple of other things to mention.

One is that the we don't really want to have all these magic numbers in the code, so it's better to have them as named "constants" (which is a kind of oxymoron in our case, since in Elm everything is, you know, "constant").

In the demo code I grouped all these "tweakable constants" into a record, having in mind trying to later expose them to JavaScript via ports, so we could have an additional "tweakables" signal coming from an external JavaScript part, using e.g. dat.GUI.

Another thing is the fruit coloring: rotating hue by some constant factor depending on the branch index gives some nice visual variety:

So there we go: the beauty of mathematics combined with the beauty of functional (reactive) programming. All in your web-browser.

10/10 will try Elm again.