FSMs are needed

Finit State Machines (FSMs) are great to model behaviour and control flow. Probably it is one of the most used design patterns; some developers are not even aware they are using it (when using the State pattern). And non of the well-known system-programming-languages does support it directly – it’s a shame;-)

This leads to sub-optimal, often hard to maintain code. In Castle, you can use define a FSM directly. Let’s see why that is essential.

FSMs, an intro

The well known State pattern is basically an FSM. It has a finit (!) number of states, inputs (often called events) and rules; the latter define the next state, which only depend on the current state and the current input. Optionally, but typically for software: there are also actions: code that is executed on a state-transition or the entry/exit of a state.

With a FSM one can model simple “automata”. It does not have real memory –it does not remember how it gets in a state– but does act depending on this internal state. Therefore a FSM can act differently on a second button-push: With the first push it goes to a state as <pressed-once>, and with the second one to <pressed-twice>. And we can even define a states as <pressed-three-times, <pressed-four-times, etc; but (a given) FSM has always a limited number of states.
As soon we add variable (eg counter) to the code; it’s no longer a FSM!

Conceptually coding a FSM is simple; but in practise it is troublesome to make it solid and maintainable. Typically the result is a lot of code. Even a trivial FSM with only 3 states, 3 events, and a single function-call on all possible transactions (but no entry/leave actions) will give about 30 lines. See the UML-diagram and (python) code examples in the side-bar, for a first impression.
This straightforward implementation used a nested switch: for every \(S\) states we divert for every \(E\) events; where we need 2 lines: one to update the state and one function-call. This leads to \(2*S*E\) lines. Or bit less when using a compact notation (although the lines become longer).
When one event is added, we have to update at \(E\) distinct locations.

Some use a revered approach: fist switch on the event, then on the state. This helps when need to add one event; as we only have to add \(2*S\) lines in 1 place – but fails when we need to add a state.
Some languages support a “table” approach. Mathematical, the FSM-rules can be given in a table with state and input on the axes. The next state and the transition-action is filled-in in the cell. This results in compact (and easy to maintain) data-structure; but needs some generic code – which is hardly ever generic, as there many variants.
Last & least, there are many “OO” templates 1; where inheritance is used to distribute code over many subclasses and files. It helps at bit, but only limited – as the many available alternatives already show.

Kinds of FSMs

The theory of FSMs is old: it predates modernd computers. With a few memory-cells and some relays (or other combinational logic) one can build an “electronic FSM”. Even there is lot of theory available it is often assuming you are using such electronic-one – or even only mathematical model.
This post will not repeat all that; we give a short overview, focusing on what we need for a SW-FSM and with many (wikipedia) links for more theory.

More vs Mealy

Most FSM-theory is already developed in the 1950ties, by Moore and Mealy; both have a machine named to them. Conceptually, the have equal power –meaning a Moore machine can’t do more as a Mealy machine, nor the other-way around. But the Mealy typically has less states – which can be relevant for SW-Developers.
Hardware-developers typically prefers the Moore machine, as it is safer – for SW this advantage does not exist.

The big distinction is actions. They can depend both on the (current) state and the input – at least in a Mealy machine. In a Moore machine the action may depend only on the state.
In the diagram (and in code) the difference is where the actions are located. When they are “on” the arrows –that is: a combi of the current-state and the input– then it is a Mealy; when they are “in” the state (only) it’s a Moore machine. (as the input does not directly influence the action).

For SW-FSMs we can also differentiate (Moore machines) between Entry and Leave actions; this does not apply in electronics, nor in the old concept – they are not event-driveren (but level-active: there is an output as long as the state is active).
And often, SW-FSMs use both Moore and Mealy kind of actions; which is fine.

FSM vs NFA

Most developers ony know the deterministic FSM: it has only one next state for a given current state and input and can’t switch to another state without an input. There are also non-deterministic FSMs (NFA), however. When a (at least one) input-sequence can lead to more-as-one states, the FSM is a NFA!
Despite that a NFA can’t be realised directly and is mathematically equivalent to a FSM, they have more expressions-power: the table to describe them can be shorter! They are also used to implement regular-expressions

In the NFA-theory the input-sequence is ‘valid’ as there is a path (of internal states) that is valid; the others options are ignored. It kind of magic: we just assume the NFA gambles correctly which path to follow. For mathematics that will do; the practical issues, that we as engineers need to solve, is not there concern.

Transformations

We could be tempted to implement a NFA by using lookahead; but there easier ways as the same theory gives us. It is possible to “transform” (rebuild) a FSM into another kind. We can transform a Mealy into a Moore machine, or back. That is easy: just a bit of mathematics …

The same for a NFA. To implement a NFA, we can transform it into a deterministic FSM first. Again, this uses a well-know (old, 1959) algorithm, that calculates super-sets and constructs a new FSM. That, newly constructed (deterministic) FSM can have up-to \(S**2\) states and lot of rules. It might sound complicated, But, with a lot of patience everybody can do this (manually) by following some simple steps.
Would’t it be great when those transformations can be applied automatical? Then we can describe the FSM in the most conviant way. And the computer will convert it into one that is deterministic and easy to execute 2

Epsilon transitions

Another interesting concept are the \(\epsilon\)-transitions –even less known by programmers. Then we allow rules with state-transition without input. This makes the FSM always a NFA; as we can’t predict when/whether this None input happens.
Again, it’s main use is to make the “FSM-table” shorter – albeit one has to add those \(\epsilon\)-rules into the table!

A curious transition is the “instantly \(\epsilon\)-transition”, where no other rules are allowed (for states that has uses it). As its needs no input, we have designed a NFA. However, as it is the only possibility and the spec is that this \(\epsilon\)-input happens directly, it’s easy to implement in software: when state==C: { state:=N; act_appropriate(); }.
This is often easier then removing that state and combining the actions (one step back in all paths). And quite popular by SW-developers – one of the few places where the practise is smoother then the formal theory.

More energetic FSMs

As described above a non-deterministic NFA has more expression-power then a deterministic FSM and can be converted into a deterministic FSM automatically. This makes designing a FSM more easy. Likewise, there are more relative new know-how to make the job of the (SW) developers easy. We mention a few and link to more theory.

Statecharts

Classically, a FSM has atomic states only. In 198X a OO-variant of was invented called the “statechart diagram”, by David Harel. This is also know as UML-FSM, or “UML statechart”. Aside of being “OO” it has a few smart concepts, that makes defining a FSM easier. We give you the most relevants two.

Hierarchically superstates

When we study (or define) a system we typically start with a few “main statuses”: the system can be off, working halted or in-error, by example. And we change mode with a few event as TurnOn TurnOff, and InteralError. The ‘off’-state is quite clear, but how about ‘on’? And perhaps the are a few steps between off and on.
We can solve this by adding more states In almost all of those of those states, an event as TurnOff can be expected, and in those cases it should result in off –as before.

This make the FSM bigger and so less pleasant to design and maintain. Wouldn’t it be great to a kind generic “on-state” with one generic TurnOff event? And have kind of sub-state of “on-state” that can implement the details?
With hierarchically superstates this is possible. One groups a set of states together and call that group a superstate. One even repeat that process, and add higher-level group – or most developers prefer: zoom in and create “sub” states within a state.

So the on state is kind of split into booting, warming-up and operational. And operational contains do-prog_1, do-prog_2, do-prog_3, ect. And in all cases the event TurnOff will turn-off the system – but you only need to add one rule!

Concurent states (Orthogonal regions)

Concurent states are more complex as super/subs-states. It are kind of local states are active simultaneously. Wikipedia has a nice example with the num-lock and caps-lock key: both change state – one to select capital on/of, and one to prefer arrow over number – but those states are independent and concurent.

Again, the advantage is it makes the “FSM table” shorter and better to maintain.

All kind of actions

For a SW-designer the main difference between a Moore and a Mealy is where to put actions, as described above. But why choose? In software it possible to use both Mealy actions (‘on’ the transition) as well as Moore actions (‘in’ a state). And we can even use ‘Entry’ and ‘Leave’ (or exit – but that has the same 1-character abbreviation).

When using nested-stated we can define all of those kind-of actions on all levels. David Harel was so kind to define the sematics. Such that we now agree on the order in which the rules and actions should be executed, when many are valid. The result is quite simple: see UML-FSM for the “transition execution sequence”.

Castle: Build-in FSM syntax

After all this theory and seeing all options, we call conclude “the best programming-language (for system engineers), ever” needs to have build-in suport for all those options.
We want do program Finit State Machines directly. Not implement the details, but just describe how the FSM (or NFA) should behave; the computer can fill in the details – all theory does exit (for years)!

UseCase: Castle has generic FSM syntax build-in U_FSM_Syntax ../../_images/arrow-right-circle.svg
links incoming: U_FSM_extensions

In Castle one can defines FSM directly in code.

As argued above FSMs and the state (design) pattern are wildly used and there a no excuses to not support that in a modern programming-language. Castle will have syntax this

For the same reason, there is no need to restrict the syntax to one kind of FSM (including the NFA), or prefer one kind of actions above others. Castle will have syntax support for “all” options (unless this conflicts with other language-rules).

Tip

non deterministic rules are not excluded

Having syntax support for non-deterministic NFA rules does not imply the “compiler” will resolves all conflicts (see e.g. [#converted_actions]). But those potential conflicts will not restrict the Castle syntax.

Compare this by “devision by zero”. Everybody know that isn’t possible, but no language (syntax) will disallow it. But a compiler may warn for it.
Castle has the same approach: the language/syntax allow “the generic case”. The SW-developers are responsible to define a sound FSM. And Castle will support her/him by doing trivial transformations, and giving errors/warnings when they cant’ be resolved

UseCase: Castle supports NFA_ and UML-FSM_ extensions U_FSM_extensions ../../_images/arrow-right-circle.svg
links outgoing: U_FSM_Syntax

To clarify Castle has generic FSM synt... (U_FSM_Syntax) even more. Castle will support:

  1. non-deterministic rules

  2. Epsilon transitions

  3. ‘Instantly’ transitions (see above: a special kind of \(\epsilon\)-transition)

  4. Hierarchically superstates (or sub-states)

  5. Concurent states (Orthogonal regions)

  6. Moore and Mealy transitions

  7. Entry and Leave transitions

This list is not restrictive, and may be extended

Tip

Once …

Having syntax-support is meaningless without the proper compile/run-time support. That however is not demand is thise needs. One may expect that early implementations of the Castle-compiler can “parse” all syntax, but ony really compiler (and/or optimise) the easier parts of NFA.
Once, all will be fully supported!


Footnotes

1

The OO-variants are not shown, as even this 3 by 3 FSM will result is many classes, files an even more lines-of code than the trivial one. When the FSM becomes bigger, the OO variants have some advantages; as it scales a bit better. But scatters the FSM-behaviour over many files, and so does hardly solves the problem of understandability.

2

There is still a “small” practical issue we have to consider: converted-actions. When converting we got more and other states and transition. Most algorithm kind-of ignore the actions; but we can’t. We have to convert them to; which is conceptually simple (just move them along with the arrows), but more work (but as we will automatte this anyhow, we can ignore that.
When converting a non-deterministic NFA to a deterministic FSM, we kind of “know” which arrow had to be selected only a bit later, when more inputs become available. Remember, the NFA “gambles” always correctly, but we can’t! A practical machine has wait until enough input is available, before selecting. This implies, worst case, all actions are executed after the final input is processed.

This may be an options, but when (e.g.) real-time behaviour is needed 3 this is not acceptable. Then, the developers has no other options the to converting (partially) by hand (an head).

3

As a typical example: an elevator. Probably we can really optimise the movement of the elevator when we know whether next passengier needs to up or down. With a NFA that is easy: define an up and a down and decide later, on that next button. This will not be practical, at least not for that first passengier (and especially not for the last one that day).
There is no way, we can automatically convert that “optimised” NFA into a practical one. Therefore we need human-creativity.

Comments

comments powered by Disqus