Architectural overview
The (new) CastleCompiler(s) share a common “pipeline” architecture. Which is flexible, both in functionality and implementation, as they share the AIGR –see below – for input and/or output.
AIGR pipeline
The pipeline starts with source, in text format, that is explored by the Readers [ToDo] and translated in the
Abstract Intermediate Graph Representation. The next components all read this format, like the Transformers, that transform it to a “better”
form – see later. Last, the AIGR is converted into a binary by (one of the) the Backends [ToDo]
Most Backends [ToDo] consist of two main parts: the Writers [ToDo] (which is part of CCastle), and a Translator: an
external compiler that translate/compiles the generated intermediate code into a binary.
As the AIGR is a format (not a call-interface!), this architecture gives flexibility on deployment of the components.
A simple (Castle)compiler can hold them as (plug-in) libraries in one process.
Alternatively, each component can be a process, where te several processes communicate with e.g. unix-pipes, or
network-connections). And there are many more option, not only separated in space, but also in time: As the AIGR can be
serialised [1], it is possible to save it in file, and read it for the next component, later …
Important
Although it is possible to saving a (pickled) AIGR, that component (nor action) is NOT a Writer!
One should typically speak about “saving” the AIGR, and “loading the (AIGR) file”. It is a feature of the The AIGR auxiliary component (see below).
The Reader(s)
A typical reader reads (some) source-files and then translate that, in a few steps, into the Abstract Intermediate Graph Representation, as
shown below.
The mockReader [ToDo] is different: it does output (needed) ‘AIGR’, and so can act as the starts of a
pipeline (and therefor considered as a ‘Reader’), but has no input.
Some sub-components in the ‘Reader’ may also work on the ‘AIGR’, as shown. The difference (to a ‘Translator’) is
simple: the ‘Reader’ should do all error-checking, etc, to make sure the inputs (so the code of the developer) is
valid. A normal Translator (nor the ‘Backend’) should ever find errors.
When implementing that (Reader) functionality is more convenient as after converting the ATS into the AIGR an “AIGR-analyser” is build.
Transformers
All ‘Transformers’ receive and post (a “dialect” of) the`AIGR`, pushing it into a form that van be handled by the backend to create an efficient binary. There can be many Transformers, and typically several of them are run in sequence. Other sets of Transformers exclude each other.
A Transformer is often triggered by one of the Rewriters (TODO)
We show two examples.
FSM
In Castle, one can directly describe a FSM (see: FSMs are needed) including advance/extended variants. Like
the non-deterministic “NFA”s, and the “State-Charts” (known from UML), with orthogonale regions and hierarchically
‘superstates’. See Castle has generic FSM synt... (U_FSM_Syntax) for the demands.
Those FSM are initially stored “asis” in the AIGR, and step-by-step rewritten by several FSM-Transformers.
The FSM.NFA_2_FSM
Transformer reworks a NFA into a (bigger, deterministic) FSM. This is a well know algorithm, such
that non-deterministic edges and epsilon-transitions are gone.
The resulting AIGR has the same functionality, but is simpler to translate into a binary
Similarly, the FSM.SuperStates
Transformer can “flatten” complex hierarchical FSM’s into ones that are easier to
translate into executable code.
This is an examples of set of Transformers that can work collectively. First, remove the non-determinisms, then handle the SuperStates and completely transformer the (simple) FSM into regulair routines.
Machinery
Caution
The Machinery part is still in development. And so, it’s not sure that the Machinery will be implemented as a Transformer!
‘The Machinery (ToDo)’ is an abstraction of the technology to connect ports and send data (like events) over them. Several implementations are possible, like direct function-calls, dispatching to concurent thread-pools, or distributing them over a network.
Typically, one wil only use one Machinery: connecting two port with DDS, ‘sending’ an event by dispatching it whereas the receiving event-handler expect a traditional call, will not work. By choosing one Machinery-Transformer, all bolt-and nuts will fit.
Caution
This is not an requirement!
One can imagine, that (eventually) a Mixed-Machinery is used. Ultimately, only the details of each (individual) connection should be aligned.
Advanced example
The Machinery-kind can be seen an attribute of the (super)component that holds the connections (and
sub-component, with ports). By using that Machinery for those connections, it will work.
But the (external) ports and connections of that super(component) can use another Machinery; when the
supper-super-component’s machinery-kind attribute has another value.
Again, this makes it complicated. But it gives flexibility: for deep-down connections we might prefer direct calls, but use concurent options at a (bit) higher level. And use maximal decoupling for networking-applications.
Note
Here, we see an example of having the “connected/concurent components” abstraction and ‘The Machinery (ToDo)’ abstraction.
The CCastle code, use “components, ports and connections” only. Later, we compiling it, the details of the
Machinery are added.
And by implementing it in/as a Transformer we can add more-and-more advanced options without the need to change
the source. Only some (global) “compiler options” have to be improved.
(many) sor
Writers & Backends
The ‘Backends’ read the (simplified & optimised) AIGR and transform it to a binary that can be executed. Typically this is a two-step approach: A ‘Writer’ renders the AIGR into a low-level intermediate [2] language [3]
The interface between the ‘Writer’ and the ‘Translator’ is typically file-based, and depend heavily on the chosesn
‘Translator’ – which is not part of CCastle.
As those two are very depending on each other, there is little commonality between various Backend variants.
Some examples
The rPY: Use (r)Python as backend backend/writer renders to RPython, such that (PyY’s) rpython-translator can handle it. The intermediate file-format is fully described by RPython: the rPY: Use (r)Python as backend Writer needs to emit exactly that format.
CC2Cpy (now defunct [4]) generates standard C code, that can be translated into binaries by many (standard, C) compilers. So, it’s a bit more generic (then RPy), but still the writer is limited to C – and so has to emulate namespaces, as that isn’t handled in C
A possible variant is using C++, both as lll and translator (but as I’m not a a fan of it, somebody else has to make itBoth mentioned writers are implemented in python, for now (that is the ‘py’ part of CC2Cpy).
Future variants of those Writers will be implemented in Castle itself. This does not change the input (AIGR) not the output of the Writers (rpython and C). And such we can use easily upgrade the Backend as the Translator does not change.
The AIGR auxiliary component
The AIGR auxiliary component describes & handles the ‘Abstract Intermediate Graph Representation’ and is used by all regulair components. Sometimes it (the AIGR) is called an “intermediate language” (‘IM’), or “intermediate representation” (‘IR’). Many existing IMs are quite “flat”, low-level and very operational, like RTL or SSA – they are great to convert code to assembly.
For the CCastle Workshop Tools, a more abstract representation is chosen, with more structure. Visually, it resembles
a tree, but without the need to have a single root (making it a “forest”), and with interconnects (making it a
graph). Structurally, it is not dissimular to the XLM/DOM, known by many webpages; but again without the “single
document-root” – the DOM has interconnects, known as “links” (a term not used in the AIGR).
The AIGR reminds also to the AST (of CCastle), after all, each language construct is “stored” in the AIGR. Some see it as a
semantically parsed AST. A namedID (like a variable, or function) in the source, when in the same namespace denotes the
same artifact – even it is mention at several places (and can have aliases). In the AIGR it is the same ‘node’ having
multiple incoming ‘edges’ – and so, violates the tree’s non-cycle rule.
The AIGR-component describes all possible elements, and the relations (so it is a bit like the XML DTD or Schema). And has (will have) general routines to facilitate handling the AIGR.
By example, you can expect routines to “save” an AIGR to file, and “load” is later.
Warning
Although the AIGR is a graph, the AIGR-component will not be able to visualize that graph.
Other workshop tools may do that, and probably use the AIGR-component to read it. The visualising is part of that tool. For the example above an manual conversion to plantUML is made.
Currently, the AIGR is in the design phase, and may change.
For that reason, only a Python dataclass reference model is available (Work-in-Progress). The (unit & behaviour) tests
and TesDoubles make it quite understandable. Once, it will be fully documented (and versioned) And available for
multiple languages (including Castle :-)
Footnotes
Comments
comments powered by Disqus