C++ hardware library for (small) 32-bit micro-controllers

Wouter van Ooijen · Dec 4, 2013

(posted to comp.lang.c++, comp.arch.embedded, yahoo-groups/lpc2000)

I am working on a portable C++ hardware library for real-time
applications on small (but still 32-bit) micro-controllers. My typical
(and current only) target is the Cortex M0 LPC1114FN28 (4k RAM, 32K
Flash) using GCC. Efficiency is very important, in run-time, RAM use,
and ROM use. The typical restrictions for small microcontrollers apply:
no RTTI, no exceptions, and no heap (or at most just an allocate-only heap).

So far I have found nice and efficient abstractions for I/O pins and I/O
ports (using static class templates), implemented those on the LPC1114,
and used these abstractions to implement protocols (I2C, SPI) and
interfaces for some external chips like 74HC595 and MCP23017.

Like any would-be author of a serious piece of work I am looking for an
adience to read, test, critisize (ridicule if necessary), maybe even
contribute, and eventually use my work.

To get a idea of what I want to make, a very initial version can be
found at http://www.voti.nl/hwlib

Any takers?

If this is all too abstract, I have a more concrete question, mainly for
C++ architects. For I/O pins and ports I use class templates with (only)
static functions. This matches well with reality (the hardware circuit
is generally fixed), and is much more efficient than using inheritance
and virtual functions.

However, I am now working on graphics for small LCD screens. At the
lowest level (configuring an LCD driver for the size of the LCD and the
pins used) the static-class-template approach is fine. But at some level
the higher abstractions using the screen (for instance a subscreen,
button, etc.) must be more dynamic and created ad-hoc. So somewhere
(probably at multiple levels in the abstraction tree) I must make a
transition from static class templates to the classic objects in an
inheritance tree with virtual functions. Does anyone have any experience
with such a dual hierarchy? One very stupid but basic problem is naming:
a template and a class can't have te same name, so when the natural name
would for instance be 'subframe', who gets that name, the template, the
class, or neither?

Wouter van Ooijen

Alf P. Steinbach · Dec 4, 2013

{ Multi-posted clc++ and cae, follow ups set to comp.lang.c++ }

At the
lowest level (configuring an LCD driver for the size of the LCD and the
pins used) the static-class-template approach is fine. But at some level
the higher abstractions using the screen (for instance a subscreen,
button, etc.) must be more dynamic and created ad-hoc.

Is it the case that the same /compiled/ software has to be used with
different hardware?

If not then simply using separate compilation (sort of "indirection at
compile time") should suffice for the necessary decoupling, at the cost
of needing a specific build for each supported hardware mix.

You then link with the compiled classes that are specific to the
relevant hardware items.

So somewhere
(probably at multiple levels in the abstraction tree) I must make a
transition from static class templates to the classic objects in an
inheritance tree with virtual functions. Does anyone have any experience
with such a dual hierarchy?

The Java approach where the compiled code is used with various hardware,
is to have an object factory and then using an object with virtual
member functions. That can still be done without a heap. But (1) it
implies needlessly having available all the "drivers" (whatever, name)
that will not be used for this HW, and (2) chances are that you can
avoid this, as outlined above.

One very stupid but basic problem is naming:
a template and a class can't have te same name, so when the natural name
would for instance be 'subframe', who gets that name, the template, the
class, or neither?

Use namespaces?

But again, you can probably avoid all that.

Cheers & hth.,

- Alf

Wouter van Ooijen · Dec 4, 2013

Is it the case that the same /compiled/ software has to be used with

different hardware?

I am targetting micro-controllers. An application is always compiled for
a specific target, it must be, because there is nothing else on the target.

If not then simply using separate compilation (sort of "indirection at
compile time") should suffice for the necessary decoupling, at the cost
of needing a specific build for each supported hardware mix.

You then link with the compiled classes that are specific to the
relevant hardware items.

That is not my problem. I use static class templates for all things that
are 'fixed', and that approach works perfect. The problem arises when I
want to use a more traditional (classes, objects, virtual functions)
based abstraction on top of a static class based abstraction.

The Java approach where the compiled code is used with various hardware,
is to have an object factory and then using an object with virtual
member functions.

The static class approach avoids all runtime objects, and exposes
everything to the compiler for optimization. I don't want to spend 8 or
12 RAM bytes for each I/O pin on a 1k RAM chip!

As an illustration (somewhat simplified):

typedef ... port
kitt< port >();

The above code runs the 'kitt' style back-and-forth scanning on the pins
that comprise the port.

typedef ... port;
kitt< invert< port > >();

The above does the same, but instead of one-led-on at a time it is now
one-led-off at a time. With the template approach this generates the
same amount of code, because the invert< > is optimized 'away'. (It does
not realy disappear, but somewhere a MOV becomes a MNV or something
similar.)

Use namespaces?
But again, you can probably avoid all that.

I can definitely not avoid it, but putting the things in different is
indeed a possibility. I'll think about that.

Wouter van Ooijen

Clifford Heath · Dec 8, 2013

QT 5.x has ~100M runtime. And it is slow, too. This is price of GUI
portability.

In the late 80's I designed OpenUI, a cross-platform UI toolkit with
interpreter for an embedded language, and asynchronous rich messaging IO
both internally and across the Internet. It ran the new international
trading system of NASDAQ for ten years, initially on 486/66 PCs with 8MB
ram, with Windows 3.11. Equivalent versions ran the exact same apps on
character terminals, X11/Motif, Macs, OS/2, Windows NT, etc, on large
enterprises around the world, including some of the first Internet (pre
web-browser) banking and share trading apps.

The entire engine fit in 1MB, even though it was written in C++.

Not looking so smart now, are you Vladimir?

A tool is only as good as the people who use it. Especially a sharp tool.

Wouter van Ooijen · Dec 8, 2013

I am working on a portable C++ hardware library

1. Portable hardware is myth.

I have a lot of hardware that I can carry if I want to

Seriously (I am not sure you are, but I'll be), a lot of
hardware-related code can be portable, except for the frustrating aspect
of accessing the I/O pins (and dealing with timing).

2. Universal solutions and modular designs don't work.

I don't think any comment is needed.

3. Trying to cover everything instead of doing particular task is waste
of time and effort.

First part is true, second part is nonsense, otherwise no libraries
would exist or be used. Library design is the art of balancing between
doing everything and doing a specific task well.

Useless. Abstract not I/O operation but function that it does.

#define RED_LED_ON SETBIT(PORTA, 7)

I've been there, check for instance http://www.voti.nl/rfm73/. It works
up to a point. It runs into problems

- when setting a pin involves more than a simple operation (BTW, PORTA
hints that you use a PIC, in which case this is plain wrong due to the
RMW problem!)

- when you need more than one instance of your library (like interfacing
to two identical radio modules)

- it uses macro's, which are evil (according to some they are THE evil)

How about life without C++ ?

Been there, ran up against the limitations of an assembler, even wrote a
compiler. Happy with C++ now. A pity concepts did not make it (yet).

QT 5.x has ~100M runtime. And it is slow, too. This is price of GUI
portability.

That's not the kind of GUI I am targeting. I'm mainly into
microcontrollers, the thingies that count their memory in kilobytes, not
megabytes.

Accept whatever style and stick to it. It doesn't matter as long as you
are consistently following your design rules.

The two styles I mention (static class templates and (traditional)
classes with virtual functions) are both needed to get a balance between
(code and data) size and run-time flexibility.

(to the rest of the word: sorry if I am feeding a troll)

Wouter

Wouter van Ooijen · Dec 8, 2013

I am working on a portable C++ hardware library

1. Portable hardware is myth.

I have a lot of hardware that I can carry if I want to

Seriously (I am not sure you are, but I'll be), a lot of
hardware-related code can be portable, except for the frustrating aspect
of accessing the I/O pins (and dealing with timing).

2. Universal solutions and modular designs don't work.

I don't think any comment is needed.

3. Trying to cover everything instead of doing particular task is waste
of time and effort.

First part is true, second part is nonsense, otherwise no libraries
would exist or be used. Library design is the art of balancing between
doing everything and doing a specific task well.

Useless. Abstract not I/O operation but function that it does.

#define RED_LED_ON SETBIT(PORTA, 7)

I've been there, check for instance http://www.voti.nl/rfm73/. It works
up to a point. It runs into problems

- when setting a pin involves more than a simple operation (BTW, PORTA
hints that you use a PIC, in which case this is plain wrong due to the
RMW problem!)

- when you need more than one instance of your library (like interfacing
to two identical radio modules)

- it uses macro's, which are evil (according to some they are THE evil)

How about life without C++ ?

Been there, ran up against the limitations of an assembler, even wrote a
compiler. Happy with C++ now. A pity concepts did not make it (yet).

QT 5.x has ~100M runtime. And it is slow, too. This is price of GUI
portability.

That's not the kind of GUI I am targeting. I'm mainly into
microcontrollers, the thingies that count their memory in kilobytes, not
megabytes.

Accept whatever style and stick to it. It doesn't matter as long as you
are consistently following your design rules.

The two styles I mention (static class templates and (traditional)
classes with virtual functions) are both needed to get a balance between
(code and data) size and run-time flexibility.

(to the rest of the word: sorry if I am feeding a troll)

Wouter

Jorgen Grahn · Dec 8, 2013

["Followup-To:" header set to comp.lang.c++.]

On Sun, 2013-12-08, Wouter van Ooijen wrote:

(attributions lost, not my fault)

I have a lot of hardware that I can carry if I want to

Seriously (I am not sure you are, but I'll be), a lot of
hardware-related code can be portable, except for the frustrating aspect
of accessing the I/O pins (and dealing with timing).

I don't think any comment is needed.

First part is true, second part is nonsense, otherwise no libraries
would exist or be used. Library design is the art of balancing between
doing everything and doing a specific task well.

Yes, but it's a difficult art, and too many people do it badly. I hope
that was what Vladimir(?) tried to say.

I used to do it -- badly -- but nowadays I try to fit my code to the
design I'm working on, in an elegant way if possible. When I've done
similar things in two or three different projects, I stop to see if it
makes sense to split it out into a library. At that point I have real
world experience.

How this applies to you I cannot tell. Perhaps you've seen enough
different hardware already so you can tell what's the common metaphor
for most of it.

/Jorgen

Clifford Heath · Dec 8, 2013

What are you arguing to?
What point are you trying to make?

Problems reading your own words?

You wrote: "This is the price of GUI portability".
You were wrong. Your example showed the price of using Qt,
a price which, incidentally, is incredibly damaging and
may explain why Nokia is in decline - so I agree.

It is however not the fault of either C++ or of GUI portability, but of
bad design.

John Devereux · Dec 8, 2013

Wouter van Ooijen said:
I have a lot of hardware that I can carry if I want to

Seriously (I am not sure you are, but I'll be), a lot of
hardware-related code can be portable, except for the frustrating
aspect of accessing the I/O pins (and dealing with timing).

I don't think any comment is needed.

First part is true, second part is nonsense, otherwise no libraries
would exist or be used. Library design is the art of balancing between
doing everything and doing a specific task well.

I've been there, check for instance http://www.voti.nl/rfm73/. It
works up to a point. It runs into problems

- when setting a pin involves more than a simple operation (BTW, PORTA
hints that you use a PIC, in which case this is plain wrong due to the
RMW problem!)

- when you need more than one instance of your library (like
interfacing to two identical radio modules)

- it uses macro's, which are evil (according to some they are THE
evil)

So far I have indeed done this using C macros. I can define them using
standardized names so I can end up doing

/* this header specific to each target architecture */
#include "pio.h"

/* now can use generic versions of commands */

#define RED_LED PIO(A,1)
#define SERIAL_DATA PIO(A,2)

pio_out(RED_LED, 1);
pio_dd(SERIAL_DATA, 0); /* bidirectional data pin */

....And so forth.

When it involves more than a simple operation, I find you end up having
to do things in an application-specific way anyway. So to take your
example of an SPI driven I/O expander, usually I would update this
periodically rather than every time the application writes a bit. Yes,
your mileage may vary but that is the point, you are likely to end up
having to rewrite it each time in reality.

[...]

The two styles I mention (static class templates and (traditional)
classes with virtual functions) are both needed to get a balance
between (code and data) size and run-time flexibility.

I found it an interesting idea, and thanks for posting it.

But a bit daunting to comprehend unless one is very well versed in c++
template metaprogramming, which I am not. I am reading the latest "The
c++ programming language" (c++11 based). So I will look again after
that.

Öö Tiib · Dec 8, 2013

However, I am now working on graphics for small LCD screens. At the
lowest level (configuring an LCD driver for the size of the LCD and the
pins used) the static-class-template approach is fine.

Yes it is fine for lot of cases and not only for embedded systems.

But at some level
the higher abstractions using the screen (for instance a subscreen,
button, etc.) must be more dynamic and created ad-hoc. So somewhere
(probably at multiple levels in the abstraction tree) I must make a
transition from static class templates to the classic objects in an
inheritance tree with virtual functions. Does anyone have any experience
with such a dual hierarchy?

The virtual functions are nothing magical. If there is a long switch-case
of if-else-if chain in code then that is typically indicating missing
dynamic polymorphism. The compilers implement it quite efficiently so
usually virtual functions outperform such chains. There are several
patterns how to mix the static polymorphism (templates/overloads) and
dynamic polymorphism (virtuals/callbacks).

One very stupid but basic problem is naming:
a template and a class can't have te same name, so when the natural name
would for instance be 'subframe', who gets that name, the template, the
class, or neither?

Hard to understand what is asked here. The name should fit with what the
thing does. So if it is base class that contains only virtual function
'move' then it is 'movable', not 'subframe'.

Wouter van Ooijen · Dec 8, 2013

The two styles I mention (static class templates and (traditional)

I found it an interesting idea, and thanks for posting it.

But a bit daunting to comprehend unless one is very well versed in c++
template metaprogramming, which I am not. I am reading the latest "The
c++ programming language" (c++11 based). So I will look again after
that.

There is very little metaprogramming involved in th basics. My interface
for an open-collector/drain input/output pin is (leaving out
initialization and type identifification) just

struct pin_out {
static void set( bool );
static bool get();
};

Again leaving out some details, the template for a PCF8574 (8-bit I2C
I/O expander) takes 2 such pins (SCL and SDA) and provides 8 such pins.

template< class scl, class sda >
struct pcf8574 {
typedef ... pin0;
...
typedef ... pin7;
};

So if I want to connect one PCF8574 to two pins of the micro-controller,
and and a second one to two pins of the first one, I declare

typedef pcf8574< target:

in_0_4, target:

in_0_5 > chip1;
typedef pcf8574< chip1:

in0, chip1:

in1 > chip2;

now I can use the pins on chip2 ( eg. chip:

in0::set(1) ) and each use
will be written to the pin via the cascaded I2C busses.

No metaprogramming, just straightforward templates.

Wouter

Wouter van Ooijen · Dec 8, 2013

There are several

patterns how to mix the static polymorphism (templates/overloads) and
dynamic polymorphism (virtuals/callbacks).

Can you name a few? It is difficult to google if one doesn't know the terms.

Hard to understand what is asked here. The name should fit with what the
thing does. So if it is base class that contains only virtual function
'move' then it is 'movable', not 'subframe'.

That's understood, but when I have both a template and a normal class
that fulfill the same takes (but using complite time versus run time
mechanisms) how should each be named?

Wouter

Stefan Reuther · Dec 8, 2013

Hi,

I am working on a portable C++ hardware library for real-time
applications on small (but still 32-bit) micro-controllers. My typical
(and current only) target is the Cortex M0 LPC1114FN28 (4k RAM, 32K
Flash) using GCC. Efficiency is very important, in run-time, RAM use,
and ROM use. The typical restrictions for small microcontrollers apply:
no RTTI, no exceptions, and no heap (or at most just an allocate-only
heap).

I've been using an implementation using classes with virtual functions
for that, in programs from bootloaders to application programs (well,
actually I did a little template magic to implement vtbls "by hand", so
I can control when and how things are constructed). But effectively, I
have classes
class InputPin { virtual int get() = 0; };
class OutputPin { virtual void set(int) = 0; };
and their descendants.

Your fully template-based approach looks neat and appropriate for things
with as little "meat" as an I/O pin, but for more complicated things
like "SPI transaction", "parallel NOR flash", "NAND flash" I'd like to
know where in the object files my code ends up.

Plus, an I/O pin may end up to be more than just read-bit-from-register:
for applications that occasionally read a pin, I've got an
implementation of the InputPin interface that performs a remote
procedure call into the driver, saving the application from having to
map physical memory.

If this is all too abstract, I have a more concrete question, mainly for
C++ architects. For I/O pins and ports I use class templates with (only)
static functions. This matches well with reality (the hardware circuit
is generally fixed), and is much more efficient than using inheritance
and virtual functions.

"The hardware circuit is generally fixed" is one of the biggest lies of
embedded software development

At least I didn't yet encounter a project where hardware assignments
didn't change over time. Pins get moved, get inverted, new flash chip,
etc. So it's good I'm able to adapt by changing a (runtime) initialisation.

I'm paying one virtual dispatch per access. So, I wouldn't want do to
bit-banged SPI or IIC with my drivers. Thank god I don't have to

It's probably not appropriate for 8-bitters, but it's efficient enough
to be useful in production bootloaders with a few k code.

Stefan

Öö Tiib · Dec 8, 2013

Can you name a few? It is difficult to google if one doesn't know the terms.

Terms feel often self-coined. More often I have seen used "manifest
contracts", "hybrid types" and "gradual types".

That's understood, but when I have both a template and a normal class
that fulfill the same takes (but using complite time versus run time
mechanisms) how should each be named?

I do not understand how these are needed both side by side in conditions
where any overhead is expensive (like embedded system). If the
compiler knows type of object then it does not generate virtual calls
but ordinary calls (despite function is declared 'virtual'). Perhaps
you should give example.

Wouter van Ooijen · Dec 8, 2013

Your fully template-based approach looks neat and appropriate for things

with as little "meat" as an I/O pin, but for more complicated things
like "SPI transaction", "parallel NOR flash", "NAND flash" I'd like to
know where in the object files my code ends up.

You mean that it is a problem that you can't see which parts use how
much ROM?

Plus, an I/O pin may end up to be more than just read-bit-from-register:
for applications that occasionally read a pin, I've got an
implementation of the InputPin interface that performs a remote
procedure call into the driver, saving the application from having to
map physical memory.

I don't see why a templatetized implementation could not do that too?

"The hardware circuit is generally fixed" is one of the biggest lies of
embedded software development

At least I didn't yet encounter a project where hardware assignments
didn't change over time. Pins get moved, get inverted, new flash chip,
etc. So it's good I'm able to adapt by changing a (runtime) initialisation.

Would it be a problem to re-compile?

I'm paying one virtual dispatch per access. So, I wouldn't want do to
bit-banged SPI or IIC with my drivers. Thank god I don't have to
It's probably not appropriate for 8-bitters, but it's efficient enough
to be useful in production bootloaders with a few k code.

I started on such a class/object/vtable approach 2 years ago, but found
a number of problems:

- it is slow for very simple operations
- it hinders optimization
- it requires an object, which takes RAM, of which there is preciously
little on small micro-controllers
- compilers seem to be bad at eliminating virtual functions that are
never used

About optimization and speed: suppose I have a LED on a certain pin:

typedef gpio_1_0 LED;
LED::set( 0 ); // LED off

OOPs, the LED is connected to Vcc, not to ground! So I change this to

typedef invert< gpio_1_0 > LED;
LED::set( 0 ); // LED off

With the template approach the second version generates exactly the same
amount of code, and is exactly as fast, as the first version.

Wouter

Wouter van Ooijen · Dec 9, 2013

Hey gasbag

Who are you adressing?
I did not notice anyone with that name in the discussion.

Wouter

Ian Collins · Dec 9, 2013

Wouter said:
Who are you adressing?
I did not notice anyone with that name in the discussion.

Pot, kettle?

Please don't snip attributions!

Stefan Reuther · Dec 9, 2013

Wouter said:
You mean that it is a problem that you can't see which parts use how
much ROM?

Yep. It probably depends a lot on the toolchain and its configuration,
but having two-pass template compilation and linkonce sections doesn't
actually make writing linker script files easier unless you can get the
compiler to inline everything.

I don't see why a templatetized implementation could not do that too?

It could do that, but assuming that the RPC operation is more than just
dereference-a-pointer-and-set-a-bit, it would risk duplicating more
code, see below.

Would it be a problem to re-compile?

Yes. Our software usually has to run on half a dozen board versions at
least.

I started on such a class/object/vtable approach 2 years ago, but found
a number of problems:

- it is slow for very simple operations
- it hinders optimization
- it requires an object, which takes RAM, of which there is preciously
little on small micro-controllers
- compilers seem to be bad at eliminating virtual functions that are
never used

About optimization and speed: suppose I have a LED on a certain pin:

typedef gpio_1_0 LED;
LED::set( 0 ); // LED off

OOPs, the LED is connected to Vcc, not to ground! So I change this to

typedef invert< gpio_1_0 > LED;
LED::set( 0 ); // LED off

With the template approach the second version generates exactly the same
amount of code, and is exactly as fast, as the first version.

Now imagine you have a flash driver, and want do drive two flashes.

typedef spi_flash<spi<gpio_1_0, gpio_1_1, gpio_1_2> > Left_Flash;
typedef spi_flash<spi<gpio_2_0, gpio_2_1, gpio_2_2> > Right_Flash;

This would duplicate the flash driver. Sure, each one would have
incredible speed. I decided to rather pay the virtual dispatch than the
code space, because speed is good enough for the things I do. Your
mileage will probably vary.

Stefan

Wouter van Ooijen · Dec 9, 2013

It could do that, but assuming that the RPC operation is more than just

dereference-a-pointer-and-set-a-bit, it would risk duplicating more
code, see below.

When the code is small and/or has substantial possibility for
optimization due to inlining the template approach wins. When the code
is large and needs to be used with two or more sets of pins (or run-time
configurable pins or other resources) the OO approach wins.

Inbetween the contest is on. Hence my quest for a good way to make a
dual hierarchy.

Now imagine you have a flash driver, and want do drive two flashes.

typedef spi_flash<spi<gpio_1_0, gpio_1_1, gpio_1_2> > Left_Flash;
typedef spi_flash<spi<gpio_2_0, gpio_2_1, gpio_2_2> > Right_Flash;

One approach here is the template-inherits-from-non-template-class
approach. Put the bulk in the non-template, and the few fast things in
the template.

I think in this case the template should be the SPI part, the flash
driver should not be a template. Separating the SPI makes sense anyway,
beacuse you want to be able to use the flash driver over a bit-banged
SPI, a hardware SPI, and tomorrow over yet something else.

Wouter

32/64 bit cc differences	110	Jan 10, 2014
Released Contract Programming Library for C++	4	Jun 10, 2012
Somone's SO question: "Is there an existing library for dynamically-determineddimensional array in c	1	Dec 9, 2013
Small High-precision Arithmetic Library	7	Jul 19, 2007
SuperKISS for 32- and 64-bit RNGs in both C and Fortran.	11	Nov 27, 2009
Two C++ snippets for brainstorming	6	Dec 2, 2011
help with small c++ assignment	3	Oct 13, 2006
Micro-C -- Help monitoring a switch on 8051	13	Nov 18, 2004

C++ hardware library for (small) 32-bit micro-controllers

Wouter van Ooijen

Alf P. Steinbach

Wouter van Ooijen

Clifford Heath

Wouter van Ooijen

Wouter van Ooijen

Jorgen Grahn

Clifford Heath

John Devereux

Öö Tiib

Wouter van Ooijen

Wouter van Ooijen

Stefan Reuther

Öö Tiib

Wouter van Ooijen

Wouter van Ooijen

Ian Collins

Stefan Reuther

Wouter van Ooijen

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads