1 # Status of this project
3 This is an early preview. I think the structure of the API is mostly
4 stable (ie the relationship of data types should not change radically),
5 but names might change for better consistency and semantics (how errors
6 are handled, side effects like logging, etc.) will change in subtle ways
7 too. And of course lot's of symbols will be added to make a consistent
8 and conventient interface.
10 What is currently completely missing, is algorithms for data processing
11 (eg filtering), standard feedback loops, etc. While these are clearly
12 within the scope of this project, it turns out many applications don't
13 need them. Therefore implementation (and API design) will happen later.
14 (Hopefully somehow fitting with the existing API.)
18 # What libautomation is: Design decisions and lessons learned
20 libautomation was created by abstracting common code from several
21 applications of industrial and home automation. It is assumed that this
22 code will be useful for writing more automation applications.
24 libautomation is intended to do as little as possible, relying on linux
25 functionalities and external libraries (like libev) as much as possible.
26 It is written entirely in C and every part of it should be easily
27 replaceable by custom code, if the application has special needs.
29 The goal here is to make easy tasks simple and complex projects
30 possible and easily managable.
32 As such libautomation is not only (maybe even not mainly) about the
33 code provided, but very much about the lessons learned from writing
34 automation applications: How to do things and what not to do because
35 it leads to problems down the road. These lessons have been turned
36 into design decisions.
39 ## Use a system daemon instead of a watchdog
41 While it might seem tempting on embedded automation solutions to use the
42 watchdog directly in the main application, because this automatically
43 tests the entire software stack, this approach is too inflexible: Any
44 additional features need to be integrated into the main application,
45 which conflicts with trying to keep libautomation very modular.
47 Instead a system daemon (like procd from OpenWRT) should feed the
48 watchdog and in turn monitor all applications running on the system.
51 ## An automation system should be resilent against crashing/hanging processes
53 libautomation aims for custom made (home brewn) automations solutions.
54 These often don't run on a 10k Euro PLC, yet might be used in an industrial
55 environment with high levels of EMI, that might cause occasional hardware
56 faults in cheap HW. It should be easy to write automation applications
57 in such a way, that the system can recover from such errors by restarting
58 processes or resetting the system.
61 ## library functions might crash instead of fail
63 This might seem like an unconventional choice at first. But additional to
64 the remarks about reliability above, think about the following points:
65 * Usually nobody attends the automation system to read error messages.
66 * If a syscall returns an error, that can be handled, the library should
67 do it. If it can't be handled, then restarting the application or the
68 system seems to be a plausible fix.
69 * The library API is a lot easier to use when functions cant fail ... ;)
71 Of course the above is only a guideline, not a principle. If in doubt
72 failing and not-failing versions of the same function should be provided.
75 ## Multiple processes need to cooperate
77 There are many constraints on how an automation algorithm is split into
78 threads and processes:
79 * Some actions can cause interference with reading sensors, so you typically
80 want to have everything in one thread, to control relativ timing of
82 * However not every IO can be made non-blocking, e.g. reading a sensor
83 via sysfs might block for a long time until it runs into some timeout.
84 Therefore at least multiple threads are needed, if not multiple processes.
85 * Some input is expensive to read (might come over the network or a slow
86 bus), so the effort should not be duplicated needlessly.
87 * Sometimes it becomes desireable two run the same thread twice, controlling
88 two machines at the same times. When people write multithreaded applications,
89 they often don't anticipate this case, and if they do, it usually is untested
90 and more complicated then plain running the application twice.
92 To make it possible to meet all these contraints, clearly an ability to
93 share (sensor) data between multiple processes is necessary. To that end
94 libautomation provides a shared memory interface.
98 # What to find in the directory tree
100 The repository currently contains the sources and header files of the
101 library itself in the folder `lib' and some small demo applications,
102 that are either helpful utilities or small automation applications
103 in the folders `tools' and `examples'.
109 The `atmdump' tool prints the contents of the libautomation shared memory
110 domain SHMID in human readable form to standard output. This is mostly
111 useful for ad-hoc testing and inspection, but also serves as a demo for
112 shared memory clients.
114 ### atmd /path/to/configfile
115 The `atmd' (for libautomation daemon) reads any number of data source (ie
116 devices) from the configuration file specified on the command line and
117 makes their periodically updated values available via shared memory. This
118 is the demo for providing values in shared memory,
120 `atmd' is also a very important part of the libautomation ecosystem:
121 In the case where reading a sensor might block for an unacceptable long
122 time, `atmd' provides reading the sensor in a seperate process, without
123 any change to the primary application.
125 Furthermore `atmd' allows organizing data sources into so called
126 data source groups, which each can have their own policy regarding handling
127 of errors when reading a sensor. Each data source group lives in their
128 own config file. Config files can be loaded recursively with the
134 ### humiditycontrol /path/to/configfile
135 is a small air humidity regulator. It measures the relativ humidity and
136 temperature on two places (typically inside a building and out doors) and
137 calculates the absolute humidities. If the absolute humidity on the out side
138 is lower then insides, then a fan is turned on.
140 There are several configuration settings to control target values, allowed
141 energy loss when it is cold outsides, etc. This demo is actually useful
142 for real world applications and can easily get extended and customized to
143 specific needs, like adding a dehumidifier, etc.
145 ### line_monitor /path/to/configfile
146 is a building block for an alert system. It monitors some input (typically
147 a gpio) indicating the status of some equipment. When the input changes
148 state, it triggers execution of some external programm like a script doing
151 The alert command is executed periodically until either the error condition
152 is fixed or the operator acknowledges the error.
154 This application is actually used verbatim in a district heating plant.
155 Well, since I changed the libautomation API after deploying the system,
156 it isn't actually verbatim any longer ...
162 ## Important data structures
165 A value (machine size integer) together with a timestamp, when the value
169 Data Source: Descriptor how to obtain/update a single value. This is
170 mostly used internally. Users typically either update an ATM_VALUE
171 manually or have this managed completely by libautomation.
174 A data source group is a list of data sources together with a policy, how
175 to handle errors on reading data sources. Some useful policies are
176 predefined, however it is possible to implement arbitrary policies in the
177 application via callback.
179 Also data source groups can be stacked by treating data source groups as
180 pseudo data sources. This allows building complex policies: E.g. you can
181 have an inner data source group, that retries reading a sensor three times
182 before failing, and an outer data source group, that resets the data bus
183 to recover from errors.
185 In the above example, you would have all sensors of the bus as children
186 of the data source group, to prevent sensor access while the bus is down.
189 A task is a repeating timer together with a data source group and an
190 optional function. Every time the timer fires, all values in the data
191 source group are updated. As last step the optional function is called.
193 A task can be type cast to it's ev_timer and thus can be started and
194 stopped using the usual libev facilities.
196 libautomation automatically initializes a task with the global symbol
197 atm_main_task. This task is started when atm_main() is called.
199 ### TODO: Write something about filters
202 ## Shared memory interface
204 As noted above, sharing sensor data between applications is an important
205 requirement for automation applications. Therefore the shared memory
206 interface is a core part of libautomation and most features make use of
209 Each shared memory region has an unique id. Typically an application will
210 create one shared memory region to export its data and connect to N other
211 memory regions for data input. The first shared memory region created, is
212 assigned to the global symbol `atm_shm_stdmem`.
214 Each shared memory region is organized as key-value database. Where keys
215 can be arbitrary strings and values are of type `struct ATM_VALUE`.
217 ### Creating shared memory regions
218 `atm_shm_create(id)` creates a new shared memory region with unique id.
219 It is an error, if a memory regiion with the same id has already been
220 created by an other process. If `id` is NULL, then a memory region is
221 allocated on the heap instead, making the memory region effectifly private.
223 It is good policy to use the name of the config file as key. This is
224 under the assumption, that to instances of the same application surely
225 would need different config files, because they should not access the same
228 ### Exporting locally calculated values
229 `atm_shm_register(shmr, key)` registers a new key-value-pair with shared
230 memory region `shmr` and returns a pointer to the value.
232 Use `atm_shm_update(struct ATM_VALUE *var, int value)` to automatically
233 update the timestamp with the value.
235 ### Exporting sensor data
236 Registering a new data source with
237 `atm_ds_register(struct ATM_DSGRP *grp, const char *url, const char *key)'
238 automatically registers a key-value pair in `atm_shm_stdmem`.
240 ### Reading data from shared memory
241 `atm_ds_register(struct ATM_DSGRP *grp, const char *url, const char *key)`
242 where `url` is of the form "shm:id/key" returns a pointer to the
243 associated `struct ATM_VALUE`. In this case, the value of `key` is
244 ignored and no key-value pair is exported.
246 Optionally there is also `atm_shm_get(const char *id, const char *key)`
247 with the same effect.
250 ## Some notes on time keeping
252 There are three different conventions to handle time. Sorry for the
255 ### ev_time from libev
256 `ev_time` is a double precision floating point data type used by libev
257 and typically stores the time in something close to seconds since UNIX
258 epoch. The upside is: no overflow danger and good precision at the same
259 time. The downside is: floating point arithmetic. :-(
261 Since we are using libev, we have to use this.
263 ### atm_time, atm_timestamp()
264 This uses native integer variables. `atm_timestamp()` returns the time
265 since booting the system in 1/10 seconds. This also is quite safe from
266 overflows (assuming at least 32bit integers) and precise enough to keep
267 track of typical hardware like relais.
269 `atm_time` is a global variable and automatically set from `atm_timestamp()`
270 each time an ATM_TASK is run. The idea is to have an rough estimate of the
271 current time without having to force a context switch (i.e. calling into the
275 This macro is defined to 0.001, meaning one millisecond.
277 Data sources (or actually data source groups) return a positive integer
278 value when they need to get called again to complete their operation.
279 (E.g. because they had to reset some bus and need to wait for everything to
280 initialize.) This return value times `ATM_TIMER_RES` is the waiting time
281 until resuming the task.
283 This allows for more fine grained control then 1/10 of a second. Think of
284 1/10 second as the fastest sensible interval to repeat a task (but nothing
285 stops you, from repeating faster). Then obviously interruptions of tasks
286 have to support a much shorter time scale.
292 Libautomation is currently hosted at https://repo.or.cz/libautomation.git -
293 please ask me if you want push access. I'm easily reachable via e-mail.
295 If there is sufficient interest, I can open a mailinglist for this
296 project, but at the moment you need to send all questions and bug
297 reports to me personally.