emonhub's data buffers

emonHub has in-built buffers to cope with short network interruptions, this was introduced in the OEM gateway and is one of the major benefits of using emonhub with a remote emonCMS. The default buffer size is 1000 frames and this can be changed by the user. OEM gateway gave many months of reliable service and never gave me great cause to question the buffer size as I very rarely, if ever noticed any periods of missing data. I did however notice periods of "sparse" data due to the fact that while OEM gateway was trying in vain to deliver data to an unavailable server, it was missing some incoming data, not a big problem as some data is better than no data.

Now that emonHub runs several threads so that it doesn't get bogged down with failed delivery attempts there is no incoming data being missed and therefore the buffers fill up much quicker and older data is dropped for new, The end result now is that when the network comes back up the last 1000 frames are 100% accurate at the correct intervals and all previous data is lost rather than the same number of frames spread more sparsely over a longer period. IMO this is a negative side effect of a great improvement.

Last night I lost my internet connection (about 15mins after the ISP's helpline closed, typically!) being aware of the above I changed the default buffer size in one of my Pi's as an experiment and to hopefully reduce the potential data loss. So having just got my internet connection back after an outage of 10hrs+ and checked my emonCMS for data loss, I thought I would share my findings.

I currently have 2 Pi's running emonHub that post data to emonCMS.org, Pi A is live (valuable) data and Pi B is experimental (expendable) data. Both were using the default buffer size of 1000 until I changed Pi A's buffer size to 10000 whilst it was running, at around an hour into the outage. I didn't do any math at the time I just "made it bigger".

Pi A has a single node reporting every 5 seconds (12 frames per min or 720ph) and Pi B currently reporting from 3 nodes every 5 seconds (12 frames per min per node or 2160ph).

Pi A retained absolutely everything, no loss of data either before, during or after changing the buffer size, the whole 10hrs+ took seconds to catch up when the network came up this morning, where as Pi B only managed to retain the preceding 27mins of data, here's the math...

Pi A's larger buffer of 10000 at just 720 frames/ph could last for 13.9hrs where as Pi B's default buffer of 1000 at a whooping rate of 2160 frames/ph is 0.46Hrs or 27.6 minutes.

Although I had no memory monitoring software on either Pi at the time I have roughly calculated that since all the payloads are the same size (5 values) Pi A's buffer size in the memory  must be 10x that of Pi B and yet it did not choke, hang or crash so was probably ok with the increase. I think this is due to the buffers using minimal space ie Pi A's 5 values plus node id, unix timestamp and internal emonhub packet ref, all at 4bytes each is 32bytes per frame x 10000 = 320kB (plus overheads ??), I have 2 destinations set up which is still only 0.64MB total RAM used by buffered data during a 13.9hr outage..

The conclusion is my buffer sizes need to be recalculated and set to a level determined by the rate I'm producing data and if that's a significant amount I should also consider the RAM'ifications (sorry)

The RAM used will of course depend heavily on the size of the buffers which will in turn depend on both the number of frames and the size of each frame, but in general I think the buffer sizes could be increased sensibly, over cook it and the Pi will choke, at what level depends on what else you have running, but small increases should not cause an issue. Also play fair with emonhub.org, if you have a dodgy network connection every time the network comes up emoncms.org will get the backlog dumped on it's doorstep.

The buffer size is set in emonhub.conf in each reporters init_settings, there isn't currently a value set as the default of 1000 is hardcoded in. To change the buffer size add a "buffer_size = " line for example


        Type = EmonHubEmoncmsReporter
            buffer_size = 5000

I'm not an expert on memory management and do not know the overheads involved so if anyone can expand on or confirm the math or assumptions I've made, please do.


PS just as a yard stick for comparison the same Pi A with the same one node would previously retain over an hour and a half of data when running OEM gateway, so I guess I was losing around 2 out of every 3 frames due to an unavailable server, thus extending the life of the buffer.


CidiRome's picture

Re: emonhub's data buffers


This is quite an old post, but today it interested me.

I tried to use it

    Type = EmonHubEmoncmsHTTPInterfacer
        buffer_size = 10000

But I got

2015-12-27 02:31:56,159 ERROR    MainThread Unable to create 'emoncmsvieirasoft'                                               interfacer: __init__() got an unexpected keyword argument 'buffer_size'

Is there any change necessary on the main code to user buffer_size option?
I has to change the code for interval = 5 to work, maybe this is the same situation.


pb66's picture

Re: emonhub's data buffers

Basically the "buffers" are a feature of emonhub's reporters and the "emon-pi" variant no longer has reporters, the replacement interfacers don't do the same job, to reintroduce the buffers would definitely require some coding.


CidiRome's picture

Re: emonhub's data buffers

Is there any alternative to keep emonhub from sending the data when the server is unavailable and keep it to send later when it later when it becomes available again?


pb66's picture

Re: emonhub's data buffers

Not easily that I can think of.


Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.