Over the last few weeks I have been investigating how to reduce the write load of emoncms to explore if it might be possible to achieve long term logging on SD cards.
A brief summary
Through a mix of buffered writing of feed data to disk, reduction of the number of meta and average data files written to and use of the FAT filesystem, my home raspberrypi now has a write load of 0.4kb_wrtn/s down from 197kb_wrtn/s.. almost 500x less and the writes are all append only (there's no regular writing of the same part of a file again and again). The question is could this mean years of SD card logging rather than months?
Write buffering
A single Emoncms PHPFina (PHP Fixed Interval No averaging) or PHPTimeSeries datapoint uses between 4 and 9 bytes. The write load on the disk however is a bit more complicated than that. Most filesystems and disk's have a minimum IO size that is much larger than 4-9 bytes, on a FAT filesystem the minimum IO size is 512 bytes this means that if you try and write 4 bytes the operation will actually cause 512 bytes of write load. But its not just the datafile that gets written to, every file has inode meta data which can also result in a further 512 bytes of write load. A single 4 byte write can therefore cause 1kb of write load.
By buffering writes for as long as we can in memory and then writing in larger blocks its possible to reduce the write load significantly. The full investigation with a lot of benchmarking of different configurations including differences between FAT and Ext4 can be found here:
https://github.com/openenergymonitor/documentation/blob/master/BuildingB...
A minimal version of emoncms in python
To make the investigation easier I simplified emoncms down to the core parts needed on a raspberrypi to get basic timeseries graphing: serial listener, node decoder, basic feed engine and a ui consisting of a nodelist and a single rawdata visualisation type and wrote the result in python making use of Jerome and Paul's (pb66) excellent work on the emonhub serial listener and node decoder.
Here's a diagram of the main components of the python app:
And some screenshots of what it looks like:
Node list:
Basic graphing:
The source code and an installation guide for this minimal python version of emoncms can be found here:
https://github.com/emoncms/development/tree/master/experimental/emon-py
I posted on the EmonHub issues list about this on friday and I've been having a bit of a discussion there with Paul https://github.com/emonhub/emonhub/issues/48
Interested to hear people's thoughts on it.
Further development questions (copied from the end of the write load investigation page)
Is the reduced write load and longer SD card lifespan that might result from using the FAT filesystem worth the increased chance of data corruption from power failure that Ext4 might prevent?
It would be interesting to compare the performance of the FAT filesystem + 5 min application based commit time with the EXT4 filesystem with Journaling turned off and filesystem delayed allocation set to 5 min instead of write buffering in the application.
Re: Reducing emoncms write load for long term SD card logging and a minimal python version of emoncms
Done some further testing with the Ext2 filesystem this time which is a non-journaling filesystem, the results appear to be exactly the same as Ext4 suggesting that journaling is not as large a source of write load as I initially thought and that the main difference between the write load on FAT vs Ext4/Ext2 is probably the block size 512 vs 4096 which sounds logical to me.
Re: Reducing emoncms write load for long term SD card logging and a minimal python version of emoncms
Looking into it, its possible to set the Ext2/3/4 filesystem blocksize to 1024 rather than 4096. After backing up the 3rd data partition on the sd card I just ran:
sudo umount /dev/mmcblk0p3
sudo mkfs.ext4 -b 1024 /dev/mmcblk0p3
sudo mount -a
Copied the data back on to it, it then takes a while for the Ext4 partition to fully setup (there's a process in the background called ext4lazyinit that writes quite heavily for a few minutes) But after it finishes the write load drops to ~1.4 kb_wrtn/s down from ~3.2 kb_wrtn/s.
So potentially a 2x reduction by dropping the block size from 4096 to 1024 on Ext4.
The FAT filesystem with a blocksize of 512 bytes writes at 0.4 kb_wrtn/s for the same commit rate and number of feeds.
Re: Reducing emoncms write load for long term SD card logging and a minimal python version of emoncms
Hi, really nice work with reducing the write load! I'm currently testing your low write load-image and till now it seems to be all fine.
Just a little question: the feeds "Input on-time" and "Power to kwhd" seems not present in this version. Can I add them anyhow?
Or is there another way to count the on-time of an input?
Re: Reducing emoncms write load for long term SD card logging and a minimal python version of emoncms
simsasaile: I will try and do a blog post on this, the approach for the calculation of kwhd is different so instead of calculating daily values you instead either calculate accumulating watt hours on the emontx or use the power to kwh input processor. That generates an every increasing graph. You can then use either the myelectric page which pulls out the daily values or the newly update bargraph visualisation which has under vis > bargraph a property called delta, if you set that to 1 it will calculate the difference between the total watt hours of each day giving you daily watt hours.
The input on time process needs to be re-implemented in the same way producing an accumulating graph but there isnt a process to produce that accumulating graph yet of on-time.
The reason for the change is that the low write image feed engines are append only which makes the buffering mechanism easier to implement and helps to reduce the amount of writes to the same sectors on the disk. But a move to calculating daily data on the fly from accumulating graphs provides its own advantages too as you can pull out daily totals at any timezone, or you can extract hourly, or monthly totals too.
Re: Reducing emoncms write load for long term SD card logging and a minimal python version of emoncms
Hi Trystan,
thanks for your really fast reply and the help! I'm glad there is a way (or even two) to calculate the kWh/d-values the on-time is not so important.
Cheers, Simsasaile
Re: Reducing emoncms write load for long term SD card logging and a minimal python version of emoncms
Hi,
I would like to test the new emonSD-13-08-14 image on my Raspberry model B.
One can see here that this ready-to-go SD card image has been tested to work on the new model B+.
Will it work on my RPI model B ?
Thank's
Eric
Re: Reducing emoncms write load for long term SD card logging and a minimal python version of emoncms
Hi Eric, Yes it will work on either.
Paul
Re: Reducing emoncms write load for long term SD card logging and a minimal python version of emoncms
Thank's Paul.
I did a try and it works as expected on the model B.
Some points that may be added in this great guide :
- how to expand the file system (I 'm using a 8 Go sd card)
- how to modify the mysql root password (it's only mentioned how to modify the root password and the ssh password)
In this low write buffer version, there is no kwh/d and history processes. I'm wondering why these processes (and its associated engines) have not been implemented yet in this version.
Is it because one can't reduce the disk write load when using these engines/processes ?
Or is it because it was not needed to implement these engines/processes to validate the low write buffer approach ?
Regards,
Eric
Re: Reducing emoncms write load for long term SD card logging and a minimal python version of emoncms
this is great job. I was looking for something like this to make my SD card last. Thanks a lot and hope to get more updates from this :D
Re: Reducing emoncms write load for long term SD card logging and a minimal python version of emoncms
Hello Eric, the main reason for not yet implementing the power to kwh/d input processors is that the way the buffer works in the low write version means the writes can only be append only. It would be great to improve on this initial implementation to support editing/updating last or order datapoints, It's in the development plan I started writing here https://github.com/emoncms/emoncms/issues/244
The need however for power to kwh/d input processors is reduced however as you can now either calculate accumulating watt hours on the emontx or use the power to kwh input processor to generate an ever increasing watt hour graph. The myelectric page and new version of the bar graph visualisation can then pull out daily/monthly/any time division totals with the plus point of being able to do this at any timezone the user wishes at visualization time rather than pre-storage.
There's some more on this here: http://openenergymonitor.org/emon/node/3995
Trystan
Re: Reducing emoncms write load for long term SD card logging and a minimal python version of emoncms
Hi,
Thank's for replying Trystan. I understand that the need for kwh/d input processes is reduced but it will still be needed in some case. As a example to plot a multigraph that use both dayly and instantaneous data. (See attached). EDIT : kwh/D also needed for the zoom graph which is a very nice graph.
I have difficulties in using the new version of the bar graph visualization. On EmonCMS.org, I tried to plot the kwh used per day in my home but I got a wrong graph (See attached). I guess I'm using a wrong config (See attached). Is there a documentation on how to use that new bar graph vis ?
Eric
Re: Reducing emoncms write load for long term SD card logging and a minimal python version of emoncms
Eric,
for me the daily kWh bargraph works if I set the delta to 1. Don't ask why (or what the delta value means). Probably 1 for 'Yes'?
Jörg.
PS: and the description behind the edit fields is shifted. So behind the Delta setting there should be the description 'Show diff (Wh feeds)'. Then it makes sense that 1 means YES.
Re: Reducing emoncms write load for long term SD card logging and a minimal python version of emoncms
Yes need to improve documentation and naming there, ideally have a dropdown for yes/no. Delta for show difference between each bar.