Use a tempfile / rename for atomic writes in order to prevent dataloss during a HardFault on a repeater#2386
Use a tempfile / rename for atomic writes in order to prevent dataloss during a HardFault on a repeater#2386winnieXY wants to merge 2 commits intomeshcore-dev:devfrom
Conversation
…s on a HardFault on a repeater.
|
@oltaco : Could you review this patch? For repeaters and roomservers the 1MB flash ram is sufficient for this operation. I've tested that on a P1 and it's working. |
| #include "ClientACL.h" | ||
|
|
||
| static File openWrite(FILESYSTEM* _fs, const char* filename) { | ||
| static File openWriteTemp(FILESYSTEM* _fs, const char* filename, const char* tmp_filename) { |
There was a problem hiding this comment.
A cleaner way would be to keep method sig the same, but for NRF52 case, concat ".tmp" to the given filename, as use that to open in write mode
There was a problem hiding this comment.
I agree, it would be much cleaner to just let openWrite handle the atomicity for the NRF52 case. All the openWrite callsites could then just stay as they are.
There was a problem hiding this comment.
Done.
I added a getTmpPath() function to handle the tmp path name generation and moved back to openWrite().
| } | ||
| file.close(); | ||
|
|
||
| #if defined(NRF52_PLATFORM) || defined(STM32_PLATFORM) |
There was a problem hiding this comment.
@oltaco Can you verify that _fs->rename() definitely will work in this case? ie. does the real_path file get overwritten/replaced?
There was a problem hiding this comment.
Confirmed, the underlying lfs_rename() call will point the old directory entry for file at the blocks referenced by file.tmp and then remove the directory entry for file.tmp
I think the total maximum size of files on disk for repeaters/rooms is much less than that, I think less than 10kB in total? The UserData area that repeaters and room servers use is only 28kb total (actual usable space is less than due to overheads). |
…ate function instead.
This is the first patch to make repeaters more reliable in the field - even if errors occur.
On a repeater the atomic write (tmpfile + rename) should work even on nrf52 Repeaters with internal flash, the amount of data stored is much smaller than the data used on a companion. I estimated the total size needed to be around 40-50kB with MAX_CLIENTS and MAX_REGIONS = 32.
This patch should ensure that no setting is lost during a failed write operation into the flash on a repeater. The repeater will be still stuck - but after restart all settings are preserved.
This is most likely the first of several smaller patches to mitigate this bug #2283.