Minimizing impact in CPU usage of multi_index.modify()
in cycles
We were requested to do an efficiency audit on some EOSIO smart-contract code for a dapp and we were quite shocked with the amount of nested cycles they had around a user table, each iteration making its own modifications to some of the users. Knowing that multi_index.modify()
is a bit expensive, that implementation would consume a huge CPU and would possibly fail with some higher number of users, exceeding the limit of CPU usage per transaction on EOSIO.
At first we would suggest a complete rewrite, but that code was so confusing and inter-dependent that even that would require a complete redesign of the logic itself, not just the implementation. This is usualy the problem with copy/pasted “spagetti” code, so we decided to make some experiments around multi_index
tables and see if we could get some improvements out of it.
Initial state
The start of this experiment had these results for a thousand table writes of an uint64_t field:
1 | $ eoslime test |
meaning that between 12ms and 14.5ms were being billed for the “count one thousand” action CPU when using a multi_index table. These values come from the transaction receipt, so they are the most accurate possible according to the eosio internal accounting.
Routes identified:
- Tables caching
- Smart tables
- …
Tables caching
The idea is cache the reads/writes from/to multi_index tables into a cache map
, so that many reads of the same object just require one deserialization an many writes to the same object only do one final serialization and write to eosio RAM.
The final results from the same operations are:
1 | $ eoslime test |
So this experiment took the CPU billing ~ 10% lower, and this could be improved a bit by using an unordered_map
instead of a map
, which is sorted by nature, but the unordered_map
container in eosio is faulty, as you can check in this github issue of EOSIO.CDT.
So turns out that there’s not much difference, from the efficiency point of view, of using multi_index
tables and map
s.
Smart tables
The idea is to make tables lazy, by using a deque
to store the all entries and flush the contents of of the cache when the object is destroyed, performing only one write to the multi_index
table per each entry in the cache that was modified.
You can check the code for this experiment in eosio-smart-table repo.
The final results from the same operations, even multiplied 10x the cycles, are as follows:
1 | $ eoslime test |
Please notice that for one thousand updates the difference in efficiency is around 5x (80% saving on billed CPU), but it is also non linear, so for ten thousand (10x more) the total CPU billed only grew by less than 9x.
smart_table
limitations
- only supports primary key, no other indexes
- focus on
modify
optimization, so other operations likeemplace
are not optimized - right now it fills all the internal cache all at once, might be interesting to improve reading in chunks for large datasets
Statistics helper wrappers
In order to identify how many operations were being executed and which, we also included some stats wrappers.
- smart_table_stats
- multi_index_stats
- singleton_stats
These wrapper provide operation counters for specific functionalities of each of the base classes, so that we can measure what is going on on some intensive code and decide if it makes sense to use smart_table or not.
Conclusions
There’s a lot of room to improve efficiency of smart contract code. multi_index
tables are expensive in CPU but are well optimized facing STL counterparts like std::map
. Where we can get more benefits in terms of efficiency is to minimize the writes by using internal structures that do not require hashing/sorting, like std::vector
or even better std::deque
, if we do not require traversing the data in a single contiguous data block.
Actually, in that dapp, the impact of smart_table
was not as drastic as expected, due to the multiple updates to multiple records, but still it diminished the repeated modifications over the same record. We had to contribute with many other optimizations in order to minimize the CPU usage on that project.
Possible future improvements
- full test suite
- cache filling and flushing in chunks
- perform operations over current loaded chunk of data, using lambda for instance
Hope this shows the importance of optimization and also sheds some light on the internals of eosio and what/where we can improve.