EOSIO efficiency experiments

Minimizing impact in CPU usage of multi_index.modify() in cycles

We were requested to do an efficiency audit on some EOSIO smart-contract code for a dapp and we were quite shocked with the amount of nested cycles they had around a user table, each iteration making its own modifications to some of the users. Knowing that multi_index.modify() is a bit expensive, that implementation would consume a huge CPU and would possibly fail with some higher number of users, exceeding the limit of CPU usage per transaction on EOSIO.

At first we would suggest a complete rewrite, but that code was so confusing and inter-dependent that even that would require a complete redesign of the logic itself, not just the implementation. This is usualy the problem with copy/pasted “spagetti” code, so we decided to make some experiments around multi_index tables and see if we could get some improvements out of it.

Initial state

The start of this experiment had these results for a thousand table writes of an uint64_t field:

1
2
3
4
5
6
7
8
9
10
11
12
13
$ eoslime test

Test profile contract
CPU: 354 NET: 184
✓ update John Doe profile (140ms)
CPU: 283 NET: 192
✓ update Jane Doe profile (185ms)
CPU: 12009 NET: 104
✓ John Doe count one thousand (228ms)
CPU: 14588 NET: 104
✓ Jane Doe count one thousand (213ms)

4 passing (3s)

meaning that between 12ms and 14.5ms were being billed for the “count one thousand” action CPU when using a multi_index table. These values come from the transaction receipt, so they are the most accurate possible according to the eosio internal accounting.

Routes identified:

  • Tables caching
  • Smart tables

Tables caching

The idea is cache the reads/writes from/to multi_index tables into a cache map, so that many reads of the same object just require one deserialization an many writes to the same object only do one final serialization and write to eosio RAM.

The final results from the same operations are:

1
2
3
4
5
6
7
8
9
10
11
12
13
$ eoslime test

Test profile contract
CPU: 390 NET: 184
✓ update John Doe profile (197ms)
CPU: 342 NET: 192
✓ update Jane Doe profile (237ms)
CPU: 10602 NET: 104
✓ John Doe count one thousand (194ms)
CPU: 11437 NET: 104
✓ Jane Doe count one thousand (202ms)

4 passing (3s)

So this experiment took the CPU billing ~ 10% lower, and this could be improved a bit by using an unordered_map instead of a map, which is sorted by nature, but the unordered_map container in eosio is faulty, as you can check in this github issue of EOSIO.CDT.

So turns out that there’s not much difference, from the efficiency point of view, of using multi_index tables and maps.

Smart tables

The idea is to make tables lazy, by using a deque to store the all entries and flush the contents of of the cache when the object is destroyed, performing only one write to the multi_index table per each entry in the cache that was modified.

You can check the code for this experiment in eosio-smart-table repo.

The final results from the same operations, even multiplied 10x the cycles, are as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
$ eoslime test

Test profile contract
CPU: 346 NET: 184

✓ update John Doe profile (114ms)
CPU: 408 NET: 192

✓ update Jane Doe profile (134ms)
CPU: 2961 NET: 104

✓ John Doe count one thousand (186ms)
CPU: 27497 NET: 104

✓ Jane Doe count ten thousand (151ms)

4 passing (3s)

Please notice that for one thousand updates the difference in efficiency is around 5x (80% saving on billed CPU), but it is also non linear, so for ten thousand (10x more) the total CPU billed only grew by less than 9x.

smart_table limitations

  • only supports primary key, no other indexes
  • focus on modify optimization, so other operations like emplace are not optimized
  • right now it fills all the internal cache all at once, might be interesting to improve reading in chunks for large datasets

Statistics helper wrappers

In order to identify how many operations were being executed and which, we also included some stats wrappers.

  • smart_table_stats
  • multi_index_stats
  • singleton_stats

These wrapper provide operation counters for specific functionalities of each of the base classes, so that we can measure what is going on on some intensive code and decide if it makes sense to use smart_table or not.

Conclusions

There’s a lot of room to improve efficiency of smart contract code. multi_index tables are expensive in CPU but are well optimized facing STL counterparts like std::map. Where we can get more benefits in terms of efficiency is to minimize the writes by using internal structures that do not require hashing/sorting, like std::vector or even better std::deque, if we do not require traversing the data in a single contiguous data block.

Actually, in that dapp, the impact of smart_table was not as drastic as expected, due to the multiple updates to multiple records, but still it diminished the repeated modifications over the same record. We had to contribute with many other optimizations in order to minimize the CPU usage on that project.

Possible future improvements

  • full test suite
  • cache filling and flushing in chunks
  • perform operations over current loaded chunk of data, using lambda for instance

Hope this shows the importance of optimization and also sheds some light on the internals of eosio and what/where we can improve.

Share