-
Levin Zimmermann authored
There are two formats to save data with a ZBigFile: ZBlk0 and ZBlk1. They differ by adjusting the ratio between access-time and growing disk-space, where ZBlk1 is better regarding to disk space, while ZBlk0 has a better access-time. Wendelin.core users may not always know yet or care which format fits better for their data. In this case it may be easier for users to just let the program automatically select the ZBlk format. With this patch and the new 'h' (for heuristic) option of the 'ZBlk' argument of ZBigFile, this is now possible. The 'h' option isn't really a new ZBlk format in itself, but it just tries to automatically select the best ZBlk format option according to the characteristics of the changes that the user applies to the ZBigFile. In its current implementation, the heuristic tackles the use-case of large arrays with many small append-only changes. In this case 'h' is smaller in space than ZBlk0, but faster to read than ZBlk1. It does so, by initally using ZBlk1 until a blk is filled up. Once a blk is full, it switches to ZBlk1, as it was recommended by @kirr in nexedi/wendelin.core!20 (comment 196084). With this patch comes a test (bigfile/tests/test-zblk-fmt) that creates benchmarks for different combinations and zblk formats. The test aims to check how the 'heuristic' format performs in contrast to 'ZBlk0' and 'ZBlk1': --- Run append tests --------------------------------------------- --------------------------------------------- Set change_percentage_set to 0.15 Set change_count to 500 Set arrsize to 500000 Set change_type to append Run tests with format h: ZODB storage size: 318.565101 MB Access time: 0.747 ms / blk (initially cold; might get warmer during benchmark) Run tests with format ZBlk0: ZODB storage size: 704.347196 MB Access time: 0.737 ms / blk (initially cold; might get warmer during benchmark) Run tests with format ZBlk1: ZODB storage size: 163.367072 MB Access time: 74.628 ms / blk (initially cold; might get warmer during benchmark)
5b50a2fe