Commit 8e043523 authored by Yoni Fogel's avatar Yoni Fogel

Refs Tokutek/ft-index#46 Added some comments, deleted unused/commented out code, renamed variables

parent b0ccec78
See wiki:
https://github.com/Tokutek/ft-index/wiki/Improving-in-memory-query-performance---Design
ft/bndata.{cc,h} The basement node was heavily modified to split the key/value, and inline the keys
bn_data::initialize_from_separate_keys_and_vals
This is effectively the deserialize
The bn_data::omt_* functions (probably badly named) kind of treat the basement node as an omt of key+leafentry pairs
There are many references to 'omt' that could be renamed to dmt if it's worth it.
util/dmt.{cc,h} The new DMT structure
Possible questions:
1-Should we merge dmt<> & omt<>? (delete omt entirely)
2-Should omt<> become a wrapper for dmt<>?
3-Should we just keep both around?
If we plan to do this for a while, should we get rid of any scaffolding that would make it easier to do 1 or 2?
The dmt is basically an omt with dynamic sized nodes/values.
There are two representations: an array of values, or a tree of nodes.
The high-level algorithm is basically the same for dmt and omt, except the dmt tries not to move values around in tree form
Instead, it moves the metadata from nodes around.
Insertion into a dmt requires a functor that can provide information about size, since it's expected to be (potentially at least) dynamically sized
The dmt does not revert to array form when rebalancing the root, but it CAN revert to array form when it prepares for serializing (if it notices everything is fixed length)
The dmt also can serialize and deserialize the values (set) it represents. It saves no information about the dmt itself, just the values.
Some comments about what's in each file.
ft/CMakeLists.txt
add dmt-wrapper (test wrapper, nearly identical to ft/omt.cc which is also a test wrapper)
ft/dmt-wrapper.cc/h
Just like ft/omt.cc,h. Is a test wrapper for the dmt to implement a version of the old (non-templated) omt tests.
ft/ft-internal.h
Additional engine status
ft/ft-ops.cc/h
Additional engine status
in ftnode_memory_size()
fix a minor bug where we didn't count all the memory.
comments
ft/ft_layout_version.h
Update comment describing version change.
NOTE: May need to add version 26 if 25 is sent to customers before this goes live.
Adding 26 requires additional code changes (limited to a subset of places where version 24/25 are referred to)
ft/ft_node-serialize.cc
Changes calculation of size of a leaf node to include basement-node header
Adds optimized serialization for basement nodes with fixed-length keys
Maintains old method when not using fixed-length keys.
rebalance_ftnode_leaf()
Minor changes since key/leafentries are separated
deserialize_ftnode_partition()
Minor changes, including passing rbuf directly to child function (so ndone calculation is done by child)
ft/memarena.cc
Changes so that toku_memory_footprint is more accurate. (Not exactly related project)
ft/rollback.cc
Just uses new memarena function for memory footprint
ft/tests/dmt-test.cc
"clone" of old omt-test (non templated) ported to dmt
Basically not worth looking at except to make sure it imports dmt instead of omt.
ft/tests/dmt-test2.cc
New dmt tests.
You might decide not enough new tests were implemented.
ft/tests/ft-serialize-benchmark.cc
Minor improvements s.t. you can take an average of a bunch of runs.
ft/tests/ft-serialize-test.cc
Just ported to changed api
ft/tests/test-pick-child-to-flush.cc
The new basement-node headers reduce available memory.. reduce max size of test appropriately.
ft/wbuf.h
Added wbuf_nocrc_reserve_literal_bytes()
Gives you a pointer to write to the wbuf, but notes the memory was used.
util/mempool.cc
Made mempool allocations aligned to cachelines
Minor 'const' changes to help compilation
Some utility functions to get/give offsets
...@@ -120,6 +120,8 @@ void bn_data::remove_key(uint32_t keylen) { ...@@ -120,6 +120,8 @@ void bn_data::remove_key(uint32_t keylen) {
m_disksize_of_keys -= sizeof(keylen) + keylen; m_disksize_of_keys -= sizeof(keylen) + keylen;
} }
// Deserialize from format optimized for keys being inlined.
// Currently only supports fixed-length keys.
void bn_data::initialize_from_separate_keys_and_vals(uint32_t num_entries, struct rbuf *rb, uint32_t data_size, uint32_t version UU(), void bn_data::initialize_from_separate_keys_and_vals(uint32_t num_entries, struct rbuf *rb, uint32_t data_size, uint32_t version UU(),
uint32_t key_data_size, uint32_t val_data_size, bool all_keys_same_length, uint32_t key_data_size, uint32_t val_data_size, bool all_keys_same_length,
uint32_t fixed_key_length) { uint32_t fixed_key_length) {
...@@ -149,17 +151,18 @@ void bn_data::initialize_from_separate_keys_and_vals(uint32_t num_entries, struc ...@@ -149,17 +151,18 @@ void bn_data::initialize_from_separate_keys_and_vals(uint32_t num_entries, struc
invariant(rb->ndone - ndone_before == data_size); invariant(rb->ndone - ndone_before == data_size);
} }
// static inline void rbuf_literal_bytes (struct rbuf *r, bytevec *bytes, unsigned int n_bytes) {
// If we have fixed-length keys, we prepare the dmt and mempool.
// The mempool is prepared by removing any fragmented space and ordering leafentries in the same order as their keys.
void bn_data::prepare_to_serialize(void) { void bn_data::prepare_to_serialize(void) {
if (m_buffer.is_value_length_fixed()) { if (m_buffer.value_length_is_fixed()) {
m_buffer.prepare_for_serialize(); m_buffer.prepare_for_serialize();
omt_compress_kvspace(0, nullptr, true); // Gets it ready for easy serialization. omt_compress_kvspace(0, nullptr, true); // Gets it ready for easy serialization.
} }
} }
void bn_data::serialize_header(struct wbuf *wb) const { void bn_data::serialize_header(struct wbuf *wb) const {
bool fixed = m_buffer.is_value_length_fixed(); bool fixed = m_buffer.value_length_is_fixed();
//key_data_size //key_data_size
wbuf_nocrc_uint(wb, m_disksize_of_keys); wbuf_nocrc_uint(wb, m_disksize_of_keys);
...@@ -175,27 +178,31 @@ void bn_data::serialize_header(struct wbuf *wb) const { ...@@ -175,27 +178,31 @@ void bn_data::serialize_header(struct wbuf *wb) const {
void bn_data::serialize_rest(struct wbuf *wb) const { void bn_data::serialize_rest(struct wbuf *wb) const {
//Write keys //Write keys
invariant(m_buffer.is_value_length_fixed()); //Assumes prepare_to_serialize was called invariant(m_buffer.value_length_is_fixed()); //Assumes prepare_to_serialize was called
m_buffer.serialize_values(m_disksize_of_keys, wb); m_buffer.serialize_values(m_disksize_of_keys, wb);
//Write leafentries //Write leafentries
paranoid_invariant(toku_mempool_get_frag_size(&m_buffer_mempool) == 0); //Just ran omt_compress_kvspace //Just ran omt_compress_kvspace so there is no fragmentation and also leafentries are in sorted order.
paranoid_invariant(toku_mempool_get_frag_size(&m_buffer_mempool) == 0);
uint32_t val_data_size = toku_mempool_get_used_space(&m_buffer_mempool); uint32_t val_data_size = toku_mempool_get_used_space(&m_buffer_mempool);
wbuf_nocrc_literal_bytes(wb, toku_mempool_get_base(&m_buffer_mempool), val_data_size); wbuf_nocrc_literal_bytes(wb, toku_mempool_get_base(&m_buffer_mempool), val_data_size);
} }
// No optimized (de)serialize method implemented (yet?) for non-fixed length keys.
bool bn_data::need_to_serialize_each_leafentry_with_key(void) const { bool bn_data::need_to_serialize_each_leafentry_with_key(void) const {
return !m_buffer.is_value_length_fixed(); return !m_buffer.value_length_is_fixed();
} }
// Deserialize from rbuf
void bn_data::initialize_from_data(uint32_t num_entries, struct rbuf *rb, uint32_t data_size, uint32_t version) { void bn_data::initialize_from_data(uint32_t num_entries, struct rbuf *rb, uint32_t data_size, uint32_t version) {
uint32_t key_data_size = data_size; // overallocate if < version 25 uint32_t key_data_size = data_size; // overallocate if < version 25 (best guess that is guaranteed not too small)
uint32_t val_data_size = data_size; // overallocate if < version 25 uint32_t val_data_size = data_size; // overallocate if < version 25 (best guess that is guaranteed not too small)
bool all_keys_same_length = false; bool all_keys_same_length = false;
bool keys_vals_separate = false; bool keys_vals_separate = false;
uint32_t fixed_key_length = 0; uint32_t fixed_key_length = 0;
// In version 24 and older there is no header. Skip reading header for old version.
if (version >= FT_LAYOUT_VERSION_25) { if (version >= FT_LAYOUT_VERSION_25) {
uint32_t ndone_before = rb->ndone; uint32_t ndone_before = rb->ndone;
key_data_size = rbuf_int(rb); key_data_size = rbuf_int(rb);
...@@ -214,6 +221,7 @@ void bn_data::initialize_from_data(uint32_t num_entries, struct rbuf *rb, uint32 ...@@ -214,6 +221,7 @@ void bn_data::initialize_from_data(uint32_t num_entries, struct rbuf *rb, uint32
return; return;
} }
} }
// Version >= 25 and version 24 deserialization are now identical except that <= 24 might allocate too much memory.
bytevec bytes; bytevec bytes;
rbuf_literal_bytes(rb, &bytes, data_size); rbuf_literal_bytes(rb, &bytes, data_size);
const unsigned char *CAST_FROM_VOIDP(buf, bytes); const unsigned char *CAST_FROM_VOIDP(buf, bytes);
...@@ -259,7 +267,7 @@ void bn_data::initialize_from_data(uint32_t num_entries, struct rbuf *rb, uint32 ...@@ -259,7 +267,7 @@ void bn_data::initialize_from_data(uint32_t num_entries, struct rbuf *rb, uint32
curr_src_pos += keylen; curr_src_pos += keylen;
} }
uint32_t le_offset = curr_dest_pos - newmem; uint32_t le_offset = curr_dest_pos - newmem;
dmt_builder.insert_sorted(toku::dmt_functor<klpair_struct>(keylen, le_offset, keyp)); dmt_builder.append(toku::dmt_functor<klpair_struct>(keylen, le_offset, keyp));
add_key(keylen); add_key(keylen);
// now curr_dest_pos is pointing to where the leafentry should be packed // now curr_dest_pos is pointing to where the leafentry should be packed
...@@ -285,8 +293,8 @@ void bn_data::initialize_from_data(uint32_t num_entries, struct rbuf *rb, uint32 ...@@ -285,8 +293,8 @@ void bn_data::initialize_from_data(uint32_t num_entries, struct rbuf *rb, uint32
curr_src_pos += num_rest_bytes; curr_src_pos += num_rest_bytes;
} }
} }
dmt_builder.build_and_destroy(&this->m_buffer); dmt_builder.build(&this->m_buffer);
toku_note_deserialized_basement_node(m_buffer.is_value_length_fixed()); toku_note_deserialized_basement_node(m_buffer.value_length_is_fixed());
#if TOKU_DEBUG_PARANOID #if TOKU_DEBUG_PARANOID
uint32_t num_bytes_read = (uint32_t)(curr_src_pos - buf); uint32_t num_bytes_read = (uint32_t)(curr_src_pos - buf);
...@@ -298,6 +306,8 @@ void bn_data::initialize_from_data(uint32_t num_entries, struct rbuf *rb, uint32 ...@@ -298,6 +306,8 @@ void bn_data::initialize_from_data(uint32_t num_entries, struct rbuf *rb, uint32
toku_mempool_init(&m_buffer_mempool, newmem, (size_t)(curr_dest_pos - newmem), allocated_bytes_vals); toku_mempool_init(&m_buffer_mempool, newmem, (size_t)(curr_dest_pos - newmem), allocated_bytes_vals);
paranoid_invariant(get_disk_size() == data_size); paranoid_invariant(get_disk_size() == data_size);
// Versions older than 25 might have allocated too much memory. Try to shrink the mempool now that we
// know how much memory we need.
if (version < FT_LAYOUT_VERSION_25) { if (version < FT_LAYOUT_VERSION_25) {
//Maybe shrink mempool. Unnecessary after version 25 //Maybe shrink mempool. Unnecessary after version 25
size_t used = toku_mempool_get_used_space(&m_buffer_mempool); size_t used = toku_mempool_get_used_space(&m_buffer_mempool);
...@@ -336,7 +346,7 @@ void bn_data::delete_leafentry ( ...@@ -336,7 +346,7 @@ void bn_data::delete_leafentry (
uint32_t idx, uint32_t idx,
uint32_t keylen, uint32_t keylen,
uint32_t old_le_size uint32_t old_le_size
) )
{ {
remove_key(keylen); remove_key(keylen);
m_buffer.delete_at(idx); m_buffer.delete_at(idx);
...@@ -361,11 +371,15 @@ static int move_it (const uint32_t, klpair_struct *klpair, const uint32_t idx UU ...@@ -361,11 +371,15 @@ static int move_it (const uint32_t, klpair_struct *klpair, const uint32_t idx UU
} }
// Compress things, and grow the mempool if needed. // Compress things, and grow the mempool if needed.
// May (always if force_compress) have a side effect of putting contents of mempool in sorted order.
void bn_data::omt_compress_kvspace(size_t added_size, void **maybe_free, bool force_compress) { void bn_data::omt_compress_kvspace(size_t added_size, void **maybe_free, bool force_compress) {
uint32_t total_size_needed = toku_mempool_get_used_space(&m_buffer_mempool) + added_size; uint32_t total_size_needed = toku_mempool_get_used_space(&m_buffer_mempool) + added_size;
// set the new mempool size to be twice of the space we actually need. // set the new mempool size to be twice of the space we actually need.
// On top of the 25% that is padded within toku_mempool_construct (which we // On top of the 25% that is padded within toku_mempool_construct (which we
// should consider getting rid of), that should be good enough. // should consider getting rid of), that should be good enough.
// If there is no fragmentation, e.g. in serial inserts, we can just increase the size of the mempool
// with a realloc. (force_compress means we NEED the side effect that all contents are put in sorted order).
if (!force_compress && toku_mempool_get_frag_size(&m_buffer_mempool) == 0) { if (!force_compress && toku_mempool_get_frag_size(&m_buffer_mempool) == 0) {
// Skip iterate, just realloc. // Skip iterate, just realloc.
toku_mempool_realloc_larger(&m_buffer_mempool, 2*total_size_needed); toku_mempool_realloc_larger(&m_buffer_mempool, 2*total_size_needed);
...@@ -401,7 +415,6 @@ LEAFENTRY bn_data::mempool_malloc_and_update_omt(size_t size, void **maybe_free) ...@@ -401,7 +415,6 @@ LEAFENTRY bn_data::mempool_malloc_and_update_omt(size_t size, void **maybe_free)
return (LEAFENTRY)v; return (LEAFENTRY)v;
} }
//TODO: probably not free the "maybe_free" right away?
void bn_data::get_space_for_overwrite( void bn_data::get_space_for_overwrite(
uint32_t idx, uint32_t idx,
const void* keyp UU(), const void* keyp UU(),
...@@ -418,13 +431,13 @@ void bn_data::get_space_for_overwrite( ...@@ -418,13 +431,13 @@ void bn_data::get_space_for_overwrite(
); );
toku_mempool_mfree(&m_buffer_mempool, nullptr, old_le_size); // Must pass nullptr, since le is no good any more. toku_mempool_mfree(&m_buffer_mempool, nullptr, old_le_size); // Must pass nullptr, since le is no good any more.
KLPAIR klp = nullptr; KLPAIR klp = nullptr;
uint32_t klpair_len; //TODO: maybe delete klpair_len uint32_t klpair_len;
int r = m_buffer.fetch(idx, &klpair_len, &klp); int r = m_buffer.fetch(idx, &klpair_len, &klp);
invariant_zero(r); invariant_zero(r);
paranoid_invariant(klp!=nullptr); paranoid_invariant(klp!=nullptr);
// Key never changes. // Key never changes.
paranoid_invariant(keylen_from_klpair_len(klpair_len) == keylen); paranoid_invariant(keylen_from_klpair_len(klpair_len) == keylen);
paranoid_invariant(!memcmp(klp->key_le, keyp, keylen)); // TODO: can keyp be pointing to the old space? If so this could fail paranoid_invariant(!memcmp(klp->key, keyp, keylen)); // TODO: can keyp be pointing to the old space? If so this could fail
size_t new_le_offset = toku_mempool_get_offset_from_pointer_and_base(&this->m_buffer_mempool, new_le); size_t new_le_offset = toku_mempool_get_offset_from_pointer_and_base(&this->m_buffer_mempool, new_le);
paranoid_invariant(new_le_offset <= UINT32_MAX - new_size); // Not using > 4GB paranoid_invariant(new_le_offset <= UINT32_MAX - new_size); // Not using > 4GB
...@@ -439,7 +452,6 @@ void bn_data::get_space_for_overwrite( ...@@ -439,7 +452,6 @@ void bn_data::get_space_for_overwrite(
} }
} }
//TODO: probably not free the "maybe_free" right away?
void bn_data::get_space_for_insert( void bn_data::get_space_for_insert(
uint32_t idx, uint32_t idx,
const void* keyp, const void* keyp,
...@@ -499,7 +511,7 @@ void bn_data::move_leafentries_to( ...@@ -499,7 +511,7 @@ void bn_data::move_leafentries_to(
void* new_le = toku_mempool_malloc(dest_mp, le_size, 1); void* new_le = toku_mempool_malloc(dest_mp, le_size, 1);
memcpy(new_le, old_le, le_size); memcpy(new_le, old_le, le_size);
size_t le_offset = toku_mempool_get_offset_from_pointer_and_base(dest_mp, new_le); size_t le_offset = toku_mempool_get_offset_from_pointer_and_base(dest_mp, new_le);
dest_bd->m_buffer.insert_at(dmt_functor<klpair_struct>(keylen_from_klpair_len(curr_kl_len), le_offset, curr_kl->key_le), i-lbi); dest_bd->m_buffer.insert_at(dmt_functor<klpair_struct>(keylen_from_klpair_len(curr_kl_len), le_offset, curr_kl->key), i-lbi);
this->remove_key(keylen_from_klpair_len(curr_kl_len)); this->remove_key(keylen_from_klpair_len(curr_kl_len));
dest_bd->add_key(keylen_from_klpair_len(curr_kl_len)); dest_bd->add_key(keylen_from_klpair_len(curr_kl_len));
...@@ -534,7 +546,6 @@ void bn_data::destroy(void) { ...@@ -534,7 +546,6 @@ void bn_data::destroy(void) {
m_disksize_of_keys = 0; m_disksize_of_keys = 0;
} }
//TODO: Splitting key/val requires changing this
void bn_data::replace_contents_with_clone_of_sorted_array( void bn_data::replace_contents_with_clone_of_sorted_array(
uint32_t num_les, uint32_t num_les,
const void** old_key_ptrs, const void** old_key_ptrs,
...@@ -557,10 +568,10 @@ void bn_data::replace_contents_with_clone_of_sorted_array( ...@@ -557,10 +568,10 @@ void bn_data::replace_contents_with_clone_of_sorted_array(
void* new_le = toku_mempool_malloc(&m_buffer_mempool, le_sizes[idx], 1); void* new_le = toku_mempool_malloc(&m_buffer_mempool, le_sizes[idx], 1);
memcpy(new_le, old_les[idx], le_sizes[idx]); memcpy(new_le, old_les[idx], le_sizes[idx]);
size_t le_offset = toku_mempool_get_offset_from_pointer_and_base(&m_buffer_mempool, new_le); size_t le_offset = toku_mempool_get_offset_from_pointer_and_base(&m_buffer_mempool, new_le);
dmt_builder.insert_sorted(dmt_functor<klpair_struct>(old_keylens[idx], le_offset, old_key_ptrs[idx])); dmt_builder.append(dmt_functor<klpair_struct>(old_keylens[idx], le_offset, old_key_ptrs[idx]));
add_key(old_keylens[idx]); add_key(old_keylens[idx]);
} }
dmt_builder.build_and_destroy(&this->m_buffer); dmt_builder.build(&this->m_buffer);
} }
LEAFENTRY bn_data::get_le_from_klpair(const klpair_struct *klpair) const { LEAFENTRY bn_data::get_le_from_klpair(const klpair_struct *klpair) const {
...@@ -586,7 +597,7 @@ int bn_data::fetch_klpair(uint32_t idx, LEAFENTRY *le, uint32_t *len, void** key ...@@ -586,7 +597,7 @@ int bn_data::fetch_klpair(uint32_t idx, LEAFENTRY *le, uint32_t *len, void** key
int r = m_buffer.fetch(idx, &klpair_len, &klpair); int r = m_buffer.fetch(idx, &klpair_len, &klpair);
if (r == 0) { if (r == 0) {
*len = keylen_from_klpair_len(klpair_len); *len = keylen_from_klpair_len(klpair_len);
*key = klpair->key_le; *key = klpair->key;
*le = get_le_from_klpair(klpair); *le = get_le_from_klpair(klpair);
} }
return r; return r;
...@@ -608,7 +619,7 @@ int bn_data::fetch_le_key_and_len(uint32_t idx, uint32_t *len, void** key) { ...@@ -608,7 +619,7 @@ int bn_data::fetch_le_key_and_len(uint32_t idx, uint32_t *len, void** key) {
int r = m_buffer.fetch(idx, &klpair_len, &klpair); int r = m_buffer.fetch(idx, &klpair_len, &klpair);
if (r == 0) { if (r == 0) {
*len = keylen_from_klpair_len(klpair_len); *len = keylen_from_klpair_len(klpair_len);
*key = klpair->key_le; *key = klpair->key;
} }
return r; return r;
} }
......
...@@ -96,39 +96,29 @@ PATENT RIGHTS GRANT: ...@@ -96,39 +96,29 @@ PATENT RIGHTS GRANT:
#include <util/dmt.h> #include <util/dmt.h>
#include "leafentry.h" #include "leafentry.h"
#if 0 //for implementation // Key/leafentry pair stored in a dmt. The key is inlined, the offset (in leafentry mempool) is stored for the leafentry.
static int
UU() verify_in_mempool(OMTVALUE lev, uint32_t UU(idx), void *mpv)
{
LEAFENTRY CAST_FROM_VOIDP(le, lev);
struct mempool *CAST_FROM_VOIDP(mp, mpv);
int r = toku_mempool_inrange(mp, le, leafentry_memsize(le));
lazy_assert(r);
return 0;
}
toku_omt_iterate(bn->buffer, verify_in_mempool, &bn->buffer_mempool);
#endif
struct klpair_struct { struct klpair_struct {
uint32_t le_offset; //Offset of leafentry (in leafentry mempool) uint32_t le_offset; //Offset of leafentry (in leafentry mempool)
uint8_t key_le[0]; // key, followed by le uint8_t key[0]; // key, followed by le
}; };
static constexpr uint32_t keylen_from_klpair_len(const uint32_t klpair_len) { static constexpr uint32_t keylen_from_klpair_len(const uint32_t klpair_len) {
return klpair_len - __builtin_offsetof(klpair_struct, key_le); return klpair_len - __builtin_offsetof(klpair_struct, key);
} }
typedef struct klpair_struct KLPAIR_S, *KLPAIR; typedef struct klpair_struct KLPAIR_S, *KLPAIR;
static_assert(__builtin_offsetof(klpair_struct, key_le) == 1*sizeof(uint32_t), "klpair alignment issues"); static_assert(__builtin_offsetof(klpair_struct, key) == 1*sizeof(uint32_t), "klpair alignment issues");
static_assert(__builtin_offsetof(klpair_struct, key_le) == sizeof(klpair_struct), "klpair size issues"); static_assert(__builtin_offsetof(klpair_struct, key) == sizeof(klpair_struct), "klpair size issues");
// A wrapper for the heaviside function provided to dmt->find*.
// Needed because the heaviside functions provided to bndata do not know about the internal types.
// Alternative to this wrapper is to expose accessor functions and rewrite all the external heaviside functions.
template<typename dmtcmp_t, template<typename dmtcmp_t,
int (*h)(const DBT &, const dmtcmp_t &)> int (*h)(const DBT &, const dmtcmp_t &)>
static int wrappy_fun_find(const uint32_t klpair_len, const klpair_struct &klpair, const dmtcmp_t &extra) { static int wrappy_fun_find(const uint32_t klpair_len, const klpair_struct &klpair, const dmtcmp_t &extra) {
DBT kdbt; DBT kdbt;
kdbt.data = const_cast<void*>(reinterpret_cast<const void*>(klpair.key_le)); kdbt.data = const_cast<void*>(reinterpret_cast<const void*>(klpair.key));
kdbt.size = keylen_from_klpair_len(klpair_len); kdbt.size = keylen_from_klpair_len(klpair_len);
return h(kdbt, extra); return h(kdbt, extra);
} }
...@@ -140,17 +130,21 @@ struct wrapped_iterate_extra_t { ...@@ -140,17 +130,21 @@ struct wrapped_iterate_extra_t {
const class bn_data * bd; const class bn_data * bd;
}; };
// A wrapper for the high-order function provided to dmt->iterate*
// Needed because the heaviside functions provided to bndata do not know about the internal types.
// Alternative to this wrapper is to expose accessor functions and rewrite all the external heaviside functions.
template<typename iterate_extra_t, template<typename iterate_extra_t,
int (*h)(const void * key, const uint32_t keylen, const LEAFENTRY &, const uint32_t idx, iterate_extra_t *const)> int (*f)(const void * key, const uint32_t keylen, const LEAFENTRY &, const uint32_t idx, iterate_extra_t *const)>
static int wrappy_fun_iterate(const uint32_t klpair_len, const klpair_struct &klpair, const uint32_t idx, wrapped_iterate_extra_t<iterate_extra_t> *const extra) { static int wrappy_fun_iterate(const uint32_t klpair_len, const klpair_struct &klpair, const uint32_t idx, wrapped_iterate_extra_t<iterate_extra_t> *const extra) {
const void* key = &klpair.key_le; const void* key = &klpair.key;
LEAFENTRY le = extra->bd->get_le_from_klpair(&klpair); LEAFENTRY le = extra->bd->get_le_from_klpair(&klpair);
return h(key, keylen_from_klpair_len(klpair_len), le, idx, extra->inner); return f(key, keylen_from_klpair_len(klpair_len), le, idx, extra->inner);
} }
namespace toku { namespace toku {
template<> template<>
// Use of dmt requires a dmt_functor for the specific type.
class dmt_functor<klpair_struct> { class dmt_functor<klpair_struct> {
public: public:
size_t get_dmtdatain_t_size(void) const { size_t get_dmtdatain_t_size(void) const {
...@@ -158,13 +152,13 @@ class dmt_functor<klpair_struct> { ...@@ -158,13 +152,13 @@ class dmt_functor<klpair_struct> {
} }
void write_dmtdata_t_to(klpair_struct *const dest) const { void write_dmtdata_t_to(klpair_struct *const dest) const {
dest->le_offset = this->le_offset; dest->le_offset = this->le_offset;
memcpy(dest->key_le, this->keyp, this->keylen); memcpy(dest->key, this->keyp, this->keylen);
} }
dmt_functor(uint32_t _keylen, uint32_t _le_offset, const void* _keyp) dmt_functor(uint32_t _keylen, uint32_t _le_offset, const void* _keyp)
: keylen(_keylen), le_offset(_le_offset), keyp(_keyp) {} : keylen(_keylen), le_offset(_le_offset), keyp(_keyp) {}
dmt_functor(const uint32_t klpair_len, klpair_struct *const src) dmt_functor(const uint32_t klpair_len, klpair_struct *const src)
: keylen(keylen_from_klpair_len(klpair_len)), le_offset(src->le_offset), keyp(src->key_le) {} : keylen(keylen_from_klpair_len(klpair_len)), le_offset(src->le_offset), keyp(src->key) {}
private: private:
const uint32_t keylen; const uint32_t keylen;
const uint32_t le_offset; const uint32_t le_offset;
...@@ -178,6 +172,8 @@ class bn_data { ...@@ -178,6 +172,8 @@ class bn_data {
public: public:
void init_zero(void); void init_zero(void);
void initialize_empty(void); void initialize_empty(void);
// Deserialize from rbuf.
void initialize_from_data(uint32_t num_entries, struct rbuf *rb, uint32_t data_size, uint32_t version); void initialize_from_data(uint32_t num_entries, struct rbuf *rb, uint32_t data_size, uint32_t version);
// globals // globals
uint64_t get_memory_size(void); uint64_t get_memory_size(void);
...@@ -212,7 +208,7 @@ public: ...@@ -212,7 +208,7 @@ public:
} }
if (key) { if (key) {
paranoid_invariant(keylen != NULL); paranoid_invariant(keylen != NULL);
*key = klpair->key_le; *key = klpair->key;
*keylen = keylen_from_klpair_len(klpair_len); *keylen = keylen_from_klpair_len(klpair_len);
} }
else { else {
...@@ -234,7 +230,7 @@ public: ...@@ -234,7 +230,7 @@ public:
} }
if (key) { if (key) {
paranoid_invariant(keylen != NULL); paranoid_invariant(keylen != NULL);
*key = klpair->key_le; *key = klpair->key;
*keylen = keylen_from_klpair_len(klpair_len); *keylen = keylen_from_klpair_len(klpair_len);
} }
else { else {
...@@ -281,9 +277,22 @@ public: ...@@ -281,9 +277,22 @@ public:
LEAFENTRY get_le_from_klpair(const klpair_struct *klpair) const; LEAFENTRY get_le_from_klpair(const klpair_struct *klpair) const;
// Must be called before serializing this basement node.
// Between calling prepare_to_serialize and actually serializing, the basement node may not be modified
void prepare_to_serialize(void); void prepare_to_serialize(void);
// Requires prepare_to_serialize() to have been called first.
// Serialize the basement node header to a wbuf
void serialize_header(struct wbuf *wb) const; void serialize_header(struct wbuf *wb) const;
// Requires prepare_to_serialize() (and serialize_header()) has been called first.
// Serialize all keys and leafentries to a wbuf
// Currently only supported when all keys are fixed-length.
void serialize_rest(struct wbuf *wb) const; void serialize_rest(struct wbuf *wb) const;
// Requires prepare_to_serialize() to have been called first.
// Returns true if we must use the old (version 24) serialization method for this basement node
// In other words, the bndata does not know how to serialize the keys and leafentries.
bool need_to_serialize_each_leafentry_with_key(void) const; bool need_to_serialize_each_leafentry_with_key(void) const;
static const uint32_t HEADER_LENGTH = 0 static const uint32_t HEADER_LENGTH = 0
...@@ -298,8 +307,12 @@ private: ...@@ -298,8 +307,12 @@ private:
// Private functions // Private functions
LEAFENTRY mempool_malloc_and_update_omt(size_t size, void **maybe_free); LEAFENTRY mempool_malloc_and_update_omt(size_t size, void **maybe_free);
void omt_compress_kvspace(size_t added_size, void **maybe_free, bool force_compress); void omt_compress_kvspace(size_t added_size, void **maybe_free, bool force_compress);
// Maintain metadata about size of memory for keys (adding a single key)
void add_key(uint32_t keylen); void add_key(uint32_t keylen);
// Maintain metadata about size of memory for keys (adding multiple keys)
void add_keys(uint32_t n_keys, uint32_t combined_keylen); void add_keys(uint32_t n_keys, uint32_t combined_keylen);
// Maintain metadata about size of memory for keys (removing a single key)
void remove_key(uint32_t keylen); void remove_key(uint32_t keylen);
klpair_dmt_t m_buffer; // pointers to individual leaf entries klpair_dmt_t m_buffer; // pointers to individual leaf entries
...@@ -307,6 +320,7 @@ private: ...@@ -307,6 +320,7 @@ private:
friend class bndata_bugfix_test; friend class bndata_bugfix_test;
uint32_t klpair_disksize(const uint32_t klpair_len, const klpair_struct *klpair) const; uint32_t klpair_disksize(const uint32_t klpair_len, const klpair_struct *klpair) const;
// The disk/memory size of all keys. (Note that the size of memory for the leafentries is maintained by m_buffer_mempool)
size_t m_disksize_of_keys; size_t m_disksize_of_keys;
void initialize_from_separate_keys_and_vals(uint32_t num_entries, struct rbuf *rb, uint32_t data_size, uint32_t version, void initialize_from_separate_keys_and_vals(uint32_t num_entries, struct rbuf *rb, uint32_t data_size, uint32_t version,
......
...@@ -210,12 +210,12 @@ static void test_builder_fixed(uint32_t len, uint32_t num) { ...@@ -210,12 +210,12 @@ static void test_builder_fixed(uint32_t len, uint32_t num) {
for (uint32_t i = 0; i < num; i++) { for (uint32_t i = 0; i < num; i++) {
vfunctor vfun(data[i]); vfunctor vfun(data[i]);
builder.insert_sorted(vfun); builder.append(vfun);
} }
invariant(builder.is_value_length_fixed()); invariant(builder.value_length_is_fixed());
vdmt v; vdmt v;
builder.build_and_destroy(&v); builder.build(&v);
invariant(v.is_value_length_fixed()); invariant(v.value_length_is_fixed());
invariant(v.get_fixed_length() == len); invariant(v.get_fixed_length() == len);
invariant(v.size() == num); invariant(v.size() == num);
...@@ -257,12 +257,12 @@ static void test_builder_variable(uint32_t len, uint32_t len2, uint32_t num) { ...@@ -257,12 +257,12 @@ static void test_builder_variable(uint32_t len, uint32_t len2, uint32_t num) {
for (uint32_t i = 0; i < num; i++) { for (uint32_t i = 0; i < num; i++) {
vfunctor vfun(data[i]); vfunctor vfun(data[i]);
builder.insert_sorted(vfun); builder.append(vfun);
} }
invariant(!builder.is_value_length_fixed()); invariant(!builder.value_length_is_fixed());
vdmt v; vdmt v;
builder.build_and_destroy(&v); builder.build(&v);
invariant(!v.is_value_length_fixed()); invariant(!v.value_length_is_fixed());
invariant(v.size() == num); invariant(v.size() == num);
...@@ -305,7 +305,7 @@ static void test_create_from_sorted_memory_of_fixed_sized_elements__and__seriali ...@@ -305,7 +305,7 @@ static void test_create_from_sorted_memory_of_fixed_sized_elements__and__seriali
vdmt v; vdmt v;
v.create_from_sorted_memory_of_fixed_size_elements(flat, num, len*num, len); v.create_from_sorted_memory_of_fixed_size_elements(flat, num, len*num, len);
invariant(v.is_value_length_fixed()); invariant(v.value_length_is_fixed());
invariant(v.get_fixed_length() == len); invariant(v.get_fixed_length() == len);
invariant(v.size() == num); invariant(v.size() == num);
......
...@@ -1180,7 +1180,7 @@ uint32_t dmt<dmtdata_t, dmtdataout_t>::get_fixed_length_alignment_overhead(void) ...@@ -1180,7 +1180,7 @@ uint32_t dmt<dmtdata_t, dmtdataout_t>::get_fixed_length_alignment_overhead(void)
} }
template<typename dmtdata_t, typename dmtdataout_t> template<typename dmtdata_t, typename dmtdataout_t>
bool dmt<dmtdata_t, dmtdataout_t>::is_value_length_fixed(void) const { bool dmt<dmtdata_t, dmtdataout_t>::value_length_is_fixed(void) const {
return this->values_same_size; return this->values_same_size;
} }
...@@ -1223,7 +1223,7 @@ void dmt<dmtdata_t, dmtdataout_t>::builder::create(uint32_t _max_values, uint32_ ...@@ -1223,7 +1223,7 @@ void dmt<dmtdata_t, dmtdataout_t>::builder::create(uint32_t _max_values, uint32_
} }
template<typename dmtdata_t, typename dmtdataout_t> template<typename dmtdata_t, typename dmtdataout_t>
void dmt<dmtdata_t, dmtdataout_t>::builder::insert_sorted(const dmtdatain_t &value) { void dmt<dmtdata_t, dmtdataout_t>::builder::append(const dmtdatain_t &value) {
paranoid_invariant(this->temp_valid); paranoid_invariant(this->temp_valid);
//NOTE: Always use d.a.num_values for size because we have not yet created root. //NOTE: Always use d.a.num_values for size because we have not yet created root.
if (this->temp.values_same_size && (this->temp.d.a.num_values == 0 || value.get_dmtdatain_t_size() == this->temp.value_length)) { if (this->temp.values_same_size && (this->temp.d.a.num_values == 0 || value.get_dmtdatain_t_size() == this->temp.value_length)) {
...@@ -1257,13 +1257,13 @@ void dmt<dmtdata_t, dmtdataout_t>::builder::insert_sorted(const dmtdatain_t &val ...@@ -1257,13 +1257,13 @@ void dmt<dmtdata_t, dmtdataout_t>::builder::insert_sorted(const dmtdatain_t &val
} }
template<typename dmtdata_t, typename dmtdataout_t> template<typename dmtdata_t, typename dmtdataout_t>
bool dmt<dmtdata_t, dmtdataout_t>::builder::is_value_length_fixed(void) { bool dmt<dmtdata_t, dmtdataout_t>::builder::value_length_is_fixed(void) {
paranoid_invariant(this->temp_valid); paranoid_invariant(this->temp_valid);
return this->temp.values_same_size; return this->temp.values_same_size;
} }
template<typename dmtdata_t, typename dmtdataout_t> template<typename dmtdata_t, typename dmtdataout_t>
void dmt<dmtdata_t, dmtdataout_t>::builder::build_and_destroy(dmt<dmtdata_t, dmtdataout_t> *dest) { void dmt<dmtdata_t, dmtdataout_t>::builder::build(dmt<dmtdata_t, dmtdataout_t> *dest) {
invariant(this->temp_valid); invariant(this->temp_valid);
//NOTE: Always use d.a.num_values for size because we have not yet created root. //NOTE: Always use d.a.num_values for size because we have not yet created root.
invariant(this->temp.d.a.num_values == this->max_values); // Optionally make it <= invariant(this->temp.d.a.num_values == this->max_values); // Optionally make it <=
......
/* -*- mode: C++; c-basic-offset: 4; indent-tabs-mode: nil -*- */ /* -*- mode: C++; c-basic-offset: 4; indent-tabs-mode: nil -*- */
// vim: ft=cpp:expandtab:ts=8:sw=4:softtabstop=4: // vim: ft=cpp:expandtab:ts=8:sw=4:softtabstop=4:
#ifndef UTIL_DMT_H #pragma once
#define UTIL_DMT_H
#ident "$Id$"
/* /*
COPYING CONDITIONS NOTICE: COPYING CONDITIONS NOTICE:
...@@ -210,13 +208,16 @@ public: ...@@ -210,13 +208,16 @@ public:
} }
// Each data type used in a dmt requires a dmt_functor (allows you to insert/etc with dynamic sized types).
// There is no default implementation.
template<typename dmtdata_t> template<typename dmtdata_t>
class dmt_functor { class dmt_functor {
static_assert(!std::is_same<dmtdata_t, dmtdata_t>::value, "Must use partial specialization"); // Ensures that if you forget to use partial specialization this compile error will remind you to use it.
static_assert(!std::is_same<dmtdata_t, dmtdata_t>::value, "Must use partial specialization on dmt_functor");
// Defines the interface: // Defines the interface:
//static size_t get_dmtdata_t_size(const dmtdata_t &) { return 0; } static size_t get_dmtdata_t_size(const dmtdata_t &) { return 0; }
//size_t get_dmtdatain_t_size(void) { return 0; } size_t get_dmtdatain_t_size(void) { return 0; }
//void write_dmtdata_t_to(dmtdata_t *const dest) {} void write_dmtdata_t_to(dmtdata_t *const dest) {}
}; };
template<typename dmtdata_t, template<typename dmtdata_t,
...@@ -237,10 +238,10 @@ public: ...@@ -237,10 +238,10 @@ public:
class builder { class builder {
public: public:
void insert_sorted(const dmtdatain_t &value); void append(const dmtdatain_t &value);
void create(uint32_t n_values, uint32_t n_value_bytes); void create(uint32_t n_values, uint32_t n_value_bytes);
bool is_value_length_fixed(void); bool value_length_is_fixed(void);
void build_and_destroy(dmt<dmtdata_t, dmtdataout_t> *dest); void build(dmt<dmtdata_t, dmtdataout_t> *dest);
private: private:
uint32_t max_values; uint32_t max_values;
uint32_t max_value_bytes; uint32_t max_value_bytes;
...@@ -512,7 +513,7 @@ public: ...@@ -512,7 +513,7 @@ public:
*/ */
size_t memory_size(void); size_t memory_size(void);
bool is_value_length_fixed(void) const; bool value_length_is_fixed(void) const;
uint32_t get_fixed_length(void) const; uint32_t get_fixed_length(void) const;
...@@ -533,7 +534,7 @@ private: ...@@ -533,7 +534,7 @@ private:
}; };
bool values_same_size; //TODO: is this necessary? maybe sentinel for value_length bool values_same_size;
uint32_t value_length; uint32_t value_length;
struct mempool mp; struct mempool mp;
bool is_array; bool is_array;
...@@ -678,4 +679,3 @@ private: ...@@ -678,4 +679,3 @@ private:
// include the implementation here // include the implementation here
#include "dmt.cc" #include "dmt.cc"
#endif // UTIL_DMT_H
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment