Commit 19070f9c authored by Yoni Fogel's avatar Yoni Fogel

Refs Tokutek/ft-index#46 Added some comments, deleted unused/commented out code, renamed variables

parent 87da689b
See wiki:
https://github.com/Tokutek/ft-index/wiki/Improving-in-memory-query-performance---Design
ft/bndata.{cc,h} The basement node was heavily modified to split the key/value, and inline the keys
bn_data::initialize_from_separate_keys_and_vals
This is effectively the deserialize
The bn_data::omt_* functions (probably badly named) kind of treat the basement node as an omt of key+leafentry pairs
There are many references to 'omt' that could be renamed to dmt if it's worth it.
util/dmt.{cc,h} The new DMT structure
Possible questions:
1-Should we merge dmt<> & omt<>? (delete omt entirely)
2-Should omt<> become a wrapper for dmt<>?
3-Should we just keep both around?
If we plan to do this for a while, should we get rid of any scaffolding that would make it easier to do 1 or 2?
The dmt is basically an omt with dynamic sized nodes/values.
There are two representations: an array of values, or a tree of nodes.
The high-level algorithm is basically the same for dmt and omt, except the dmt tries not to move values around in tree form
Instead, it moves the metadata from nodes around.
Insertion into a dmt requires a functor that can provide information about size, since it's expected to be (potentially at least) dynamically sized
The dmt does not revert to array form when rebalancing the root, but it CAN revert to array form when it prepares for serializing (if it notices everything is fixed length)
The dmt also can serialize and deserialize the values (set) it represents. It saves no information about the dmt itself, just the values.
Some comments about what's in each file.
ft/CMakeLists.txt
add dmt-wrapper (test wrapper, nearly identical to ft/omt.cc which is also a test wrapper)
ft/dmt-wrapper.cc/h
Just like ft/omt.cc,h. Is a test wrapper for the dmt to implement a version of the old (non-templated) omt tests.
ft/ft-internal.h
Additional engine status
ft/ft-ops.cc/h
Additional engine status
in ftnode_memory_size()
fix a minor bug where we didn't count all the memory.
comments
ft/ft_layout_version.h
Update comment describing version change.
NOTE: May need to add version 26 if 25 is sent to customers before this goes live.
Adding 26 requires additional code changes (limited to a subset of places where version 24/25 are referred to)
ft/ft_node-serialize.cc
Changes calculation of size of a leaf node to include basement-node header
Adds optimized serialization for basement nodes with fixed-length keys
Maintains old method when not using fixed-length keys.
rebalance_ftnode_leaf()
Minor changes since key/leafentries are separated
deserialize_ftnode_partition()
Minor changes, including passing rbuf directly to child function (so ndone calculation is done by child)
ft/memarena.cc
Changes so that toku_memory_footprint is more accurate. (Not exactly related project)
ft/rollback.cc
Just uses new memarena function for memory footprint
ft/tests/dmt-test.cc
"clone" of old omt-test (non templated) ported to dmt
Basically not worth looking at except to make sure it imports dmt instead of omt.
ft/tests/dmt-test2.cc
New dmt tests.
You might decide not enough new tests were implemented.
ft/tests/ft-serialize-benchmark.cc
Minor improvements s.t. you can take an average of a bunch of runs.
ft/tests/ft-serialize-test.cc
Just ported to changed api
ft/tests/test-pick-child-to-flush.cc
The new basement-node headers reduce available memory.. reduce max size of test appropriately.
ft/wbuf.h
Added wbuf_nocrc_reserve_literal_bytes()
Gives you a pointer to write to the wbuf, but notes the memory was used.
util/mempool.cc
Made mempool allocations aligned to cachelines
Minor 'const' changes to help compilation
Some utility functions to get/give offsets
This diff is collapsed.
......@@ -96,39 +96,29 @@ PATENT RIGHTS GRANT:
#include <util/dmt.h>
#include "leafentry.h"
#if 0 //for implementation
static int
UU() verify_in_mempool(OMTVALUE lev, uint32_t UU(idx), void *mpv)
{
LEAFENTRY CAST_FROM_VOIDP(le, lev);
struct mempool *CAST_FROM_VOIDP(mp, mpv);
int r = toku_mempool_inrange(mp, le, leafentry_memsize(le));
lazy_assert(r);
return 0;
}
toku_omt_iterate(bn->buffer, verify_in_mempool, &bn->buffer_mempool);
#endif
// Key/leafentry pair stored in a dmt. The key is inlined, the offset (in leafentry mempool) is stored for the leafentry.
struct klpair_struct {
uint32_t le_offset; //Offset of leafentry (in leafentry mempool)
uint8_t key_le[0]; // key, followed by le
uint8_t key[0]; // key, followed by le
};
static constexpr uint32_t keylen_from_klpair_len(const uint32_t klpair_len) {
return klpair_len - __builtin_offsetof(klpair_struct, key_le);
return klpair_len - __builtin_offsetof(klpair_struct, key);
}
typedef struct klpair_struct KLPAIR_S, *KLPAIR;
static_assert(__builtin_offsetof(klpair_struct, key_le) == 1*sizeof(uint32_t), "klpair alignment issues");
static_assert(__builtin_offsetof(klpair_struct, key_le) == sizeof(klpair_struct), "klpair size issues");
static_assert(__builtin_offsetof(klpair_struct, key) == 1*sizeof(uint32_t), "klpair alignment issues");
static_assert(__builtin_offsetof(klpair_struct, key) == sizeof(klpair_struct), "klpair size issues");
// A wrapper for the heaviside function provided to dmt->find*.
// Needed because the heaviside functions provided to bndata do not know about the internal types.
// Alternative to this wrapper is to expose accessor functions and rewrite all the external heaviside functions.
template<typename dmtcmp_t,
int (*h)(const DBT &, const dmtcmp_t &)>
static int wrappy_fun_find(const uint32_t klpair_len, const klpair_struct &klpair, const dmtcmp_t &extra) {
DBT kdbt;
kdbt.data = const_cast<void*>(reinterpret_cast<const void*>(klpair.key_le));
kdbt.data = const_cast<void*>(reinterpret_cast<const void*>(klpair.key));
kdbt.size = keylen_from_klpair_len(klpair_len);
return h(kdbt, extra);
}
......@@ -140,17 +130,21 @@ struct wrapped_iterate_extra_t {
const class bn_data * bd;
};
// A wrapper for the high-order function provided to dmt->iterate*
// Needed because the heaviside functions provided to bndata do not know about the internal types.
// Alternative to this wrapper is to expose accessor functions and rewrite all the external heaviside functions.
template<typename iterate_extra_t,
int (*h)(const void * key, const uint32_t keylen, const LEAFENTRY &, const uint32_t idx, iterate_extra_t *const)>
int (*f)(const void * key, const uint32_t keylen, const LEAFENTRY &, const uint32_t idx, iterate_extra_t *const)>
static int wrappy_fun_iterate(const uint32_t klpair_len, const klpair_struct &klpair, const uint32_t idx, wrapped_iterate_extra_t<iterate_extra_t> *const extra) {
const void* key = &klpair.key_le;
const void* key = &klpair.key;
LEAFENTRY le = extra->bd->get_le_from_klpair(&klpair);
return h(key, keylen_from_klpair_len(klpair_len), le, idx, extra->inner);
return f(key, keylen_from_klpair_len(klpair_len), le, idx, extra->inner);
}
namespace toku {
template<>
// Use of dmt requires a dmt_functor for the specific type.
class dmt_functor<klpair_struct> {
public:
size_t get_dmtdatain_t_size(void) const {
......@@ -158,13 +152,13 @@ class dmt_functor<klpair_struct> {
}
void write_dmtdata_t_to(klpair_struct *const dest) const {
dest->le_offset = this->le_offset;
memcpy(dest->key_le, this->keyp, this->keylen);
memcpy(dest->key, this->keyp, this->keylen);
}
dmt_functor(uint32_t _keylen, uint32_t _le_offset, const void* _keyp)
: keylen(_keylen), le_offset(_le_offset), keyp(_keyp) {}
dmt_functor(const uint32_t klpair_len, klpair_struct *const src)
: keylen(keylen_from_klpair_len(klpair_len)), le_offset(src->le_offset), keyp(src->key_le) {}
: keylen(keylen_from_klpair_len(klpair_len)), le_offset(src->le_offset), keyp(src->key) {}
private:
const uint32_t keylen;
const uint32_t le_offset;
......@@ -178,6 +172,8 @@ class bn_data {
public:
void init_zero(void);
void initialize_empty(void);
// Deserialize from rbuf.
void initialize_from_data(uint32_t num_entries, struct rbuf *rb, uint32_t data_size, uint32_t version);
// globals
uint64_t get_memory_size(void);
......@@ -212,7 +208,7 @@ public:
}
if (key) {
paranoid_invariant(keylen != NULL);
*key = klpair->key_le;
*key = klpair->key;
*keylen = keylen_from_klpair_len(klpair_len);
}
else {
......@@ -234,7 +230,7 @@ public:
}
if (key) {
paranoid_invariant(keylen != NULL);
*key = klpair->key_le;
*key = klpair->key;
*keylen = keylen_from_klpair_len(klpair_len);
}
else {
......@@ -281,9 +277,22 @@ public:
LEAFENTRY get_le_from_klpair(const klpair_struct *klpair) const;
// Must be called before serializing this basement node.
// Between calling prepare_to_serialize and actually serializing, the basement node may not be modified
void prepare_to_serialize(void);
// Requires prepare_to_serialize() to have been called first.
// Serialize the basement node header to a wbuf
void serialize_header(struct wbuf *wb) const;
// Requires prepare_to_serialize() (and serialize_header()) has been called first.
// Serialize all keys and leafentries to a wbuf
// Currently only supported when all keys are fixed-length.
void serialize_rest(struct wbuf *wb) const;
// Requires prepare_to_serialize() to have been called first.
// Returns true if we must use the old (version 24) serialization method for this basement node
// In other words, the bndata does not know how to serialize the keys and leafentries.
bool need_to_serialize_each_leafentry_with_key(void) const;
static const uint32_t HEADER_LENGTH = 0
......@@ -298,8 +307,12 @@ private:
// Private functions
LEAFENTRY mempool_malloc_and_update_omt(size_t size, void **maybe_free);
void omt_compress_kvspace(size_t added_size, void **maybe_free, bool force_compress);
// Maintain metadata about size of memory for keys (adding a single key)
void add_key(uint32_t keylen);
// Maintain metadata about size of memory for keys (adding multiple keys)
void add_keys(uint32_t n_keys, uint32_t combined_keylen);
// Maintain metadata about size of memory for keys (removing a single key)
void remove_key(uint32_t keylen);
klpair_dmt_t m_buffer; // pointers to individual leaf entries
......@@ -307,6 +320,7 @@ private:
friend class bndata_bugfix_test;
uint32_t klpair_disksize(const uint32_t klpair_len, const klpair_struct *klpair) const;
// The disk/memory size of all keys. (Note that the size of memory for the leafentries is maintained by m_buffer_mempool)
size_t m_disksize_of_keys;
void initialize_from_separate_keys_and_vals(uint32_t num_entries, struct rbuf *rb, uint32_t data_size, uint32_t version,
......
......@@ -210,12 +210,12 @@ static void test_builder_fixed(uint32_t len, uint32_t num) {
for (uint32_t i = 0; i < num; i++) {
vfunctor vfun(data[i]);
builder.insert_sorted(vfun);
builder.append(vfun);
}
invariant(builder.is_value_length_fixed());
invariant(builder.value_length_is_fixed());
vdmt v;
builder.build_and_destroy(&v);
invariant(v.is_value_length_fixed());
builder.build(&v);
invariant(v.value_length_is_fixed());
invariant(v.get_fixed_length() == len);
invariant(v.size() == num);
......@@ -257,12 +257,12 @@ static void test_builder_variable(uint32_t len, uint32_t len2, uint32_t num) {
for (uint32_t i = 0; i < num; i++) {
vfunctor vfun(data[i]);
builder.insert_sorted(vfun);
builder.append(vfun);
}
invariant(!builder.is_value_length_fixed());
invariant(!builder.value_length_is_fixed());
vdmt v;
builder.build_and_destroy(&v);
invariant(!v.is_value_length_fixed());
builder.build(&v);
invariant(!v.value_length_is_fixed());
invariant(v.size() == num);
......@@ -305,7 +305,7 @@ static void test_create_from_sorted_memory_of_fixed_sized_elements__and__seriali
vdmt v;
v.create_from_sorted_memory_of_fixed_size_elements(flat, num, len*num, len);
invariant(v.is_value_length_fixed());
invariant(v.value_length_is_fixed());
invariant(v.get_fixed_length() == len);
invariant(v.size() == num);
......
......@@ -1180,7 +1180,7 @@ uint32_t dmt<dmtdata_t, dmtdataout_t>::get_fixed_length_alignment_overhead(void)
}
template<typename dmtdata_t, typename dmtdataout_t>
bool dmt<dmtdata_t, dmtdataout_t>::is_value_length_fixed(void) const {
bool dmt<dmtdata_t, dmtdataout_t>::value_length_is_fixed(void) const {
return this->values_same_size;
}
......@@ -1223,7 +1223,7 @@ void dmt<dmtdata_t, dmtdataout_t>::builder::create(uint32_t _max_values, uint32_
}
template<typename dmtdata_t, typename dmtdataout_t>
void dmt<dmtdata_t, dmtdataout_t>::builder::insert_sorted(const dmtdatain_t &value) {
void dmt<dmtdata_t, dmtdataout_t>::builder::append(const dmtdatain_t &value) {
paranoid_invariant(this->temp_valid);
//NOTE: Always use d.a.num_values for size because we have not yet created root.
if (this->temp.values_same_size && (this->temp.d.a.num_values == 0 || value.get_dmtdatain_t_size() == this->temp.value_length)) {
......@@ -1257,13 +1257,13 @@ void dmt<dmtdata_t, dmtdataout_t>::builder::insert_sorted(const dmtdatain_t &val
}
template<typename dmtdata_t, typename dmtdataout_t>
bool dmt<dmtdata_t, dmtdataout_t>::builder::is_value_length_fixed(void) {
bool dmt<dmtdata_t, dmtdataout_t>::builder::value_length_is_fixed(void) {
paranoid_invariant(this->temp_valid);
return this->temp.values_same_size;
}
template<typename dmtdata_t, typename dmtdataout_t>
void dmt<dmtdata_t, dmtdataout_t>::builder::build_and_destroy(dmt<dmtdata_t, dmtdataout_t> *dest) {
void dmt<dmtdata_t, dmtdataout_t>::builder::build(dmt<dmtdata_t, dmtdataout_t> *dest) {
invariant(this->temp_valid);
//NOTE: Always use d.a.num_values for size because we have not yet created root.
invariant(this->temp.d.a.num_values == this->max_values); // Optionally make it <=
......
/* -*- mode: C++; c-basic-offset: 4; indent-tabs-mode: nil -*- */
// vim: ft=cpp:expandtab:ts=8:sw=4:softtabstop=4:
#ifndef UTIL_DMT_H
#define UTIL_DMT_H
#pragma once
#ident "$Id$"
/*
COPYING CONDITIONS NOTICE:
......@@ -210,13 +208,16 @@ public:
}
// Each data type used in a dmt requires a dmt_functor (allows you to insert/etc with dynamic sized types).
// There is no default implementation.
template<typename dmtdata_t>
class dmt_functor {
static_assert(!std::is_same<dmtdata_t, dmtdata_t>::value, "Must use partial specialization");
// Ensures that if you forget to use partial specialization this compile error will remind you to use it.
static_assert(!std::is_same<dmtdata_t, dmtdata_t>::value, "Must use partial specialization on dmt_functor");
// Defines the interface:
//static size_t get_dmtdata_t_size(const dmtdata_t &) { return 0; }
//size_t get_dmtdatain_t_size(void) { return 0; }
//void write_dmtdata_t_to(dmtdata_t *const dest) {}
static size_t get_dmtdata_t_size(const dmtdata_t &) { return 0; }
size_t get_dmtdatain_t_size(void) { return 0; }
void write_dmtdata_t_to(dmtdata_t *const dest) {}
};
template<typename dmtdata_t,
......@@ -237,10 +238,10 @@ public:
class builder {
public:
void insert_sorted(const dmtdatain_t &value);
void append(const dmtdatain_t &value);
void create(uint32_t n_values, uint32_t n_value_bytes);
bool is_value_length_fixed(void);
void build_and_destroy(dmt<dmtdata_t, dmtdataout_t> *dest);
bool value_length_is_fixed(void);
void build(dmt<dmtdata_t, dmtdataout_t> *dest);
private:
uint32_t max_values;
uint32_t max_value_bytes;
......@@ -512,7 +513,7 @@ public:
*/
size_t memory_size(void);
bool is_value_length_fixed(void) const;
bool value_length_is_fixed(void) const;
uint32_t get_fixed_length(void) const;
......@@ -533,7 +534,7 @@ private:
};
bool values_same_size; //TODO: is this necessary? maybe sentinel for value_length
bool values_same_size;
uint32_t value_length;
struct mempool mp;
bool is_array;
......@@ -678,4 +679,3 @@ private:
// include the implementation here
#include "dmt.cc"
#endif // UTIL_DMT_H
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment