wcfs.cpp 45.8 KB
Newer Older
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1
// Copyright (C) 2018-2020  Nexedi SA and Contributors.
2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
//                          Kirill Smelkov <kirr@nexedi.com>
//
// This program is free software: you can Use, Study, Modify and Redistribute
// it under the terms of the GNU General Public License version 3, or (at your
// option) any later version, as published by the Free Software Foundation.
//
// You can also Link and Combine this program with other software covered by
// the terms of any of the Free Software licenses or any of the Open Source
// Initiative approved licenses and Convey the resulting work. Corresponding
// source of such a combination shall include the source code for all other
// software used.
//
// This program is distributed WITHOUT ANY WARRANTY; without even the implied
// warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
//
// See COPYING file for full licensing terms.
// See https://www.nexedi.com/licensing for rationale and options.

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
20 21
// Package wcfs provides WCFS client integrated with user-space virtual memory manager.
// See wcfs.h for package overview.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
22

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
23

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
24
// Wcfs client organization
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
25
//
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
26 27 28 29 30
// Wcfs client provides to its users isolated bigfile views backed by data on
// WCFS filesystem. In the absence of Isolation property, wcfs client would
// reduce to just directly using OS-level file wcfs/head/f for a bigfile f. On
// the other hand there is a simple, but inefficient, way to support isolation:
// for @at database view of bigfile f - directly use OS-level file wcfs/@at/f.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
31 32 33 34 35 36
// The latter works, but is very inefficient because OS-cache for f data is not
// shared in between two connections with @at1 and @at2 views. The cache is
// also lost when connection view of the database is resynced on transaction
// boundary. To support isolation efficiently, wcfs client uses wcfs/head/f
// most of the time, but injects wcfs/@revX/f parts into mappings to maintain
// f@at view driven by pin messages that wcfs server sends to client in
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
37 38 39 40 41 42 43
// accordance to WCFS isolation protocol(*).
//
// Wcfs server sends pin messages synchronously triggered by access to mmaped
// memory. That means that a client thread, that is accessing wcfs/head/f mmap,
// is completely blocked while wcfs server sends pins and waits to receive acks
// from all clients. In other words on-client handling of pins has to be done
// in separate thread, because wcfs server can also send pins to client that
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
44
// triggered the access.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
45
//
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
46 47 48
// Wcfs client implements pins handling in so-called "pinner" thread(+). The
// pinner thread receives pin requests from wcfs server via watchlink handle
// opened through wcfs/head/watch. For every pin request the pinner finds
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
49
// corresponding Mappings and injects wcfs/@revX/f parts via Mapping._remmapblk
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
50
// appropriately.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
51
//
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
52 53
// The same watchlink handle is used to send client-originated requests to wcfs
// server. The requests are sent to tell wcfs that client wants to observe a
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
54
// particular bigfile as of particular revision, or to stop watching for it.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
55 56 57
// Such requests originate from regular client threads - not pinner - via entry
// points like Conn.open, Conn.resync and FileH.close.
//
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
58
// Every FileH maintains fileh._pinned {} with currently pinned blk -> rev. This
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
59 60 61
// dict is updated by pinner driven by pin messages, and is used when either
// new fileh Mapping is created (FileH.mmap) or refreshed due to request from
// virtmem (Mapping.remmap_blk, see below).
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
62
//
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
63 64 65 66 67 68 69
// In wendelin.core a bigfile has semantic that it is infinite in size and
// reads as all zeros beyond region initialized with data. Memory-mapping of
// OS-level files can also go beyond file size, however accessing memory
// corresponding to file region after file.size triggers SIGBUS. To preserve
// wendelin.core semantic wcfs client mmaps-in zeros for Mapping regions after
// wcfs/head/f.size. For simplicity it is assumed that bigfiles only grow and
// never shrink. It is indeed currently so, but will have to be revisited
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
70 71
// if/when wendelin.core adds bigfile truncation. Wcfs client restats
// wcfs/head/f at every transaction boundary (Conn.resync) and remembers f.size
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
72
// in FileH._headfsize for use during one transaction(%).
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
73
//
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
74
//
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
75 76
// Integration with wendelin.core virtmem layer
//
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
77
// Wcfs client integrates with virtmem layer to support virtmem handle
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
78
// dirtying pages of read-only base-layer that wcfs client provides via
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
79 80
// isolated Mapping. For wcfs-backed bigfiles every virtmem VMA is interlinked
// with Mapping:
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
81
//
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
82 83 84
//       VMA     -> BigFileH -> ZBigFile -----> Z
//        ↑↓                                    O
//      Mapping  -> FileH    -> wcfs server --> DB
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
85 86 87
//
// When a page is write-accessed, virtmem mmaps in a page of RAM in place of
// accessed virtual memory, copies base-layer content provided by Mapping into
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
88
// there, and marks that page as read-write.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
89 90
//
// Upon receiving pin message, the pinner consults virtmem, whether
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
91 92 93 94
// corresponding page was already dirtied in virtmem's BigFileH (call to
// __fileh_page_isdirty), and if it was, the pinner does not remmap Mapping
// part to wcfs/@revX/f and just leaves dirty page in its place, remembering
// pin information in fileh._pinned.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
95
//
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
96 97
// Once dirty pages are no longer needed (either after discard/abort or
// writeout/commit), virtmem asks wcfs client to remmap corresponding regions
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
98 99
// of Mapping in its place again via calls to Mapping.remmap_blk for previously
// dirtied blocks.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
100
//
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
101 102
// The scheme outlined above does not need to split Mapping upon dirtying an
// inner page.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
103
//
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
104 105 106
// See bigfile_ops interface (wendelin/bigfile/file.h) that explains base-layer
// and overlaying from virtmem point of view. For wcfs this interface is
// provided by small wcfs client wrapper in bigfile/file_zodb.cpp.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
107
//
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
108 109 110
// --------
//
// (*) see wcfs.go documentation for WCFS isolation protocol overview and details.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
111 112
// (+) currently, for simplicity, there is one pinner thread for each connection.
//     In the future, for efficiency, it might be reworked to be one pinner thread
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
113
//     that serves all connections simultaneously.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
114
// (%) see _headWait comments on how this has to be reworked.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
115

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
116

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
117 118
// Wcfs client locking organization
//
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
119
// XXX locking -> explain atMu + slaves and refer to "Locking" in wcfs.go
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
120
//
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
121
// Conn.atMu > Conn.filehMu > FileH.mmapMu
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
122
//
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
123
// Several locks are RWMutex instead of just Mutex not only to allow more
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
124 125
// concurrency, but, in the first place for correctness: pinner thread being
// core element in handling WCFS isolation protocol, is effectively invoked
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
126 127
// synchronously from other threads via messages coming through wcfs server.
// For example Conn.resync sends watch request to wcfs and waits for the
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
128
// answer. Wcfs server, in turn, might send corresponding pin messages to the
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
129
// pinner and _wait_ for the answer before answering to resync:
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
130 131
//
//        - - - - - -
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
132 133
//       |       .···|·····.        ---->   = request
//          pinner <------.↓        <····   = response
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
134
//       |           |   wcfs
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
135 136
//          resync -------^↓
//       |      `····|·····
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
137 138 139
//        - - - - - -
//       client process
//
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
140
// This creates the necessity to use RWMutex for locks that pinner and other
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
141 142
// parts of the code could be using at the same time in synchronous scenarious
// similar to the above. This locks are:
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
143 144
//
//      - Conn.atMu
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
145
//      - Conn.filehMu
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
146 147 148
//
// XXX pinner takes the following locks (XXX recheck)
//
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
149 150
//      - wconn.filehMu.W
//      - wconn.filehMu.R
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
151
//
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
152
//      - virt_lock
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
153
//      - wconn.atMu.R
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
154
//      - wconn.filehMu.R
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
155
//      - fileh.mmapMu (R:.mmaps W:.pinned)
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
156
//
157 158
//
// XXX note on virt_lock in pinner and deadlocks.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
159 160


Kirill Smelkov's avatar
.  
Kirill Smelkov committed
161 162 163
#include "wcfs_misc.h"
#include "wcfs.h"
#include "wcfs_watchlink.h"
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
164

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
165
#include <wendelin/bigfile/virtmem.h>
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
166
#include <wendelin/bigfile/ram.h>
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
167

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
168
#include <golang/errors.h>
169
#include <golang/fmt.h>
170
#include <golang/io.h>
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
171
#include <golang/time.h>
172

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
173
#include <algorithm>
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
174
#include <string>
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
175
#include <vector>
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
176 177

#include <sys/types.h>
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
178
#include <sys/mman.h>
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
179 180
#include <sys/stat.h>
#include <unistd.h>
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
181

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
182
using std::min;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
183
using std::max;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
184 185
using std::vector;

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
186 187
namespace ioutil = io::ioutil;

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
188

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
189
#define TRACE 0
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
190 191 192 193 194 195 196 197 198
#if TRACE
#  define trace(format, ...) log::Debugf(format, ##__VA_ARGS__)
#else
#  define trace(format, ...) do {} while (0)
#endif

// trace with op prefix taken from E.
#define etrace(format, ...) trace("%s", v(E(fmt::errorf(format, ##__VA_ARGS__))))

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
199

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
200 201 202
// wcfs::
namespace wcfs {

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
203
static error mmap_zero_into_ro(void *addr, size_t size);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
204
static error mmap_efault_into(void *addr, size_t size);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
205 206
static tuple<uint8_t*, error> mmap_ro(os::File f, off_t offset, size_t size);
static error mmap_into_ro(void *addr, size_t size, os::File f, off_t offset);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
207

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
208 209
// _headWait waits till wcfs/head/at becomes ≥ at.
//
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244
// _headWait is currently needed, because client stats wcfs/head/f to get f
// size assuming that f size only ↑. The assumption is not generally valid
// (e.g. f might be truncated = hole puched for block at tail), but holds true
// for now. However to get correct results wcfs/head/f has to be statt'ed
// _after_ wcfs view of the database becomes ≥ wconn.at.
//
// TODO extend isolation protocol to report f size as of @at database state at
// watch init/update(*). This way there won't be need for headWait as correct
// file size @at will be returned by wcfs itself, which will also work if
// wcfs/head/f size is changed arbitrarily.
//
// (*) equivalient might be to send something like "pin #<bsize>.. Z" (pin
//     blocks bsize till ∞ to zeros).
error WCFS::_headWait(zodb::Tid at) {
    WCFS *wc = this;
    xerr::Contextf E("%s: headWait @%s", v(wc), v(at));
    etrace("");

    zodb::Tid xat;
    string    xatStr;
    error     err;

    // XXX dumb implementation, because _headWait should go away.
    while (1) {
        tie(xatStr, err) = ioutil::ReadFile(wc->_path("head/at"));
        if (err != nil)
            return E(err);

        tie(xat, err) = xstrconv::parseHex64(xatStr);
        if (err != nil)
            return E(fmt::errorf("head/at: %w", err));

        if (xat >= at)
            break;

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
245
        time::sleep(1*time::millisecond);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
246 247
    }

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
248 249 250
    return nil;
}

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287
// connect creates new Conn viewing WCFS state as of @at.
pair<Conn, error> WCFS::connect(zodb::Tid at) {
    WCFS *wc = this;
    xerr::Contextf E("%s: connect @%s", v(wc), v(at));
    etrace("");

    error err;

    // TODO support !isolated mode

    // need to wait till `wcfs/head/at ≥ at` because e.g. Conn.open stats
    // head/f to get f.headfsize.
    err = wc->_headWait(at);
    if (err != nil) {
        return make_pair(nil, E(err));
    }

    WatchLink wlink;
    tie(wlink, err) = wc->_openwatch();
    if (err != nil)
        return make_pair(nil, E(err));

    Conn wconn = adoptref(new _Conn());
    wconn->_wc      = wc;
    wconn->at       = at;
    wconn->_wlink   = wlink;

    context::Context pinCtx;
    tie(pinCtx, wconn->_pinCancel) = context::with_cancel(context::background());
    wconn->_pinWG = sync::NewWorkGroup(pinCtx);
    wconn->_pinWG->go([wconn](context::Context ctx) -> error {
        return wconn->_pinner(ctx);
    });

    return make_pair(wconn, nil);
}

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
288 289
static global<error> errConnClosed = errors::New("connection closed");

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
290
// close releases resources associated with wconn.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
291
//
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
292
// opened fileh and mappings become invalid to use except close and unmap.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
293
error _Conn::close() {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
294
    _Conn& wconn = *this;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
295

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
296 297 298 299 300 301 302 303 304
    // lock virtmem early. TODO more granular virtmem locking (see __pin1 for
    // details and why virt_lock currently goes first)
    virt_lock();
    bool virtUnlocked = false;
    defer([&]() {
        if (!virtUnlocked)
            virt_unlock();
    });

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
305 306
    wconn._atMu.RLock();
    defer([&]() {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
307
        wconn._atMu.RUnlock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
308 309
    });

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
310
    xerr::Contextf E("%s: close", v(wconn));
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
311
    etrace("");
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
312

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
313
    error err, eret;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
314
    auto reterr1 = [&eret](error err) {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
315 316 317
        if (eret == nil && err != nil)
            eret = err;
    };
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
318

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
319
    // mark wconn as closed, so that no new wconn.open might be spawned.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
320
    bool alreadyClosed = false;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
321
    wconn._filehMu.Lock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
322 323
    alreadyClosed = (wconn._downErr == errConnClosed);
    wconn._downErr = errConnClosed;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
324
    wconn._filehMu.Unlock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
325 326
    if (alreadyClosed)
        return nil;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
327

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
328 329 330 331 332
    // close all files - both that have no mappings and that still have opened
    // mappings. We have to close files before shutting down pinner, because
    // wcfs might send pin messages due to file access by other clients. So to
    // avoid being killed we have to unwatch all files before stopping the
    // pinner.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
333
    //
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
334 335 336 337 338
    // NOTE after file is closed, its mappings could continue to survive, but
    // we can no longer maintain consistent view. For this reason we change
    // mappings to give EFAULT on access.
    while (1) {
        FileH f = nil;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
339
        bool opening;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
340 341 342

        // pick up any fileh
        wconn._filehMu.Lock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
343
        if (!wconn._filehTab.empty()) {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
344
            f = wconn._filehTab.begin()->second;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
345
            opening = (f->_state < _FileHOpened);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
346
        }
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
347
        wconn._filehMu.Unlock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
348

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
349 350 351
        if (f == nil)
            break; // all closed

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
352 353 354 355 356 357
        // if fileh was "opening" - wait for the open to complete before calling close.
        if (opening) {
            f->_openReady.recv();
            if (f->_openErr != nil)
                continue; // failed open; f should be removed from wconn._filehTab by Conn.open itself
        }
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
358

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
359
        // force fileh close.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
360
        // - virt_lock
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
361 362
        // - wconn.atMu.R
        // - wconn.filehMu unlocked
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
363 364 365
        err = f->_closeLocked(/*force=*/true);
        if (err != nil)
            reterr1(err);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
366

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
367
        // wait for f close to complete, as it might be that f.close was called
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
368
        // simultaneously to us or just before. f is removed from
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
369
        // wconn.filehTab only after close is complete.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
370
        f->_closedq.recv();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
371
    }
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
372

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
373 374 375 376 377 378 379 380 381 382 383 384
    // close wlink and signal to pinner to stop.
    // we have to release virt_lock, to avoid deadlocking with pinner.
    virtUnlocked = true;
    virt_unlock();

    err = wconn._wlink->close();
    if (err != nil)
        reterr1(err);
    wconn._pinCancel();
    err = wconn._pinWG->wait();
    if (!errors::Is(err, context::canceled)) // canceled - ok
        reterr1(err);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
385

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
386
    return E(eret);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
387
}
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
388

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
389
// _pinner receives pin messages from wcfs and adjusts wconn file mappings.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
390
error _Conn::_pinner(context::Context ctx) {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
391 392
    Conn wconn = newref(this); // newref for go
    error err = wconn->__pinner(ctx);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
393 394

    // if pinner fails, wcfs will kill us.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
395
    // log pinner error so that the error is not hidden.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
396 397 398
    // print to stderr as well as by default log does not print to there.
    // XXX also catch panic/exc ?
    if (!(err == nil || errors::Is(err, context::canceled))) { // canceled = .close asks pinner to stop
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
399 400
        log::Fatalf("CRITICAL: %s", v(err));
        log::Fatalf("CRITICAL: wcfs server will likely kill us soon.");
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
401 402
        fprintf(stderr, "CRITICAL: %s\n", v(err));
        fprintf(stderr, "CRITICAL: wcfs server will likely kill us soon.\n");
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
403

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
404
        // mark the connection non-operational if pinner fails.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
405 406 407
        //
        // XXX go because wconn.close might deadlock wrt Conn.resync on
        // wconn._filehMu, because Conn.resync sends "watch" updates under
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
408 409 410
        // wconn._filehMu (however Conn.open and FileH.close send "watch"
        // _without_ wconn._filehMu). If pinner fails - we already have serious
        // problems... TODO try to resolve the deadlock.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
411 412 413
        go([wconn]() {
            wconn->close();
        });
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
414
    }
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
415

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
416
    return err;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
417 418 419
}

error _Conn::__pinner(context::Context ctx) {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
420
    _Conn& wconn = *this;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
421
    xerr::Contextf E("pinner"); // NOTE pinner error goes to Conn::close who has its own context
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
422
    etrace("");
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
423

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
424 425 426
    PinReq req;
    error  err;

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
427
    while (1) {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
428 429
        err = wconn._wlink->recvReq(ctx, &req);
        if (err != nil) {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
430
            // it is ok if we receive EOF due to us (client) closing the connection
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
431
            if (err == io::EOF_) {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
432
                wconn._filehMu.RLock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
433
                err = (wconn._downErr == errConnClosed) ? nil : io::ErrUnexpectedEOF;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
434
                wconn._filehMu.RUnlock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
435
            }
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
436
            return E(err);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
437
        }
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
438 439

        // we received request to pin/unpin file block. handle it
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
440 441
        err = wconn._pin1(&req);
        if (err != nil) {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
442
            return E(err);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
443
        }
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
444 445 446 447
    }
}

// pin1 handles one pin request received from wcfs.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
448
error _Conn::_pin1(PinReq *req) {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
449
    _Conn& wconn = *this;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
450
    xerr::Contextf E("pin f<%s> #%ld @%s", v(req->foid), req->blk, v(req->at));
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
451
    etrace("");
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
452

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
453
    error err = wconn.__pin1(req);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
454

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
455
    // reply either ack or nak on error
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
456 457
    string ack = "ack";
    if (err != nil)
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
458
        ack = fmt::sprintf("nak: %s", v(err));
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
459 460
    // NOTE ctx=bg to always send reply even if we are canceled
    error err2 = wconn._wlink->replyReq(context::background(), req, ack);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
461 462 463 464
    if (err == nil)
        err = err2;

    return E(err);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
465 466 467 468
}

error _Conn::__pin1(PinReq *req) {
    _Conn& wconn = *this;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
469

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
470
    FileH f;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
471
    bool  ok;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
472

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
473 474 475 476 477 478 479
    // lock virtmem first.
    //
    // The reason we do it here instead of closely around call to
    // mmap->_remmapblk() is to avoid deadlocks: virtmem calls FileH.mmap,
    // Mapping.remmap_blk and Mapping.unmap under virt_lock locked. In those
    // functions the order of locks is
    //
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
480
    //      virt_lock, wconn.atMu.R, fileh.mmapMu
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
481 482 483 484
    //
    // So if we take virt_lock right around mmap._remmapblk(), the order of
    // locks in pinner would be
    //
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
485
    //      wconn.atMu.R, wconn.filehMu.R, fileh.mmapMu, virt_lock
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
486 487 488 489 490 491 492 493 494 495 496 497
    //
    // which means there is AB-BA deadlock possibility.
    //
    // TODO try to take virt_lock only around virtmem-associated VMAs and with
    // better granularity. NOTE it is possible to teach virtmem to call
    // FileH.mmap and Mapping.unmap without virtmem locked. However reworking
    // virtmem to call Mapping.remmap_blk without virt_lock is not so easy.
    virt_lock();
    defer([&]() {
        virt_unlock();
    });

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
498
    wconn._atMu.RLock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
499 500 501 502
    defer([&]() {
        wconn._atMu.RUnlock();
    });

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
503
    // lock wconn.filehMu.R to lookup fileh in wconn.filehTab.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
504 505 506
    //
    // keep wconn.filehMu.R locked during whole __pin1 run to make sure that
    // e.g.  simultaneous FileH.close does not remove f from wconn.filehTab.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
507
    // TODO keeping filehMu.R during whole pin1 is not needed and locking can be made more granular.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
508
    //
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
509
    // NOTE no deadlock wrt Conn.resync, Conn.open, FileH.close - they all send
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
510
    // "watch" requests to wcfs server outside of wconn.filehMu.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
511
    wconn._filehMu.RLock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
512 513 514 515
    defer([&]() {
        wconn._filehMu.RUnlock();
    });

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
516
    tie(f, ok) = wconn._filehTab.get_(req->foid);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
517
    if (!ok) {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
518 519
        // why wcfs sent us this update?
        return fmt::errorf("unexpected pin: f<%s> not watched", v(req->foid));
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
520 521
    }

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
522 523 524 525
    // NOTE no need to check f._state as we need to go only through f.mmaps, and
    // wcfs server can send us pins at any state, including "opening" - to pin
    // our view to requested @at, and "closing" - due to other clients
    // accessing wcfs/head/f simultaneously.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
526

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
527
    f->_mmapMu.lock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
528
    defer([&]() {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
529
        f->_mmapMu.unlock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
530
    });
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
531

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
532
    for (auto mmap : f->_mmaps) {   // TODO use ↑blk_start for binary search
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
533 534
        if (!(mmap->blk_start <= req->blk && req->blk < mmap->blk_stop()))
            continue;   // blk ∉ mmap
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
535

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
536
        trace("\tremmapblk %d @%s", req->blk, (req->at == TidHead ? "head" : v(req->at)));
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
537 538 539

        // pin only if virtmem did not dirtied page corresponding to this block already
        // if virtmem dirtied the page - it will ask us to remmap it again after commit or abort.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
540
        bool do_pin= true;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
541
        error err;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
542
        if (mmap->vma != nil) {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
543
            mmap->_assertVMAOk();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
544 545 546 547

            // see ^^^ about deadlock
            //virt_lock();

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
548
            BigFileH *virt_fileh = mmap->vma->fileh;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
549
            TODO (mmap->fileh->blksize != virt_fileh->ramh->ram->pagesize);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
550 551 552 553
            do_pin = !__fileh_page_isdirty(virt_fileh, req->blk);
        }

        if (do_pin)
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
554
            err = mmap->_remmapblk(req->blk, req->at);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
555

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
556 557 558
        // see ^^^ about deadlock
        //if (mmap->vma != nil)
        //    virt_unlock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
559

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
560 561 562 563 564
        // on error don't need to continue with other mappings - all fileh and
        // all mappings become marked invalid on pinner failure.
        if (err != nil)
            return err;

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
565
        trace("\t-> remmaped");
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
566 567
    }

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
568
    // update f._pinned
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
569
    if (req->at == TidHead) {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
570
        f->_pinned.erase(req->blk);      // unpin to @head
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
571 572
    }
    else {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
573
        f->_pinned[req->blk] = req->at;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
574 575
    }

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
576
    return nil;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
577
}
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
578

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
579
// resync resyncs connection and its file mappings onto different database view.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
580 581 582
//
// bigfile/_file_zob.pyx arranges to call Conn.resync at transaction boundaries
// to keep Conn view in sync with updated zconn database view.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
583 584
error _Conn::resync(zodb::Tid at) {
    _Conn& wconn = *this;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
585 586
    error err;

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
587
    wconn._atMu.RLock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
588
    xerr::Contextf E("%s: resync -> @%s", v(wconn), v(at));
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
589
    wconn._atMu.RUnlock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
590
    etrace("");
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
591

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
592 593
    // XXX _downErr -> E
    // XXX at ^ (increases)     -> rejected by wcfs     XXX or also precheck here?
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
594

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
595
    // wait for wcfs/head to be >= at.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
596
    // we need this e.g. to be sure that head/f.size is at least as big that it will be @at state.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
597
    err = wconn._wc->_headWait(at);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
598
    if (err != nil)
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
599
        return E(err);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
600

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
601 602 603 604 605 606
    // bring wconn + fileh + mmaps down on error
    bool retok = false;
    defer([&]() {
        if (!retok)
            wconn.close(); // ignore error
    });
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
607

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
608
    // lock wconn._atMu.W . This excludes everything else, and in
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
609 610 611
    // particular _pinner_, from running and mutating files and mappings.
    //
    // NOTE we'll relock atMu as R in the second part of resync, so we prelock
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
612
    // wconn._filehMu.R as well while under atMu.W, to be sure that set of opened
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
613
    // files and their states stay the same during whole resync.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
614 615
    bool atMuWLocked = true;
    wconn._atMu.Lock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
616
    wconn._filehMu.RLock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
617
    defer([&]() {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
618
        wconn._filehMu.RUnlock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
619 620 621 622 623
        if (atMuWLocked)
            wconn._atMu.Unlock();
        else
            wconn._atMu.RUnlock();
    });
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
624

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
625 626 627
    err = wconn._downErr;
    if (err != nil)
        return E(err);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
628

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
629
    // set new wconn.at early, so that e.g. Conn.open running simultaneously
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
630
    // to second part of resync (see below) uses new at.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
631
    wconn.at = at;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
632

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
633 634
    // go through all files opened under wconn and pre-adjust their mappings
    // for viewing data as of new @at state.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
635 636 637 638 639
    //
    // We are still holding atMu.W, so we are the only mutators of mappings,
    // because, in particular, pinner is not running.
    //
    // Don't send watch updates for opened files to wcfs yet - without running
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
640
    // pinner those updates will get stuck.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
641
    for (auto fit : wconn._filehTab) {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
642 643
        //zodb::Oid  foid = fit.first;
        FileH        f    = fit.second;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
644

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
645
        // TODO if file has no mappings and was not used during whole prev
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
646
        // cycle - forget and stop watching it?
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
647

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
648 649
        // "opening" or "closing" fileh - their setup/teardown is currently
        // handled by Conn.open and FileH.close correspondingly.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
650
        if (f->_state != _FileHOpened)
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
651
            continue;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
652

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
653
        // update f._headfsize and remmap to head/f zero regions that are now covered by head/f
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
654
        struct stat st;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
655
        err = f->_headf->stat(&st);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
656
        if (err != nil)
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
657
            return E(err);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
658

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
659
        if ((size_t)st.st_blksize != f->blksize)    // blksize must not change
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
660
            return E(fmt::errorf("wcfs bug: blksize changed: %zd -> %ld", f->blksize, st.st_blksize));
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
661
        auto headfsize = st.st_size;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
662
        if (!(f->_headfsize <= headfsize))          // head/file size ↑=
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
663
            return E(fmt::errorf("wcfs bug: head/file size not ↑="));
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
664
        if (!(headfsize % f->blksize == 0))
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
665
            return E(fmt::errorf("wcfs bug: head/file size %% blksize != 0"));
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
666

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
667
        // replace zero regions in f mappings in accordance to adjusted f._headfsize.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
668
        // NOTE it is ok to access f._mmaps without locking f._mmapMu because we hold wconn.atMu.W
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
669
        for (auto mmap : f->_mmaps) {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
670
            //trace("  resync -> %s: unzero [%lu:%lu)", v(at), f->_headfsize/f->blksize, headfsize/f->blksize);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
671
            uint8_t *mem_unzero_start = min(mmap->mem_stop,
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
672
                            mmap->mem_start + (f->_headfsize - mmap->blk_start*f->blksize));
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
673
            uint8_t *mem_unzero_stop  = min(mmap->mem_stop,
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
674
                            mmap->mem_start + (   headfsize  - mmap->blk_start*f->blksize));
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
675 676

            if (mem_unzero_stop - mem_unzero_start > 0) {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
677
                err = mmap_into_ro(mem_unzero_start, mem_unzero_stop-mem_unzero_start, f->_headf, f->_headfsize);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
678
                if (err != nil)
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
679
                    return E(err);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
680 681
            }
        }
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
682

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
683
        f->_headfsize = headfsize;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
684 685 686
    }

    // atomically downgrade atMu.W to atMu.R before issuing watch updates to wcfs.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
687
    // - we need atMu to be not Wlocked, because under atMu.W pinner cannot run simultaneously to us.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
688 689
    // - we need to hold atMu.R to avoid race wrt e.g. other resync which changes at.
    // - we cannot just do regular `atMu.Unlock + atMu.RLock()` because then
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
690
    //   there is e.g. a race window in between Unlock and RLock where wconn.at can be changed.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
691 692
    //   Also if we Unlock and Rlock, it will produce deadlock, because locking
    //   order will change to reverse: wconn._filehMu.R + wconn._atMu.R
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
693 694 695 696 697
    //
    // Now other calls, e.g. Conn.open, can be running simultaneously to us,
    // but since we already set wconn.at to new value it is ok. For example
    // Conn.open, for not-yet-opened file, will use new at to send "watch".
    //
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
698 699
    // NOTE we are still holding wconn._filehMu.R, so wconn._filehTab and fileh
    // states are the same as in previous pass above.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
700
    wconn._atMu.UnlockToRLock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
701
    atMuWLocked = false;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
702

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
703
    // send watch updates to wcfs.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
704 705 706
    // the pinner is now running and will be able to serve pin requests triggered by our watch.
    //
    // update only fileh in "opened" state - for fileh in "opening" and
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
707 708
    // "closing" states, watch setup/teardown is currently in-progress and
    // performed by Conn.open and FileH.close correspondingly.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
709 710
    for (auto fit : wconn._filehTab) {
        zodb::Oid  foid = fit.first;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
711 712 713 714
        FileH      f    = fit.second;

        if (f->_state != _FileHOpened)
            continue;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
715

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
716
        string ack;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
717 718
        tie(ack, err) = wconn._wlink->sendReq(context::background(),
                            fmt::sprintf("watch %s @%s", v(foid), v(at)));
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
719
        if (err != nil)
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
720
            return E(err);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
721
        if (ack != "ok")
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
722
            return E(fmt::errorf("%s", v(ack)));
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
723
    }
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
724

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
725
    retok = true;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
726
    return nil;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
727
}
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
728

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
729 730 731 732 733 734 735 736 737 738 739
// open opens FileH corresponding to ZBigFile foid.
pair<FileH, error> _Conn::open(zodb::Oid foid) {
    _Conn& wconn = *this;
    error err;

    wconn._atMu.RLock();
    defer([&]() {
        wconn._atMu.RUnlock();
    });

    xerr::Contextf E("%s: open f<%s>", v(wconn), v(foid));
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
740
    etrace("");
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
741

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
742
retry:
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
743
    wconn._filehMu.Lock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
744 745

    if (wconn._downErr != nil) {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
746
        err = wconn._downErr;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
747
        wconn._filehMu.Unlock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
748
        return make_pair(nil, E(err));
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
749 750
    }

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
751 752
    // TODO ensure f<foid>@ wconn.at exists - else we get pins to non-existing
    //      state from wcfs, pinner replies nak, wcfs sends SIGBUS.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
753
    // TODO -> better teach wcfs to reject "watch <foid> @at" for @at where f did not existed.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
754
    //      (see test_wcfs_watch_before_create)
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
755

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
756 757 758
    FileH f; bool ok;
    tie(f, ok) = wconn._filehTab.get_(foid);
    if (ok) {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
759 760 761
        bool closing;

        if (f->_state <= _FileHOpened) {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
762
            f->_nopen++;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
763 764 765 766
            closing = false;
        } else {
            closing = true;
        }
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
767
        wconn._filehMu.Unlock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
768

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
769 770 771 772 773 774 775
        // if the file was closing|closed, we should wait for the close to
        // complete and retry the open.
        if (closing) {
            f->_closedq.recv();
            goto retry;
        }

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
776 777
        // the file was opening|opened. wait for open to complete and return the result.
        // we can be sure there won't be last close simultaneous to us as we did ._nopen++
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
778
        f->_openReady.recv();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
779
        if (f->_openErr != nil) {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
780
            // don't care about f->_nopen-- since f is not returned anywhere
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
781
            return make_pair(nil, E(f->_openErr));
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
782
        }
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
783 784 785 786

        return make_pair(f, nil);
    }

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
787
    // create "opening" FileH entry and perform open with wconn._filehMu released.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
788
    // NOTE wconn._atMu.R is still held because FileH._open relies on wconn.at being stable.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
789 790 791 792
    f = adoptref(new _FileH());
    f->wconn      = newref(&wconn);
    f->foid       = foid;
    f->_openReady = makechan<structZ>();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
793
    f->_closedq   = makechan<structZ>();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
794
    f->_openErr   = nil;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
795 796 797
    f->_headf     = nil;
    f->blksize    = 0;
    f->_headfsize = 0;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
798
    f->_state     = _FileHOpening;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
799
    f->_nopen     = 1;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
800 801 802

    bool retok = false;
    wconn._filehTab[foid] = f;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
803
    wconn._filehMu.Unlock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
804
    defer([&]() {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
805
        wconn._filehMu.Lock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
806
        if (wconn._filehTab.get(foid) != f) {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
807
            wconn._filehMu.Unlock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
808 809
            panic("BUG: wconn.open: wconn.filehTab[foid] mutated while file open was in progress");
        }
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
810
        if (!retok) {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
811
            // don't care about f->_nopen-- since f is not returned anywhere
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
812
            wconn._filehTab.erase(foid);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
813 814
        } else {
            f->_state = _FileHOpened;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
815
        }
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
816
        wconn._filehMu.Unlock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
817 818 819
        f->_openReady.close();
    });

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
820
    // do the actuall open.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
821
    // we hold only wconn.atMu.R, but niether wconn.filehMu, nor f.mmapMu .
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
822 823 824 825
    f->_openErr = f->_open();
    if (f->_openErr != nil)
        return make_pair(nil, E(f->_openErr));

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
826 827 828 829 830 831 832 833 834 835 836
    // NOTE no need to recheck that wconn was not closed while the open was in
    // progress: we'll return "success" but Conn.close will close the fileh.
    // However it is indistinguishable from the following scenario:
    //
    //      T1                  T2
    //
    //  f = wconn.open()
    //  # completes ok
    //                      wconn.close()
    //
    //  # use f -> error
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
837

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
838 839 840 841 842 843 844
    retok = true;
    return make_pair(f, nil);
}

// _open performs actual open of FileH marked as "in-flight-open" in wconn.filehTab.
//
// Called with:
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
845 846
// - wconn.atMu     held
// - wconn.filehMu  not locked
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
847
// - f.mmapMu       not locked
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868
error _FileH::_open() {
    _FileH* f = this;
    Conn    wconn = f->wconn;
    error err;

    tie(f->_headf, err)
                = wconn->_wc->_open(fmt::sprintf("head/bigfile/%s", v(foid)));
    if (err != nil)
        return err;

    bool retok = false;
    defer([&]() {
        if (!retok)
            f->_headf->close();
    });

    struct stat st;
    err = f->_headf->stat(&st);
    if (err != nil)
        return err;
    f->blksize    = st.st_blksize;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
869
    f->_headfsize = st.st_size;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
870 871 872 873 874
    if (!(f->_headfsize % f->blksize == 0))
        return fmt::errorf("wcfs bug: %s size (%d) %% blksize (%d) != 0",
                        v(f->_headf->name()), f->_headfsize, f->blksize);

    // start watching f
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
875
    // NOTE we are _not_ holding wconn.filehMu nor f.mmapMu - only wconn.atMu to rely on wconn.at being stable.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
876 877
    // NOTE wcfs will reply "ok" only after wcfs/head/at ≥ wconn.at
    string ack;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
878 879
    tie(ack, err) = wconn->_wlink->sendReq(context::background(),
                        fmt::sprintf("watch %s @%s", v(foid), v(wconn->at)));
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
880 881
    if (err != nil)
        return err;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
882
    if (ack != "ok")
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
883 884 885 886 887 888 889 890 891 892 893 894
        return fmt::errorf("watch: %s", v(ack));

    retok = true;
    return nil;
}

// close releases resources associated with FileH.
//
// Left fileh mappings become invalid to use except unmap.
error _FileH::close() {
    _FileH& fileh = *this;
    Conn    wconn = fileh.wconn;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
895

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
896
    // lock virtmem early. TODO more granular virtmem locking (see __pin1 for
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
897 898 899 900 901
    // details and why virt_lock currently goes first)
    virt_lock();
    defer([&]() {
        virt_unlock();
    });
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
902

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
903
    wconn->_atMu.RLock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
904 905 906 907
    defer([&]() {
        wconn->_atMu.RUnlock();
    });

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
908
    return fileh._closeLocked(/*force=*/false);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
909 910 911 912 913 914 915
}

// _closeLocked serves FileH.close and Conn.close.
//
// Must be called with the following locks held by caller:
// - virt_lock
// - wconn.atMu
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
916
error _FileH::_closeLocked(bool force) {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
917 918 919
    _FileH& fileh = *this;
    Conn    wconn = fileh.wconn;

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
920
    wconn->_filehMu.Lock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
921
    defer([&]() {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
922
        wconn->_filehMu.Unlock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
923 924
    });

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
925
    // fileh.close can be called several times. just return nil for second close.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
926
    if (fileh._state >= _FileHClosing)
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
927 928 929 930
        return nil;

    // decref open count; do real close only when last open goes away.
    if (fileh._nopen <= 0)
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
931
        panic("BUG: fileh.close: fileh._nopen <= 0");
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
932
    fileh._nopen--;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
933
    if (fileh._nopen > 0 && !force)
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
934 935 936
        return nil;

    // last open went away - real close.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
937
    xerr::Contextf E("%s: %s: close", v(wconn), v(fileh));
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
938
    etrace("");
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
939

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
940 941 942
    ASSERT(fileh._state == _FileHOpened); // there can be no open-in-progress, because
    fileh._state = _FileHClosing;         // .close() can be called only on "opened" fileh

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
943 944
    // unlock wconn._filehMu to stop watching the file outside of this lock.
    // we'll relock wconn._filehMu again before updating wconn.filehTab.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
945
    wconn->_filehMu.Unlock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
946 947


Kirill Smelkov's avatar
.  
Kirill Smelkov committed
948 949 950 951 952 953
    error err, eret;
    auto reterr1 = [&eret](error err) {
        if (eret == nil && err != nil)
            eret = err;
    };

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
954
    // stop watching f
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
955
    string ack;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
956 957
    tie(ack, err) = wconn->_wlink->sendReq(context::background(),
                        fmt::sprintf("watch %s -", v(foid)));
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
958
    if (err != nil)
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
959 960 961
        reterr1(err);
    else if (ack != "ok")
        reterr1(fmt::errorf("unwatch: %s", v(ack)));
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
962

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
963

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
964 965
    // relock wconn._filehMu again and remove fileh from wconn._filehTab
    wconn->_filehMu.Lock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
966 967 968
    if (wconn->_filehTab.get(fileh.foid)._ptr() != &fileh)
        panic("BUG: fileh.close: wconn.filehTab[fileh.foid] != fileh");
    wconn->_filehTab.erase(fileh.foid);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
969

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
970 971
    reterr1(fileh._headf->close());

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
972
    // change all fileh.mmaps to cause EFAULT on any access after fileh.close
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
973
    fileh._mmapMu.lock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
974
    defer([&]() {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
975
        fileh._mmapMu.unlock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
976 977 978 979 980 981 982 983 984
    });

    for (auto mmap : fileh._mmaps) {
        err = mmap->__remmapAsEfault();
        if (err != nil)
            reterr1(err);
    }

    // fileh close complete
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
985
    fileh._state = _FileHClosed;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
986 987
    fileh._closedq.close();

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
988
    return E(eret);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
989 990
}

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
991 992 993 994 995 996
// mmap creates file mapping representing file[blk_start +blk_len) data as of wconn.at database state.
//
// If vma != nil, created mapping is associated with that vma of user-space virtual memory manager:
// virtmem calls FileH::mmap under virtmem lock when virtmem fileh is mmapped into vma.
pair<Mapping, error> _FileH::mmap(int64_t blk_start, int64_t blk_len, VMA *vma) {
    _FileH& f = *this;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
997 998

    // NOTE virtmem lock is held by virtmem caller
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
999
    f.wconn->_atMu.RLock();     // e.g. f._headfsize
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1000
    f.wconn->_filehMu.RLock();  // f._state  TODO -> finer grained (currently too coarse)
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1001
    f._mmapMu.lock();           // f._pinned, f._mmaps
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1002
    defer([&]() {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1003
        f._mmapMu.unlock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1004
        f.wconn->_filehMu.RUnlock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1005 1006 1007
        f.wconn->_atMu.RUnlock();
    });

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1008
    xerr::Contextf E("%s: %s: mmap [#%ld +%ld)", v(f.wconn), v(f), blk_start, blk_len);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1009
    etrace("");
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1010

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1011
    if (f._state >= _FileHClosing)
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1012
        return make_pair(nil, E(os::ErrClosed));
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1013

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093
    error err;

    if (blk_start < 0)
        panic("blk_start < 0");
    if (blk_len < 0)
        panic("blk_len < 0");

    int64_t blk_stop; // = blk_start + blk_len
    if (__builtin_add_overflow(blk_start, blk_len, &blk_stop))
        panic("blk_start + blk_len overflow int64");

    int64_t stop;//  = blk_stop *f.blksize;
    if (__builtin_mul_overflow(blk_stop, f.blksize, &stop))
        panic("(blk_start + blk_len)*f.blksize overflow int64");
    int64_t start    = blk_start*f.blksize;


    // create memory with head/f mapping and applied pins
    // mmap-in zeros after f.size (else access to memory after file.size will raise SIGBUS)
    uint8_t *mem_start, *mem_stop;
    tie(mem_start, err) = mmap_ro(f._headf, start, blk_len*f.blksize);
    if (err != nil)
        return make_pair(nil, E(err));
    mem_stop = mem_start + blk_len*f.blksize;

    bool retok = false;
    defer([&]() {
        if (!retok)
            mm::unmap(mem_start, mem_stop - mem_start); // ignore error
    });

    if (stop > f._headfsize) {
        uint8_t *zmem_start = mem_start + (max(f._headfsize/*XXX -1 ?*/, start) - start);
        err = mmap_zero_into_ro(zmem_start, mem_stop - zmem_start);
        if (err != nil)
            return make_pair(nil, E(err));
    }

    Mapping mmap = adoptref(new _Mapping());
    mmap->fileh     = newref(&f);
    mmap->blk_start = blk_start;
    mmap->mem_start = mem_start;
    mmap->mem_stop  = mem_stop;
    mmap->vma       = vma;

    for (auto _ : f._pinned) {  // TODO keep f._pinned ↑blk and use binary search
        int64_t    blk = _.first;
        zodb::Tid  rev = _.second;
        if (!(blk_start <= blk && blk < blk_stop))
            continue;   // blk ∉ this mapping
        err = mmap->_remmapblk(blk, rev);
        if (err != nil)
            return make_pair(nil, E(err));
    }

    if (vma != nil) {
        if (vma->mmap_overlay_server != nil)
            panic("vma is already associated with overlay server");
        if (!(vma->addr_start == 0 && vma->addr_stop == 0))
            panic("vma already covers !nil virtual memory area");
        mmap->incref(); // vma->mmap_overlay_server is keeping ref to mmap
        vma->mmap_overlay_server = mmap._ptr();
        vma->addr_start = (uintptr_t)mmap->mem_start;
        vma->addr_stop  = (uintptr_t)mmap->mem_stop;
        mmap->_assertVMAOk(); // just in case
    }

    f._mmaps.push_back(mmap);   // TODO keep f._mmaps ↑blk_start

    retok = true;
    return make_pair(mmap, nil);
}

// unmap releases mapping memory from address space.
//
// After call to unmap the mapping must no longer be used.
// The association in between mapping and linked virtmem VMA is reset.
//
// Virtmem calls Mapping.unmap under virtmem lock when VMA is unmapped.
error _Mapping::unmap() {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1094
    Mapping mmap = newref(this); // newref for std::remove
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1095 1096 1097
    FileH f = mmap->fileh;

    // NOTE virtmem lock is held by virtmem caller
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1098
    f->wconn->_atMu.RLock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1099
    f->_mmapMu.lock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1100
    defer([&]() {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1101
        f->_mmapMu.unlock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1102 1103 1104
        f->wconn->_atMu.RUnlock();
    });

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1105
    xerr::Contextf E("%s: %s: %s: unmap", v(f->wconn), v(f), v(mmap));
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1106
    etrace("");
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130

    if (mmap->vma != nil) {
        mmap->_assertVMAOk();
        VMA *vma = mmap->vma;
        vma->mmap_overlay_server = nil;
        mmap->decref(); // vma->mmap_overlay_server was holding a ref to mmap
        vma->addr_start = 0;
        vma->addr_stop  = 0;
        mmap->vma = nil;
    }

    error err = mm::unmap(mmap->mem_start, mmap->mem_stop - mmap->mem_start);
    mmap->mem_start = nil;
    mmap->mem_stop  = nil;
    // XXX clear other fields?

    //f->_mmaps.remove(mmap);
    f->_mmaps.erase(
        std::remove(f->_mmaps.begin(), f->_mmaps.end(), mmap),
        f->_mmaps.end());

    return E(err);
}

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1131 1132
// _remmapblk remmaps mapping memory for file[blk] to be viewing database as of @at state.
//
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1133 1134
// at=TidHead means unpin to head/ .
// NOTE this does not check whether virtmem already mapped blk as RW.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1135 1136 1137
//
// The following locks must be held by caller:
// - f.wconn.atMu
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1138
// - f._mmapMu          XXX not needed? (f._mmaps and f._pinned are not used)
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1139
error _Mapping::_remmapblk(int64_t blk, zodb::Tid at) {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1140 1141
    // XXX must not be called after Mapping is switched to efault?

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1142
    _Mapping *mmap = this;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1143
    FileH f = mmap->fileh;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1144
    xerr::Contextf E("%s: %s: %s: remmapblk #%ld @%s", v(f->wconn), v(f), v(mmap), blk, v(at));
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1145
    etrace("");
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1146

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1147
    ASSERT(mmap->blk_start <= blk && blk < mmap->blk_stop());
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1148
    error err;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1149

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1150
    uint8_t *blkmem = mmap->mem_start + (blk - mmap->blk_start)*f->blksize;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1151
    os::File fsfile;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1152
    bool fclose = false;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1153
    if (at == TidHead) {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1154
        fsfile = f->_headf;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1155 1156 1157
    }
    else {
        // TODO share @rev fd until wconn is resynced?
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1158
        tie(fsfile, err) = f->wconn->_wc->_open(
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1159
                fmt::sprintf("@%s/bigfile/%s", v(at), v(f->foid)));
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1160
        if (err != nil)
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1161
            return E(err);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1162
        fclose = true;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1163
    }
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1164 1165
    defer([&]() {
        if (fclose)
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1166
            fsfile->close();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1167
    });
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1168

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1169
    struct stat st;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1170
    err = fsfile->stat(&st);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1171
    if (err != nil)
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1172
        return E(err);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1173
    if ((size_t)st.st_blksize != f->blksize)
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1174
        return E(fmt::errorf("wcfs bug: blksize changed: %zd -> %ld", f->blksize, st.st_blksize));
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1175

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1176 1177
    // block is beyond file size - mmap with zeros - else access to memory
    // after file.size will raise SIGBUS. (assumes head/f size ↑=)
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1178 1179
    if ((blk+1)*f->blksize > (size_t)st.st_size) {
        err = mmap_zero_into_ro(blkmem, 1*f->blksize);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1180
        if (err != nil)
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1181
            return E(err);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1182
    }
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1183
    // block is inside file - mmap in file data
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1184
    else {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1185
        err = mmap_into_ro(blkmem, 1*f->blksize, fsfile, blk*f->blksize);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1186
        if (err != nil)
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1187
            return E(err);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1188
    }
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1189 1190

    return nil;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1191
}
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1192

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1193
// remmap_blk remmaps file[blk] in its place again.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1194 1195 1196
//
// Virtmem calls Mapping.remmap_blk under virtmem lock to remmap a block after
// RW dirty page was e.g. discarded.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1197
error _Mapping::remmap_blk(int64_t blk) {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1198
    _Mapping& mmap = *this;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1199
    FileH f = mmap.fileh;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1200

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1201
    // NOTE virtmem lock is held by virtmem caller
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1202
    f->wconn->_atMu.RLock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1203
    f->_mmapMu.lock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1204
    defer([&]() {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1205
        f->_mmapMu.unlock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1206
        f->wconn->_atMu.RUnlock();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1207
    });
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1208

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1209 1210 1211
    if (!(mmap.blk_start <= blk && blk < mmap.blk_stop()))
        panic("remmap_blk: blk out of Mapping range");

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1212 1213
    // XXX do nothing after mapping is switched to efault? (i.e. f is closed/down)

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1214 1215
    // blkrev = rev | @head
    zodb::Tid blkrev; bool ok;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1216
    tie(blkrev, ok) = f->_pinned.get_(blk);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1217 1218 1219
    if (!ok)
        blkrev = TidHead;

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1220 1221
    error err = mmap._remmapblk(blk, blkrev);
    if (err != nil)
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1222 1223 1224
        return err; // errctx is good in _remmapblk

    return nil;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1225
}
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1226

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246
// __remmapEfault remmaps Mapping memory to cause SIGSEGV on access.
//
// It is used on FileH shutdown to turn all fileh mappings into incorrect ones,
// because after fileh is down, it is not possible to continue to provide
// correct f@at data view.
//
// Must be called with the following locks held by caller:
// - virt_lock
// XXX more?
error _Mapping::__remmapAsEfault() {
    _Mapping& mmap = *this;
    FileH f = mmap.fileh;

    xerr::Contextf E("%s: remmap as efault", v(mmap)); // XXX +wconn, +f ?
    etrace("");

    error err = mmap_efault_into(mmap.mem_start, mmap.mem_stop - mmap.mem_start);
    return E(err);
}

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1247 1248
// ---- WCFS raw file access ----

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1249 1250 1251 1252 1253 1254 1255 1256
// _path returns path for object on wcfs.
// - str:        wcfs root + obj;
string WCFS::_path(const string &obj) {
    WCFS *wc = this;

    return wc->mountpoint + "/" + obj;
}

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1257
tuple<os::File, error> WCFS::_open(const string &path, int flags) {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1258
    WCFS *wc = this;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1259
    string path_ = wc->_path(path);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1260
    return os::open(path_, flags);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1261
}
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1262 1263 1264 1265


// ---- misc ----

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1266 1267
// mmap_zero_into serves mmap_zero_into_ro and mmap_efault_into.
static error mmap_zero_into(void *addr, size_t size, int prot) {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1268
    xerr::Contextf E("mmap zero");
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1269
    etrace("");
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1270

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1271 1272 1273 1274 1275 1276
    // mmap /dev/zero with MAP_NORESERVE and MAP_SHARED
    // this way the mapping will be able to be read, but no memory will be allocated to keep it.
    os::File z;
    error err;
    tie(z, err) = os::open("/dev/zero");
    if (err != nil)
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1277
        return E(err);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1278
    defer([&]() {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1279
        z->close();
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1280
    });
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1281
    err = mm::map_into(addr, size, prot, MAP_SHARED | MAP_NORESERVE, z, 0);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1282
    if (err != nil)
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1283
        return E(err);
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1284 1285
    return nil;
}
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1286

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303
// mmap_zero_into_ro mmaps read-only zeros into [addr +size) so that region is all zeros.
// created mapping, even after it is accessed, does not consume memory.
static error mmap_zero_into_ro(void *addr, size_t size) {
    return mmap_zero_into(addr, size, PROT_READ);
}

// mmap_efault_into changes [addr +size) region to generate SIGSEGV on read/write access.
// Any previous mapping residing in that virtual address range is released.
static error mmap_efault_into(void *addr, size_t size) {
    xerr::Contextf E("mmap efault");
    etrace("");

    // mmaping /dev/zero with PROT_NONE gives what we need.
    return E(mmap_zero_into(addr, size, PROT_NONE));
}


Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1304 1305 1306 1307 1308 1309
// mmap_ro mmaps read-only fd[offset +size).
// The mapping is created with MAP_SHARED.
static tuple<uint8_t*, error> mmap_ro(os::File f, off_t offset, size_t size) {
    return mm::map(PROT_READ, MAP_SHARED, f, offset, size);
}

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1310 1311
// mmap_into_ro mmaps read-only fd[offset +size) into [addr +size).
// The mapping is created with MAP_SHARED.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1312
static error mmap_into_ro(void *addr, size_t size, os::File f, off_t offset) {
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1313 1314
    return mm::map_into(addr, size, PROT_READ, MAP_SHARED, f, offset);
}
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1315 1316


Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1317
// _assertVMAOk() verifies that mmap and vma are related to each other and cover
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1318
// exactly the same virtual memory range.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1319 1320 1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340
//
// It panics if mmap and vma do not exactly relate to each other or cover
// different virtual memory range.
void _Mapping::_assertVMAOk() {
    _Mapping* mmap = this;
    VMA *vma = mmap->vma;

    if (!(vma->mmap_overlay_server == static_cast<void*>(mmap)))
        panic("BUG: mmap and vma do not link to each other");
    if (!(vma->addr_start == uintptr_t(mmap->mem_start) &&
          vma->addr_stop  == uintptr_t(mmap->mem_stop)))
        panic("BUG: mmap and vma cover different virtual memory ranges");

    // verified ok
}


string WCFS::String() const {
    const WCFS& wc = *this;
    return fmt::sprintf("wcfs %s", v(wc.mountpoint));
}

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1341
// NOTE String must be called with Conn.atMu locked.
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1342 1343
string _Conn::String() const {
    const _Conn& wconn = *this;
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1344 1345 1346
    // XXX don't include wcfs as prefix here?
    // (e.g. to use Conn.String in tracing without wcfs prefix)
    // (if yes -> go and correct all xerr::Contextf calls)
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1347
    return fmt::sprintf("%s: conn%d @%s", v(wconn._wc), wconn._wlink->fd(), v(wconn.at));
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1348 1349
}

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361
string _FileH::String() const {
    const _FileH& f = *this;
    return fmt::sprintf("f<%s>", v(f.foid));
}

string _Mapping::String() const {
    const _Mapping& mmap = *this;
    return fmt::sprintf("m[#%ld +%ld) v[%p +%lx)",
                mmap.blk_start, mmap.blk_stop() - mmap.blk_start,
                mmap.mem_start, mmap.mem_stop   - mmap.mem_start);
}

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1362 1363 1364 1365 1366 1367
_Conn::_Conn()  {}
_Conn::~_Conn() {}
void _Conn::decref() {
    if (__decref())
        delete this;
}
Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1368

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380 1381 1382
_FileH::_FileH()  {}
_FileH::~_FileH() {}
void _FileH::decref() {
    if (__decref())
        delete this;
}

_Mapping::_Mapping()  {}
_Mapping::~_Mapping() {}
void _Mapping::decref() {
    if (__decref())
        delete this;
}

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1383 1384 1385 1386
dict<int64_t, zodb::Tid> _tfileh_pinned(FileH fileh) {
    return fileh->_pinned;
}

Kirill Smelkov's avatar
.  
Kirill Smelkov committed
1387
}   // wcfs::