Commits · ab962714c935b93684d8a4c1df997ad7365c5e09 · Kirill Smelkov / wendelin.core

An error occurred fetching the project authors.

12 Jul, 2019 2 commits
- . · ab962714
  Kirill Smelkov authored 5 years ago
  
  ab962714
- . · e7b77669
  Kirill Smelkov authored 5 years ago
  
  e7b77669
11 Jul, 2019 7 commits
- X Transition to all VMA under 1 fileh to be either all based on wcfs or all based on !wcfs · f084ff9b
  Kirill Smelkov authored 5 years ago
```
This allows to make decision to drop a page right after writeout for
wcfs case. If vmas for one fileh were allowed to be mixed - we could not
know whether to drop the page & its memory from RAM or not - even if
there is no vma mapped in !wcfs mode, it could be created later, and by
dropping RAM cahce we hit severe slowness. NOTE: in between transactions
there is usually no vmas mapped, so this case is not artificial.

By requiring that all VMAs are of the same kind under one fileh we avoid
this ambiguity.
```
  f084ff9b
- X Settled on what should happen after writeout for wcfs case · 4a20a573
  Kirill Smelkov authored 5 years ago
  
  4a20a573
- . · 23ef1568
  Kirill Smelkov authored 5 years ago
  
  23ef1568
- . · 8a00147e
  Kirill Smelkov authored 5 years ago
  
  8a00147e
- . · 3a82b464
  Kirill Smelkov authored 5 years ago
  
  3a82b464
- bigfile: Fix typos · 49fd66ad
  Kirill Smelkov authored 5 years ago
  
  49fd66ad
- . · 877bab37
  Kirill Smelkov authored 5 years ago
  
  877bab37
10 Jul, 2019 8 commits
- . · 0ff122c6
  Kirill Smelkov authored 5 years ago
  
  0ff122c6
- . · db641047
  Kirill Smelkov authored 5 years ago
  
  db641047
- . · 82f28080
  Kirill Smelkov authored 5 years ago
  
  82f28080
- . · 0d10ef28
  Kirill Smelkov authored 5 years ago
  
  0d10ef28
- . · fbd5b279
  Kirill Smelkov authored 5 years ago
  
  fbd5b279
- . · 1dcbe2a4
  Kirill Smelkov authored 5 years ago
  
  1dcbe2a4
- X Split PAGE_LOADED -> PAGE_LOADED, PAGE_LOADED_FOR_WRITE · fb6932a2
  Kirill Smelkov authored 5 years ago
```
The latter state is the only valid state in wcfs mode when pagefault
handler sees already loaded page in fileh->pagemap.
```
  fb6932a2
- . · 4d18f19b
  Kirill Smelkov authored 5 years ago
  
  4d18f19b
09 Jul, 2019 3 commits
- . · a3299f60
  Kirill Smelkov authored 5 years ago
  
  a3299f60
- . · 56411971
  Kirill Smelkov authored 5 years ago
  
  56411971
- . · 72eb63d0
  Kirill Smelkov authored 5 years ago
  
  72eb63d0
23 May, 2019 1 commit
- . · 9c39d87d
  Kirill Smelkov authored 5 years ago
  
  9c39d87d
22 Feb, 2019 2 commits
- . · a4d63fbb
  Kirill Smelkov authored 5 years ago
  
  a4d63fbb
- . · bc041be8
  Kirill Smelkov authored 5 years ago
  
  bc041be8
24 Oct, 2017 1 commit

Relicense to GPLv3+ with wide exception for all Free Software / Open Source... · f11386a4

Kirill Smelkov authored 7 years ago

Relicense to GPLv3+ with wide exception for all Free Software / Open Source projects + Business options.

Nexedi stack is licensed under Free Software licenses with various exceptions
that cover three business cases:

- Free Software
- Proprietary Software
- Rebranding

As long as one intends to develop Free Software based on Nexedi stack, no
license cost is involved. Developing proprietary software based on Nexedi stack
may require a proprietary exception license. Rebranding Nexedi stack is
prohibited unless rebranding license is acquired.

Through this licensing approach, Nexedi expects to encourage Free Software
development without restrictions and at the same time create a framework for
proprietary software to contribute to the long term sustainability of the
Nexedi stack.

Please see https://www.nexedi.com/licensing for details, rationale and options.

f11386a4

06 Jul, 2017 1 commit

bigfile/virtmem: Don't forget to release fileh->writeout_inprogress on storeblk error · 87bf4908

Kirill Smelkov authored 7 years ago

Commit fb4bfb32 (bigfile/virtmem: Do storeblk() with virtmem lock
released) added bug-protection to fileh_dirty_writeout() so that it could
not be called twice at the same time or in parallel with other functions
which modify pages.

However it missed the code path when storeblk() call returned with error
and whole writeout was thus erroring out, but with fileh->writeout_inprogress
still left set to 1 incorrectly.

This was leading to things like

    bigfile/virtmem.c:419: fileh_dirty_discard: Assertion `!(fileh->writeout_inprogress)' failed.

and crashes.

Fix it.

87bf4908

10 Jan, 2017 2 commits

bigfile/virtmem: Do storeblk() with virtmem lock released · fb4bfb32

Kirill Smelkov authored 8 years ago

Like with loadblk (see f49c11a3 "bigfile/virtmem: Do loadblk() with
virtmem lock released" for the reference) storeblk() calls are
potentially slow and external code that serves the call can take other
locks in addition to virtmem lock taken by virtmem subsystem.
If that "other locks" are also taken before external code calls e.g.
with fileh_invalidate_page() in different codepath - a deadlock can happen:

      T1                  T2

      commit              invalidation-from-server received
      V -> storeblk
                          Z   <- ClientStorage.invalidateTransaction()
      Z -> zeo.store
                          V   <- fileh_invalidate_page (of unrelated page)

The solution to avoid deadlock, like for loadblk case, is to call storeblk()
with virtmem lock released.

However unlike loadblk which can be invoked at any time, storeblk is
invoked at commit time only so for storeblk case we handle rules for making
sure virtmem stays consistent after virtmem lock is retaken differently:

1. We disallow several parallel writeouts for one fileh. This way dirty
   pages handling logic can not mess up. This restriction is also
   consistent with ZODB 2 phase commit protocol where for a transaction
   commit logic is invoked/handled from only 1 thread.

2. For the same reason we disallow discard while writeout is in
   progress. This is also consistent with ZODB 2 phase commit protocol
   where txn.tpc_abort() is not expected to be called at the same time
   with txn.commit().

3. While writeout is in progress, for that fileh we disallow pages
   modifications and pages invalidations - because both operations would
   change at least fileh dirty pages list which is iterated over by
   writeout code with releasing/retaking the virtmem lock. By
   disallowing them we make sure fileh dirty pages list stays constant
   during whole fileh writeout.

   This restrictions are also consistent with ZODB commit semantics:

   - while an object is being stored into ZODB it is not expected it
     will be further modified or explicitly invalidated by client via
     ._p_invalidate()

   - server initiated invalidations come into effect only at transaction
     boundaries - when new transaction is started, not during commit time.

Also since now storeblk is called with virtmem lock released, for buffer
to store we no longer can use present page mapping in some vma directly,
because while virtmem lock is released that mappings can go away.

Fixes: nexedi/wendelin.core#6

fb4bfb32

bigfile/virtmem: Maintain dirty pages list for a fileh · 8bb7f2f2

Kirill Smelkov authored 8 years ago

This allows writeout code not to scan whole pagemap to find dirty pages
to write out, which should be faster.

But more importantly iterating whole pagemap on writeout would become
unsafe, when in upcoming patch storeblk() will be called with virt_lock
released: because there pagemap could be modified e.g. due to processing
other read accesses.

So maintain fileh->dirty_pages list and use it when we need to go
through dirtied pages.

Updates: nexedi/wendelin.core#6

8bb7f2f2

14 Jul, 2016 1 commit

bigfile/virtmem: usleep() needs unistd.h · d9d6409f

Kirill Smelkov authored 8 years ago

The following started to appear after recent gcc upgrade on my host:

bigfile/virtmem.c: In function `vma_on_pagefault':
bigfile/virtmem.c:696:9: warning: implicit declaration of function `usleep' [-Wimplicit-function-declaration]
usleep(10000); // XXX with 1000 uslepp still busywaits

d9d6409f

15 Dec, 2015 3 commits

bigfile/virtmem: Do loadblk() with virtmem lock released · f49c11a3

Kirill Smelkov authored 9 years ago

loadblk() calls are potentially slow and external code that serve the cal can
take other locks in addition to virtmem lock taken by virtmem subsystem. If
that "other locks" are also taken before external code calls e.g.
fileh_invalidate_page() in different codepath a deadlock can happen, e.g.

      T1                  T2

      page-access         invalidation-from-server received
      V -> loadblk
                          Z   <- ClientStorage.invalidateTransaction()
      Z -> zeo.load
                          V   <- fileh_invalidate_page

The solution to avoid deadlock is to call loadblk() with virtmem lock released
and upon loadblk() completion recheck virtmem data structures carefully.

To make that happen:

- new page state is introduces:

    PAGE_LOADING                (file content loading is  in progress)

- virtmem releases virt_lock before calling loadblk() when serving pagefault

- because loading is now done with virtmem lock released, now:

1. After loading completes we need to recheck fileh/vma data structures

   The recheck is done in full - vma_on_pagefault() just asks its driver (see
   VM_RETRY and VM_HANDLED codes) to retry handling the fault completely. This
   should work as the freshly loaded page was just inserted into fileh->pagemap
   and should be found there in the cache on next lookup.

   On the other hand this also works correctly, if there was concurrent change
   - e.g. vma was unmapped while we were loading the data - in that case the
   fault will be also processed correctly - but loaded data will stay in
   fileh->pagemap (and if not used will be evicted as not-needed
   eventually by RAM reclaim).

2. Similar to retrying mechanism is used for cases when two threads
   concurrently access the same page and would both try to load corresponding
   block - only one thread issues the actual loadblk() and another waits for load
   to complete with polling and VM_RETRY.

3. To correctly invalidate loading-in-progress pages another new page state
   is introduced:

    PAGE_LOADING_INVALIDATED    (file content loading was in progress
                                 while request to invalidate the page came in)

   which fileh_invalidate_page() uses to propagate invalidation message to
   loadblk() caller.

4. Blocks loading can now happen in parallel with other block loading and
   other virtmem operations - e.g. invalidation. For such cases tests are added
   to test_thread.py

5. virtmem lock now becomes just regular lock, instead of being previously
   recursive.

   For virtmem lock to be recursive was needed for cases, when code under
   loadblk() could trigger other virtmem calls, e.g. due to GC and calling
   another VMA dtor that would want to lock virtmem, but virtmem lock was
   already held.

   This is no longer needed.

6. To catch double faults we now cannot use just on static variable
   in_on_pagefault. That variable thus becomes thread-local.

7. Old test in test_thread to "test that access vs access don't overlap" no
   longer holds true - and is thus removed.

/cc @Tyagov, @klaus

f49c11a3

bigfile/virtmem: Factor functionality to unlock/retake GIL into own functions · 0231a65d

Kirill Smelkov authored 9 years ago

Previously we were doing virt_lock() / virt_unlock() which automatically
were making sure to unlock GIL before locking virtmem, and to restore
GIL state to previous after virtmem lock happened. virt_unlock() was
unlocking just the virtmem lock without touching GIL at all - that works
because the running code would eventually release GIL as python
regularly does so to allowing multiple threads to run.

In the next patch however, we'll need to wait for in-progress-loading
page to complete, and that wait has to be done with GIL released (so
other python threads could run), and for doing so we'll need
functionality to make sure GIL is unlocked and retake it back, not tied
to virt_lock().

So factor it out.

0231a65d

bigfile/virtmem: Remove obsolete XXX about locking · 81bf620c

Kirill Smelkov authored 9 years ago

Both comments are from the beginning - from 9a293c2d (bigfile/virtmem:
Userspace Virtual Memory Manager) - but d53271b9 patch (bigfile/virtmem:
Big Virtmem lock) missed to update them.

81bf620c

17 Aug, 2015 1 commit

bigfile/virtmem: Client API to invalidate a fileh page · cb779c7b

Kirill Smelkov authored 9 years ago

FileH is a handle representing snapshot of a file. If, for a pgoffset,
fileh already has loaded page, but we know the content of the file has
changed externally after loading has been done, we need to propagate to
fileh that such-and-such page should be invalidated (and reloaded on
next access).

This patch introduces

    fileh_invalidate_page(fileh, pgoffset)

to do just that.

In the next patch we'll use this facility to propagate invalidations of
ZBlk ZODB objects to virtmem subsystem.

NOTE

Since invalidation removes "dirtiness" from a page state, several
subsequent invalidations can make a fileh completely non-dirty
(invalidating all dirty page). Previously fileh->dirty was just a one
bit, so we needed to improve how we track dirtiness.

One way would be to have a dirty list for fileh pages and operate on
that. This has advantage to even optimize dirty pages processing like
fileh_dirty_writeout() where we currently scan through all fileh pages
just to write only PAGE_DIRTY ones.

Another simpler way is to make fileh->dirty a counter and maintain that.

Since we are going to move virtmem subsystem back into the kernel, here,
a simpler less-intrusive approach is used.

cb779c7b

06 Aug, 2015 3 commits

bigfile/virtmem: Big Virtmem lock · d53271b9

Kirill Smelkov authored 9 years ago

At present several threads running can corrupt internal virtmem
datastructures (e.g. ram->lru_list, fileh->pagemap, etc).

This can happen even if we have zope instances only with 1 worker thread
- because there are other "system" thread, and python garbage collection
can trigger at any thread, so if a virtmem object, e.g. VMA or FileH was
there sitting at GC queue to be collected, their collection, and thus
e.g. vma_unmap() and fileh_close() will be called from
different-from-worker thread.

Because of that virtmem just has to be aware of threads not to allow
internal datastructure corruption.

On the other hand, the idea of introducing userspace virtual memory
manager turned out to be not so good from performance and complexity
point of view, and thus the plan is to try to move it back into the
kernel. This way it does not make sense to do a well-optimised locking
implementation for userspace version.

So we do just a simple single "protect-all" big lock for virtmem.

Of a particular note is interaction with Python's GIL - any long-lived
lock has to be taken with GIL released, because else it can deadlock:

    t1  t2

    G
    V   G
   !G   V
    G

so we introduce helpers to make sure the GIL is not taken, and to retake
it back if we were holding it initially.

Those helpers (py_gil_ensure_unlocked / py_gil_retake_if_waslocked) are
symmetrical opposites to what Python provides to make sure the GIL is
locked (via PyGILState_Ensure / PyGILState_Release).

Otherwise, the patch is more-or-less straightforward application for
one-big-lock to protect everything idea.

d53271b9

bigfile/virtmem: When restoring SIGSEGV, don't change procmask for other signals · d7c33cd7

Kirill Smelkov authored 9 years ago

We factored out SIGSEGV block/restore from fileh_dirty_writeout() to all
functions in cb7a7055 (bigfile/virtmem: Block/restore SIGSEGV in
non-pagefault-handling function). The restoration however just sets
whole thread sigmask.

It could be possible that between block/restore calls procmask for other
signals could be changed, and this way - setting procmask directly - we
will overwrite them.

So be careful, and when restoring SIGSEGV mask, touch mask bit for only
that signal.

( we need xsigismember helper to get this done, which is also introduced
  in this patch )

d7c33cd7

bigfile/virtmem: Block/restore SIGSEGV in non-pagefault-handling function · cb7a7055

Kirill Smelkov authored 9 years ago

Non on-pagefault code should not access any not-mmapped memory.

Here we just refactor the code we already had to block/restore
SIGSEGV from fileh_dirty_writeout() and use it in all functions called
from non-pagefaulting context, as promised.

This way, if there is an error in virtmem implementation which
incorrectly accesses prepared for BigFile maps memory, we'll just die
with coredump instead of trying to incorrectly handle the pagefault.

cb7a7055

03 Apr, 2015 2 commits

bigfile/virtmem: Userspace Virtual Memory Manager · 9a293c2d

Kirill Smelkov authored 9 years ago

Does similar things to what kernel does - users can mmap file parts into
address space and access them read/write. The manager will be getting
invoked by hardware/OS kernel for cases when there is no page loaded for
read, or when a previousle read-only page is being written to.

Additionally to features provided in kernel, it support to be used to
store back changes in transactional way (see fileh_dirty_writeout()) and
potentially use huge pages for mappings (though this is currently TODO)

9a293c2d

bigfile: Stub for virtmem · 77d61533

Kirill Smelkov authored 9 years ago

This will be the core of virtual memory subsystem. For now we just
define a structure to describe pages of memory and add utility to
allocate address space from OS.

77d61533