Commits · 5b09ebfb0c1bb3f6397655ad87cc8da6b2644569 · dms3 / go-ds-flatfs

31 Mar, 2020 1 commit

feat: put all temporary files in the same directory and clean them up on start · 5b09ebfb

Steven Allen authored Mar 31, 2020

Otherwise, if we repeatedly stop the same node, we'll collect a bunch of
temporary files and NEVER DELETE them. This will:

1. Waste space.
2. Slow down queries/GC.

This patch:

1. Moves all temporary files to a single `.temp` directory. The leading `.`
means it can't conflict with any keys and queries (even on older flatfs
versions) will skip it.
2. Adds an "rm -rf flatfs-dir/.temp" call on start.

5b09ebfb

14 Feb, 2020 1 commit

fix: be explicit about key limitations · 4ca877d4

Steven Allen authored Feb 10, 2020

Only allow keys of the form `/[0-9A-Z+-_=]`. That is, upper-case alphanumeric
keys in the root namespace (plus some special characters).

Why? We don't encode keys before writing them to the filesystem. This change
ensures that:

1. Case sensitivity doesn't matter because we only allow upper-case keys.
2. Path separators and special characters doesn't matter.

For context, go-ipfs only uses flatfs for storing blocks. Every block CID is
encoded as uppercase alphanumeric text (specifically, uppercase base32).

We could be less restrictive, but this is safer and easier to understand.
Unfortunately, we _can't_ allow mixed case (Windows) and can't allow lowercase
because we're already using uppercase keys.

fixes #23

4ca877d4

21 Aug, 2019 2 commits
- chore: fix lint errors · 70540a69
  Steven Allen authored Aug 21, 2019
  
  70540a69
- make delete idempotent · dfb2bf17
  Steven Allen authored Aug 21, 2019
  
  dfb2bf17
01 Mar, 2019 1 commit
- fix panic on write after close · 1d9a4fcf
  Steven Allen authored Feb 28, 2019
```
(and make Close threadsafe)
```
  1d9a4fcf
18 Dec, 2018 1 commit
- query: test for goroutine leaks · bba93ebd
  Steven Allen authored Dec 17, 2018
  
  bba93ebd
04 Oct, 2018 2 commits
- add GetSize function · 1df2c1e0
  Steven Allen authored Oct 04, 2018
  
  1df2c1e0
- testing: sweet sanity · 135b2edb
  Steven Allen authored Oct 04, 2018
  
  135b2edb
13 Aug, 2018 1 commit
- use bytes instead of interface{} · 1807cc86
  Steven Allen authored Aug 13, 2018
  
  1807cc86
27 Mar, 2018 1 commit

Enhance Test cases and fix bugs found. · 6846908d

Kevin Atkinson authored Mar 27, 2018

Also sync contents of diskUsage.cache to disk after the initial calculation
and during shutdown.

6846908d

24 Mar, 2018 2 commits
- Fix bug in test case. · 69ad27b5
  Kevin Atkinson authored Mar 24, 2018
  
  69ad27b5
- Write diskusage in a background goroutine. · 3633dfad
  Kevin Atkinson authored Mar 24, 2018
```
This now means that the datastore needs to be properly closed, to clean up
the background thread.
```
  3633dfad
23 Mar, 2018 1 commit
- Use JSON file for `diskUsage.cache` and include note on accuracy of value. · fdadcca6
  Kevin Atkinson authored Mar 23, 2018
  
  fdadcca6
09 Mar, 2018 1 commit

Feat: Implement a PersistentDatastore by adding DiskUsage method (#27) · a095ff54

Hector Sanjuan authored Mar 09, 2018

* Feat: Implement a PersistentDatastore by adding DiskUsage method

This adds DiskUsage().

This datastore would have a big performance hit if we walked the
filesystem to calculate disk usage everytime.

Therefore I have opted to keep tabs of current disk usage by
walking the filesystem once during "Open" and then adding/subtracting
file sizes on Put/Delete operations.

On the plus:
  * Small perf impact
  * Always up to date values
  * No chance that race conditions will leave DiskUsage with wrong values

On the minus:
  * Slower Open() - it run Stat() on all files in the datastore
  * Size does not match real size if a directory grows large
    (at least on ext4 systems). We don't track directory-size changes,
    only use the creation size.

* Update .travis.yml: latest go

* DiskUsage: cache diskUsage on Close()

Avoids walking the whole datastore when a clean shutdown happened.

File is removed on read, so a non-cleanly-shutdown datastore
will not find an outdated file later.

* Manage diskUsage with atomic.AddInt64 (no channel). Use tmp file + rename.

* Remove redundant comments

* Address race conditions when writing/deleting the same key concurrently

This improves diskUsage book-keeping when writing and deleting the same
key concurrently. It however means that existing values in the datastore
cannot be replaced without a explicit delete (before put).

A new test checks that there are no double counts in a put/delete race
condition environment. This is true when sync is enabled. No syncing
causes small over-counting when deleting files concurrently to put.

* Document that datastore Put does not replace values

* Comment TestPutOverwrite

* Implement locking and discard for concurrent operations on the same key

This implements the approach suggested by @stebalien in
https://github.com/ipfs/go-ds-flatfs/pull/27

Write operations (delete/put) to the same key are tracked in a map
which provides a shared lock. Concurrent operations to that key
will share that lock. If one operation succeeds, it will remove
the lock from the map and the others using it will automatically
succeed. If one operation fails, it will let the others waiting
for the lock try.

New operations to that key will request a new lock.

A new test for putMany (batching) has been added.

Worth noting: a concurrent Put+Delete on a non-existing key
always yields Put as the winner (delete will fail if it comes first,
or will skipped if it comes second).

* Do less operation in tests (travis fails on mac)

* Reduce counts again

* DiskUsage: address comments. Use sync.Map.

* Add rw and rwundo rules to Makefile

* DiskUsage: use one-off locks for operations

Per @stebalien 's suggestion.

* DiskUsage: write checkpoint file when du changes by more than 1 percent

Meaning, if the difference between the checkpoint file value and the current
is more than one percent, we checkpoint it.

* Fix tests so they ignore disk usage cache file

* Rename: update disk usage when rename fails too..

* Improve rename comment and be less explicit on field initialization

* Do not use filepath.Walk, use Readdir instead.

* Estimate diskUsage for folders with more than 100 files

This will estimate disk usage when folders have more than
100 files in them. Non processed files will be assumed to have
the average size of processed ones.

* Select file randomly when there are too many to read

* Fix typo

* fix tests

* Set time deadline to 5 minutes.

This provides a disk estimation deadline. We will stat() as many
files as possible until we run out of time. If that happens,
the rest will be calculated as an average.

The user is informed of the slow operation and, if we ran out of time,
about how to obtain better accuracy.

a095ff54

19 Jan, 2017 3 commits
- Cleanup. Re-lower the tolerance in TestNoCluster, · 60a13511
  Kevin Atkinson authored Jan 19, 2017
  
  60a13511
- use math/rand and increase tolerance on tests because poisson is a jerk · 51a1252c
  Jeromy authored Jan 19, 2017
  
  51a1252c
- Fix README file name. · d2040b5f
  Kevin Atkinson authored Jan 17, 2017
  
  d2040b5f
16 Jan, 2017 10 commits
- Avoid using strings for shard func when creating the datastore. · ceebea7e
  Kevin Atkinson authored Jan 16, 2017
  
  ceebea7e
- Use constants for SHARDING and _README file names. · 6db29c0a
  Kevin Atkinson authored Jan 13, 2017
  
  6db29c0a
- Test if directory is empty before creating SHARDING file. · 9d9fdda1
  Kevin Atkinson authored Jan 13, 2017
```
Other minor cleanups.
```
  9d9fdda1
- Always require the version string. · 9492e359
  Kevin Atkinson authored Jan 13, 2017
  
  9492e359
- Test that the correct prefix is used in shard identifier. · fbc4d441
  Kevin Atkinson authored Jan 11, 2017
  
  fbc4d441
- Use separate Create and Open methods · d7bea76d
  Kevin Atkinson authored Jan 11, 2017
```
This avoids the need for the special "auto" string for the shard func.
```
  d7bea76d
- Prefix shard function identifier with "/repo/flatfs/shard/". · 30b5a0e7
  Kevin Atkinson authored Jan 07, 2017
  
  30b5a0e7
- Tweak TestStorage tests. · ce77b1df
  Kevin Atkinson authored Jan 05, 2017
  
  ce77b1df
- Add _README file when shard func is "v1/next-to-last/2" · ef9568aa
  Kevin Atkinson authored Jan 05, 2017
  
  ef9568aa
- Store sharding function used in the repo. · d1554dd0
  Kevin Atkinson authored Dec 22, 2016
```
Store the sharding function used in the file "SHARDING" in the repo.
To make this work the sharding function is now always specified as a
string.
```
  d1554dd0
17 Dec, 2016 1 commit

Decrease the tolerance to 0.20 in TestNoCluster. · a86f092d

Kevin Atkinson authored Dec 16, 2016

This is done by fixing the Rand source to use a seed for deterministic
results and increasing the number of keys used.

a86f092d

16 Dec, 2016 3 commits
- Add test to make sure suffix sharding prevents prefix clustering. · d385911c
  Kevin Atkinson authored Dec 16, 2016
```
Closes #8
```
  d385911c
- Add support for NextToLast shard function. · d4844efe
  Kevin Atkinson authored Dec 15, 2016
  
  d4844efe
- Refactor tests to make testing additional shard functions easier. · ace251fd
  Kevin Atkinson authored Dec 15, 2016
```
Also fix bug when only the Prefix function was being tested in most cases.
```
  ace251fd
07 Dec, 2016 2 commits
- Ignore non-directories in the top level directory. · 7ec19f16
  Kevin Atkinson authored Dec 07, 2016
```
Ignore non-directories in the top level directory to allow placing a
README or similar file there.
```
  7ec19f16
- Support using suffix of key for directory in addition to the prefix. · 0969a796
  Kevin Atkinson authored Dec 07, 2016
  
  0969a796
05 Oct, 2016 1 commit
- Fix import in tests · b3dac15a
  Jakub Sztandera authored Oct 05, 2016
  
  b3dac15a
28 Jun, 2016 1 commit
- omit slash from created dskeys, fix prefixLen discrepancy · 9121c861
  Jeromy authored Jun 28, 2016
  
  9121c861
22 Jun, 2016 1 commit
- remove hex encoding from flatfs · dfe0659f
  Jeromy authored Jun 22, 2016
  
  dfe0659f
11 Jun, 2016 1 commit
- Remove all godeps · 89d5ea6f
  Jakub Sztandera authored Jun 10, 2016
```
License: MIT
Signed-off-by: Jakub Sztandera <kubuxu@protonmail.ch>
```
  89d5ea6f
01 Jan, 2016 1 commit
- fix pathing · 15a35675
  Jeromy authored Jan 01, 2016
  
  15a35675
09 Nov, 2015 1 commit
- add option to disable syncing · b72b00c3
  Jeromy authored Nov 08, 2015
  
  b72b00c3
07 Jul, 2015 1 commit
- better batch testing stuff · 23052ff7
  Jeromy authored Jul 07, 2015
  
  23052ff7