1. 31 Mar, 2020 1 commit
    • Steven Allen's avatar
      feat: put all temporary files in the same directory and clean them up on start · 5b09ebfb
      Steven Allen authored
      Otherwise, if we repeatedly stop the same node, we'll collect a bunch of
      temporary files and NEVER DELETE them. This will:
      
      1. Waste space.
      2. Slow down queries/GC.
      
      This patch:
      
      1. Moves all temporary files to a single `.temp` directory. The leading `.`
      means it can't conflict with any keys and queries (even on older flatfs
      versions) will skip it.
      2. Adds an "rm -rf flatfs-dir/.temp" call on start.
      5b09ebfb
  2. 14 Feb, 2020 1 commit
    • Steven Allen's avatar
      fix: be explicit about key limitations · 4ca877d4
      Steven Allen authored
      Only allow keys of the form `/[0-9A-Z+-_=]`. That is, upper-case alphanumeric
      keys in the root namespace (plus some special characters).
      
      Why? We don't encode keys before writing them to the filesystem. This change
      ensures that:
      
      1. Case sensitivity doesn't matter because we only allow upper-case keys.
      2. Path separators and special characters doesn't matter.
      
      For context, go-ipfs only uses flatfs for storing blocks. Every block CID is
      encoded as uppercase alphanumeric text (specifically, uppercase base32).
      
      We could be less restrictive, but this is safer and easier to understand.
      Unfortunately, we _can't_ allow mixed case (Windows) and can't allow lowercase
      because we're already using uppercase keys.
      
      fixes #23
      4ca877d4
  3. 21 Aug, 2019 2 commits
  4. 01 Mar, 2019 1 commit
  5. 18 Dec, 2018 1 commit
  6. 04 Oct, 2018 2 commits
  7. 13 Aug, 2018 1 commit
  8. 27 Mar, 2018 1 commit
  9. 24 Mar, 2018 2 commits
  10. 23 Mar, 2018 1 commit
  11. 09 Mar, 2018 1 commit
    • Hector Sanjuan's avatar
      Feat: Implement a PersistentDatastore by adding DiskUsage method (#27) · a095ff54
      Hector Sanjuan authored
      * Feat: Implement a PersistentDatastore by adding DiskUsage method
      
      This adds DiskUsage().
      
      This datastore would have a big performance hit if we walked the
      filesystem to calculate disk usage everytime.
      
      Therefore I have opted to keep tabs of current disk usage by
      walking the filesystem once during "Open" and then adding/subtracting
      file sizes on Put/Delete operations.
      
      On the plus:
        * Small perf impact
        * Always up to date values
        * No chance that race conditions will leave DiskUsage with wrong values
      
      On the minus:
        * Slower Open() - it run Stat() on all files in the datastore
        * Size does not match real size if a directory grows large
          (at least on ext4 systems). We don't track directory-size changes,
          only use the creation size.
      
      * Update .travis.yml: latest go
      
      * DiskUsage: cache diskUsage on Close()
      
      Avoids walking the whole datastore when a clean shutdown happened.
      
      File is removed on read, so a non-cleanly-shutdown datastore
      will not find an outdated file later.
      
      * Manage diskUsage with atomic.AddInt64 (no channel). Use tmp file + rename.
      
      * Remove redundant comments
      
      * Address race conditions when writing/deleting the same key concurrently
      
      This improves diskUsage book-keeping when writing and deleting the same
      key concurrently. It however means that existing values in the datastore
      cannot be replaced without a explicit delete (before put).
      
      A new test checks that there are no double counts in a put/delete race
      condition environment. This is true when sync is enabled. No syncing
      causes small over-counting when deleting files concurrently to put.
      
      * Document that datastore Put does not replace values
      
      * Comment TestPutOverwrite
      
      * Implement locking and discard for concurrent operations on the same key
      
      This implements the approach suggested by @stebalien in
      https://github.com/ipfs/go-ds-flatfs/pull/27
      
      Write operations (delete/put) to the same key are tracked in a map
      which provides a shared lock. Concurrent operations to that key
      will share that lock. If one operation succeeds, it will remove
      the lock from the map and the others using it will automatically
      succeed. If one operation fails, it will let the others waiting
      for the lock try.
      
      New operations to that key will request a new lock.
      
      A new test for putMany (batching) has been added.
      
      Worth noting: a concurrent Put+Delete on a non-existing key
      always yields Put as the winner (delete will fail if it comes first,
      or will skipped if it comes second).
      
      * Do less operation in tests (travis fails on mac)
      
      * Reduce counts again
      
      * DiskUsage: address comments. Use sync.Map.
      
      * Add rw and rwundo rules to Makefile
      
      * DiskUsage: use one-off locks for operations
      
      Per @stebalien 's suggestion.
      
      * DiskUsage: write checkpoint file when du changes by more than 1 percent
      
      Meaning, if the difference between the checkpoint file value and the current
      is more than one percent, we checkpoint it.
      
      * Fix tests so they ignore disk usage cache file
      
      * Rename: update disk usage when rename fails too..
      
      * Improve rename comment and be less explicit on field initialization
      
      * Do not use filepath.Walk, use Readdir instead.
      
      * Estimate diskUsage for folders with more than 100 files
      
      This will estimate disk usage when folders have more than
      100 files in them. Non processed files will be assumed to have
      the average size of processed ones.
      
      * Select file randomly when there are too many to read
      
      * Fix typo
      
      * fix tests
      
      * Set time deadline to 5 minutes.
      
      This provides a disk estimation deadline. We will stat() as many
      files as possible until we run out of time. If that happens,
      the rest will be calculated as an average.
      
      The user is informed of the slow operation and, if we ran out of time,
      about how to obtain better accuracy.
      a095ff54
  12. 19 Jan, 2017 3 commits
  13. 16 Jan, 2017 10 commits
  14. 17 Dec, 2016 1 commit
  15. 16 Dec, 2016 3 commits
  16. 07 Dec, 2016 2 commits
  17. 05 Oct, 2016 1 commit
  18. 28 Jun, 2016 1 commit
  19. 22 Jun, 2016 1 commit
  20. 11 Jun, 2016 1 commit
  21. 01 Jan, 2016 1 commit
  22. 09 Nov, 2015 1 commit
  23. 07 Jul, 2015 1 commit