Document performance caveats of `ExtractV1File` and address comments
Since `CopyFileRange` performance is OS-dependant, we cannot guarantee that `ExtractV1File` will keep copies out of user-space. For example, on Linux with sufficiently old Kernel the current behaviour will fall back to user-space copy. Document this on the function so that it is made clear. Improve benchmarking determinism and error messages. The benchmark numbers that Daniel obtained on his laptop, running Linux 5.13 on an i5-8350U with /tmp being tmpfs, are as follows. The "old" results are BenchmarkExtractV1UsingReader, and the "new" are BenchmarkExtractV1File. name old time/op new time/op delta ExtractV1-8 1.33ms ± 1% 1.11ms ± 2% -16.48% (p=0.000 n=8+8) name old speed new speed delta ExtractV1-8 7.88GB/s ± 1% 9.43GB/s ± 2% +19.74% (p=0.000 n=8+8) name old alloc/op new alloc/op delta ExtractV1-8 34.0kB ± 0% 1.0kB ± 0% -97.09% (p=0.000 n=8+8) name old allocs/op new allocs/op delta ExtractV1-8 26.0 ± 0% 23.0 ± 0% -11.54% (p=0.000 n=8+8) So, at least in the case where the filesystem is very fast, we can see that the benefit is around 10-20%, as well as fewer allocs thanks to not needing a user-space buffer. The performance benefit will likely be smaller on slower disks. For the Linux syscall logic, see: - https://cs.opensource.google/go/go/+/refs/tags/go1.16.7:src/internal/poll/copy_file_range_linux.go;drc=refs%2Ftags%2Fgo1.16.7;l=54
Showing
Please register or sign in to comment