-
Eric Myhre authored
If you've been following along for a while now, you don't need to see the benchmarks to know what's coming. The long story short is: allocations are the root of all evil, and we got rid of some, and now things are significantly faster. Here's the numbers: basicnode (just for a baseline to compare to): ``` BenchmarkMapStrInt_3n_AssembleStandard-8 1988986 588 ns/op 520 B/op 8 allocs/op BenchmarkMapStrInt_3n_AssembleEntry-8 2158921 559 ns/op 520 B/op 8 allocs/op BenchmarkMapStrInt_3n_Iteration-8 19679841 67.0 ns/op 16 B/op 1 allocs/op BenchmarkSpec_Marshal_Map3StrInt-8 1377094 870 ns/op 544 B/op 7 allocs/op BenchmarkSpec_Marshal_Map3StrInt_CodecNull-8 4560031 278 ns/op 176 B/op 3 allocs/op BenchmarkSpec_Unmarshal_Map3StrInt-8 368763 3239 ns/op 1608 B/op 32 allocs/op ``` realgen, previously, using fcb: ``` BenchmarkMapStrInt_3n_AssembleStandard-8 4293072 278 ns/op 208 B/op 5 allocs/op BenchmarkMapStrInt_3n_AssembleEntry-8 4643892 259 ns/op 208 B/op 5 allocs/op BenchmarkMapStrInt_3n_Iteration-8 20307603 59.9 ns/op 16 B/op 1 allocs/op BenchmarkSpec_Marshal_Map3StrInt-8 1346115 913 ns/op 544 B/op 7 allocs/op BenchmarkSpec_Marshal_Map3StrInt_CodecNull-8 4606304 256 ns/op 176 B/op 3 allocs/op BenchmarkSpec_Unmarshal_Map3StrInt-8 425662 2793 ns/op 1160 B/op 27 allocs/op ``` realgen, new, improved: ``` BenchmarkMapStrInt_3n_AssembleStandard-8 6138765 183 ns/op 129 B/op 3 allocs/op BenchmarkMapStrInt_3n_AssembleEntry-8 7276795 176 ns/op 129 B/op 3 allocs/op BenchmarkMapStrInt_3n_Iteration-8 19593212 67.2 ns/op 16 B/op 1 allocs/op BenchmarkSpec_Marshal_Map3StrInt-8 1309916 912 ns/op 544 B/op 7 allocs/op BenchmarkSpec_Marshal_Map3StrInt_CodecNull-8 4579935 257 ns/op 176 B/op 3 allocs/op BenchmarkSpec_Unmarshal_Map3StrInt-8 465195 2599 ns/op 1080 B/op 25 allocs/op ``` So! About 150% improvement on assembly between gen with fcb and our new-improved no-callback system. And about 321% improvement in total now for codegen structs over the basicnode map. That's the kind of ratio I was looking for :) As with all of these measurements: these will also get much bigger on bigger corpuses. Some of the improvements here are O(n) -> O(1), and some apply even more heartily in deeper trees, etc. But it's telling that even on very small corpuses, the impact is already huge.
6d31b15f