• Eric Myhre's avatar
    Remove finish callback. Much faster. Bench. · 6d31b15f
    Eric Myhre authored
    If you've been following along for a while now, you don't need to see
    the benchmarks to know what's coming.  The long story short is:
    allocations are the root of all evil, and we got rid of some, and now
    things are significantly faster.
    
    Here's the numbers:
    
    basicnode (just for a baseline to compare to):
    
    ```
    BenchmarkMapStrInt_3n_AssembleStandard-8         1988986               588 ns/op             520 B/op          8 allocs/op
    BenchmarkMapStrInt_3n_AssembleEntry-8            2158921               559 ns/op             520 B/op          8 allocs/op
    BenchmarkMapStrInt_3n_Iteration-8               19679841                67.0 ns/op            16 B/op          1 allocs/op
    BenchmarkSpec_Marshal_Map3StrInt-8               1377094               870 ns/op             544 B/op          7 allocs/op
    BenchmarkSpec_Marshal_Map3StrInt_CodecNull-8     4560031               278 ns/op             176 B/op          3 allocs/op
    BenchmarkSpec_Unmarshal_Map3StrInt-8              368763              3239 ns/op            1608 B/op         32 allocs/op
    ```
    
    realgen, previously, using fcb:
    
    ```
    BenchmarkMapStrInt_3n_AssembleStandard-8         4293072               278 ns/op             208 B/op          5 allocs/op
    BenchmarkMapStrInt_3n_AssembleEntry-8            4643892               259 ns/op             208 B/op          5 allocs/op
    BenchmarkMapStrInt_3n_Iteration-8               20307603                59.9 ns/op            16 B/op          1 allocs/op
    BenchmarkSpec_Marshal_Map3StrInt-8               1346115               913 ns/op             544 B/op          7 allocs/op
    BenchmarkSpec_Marshal_Map3StrInt_CodecNull-8     4606304               256 ns/op             176 B/op          3 allocs/op
    BenchmarkSpec_Unmarshal_Map3StrInt-8              425662              2793 ns/op            1160 B/op         27 allocs/op
    ```
    
    realgen, new, improved:
    
    ```
    BenchmarkMapStrInt_3n_AssembleStandard-8         6138765               183 ns/op             129 B/op          3 allocs/op
    BenchmarkMapStrInt_3n_AssembleEntry-8            7276795               176 ns/op             129 B/op          3 allocs/op
    BenchmarkMapStrInt_3n_Iteration-8               19593212                67.2 ns/op            16 B/op          1 allocs/op
    BenchmarkSpec_Marshal_Map3StrInt-8               1309916               912 ns/op             544 B/op          7 allocs/op
    BenchmarkSpec_Marshal_Map3StrInt_CodecNull-8     4579935               257 ns/op             176 B/op          3 allocs/op
    BenchmarkSpec_Unmarshal_Map3StrInt-8              465195              2599 ns/op            1080 B/op         25 allocs/op
    ```
    
    So!  About 150% improvement on assembly between gen with fcb and our new-improved no-callback system.
    
    And about 321% improvement in total now for codegen structs over the basicnode map.
    
    That's the kind of ratio I was looking for :)
    
    As with all of these measurements: these will also get much bigger on bigger corpuses.
    Some of the improvements here are O(n) -> O(1), and some apply even more heartily in deeper trees, etc.
    But it's telling that even on very small corpuses, the impact is already huge.
    6d31b15f
genStruct.go 22.5 KB