Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move structs to classes, ~1.10x faster Ripper.lex #5093

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

Labels
None yet
1 participant
@schneems
Copy link
Contributor

@schneems schneems commented Nov 8, 2021

Concept

I am proposing we replace the Struct implementation of data structures inside of ripper with real classes.

This will improve performance and the implementation is not meaningfully more complicated.

Example

Struct versus class comparison:

Elem = Struct.new(:pos, :event, :tok, :state, :message) do
  def initialize(pos, event, tok, state, message = nil)
    super(pos, event, tok, State.new(state), message)
  end

  # ...

  def to_a
    a = super
    a.pop unless a.empty?
    a
  end
end

class ElemClass
  attr_accessor :pos, :event, :tok, :state, :message

  def initialize(pos, event, tok, state, message = nil)
    @pos = pos
    @event = event
    @tok = tok
    @state = State.new(state)
    @message = message
  end

  def to_a
    if @message
      [@pos, @event, @tok, @state, @message]
    else
      [@pos, @event, @tok, @state]
    end
  end
end

# stub state class creation for now
class State; def initialize(val); end; end

MicroBenchmark creation

require 'benchmark/ips'
require 'ripper'

pos = [1, 2]
event = :on_nl
tok = "\n".freeze
state = Ripper::EXPR_BEG

Benchmark.ips do |x|
  x.report("struct") { Elem.new(pos, event, tok, state) }
  x.report("class ") { ElemClass.new(pos, event, tok, state) }
  x.compare!
end; nil

Gives ~1.2x faster creation:

Warming up --------------------------------------
              struct   263.983k i/100ms
              class    303.367k i/100ms
Calculating -------------------------------------
              struct      2.638M (± 5.9%) i/s -     13.199M in   5.023460s
              class       3.171M (± 4.6%) i/s -     16.078M in   5.082369s

Comparison:
              class :  3170690.2 i/s
              struct:  2638493.5 i/s - 1.20x  (± 0.00) slower

MicroBenchmark to_a (Called by Ripper.lex for every element)

require 'benchmark/ips'
require 'ripper'

pos = [1, 2]
event = :on_nl
tok = "\n".freeze
state = Ripper::EXPR_BEG

struct =  Elem.new(pos, event, tok, state)
from_class = ElemClass.new(pos, event, tok, state)

Benchmark.ips do |x|
  x.report("struct") { struct.to_a }
  x.report("class ") { from_class.to_a }
  x.compare!
end; nil

Gives 1.46x faster to_a:

Warming up --------------------------------------
              struct   612.094k i/100ms
              class    893.233k i/100ms
Calculating -------------------------------------
              struct      6.121M (± 5.4%) i/s -     30.605M in   5.015851s
              class       8.931M (± 7.9%) i/s -     44.662M in   5.039733s

Comparison:
              class :  8930619.0 i/s
              struct:  6121358.9 i/s - 1.46x  (± 0.00) slower

MicroBenchmark data access

require 'benchmark/ips'
require 'ripper'

pos = [1, 2]
event = :on_nl
tok = "\n".freeze
state = Ripper::EXPR_BEG

struct =  Elem.new(pos, event, tok, state)
from_class = ElemClass.new(pos, event, tok, state)

Benchmark.ips do |x|
  x.report("struct") { struct.pos[1] }
  x.report("class ") { from_class.pos[1] }
  x.compare!
end; nil

Gives ~1.17x faster data access:

Warming up --------------------------------------
              struct     1.694M i/100ms
              class      1.868M i/100ms
Calculating -------------------------------------
              struct     16.149M (± 6.8%) i/s -     81.318M in   5.060633s
              class      18.886M (± 2.9%) i/s -     95.262M in   5.048359s

Comparison:
              class : 18885669.6 i/s
              struct: 16149255.8 i/s - 1.17x  (± 0.00) slower

Full benchmark integration of this inside of Ripper.lex

Inside of this repo with this commit

$ cd ext/ripper
$ make
$ cat test.rb
file = File.join(__dir__, "../../array.rb")
source = File.read(file)

bench = Benchmark.measure do
  10_000.times.each do
    Ripper.lex(source)
  end
end

puts bench

Then execute with and without this change 50 times:

rm new.txt
rm old.txt
for i in {0..50}
do
  `ruby -Ilib -rripper -rbenchmark ./test.rb >> new.txt`
  `ruby -rripper -rbenchmark ./test.rb >> old.txt`
done

I used derailed benchmarks internals to compare the results:

dir = Pathname(".")
branch_info = {}
branch_info["old"]  = { desc: "Struct lex", time: Time.now, file: dir.join("old.txt"), name: "old" }
branch_info["new"]  = { desc: "Class lex", time: Time.now, file: dir.join("new.txt"), name: "new" }
stats = DerailedBenchmarks::StatsFromDir.new(branch_info)
stats.call.banner

Which gave us:

❤️ ❤️ ❤️  (Statistically Significant) ❤️ ❤️ ❤️

[new] (3.3139 seconds) "Class lex" ref: "new"
  FASTER 🚀🚀🚀 by:
    1.1046x [older/newer]
    9.4700% [(older - newer) / older * 100]
[old] (3.6606 seconds) "Struct lex" ref: "old"

Iterations per sample:
Samples: 51

Test type: Kolmogorov Smirnov
Confidence level: 99.0 %
Is significant? (max > critical): true
D critical: 0.30049534876137013
D max: 0.9607843137254902

Histograms (time ranges are in seconds):

   [new] description:                                        [old] description:
     "Class lex"                                               "Struct lex"
              ┌                                        ┐                ┌                                        ┐
   [3.0, 3.3) ┤▇ 1                                           [3.0, 3.3) ┤ 0
   [3.3, 3.6) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 47       [3.3, 3.6) ┤ 0
   [3.5, 3.8) ┤▇▇ 2                                          [3.5, 3.8) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 46
   [3.8, 4.1) ┤▇ 1                                           [3.8, 4.1) ┤▇▇▇ 4
   [4.0, 4.3) ┤ 0                                            [4.0, 4.3) ┤ 0
   [4.3, 4.6) ┤ 0                                            [4.3, 4.6) ┤▇ 1
              └                                        ┘                └                                        ┘
                         # of runs in range                                        # of runs in range

To sum this up, the "new" version of this code (using real classes instead of structs) is 10% faster across 50 runs with a statistical significance confidence
level of 99%. Histograms are for visual checksum.

Consequences

This is a breaking change as seen by the modifications needed to IRB (because the element no longer acts as an array as well). However, this constant is technically marked as "internal" via docs.

I want to gauge interest in such a change. What do you think of this opportunity to increase performance and tighten the API? If you like it, what are your thoughts on moving forwards?

At a high level, the two options I see are:

  1. Break that API

If we go this route, we can support IRB's lex usage out of the box. I don't know how many other gems might be accessing the contents of Lexer::Elem. To gauge impact it might be possible to do some kind of a code scan on github, or maybe rubygems. By default it's not exposed via Ripper.lex so searching for Lexer.new should yield high-signal-low-noise results.

It looks like via https://github.com/search?q=Ripper%3A%3ALexer.new&type=code there are 1,225 results. However it looks like there's more than a few forks of the Ruby project that github does not identify as forks. I could filter by something else like more than 5 stars, but for some reason, the UI returns no results for these cases for me.

  1. Perform a deprecation cycle where we deprecate the Struct methods (such as [] and each_with_index).

I've explored it a little, but the work is ultimately unknown. I think these are the methods we would need to implement in order to deprecate them:

[:pretty_print_cycle, :to_a, :==, :to_s, :[], :[]=, :deconstruct, :eql?, :each_pair, :select, :to_h, :filter, :values, :inspect, :deconstruct_keys, :dig, :members, :length, :size, :pretty_print, :each, :hash, :values_at]
schneems added 3 commits Nov 8, 2021
## Concept

I am proposing we replace the Struct implementation of data structures inside of ripper with real classes.

This will improve performance and the implementation is not meaningfully more complicated.

## Example

Struct versus class comparison:

```ruby
Elem = Struct.new(:pos, :event, :tok, :state, :message) do
  def initialize(pos, event, tok, state, message = nil)
    super(pos, event, tok, State.new(state), message)
  end

  # ...

  def to_a
    a = super
    a.pop unless a.empty?
    a
  end
end

class ElemClass
  attr_accessor :pos, :event, :tok, :state, :message

  def initialize(pos, event, tok, state, message = nil)
    @pos = pos
    @event = event
    @Tok = tok
    @State = State.new(state)
    @message = message
  end

  def to_a
    if @message
      [@pos, @event, @Tok, @State, @message]
    else
      [@pos, @event, @Tok, @State]
    end
  end
end

# stub state class creation for now
class State; def initialize(val); end; end
```

## MicroBenchmark creation

```ruby
require 'benchmark/ips'
require 'ripper'

pos = [1, 2]
event = :on_nl
tok = "\n".freeze
state = Ripper::EXPR_BEG

Benchmark.ips do |x|
  x.report("struct") { Elem.new(pos, event, tok, state) }
  x.report("class ") { ElemClass.new(pos, event, tok, state) }
  x.compare!
end; nil
```

Gives ~1.2x faster creation:

```
Warming up --------------------------------------
              struct   263.983k i/100ms
              class    303.367k i/100ms
Calculating -------------------------------------
              struct      2.638M (± 5.9%) i/s -     13.199M in   5.023460s
              class       3.171M (± 4.6%) i/s -     16.078M in   5.082369s

Comparison:
              class :  3170690.2 i/s
              struct:  2638493.5 i/s - 1.20x  (± 0.00) slower
```

## MicroBenchmark `to_a` (Called by Ripper.lex for every element)

```ruby
require 'benchmark/ips'
require 'ripper'

pos = [1, 2]
event = :on_nl
tok = "\n".freeze
state = Ripper::EXPR_BEG

struct =  Elem.new(pos, event, tok, state)
from_class = ElemClass.new(pos, event, tok, state)

Benchmark.ips do |x|
  x.report("struct") { struct.to_a }
  x.report("class ") { from_class.to_a }
  x.compare!
end; nil
```

Gives 1.46x faster `to_a`:

```
Warming up --------------------------------------
              struct   612.094k i/100ms
              class    893.233k i/100ms
Calculating -------------------------------------
              struct      6.121M (± 5.4%) i/s -     30.605M in   5.015851s
              class       8.931M (± 7.9%) i/s -     44.662M in   5.039733s

Comparison:
              class :  8930619.0 i/s
              struct:  6121358.9 i/s - 1.46x  (± 0.00) slower
```

## MicroBenchmark data access

```ruby
require 'benchmark/ips'
require 'ripper'

pos = [1, 2]
event = :on_nl
tok = "\n".freeze
state = Ripper::EXPR_BEG

struct =  Elem.new(pos, event, tok, state)
from_class = ElemClass.new(pos, event, tok, state)

Benchmark.ips do |x|
  x.report("struct") { struct.pos[1] }
  x.report("class ") { from_class.pos[1] }
  x.compare!
end; nil
```

Gives ~1.17x faster data access:

```
Warming up --------------------------------------
              struct     1.694M i/100ms
              class      1.868M i/100ms
Calculating -------------------------------------
              struct     16.149M (± 6.8%) i/s -     81.318M in   5.060633s
              class      18.886M (± 2.9%) i/s -     95.262M in   5.048359s

Comparison:
              class : 18885669.6 i/s
              struct: 16149255.8 i/s - 1.17x  (± 0.00) slower
```

## Full benchmark integration of this inside of Ripper.lex

Inside of this repo with this commit

```
$ cd ext/ripper
$ make
$ cat test.rb
file = File.join(__dir__, "../../array.rb")
source = File.read(file)

bench = Benchmark.measure do
  10_000.times.each do
    Ripper.lex(source)
  end
end

puts bench
```

Then execute with and without this change 50 times:

```
rm new.txt
rm old.txt
for i in {0..50}
do
  `ruby -Ilib -rripper -rbenchmark ./test.rb >> new.txt`
  `ruby -rripper -rbenchmark ./test.rb >> old.txt`
done
```

I used derailed benchmarks internals to compare the results:

```
dir = Pathname(".")
branch_info = {}
branch_info["old"]  = { desc: "Struct lex", time: Time.now, file: dir.join("old.txt"), name: "old" }
branch_info["new"]  = { desc: "Class lex", time: Time.now, file: dir.join("new.txt"), name: "new" }
stats = DerailedBenchmarks::StatsFromDir.new(branch_info)
stats.call.banner
```

Which gave us:

```
❤️ ❤️ ❤️  (Statistically Significant) ❤️ ❤️ ❤️

[new] (3.3139 seconds) "Class lex" ref: "new"
  FASTER 🚀🚀🚀 by:
    1.1046x [older/newer]
    9.4700% [(older - newer) / older * 100]
[old] (3.6606 seconds) "Struct lex" ref: "old"

Iterations per sample:
Samples: 51

Test type: Kolmogorov Smirnov
Confidence level: 99.0 %
Is significant? (max > critical): true
D critical: 0.30049534876137013
D max: 0.9607843137254902

Histograms (time ranges are in seconds):

   [new] description:                                        [old] description:
     "Class lex"                                               "Struct lex"
              ┌                                        ┐                ┌                                        ┐
   [3.0, 3.3) ┤▇ 1                                           [3.0, 3.3) ┤ 0
   [3.3, 3.6) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 47       [3.3, 3.6) ┤ 0
   [3.5, 3.8) ┤▇▇ 2                                          [3.5, 3.8) ┤▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇ 46
   [3.8, 4.1) ┤▇ 1                                           [3.8, 4.1) ┤▇▇▇ 4
   [4.0, 4.3) ┤ 0                                            [4.0, 4.3) ┤ 0
   [4.3, 4.6) ┤ 0                                            [4.3, 4.6) ┤▇ 1
              └                                        ┘                └                                        ┘
                         # of runs in range                                        # of runs in range
```

To sum this up, the "new" version of this code (using real classes instead of structs) is 10% faster across 50 runs with a statistical significance confidence level of 99%. Histograms are for visual checksum.
The last element in the `@buf` may be either an array or an `Elem`. In the case it is an `Elem` we iterate over every element, when we do not need to. This check guards that case by ensuring that we only iterate over an array of elements.
Instead of accessing the struct as an array, access it via methods. There are other places inside of this file already using this API (for example https://github.com/ruby/ruby/blob/e0a5c3d2b71dfad038d7562fdd33f02ffd79232d/lib/irb/ruby-lex.rb#L829-L830).

This commit moves all struct array-ish calls to use their method calls instead. It is also ~1.23 faster accessing values via a method instead of as an array according to this microbenchmark:

```ruby
Elem = Struct.new(:pos, :event, :tok, :state, :message) do
  def initialize(pos, event, tok, state, message = nil)
    super(pos, event, tok, State.new(state), message)
  end

  # ...

  def to_a
    a = super
    a.pop unless a.empty?
    a
  end
end

class ElemClass
  attr_accessor :pos, :event, :tok, :state, :message

  def initialize(pos, event, tok, state, message = nil)
    @pos = pos
    @event = event
    @Tok = tok
    @State = State.new(state)
    @message = message
  end

  def to_a
    if @message
      [@pos, @event, @Tok, @State, @message]
    else
      [@pos, @event, @Tok, @State]
    end
  end
end

# stub state class creation for now
class State; def initialize(val); end; end
```

```ruby
Benchmark.ips do |x|
  x.report("struct") { struct[1] }
  x.report("class ") { from_class.event }
  x.compare!
end; nil
```

```
Warming up --------------------------------------
              struct     1.624M i/100ms
              class      1.958M i/100ms
Calculating -------------------------------------
              struct     17.139M (± 2.6%) i/s -     86.077M in   5.025801s
              class      21.104M (± 3.4%) i/s -    105.709M in   5.015193s

Comparison:
              class : 21103826.3 i/s
              struct: 17139201.5 i/s - 1.23x  (± 0.00) slower
```
@schneems schneems marked this pull request as draft Nov 8, 2021
@schneems schneems changed the title ~1.10x faster Change Ripper.lex structs to classes Move structs to classes, ~1.10x faster Ripper.lex Nov 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment