Enable arm64 optimizations that exist for power/x86 #3393

AGSaidi · 2020-08-06T02:29:54Z

Enable a set of optimizations that exist already for power and x86 for aarch64/arm64 systems.

Passes make check after these changes.

Running the string benchmarks the unaligned access change improves performance
by an average of 1.04x, min .96x, max 1.21x, median 1.01x

The gc optimization improves benchmark/gc/hash1 by 5%

The vm_exec changes make a massive difference on some benchmarks (e.g. 1.38x).

64-bit Arm platforms support unaligned accesses. Running the string benchmarks this change improves performance by an average of 1.04x, min .96x, max 1.21x, median 1.01x

Similar to x86 and powerpc optimizations. | |compare-ruby|built-ruby| |:------|-----------:|---------:| |hash1 | 0.225| 0.237| | | -| 1.05x| |hash2 | 0.110| 0.110| | | 1.00x| -|

| |compare-ruby|built-ruby| |:------------------------------|-----------:|---------:| |vm_array | 26.501M| 27.959M| | | -| 1.06x| |vm_attr_ivar | 21.606M| 31.429M| | | -| 1.45x| |vm_attr_ivar_set | 21.178M| 26.113M| | | -| 1.23x| |vm_backtrace | 6.621| 6.668| | | -| 1.01x| |vm_bigarray | 26.205M| 29.958M| | | -| 1.14x| |vm_bighash | 504.155k| 479.306k| | | 1.05x| -| |vm_block | 16.692M| 21.315M| | | -| 1.28x| |block_handler_type_iseq | 5.083| 7.004| | | -| 1.38x|

shyouhei · 2020-08-06T05:22:19Z

vm_exec.c

+#elif defined(__GNUC__) && defined(__aarch64__)
+    DECL_SC_REG(const VALUE *, pc, "19");
+    DECL_SC_REG(rb_control_frame_t *, cfp, "20");
+#define USE_MACHINE_REGS 1
+


Does this really benefit? We know that recent compilers are smarter than they were when we wrote those sibling codes. Read more: https://bugs.ruby-lang.org/issues/12225

cc @nurse

@shyouhei the only changes between compare-ruby and built-ruby in the number in the commit message above are the two hunks in vm_exec.c. I'm happy to run other benchmarks if you'd like, but it appears to improve substantially. Double checked my result again by removing all diffs and comparing to the ruby I built prior to my patches. The results were +-2% and then reapplied these two hunks and re-ran again, and observed the improvements here (up to 1.38x).

As far as I remember, there're another example with clang which says it's still effective.
And the commit comment says 1.2x seems worth introducing this change.

OK then, we need to investigate what is going on but this pull request can be a separate thing.

@nurse anything else you'd like to see before you merge?

I think this is OK to merge.
@shyouhei Do you have another topic?

@nurse No, it is LTGM.

AGSaidi added 3 commits Aug 6, 2020

Enable unaligned accesses on arm64

c5806d6

64-bit Arm platforms support unaligned accesses. Running the string benchmarks this change improves performance by an average of 1.04x, min .96x, max 1.21x, median 1.01x

arm64 enable gc optimizations

79b7b91

Similar to x86 and powerpc optimizations. | |compare-ruby|built-ruby| |:------|-----------:|---------:| |hash1 | 0.225| 0.237| | | -| 1.05x| |hash2 | 0.110| 0.110| | | 1.00x| -|

nurse approved these changes Aug 6, 2020

View changes

shyouhei reviewed Aug 6, 2020

View changes

AGSaidi deleted the AGSaidi:arm64-unaligned branch Aug 19, 2020

ruby / ruby

Enable arm64 optimizations that exist for power/x86 #3393

Enable arm64 optimizations that exist for power/x86 #3393

AGSaidi commented Aug 6, 2020

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

ruby / ruby

Enable arm64 optimizations that exist for power/x86 #3393

Enable arm64 optimizations that exist for power/x86 #3393

Conversation

AGSaidi commented Aug 6, 2020

This comment has been minimized.

shyouhei Aug 6, 2020 • edited Member

This comment has been minimized.

AGSaidi Aug 6, 2020 Author Contributor

This comment has been minimized.

nurse Aug 6, 2020 Member

This comment has been minimized.

shyouhei Aug 7, 2020 Member

This comment has been minimized.

AGSaidi Aug 8, 2020 Author Contributor

This comment has been minimized.

nurse Aug 12, 2020 Member

This comment has been minimized.

shyouhei Aug 13, 2020 Member

shyouhei Aug 6, 2020 •

edited

Member

AGSaidi Aug 6, 2020
Author Contributor

nurse Aug 6, 2020
Member

shyouhei Aug 7, 2020
Member

AGSaidi Aug 8, 2020
Author Contributor

nurse Aug 12, 2020
Member

shyouhei Aug 13, 2020
Member