Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split ruby.h #2991

Merged
merged 86 commits into from Apr 8, 2020
Merged

Split ruby.h #2991

merged 86 commits into from Apr 8, 2020

Conversation

@shyouhei
Copy link
Member

@shyouhei shyouhei commented Mar 30, 2020

Turn this:
ruby_2ruby_8h__incl

into this:
ruby_2ruby_8h__incl 1

@shyouhei
Copy link
Member Author

@shyouhei shyouhei commented Mar 30, 2020

Confirmed that edge rails can properly be bundle installed with it.

@ko1
Copy link
Contributor

@ko1 ko1 commented Mar 30, 2020

could you explain the PR more? Motivation etc.

@shyouhei
Copy link
Member Author

@shyouhei shyouhei commented Mar 30, 2020

This is almost the same as #2711, except it applies the same thing against ruby.h instead of internal.h.

  • ruby.h@master includes literally thousands of lines of macro definitions, and #includes intern.h, which has another thousands of lines of code. I have problems maintaining these files. I want them to be split into lots of small files.

  • By splitting into small files, it is now possible to convert macros into inline functions. Macros are difficult to read, hard to maintain, impossible to debug. Converting them into inline functions should at least give a degree of relief from them.

  • As a result of de-macro transformation, granular type checking is now possible. For instance ruby/spec#762 was found during developing this branch.

@shyouhei shyouhei mentioned this pull request Mar 31, 2020
@shyouhei shyouhei force-pushed the shyouhei:ruby.h branch from 514b73a to a33b313 Mar 31, 2020
@ioquatix
Copy link
Member

@ioquatix ioquatix commented Mar 31, 2020

I generally like this idea.

I wonder if we should use ruby3 or ruby/3 (or even just ruby). I guess all are fine but I feel like it's more common to use ruby3 if version number is expected. It also means that user could have "ruby2", "ruby3" and "ruby4" all installed and in theory include all headers simultaneously. Is there any example we can follow? Don't mean to bikeshed.

Also, if we are considering this, is it also open to consider moving all source ruby/*.c into ruby/src/? Because the root directory of Ruby is a bit of a mess.

I tried to move coroutine code into it's own area, using more fine grained files (partly because they are different arch). However, maybe we can rethink the layout of the repo.

@shyouhei
Copy link
Member Author

@shyouhei shyouhei commented Apr 1, 2020

Yes moving *.c files into src/ -like locations is in my TODO list. So far it is postponed because the layout of the repo as a whole is a big story. It should also impact all the core devs (their stash/local branches get ruined).

Re: directory name. It's okay to take ruby3 instead of ruby/3. I don't think side-by-side installation of headers from different versions works, though (for instance 2.6 and 2.7 provide slightly different feature sets so "ruby2" is impractical).

@ioquatix
Copy link
Member

@ioquatix ioquatix commented Apr 1, 2020

Yes moving *.c files into src/ -like locations is in my TODO list. So far it is postponed because the layout of the repo as a whole is a big story. It should also impact all the core devs (their stash/local branches get ruined).

Maybe requires more discussion, but I'd support (before Ruby 3):

  • Better organisation of source files. Maybe src/$module/$thing.c where $thing is logical grouping. We should try to make some organisation such that we have logical Init_$thing and so on, and try to ensure we prefer rb_$module_$thing_function or something similar.
  • Expand tabs across all code.

Re: directory name. It's okay to take ruby3 instead of ruby/3. I don't think side-by-side installation of headers from different versions works, though (for instance 2.6 and 2.7 provide slightly different feature sets so "ruby2" is impractical).

What does Python do? Can you have Python 3.1, 3.3 installed at the same time? Maybe we should adopt, e.g.

ruby3.1/*.h
ruby3 -> ruby3.1 (symlink to latest)

We should try to find what is best for users, OS distributions.

@shyouhei
Copy link
Member Author

@shyouhei shyouhei commented Apr 1, 2020

I started to think we should move to our bug tracker. The discussion (is fruitful so far, but) started to divert from this particular pull request.

@ioquatix
Copy link
Member

@ioquatix ioquatix commented Apr 1, 2020

I am happy for you to move the discussion. Maybe good to have a concrete proposal. We can discuss on slack? I am happy to help edit the proposal.

@shyouhei
Copy link
Member Author

@shyouhei shyouhei commented Apr 1, 2020

  • Expand tabs across all code.

Heh, you just realized that I silently removed all the tabs in the headers I touched in this pull request 😄

@ioquatix
Copy link
Member

@ioquatix ioquatix commented Apr 1, 2020

The way I think will best preserve history:

  • Expand tabs in one commit.
  • Move files in another commit.

That way, git blame and other tools should be able to go through rename/whitespace expansion easily.

@ioquatix
Copy link
Member

@ioquatix ioquatix commented Apr 1, 2020

Can you tell me if compile performance is affected?

@shyouhei
Copy link
Member Author

@shyouhei shyouhei commented Apr 1, 2020

Well yes, as I mentioned in 7b8969e I see some slowdown in compilation. For instance GitHub Actions reports it compiled current master in 2m 23s. The same thing took 5m 19s for this branch.

@ioquatix
Copy link
Member

@ioquatix ioquatix commented Apr 1, 2020

Can you get back speed using precompiled headers?

@shyouhei
Copy link
Member Author

@shyouhei shyouhei commented Apr 1, 2020

I guess so... not tried yet.

@ioquatix
Copy link
Member

@ioquatix ioquatix commented Apr 3, 2020

Did you check if incremental build is faster or slower?

@shyouhei
Copy link
Member Author

@shyouhei shyouhei commented Apr 6, 2020

@ioquatix incremental builds are also slower than before. This is because right now, source codes directly includeruby/ruby.h so that they pull everything. Becasue ruby/ruby.h is a public API we cannot make it lightweight. We need to decouple things in each .c files.

@ko1
Copy link
Contributor

@ko1 ko1 commented Apr 6, 2020

I'm not sure the path 3/ is suitable (change them for 4?).

@shyouhei
Copy link
Member Author

@shyouhei shyouhei commented Apr 6, 2020

@ko1 or leave 3 as-is and make new 4. Both are OK.

@shyouhei
Copy link
Member Author

@shyouhei shyouhei commented Apr 6, 2020

My intention to create a subdirectory is to be explicit that files under the new one are implementation details. Those files should not be considered as public APIs. Extension libraries are not expected to do something line #include <ruby/3/intern/select/posix.h>. Instead they should stick to #include "ruby/ruby.h".

@shyouhei shyouhei force-pushed the shyouhei:ruby.h branch 2 times, most recently from fe56ce8 to 3256ded Apr 6, 2020
shyouhei added 7 commits Mar 2, 2020
... on mswin.  According to 3ea6beb, it
is a wrong idea to define HAVE_SYS_TIME_H in case of mingw.
MakeMakefile#have_header do not know such restriction.  It is up to the
programmer to properly avoid such situation.

:FIXME: I suspect this is rather a bug of have_header.
I am going to modify C codes.  Must make sure that does not break
things.  This changeset adds many CI that basically just make
binaries with slightly distinct options each other.
The ruby/ruby.h, our main public header, is the biggest header file
except autogenerated ones under enc/unicode.  It has roughly 2,500 LOC.
It then includes ruby/intern.h, which is ~1,000 LOC.  Also included are
ruby/defines.h (~500 LOC), ruby/win32.h (~700 LOC), etc.

It's too big!  Nobody can understand what is going on.  We cannot
eliminate the contents for backward compatibility, but at least we can
split it into many, small parts.  I hope this improves understanding of
our public APIs.

This changeset is a pure refactoring that does not add or remove any
single LOC, except for obvious header include guards.
This header file improves consistency of macro definitions.  Let's use
it throughout the project.
This macro is worth defining becuase it elminates literally thousands of
lines of copy&paste.
When I made internal/compilers.h I was afraid of global namespace
pollutions.  Later it turned out that our public header defines
__has_attribute anyways.  Let's stop worrying.  Publicize what we
already have internally.

Doing so, however, is not a simple copy&paste business.  We only had to
consider C99 for internal headers.  But for public ones, we need to make
sure all other C/C++ versions work well.
shyouhei added 11 commits Mar 27, 2020
Minor updates, like adding empty lines for readability.
NORETURN() was missing from the included files, so fixed it.
Reduced use of ruby/backward/2/attributes.h when only one macro from
that header was used in the file.  Direct use new syntax.
The file was necessary because ruby internals tend to confuse VALUE and
void*.  That was a bad habit of 20th century.  Let's not be loose.
The header is emppty now.
It seems gcc takes more time to compile ruby than before.  That also
impacts on JIT.  Extended timeouts for them.
Don't bother complex preprocessor macros.  ST2FIX is exactly what is
needed here.
Some experiments revealed that in case of GCC, there are chances for
this function to remain not inlined.  That impacts runtime performance
negatively.  Let us force inline the function.  It was designed to be
inlined, then constant-folded.

Calculating -------------------------------------
                             before       after
Optcarrot Lan_Master.nes     37.090      39.064 fps

Comparison:
             Optcarrot Lan_Master.nes
                   after:        39.1 fps
                  before:        37.1 fps - 1.05x  slower
Not the case of Clang but when compiled using GCC, RB_FL_ABLE shines so
brightly on top of perf report.  This shall be inlined.

Calculating -------------------------------------
                             before       after
Optcarrot Lan_Master.nes     39.685      43.147 fps

Comparison:
             Optcarrot Lan_Master.nes
                   after:        43.1 fps
                  before:        39.7 fps - 1.09x  slower
@shyouhei shyouhei force-pushed the shyouhei:ruby.h branch from bbb3118 to 12f1e15 Apr 7, 2020
@shyouhei
Copy link
Member Author

@shyouhei shyouhei commented Apr 7, 2020

Now I think the branch is ready to be merged.

@shyouhei shyouhei merged commit 9e6e39c into ruby:master Apr 8, 2020
94 of 95 checks passed
94 of 95 checks passed
gcc-9
Details
make (check, --jit)
Details
make (check)
Details
check_branch
Details
make (check, ubuntu-latest)
Details
make (test, windows-2019, 2019)
Details
make (check)
Details
gcc-8
Details
make (check, --jit-wait)
Details
make (check, ubuntu-16.04)
Details
make (test-bundler)
Details
gcc-7
Details
make (test-bundler, ubuntu-latest)
Details
make (test-bundled-gems)
Details
gcc-6
Details
make (test-bundled-gems, ubuntu-latest)
Details
make (leaked-globals)
Details
gcc-5
Details
make (test-all TESTS=--repeat-count=2, ubuntu-latest)
Details
gcc-4.8
Details
make (leaked-globals, ubuntu-latest)
Details
clang-11
Details
clang-10
Details
clang-9
Details
clang-8
Details
clang-7
Details
clang-6.0
Details
clang-5.0
Details
clang-4.0
Details
clang-3.9
Details
c99
Details
c11
Details
c17
Details
c2x
Details
c++98
Details
c++11
Details
c++14
Details
c++17
Details
c++2a
Details
-O0
Details
-O3
Details
gmp
Details
jemalloc
Details
valgrind
Details
coroutine=ucontext
Details
coroutine=copy
Details
disable-mathn
Details
disable-jit-support
Details
disable-dln
Details
disable-rubygems
Details
OPT_THREADED_CODE=1
Details
OPT_THREADED_CODE=2
Details
OPT_THREADED_CODE=3
Details
NDEBUG
Details
RUBY_DEBUG
Details
ARRAY_DEBUG
Details
BUGNUM_DEBUG
Details
CCAN_LIST_DEBUG
Details
CPDEBUG=-1
Details
ENC_DEBUG
Details
GC_DEBUG
Details
HASH_DEBUG
Details
ID_TABLE_DEBUG
Details
RGENGC_DEBUG=-1
Details
SYMBOL_DEBUG
Details
THREAD_DEBUG=-1
Details
RGENGC_CHECK_MODE
Details
TRANSIENT_HEAP_CHECK_MODE
Details
VM_CHECK_MODE
Details
USE_EMBED_CI=0
Details
USE_FLONUM=0
Details
USE_LAZY_LOAD
Details
USE_RINCGC=0
Details
USE_SYMBOL_GC=0
Details
USE_THREAD_CACHE=0
Details
USE_TRANSIENT_HEAP=0
Details
DEBUG_FIND_TIME_NUMGESS
Details
DEBUG_INTEGER_PACK
Details
ENABLE_PATH_CHECK
Details
GC_DEBUG_STRESS_TO_CLASS
Details
GC_ENABLE_LAZY_SWEEP=0
Details
GC_PROFILE_DETAIL_MEMOTY
Details
GC_PROFILE_MORE_DETAIL
Details
CALC_EXACT_MALLOC_SIZE
Details
MALLOC_ALLOCATED_SIZE_CHECK
Details
IBF_ISEQ_ENABLE_LOCAL_BUFFER
Details
RGENGC_ESTIMATE_OLDMALLOC
Details
RGENGC_FORCE_MAJOR_GC
Details
RGENGC_OBJ_INFO
Details
RGENGC_OLD_NEWOBJ_CHECK
Details
RGENGC_PROFILE
Details
VM_DEBUG_BP_CHECK
Details
VM_DEBUG_VERIFY_METHOD_CACHE
Details
continuous-integration/appveyor/pr AppVeyor build failed
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@shyouhei shyouhei deleted the shyouhei:ruby.h branch Apr 8, 2020
@eregon
Copy link
Member

@eregon eregon commented Apr 8, 2020

Interesting PR.

Is there any intention to change the set of symbols available through #include "ruby.h"? If so I would like to be part of the discussion as TruffleRuby implements a large part of the C-API and reuses MRI headers pretty much as-is: https://github.com/oracle/truffleruby/tree/08789dca1d40158bc214d9f3a148c79b4c112baa/lib/cext/include/ruby

(this PR will likely cause me merge hell for any modification in ruby.h, but I'll have to do with it)

@eregon
Copy link
Member

@eregon eregon commented Apr 8, 2020

  • By splitting into small files, it is now possible to convert macros into inline functions. Macros are difficult to read, hard to maintain, impossible to debug. Converting them into inline functions should at least give a degree of relief from them.

Note that this needs to be done quite carefully for compatibility, because C extensions might expect it's a macro (e.g., if they #undef it, or use have_macro), or might expect it's a function (e.g., if assigning to a function pointer variable, or checking with have_func, etc).

@shyouhei
Copy link
Member Author

@shyouhei shyouhei commented Apr 10, 2020

Thank you for reviewing!

  • By splitting into small files, it is now possible to convert macros into inline functions. Macros are difficult to read, hard to maintain, impossible to debug. Converting them into inline functions should at least give a degree of relief from them.

Note that this needs to be done quite carefully for compatibility, because C extensions might expect it's a macro (e.g., if they #undef it, or use have_macro), or might expect it's a function (e.g., if assigning to a function pointer variable, or checking with have_func, etc).

  • No functions were deleted. have_func etc must not fail.
  • All the functions introduced are inline functions, so taking address of them makes no to little sense.
  • For each inline functions that were formerly macros, I made no-op macros (like this one cb70531#diff-64df151b2239dc342d11589751f3bc47R34). #ifdef or have_macro should work.
@ioquatix
Copy link
Member

@ioquatix ioquatix commented Apr 13, 2020

There has been a 30% drop in performance after this was merged. Can you investigate?

master:
Running 2s test @ http://127.0.0.1:9294/
  8 threads and 8 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   186.71us  417.74us  10.27ms   98.43%
    Req/Sec     6.74k     0.89k    9.00k    57.32%
  110095 requests in 2.10s, 3.99MB read
Requests/sec:  52436.50
Transfer/sec:      1.90MB

2.7.1
Running 2s test @ http://127.0.0.1:9294/
  8 threads and 8 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   318.78us    1.10ms  20.60ms   95.24%
    Req/Sec     9.08k     2.61k   17.76k    70.12%
  148277 requests in 2.10s, 5.37MB read
Requests/sec:  70576.54
Transfer/sec:      2.56MB
@shyouhei
Copy link
Member Author

@shyouhei shyouhei commented Apr 13, 2020

@ioquatix Can you tell us your environment (OS, compiler, flags passed to configure)?

@ioquatix
Copy link
Member

@ioquatix ioquatix commented Apr 13, 2020

It seems to affect optcarrot too.

https://benchmark-driver.github.io/benchmarks/optcarrot/commits.html

My setup has no specific configure flags.

OS: Linux (same used in both tests)
Compiler: Clang (same used in both tests)
Flags: Nothing specific.

@shyouhei
Copy link
Member Author

@shyouhei shyouhei commented Apr 13, 2020

The optcarrot benchmark drop is because it doesn't specify cppflags=-DNDEBUG at configure, @k0kubun confirmed. It seems your situation could be the same. Can you try passing that flag to configure, to see if it fixes the situation?

@ioquatix
Copy link
Member

@ioquatix ioquatix commented Apr 13, 2020

I will try it now and report back.

@ioquatix
Copy link
Member

@ioquatix ioquatix commented Apr 13, 2020

Ruby flags:

koyoko% make install
	BASERUBY = /home/samuel/.rubies/ruby-2.7.1/bin/ruby --disable=gems
	CC = gcc
	LD = ld
	LDSHARED = gcc -shared
	CFLAGS = -O3 -ggdb3 -Wall -Wextra -Werror=deprecated-declarations -Werror=duplicated-cond -Werror=implicit-function-declaration -Werror=implicit-int -Werror=misleading-indentation -Werror=pointer-arith -Werror=write-strings -Wimplicit-fallthrough=0 -Wmissing-noreturn -Wno-cast-function-type -Wno-constant-logical-operand -Wno-long-long -Wno-missing-field-initializers -Wno-overlength-strings -Wno-packed-bitfield-compat -Wno-parentheses-equality -Wno-self-assign -Wno-tautological-compare -Wno-unused-parameter -Wno-unused-value -Wsuggest-attribute=format -Wsuggest-attribute=noreturn -Wunused-variable -std=gnu99 
	XCFLAGS = -D_FORTIFY_SOURCE=2 -fstack-protector-strong -fno-strict-overflow -DRUBY_DEVEL=1 -fvisibility=hidden -fexcess-precision=standard -DRUBY_EXPORT -fPIE -DCANONICALIZATION_FOR_MATHN -I. -I.ext/include/x86_64-linux -I../include -I.. -I../enc/unicode/12.1.0
	CPPFLAGS =  -DNDEBUG 
	DLDFLAGS = -Wl,--compress-debug-sections=zlib -fstack-protector-strong -pie  
	SOLIBS = -lz -lpthread -lrt -lrt -lgmp -ldl -lcrypt -lm 
	LANG = en_NZ.utf8
	LC_ALL = 
	LC_CTYPE = 
	MFLAGS = 

I would say, that probably fixed the issue:

Running 2s test @ http://127.0.0.1:9294/
  8 threads and 8 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   328.03us    1.14ms  16.96ms   95.53%
    Req/Sec     8.66k     2.80k   17.24k    65.85%
  141326 requests in 2.10s, 5.12MB read
Requests/sec:  67313.64
Transfer/sec:      2.44MB
@ioquatix
Copy link
Member

@ioquatix ioquatix commented Apr 13, 2020

I also got some strange reports:

compiling ../debug.c
../complex.c:18: warning: "NDEBUG" redefined
   18 | #define NDEBUG
      | 
static inline VALUE
RB_FL_TEST_RAW(VALUE obj, VALUE flags)
{
RUBY3_ASSERT_OR_ASSUME(RB_FL_ABLE(obj));

This comment has been minimized.

@k0kubun

k0kubun Apr 18, 2020
Member

I guess this is a new assertion introduced in this PR, and I'd like to share this helped to find 04e5695. Thank you!

@eregon
Copy link
Member

@eregon eregon commented May 6, 2020

@shyouhei I suspect you might like this somewhat related change in TruffleRuby: oracle/truffleruby@d83ddc3
It's splitting the 3448-lines long C file in 36 files of ~100 lines each 😃

Now it looks like there is a small MRI in https://github.com/oracle/truffleruby/tree/master/src/main/c/cext except most C functions just call back to Ruby and not the other way around.

@shyouhei
Copy link
Member Author

@shyouhei shyouhei commented May 8, 2020

Yes, very interesting. Thank you. Let me take a closer look at them. I guess @ko1 might also be interested in it. His current approach (ruby methods are written in ruby, who call C codes via _builtin) seems like the opposite of them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Linked issues

Successfully merging this pull request may close these issues.

None yet

5 participants
You can’t perform that action at this time.