Project

General

Profile

Actions

Bug #14445

closed

MJIT: Determine path of mjit header and libruby at runtime

Added by larskanis (Lars Kanis) about 6 years ago. Updated about 6 years ago.

Status:
Closed
Assignee:
-
Target version:
-
ruby -v:
ruby 2.6.0dev (2018-02-05 trunk 62211) [x64-mingw32]
[ruby-core:85390]

Description

I'm pleased to see that the MJIT pull request has been merged and that my initial Windows port has been integrated as well! One part of this patch was to avoid builtin absolute paths in the binary. This part was not merged.

The current MJIT implementation uses the install prefix at build time as part of the include and library paths for gcc/clang at runtime. If ruby is distributed as pre-compiled binary, these paths can differ between build and runtime. This applies in particular to the Windows RubyInstaller, where a user defined directory is used to store the binaries.

So for RubyInstaller I'm using this patch to determine the paths based on the ruby top directory and to compile only relative paths into the binary. There's probably a cleaner approach to do this, but since ruby has always been fully path-relocatable, I think this should be fixed for MJIT as well.


Files

Actions #1

Updated by k0kubun (Takashi Kokubun) about 6 years ago

  • Status changed from Open to Closed

Applied in changeset trunk|r62238.


mjit.c: determine prefix of MJIT header at runtime

so that MJIT can work if Ruby is distributed as prebuilt binary.

Now mjit_init() depends on the internal const TMP_RUBY_PREFIX which is
only available after ruby_init_loadpath_safe() (L1608) and before
ruby_init_prelude() (L1681). So the place of mjit_init() is moved.

Makefile.in: Removed static prefix from MJIT_HEADER_ISNTALL_DIR macro.
And this removes the unused LIBRUBY_LIBDIR macro as well.
win32/Makefile.sub: ditto.

Patch by: Lars Kanis
[Bug #14445]

Updated by k0kubun (Takashi Kokubun) about 6 years ago

Thank you for the initial Windows port and your work on this patch. I agree that we need support for pre-compiled binary.
As this patch worked well on my MinGW environment, I merged your patch at r62238 with a comment and #ifdef addition.

Updated by larskanis (Lars Kanis) about 6 years ago

I have to thank you for getting the whole MJIT train forward!

Unfortunately on Windows the startup times of GCC are so crazy slow, that MJIT is rarely useful. An idea is to bundle several iseq units from the queue to one compiler run. In a very simple test I was able to combine more than 10 iseqs without significantly increasing the compile time. Is this something you could imagine?

Updated by vmakarov (Vladimir Makarov) about 6 years ago

On 02/05/2018 01:10 PM, wrote:

Issue #14445 has been updated by larskanis (Lars Kanis).

I have to thank you for getting the whole MJIT train forward!

Unfortunately on Windows the startup times of GCC are so crazy slow, that MJIT is rarely useful. An idea is to bundle several iseq units from the queue to one compiler run. In a very simple test I was able to combine more than 10 iseqs without significantly increasing the compile time. Is this something you could imagine?

Actually, the first variant of MJIT had this feature.  It had a batch
which contained a few iseqs.  The batch was an entity to compile.

The batch had own drawbacks.  Iseq can be compiled several times (e.g.
because of different levels of speculation). You can remove old iseq
code only when all other iseqs in the same batch became obsolete because
code of all batch iseqs is in the same shared object and we can remove
only the shared object.  It results in keeping several variants of code
for one iseq in memory.  The more iseqs a batch contains the more memory
can be wasted for the obsolete code.  So I removed batches from MJIT. 
This code can be found on  https://github.com/vnmakarov/ruby in a
variant before Aug 2.

The memory wasting might not a big deal as we have no aggressive
speculation and iseq re-compilations are rare.  Probably it is worth to
restore the code.  Linux and MacOS can have batches containing only 1
iseq.  For Windows, the batch can have more than 1 iseq.  I could try it
but unfortunately not before April because I am quit busy with GCC 8
release these days.

I have only experience with CYGWIN, gcc is very slow there.  I guess the
same problem with MINGW.  I suspect the acceptable compilation speed can
be achieved only by using native visual C compiler.

Another strategic way to solve the problem could be an implementation of
a simple tier1 JIT compiler.  In this case, the current MJIT with
GCC/Clang would become a typical tier2 JIT compiler, which generates a
very optimized code but takes much more compilation time.  Zing VM is
such example. It changes only JVM tier 2 (server) compiler by LLVM not
touching tier1 compiler which is fast but generates less optimized
code.  I thought about this approach.  On my evaluation such jit could
be 4-5K of C, achieving 70% of GCC -O2 performance on x86-64 but at
least 10 times faster in compilation speed.  For windows, tier1 could be
used more frequently than on Linux/Macos.  But there are still a lot of
questions for me with this approach.

Updated by k0kubun (Takashi Kokubun) about 6 years ago

I agree above batching ideas and having multiple tiers are worth trying, but I understand current bottleneck of compilation time on MinGW comes from the fact that we give up transforming header for Windows.
I guess adding static to non-static functions would make compilation fast, and we should solve it first.

Updated by vmakarov (Vladimir Makarov) about 6 years ago

On 02/05/2018 07:25 PM, wrote:

Issue #14445 has been updated by k0kubun (Takashi Kokubun).

I agree above batching ideas are worth trying, but I understand current bottleneck of compilation time on MinGW comes from the fact that we give up transforming header for Windows.
The header should contain only static functions. In this case unused
functions are never optimized and not processed for code generation. N
non-static functions would slowdown compilation in about N times. As
there are many functions in the header, it is definitely a bottleneck.
I guess adding static to non-static functions would make compilation fast, and we should solve it first.

Updated by k0kubun (Takashi Kokubun) about 6 years ago

wrote (but seems not posted to redmine):

Thank you Vladimir for this hint! I also think that profiling gcc makes sense, to find the root cause of this slowness, before adding iseq batching. Anyhow with the RTL-MJIT branch the compile time was similar slow on Windows, even with it's optimized header file in addition to precompiled headers.

Hmm, it's surprising for me because giving up the transformation increases the minimum compilation time from about 50ms to about 600ms on Linux, and MinGW's compilation is taking about 900ms on the same machine.

Probably we can verify it with current trunk by manually transforming header to have "static" for functions that don't have. I'll try that later.

Note of implementation background: RTL-MJIT branch was succeeding to transform header because it removes many unused declarations. I removed the removal because it makes building header too slow, and current Ruby's trunk is failing to transform header because it doesn't remove them and they're bad for the transformation. But the failure should be bug and we should fix it without removing unused declarations.

Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0