Project

General

Profile

Actions

Feature #15438

closed

Threads can't switch faster than TIME_QUANTUM_(NSEC|USEC|MSEC)

Added by sylvain.joyeux (Sylvain Joyeux) almost 6 years ago. Updated 7 months ago.

Status:
Closed
Assignee:
-
Target version:
-
[ruby-core:90624]
Tags:

Description

Thread#priority can be set to negative values, which when looking at the code is meant to reduce the time allocated to the thread. However, as far as I could understand in the codebase, the quantum of time is definitely hard-coded to 100ms (TIME_QUANTUM_...). This means that the "lower allocated time" would only work for threads that would often yield one way or the other (sleep, blocking calls, ...)

My projects would definitely benefit from a faster switching period. I was wondering how best to implement this ability ?

I thought of the following:

  1. globally using an environment variable
  2. globally using an API
  3. trying to adapt dynamically, using the highest needed period
  4. lowering the period when a priority lower than 0 is set, leaving it at the lower period.

Obviously (3) would seem to be the best, but I'm not sure I would be able to get it right in a decent amount of time. (4) seem to be a good trade-off between simplicity and performance (nothing changes if you never use priorities lower than 0, and if you were you basically get what you wanted).

What do you think ?


Files

Updated by sylvain.joyeux (Sylvain Joyeux) almost 6 years ago

This is the implementation of option (4) on 2.5.3 (since that's what I am using). If it has a chance to be accepted, I will forward-port it to 2.6.

Updated by sylvain.joyeux (Sylvain Joyeux) almost 6 years ago

Anyone interested on the subject ?

Updated by nobu (Nobuyoshi Nakada) almost 6 years ago

Your patch can't apply to the latest code.
Could you rebase it?

Updated by sylvain.joyeux (Sylvain Joyeux) almost 6 years ago

Hi nobu. Thanks for looking into this !

Here's the patch for 2.6. I would be glad if you could consider also applying the other patch to 2.5.

Updated by sylvain.joyeux (Sylvain Joyeux) almost 6 years ago

Quick question ... what is the preferred method to provide patches now ? I'm seeing that there are PR on GitHub ... Should I propose the patch here instead ?

Updated by nobu (Nobuyoshi Nakada) almost 6 years ago

sylvain.joyeux (Sylvain Joyeux) wrote:

Here's the patch for 2.6. I would be glad if you could consider also applying the other patch to 2.5.

Thank you so much.

In thread_win32.c:

-#define TIME_QUANTUM_USEC (10 * 1000)
+#define TIME_QUANTUM_USEC_BASE (100 * 1000)

Why 10 times?

+#define TIME_QUANTUM_USEC TIME_QUANTUM_USEC

_BASE is missing at the last?

+static const rb_hrtime_t TIME_QUANTUM_NSEC = &TIME_QUANTUM_USEC * 1000;

The address of a lvalue is invalid, and the multiplication too.
Just & isn't needed?

sylvain.joyeux (Sylvain Joyeux) wrote:

Quick question ... what is the preferred method to provide patches now ? I'm seeing that there are PR on GitHub ... Should I propose the patch here instead ?

Anywhere you like.

Updated by sylvain.joyeux (Sylvain Joyeux) over 5 years ago

Follow-up PR: https://github.com/ruby/ruby/pull/2087

I apologize for Windows not building, I don't have a windows machine to try it on. Unfortunately, the Azure Pipeline build on GH failed to fetch the repository, and there doesn't seem to be a way to restart it.

Updated by k0kubun (Takashi Kokubun) over 5 years ago

I apologize for Windows not building

Both Linux and macOS on Travis are red. It's fine that you ignore our unstable Azure Pipeline for now, but you can still use AppVeyor for checking Windows. Please just try to make Travis and AppVeyor green.

Unfortunately, the Azure Pipeline build on GH failed to fetch the repository, and there doesn't seem to be a way to restart it.

I can rerun that, but the fetch failure is actually definitive. Please remove v1_3_1"_990201 git tag on your fork first. We removed that tag from ruby/ruby mirror recently. Your repository might be forked before that and so still seems to have that. https://github.com/ThirteenLtda/ruby/releases/tag/v1_3_1%22_990201

Updated by sylvain.joyeux (Sylvain Joyeux) over 5 years ago

  • ruby -v changed from ruby 2.5.1p57 (2018-03-29 revision 63029) [x86_64-linux-gnu] to trunk

I'll check appveyor out

I've push --mirror from the current ruby git repo to remove all dead references, and re-pushed the branch. I didn't realize it would force me to re-create the pull request, which is now https://github.com/ruby/ruby/pull/2093.

I'll make sure the AppVeyor build passes, and report here when it does.

Updated by Eregon (Benoit Daloze) over 5 years ago

Why is RUBY_THREAD_PRIORITY_MIN changed? This should be discussed in this thread.

Other implementations have to follow this change (e.g. TruffleRuby & JRuby already handle negative priorities AFAIK),
so we need a good spec to ensure in a given Ruby versions the limit is the same for all implementations.

Updated by sylvain.joyeux (Sylvain Joyeux) over 5 years ago

RUBY_THREAD_PRIORITY_MIN is changed to give the developer the choice to make some threads really low priority, at the cost of switching overhead. Unless I'm mistaken, the semantic of 'priority' right now is very platform-specific anyways (depends on the underlying thread scheduling implementation), so I don't see a major drive in making the min and max value uniform across all Ruby implementations.

If this indeed is a problem, I'll just switch it back to -3.

Updated by Eregon (Benoit Daloze) over 5 years ago

the semantic of 'priority' right now is very platform-specific anyways

The specific semantics maybe, but the range is platform-agnostic and has always been -3..3 so far.

I don't see a major drive in making the min and max value uniform across all Ruby implementations.

I think it's important, otherwise how is Ruby code supposed to know what are the limits and which value it can use for Thread#priority= ?

Updated by sylvain.joyeux (Sylvain Joyeux) over 5 years ago

I think it's important, otherwise how is Ruby code supposed to know what are the limits and which value it can use for Thread#priority= ?

It does not have to know. It will use, say -5 because it is meaningful on MRI. The value is then clamped to -3 by e.g. JRuby where -5 is not meaningful. This is already the behavior of Thread#priority - clamp to [min, max], https://github.com/ruby/spec/pull/661 only stopped assuming that the only range is [-3, 3]

This IMO does not add any more variability than there already is between the platforms, where '-3' already maps to different behaviors.

Updated by Eregon (Benoit Daloze) over 5 years ago

It does not have to know. It will use, say -5 because it is meaningful on MRI. The value is then clamped to -3 by e.g. JRuby where -5 is not meaningful.

-5 would still mean "the highest priority", yes, but then e.g. -4 or -3 which starts to mean "high but not highest priority" on MRI would have a different meaning on other implementations, which I believe is undesirable.

FWIW, TruffleRuby and JRuby map -3..3 to Java 1..10:
https://github.com/oracle/truffleruby/blob/ed796f79515351f8c569306e01775c13e48876e8/src/main/ruby/core/thread.rb#L271-L276
So making the range larger would actually allow finer control on those implementations too.
setpriority(2) has 40 values on Linux.

Maybe we should use something else than integers, like a floating point number or Rational between 0 and 1, and internally map to the closest value?
I think that would be both more flexible and portable than extending the range on one side just for one implementation.

Updated by sylvain.joyeux (Sylvain Joyeux) over 5 years ago

-5 would still mean "the highest priority", yes, but then e.g. -4 or -3 which starts to mean "high but not highest priority" on MRI would have a different meaning on other implementations, which I believe is undesirable.

-5 means "give a timeslice of 3.1ms to this thread" while the threads at 0 will have a timeslice of 100ms and 3 have 800ms. So, -5 is lowest, not highest.

I guess we'll have to agree to disagree on the desirability issue. -3 on MRI and -3 on JRuby will already cause a different behavior in the various implementations. Someone who would want to fine-tune threading behavior will already have to find which values to use on which implementation.

Maybe we should use something else than integers, like a floating point number or Rational between 0 and 1, and internally map to the closest value?

I guess we could ... but on MRI we would still have to "guess" what are both the lowest and highest time slices we want to support forever and embed that in the value. Not really a very good prospect IMO.

Updated by sylvain.joyeux (Sylvain Joyeux) over 5 years ago

OK ... since it seems that everyone is camping on their own position, I have an alternative proposal: initialize the min and max priority values using environment variables on MRI, defaults to -3 and 3. Would that be alright ?

Updated by Eregon (Benoit Daloze) over 5 years ago

I guess we could ... but on MRI we would still have to "guess" what are both the lowest and highest time slices we want to support forever and embed that in the value. Not really a very good prospect IMO.

Does it matter what the exact timeslice is in absolute units, or what matters is how often is a high-priority Thread executed compared to a low-priority Thread?

If the first one, maybe the API should be Thread#timeslice=, but I have no idea how to map e.g. getpriority/setpriority to that (on implementations without a GIL).

Someone who would want to fine-tune threading behavior will already have to find which values to use on which implementation.

How about exposing Thread::MIN_PRIORITY and Thread::MAX_PRIORITY? Then one could know what's the valid range of values.
I would be fine with that as an evolution mechanism (and bump MAX_PRIORITY from currently 3 to 5 on MRI).

I understand the desire to keep the same timeslice values on MRI for compatibility.
I'm trying to come with an API that is still portable and doesn't silently truncate values a user cannot easily query.

In the end, this is just my opinion. Other MRI committers, please weigh in. cc @nobu (Nobuyoshi Nakada)

Updated by sylvain.joyeux (Sylvain Joyeux) over 5 years ago

Does it matter what the exact timeslice is in absolute units, or what matters is how often is a high-priority Thread executed compared to a low-priority Thread?

The actual underlying behavior matters, actually. If I'm using Linux scheduler, I know how to play to get what I want. On MRI, the control is the timeslice, so yes I want to know what timeslice I get.

What I really don't see is how you can expect to get a common set of values, the underlying scheduling system being always different.

I don't see any downside with exposing MIN_PRIORITY and MAX_PRIORITY. If that would be fine for you too, I'll just change the PR to reflect that.

Updated by Eregon (Benoit Daloze) over 5 years ago

sylvain.joyeux (Sylvain Joyeux) wrote:

Does it matter what the exact timeslice is in absolute units, or what matters is how often is a high-priority Thread executed compared to a low-priority Thread?

The actual underlying behavior matters, actually. If I'm using Linux scheduler, I know how to play to get what I want. On MRI, the control is the timeslice, so yes I want to know what timeslice I get.

OK, then I think it would be worth documenting what the timeslice is on MRI based on the priority value in the docs of Thread#priority=, and mention other implementations might interpret differently.

What I really don't see is how you can expect to get a common set of values, the underlying scheduling system being always different.

Shouldn't we be able to have e.g. a Thread with half the priority of another, regardless of the OS/Ruby implementation?

I don't see any downside with exposing MIN_PRIORITY and MAX_PRIORITY. If that would be fine for you too, I'll just change the PR to reflect that.

I think that would be a useful addition and gives a path forward for knowing what is the priority value range, so please do.

I'm not very familiar with scheduling priorities, so another review/opinion would definitely be welcome.

Actions #20

Updated by jeremyevans0 (Jeremy Evans) about 1 year ago

  • Tracker changed from Bug to Feature
  • ruby -v deleted (trunk)
  • Backport deleted (2.4: UNKNOWN, 2.5: UNKNOWN)

Updated by jhawthorn (John Hawthorn) 7 months ago

  • Status changed from Open to Closed

I think this is something we should improve more (I would like even faster switching times), but it does seem possible as of Ruby 3.3 to have threads switch faster than 100ms.

def test_switching(priority = 0)
  done = false
  started = false
  busy = Thread.new do
    Thread.current.priority = priority
    until done
      started = true
    end
  end

  Thread.pass until started

  times = []
  while times.length < 10
    before = Process.clock_gettime(Process::CLOCK_MONOTONIC)
    Thread.pass
    after = Process.clock_gettime(Process::CLOCK_MONOTONIC)
    times << (after - before)
  end
  done = true
  busy.join

  times
end

puts RUBY_VERSION
(-3).upto(3) do |priority|
  print "Priority: #{priority}"
  times = test_switching(priority)
  times = times.sort[1..-2] # drop fastest and slowest
  average = (times.sum / times.length)
  puts "  average: #{average}"
end
$ ruby test_measure_switching_times.rb
3.2.2
Priority: -3  average: 0.10010529025021242
Priority: -2  average: 0.10010091188087245
Priority: -1  average: 0.10010025675364886
Priority: 0  average: 0.10010364262052462
Priority: 1  average: 0.20017941037440323
Priority: 2  average: 0.4003785701279412
Priority: 3  average: 0.8007820281236491
$ ruby test_measure_switching_times.rb
3.3.1
Priority: -3  average: 0.02040818374371156
Priority: -2  average: 0.03023905074587674
Priority: -1  average: 0.05040335562443943
Priority: 0  average: 0.10088755612378009
Priority: 1  average: 0.20198591962616774
Priority: 2  average: 0.4033456226279668
Priority: 3  average: 0.8073137137507729
Actions

Also available in: Atom PDF

Like0
Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0Like0