Bug #17263: Fiber context switch degrades with number of fibers, limit on number of fibers - Ruby - Ruby Issue Tracking System

Actions

Copy link

Bug #17263

closed

Fiber context switch degrades with number of fibers, limit on number of fibers

Bug #17263: Fiber context switch degrades with number of fibers, limit on number of fibers

Added by ciconia (Sharon Rosner) over 5 years ago. Updated over 2 years ago.

Status:

Closed

Assignee:

Target version:

ruby -v:

2.7.1

Backport:

2.5: UNKNOWN, 2.6: UNKNOWN, 2.7: UNKNOWN

[ruby-core:100401]

Description

I'm working on developing Polyphony, a Ruby gem for writing
highly-concurrent Ruby programs with fibers. In the course of my work I have
come up against two problems using Ruby fibers:

Fiber context switching performance seem to degrade as the number of fibers
is increased. This is both with Fiber#transfer and
Fiber#resume/Fiber.yield.
The number of concurrent fibers that can exist at any time seems to be
limited. Once a certain number is reached (on my system this seems to be
31744 fibers), calling Fiber#transfer will raise a FiberError with the
message can't set a guard page: Cannot allocate memory. This is not due to
RAM being saturated. With 10000 fibers, my test program hovers at around 150MB
RSS (on Ruby 2.7.1).

Here's a program for testing the performance of Fiber#transfer:

# frozen_string_literal: true

require 'fiber'

class Fiber
  attr_accessor :next
end

def run(num_fibers)
  count = 0

  GC.start
  GC.disable

  first = nil
  last = nil
  supervisor = Fiber.current
  num_fibers.times do
    fiber = Fiber.new do
      loop do
        count += 1
        if count == 1_000_000
          supervisor.transfer
        else
          Fiber.current.next.transfer
        end
      end
    end
    first ||= fiber
    last.next = fiber if last
    last = fiber
  end

  last.next = first
  
  t0 = Time.now
  first.transfer
  elapsed = Time.now - t0

  rss = `ps -o rss= -p #{Process.pid}`.to_i

  puts "fibers: #{num_fibers} rss: #{rss} count: #{count} rate: #{count / elapsed}"
rescue Exception => e
  puts "Stopped at #{count} fibers"
  p e
end

run(100)
run(1000)
run(10000)
run(100000)

With Ruby 2.6.5 I'm getting:

fibers: 100 rss: 23212 count: 1000000 rate: 3357675.1688139187
fibers: 1000 rss: 31292 count: 1000000 rate: 2455537.056439736
fibers: 10000 rss: 127388 count: 1000000 rate: 954251.1674325482
Stopped at 22718 fibers
#<FiberError: can't set a guard page: Cannot allocate memory>

With Ruby 2.7.1 I'm getting:

fibers: 100 rss: 23324 count: 1000000 rate: 3443916.967616508
fibers: 1000 rss: 34676 count: 1000000 rate: 2333315.3862491543
fibers: 10000 rss: 151364 count: 1000000 rate: 916772.1008060966
Stopped at 31744 fibers
#<FiberError: can't set a guard page: Cannot allocate memory>

With ruby-head I get an almost identical result to that of 2.7.1.

As you can see, the performance degradation is similar in all the three versions
of Ruby, going from ~3.4M context switches per second for 100 fibers to less
then 1M context switches per second for 10000 fibers. Running with 100000 fibers
fails to complete.

Here's a program for testing the performance of Fiber#resume/Fiber.yield:

# frozen_string_literal: true

require 'fiber'

class Fiber
  attr_accessor :next
end

# This program shows how the performance of Fiber.transfer degrades as the fiber
# count increases

def run(num_fibers)
  count = 0

  GC.start
  GC.disable

  fibers = []
  num_fibers.times do
    fibers << Fiber.new { loop { Fiber.yield } }
  end

  t0 = Time.now

  while count < 1000000
    fibers.each do |f|
      count += 1
      f.resume
    end
  end

  elapsed = Time.now - t0

  puts "fibers: #{num_fibers} count: #{count} rate: #{count / elapsed}"
rescue Exception => e
  puts "Stopped at #{count} fibers"
  p e
end

run(100)
run(1000)
run(10000)
run(100000)

With Ruby 2.7.1 I'm getting the following output:

fibers: 100 count: 1000000 rate: 3048230.049946255
fibers: 1000 count: 1000000 rate: 2362235.6455160403
fibers: 10000 count: 1000000 rate: 950251.7621725246
Stopped at 21745 fibers
#<FiberError: can't set a guard page: Cannot allocate memory>

As I understand it, theoretically at least switching between fibers should have
a constant cost in terms of CPU cycles, irrespective of the number of fibers
currently existing in memory. I am completely ignorant the implementation
details of Ruby fibers, so at least for now I don't have any idea where this
problem is coming from.

Files

Download all files

clipboard-202308251514-grqb1.png (81.3 KB) clipboard-202308251514-grqb1.png		ioquatix (Samuel Williams), 08/25/2023 03:15 AM
clipboard-202308251514-r7g4l.png (81 KB) clipboard-202308251514-r7g4l.png		ioquatix (Samuel Williams), 08/25/2023 03:15 AM
clipboard-202308251538-kmofk.png (13.8 KB) clipboard-202308251538-kmofk.png		ioquatix (Samuel Williams), 08/25/2023 03:38 AM
flamegraph_make_many_fibers.png (471 KB) flamegraph_make_many_fibers.png		kjtsanaktsidis (KJ Tsanaktsidis), 09/18/2023 08:21 AM
cache_misses_vs_time.png (42.5 KB) cache_misses_vs_time.png		kjtsanaktsidis (KJ Tsanaktsidis), 09/18/2023 08:21 AM

Actions

Copy link

Also available in: PDF Atom

Project

General

Profile

Ruby

Custom queries

Bug #17263

Fiber context switch degrades with number of fibers, limit on number of fibers

Updated by ioquatix (Samuel Williams) over 5 years ago Actions
Copy link
#1 [ruby-core:100402]

Updated by ioquatix (Samuel Williams) over 5 years ago Actions
Copy link
#2 [ruby-core:100403]

Updated by Eregon (Benoit Daloze) over 5 years ago Actions
Copy link
#3

Updated by ioquatix (Samuel Williams) over 5 years ago Actions
Copy link
#4 [ruby-core:100412]

Updated by ioquatix (Samuel Williams) over 5 years ago Actions
Copy link
#5 [ruby-core:100418]

Updated by ciconia (Sharon Rosner) over 5 years ago Actions
Copy link
#6 [ruby-core:100453]

Updated by ioquatix (Samuel Williams) over 5 years ago Actions
Copy link
#7 [ruby-core:100499]

Updated by rmosolgo (Robert Mosolgo) over 4 years ago Actions
Copy link
#8 [ruby-core:107390]

Updated by ioquatix (Samuel Williams) almost 3 years ago Actions
Copy link
#9 [ruby-core:114519]

Updated by ioquatix (Samuel Williams) almost 3 years ago Actions
Copy link
#10 [ruby-core:114520]

Updated by ioquatix (Samuel Williams) almost 3 years ago Actions
Copy link Download all files
#11 [ruby-core:114523]

Updated by ioquatix (Samuel Williams) almost 3 years ago Actions
Copy link
#12 [ruby-core:114524]

Updated by ioquatix (Samuel Williams) almost 3 years ago Actions
Copy link
#13 [ruby-core:114525]

Updated by kjtsanaktsidis (KJ Tsanaktsidis) over 2 years ago 1Actions
Copy link Download all files
#14 [ruby-core:114794]

Making lots of fibers¶

Fiber memory mappings¶

Page faults¶

Transfering between existing fibers¶

Page faults¶

Cache misses¶

Conclusion¶

Project

General

Profile

Ruby

Custom queries

Bug #17263

Fiber context switch degrades with number of fibers, limit on number of fibers

Updated by ioquatix (Samuel Williams) over 5 years ago ActionsCopy link #1 [ruby-core:100402]

Updated by ioquatix (Samuel Williams) over 5 years ago ActionsCopy link #2 [ruby-core:100403]

Updated by Eregon (Benoit Daloze) over 5 years ago ActionsCopy link #3

Updated by ioquatix (Samuel Williams) over 5 years ago ActionsCopy link #4 [ruby-core:100412]

Updated by ioquatix (Samuel Williams) over 5 years ago ActionsCopy link #5 [ruby-core:100418]

Updated by ciconia (Sharon Rosner) over 5 years ago ActionsCopy link #6 [ruby-core:100453]

Updated by ioquatix (Samuel Williams) over 5 years ago ActionsCopy link #7 [ruby-core:100499]

Updated by rmosolgo (Robert Mosolgo) over 4 years ago ActionsCopy link #8 [ruby-core:107390]

Updated by ioquatix (Samuel Williams) almost 3 years ago ActionsCopy link #9 [ruby-core:114519]

Updated by ioquatix (Samuel Williams) almost 3 years ago ActionsCopy link #10 [ruby-core:114520]

Updated by ioquatix (Samuel Williams) almost 3 years ago ActionsCopy link Download all files #11 [ruby-core:114523]

Updated by ioquatix (Samuel Williams) almost 3 years ago ActionsCopy link #12 [ruby-core:114524]

Updated by ioquatix (Samuel Williams) almost 3 years ago ActionsCopy link #13 [ruby-core:114525]

Updated by kjtsanaktsidis (KJ Tsanaktsidis) over 2 years ago 1ActionsCopy link Download all files #14 [ruby-core:114794]

Making lots of fibers¶

Fiber memory mappings¶

Page faults¶

Transfering between existing fibers¶

Page faults¶

Cache misses¶

Conclusion¶

Updated by ioquatix (Samuel Williams) over 5 years ago Actions
Copy link
#1 [ruby-core:100402]

Updated by ioquatix (Samuel Williams) over 5 years ago Actions
Copy link
#2 [ruby-core:100403]

Updated by Eregon (Benoit Daloze) over 5 years ago Actions
Copy link
#3

Updated by ioquatix (Samuel Williams) over 5 years ago Actions
Copy link
#4 [ruby-core:100412]

Updated by ioquatix (Samuel Williams) over 5 years ago Actions
Copy link
#5 [ruby-core:100418]

Updated by ciconia (Sharon Rosner) over 5 years ago Actions
Copy link
#6 [ruby-core:100453]

Updated by ioquatix (Samuel Williams) over 5 years ago Actions
Copy link
#7 [ruby-core:100499]

Updated by rmosolgo (Robert Mosolgo) over 4 years ago Actions
Copy link
#8 [ruby-core:107390]

Updated by ioquatix (Samuel Williams) almost 3 years ago Actions
Copy link
#9 [ruby-core:114519]

Updated by ioquatix (Samuel Williams) almost 3 years ago Actions
Copy link
#10 [ruby-core:114520]

Updated by ioquatix (Samuel Williams) almost 3 years ago Actions
Copy link Download all files
#11 [ruby-core:114523]

Updated by ioquatix (Samuel Williams) almost 3 years ago Actions
Copy link
#12 [ruby-core:114524]

Updated by ioquatix (Samuel Williams) almost 3 years ago Actions
Copy link
#13 [ruby-core:114525]

Updated by kjtsanaktsidis (KJ Tsanaktsidis) over 2 years ago 1Actions
Copy link Download all files
#14 [ruby-core:114794]