Misc #20013
openTravis CI status
Description
I would like to use this ticket to manage our activities to report Travis CI status.
Because there is Travis CI status page provided by Travis CI. However, even when the page shows ok, I actually see infra issues.
https://www.traviscistatus.com/
I would share my activities and report the Travis CI status on the ticket.
The ticket's status is not closed until we stop using Travis CI.
The easiest option to fix the Travis infra issue is to email Travis CI support support _AT_ travis-ci.com
.
You can check this ruby/ruby Travis CI wiki page for details.
Updated by jaruga (Jun Aruga) 12 months ago
I am seeing that Travis s390x builds are not starting right now. I am asking to fix it by emailing Travis CI customer support.
https://app.travis-ci.com/github/ruby/ruby/builds/267381855
https://app.travis-ci.com/github/ruby/ruby/builds/267383404
Updated by jaruga (Jun Aruga) 12 months ago
It seems that s390x build takes time to start. But the builds are still running.
https://app.travis-ci.com/github/ruby/ruby/builds
Updated by jaruga (Jun Aruga) 12 months ago
I asked Travis CI support about the s390x build issue yesterday. The support replied that they are investigating the issue now.
Updated by jaruga (Jun Aruga) 12 months ago
I will enable allow_failures for s390x. I am sorry for that.
https://github.com/ruby/ruby/pull/8997
Updated by jaruga (Jun Aruga) 12 months ago
I see the following infra is colored as yellow (not green).
https://www.traviscistatus.com/
Pusher Webhooks - Degraded Performance
Updated by jaruga (Jun Aruga) 12 months ago
I will drop the s390x temporarily. I guess that there are maximum queue number in Travis CI. And as s390x builds are in the queue, other CPU architecture builds (arm64, arm32, ppc64le) even don't start.
https://github.com/ruby/ruby/pull/9004
Updated by jaruga (Jun Aruga) 12 months ago
I am canceling the s390x builds manually for the running Travis builds.
Updated by jaruga (Jun Aruga) 12 months ago
I can see Travis CI builds are stable except for s390x.
https://app.travis-ci.com/github/ruby/ruby/builds
I am communicating with Travis CI support. It seems they added "IBM Z Builds" in Build Processing on Travis CI status page. And you see the status is Degraded Performance (yellow color). It's really helpful! I am asking them to add the "Arm builds" / "IBM ppc64le builds" on the page too.
https://www.traviscistatus.com/
Updated by jaruga (Jun Aruga) 12 months ago
I am testing the s390x builds on my forked repository to add it again.
https://github.com/ruby/ruby/pull/9024
Updated by jaruga (Jun Aruga) 12 months ago
I tested the PR to add the s390x on my forked repository, and merged it. Now Travis CI has the s390x pipeline again.
Updated by jaruga (Jun Aruga) 12 months ago
I was told by Travis customer support that their infra team resolved the issue with s390x builds, and the builds should work now.
Updated by jaruga (Jun Aruga) 11 months ago
Now I am asking Travis CI support by emailing them about the following error messages which are printed in only Arm64 pipelines, and it seems not affected to the result of the CI tests.
https://app.travis-ci.com/github/ruby/ruby/jobs/615194806#L6
sudo: unable to resolve host travis-job-ruby-ruby-615194806: Name or service not known
I opened the thread about the issue in the end of the October 2023, but I haven't seen the response there.
https://travis-ci.community/t/arm64-sudo-unable-to-resolve-host-name-or-service-not-known/14028
So, I emailed them today, and then I was told that the support has reached out to the Travis infra team. I will let you know here when I have updates.
Updated by jaruga (Jun Aruga) 10 months ago
It seems that Travis s390x is slow, running out the max 50 minutes (ruby_3_3 specific issue?),
https://app.travis-ci.com/github/ruby/ruby/builds/268615249
Or not starting soon.
https://app.travis-ci.com/github/ruby/ruby/builds/268616415
I am contacting Travis CI support.
Updated by jaruga (Jun Aruga) 10 months ago
I will drop the s390x case in Travis CI temporarily. I am not sure that the issue comes from an infra or Ruby. But right now the test failing with 50 minutes is not convenient as a CI.
https://github.com/ruby/ruby/pull/9758
Updated by jaruga (Jun Aruga) 10 months ago
jaruga (Jun Aruga) wrote in #note-14:
I will drop the s390x case in Travis CI temporarily. I am not sure that the issue comes from an infra or Ruby. But right now the test failing with 50 minutes is not convenient as a CI.
https://github.com/ruby/ruby/pull/9758
I got a message from Travis CI support "Our Infra team has deployed a fix for the issue you encountered with the s390x Build environment."
Now I am testing the Travis s390x on my forked repository.
Updated by jaruga (Jun Aruga) 10 months ago
Now I am testing the Travis s390x on my forked repository.
I tested. I sent a PR to add the s390x again.
https://github.com/ruby/ruby/pull/9773
Updated by jaruga (Jun Aruga) 10 months ago
jaruga (Jun Aruga) wrote in #note-16:
Now I am testing the Travis s390x on my forked repository.
I tested. I sent a PR to add the s390x again.
https://github.com/ruby/ruby/pull/9773
Merged. The s390x is added on Travis again.
Updated by jaruga (Jun Aruga) 9 months ago
It seems some s390x builds are not starting after 3 hours now. I am asking Travis customer support.
https://app.travis-ci.com/github/ruby/ruby/builds
https://app.travis-ci.com/github/ruby/ruby/builds/269093276
https://app.travis-ci.com/github/ruby/ruby/builds/269093679
https://www.traviscistatus.com/ - Builds Processing - IBM Z Builds shows operational (green).
Updated by jaruga (Jun Aruga) 9 months ago
We are seeing one of s390x build[1] is very slow, exceeding the maximum timeout 50 minutes, totally taking the make test-all
build time for 1515 seconds (= about 25 minutes) So far this build is only the case. This behavior is not normal. Because the next build[2] takes total 28 minutes 40 seconds, taking the make test-all
build time 683 seconds (= about 11 minutes). I suspect this may come from a specific slow running machine. And I am asking Travis support about this issue.
[1] https://app.travis-ci.com/github/ruby/ruby/jobs/618214295#L2094
[2] https://app.travis-ci.com/github/ruby/ruby/jobs/618215618#L2262
Updated by jaruga (Jun Aruga) 9 months ago
jaruga (Jun Aruga) wrote in #note-19:
We are seeing one of s390x build[1] is very slow, exceeding the maximum timeout 50 minutes, totally taking the
make test-all
build time for 1515 seconds (= about 25 minutes) So far this build is only the case. This behavior is not normal. Because the next build[2] takes total 28 minutes 40 seconds, taking themake test-all
build time 683 seconds (= about 11 minutes). I suspect this may come from a specific slow running machine. And I am asking Travis support about this issue.[1] https://app.travis-ci.com/github/ruby/ruby/jobs/618214295#L2094
[2] https://app.travis-ci.com/github/ruby/ruby/jobs/618215618#L2262
Today I found another s390x build exceeding the maximum timeout 50 minutes. Interestingly it took the make test-all
build time for 588 seconds (= about 9 minutes). That is normal.
https://app.travis-ci.com/github/ruby/ruby/jobs/618265449#L2095
But it seems that a freezing happened in the step of the make test-spec
.
https://app.travis-ci.com/github/ruby/ruby/jobs/618265449#L3079
Updated by jaruga (Jun Aruga) 9 months ago
jaruga (Jun Aruga) wrote in #note-19:
We are seeing one of s390x build[1] is very slow, exceeding the maximum timeout 50 minutes, totally taking the
make test-all
build time for 1515 seconds (= about 25 minutes) So far this build is only the case. This behavior is not normal. Because the next build[2] takes total 28 minutes 40 seconds, taking themake test-all
build time 683 seconds (= about 11 minutes). I suspect this may come from a specific slow running machine. And I am asking Travis support about this issue.[1] https://app.travis-ci.com/github/ruby/ruby/jobs/618214295#L2094
[2] https://app.travis-ci.com/github/ruby/ruby/jobs/618215618#L2262
I was told from the Travis support that the Travis's engineers were able to check this issue by their message below.
Thanks so much for your patience here.
Our engineers were able to check on this and you should be able to see your builds are now running. Very sorry for the trouble and we will continue to monitor this!
Updated by Eregon (Benoit Daloze) 9 months ago
FYI mspec has a --timeout SECONDS
option, which should help identify which spec is hanging/very slow.
Updated by jaruga (Jun Aruga) 9 months ago
Eregon (Benoit Daloze) wrote in #note-22:
FYI mspec has a
--timeout SECONDS
option, which should help identify which spec is hanging/very slow.
OK. Thanks for the tip!
Updated by jaruga (Jun Aruga) 9 months ago
We have observed unstable Travis ppc64le/s390x pipelines. So, I added the allow_failures to the pipelines by the PR https://github.com/ruby/ruby/pull/10158.
ppc64le¶
We have seen the following errors around 10 or more times in latest 1 or 2 days.
- https://app.travis-ci.com/github/ruby/ruby/builds/269211469
- https://app.travis-ci.com/github/ruby/ruby/builds/269204073
-
https://app.travis-ci.com/github/ruby/ruby/builds/269211464
No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.
Check the details on how to adjust your build configuration on: https://docs.travis-ci.com/user/common-build-problems/#build-times-out-because-no-output-was-received
The build has been terminated
s390x¶
The following error happened without any output.
Updated by jaruga (Jun Aruga) 9 months ago
I found the following information on https://www.traviscistatus.com/ . Travis CI is undergoing a maintenance in a week of 27/Feb - 5/Mar.
Back-end maintenance 27-Feb to 5-Mar
Update - Build status on GitHub works. Builds triggered from GitLab, BitBucket and Assembla operational. Next updates on Feb-29.
Feb 28, 2024 - 12:42 UTCUpdate - Be advised: Build statsues are not passed back to GitHub after build is executed. Triggering builds from GitLab, BitBucket and Assembla not available. We are in progress with maintenance activities.
Feb 28, 2024 - 11:36 UTCIn progress - Scheduled maintenance is currently in progress. We will provide updates as necessary.
Feb 27, 2024 - 08:00 UTCUpdate - Reminder: Travis CI will be undergoing a maintenance in a week of 27/Feb - 5/Mar. There may be intermittent service detoration, particularly on Feb 28th.
Feb 27, 2024 08:00 - Mar 5, 2024 08:00 UTCScheduled - Travis CI will be undergoing a maintenance in a week of 27/Feb - 5/Mar. We will do all that we can to not interrupt the service during this period. If you spot erratic or deteriorated service behavior please report back to our support.
Feb 27, 2024 08:00 - Mar 5, 2024 08:00 UTC
Updated by jaruga (Jun Aruga) 9 months ago
- Related to Misc #20320: Using OSU Open Source Lab native ppc64le/s390x CI services trigged on pull-requests added
Updated by jaruga (Jun Aruga) 8 months ago
jaruga (Jun Aruga) wrote in #note-21:
jaruga (Jun Aruga) wrote in #note-19:
We are seeing one of s390x build[1] is very slow, exceeding the maximum timeout 50 minutes, totally taking the
make test-all
build time for 1515 seconds (= about 25 minutes) So far this build is only the case. This behavior is not normal. Because the next build[2] takes total 28 minutes 40 seconds, taking themake test-all
build time 683 seconds (= about 11 minutes). I suspect this may come from a specific slow running machine. And I am asking Travis support about this issue.
For the slow s390x build issue, I received the following reply from Travis support on 1st March 2024.
Our Infra team has resolved the issue you encountered. In case it resurfaces, please reach back and we will gladly help.
Updated by jaruga (Jun Aruga) 8 months ago
I noticed the following announcement that would happen on this Wednesday, 6th March. So, I will plan to add the allow_failures to the ruby/ruby's arm64, arm32 cases too before the maintenance. I hope ideally Travis will maintain their service without stopping their service.
https://app.travis-ci.com/github/ruby/ruby
Please note: Travis CI is undergoing maintenance. On March 6 , between 08:00-12:00 UTC+0 service may be temporarily unavailable.
Updated by jaruga (Jun Aruga) 8 months ago
jaruga (Jun Aruga) wrote in #note-28:
I noticed the following announcement that would happen on this Wednesday, 6th March. So, I will plan to add the allow_failures to the ruby/ruby's arm64, arm32 cases too before the maintenance. ...
I sent the PR for that.
https://github.com/ruby/ruby/pull/10180
Updated by jaruga (Jun Aruga) 8 months ago
jaruga (Jun Aruga) wrote in #note-29:
jaruga (Jun Aruga) wrote in #note-28:
I noticed the following announcement that would happen on this Wednesday, 6th March. So, I will plan to add the allow_failures to the ruby/ruby's arm64, arm32 cases too before the maintenance. ...
I sent the PR for that.
https://github.com/ruby/ruby/pull/10180
As it seems that the maintenance is finished, I reverted the commit above.
https://github.com/ruby/ruby/pull/10186
Updated by jaruga (Jun Aruga) 8 months ago · Edited
For your information, I saw the following ppc64le job not starting 10 days ago, and contacted Travis support at that time, and still waiting for the fix, though I didn't find any other failures in last few days.
https://app.travis-ci.com/github/ruby/ruby/jobs/619005133
No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.
Check the details on how to adjust your build configuration on: https://docs.travis-ci.com/user/common-build-problems/#build-times-out-because-no-output-was-received
The build has been terminated
Updated by jaruga (Jun Aruga) 5 months ago
I noticed the following message is displayed on our Travis page. I will contact Travis support.
https://app.travis-ci.com/github/ruby/ruby
We are unable to start your build at this time. You exceeded the number of users allowed for your plan. Please review your plan details and follow the steps to resolution.
Updated by jaruga (Jun Aruga) 5 months ago
jaruga (Jun Aruga) wrote in #note-32:
I noticed the following message is displayed on our Travis page. I will contact Travis support.
https://app.travis-ci.com/github/ruby/ruby
We are unable to start your build at this time. You exceeded the number of users allowed for your plan. Please review your plan details and follow the steps to resolution.
Travis support quickly responded and fix the issue removing the message. And I can see the Travis builds are starting to run since 2 hours ago.
However, I still see the message on the Travis page below for my ruby's forked repository. And I am asking the Travis to enable the builds for the ruby's forked repositories too. I was previously able to run Travis builds in my forked repository too.
https://app.travis-ci.com/github/junaruga/ruby/
Updated by jaruga (Jun Aruga) 5 months ago
Travis support enabled my forked repositories of ruby/ruby, ruby/zlib and ruby/prism where Travis was used. It seems that we need to submit a list of github accounts to enable someone's forked repository from ruby/* organization. Maybe they changed how to enable Travis recently.
I am working to get the list of the github accounts who contributed to the ruby/ruby, ruby/zlib and ruby/prism in the last 2 years, and submit the list to Travis support to enable Travis for their forked repositories.
Updated by jaruga (Jun Aruga) 5 months ago
I am working to get the list of the github accounts who contributed to the ruby/ruby, ruby/zlib and ruby/prism in the last 2 years, and submit the list to Travis support to enable Travis for their forked repositories.
I am working to get the list of the github accounts. For example, in the last 2 years for ruby/ruby master branch, there are 496 people. I want to get the github accounts of the people. If you know, how to get the list easily, please let me know.
$ git remote get-url origin
https://github.com/ruby/ruby.git
$ git branch | grep ^*
* master
$ git shortlog --summary --numbered --since="2022-06-27" | head
2238 Nobuyoshi Nakada
1397 Takashi Kokubun
1352 Kevin Newton
1133 Hiroshi SHIBATA
697 Peter Zhu
508 git[bot]
406 David Rodríguez
302 Alan Wu
252 yui-knk
242 Jemma Issroff
$ git shortlog --summary --numbered --since="2022-06-27" | wc -l
496
Updated by jaruga (Jun Aruga) 5 months ago
As a note, by the Travis's above change, while people can run Travis by sending a pull-request to the ruby/* repositories, people cannot run Travis by pushing commits to the branches in people's forked repositories of the ruby/* repositories right now.
Updated by jaruga (Jun Aruga) 3 months ago
I am seeing Travis infra errors. It started at least since 13th August 2024 until now. Now the errors only happen on ppc64le/s390x cases. I will contact Travis support.
13th August 2024
https://app.travis-ci.com/github/ruby/prism/builds/271850178
https://app.travis-ci.com/github/ruby/prism/builds/271851132
Updated by jaruga (Jun Aruga) 3 months ago
Sent a PR to allow failures for ppc64le/s390x on ruby/prism.
https://github.com/ruby/prism/pull/3005
Updated by jaruga (Jun Aruga) 3 months ago
jaruga (Jun Aruga) wrote in #note-37:
I am seeing Travis infra errors. It started at least since 13th August 2024 until now. Now the errors only happen on ppc64le/s390x cases. I will contact Travis support.
13th August 2024
https://app.travis-ci.com/github/ruby/prism/builds/271850178
https://app.travis-ci.com/github/ruby/prism/builds/271851132
I emailed Travis support about this issue.
Updated by jaruga (Jun Aruga) 2 months ago
Right now I am seeing Travis's arm64 (and arm32) take time around 7 hours to start the jobs. Therefore I will allow failures for the cases to avoid waiting for the jobs.
https://app.travis-ci.com/github/ruby/ruby/builds/272105049
https://app.travis-ci.com/github/ruby/ruby/builds/272105708
This is the PR. Sorry for your inconvenience.
https://github.com/ruby/ruby/pull/11509
Updated by jaruga (Jun Aruga) 2 months ago
All the Travis partner pipelines (non-x86_64) jobs don't start. So, I will drop all the pipelines on the 2nd commit on the following PR.
https://app.travis-ci.com/github/ruby/ruby/builds/272123937
https://github.com/ruby/ruby/pull/11509
Updated by jaruga (Jun Aruga) 2 months ago
So, I will drop all the pipelines on the 2nd commit on the following PR.
Dropped Travis as temporary workaround. The infra issue is still ongoing. The Travis customer support replied with the following message.
Our team is actively looking into the reported issue and is working hard to find a solution. We will keep you posted on the progress.
Updated by jaruga (Jun Aruga) 2 months ago
We drop Travis CI for ruby/zlib, ruby/prism too. Because the Travis infra issues are still ongoing since my last report.
The Travis CI status page is not accurate.
https://www.traviscistatus.com/ - Past Incidents
I still see the jobs are not starting with empty outputs.
https://app.travis-ci.com/github/ruby/prism/builds/272244955
Updated by jaruga (Jun Aruga) 2 months ago
I see the case that the jobs are running with showing the outputs (and failing as expected).
https://app.travis-ci.com/github/ruby/ruby/builds/272246788
Updated by jaruga (Jun Aruga) 19 days ago
I received the following email from Travis support on 19th October.
Could you please try to trigger a build once again when you are ready and let us know if this issue now appears to be resolved?
We will follow up on your reply.
I am starting to test Travis on my forked repository.
Updated by jaruga (Jun Aruga) 18 days ago
We enabled Travis CI again (enabling the allow_failures) by the PR https://github.com/ruby/ruby/pull/11948 solving https://bugs.ruby-lang.org/issues/20810.
Updated by jaruga (Jun Aruga) 14 days ago
I sent the following PRs to enable Travis CI for ruby/prism and ruby/zlib again.
Updated by jaruga (Jun Aruga) 5 days ago · Edited
Sorry, I see Travis arm64 jobs are not starting or take long time to start. It's stucking in the step Booting virtual machine.
https://app.travis-ci.com/github/ruby/prism/jobs/627841655
I sent PR to drop Travis arm64 as a temporary workaround.
https://github.com/ruby/ruby/pull/12024
I am asking Travis support about it.