You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Start 60 independent parallel act_runners ( opposed to a single runner that has parallel jobs enabled )
# docker-compose.ymlservices:
runner:
build: . # this Dockerfile adds self-signed certs to act_runner be able to connectenvironment:
- GITEA_INSTANCE_URL= # Your Gitea Instance to register to
- GITEA_RUNNER_REGISTRATION_TOKEN= # The Gitea registration token
- GITEA_RUNNER_LABELS=label # The labels of your runner (comma separated)user: rootdeploy:
mode: replicatedreplicas: 60# <--- This is required on a different host as the gitea server, single runner setups does not seem to be affected
Now create a large 10x10 matrix with 100 jobs at once
Notice only randomly 1-10 jobs get assigned to runners
90-99 jobs keep waiting
once the runners that make progress finish those set taskversion to 0 and get a new one
this job has a sleep 60 to keep working runners busy for some amount of time
Notice the old Workflow run might continue to queue new jobs to runners even if it should have been stopped by concurrency = 1
This could be a database / request timeout side effect
Observed internal behavior
fetchtask might return no job under load even if jobs are available instead of returning an error
if this happend once taskversion gets updated and no picktask calls happen until taskversion is incremented due to new jobs
ca. 50 of the 60 runners directly update their taskversion to latest even if jobs are still queued
in this scenario rerun all jobs returns http 500 probably due to database/request timeout
all other features of gitea keep functional
Workaround
patch act_runner to always send taskversion 0 to force query the database + set fetchtimeout to 50 then all runners got a queued job assigned
Works as well more cpu power and more ram for the database and gitea tested on m4 pro mac + sqlite
Possible alternative Untested use a single act_runner to delegate resources to other machines
I'm not aware of other reports here, still debugging this trying to understand why gitea sometimes claims that no new job is available even if there are clearly tens of them
I'm not planning to run the tests against the demo site to not stress test its resources
EDIT Update macbook pro m4 as Gitea server I got 20 parallel jobs using sqlite during debugging..
Now need to make breakpoints why FetchTask returns no error and no job
EDIT
Need to collect more information...
EDIT
First edit is obsolete, more powerful device solves this problem as well.
So the good path works perfectly fine, but there must be a bad path with degraded performance as well
EDIT Possibleroot cause is here, concurrent job assignments are forwarded as no more jobs instead of an error in line 320
Description
Observed internal behavior
Workaround
I'm not aware of other reports here, still debugging this trying to understand why gitea sometimes claims that no new job is available even if there are clearly tens of them
I'm not planning to run the tests against the demo site to not stress test its resources
EDIT
Update macbook pro m4 as Gitea server I got 20 parallel jobs using sqlite during debugging..Now need to make breakpoints why FetchTask returns no error and no job
EDIT
Need to collect more information...
EDIT
First edit is obsolete, more powerful device solves this problem as well.
So the good path works perfectly fine, but there must be a bad path with degraded performance as well
EDIT
Possibleroot cause is here, concurrent job assignments are forwarded as no more jobs instead of an error in line 320gitea/models/actions/task.go
Lines 317 to 320 in 09a3b07
Gitea Version
1.23.1
Can you reproduce the bug on the Gitea demo site?
No
Log Gist
No response
Screenshots
No response
Git Version
No response
Operating System
ubuntu 22.04 arm64
How are you running Gitea?
docker image on a raspberry pi4 8GB, depending on the database performance more parallel runners might be needed to see something similar.
Database
MySQL/MariaDB
The text was updated successfully, but these errors were encountered: