-
Notifications
You must be signed in to change notification settings - Fork 557
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
馃挜 engine: Remove --oci-max-parallelism
#7395
Conversation
--oci-max-parallelism
c533cee
to
3e145f0
Compare
This is not an option that we should support. Justin puts it best: I think with (Dagger) Modules, this effectively means that `max-parallelism` is pretty much just completely broken - it just fits very badly with the idea of nested containers. ... but to me this kind of highlights that we need a better way of constraining resources - max-parallelism is just really not fit for the shape that pipelines tend to take these days. Maybe we should provide some cgroup config options instead (if the end desire is to constrain CPU usage/etc). While this can still be configured if a custom Engine toml config is provided, it means that users went out of their way to have this. It's hard on purpose and we don't recommend using this setting anymore. This also removes it from the Helm chart. Signed-off-by: Gerhard Lazu <gerhard@dagger.io>
3e145f0
to
5bc265f
Compare
Speaking with @aluzzardi, these checks may have failed due to a ClickHouse migration which killed some in-flight traces. Rebasing & force pushing so that we force re-run all failed checks. |
If possible, I'd like to hold off on merging this. If I have the time, I think keeping this option would be best for people using it, but we'd need the improved semantics previously described. But we can keep this open at least as a tracking issue.
IMO, we should do this as soon as possible - it's definitely a potential candidate for some of the weird ephemeral CI failures we've seen. With this option it's possible to get every running test is a stuck state (if the right combination of tests all start at the right time) - which prevents any progress being made (which sounds like a symptom we've seen before). |
Agree with @jedevc, this option is used already by multiple users and while it's totally buggy once modules are used with too low a value, that feels more like a bug than enough justification to remove this entirely.
I'd totally be in favor of adding a scary warning message to the help for this though and tracking a long term fix to the problem.
|
This PR is stale because it has been open 14 days with no activity. Remove stale label or comment or this will be closed in 7 days. |
--oci-max-parallelism
--oci-max-parallelism
--oci-max-parallelism
--oci-max-parallelism
While we are keeping this default so that we don't break existing users, setting this value is something that we want to move away from. The problem is that this setting limits how many operations can run in parallel. It is still possible for a single operation to max out all available cores. It is also known for a value of `2` to cause deadlocks, i.e. dagger#6894 For now, we just allow this to be disabled with either `--set engine.args=''` or by explicitly setting this value to an empty list. This started as dagger#7395 which turned out to be too big of a change. We since scaled back the initial ambition & are taking a smaller step towards eventually phasing this out. FWIW, all the Dagger Engines that we run inside the Dagger infra do not use the `--oci-max-parallelism` option. This also removes the option from tekton-dagger-task docs example. Signed-off-by: Gerhard Lazu <gerhard@dagger.io>
While we are keeping this default so that we don't break existing users, setting this value is something that we want to move away from. The problem is that this setting limits how many operations can run in parallel. It is still possible for a single operation to max out all available cores. It is also known for a value of `2` to cause deadlocks, i.e. #6894 For now, we just allow this to be disabled with either `--set engine.args=''` or by explicitly setting this value to an empty list. This started as #7395 which turned out to be too big of a change. We since scaled back the initial ambition & are taking a smaller step towards eventually phasing this out. FWIW, all the Dagger Engines that we run inside the Dagger infra do not use the `--oci-max-parallelism` option. This also removes the option from tekton-dagger-task docs example. Signed-off-by: Gerhard Lazu <gerhard@dagger.io>
Fixes #6894
This is not an option that we should support. Justin puts it best:
While this can still be configured if a custom Engine toml config is provided, it means that users went out of their way to have this. It's hard on purpose and we don't recommend using this setting anymore.
This also removes it from the Helm chart.
WDYT @sipsma @jedevc @aluzzardi @vito ?
@matipan if everyone is in agreement with this change, we should remove this from all our infra. We can do that before the release goes out.