Che 7 startup should be faster on che.openshift.io #1011

ibuziuk · 2018-10-24T12:38:19Z

Info about Che 7 workspace startup (ephemeral, but without eclipse-che/che#11786 so normal PVC is still used for broker):

Broker start: ~7.5 seconds average
Pulling images: ~100 seconds on average (che-dev, che-machine-exec, che-hello, che-theia).
Start Theia: ~10 seconds, sometimes quite a bit longer

Areas for improvement:

storage
- ~~Need to improve time required for initial git clone after startup of Che 7 workspace on osio Need to improve time required for initial git clone after startup of Che 7 workspace on osio #1012~~
network
- Remove unnecessary authentication steps in client-side IDE loading on openshift.io - Remove unnecessary authentication steps in client-side IDE loading on openshift.io #1013
- Make Theia webpack-based bundle packaging customizable - Make Theia webpack-based bundle packaging customizable #1065
- Customize the Che-Theia bundle packaging and main html file to use CDNs with fallback Customize the Che-Theia bundle packaging and main html file to use CDNs with fallback #1066
- Automatic to push of Che-theia bundles files to a chosen CDN on Docker push (CI job ?) Automatic push of Che-theia bundles files to a chosen CDN on Docker push (CI job ?) #1068
- Analyse the required actions to also enable caching of VSCode/Monaco-related JS files Analyse the required actions to also enable caching of VSCode/Monaco-related JS files #1070
- Pre-Fetch fixed CDN URLs of IDE bundles in the workspace loader. Pre-Fetch fixed CDN URLs of IDE bundles in the workspace loader. #1071
- Is it possible to separate client-side and server-side Theia IDE load?
image pulling
- pull can be fast / extraction could be slower). Setup local docker registry in the usernames similar to che-plugin-registry and compare performance with quai / dockerhub (Need to compare pod starup not only pulling time) - 8 Need to compare Che 7 workspace startup time when using images from docker registry deployed in the user namespace in comparison with dockerhub / quay #1014
- Image pulling of predefined stacks should always be fast on che.openshift.io Image pulling for predefined images should always be fast on che.openshift.io #1056
- Investigate what causes fast image pulls on OSIO - Investigate image caching on OSIO #1059
- Need to use distroless images (or FROM scratch) for brokers, editors and plugins in order to reduce image size and speed-up pulling Che 7: Need to use distroless images (or FROM scratch) for brokers, editors and plugins in order to reduce image size and speed-up pulling #1061
- Need to carry out a test in order to get information about the time required for pod startup if image streams from the user namespace are used in deployment configs created in *-che namespace Need to carry out a test in order to get information about the time required for pod startup if image streams from the user namespace are used in deployment configs created in *-che namespace #1084
- Need to move all che 7 images for brokers, editors, plugins from dockerhub to quai.io TODO
loading sequence
- Test if starting Theia container first speeds up startup - Test if changing container order in workspace deployment speeds up start #1062
- Che 7: known plugin brokers should be run not inside *-che namespace, but as services like plugin-registry / che server Che 7: known plugin brokers should be run not inside *-che namespace, but as services like plugin-registry / che server #1067
- Split current functionality of che plugin broken into 2 different application - service & init container Split current functionality of che plugin broken into 2 different application - service & init container #1069
- Theia IDE should be loaded first during workspace startup Che 7: Theia IDE should be loaded first during workspace startup #1058
- Allow to specify in a plugin meta link to a che-plugin.yaml when having a separate archive is not needed Allow to specify in a plugin meta link to a che-plugin.yaml when having a separate archive is not needed #1073
- Finish making a workspace run without any user env Finish making a workspace run without any user env #1074

Noteworthy comment from @amisevsk:

To add to the Che 7 part: Without image pull times, starting the back end components could be done in around 15 seconds without changing anything -- the charts here show that without the pulls, the only significant action is Theia start. Note also that the pinkish block (plugin broker) should already be much shorter since the majority of the wait there is for the PVC mount, which is fixed by eclipse-che/che#11657

The text was updated successfully, but these errors were encountered:

l0rd · 2018-11-09T18:06:58Z

@ibuziuk @garagatyi @gorkem @amisevsk copying here a comment from #1012 (comment) about what I think should be the next things to look at. We could use this epic to discuss new ideas to improve Che 7 startup on OSIO:

Images pulling: @amisevsk analysis shows that this is by far our bottleneck. The good news is that pulling an image is, sometimes, really fast. If we understand why pulling is fast we may be able to make it always fast! We need to continue looking at it: it's critical.
Load plain theia first: we currently start Theia after all brokers have been pulled and run, all plugins have been pulled and container created. We need to make it faster. We need to have Theia loaded in the user browser first and, in a second step, start all the rest. Something that comes to my mind to achieve this is to create a first k8s deployment with only one pod/container/service (theia) and, after we have started, start a rolling update of it with a new deployment definition that has all the pod/containers/services for the plugins. That's just an idea and I am sure there are other way to achieve that but the goal should be clear: starting a workspace should not take more than starting standalone Theia (5s or so).

Point 1. is specific to OSIO, 2. is not. It's an upstream improvement and would dramatically improve Che UX.

Another thing that we may still look at is reducing images size. This can be achieved using distroless images (or FROM scratch) for brokers, editors and plugins. This has lower priority because if we fix 1. and 2. we would not download the images on OSIO. But anyway its' a quick win, could be a panacea until 1. and 2. get addressed, and would have a great benefit for local Che bootstrap.

garagatyi · 2018-11-10T07:51:04Z

@l0rd loading IDE first can improve the experience indeed, but we should implement it in a natural way that is somehow described by a toolings configs. Otherwise, we would have a lot of hardcodings that don't respect other editor implementations.

Here are other ideas on how we can improve WS start time.
We still use a broker to evaluate Theia sidecar and we can actually improve it. If a plugin/editor has only che-plugin.yaml in the archive we can point to it in meta.yaml instead of pointing to the archive with this file. In this case, Che master can resolve sidecar configuration even before it launches brokers.
Another improvement would be to add a mark to a meta.yaml whether the plugin supports running as init container: doesn't change workspace configuration, doesn't require to be processed before IDE start (IDE might allow/not allow adding plugins after IDE start). This is the case with plain Theia plugins.
With these improvements:

plain Theia plugins are brokered in init container broker
Theia meta.yaml points to a che-plugin.yaml and doesn't need running broker to evaluate its config
What's left is remote plugins, such as JDT.LS.
JDT.LS:
adds a sidecar to the workspace config
changes environment variables set in all the containers in the workspace, so require that these env vars be applied to all the sidecars
require unarchiving .thia zip archive to get workspace config changes
To run it after Theia we would indeed require post restarting of deployments after broker evaluate the influence of it on a workspace.
But this part is would be hard to implement and can be more complex than implementing syncing.

The simplest implementation of a sync sidecar can be:

implement new volume strategy similar to ephemeral one, but would respect a flag from sync sidecar that it needs a real volume
implement adding sidecar as a container into the single deployment and share ephemeral volumes, so it can locally sync it with a real gluster volume using local rsync
this doesn't block us from separating sidecars to separate deployments if they do not require sharing projects sources or other volumes
this doesn't allow us to separate everything to deployments instead of containers in a single pod. But we have this limitation now.
Next step would be sync sidecar that can run in a separate deployment and sync over the network.
To implement that we could add a sidecar container with rsync daemon to every tooling or user pod that needs a volume and let separate master rsync sidecar connect to those slaves to sync files to a gluster volume attached to the master rsync sidecar.

benoitf · 2018-11-12T07:09:05Z

@garagatyi about init container and broker.
If I run again the same workspace, without having changing anything in my workspace config, will the broker redo everything ? or computed information is stored somewhere and it's re-used directly (no need to run a broker)

garagatyi · 2018-11-12T08:11:50Z

For the time being, we do re-run brokering each restart of a workspace. We can implement this approach to speed up the start but there are things that we need to consider before start coding that. Since archive can change without change a URL we should do something with that. Checking checksum (or even archive size) is probably not the best approach because it would require to download the whole archive to evaluate it.
We can declare that any changes in a plugin should be done using the update of a plugin in Che registry, otherwise we would not re-run brokering. Does this policy make sense?
Another problem is that something (user?!) can corrupt plugins files and a user needs an ability to trigger brokering restart to fix that.

l0rd · 2018-11-12T11:04:12Z

@garagatyi I think we agree that starting the editor as the first container (even before brokers) is the important goal and we agree on that. And your idea to have the meta.yaml to point to the che-plugin.yaml directly is a good one. But I am a little bit lost with your other proposals: sync sidecar and init container doesn't look simple.

I am still convinced that adding a "fast startup" phase where wsmaster starts the workspace pod with only the editor container (no plugins, no brokers, no sidecars) makes things really simple. Kubernetes rolling update allows a seamless transition from the first workspace pod (with a bare theia) to the second pod (with theia, theia plugins and sidecars).

garagatyi · 2018-11-12T11:30:21Z

@l0rd Starting editor first would probably involve major refactoring of the code and flow because it would make workspace start 3 phase flow rather than 2 phase.

Start editor configuration
Start brokers
Start workspace

Even though from a k8s standpoint updating deployment is simple we don't have this flow in Che workspace start flow and we would need some time to implement this flow. I don't think it is a trivial task.

l0rd · 2018-11-12T11:40:23Z

@garagatyi but you don't need to change the current 2 phases right? You just need to add a new one. The existing phases should remain untouched.

garagatyi · 2018-11-12T12:57:46Z

@l0rd If we want to rollout workspace properly we would need to change the code. To change Theia container config we either need to roll it out or delete deployment, wait it gets deleted, create the new one.

l0rd · 2018-11-12T13:48:30Z

To change Theia container config we either need to roll it out or delete deployment, wait it gets deleted, create the new one.

RollingUpdate is the default StrategyType of a Kubernetes Deployment. You should not delete the previous deployment, you should not wait and create a new one, Kubernetes does it for you.

garagatyi · 2018-11-12T13:57:03Z

@l0rd unfortunately for that we need to edit deployment and we don't have this idea in Che master code. So, we would still have to rework some code to allow this editing capability. We may try to save just IDs and integrate them into a new deployment config which would be similar to editing deployment. But in this case, our autogenerated k8s service names and OS routes would change their names probably, which might not be tolerated by a client side. In any case, we will have to figure out how we can deal with it without rewriting a lot of code.

l0rd · 2018-11-12T18:09:51Z

@garagatyi I guess that re-using the same name for the deployment and theia routes will do the trick.

ibuziuk · 2019-11-08T09:08:28Z

Closing in favor of the upstream epic - eclipse-che/che#11476
Current data for the last month on Hosted Che:

average workspace startup: 34 seconds / 98.4 % of workspaces started faster than 60 seconds

ibuziuk added kind/epic Epic labels Oct 24, 2018

ibuziuk mentioned this issue Nov 13, 2018

Che 7: Theia IDE should be loaded first during workspace startup #1058

Closed

ibuziuk changed the title ~~Che 7 startup should be faster on OSIO~~ Che 7 startup should be faster on che.openshift.io Nov 13, 2018

davidfestal mentioned this issue Nov 14, 2018

Remove unnecessary authentication steps in client-side IDE loading on openshift.io #1013

Closed

garagatyi mentioned this issue Nov 14, 2018

How to cache results of a workspace provisioning eclipse-che/che#11936

Closed

ibuziuk removed the Epic label Feb 20, 2019

ibuziuk closed this as completed Nov 8, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Che 7 startup should be faster on che.openshift.io #1011

Che 7 startup should be faster on che.openshift.io #1011

ibuziuk commented Oct 24, 2018 •

edited

Loading

l0rd commented Nov 9, 2018

garagatyi commented Nov 10, 2018

benoitf commented Nov 12, 2018

garagatyi commented Nov 12, 2018

l0rd commented Nov 12, 2018

garagatyi commented Nov 12, 2018

l0rd commented Nov 12, 2018

garagatyi commented Nov 12, 2018

l0rd commented Nov 12, 2018

garagatyi commented Nov 12, 2018

l0rd commented Nov 12, 2018

ibuziuk commented Nov 8, 2019

Che 7 startup should be faster on che.openshift.io #1011

Che 7 startup should be faster on che.openshift.io #1011

Comments

ibuziuk commented Oct 24, 2018 • edited Loading

Info about Che 7 workspace startup (ephemeral, but without eclipse-che/che#11786 so normal PVC is still used for broker):

l0rd commented Nov 9, 2018

garagatyi commented Nov 10, 2018

benoitf commented Nov 12, 2018

garagatyi commented Nov 12, 2018

l0rd commented Nov 12, 2018

garagatyi commented Nov 12, 2018

l0rd commented Nov 12, 2018

garagatyi commented Nov 12, 2018

l0rd commented Nov 12, 2018

garagatyi commented Nov 12, 2018

l0rd commented Nov 12, 2018

ibuziuk commented Nov 8, 2019

ibuziuk commented Oct 24, 2018 •

edited

Loading