Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow Rendering Performance (Astro is more than 10 times slower than Solid) #11454

Closed
1 task
axel-habermaier opened this issue Jul 11, 2024 · 13 comments · Fixed by #13195
Closed
1 task

Slow Rendering Performance (Astro is more than 10 times slower than Solid) #11454

axel-habermaier opened this issue Jul 11, 2024 · 13 comments · Fixed by #13195
Labels
- P2: nice to have Not breaking anything but nice to have (priority)

Comments

@axel-habermaier
Copy link

axel-habermaier commented Jul 11, 2024

Astro Info

Astro                    v4.11.5
Node                     v20.14.0
System                   Linux (x64)
Package Manager          yarn
Output                   server
Adapter                  @astrojs/node
Integrations             @astrojs/tailwind
                         @astrojs/solid-js

If this issue only occurs in one browser, which browser is a problem?

server problem

Describe the Bug

Astro's rendering performance is more than 10x slower than Solid's rendering performance on the server.

For example, render the following exemplary Astro component:

<div style="display:none">
  {
    new Array(50_000).fill(0).map((_, i) => (
      <div>
        Go to <a href={`/${i}`}>{i}</a>
      </div>
    ))
  }
</div>

vs. the following Solid component

export function SolidTest() {
  return (
      <div style="display:none">
        {new Array(50_000).fill(0).map((_, i) => (
          <div>
            Go to <a href={`/${i}`}>{i}</a>
          </div>
        ))}
      </div>
  );
}

used like so in another Astro component:

---
import { SolidTest } from "./SolidTest";
---
<SolidTest />

Note that the generated HTML between the first Astro component that contains the markup itself and the second Astro component that basically defers all rendering to Solid is exactly the same (minor some superfluous newlines in Astro's HTML output, but that might be another bug/optimization opportunity).

Yet, when building (in production mode) and previewing the app in Node.js (using dynamic, on-the-fly page rendering), there is a stark performance difference. On my Ryzen 5800X, the Astro version takes almost a second to render, the Solid version less than 100ms, making Astro more than 10 times slower.

I fully realize that for this simple example, it would be possible to prerender the HTML at build time for both cases. In my real world scenario, however, the pages do include dynamic data.

What's the expected result?

Astro should not be that much slower than Solid. Given that Solid is has one of the fastest SSR renderers, some slowdown can be expected and is acceptable, but 10x slower is certainly unexpected. At least for me.

Link to Minimal Reproducible Example

Note that you might have to run the provided Stackblitz example locally for meaningful performance results.
https://stackblitz.com/edit/withastro-astro-n11u7w?file=src%2Fpages%2Fsolid.astro

Participation

  • I am willing to submit a pull request for this issue.
@github-actions github-actions bot added the needs triage Issue needs to be triaged label Jul 11, 2024
@axel-habermaier
Copy link
Author

axel-habermaier commented Jul 11, 2024

An additional remark: Solid of course optimizes the JSX rendering through its Babel plugin, where they "pre-render" the static parts of the template to strings at build time. Looking at the generated code, however, it seems to me that Astro uses a similar optimization.

So what makes Astro so much slower by comparison? Is there some advanced feature compared to Solid or are there some implementation inefficiencies?

@axel-habermaier axel-habermaier changed the title Slow Rendering Performance Slow Rendering Performance (Astro is more than 10 times slower than Solid) Jul 12, 2024
@Princesseuh
Copy link
Member

cc @bluwy

@bluwy
Copy link
Member

bluwy commented Jul 17, 2024

Debugging this, and if anyone wants to help out, here's a cpuprofile (4MB) I recorded that starts up the preview server and renders the /astro page 5 times.

@bluwy bluwy added - P2: nice to have Not breaking anything but nice to have (priority) and removed needs triage Issue needs to be triaged labels Jul 17, 2024
@friedemannsommer
Copy link
Contributor

Based on the profile provided by @bluwy and my own profiling results, it seems to come down to memory allocation and therefore to garbage collection.

Since the server has to create 50,001 (50k because of the new Array(50_000) and one for the main template) RenderTemplateResult instances and for each attribute present one HTMLString instance per request, multiple megabytes of memory are allocated.
In my case each request allocates about 20-25 MB (measured via allocation sampling) to render the HTML template.

JavaScript code for the template
createComponent(($$result, $$props, $$slots) => {
  return renderTemplate`${ maybeRenderHead() }<h1>ASTRO</h1> <div style="display:none"> ${ new Array(50000).fill(0).map((_, i) => renderTemplate`<div>
Go to <a${ addAttribute(`/${ i }`, 'href') }>${ i }</a> </div>`) } </div>`;
}, './src/pages/perf.astro', void 0);

The allocated memory includes much more objects (mostly async related) than just the previously mentioned objects, but I hope it provides some insight into what is going on here.

Here is a memory profile recorded via Chrome DevTools Memory "Allocation sampling" profiling.
To load this file into the DevTools change the file extension to heapprofile. (GitHub doesn't allow file uploads with this extension.)

astro-issues-11454-cleaned.json

@bluwy bluwy self-assigned this Jul 22, 2024
@ascorbic
Copy link
Contributor

ascorbic commented Oct 3, 2024

Disabling streaming helps a lot here. For me on my Macbook Air it goes from ~750ms to ~350ms for a page with the example component. A lot of the benefit can be gained by buffering the stream into chunks. A 1kB chunk size results in around 450ms. The difference is even bigger when it's a load test. The notorious Platformic SSR performance showdown only manages 40rps by default. Disabling streaming takes that to 140rps. Chunking gets it to around 100rps.

@MatthewLymer
Copy link
Contributor

MatthewLymer commented Dec 19, 2024

I am in a similar boat as @axel-habermaier here, my company is in the middle of evaluating frameworks and right now the rendering performance of Astro's SSR implementation is a deal breaker for us.

We tried the same setup as OP, except with a Preact renderer, with a similar disparity.

Additionally, we duplicated the test in the Qwik framework and it also was considerably faster.

---
import AsyncComponent from "./AsyncComponent.astro";
import SyncComponent from "./SyncComponent.astro";

const useAsync = Astro.url.searchParams.get("mode") === "async";
---
<html>
  <head>
    <title>Performance!</title>
  </head>
  <body>
    <div>
      {useAsync && [...new Array(100_000).keys()].map(x => (
        <AsyncComponent key={x} />
      ))}

      {!useAsync && [...new Array(100_000).keys()].map(x => (
        <SyncComponent key={x} />
      ))}
    </div>
  </body>
</html>
---
const message = "Sync!";
---
<p>{message}</p>
---
await new Promise((resolve) => setTimeout(resolve, 0));
const message = "Async";
---
<p>{message}</p>

With results

% time curl -sSL 'http://localhost:4321/perf/?mode=async' -o /dev/null
curl -sSL 'http://localhost:4321/perf/?mode=async' -o /dev/null  0.00s user 0.01s system 0% cpu 3.532 total

% time curl -sSL 'http://localhost:4321/perf/?mode=sync' -o /dev/null 
curl -sSL 'http://localhost:4321/perf/?mode=sync' -o /dev/null  0.00s user 0.01s system 0% cpu 2.462 total

(Not exactly scientific, but I did run them multiple times to ensure a realistic scenario)


Comparatively, for Qwik

import { component$ } from "@builder.io/qwik";
import SyncComponent from "./SyncComponent";
import AsyncComponent from "./AsyncComponent";
import { useLocation } from "@builder.io/qwik-city";

export default component$(() => {
  const location = useLocation();

  const useAsync = location.url.searchParams.get("mode") === "async";

  return (
    <>
      {useAsync && [...new Array(100_000).keys()].map(x => (
        <AsyncComponent key={x} />
      ))}

      {!useAsync && [...new Array(100_000).keys()].map(x => (
        <SyncComponent key={x} />
      ))}
    </>
  );
});
import { component$, useSignal, useTask$ } from "@builder.io/qwik";

export default component$(() => {
    const message = useSignal<string>();

    useTask$(async () => {
        await new Promise((resolve) => setTimeout(resolve, 0));
        message.value = "Async";
    });

    return <p>{message.value}</p>;
});
import { component$ } from "@builder.io/qwik";

export default component$(() => {
    const message = "Sync!";
    return <p>{message}</p>;
});

With results

% time curl -sSL 'http://localhost:4173/demo/perf/?mode=async' -o /dev/null
curl -sSL 'http://localhost:4173/demo/perf/?mode=async' -o /dev/null  0.01s user 0.01s system 1% cpu 1.881 total

% time curl -sSL 'http://localhost:4173/demo/perf/?mode=sync' -o /dev/null
curl -sSL 'http://localhost:4173/demo/perf/?mode=sync' -o /dev/null  0.00s user 0.01s system 2% cpu 0.530 total

From the results, the Qwik accomplished the "sync" test in only 21.5% of the time, and the "async" test completed in 53.3% of the time.

I realize it's a trivial test setup, but this was prompted as a result implementing a realistic use-case for us, where we have about 15k components total on the page, with 95% of them being synchronous in nature. The rendering in Astro took ~650ms, while the application we were porting it from (php) took ~100ms for equivalent work.

We'd like to stick with Astro if possible, as we find the developer experience vastly better than Qwik and other alternatives.

My questions are:

  1. Is SSR performance a priority for the Astro team?
  2. Is it realistic to expect Astro to support rendering 15,000 SSR components in the Astro templating language?
  3. Would you accept PRs to address the performance issue (if it meant major changes to the rendering pipeline)?

@datner
Copy link

datner commented Dec 28, 2024

I've also reached this limit hah, I was sure the html-streaming feature would save me, but I'm not sure it's a real feature since I can't get it to happen even in a lab 😄

Would love some more eyes one this

@codejanovic
Copy link

I also came across a stunningly slow 30rps handling when benchmarking my ssr routes with artillery. Is there a way hook into Astros rendering to exactly measure and output the time it takes to render?

@ematipico
Copy link
Member

@MatthewLymer have you tried to turn off streaming?

@camdowney
Copy link

Just found this thread since I think this issue might be affecting my site as well.

@ematipico How do you "turn off streaming"?

@MatthewLymer
Copy link
Contributor

@ematipico I'm using the Node adapter (in production) which doesn't expose the option. Though it was roughly the same performance for me when disabling streaming in "dev" mode.

@ematipico
Copy link
Member

Apologies, I thought it was possible, but indeed we won't expose an option for that.

@bluwy bluwy removed their assignment Jan 16, 2025
@MatthewLymer
Copy link
Contributor

#13195

I've submitted a PR to help push the performance in the right direction here.

I don't think that it'll achieve SolidJS-like performance due SolidJS being pre-rendered in this case, however, I found a ~2x improvement on my company's real-world use-case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
- P2: nice to have Not breaking anything but nice to have (priority)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants