You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If I understand correctly, turborepo currently assumes that any change to any file in a package invalidates the output cache for all of the tasks in that package.
However, it is often the case that only a subset of the files in the package are inputs to a task (in the sense that they would affect the task's output). For example, modifying the README file in a package would rarely affect its build output, but currently the build cache would be invalidated in this case.
It's also common to have multiple build stages within the same package. For example, if I have a TypeScript task and a Rollup task in the same package, then modifying the Rollup config should not necessitate re-running the TypeScript task.
Describe the solution you'd like
The ability to specify the inputs for specific tasks in the pipeline config, and for only those files to be used when computing the caching hash for that task, instead of all files in the package.
With the above config, assuming a full rollup build has just completed, I would now expect:
Modifying the README file does not necessitate re-running either task.
Modifying a src/ file necessitates re-running both tasks.
Modifying only the rollup.config.js only necessitates re-running rollup.
Describe alternatives you've considered
Placing each build step in its own monorepo package seems to be the pattern I'm seeing in examples that side-steps this requirement. In the above example, we could have separate packages for the ts and rollup steps, so that they would be cached independently.
However, I work on a monorepo that has ~20 packages, each with their own multi-step build processes. Having to split each of those 20 packages into 2 or more packages themselves in order to take full advantage of caching feels a bit awkward and disruptive.
I also work on simpler non-monorepo projects which have a 2 or 3 step build process. For a similar reason, I would rather not need to refactor the project into a monorepo with a package for each build step in order to take full advantage of caching.
The text was updated successfully, but these errors were encountered:
We generate a database client library using Prisma in many projects in our monorepo.
Prisma input file is .prisma/prisma.schema (relative to each package.json) so it would be nice to scope db:generate to changes within that directory / file. (rather than /src)
* Add support for `"inputs": [<file glob>]` to pipeline config
* In the case of `"inputs"`, use `git ls-files` to calculate file hashes
* Remove package-level hash and file hash
* Hash combinations of package-inputs
* Hashing of package-tasks is now done inline while walking the graph
* Update `e2e` test to highlight new hashing behavior
Because task hashing now only takes into account the hash of task-graph dependencies, tasks that don't have task-graph dependencies won't have new hashes when package-dependencies change. The classic example is `lint`. If `a` depends on `b`, and `b` changes, that should no longer force a cache miss for `a#lint`, so long as the `lint` task has no dependencies.
Remaining work:
- [x] Update fallback for `git ls-file` to properly handle `inputs`
- [x] Add tests for `inputs`
- [x] Update global hash key (not technically necessary, but may be a good idea)
Fixes#523
Describe the feature you'd like to request
If I understand correctly, turborepo currently assumes that any change to any file in a package invalidates the output cache for all of the tasks in that package.
However, it is often the case that only a subset of the files in the package are inputs to a task (in the sense that they would affect the task's output). For example, modifying the README file in a package would rarely affect its build output, but currently the build cache would be invalidated in this case.
It's also common to have multiple build stages within the same package. For example, if I have a TypeScript task and a Rollup task in the same package, then modifying the Rollup config should not necessitate re-running the TypeScript task.
Describe the solution you'd like
The ability to specify the inputs for specific tasks in the pipeline config, and for only those files to be used when computing the caching hash for that task, instead of all files in the package.
For example:
With the above config, assuming a full
rollup
build has just completed, I would now expect:README
file does not necessitate re-running either task.src/
file necessitates re-running both tasks.rollup.config.js
only necessitates re-runningrollup
.Describe alternatives you've considered
Placing each build step in its own monorepo package seems to be the pattern I'm seeing in examples that side-steps this requirement. In the above example, we could have separate packages for the
ts
androllup
steps, so that they would be cached independently.However, I work on a monorepo that has ~20 packages, each with their own multi-step build processes. Having to split each of those 20 packages into 2 or more packages themselves in order to take full advantage of caching feels a bit awkward and disruptive.
I also work on simpler non-monorepo projects which have a 2 or 3 step build process. For a similar reason, I would rather not need to refactor the project into a monorepo with a package for each build step in order to take full advantage of caching.
The text was updated successfully, but these errors were encountered: