Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to specify inputs to task #523

Closed
aomarks opened this issue Jan 7, 2022 · 1 comment · Fixed by #951
Closed

Ability to specify inputs to task #523

aomarks opened this issue Jan 7, 2022 · 1 comment · Fixed by #951
Assignees
Labels
area: ergonomics Issues and features impacting the developer experience of using turbo

Comments

@aomarks
Copy link

aomarks commented Jan 7, 2022

Describe the feature you'd like to request

If I understand correctly, turborepo currently assumes that any change to any file in a package invalidates the output cache for all of the tasks in that package.

However, it is often the case that only a subset of the files in the package are inputs to a task (in the sense that they would affect the task's output). For example, modifying the README file in a package would rarely affect its build output, but currently the build cache would be invalidated in this case.

It's also common to have multiple build stages within the same package. For example, if I have a TypeScript task and a Rollup task in the same package, then modifying the Rollup config should not necessitate re-running the TypeScript task.

Describe the solution you'd like

The ability to specify the inputs for specific tasks in the pipeline config, and for only those files to be used when computing the caching hash for that task, instead of all files in the package.

For example:

{
  "turbo": {
    "pipeline": {
      "rollup": {
        "dependsOn": ["ts"],
        "inputs": ["lib/**", "rollup.config.js"],
        "outputs": ["bundled/**"]
      },
      "ts": {
        "inputs": ["src/**", "tsconfig.json"],
        "outputs": ["lib/**"]
      }
    }
  }
}

With the above config, assuming a full rollup build has just completed, I would now expect:

  • Modifying the README file does not necessitate re-running either task.
  • Modifying a src/ file necessitates re-running both tasks.
  • Modifying only the rollup.config.js only necessitates re-running rollup.

Describe alternatives you've considered

Placing each build step in its own monorepo package seems to be the pattern I'm seeing in examples that side-steps this requirement. In the above example, we could have separate packages for the ts and rollup steps, so that they would be cached independently.

However, I work on a monorepo that has ~20 packages, each with their own multi-step build processes. Having to split each of those 20 packages into 2 or more packages themselves in order to take full advantage of caching feels a bit awkward and disruptive.

I also work on simpler non-monorepo projects which have a 2 or 3 step build process. For a similar reason, I would rather not need to refactor the project into a monorepo with a package for each build step in order to take full advantage of caching.

@jaredpalmer jaredpalmer added area: ergonomics Issues and features impacting the developer experience of using turbo story labels Mar 8, 2022
@chrisflatley
Copy link

As a second use case example:

We generate a database client library using Prisma in many projects in our monorepo.

Prisma input file is .prisma/prisma.schema (relative to each package.json) so it would be nice to scope db:generate to changes within that directory / file. (rather than /src)

@kodiakhq kodiakhq bot closed this as completed in #951 Apr 11, 2022
kodiakhq bot pushed a commit that referenced this issue Apr 11, 2022
 * Add support for `"inputs": [<file glob>]` to pipeline config
 * In the case of `"inputs"`, use `git ls-files` to calculate file hashes
 * Remove package-level hash and file hash
 * Hash combinations of package-inputs
 * Hashing of package-tasks is now done inline while walking the graph
 * Update `e2e` test to highlight new hashing behavior

Because task hashing now only takes into account the hash of task-graph dependencies, tasks that don't have task-graph dependencies won't have new hashes when package-dependencies change. The classic example is `lint`. If `a` depends on `b`, and `b` changes, that should no longer force a cache miss for `a#lint`, so long as the `lint` task has no dependencies. 

Remaining work:
 - [x] Update fallback for `git ls-file` to properly handle `inputs`
 - [x] Add tests for `inputs`
 - [x] Update global hash key (not technically necessary, but may be a good idea)

Fixes #523
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: ergonomics Issues and features impacting the developer experience of using turbo
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants