@Daniel, good points.
There are a few things I want to comment on.
The workflow step (or build type as we're calling it) is intended to be a starting point, not the final destination. It's easy to implement, solves maybe 70% of the use cases we've seen, and allows us to drive work on a richer task execution model that can later become public.
The build types are not just about task ordering - they're declarative elements that describe the public entry points of the build, the inputs that they take (eg when I'm doing a release build, I need to provide credentials for the repository), and the conditional logic that has to happen for that build (eg when I'm doing a CI build, wire in the test coverage report).
There's a lot of value in build types even if they don't solve everyone's task ordering problems. But they are a step towards a more general solution.
The plan is certainly not to end up with separate declarative DAG and imperative + pieces. Instead, we will have a high level declarative model that describes what you want to build, and a low level declarative model of how that should happen. The high level model is used to assemble the low level model. They both describe the same thing, just with different views: The high level model is a graph of things that depend on each other, and the low level model is a graph of actions (tasks) to be executed to achieve the result.
We already have this layering: when you declare a project dependency, you're not saying anything about which tasks to run. You're just stating: 'my library foo depends on library bar, please make sure it's available before attempting to compile foo'. From this high level declaration, the low level task dependencies are wired in.
And so with the canonical 'start jetty then run my tests' use case. There is a hard dependency here: my tests need my web app to be running. This this isn't a task dependency. The tests don't care how the web app was deployed. It's a dependency between things: my web app and my tests. So, in the high-level model, you'd state 'my tests need my web app to be running', and Gradle would turn that into the appropriate low level task dependencies. The jetty plugin, for example, might declare 'I know how to deploy a web app' and Gradle can ask it to do so.
Adding in the database schema, we see the same thing. There is a hard dependency here: my web app needs a database instance to be running with my schema. Again, it's not a task dependency, it's a dependency between things: my web app and the database instance. In the high level model you'd state: my web app needs a database instance with my schema, and Gradle would turn that into the appropriate low level task dependencies. Maybe the mysql plugins declares 'I know how to create a database instance' and the liquibase plugin declares 'I know how to apply a schema to a database instance', and Gradle can ask them to do so.
Why is this useful?
Firstly, by declaring the dependencies at the right level of abstraction, you keep things simple. There's no need for dependency ordering or any of that. It's also declaring the actual dependencies, not the dependencies denormalised into a set of tasks.
Secondly, you keep things flexible. Gradle can choose the appropriate tasks based on the current state of the world: If there's already a database instance with the schema applied, then just delete all the rows and reapply the test data. If there's database instance with the wrong schema, then drop all the tables and apply the schema. If there's no database instance, create it and apply the schema and test data. If the tests are going to be run in parallel, then spin up a separate database instance for each test worker. If I'm doing a developer build, reuse the instance I have on my local machine. If I'm doing a CI build, then rebuild the database instance from scratch in the QA environment.
So, we have a nice high level description of the relationships between the software components. Ideally, you work only at this level and never touch the task model.
However, there will still be cases where you need to work with the task model, for whatever reason. This will still be a graph, but with several different types of edges, or relationships. There will be no orderings between the edges. Currently, we're looking at the following relationships:
- Tasks from set A must run before tasks from set B. This is our existing depends-on relationship, but generalised a touch to handle things like: clean and its dependencies must run before assemble and its dependencies, or all validation tasks should run before any other task.
- Tasks from set A should run before tasks from set B. This covers things like: honoring imported Ant task ordering, or unit tests should run before integration tests (but don't have to), or all jar tasks should run before any upload tasks.
- Tasks from set A must run if any of the tasks from set B run. This covers things like: always generate a coverage report if the coverage check task is run even if the coverage check fails, or always generate an aggregated test report if any test tasks are executed, or always stop the Jetty server at the end of the build if the Jetty server is started.
- When any task from set A is scheduled to be executed then (any of the above). This covers things like, if the coverageReport task is to be executed, then instrumentTestClasses must run before test.
There will be some conveniences for the above. Maybe even your proposed syntax, so that:
task a(dependsOn: b >> c)
is shorthand for:
tasks (b,c) must run before task a, and when task a is scheduled then task b and all it dependencies must run before task c and all its dependencies.