Problem
lineage show maps bucket sharing between projects, but agents need deeper understanding:
- Which extractor writes to this table?
- Which transformations read it?
- What downstream writers consume the output?
- What's the full execution order?
Currently agents must correlate config list + config detail + input/output mapping analysis manually.
Proposal
kbagent pipeline trace --project prod --table-id in.c-crm.orders
Output: structured data flow graph:
{
"table": "in.c-crm.orders",
"written_by": {"component": "keboola.ex-db-mysql", "config_id": "123", "schedule": "daily 06:00"},
"read_by": [
{"component": "keboola.snowflake-transformation", "config_id": "456", "output_tables": ["out.c-analytics.order-metrics"]}
],
"downstream": [
{"table": "out.c-analytics.order-metrics", "consumed_by": [{"component": "keboola.wr-google-sheets", "config_id": "789"}]}
]
}
Why this matters
Understanding data dependencies is the #1 prerequisite for safe changes. Without it, agents risk breaking downstream consumers when modifying a pipeline.
Context
Discussion from Devil's Advocate analysis of kbagent's agentic capabilities.
Problem
lineage showmaps bucket sharing between projects, but agents need deeper understanding:Currently agents must correlate
config list+config detail+ input/output mapping analysis manually.Proposal
Output: structured data flow graph:
{ "table": "in.c-crm.orders", "written_by": {"component": "keboola.ex-db-mysql", "config_id": "123", "schedule": "daily 06:00"}, "read_by": [ {"component": "keboola.snowflake-transformation", "config_id": "456", "output_tables": ["out.c-analytics.order-metrics"]} ], "downstream": [ {"table": "out.c-analytics.order-metrics", "consumed_by": [{"component": "keboola.wr-google-sheets", "config_id": "789"}]} ] }Why this matters
Understanding data dependencies is the #1 prerequisite for safe changes. Without it, agents risk breaking downstream consumers when modifying a pipeline.
Context
Discussion from Devil's Advocate analysis of kbagent's agentic capabilities.