-
Notifications
You must be signed in to change notification settings - Fork 8
doc: add sessions-vs-sessionless-decision #27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,261 @@ | ||||||
| # **Decision Document: To App Session, or Not to App Session?** | ||||||
|
|
||||||
| ## **Background** | ||||||
|
|
||||||
| ### **Sessions in MCP** | ||||||
|
|
||||||
| Currently, the Model Context Protocol (MCP) utilizes sessions to manage | ||||||
| client-server connections, but this concept blurs the line between two very | ||||||
| distinct use cases: | ||||||
|
|
||||||
| * **Transport-Level Use Cases:** Using sessions to track protocol versioning, | ||||||
| capability negotiation (e.g., does this server support sampling?). | ||||||
| * **Application-Level Use Cases:** Using sessions to track logical state, such | ||||||
| as a specific user context, a continuous conversation thread, or stateful tool | ||||||
| operations. | ||||||
|
|
||||||
| ### **Why Sessions Need to Change** | ||||||
|
|
||||||
| The current implementation of sessions is highly ambiguous, leading to the | ||||||
| following problems/issues: | ||||||
|
|
||||||
| * **Inconsistent Lifecycles:** Some clients create a new session for *every | ||||||
| single tool call*, others create one per *conversation*, some use one for *all | ||||||
| conversations*, and others manage them without any clear boundaries. There are | ||||||
| no strict guarantees around what sessions provide or how long they persist. | ||||||
| * **Transport Divergences:** On STDIO, sessions are implicitly tied to the | ||||||
| process lifecycle. On HTTP, sessions are optional, and their absence could | ||||||
| mean the server is stateless *or* that the previous state was simply lost. | ||||||
| * **Coupling of State to Connection:** Application state, client capabilities, | ||||||
| and protocol versions are heavily coupled to the connection. This leads to | ||||||
| operational hazards like the "Rolling Upgrade" problem (where updating a | ||||||
| load-balanced server drops the connection and wipes the user's logical state) | ||||||
| and multiplexing failures. | ||||||
|
|
||||||
| ### **Resolving Transport-Level Sessions** | ||||||
|
|
||||||
| To resolve the ambiguities at the transport level, the working group is moving | ||||||
| toward a stateless transport architecture. The need for transport-level sessions | ||||||
| is being removed via two core SEPs: | ||||||
|
|
||||||
| * [**SEP-1442**](https://github.com/modelcontextprotocol/modelcontextprotocol/issues/1442)**:** | ||||||
| Moves all data to a "per request" basis, eliminating the need to store | ||||||
| transport state between calls. | ||||||
| * [**SEP-2322**](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/2322)**:** | ||||||
| Allows for elicitation and sampling requests without relying on sessions to | ||||||
| align them. | ||||||
|
|
||||||
| ## **The Problem Statement** | ||||||
|
|
||||||
| With transport-level sessions resolved by SEP-1442 and SEP-2322, we are left | ||||||
| with a critical architectural decision regarding **Application-Level Sessions**. | ||||||
|
|
||||||
| The open question the working group must decide on is: **"Should the protocol | ||||||
| support application sessions (or not)?"** | ||||||
|
|
||||||
| Developers building agents and tools frequently need a way to track logical | ||||||
| state across multiple turns of a conversation. However, it is currently unclear | ||||||
| whose responsibility it is to maintain that state. We must decide whether to | ||||||
| standardize a formal session concept at the data/application layer, or | ||||||
| completely remove the concept of sessions from the protocol and push the | ||||||
| responsibility of tracking state references entirely to the client via explicit | ||||||
| state handles. | ||||||
|
|
||||||
| ## **Use Cases for Application Sessions** | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we still need to take a step back and start by coming to consensus on what set of use-cases we actually need to support. These four examples are interesting as hypotheticals, but if no one is actually doing any of this today, then the examples aren't useful. I think that in order to support the complexity of sessions in the protocol, we should require evidence of concrete real-world use-cases that are common enough to justify the complexity we'd be taking on.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think that's a fair callout. Sessions today don't work. As described above, the client and server can't agree on them, which makes implementing something like this impossible.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree that they don't work today in the general case, in the sense that you can't use any arbitrary client with any arbitrary server and expect it to work properly. But I think we've heard that there are specific cases where people control both the clients and servers and are leveraging sessions for some functionality where those particular clients and servers do agree on things like the scope of the session. It seems like it would be useful to get a list of those use-cases along with some idea of how many people are using them. But even setting aside examples of what people are actually doing today, I think there's still an important element of what things people want to do. We currently have no way of knowing how many people will actually want to do anything like these hypothetical use-cases. If no one (or very few people) want to implement one of these use-cases, then it's not worth the complexity of supporting it, and we should stop considering it. My overall point here is that before we commit to supporting a given use-case, we should first have confidence that enough people will actually use it to make it worth supporting it. Right now, I don't see a strong signal of that -- we've had a lot of hypothetical discussion, but very little real-world input on use-cases. In the absence of concrete use-cases that we have evidence that enough people are interested in, I would lean heavily toward option B. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Real-world context, we built an MCP proxy and hosting platform and the servers that developers are most interested in building are inherently stateful: database connections with cursors, multi-step deployment workflows, auth flows that unlock additional tools. The simple stateless servers (search, lookup, linting) are less inclined to need this. The decision about sessions disproportionately affects the most likely to actually need to run remote, the mcp servers people will actually build businesses around. |
||||||
|
|
||||||
| To understand the need for application-level state, we can look at a few | ||||||
| progressive examples of how tools currently rely on state across multiple turns | ||||||
| of a conversation. Below are the **solution-neutral logical flows** for these | ||||||
| interactions. | ||||||
|
|
||||||
| ### **Simple Counter / Accumulator** | ||||||
|
|
||||||
| The most basic form of application state is a simple accumulator where a server | ||||||
| remembers previous interactions. For example, a `count()` tool that increments | ||||||
| every time it is called by the same user or within the same conversational | ||||||
| thread. | ||||||
|
|
||||||
| **Logical Flow:** | ||||||
|
|
||||||
| ``` | ||||||
| User: "Start counting" | ||||||
| tool/call: count() -> Returns 0 | ||||||
| tool/call: count() -> Returns 1 | ||||||
| tool/call: count() -> Returns 2 | ||||||
| ``` | ||||||
|
|
||||||
| ### **E-Commerce / Shopping Cart** | ||||||
|
|
||||||
| In more complex workflows, tools require a specific sequence of operations where | ||||||
| state is built up over time. An e-commerce agent, for instance, needs to add | ||||||
| multiple items to a shopping cart across several distinct tool calls before | ||||||
| finally executing a checkout operation. | ||||||
|
|
||||||
| **Logical Flow:** | ||||||
|
|
||||||
| ``` | ||||||
| User: "I want to buy shoes and socks" | ||||||
| tool/call: add_item("shoes") | ||||||
| tool/call: add_item("socks") | ||||||
| tool/call: checkout() -> Processes order for [shoes, socks] | ||||||
| ``` | ||||||
|
|
||||||
| ### **Progressive Discovery of Tools** | ||||||
|
|
||||||
| Some tools are deliberately hidden to prevent overwhelming the LLM's context | ||||||
| window or to enforce security gates. State allows an agent to call a tool that | ||||||
| "unlocks" deeper capabilities or different toolsets dynamically for the | ||||||
| remainder of the interaction. | ||||||
|
|
||||||
| **Logical Flow:** | ||||||
|
|
||||||
| ``` | ||||||
| User: "Query the production database" | ||||||
| tool/call: list_tools() -> Returns: [connect] | ||||||
| tool/call: execute_query("SELECT 1") -> ERROR: Tool not found | ||||||
| tool/call: connect($DATABASE_URI) -> Success | ||||||
| tool/call: list_tools() -> Returns: [execute_query, list_tables, ...] | ||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Separate from this, but what would be the trigger for the client to call list_tools() again here?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was going to point this out as well. This is going to be a problem, because IIUC the SDKs handle caching the tool list and won't refresh until they have some indication that the tool list has changed. While it's true that we are going to retain the ability for the client to subscribe to tool-list-changed notifications as an optional optimization, there won't be a way to tie that notification to the client that called I think this would be a problem only if we go with option A below. And under that option, if we do decide that this particular use-case is important, there are other ways we could consider handling it. For example, we could put a bit in the tool call response that tells the client to invalidate its tool list cache.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we've previously discussed a combination of things like TTLs, notification-style changes (e.g. returning a indicator on a tool result that info needs to change). I agree there's future work here, but I think it's solvable (and likely has some value outside of these specific use-cases) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's an advantage of handles, because then you don't need to hide the execute_query tool. If only the connect tool is given, then the model may not believe that using it will help it achieve it's goals without the other tools being present in the context to use that connection as well. |
||||||
| tool/call: list_tables() -> Success | ||||||
| ``` | ||||||
|
|
||||||
| ### **Object Creation / Multi-step Provisioning** | ||||||
|
|
||||||
| When creating complex resources, an agent may need to iteratively build a | ||||||
| configuration before executing the final creation step. This requires holding a | ||||||
| "draft" state across multiple actions. | ||||||
|
|
||||||
| **Logical Flow:** | ||||||
|
|
||||||
| ``` | ||||||
| User: "Provision a new VM with 16GB RAM" | ||||||
| tool/call: init_vm("web-server") -> Success | ||||||
| tool/call: set_ram(16) -> Success | ||||||
| tool/call: set_cpu(4) -> Success | ||||||
| tool/call: deploy_vm() -> Success | ||||||
| ``` | ||||||
|
|
||||||
| ## **Proposed Solutions** | ||||||
|
|
||||||
| The working group is split between two primary proposals for handling these | ||||||
| application-level use cases. | ||||||
|
|
||||||
| ### **Option A: Adding Data Layer Sessions** | ||||||
|
|
||||||
| **Link:** [Transports-WG PR | ||||||
| \#20](https://github.com/modelcontextprotocol/transports-wg/pull/20) | ||||||
|
|
||||||
| **Outline:** Instead of relying on the transport layer (HTTP/STDIO) to manage | ||||||
| sessions, this proposal introduces explicit, standardized session constructs at | ||||||
| the *Data Layer*. The protocol would explicitly define how to initialize, | ||||||
| maintain, and terminate application contexts independent of the underlying | ||||||
| transport connection. | ||||||
|
|
||||||
| **Implementation Example (Implicit Tool State):** With data layer sessions, the | ||||||
| session ID is negotiated once and passed in the protocol envelope. Because the | ||||||
| server inherently knows which session is making the request, the tool signatures | ||||||
| themselves remain clean and rely on implicit state. | ||||||
|
|
||||||
| ``` | ||||||
| # The session is explicitly negotiated at the data layer | ||||||
| session_create(context="user_123") | ||||||
|
|
||||||
| # 1. Simple Counter | ||||||
| count() # Returns 0 | ||||||
| count() # Returns 1 | ||||||
|
|
||||||
| # 2. Shopping Cart | ||||||
| add_item("shoes") # Adds to the session's cart | ||||||
| add_item("socks") | ||||||
| checkout() | ||||||
|
|
||||||
| # 3. Capability Unlocking (Database) | ||||||
| list_tools() # Returns: [connect] | ||||||
| connect($DATABASE_URI) # State mutates silently for this session | ||||||
| list_tools() # Returns: [connect, execute_query, list_tables, ...] | ||||||
|
|
||||||
| # 4. Object Creation (VM Provisioning) | ||||||
| init_vm("web-server") # Context stored in session | ||||||
| set_ram(16) | ||||||
| set_cpu(4) | ||||||
| deploy_vm() | ||||||
| ``` | ||||||
|
|
||||||
| #### Advantages: | ||||||
|
|
||||||
| * **Lower implementation burden:** Removing the need for the agent to manage | ||||||
| state, meaning accuracy is programmatic vs deterministic. | ||||||
|
|
||||||
| #### Disadvantages: | ||||||
|
|
||||||
| * **Protocol Complexity:** Retains the concept of state within the protocol | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think another important aspect of this is that this option requires us to define how each protocol mechanism interacts with sessions, and that will increase complexity in both the client and the server. For example, in the "capability unlocking" example, we'd need to define the tool list as being a function of the session, and both clients and servers would need to understand that. |
||||||
| definition, requiring servers to manage state lifecycles, TTLs (Time to Live), | ||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: The bigger issue with this option is the extra complexity of the session handshake and when it should happen.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was trying to capture the complexity that @markdroth mentioned here -- essentially protocol levels TTLs and changes between interactions of requests in a session -- but didn't quite nail it. |
||||||
| and garbage collection. | ||||||
|
|
||||||
| #### Open Questions | ||||||
|
|
||||||
| As we weigh the advantages and disadvantages of both proposals, multi-agent | ||||||
| orchestration presents a significant unresolved challenge: | ||||||
|
Comment on lines
+196
to
+197
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: should this be after both proposals?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I moved it here because it seemed like the open questions only applied to to the Option A |
||||||
|
|
||||||
| * **Sub-Agent Orchestration:** It is currently unclear how sessions or state | ||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Isn't this just up to the client application to determine? It can either use the session from the parent or isolate depending on its use case. In that way it is somewhat similar to option B where option B could initialize the subagent with some context including required handles.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think the answer to that is yes, but in this case the session data is handled by the application and not the agent. So it's likely going to need to be standardized in some fashion -- if half the servers assume that a subagent should use a new session and have use the existing sessions, server's will have a hard time matching both. |
||||||
| should be handled when a primary agent delegates work to a sub-agent. Should | ||||||
| a sub-agent instantiate a completely **new session**? Should it receive a | ||||||
| **copy/fork** of the parent agent's session so it has the necessary context | ||||||
| but cannot mutate the parent's state? Or should they share the exact same | ||||||
| `session_id` (which risks the sub-agent polluting the parent's context with | ||||||
| unintended operations)? | ||||||
|
|
||||||
| ### **Option B: Sessionless MCP via Explicit State Handles** | ||||||
|
|
||||||
| **Link:** [Transports-WG PR \#25](https://github.com/modelcontextprotocol/transports-wg/pull/25) | ||||||
|
|
||||||
| **Outline:** Do not add sessions to the protocol at all. Instead, encourage a | ||||||
| completely stateless protocol where servers return "explicit state handles" | ||||||
| (e.g., tokens, cursors, or context IDs) in their tool responses. The client is | ||||||
| strictly responsible for storing this handle and passing it back as an argument | ||||||
| in subsequent related requests. | ||||||
|
|
||||||
| **Implementation Example (Explicit State Handles):** With explicit handles, | ||||||
| there are no sessions. The client explicitly requests a state handle (like a | ||||||
| basket ID or access token) and must manually inject it back into subsequent tool | ||||||
| arguments. | ||||||
|
|
||||||
| ``` | ||||||
| # 1. Simple Counter | ||||||
| counter = create_counter() # returns { "counter_id": "cnt_123" } | ||||||
| count(counter) # Returns 0 | ||||||
| count(counter) # Returns 1 | ||||||
|
|
||||||
| # 2. Shopping Cart | ||||||
| basket = create_basket() # returns { "basket_id": "bsk_a1b2c3" } | ||||||
| add_item(basket, "shoes") | ||||||
| add_item(basket, "socks") | ||||||
| checkout(basket) | ||||||
|
|
||||||
| # 3. Capability Unlocking (Database) | ||||||
| db = connect($DATABASE_URI) # returns { "connection_id": "db_prod_1" } | ||||||
| list_tables(db) | ||||||
| execute_sql(db) | ||||||
|
|
||||||
| # 4. Object Creation (VM Provisioning) | ||||||
| vm = init_vm("web-server") # returns { "draft_id": "vm_draft_99" } | ||||||
| set_ram(vm, 16) | ||||||
| set_cpu(vm, 4) | ||||||
| deploy_vm(vm) | ||||||
| ``` | ||||||
|
|
||||||
| **Advantages:** | ||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
|
||||||
| * **Minimal Complexity:** Protocol complexity is minimal, requiring no | ||||||
| additional changes. | ||||||
|
Comment on lines
+248
to
+249
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Does that mean we're pushing all of the responsibility for creating and maintaining "handles" and their associated state onto the developer?
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For the server developer, yes. If you were, say, building a session-backed shopping cart, then today you'd have tools: and in the proposal you'd have Before, the server developer would have to map There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We built an MCP hosting and proxy platform and this is the exact problem we've been designing around. The stateful servers (database connections, auth flows, multi-step provisioning) are the ones that need the most infrastructure support. Option B means every server author independently implements handle management with varying conventions: inconsistent handle formats, no standard TTL behavior, and no way for infrastructure to provide session management transparently. A standardized session primitive lets the platform layer handle this once instead of every server reinventing it. |
||||||
|
|
||||||
| **Disadvantages:** | ||||||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The main downside I see of this is that it adds a burden to the client's orchestration to be able to pass the correct handles back and forth. I'm not saying this is overriding but just to make sure we understand this, it would mean one of two things on the client side:
The advantage of a session id is it would remove the need for (1) and (2) and would do so in a general way. The counterargument to this is that as we expect models and architectures to supply the right context to improve (1) above should become less and less of an issue.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One disadvantage that we may want to call out here is that this approach would work for tools only, not (e.g.) resources. In theory, there could be cases where reading a particular resource mutates session state, and that wouldn't be covered by this approach. That having been said, I'm not sure this is actually a real case. See my comment above about stepping back to agree on use-cases. |
||||||
|
|
||||||
| * **Breaking Change:** If applications are using sessions today for application | ||||||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Another thing to consider in this vein is that today, clients likely have the infrastructure to maintain session ids for the MCP servers they are interacting with in some form. If we remove sessions completely, it will be harder to add back later, because those clients likely will have removed that infrastructure.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think that because the semantics of sessions are changing, we'd probably need a breaking API change in the SDKs in either option, so I'm not sure this is really a disadvantage of option B specifically.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If today there are use cases that use sessions, those tools would need to be re-written to use explicit state handles instead. I think that the use-cases are limited to use-cases where clients and servers are both under control of the developer, but I don't think that means they don't exist. The feedback from the SDK owners has been "lots of people are asking about this" including things for resuming sessions and associating with the session_id.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In option A, I think any existing tools using the current session mechanism would need to be changed to use the new mechanism anyway, because the semantics of the current session mechanism are not well defined, so we can't guarantee that things will continue to work right for existing code anyway. I believe you were making this same argument in 1442 in the context of removing initialize and adding sessions/create: the existing mechanism isn't well defined and therefore not useful, and we don't want to surprise people with behavior changes. |
||||||
| state, it would require developers to update their tools and clients to use | ||||||
| explicit state handles instead. | ||||||
|
|
||||||
| ## **Decision** | ||||||
|
|
||||||
| This decision was discussed at the [Core Maintainer's Meeting on | ||||||
| 4/1](https://github.com/modelcontextprotocol/modelcontextprotocol/discussions/2536) | ||||||
| and the decision was made to move forward with the "no-session" proposal. | ||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect (but correct me if I'm wrong) that sessions were originally added as a way of essentially shoehorning the existing stdio transport into HTTP. If so, it may be worth pointing out as background information that the mechanism just flat-out doesn't actually work as originally intended, because the idea was that all requests on a session would go to the same server instance, and that can't actually be guaranteed with HTTP (except perhaps in certain special cases that are far from the norm).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is quite correct. It was a very intentional decision for MCP to be a stateful protocol and the session ID was supposed to be the key for that state. The deployment difficulty was accepted as a trade-off for making stateful interactions more natural. Whether or not that was the right trade-off (or necessary trade-off) is another story.
See modelcontextprotocol/modelcontextprotocol#102 for some background (note: justin was one of the co-creators in MCP).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get that it was designed as a stateful protocol, but wasn't it initially designed for stdio, where that statefulness is a more natural fit? Or was the HTTP transport part of the initial design?