Skip to content

Add element_name parameter to Click-Tool and Move-Tool#80

Open
ikoskela wants to merge 1 commit intoCursorTouch:mainfrom
ikoskela:feature/element-name-click
Open

Add element_name parameter to Click-Tool and Move-Tool#80
ikoskela wants to merge 1 commit intoCursorTouch:mainfrom
ikoskela:feature/element-name-click

Conversation

@ikoskela
Copy link

Summary

  • Adds optional element_name and window_title parameters to Click-Tool and Move-Tool
  • When element_name is provided, uses Windows UI Automation to find the element by its accessible name and click/move to its center coordinates — no coordinate estimation needed
  • window_title optionally scopes the search to a specific window (substring match)
  • Fully backward-compatible: existing loc-based calls work unchanged

Motivation

Currently, clicking a UI element requires the caller to estimate coordinates — either from State-Tool output, screenshots, or manual calculation. This is error-prone, especially on high-DPI displays where coordinate systems can be mismatched.

UI Automation can locate elements precisely by name, returning exact bounding rectangles. This PR exposes that capability directly in Click-Tool and Move-Tool, so callers can say click(element_name="Save") instead of guessing pixel coordinates.

Implementation

  • Adds _resolve_element_location() helper that uses the existing windows_mcp.uia module (no new dependencies)
  • Uses uia.Control(Name=...) for element search with Exists(maxSearchSeconds=3) timeout
  • Uses uia.WindowControl(SubName=...) for optional window scoping
  • Returns center coordinates from BoundingRectangle — ready for pyautogui
  • Clear error messages when element or window is not found

Examples

# Click a button by name
click_tool(element_name="Save")

# Click within a specific window
click_tool(element_name="Submit", window_title="Settings")

# Move to an element
move_tool(element_name="Close", window_title="Notepad")

# Existing coordinate-based calls still work
click_tool(loc=[500, 300])

Test plan

  • Verify click_tool(loc=[x, y]) still works as before
  • Verify click_tool(element_name="Start") finds and clicks the Start button
  • Verify click_tool(element_name="Nonexistent") returns a clear error
  • Verify window_title scoping works correctly
  • Verify move_tool with element_name moves cursor to the element

Enable clicking and moving to UI elements by accessible name using
Windows UI Automation, without requiring coordinate estimation.

When element_name is provided, the tool uses UIA to locate the element,
gets its bounding rectangle center, and clicks/moves there. The optional
window_title parameter scopes the search to a specific window.

This is fully backward-compatible: existing loc-based calls work
unchanged. element_name is an optional alternative to loc.

Uses the existing windows_mcp.uia module — no new dependencies.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant