Skip to content

Add coordinate_system parameter for DPI-aware coordinate conversion#81

Open
ikoskela wants to merge 1 commit intoCursorTouch:mainfrom
ikoskela:feature/coordinate-system-dpi
Open

Add coordinate_system parameter for DPI-aware coordinate conversion#81
ikoskela wants to merge 1 commit intoCursorTouch:mainfrom
ikoskela:feature/coordinate-system-dpi

Conversation

@ikoskela
Copy link

Summary

  • Adds optional coordinate_system parameter to Click-Tool, Type-Tool, Scroll-Tool, and Move-Tool
  • Accepts "physical" (default, no conversion) or "logical" (auto-converts to physical using DPI scale factor)
  • Uses the existing desktop.get_dpi_scaling() method — no new dependencies

The Problem

Windows has two coordinate systems: physical (actual pixels) and logical (DPI-scaled). On systems with DPI scaling above 100% — which includes ~300-500 million Windows machines (most modern laptops, 1440p/4K monitors) — different Windows APIs return coordinates in different systems with no indication of which one.

These tools accept physical coordinates, but many Win32 APIs (GetWindowRect, Cursor.Position, .NET methods) return logical coordinates. Without conversion, clicks land in the wrong place:

GetWindowRect says button is at: (1097, 617)  ← logical
Click-Tool clicks at:            (1097, 617)  ← interpreted as physical
Actual button location:          (1920, 1080) ← physical (at 1.75x scaling)

Solution

A simple coordinate_system parameter that defaults to "physical" (backward-compatible) but can be set to "logical" when coordinates come from APIs that return logical values:

# Existing usage — unchanged
click_tool(loc=[1920, 1080])

# New: coordinates from GetWindowRect or other logical-space APIs
click_tool(loc=[1097, 617], coordinate_system="logical")
# → auto-converts to physical [1920, 1080] using system DPI scale factor

The conversion uses desktop.get_dpi_scaling() which already exists in desktop/service.py via ctypes.windll.user32.GetDpiForSystem().

Scope

Tool Parameter Added
Click-Tool coordinate_system: "physical" | "logical"
Type-Tool coordinate_system: "physical" | "logical"
Scroll-Tool coordinate_system: "physical" | "logical"
Move-Tool coordinate_system: "physical" | "logical"

Test plan

  • Verify all four tools work normally with default coordinate_system="physical"
  • On a system with DPI > 100%, verify coordinate_system="logical" correctly converts coordinates
  • Verify desktop.get_dpi_scaling() returns correct scale factor
  • Verify backward compatibility — no existing behavior changes when parameter is omitted

On systems with DPI scaling > 100%, coordinates from Win32 APIs like
GetWindowRect and Cursor.Position are in logical space, but these tools
expect physical coordinates. This mismatch causes clicks to land in the
wrong place on ~300-500M Windows machines with HiDPI displays.

Adds coordinate_system parameter ("physical" default, or "logical") to
Click-Tool, Type-Tool, Scroll-Tool, and Move-Tool. When set to "logical",
coordinates are auto-converted to physical using the existing
get_dpi_scaling() method. No new dependencies — uses the DPI detection
already in desktop/service.py.

Fully backward-compatible: default is "physical" (no conversion).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@Jeomon
Copy link
Member

Jeomon commented Feb 14, 2026

What is the benefit of doing this approach? Already, we are considering the logical coordinates, which solve almost everything.

The DPI scaling I used when it comes to annotating the screenshot; other than that, everything is at the same level.

@ikoskela
Copy link
Author

Great point — if windows-mcp is standardizing on logical coordinates, that covers most cases.

The remaining use case is mixed-tool workflows where windows-mcp receives coordinates from external tools — some providing physical, some logical. Having the ability to specify which coordinate system is being passed eliminates ambiguity and lets the tool handle conversion rather than requiring the caller to know and convert every time.

If the team sees that as a worthwhile addition, happy to adjust the PR to align with whatever direction you're heading. If not, no worries — just wanted to contribute.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants