Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 22 additions & 6 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,25 @@
## Unreleased

* **Potentially breaking behavior change:** native model cache defaults changed
without breaking Dart source compatibility. `DefaultModelDownloadManager()` now
prefers the platform shared cache on desktop/server instead of the process temp
directory, and mobile `DefaultModelDownloadManager.auto()` without an explicit
app-private directory now uses a best-effort temporary/cache fallback instead
of throwing. Apps or tests that asserted the old temp path or mobile exception
should pass an explicit cache directory or follow `MIGRATION.md`.

* Added `DefaultModelDownloadManager.auto(...)` plus explicit model cache root
constructors for shared desktop caches, app-private mobile caches,
user-selected model libraries, and App Group containers. `auto(...)` now uses
platform-specific or generic app-private directories on Android/iOS when
supplied, and otherwise falls back to a best-effort temporary/cache directory
instead of requiring application `if` branches for simple cross-platform code.
* Updated the default native `DefaultModelDownloadManager()` constructor to use
the per-user shared model cache on desktop/server platforms and the mobile
app-private cache fallback, so plain `LlamaEngine(...)` remote source loads use
a platform-appropriate default while preserving a temporary fallback for hosts
that cannot expose a desktop cache environment.

## 0.8.9

* Broadened the `hooks` dependency constraint to support both the existing
Expand All @@ -22,12 +44,6 @@
the `llamadart_llama_cpp_flutter` Apple SwiftPM checksum, and
aligned current README/website native override docs.

* Added `DefaultModelDownloadManager.auto(...)` plus explicit model cache root
constructors for shared desktop caches, app-private mobile caches,
user-selected model libraries, and App Group containers. Implicit shared cache
resolution now fails loudly on mobile and web where the OS cannot provide a
hidden cross-developer model folder.

## 0.8.7

* Fixed multimodal chat-template rendering so templates that force-open
Expand Down
77 changes: 77 additions & 0 deletions MIGRATION.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,83 @@

This document covers the major breaking upgrade paths.

## Next release: model download/cache defaults

No source migration is required for existing calls: `DefaultModelDownloadManager`
constructors remain source-compatible, and the new mobile-specific directory
arguments on `DefaultModelDownloadManager.auto(...)` are optional.

There are two intentional runtime default changes to be aware of:

1. `DefaultModelDownloadManager()` no longer defaults to the process temporary
directory on desktop/server platforms. It now uses the same platform cache
root as `DefaultModelDownloadManager.auto()`:

| Platform | New default root |
| --- | --- |
| Linux | `$XDG_CACHE_HOME/llamadart/models`, or `$HOME/.cache/llamadart/models` when `XDG_CACHE_HOME` is unset |
| macOS | `$HOME/Library/Caches/llamadart/models` |
| Windows | `%LOCALAPPDATA%\llamadart\models`, then `%APPDATA%\llamadart\models`, then `%USERPROFILE%\AppData\Local\llamadart\models` |

If a desktop/server embedder cannot expose a home/cache environment, the
default constructor preserves compatibility by falling back to
`Directory.systemTemp/llamadart/models`. Explicit `auto(...)` and
`sharedCache(...)` calls still report cache-resolution errors so apps can
choose a durable directory.

2. `DefaultModelDownloadManager.auto(platform: android/ios)` without an explicit
mobile directory no longer throws. It now uses an app-private temporary/cache
fallback at `Directory.systemTemp/llamadart/models`. This is convenient for
examples and rebuildable downloads, but large durable mobile model files
should still use an app-private cache/support directory resolved by the app.
For Flutter apps, prefer `path_provider.getApplicationCacheDirectory()` for
re-downloadable model caches; use `getApplicationSupportDirectory()` only for
app-owned durable support files when the app also accounts for platform
backup/no-backup policy.

Recommended cross-platform setup:

```dart
final engine = LlamaEngine(
LlamaBackend(),
modelDownloadManager: DefaultModelDownloadManager.auto(
// On Flutter, pass a path resolved by path_provider for the current app.
// For re-downloadable model caches, prefer getApplicationCacheDirectory().
// Desktop/server ignores this and uses the per-user shared cache.
appPrivateCacheDirectory: appCacheModelsDirectory,
),
);
```

If your app resolves platform-specific mobile directories ahead of time, pass
both without adding `Platform.isAndroid` / `Platform.isIOS` branches around the
download manager constructor:

```dart
final manager = DefaultModelDownloadManager.auto(
androidAppPrivateCacheDirectory: androidModelsDirectory,
iosAppPrivateCacheDirectory: iosModelsDirectory,
);
```

To preserve the old temporary-cache behavior exactly on desktop/server, pass an
explicit directory:

```dart
final manager = DefaultModelDownloadManager(
defaultCacheDirectory: path.join(
Directory.systemTemp.path,
'llamadart',
'models',
),
);
```

If your application previously called `DefaultModelDownloadManager.auto()` on
Android/iOS and expected a `LlamaUnsupportedException`, update that test or call
`DefaultModelDownloadManager.sharedCache()` without `cacheDirectory` when you
specifically want to reject implicit mobile shared caches.

## `0.6.3` -> `0.6.4`

No public API break, but Android arm64 native packaging defaults changed.
Expand Down
45 changes: 36 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -280,10 +280,11 @@ import 'package:llamadart/llamadart.dart';

Future<void> main() async {
final String? appPrivateModelsDirectory =
await resolveMobileAppPrivateModelsDirectory();
await resolveAppPrivateModelsDirectory();

// Desktop/server apps use a per-user shared cache. Android/iOS apps use the
// supplied app-private directory instead, so one code path works everywhere.
// supplied app-private directory when available, otherwise an app-private
// temporary/cache fallback. One constructor works across platforms.
final engine = LlamaEngine(
LlamaBackend(),
modelDownloadManager: DefaultModelDownloadManager.auto(
Expand All @@ -309,10 +310,16 @@ Future<void> main() async {
}
```

`resolveMobileAppPrivateModelsDirectory()` represents your app storage layer, for
example a Flutter `path_provider` application-support path on Android/iOS. On
desktop/server, `auto(...)` ignores `appPrivateCacheDirectory` and uses the
per-user shared model cache.
`resolveAppPrivateModelsDirectory()` represents your app storage layer, for
example a Flutter `path_provider` application-support path. On desktop/server,
`auto(...)` ignores mobile app-private directory arguments and uses the per-user
shared model cache. On Android/iOS, pass `appPrivateCacheDirectory` when a storage
abstraction has already resolved the current platform's model directory, or pass
`androidAppPrivateCacheDirectory` and `iosAppPrivateCacheDirectory` when you want
one branch-free constructor call with platform-specific directories. If no mobile
directory is supplied, `auto(...)` falls back to the app-private system
temporary/cache directory; this is convenient for examples and rebuildable
downloads, but app-support storage is preferable for large durable model files.

Native/file-backed backends stream remote models into the package-managed cache,
resume partial `.part` downloads when the server supports HTTP Range and the
Expand All @@ -326,24 +333,44 @@ and retries are rejected for local paths.

`DefaultModelDownloadManager.auto(...)` is the recommended cross-platform
entrypoint: desktop/server platforms use a per-user shared cache, while
Android/iOS use the app-private directory supplied by the app storage layer.
Android/iOS use a supplied platform-specific or generic app-private directory,
falling back to a best-effort system temporary/cache directory when omitted.
The default `DefaultModelDownloadManager()` constructor also uses the per-user
shared cache on desktop/server platforms and the same mobile app-private
temporary/cache fallback, so a plain `LlamaEngine(...)` has a platform-appropriate
default without application `if` branches. To preserve constructor compatibility
in unusual desktop/server embedders where no home/cache environment is available,
the default constructor falls back to the system temporary/cache root;
explicit desktop/server `auto(...)` and `sharedCache(...)` resolution still
reports cache-resolution errors so apps can choose a durable directory.
`DefaultModelDownloadManager.sharedCache()` is the explicit desktop/server shared
cache entrypoint, so multiple `llamadart` apps that use the same stable
`ModelSource` can reuse one downloaded file. Mobile platforms do not have a safe
implicit cross-developer model folder: use
`DefaultModelDownloadManager.appPrivate(cacheDirectory: ...)` for normal
Android/iOS app storage, `userSelected(cacheDirectory: ...)` after an Android
Android/iOS app storage resolved by the app, `userSelected(cacheDirectory: ...)` after an Android
Storage Access Framework-style user grant, or `appGroup(cacheDirectory: ...)`
for explicitly configured iOS/macOS App Group containers. Web backends use
origin-scoped browser/runtime caches instead of a file-backed shared directory.

Desktop shared-cache roots use the default `llamadart` namespace:
For Flutter apps, prefer `path_provider` over raw path guesses on mobile:
`getApplicationCacheDirectory()` is the closest match for re-downloadable model
caches, while `getApplicationSupportDirectory()` is appropriate only when the
app intentionally treats model files as durable support data and handles platform
backup/no-backup policy as needed. `getTemporaryDirectory()` also maps to
app-scoped cache locations such as Android `Context.getCacheDir` and Apple
`NSCachesDirectory`, but its contents may be cleared at any time. The
`Directory.systemTemp` fallback is therefore a compatibility fallback, not the
recommended durable mobile model-library location.

Default cache roots use the default `llamadart` namespace:

| Platform | Default path |
| --- | --- |
| Linux | `$XDG_CACHE_HOME/llamadart/models`, or `$HOME/.cache/llamadart/models` when `XDG_CACHE_HOME` is unset |
| macOS | `$HOME/Library/Caches/llamadart/models` |
| Windows | `%LOCALAPPDATA%\llamadart\models`, then `%APPDATA%\llamadart\models`, then `%USERPROFILE%\AppData\Local\llamadart\models` |
| Android/iOS | supplied app-private directory, preferably app cache/support resolved by the app, or `Directory.systemTemp/llamadart/models` as a best-effort cache fallback |

Pass `namespace: 'your.namespace'` to `auto(...)` or `sharedCache(...)` to
replace the `llamadart` path segment, or pass `cacheDirectory` to force an
Expand Down
1 change: 1 addition & 0 deletions example/basic_app/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ A clean, organized CLI application demonstrating the capabilities of the `llamad
- **Interactive Mode**: Have a back-and-forth conversation with an LLM in your terminal.
- **Single Response Mode**: Pass a prompt as an argument for quick tasks.
- **Automatic Model Management**: Automatically downloads models from Hugging Face if a URL is provided.
- **Platform Cache Defaults**: Remote model URLs use `DefaultModelDownloadManager`, so desktop/server runs share the per-user `llamadart` model cache while explicit local paths are loaded directly.
- **Backend Optimization**: Defaults to GPU acceleration (Metal/Vulkan) when available.
- **LoRA Adapters**: Load one or more LoRA adapters with repeated `--lora` flags.
- **Structured Output**: Pass `--grammar` for GBNF-constrained generation.
Expand Down
105 changes: 33 additions & 72 deletions example/basic_app/lib/services/model_service.dart
Original file line number Diff line number Diff line change
@@ -1,86 +1,47 @@
import 'dart:io';
import 'package:http/http.dart' as http;
import 'package:path/path.dart' as path;

import 'package:llamadart/llamadart.dart';

/// Service for managing model downloads and local paths.
class ModelService {
/// The directory where models are cached.
final String cacheDir;

/// Creates a model service with an optional [cacheDir].
ModelService([String? cacheDir])
: cacheDir = cacheDir ?? path.join(Directory.current.path, 'models');

/// Ensures the model at [urlOrPath] is available locally.
/// If it's a URL, it downloads it. If it's a path, it verifies existence.
Future<File> ensureModel(String urlOrPath) async {
if (urlOrPath.startsWith('http')) {
return await _downloadModel(urlOrPath);
}

final file = File(urlOrPath);
if (!file.existsSync()) {
throw Exception('Model file not found at: $urlOrPath');
}
return file;
}

Future<File> _downloadModel(String url) async {
final name = url.split('/').last.split('?').first;
final file = File(path.join(cacheDir, name));
: _downloadManager = DefaultModelDownloadManager(
defaultCacheDirectory: cacheDir,
);

if (file.existsSync() && file.lengthSync() > 0) {
return file;
}

if (!file.parent.existsSync()) {
file.parent.createSync(recursive: true);
}
final DefaultModelDownloadManager _downloadManager;

print('Downloading model: $name');
final client = http.Client();
try {
final request = http.Request('GET', Uri.parse(url));
final response = await client.send(request);
/// The directory where models are cached.
String get cacheDir => _downloadManager.defaultCacheDirectory;

if (response.statusCode != 200) {
throw Exception('Failed to download model: ${response.statusCode}');
/// Ensures the model at [urlOrPath] is available locally.
/// If it's a URL, it downloads it through the package-managed model cache.
/// If it's a path, it verifies existence.
Future<File> ensureModel(String urlOrPath) async {
final source = ModelSource.parse(urlOrPath);
if (source.isLocal) {
final file = File(source.path!);
if (!file.existsSync()) {
throw Exception('Model file not found at: ${source.path}');
}

final contentLength = response.contentLength ?? 0;
var downloaded = 0;
final sink = file.openWrite();

await response.stream
.listen(
(chunk) {
sink.add(chunk);
downloaded += chunk.length;
if (contentLength > 0) {
final progress = (downloaded / contentLength * 100)
.toStringAsFixed(1);
stdout.write('\rProgress: $progress%');
} else {
stdout.write(
'\rDownloaded: ${(downloaded / 1024 / 1024).toStringAsFixed(1)} MB',
);
}
},
onDone: () async {
await sink.close();
print('\nDownload complete.');
},
onError: (e) {
sink.close();
if (file.existsSync()) file.deleteSync();
throw e;
},
)
.asFuture();

return file;
} finally {
client.close();
}

final entry = await _downloadManager.ensureModel(
source,
onProgress: (progress) {
final fraction = progress.fraction;
if (fraction != null) {
final percent = (fraction * 100).toStringAsFixed(1);
stdout.write('\rProgress: $percent%');
} else {
final mb = (progress.receivedBytes / 1024 / 1024).toStringAsFixed(1);
stdout.write('\rDownloaded: $mb MB');
}
},
);
stdout.writeln('\nDownload complete.');
return File(entry.filePath);
}
}
2 changes: 0 additions & 2 deletions example/basic_app/pubspec.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,6 @@ environment:
dependencies:
llamadart:
path: ../.. # Use local version for testing
http: ^1.6.0
path: ^1.8.3
args: ^2.4.2
sqlite3: ^3.1.6
sqlite_vector: ^0.9.85
Expand Down
5 changes: 4 additions & 1 deletion example/chat_app/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,11 +57,14 @@ flutter test --run-skipped -t local-only \

### 2. Choose and Download a Model
1. The app will open to a **Manage Models** screen.
- Native mobile/desktop builds store downloaded model files under the app's
application-specific cache `models` directory via `path_provider`; web builds use
browser Cache Storage/origin-scoped runtime caches.
2. Select one of the pre-configured models (for example: FunctionGemma 270M, Qwen3.5 0.8B/2B/4B/9B, Llama 3.2 3B, Gemma 3/3n, DeepSeek R1 distills).
- Qwen3.5 presets now use Unsloth `Q4_K_M` GGUFs across platforms.
- Quick picks: `0.8B` for web/older phones, `2B` for mobile + low-RAM laptops, `4B` for most native desktop/laptop runs, `9B` for desktop-class devices with more headroom.
- Qwen3.5 small presets default to non-thinking mode for smoother latency and fewer reasoning loops; turn thinking on only when you need extra reasoning.
3. Tap the **Download** icon. The app uses `Dio` to download the model directly to your device's documents directory.
3. Tap the **Download** icon. The app uses `Dio` to download the model directly to your device's app-specific cache directory.
4. Once downloaded, tap **Select** to load the model.
- Gemma 4 E2B is included as a GGUF + `mmproj` bundle. In the current
`llama.cpp` mtmd path used here, that projector exposes vision support but
Expand Down
2 changes: 1 addition & 1 deletion example/chat_app/integration_test/smoke_test.dart
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ void main() {

final Directory dataDir;
if (Platform.isAndroid || Platform.isIOS) {
dataDir = await getApplicationDocumentsDirectory();
dataDir = await getApplicationCacheDirectory();
} else {
dataDir = Directory(path.join(Directory.current.path, 'models'));
if (!dataDir.existsSync()) dataDir.createSync(recursive: true);
Expand Down
2 changes: 1 addition & 1 deletion example/chat_app/lib/services/model_service_io.dart
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ class ModelServiceIO implements ModelService {

@override
Future<String> getModelsDirectory() async {
final dir = await getApplicationDocumentsDirectory();
final dir = await getApplicationCacheDirectory();
final modelsDir = Directory(p.join(dir.path, 'models'));
if (!await modelsDir.exists()) {
await modelsDir.create(recursive: true);
Expand Down
2 changes: 2 additions & 0 deletions lib/src/core/models/download/model_download_manager_stub.dart
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ class DefaultModelDownloadManager extends ThrowingModelDownloadManager {
String namespace = 'llamadart',
String? cacheDirectory,
String? appPrivateCacheDirectory,
String? androidAppPrivateCacheDirectory,
String? iosAppPrivateCacheDirectory,
ModelCachePlatform? platform,
Map<String, String>? environment,
String? homeDirectory,
Expand Down
Loading
Loading