Experimental syntax for nearVector search, object mapping, and partial gRPC support#351
Merged
Experimental syntax for nearVector search, object mapping, and partial gRPC support#351
Conversation
…locally Pass lombok version to lombok-maven-plugin explicitly, as the default version is not up-to-date. See: awhitford/lombok.maven#179 (comment)
Dataset size (n. vectors): 10 Vectors in range 0.0001-0.0010 with length: 5000 =========================================== GRPCBenchTest.testGRPC: [measured 10 out of 13 rounds, threads: 1 (sequential)] round: 0.19 [+- 0.00], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+- 0.00], GC.calls: 0, GC.time: 0.00, time.total: 2.77, time.warmup: 0.88, time.bench: 1.89 GRPCBenchTest.testGraphQL: [measured 10 out of 13 rounds, threads: 1 (sequential)] round: 0.22 [+- 0.00], round.block: 0.00 [+- 0.00], round.gc: 0.00 [+- 0.00], GC.calls: 3, GC.time: 0.00, time.total: 2.89, time.warmup: 0.71, time.bench: 2.18
…g dependency
Dataset size (n. vectors): 10
Vectors with length: 5000 in range 0.0001-0.0010
===========================================
GRPC (3 warmup, 10 benchmark): 4.0ms
warmup.round: 28.0ms total: 125ms
GraphQL (3 warmup, 10 benchmark): 32.0ms
warmup.round: 49.0ms total: 470ms
Updated benchmark:
[INFO] Running io.weaviate.integration.client.grpc.GRPCBenchTest
Dataset size (n. vectors): 10
Vectors with length: 5000 in range 0.0001-0.0010
===========================================
GRPC (3 warmup, 10 benchmark): 4.0ms
warmup.round: 26.0ms total: 121ms
GRPC.new (3 warmup, 10 benchmark): 3.0ms
warmup.round: 5.0ms total: 50ms
GraphQL (3 warmup, 10 benchmark): 31.0ms
warmup.round: 49.0ms total: 459ms
1) GRPC.new doesn't add any filters (neither do other queries, but we save some time on marshalling that perhaps)
2) Experimental syntax only supports nearVector search
…/or operators
Example query:
things.query.nearVector(
vector,
opt -> opt
.limit(K)
.where(Where.and(
Where.property("name").eq("dyma"),
Where.reference("hasFriend", "hasAddress", "city").gt("Monaco"),
Where.or(
Where.property("dob").gt("1 Jan 1970"),
Where.property("age").gt("27"))))
.returnProperties(fields)
.returnMetadata(MetadataField.ID, MetadataField.VECTOR, MetadataField.DISTANCE));
Not committing this, because those filters would be invalid for the collection used in the test (only has vector, no props).
Benchmarking results unchanged for the reason above, might change later.
[INFO] Running io.weaviate.integration.client.grpc.GRPCBenchTest
Dataset size (n. vectors): 10
Vectors with length: 5000 in range 0.0001-0.0010
===========================================
GRPC (3 warmup, 10 benchmark): 4.90ms
warmup.round: 21.00ms total: 112.00ms
GRPC.new (3 warmup, 10 benchmark): 4.60ms
warmup.round: 7.67ms total: 69.00ms
GraphQL (3 warmup, 10 benchmark): 41.10ms
warmup.round: 57.67ms total: 584.00ms
**GRPC.orm** (3 warmup, 10 benchmark): 4.80ms
warmup.round: 9.33ms total: 76.00ms
Where::isEmpty ensures we do not add a filter condition if no filters were passed.
Added protobuf.java-util dependency for logging protobuf objects as JSON (debuggging).
There was a problem hiding this comment.
Orca Security Scan Summary
| Status | Check | Issues by priority | |
|---|---|---|---|
| Infrastructure as Code | View in Orca | ||
| SAST | View in Orca | ||
| Secrets | View in Orca | ||
| Vulnerabilities | View in Orca |
List::getFirst is not introduced until Java 21
Return SearchResult<Map<String, Object>> from nearVectorUntyped
110ca82 to
615ab0b
Compare
These tests depend on the CPUs and the configuration of the host, e.g. they aren't failing on an M4 Mac but are failing in the pipeline. They need additional investigation, which we will do before v5.1.0.
615ab0b to
b767535
Compare
src/main/java/io/weaviate/client/v1/experimental/SearchClient.java
Outdated
Show resolved
Hide resolved
Server returns error: UNKNOWN: explorer: get class: concurrentTargetVector Search): explorer: get class: vector search: object vector search at index things: s hard things_JlFtZoNmwIqT: build inverted filter allow list: nested query: nested cla use at pos 1: expected value to be string, got '[]string'
There was a problem hiding this comment.
Orca Security Scan Summary
| Status | Check | Issues by priority | |
|---|---|---|---|
| Infrastructure as Code | View in Orca | ||
| SAST | View in Orca | ||
| Secrets | View in Orca | ||
| Vulnerabilities | View in Orca |
🛡️ The following SAST misconfigurations have been detected
| NAME | FILE | ||
|---|---|---|---|
| Sensitive Information Logging in Android Applications | ...l/SearchOptions.java | View in code |
Somehow, CONTAINS_ANY fails with the same error as EQUAL. Needs further invertigation, not critical for alpha.
There was a problem hiding this comment.
Orca Security Scan Summary
| Status | Check | Issues by priority | |
|---|---|---|---|
| Infrastructure as Code | View in Orca | ||
| SAST | View in Orca | ||
| Secrets | View in Orca | ||
| Vulnerabilities | View in Orca |
antas-marcin
approved these changes
Feb 19, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Here we have a preview of the syntax we are planning for the new client implemented for a small subset of features, namely
nearVectorsearch with filtering.All classes included in the preview are part of the
client.experimentalpackage and you can find examples of their usages in theGRPCBenchTest.java. The new search methods use gRPC to communicate with the server and show significant (anywhere between 6 to 8x) speed improvement compared to their GraphQL counterparts.Additionally, this PR adds a
buildSearchRequest()method to GraphQL's GetBuilder, so that it can be easily used in combination withclient.grpc().raw().withSearchRequest()to get the performance bump without any major code changes.