feat: upstream pagination, iterators, embeds, and new API methods#12
Conversation
Backport major features from upstream the-library-code/dspace-rest-python v0.1.16: Pagination & iterators: - Add paginated() decorator for automatic pagination handling - Add *_iter methods: get_bundles_iter, get_bitstreams_iter, get_communities_iter, get_collections_iter, get_users_iter, search_groups_by_metadata_iter, get_resource_policies_iter Embeds support: - Add parse_params() helper for building query params with embeds - Add embeds parameter to: get_dso, create_dso, update_dso, get_bundles, get_bitstreams, get_communities, get_collections, get_items, get_users, search_objects Search enhancements: - Add configuration parameter to search_objects - Return SimpleDSpaceObject for search results instead of full DSpaceObject - Add search_objects_iter with paginated iteration New API methods: - create_item_version: Create versioned copies of items - resolve_identifier_to_dso: Resolve handles/DOIs/UUIDs to DSOs - create_resource_policy: Create authorization policies on objects New model classes: - BitstreamFormat: Model for bitstream format metadata - SearchResult: Model for discovery search results Infrastructure: - Add __version__ via importlib.metadata - Add smart_open support (optional) for remote bitstream paths - Add functools import for decorator support
There was a problem hiding this comment.
Pull request overview
This PR backports major features from upstream dspace-rest-python v0.1.16, adding pagination support, HAL+JSON embeds, and new API methods while preserving all dataquest-dev customizations.
Changes:
- Added pagination decorator and iterator methods for resource retrieval (bundles, bitstreams, communities, collections, users, groups, search results)
- Implemented embeds support across all major retrieval/mutation methods using new
parse_params()helper - Added new API methods:
create_item_version,resolve_identifier_to_dso,create_resource_policy - Introduced
BitstreamFormatandSearchResultmodel classes - Integrated optional smart_open support for remote file URIs in bitstream creation
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| dspace_rest_client/models.py | Added BitstreamFormat and SearchResult model classes to support new API features |
| dspace_rest_client/client.py | Added pagination decorator, iterator methods, embeds support, version management, identifier resolution, and resource policy creation |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if smart_open is not None: | ||
| file = (name, smart_open.open(path, 'rb'), mime) | ||
| else: | ||
| file = (name, open(path, 'rb'), mime) |
There was a problem hiding this comment.
The file handle opened with either smart_open.open() or open() is never explicitly closed. This could lead to resource leaks. Consider using a context manager (with statement) to ensure the file is properly closed after use.
| mimetype = None | ||
| supportLevel = None | ||
| internal = False | ||
| extensions = [] |
There was a problem hiding this comment.
Using a mutable default value (empty list) as a class attribute can cause unintended sharing of state between instances. If one instance modifies this list, all instances will see the change. Initialize extensions to None and set it to an empty list in __init__, or use extensions = None as the default.
| appliedFilters = [] | ||
| type = None | ||
|
|
||
| def __init__(self, api_resource): | ||
| super(SearchResult, self).__init__(api_resource) |
There was a problem hiding this comment.
Using a mutable default value (empty list) as a class attribute can cause unintended sharing of state between instances. Initialize appliedFilters to None and set it to an empty list in __init__, or use appliedFilters = None as the default.
| appliedFilters = [] | |
| type = None | |
| def __init__(self, api_resource): | |
| super(SearchResult, self).__init__(api_resource) | |
| appliedFilters = None | |
| type = None | |
| def __init__(self, api_resource): | |
| super(SearchResult, self).__init__(api_resource) | |
| # Ensure each instance has its own list of applied filters | |
| self.appliedFilters = [] |
Summary
Backport of major features from upstream the-library-code/dspace-rest-python v0.1.16 into our fork. This PR builds on top of #11 (bugfixes and proxy support) and adds all significant new features.
Changes
Pagination & Iterators
paginated()decorator generic decorator that wraps any endpoint to automatically handle DSpace REST API pagination, yielding results one at a time without loading everything into memory.*_itermethods that use the decorator:get_bundles_iter(parent, ...)get_bitstreams_iter(bundle, ...)get_communities_iter(...)get_collections_iter(...)get_users_iter(...)search_groups_by_metadata_iter(query, ...)get_resource_policies_iter(parent, ...)Embeds Support
parse_params()helper function builds query parameter dicts with properembedentries for HAL+JSON responses.embedsparameter to all major retrieval/mutation methods:get_dso,create_dso,update_dso,get_bundles,get_bitstreams,get_communities,get_collections,get_items,get_users,search_objects.Search Enhancements
search_objectsnow accepts aconfigurationparameter (e.g.'administrativeView').SimpleDSpaceObjectinstead of fullDSpaceObject(matches upstream behavior).search_objects_iterfor paginated discovery search iteration.New API Methods
create_item_version(item_uuid, summary)resolve_identifier_to_dso(identifier)create_resource_policy(resource_policy, parent, eperson, group)New Model Classes (
models.py)BitstreamFormatAddressableHALResourceSearchResultHALResourceInfrastructure
__version__exposed viaimportlib.metadata(falls back to'unknown').smart_openintegration if installed,create_bitstreamcan read from remote URIs (S3, HTTP, etc.) in addition to local files.functoolsimport for decorator support.Preserved Custom Code
All dataquest-dev custom features are untouched:
verify_response/last_err/_last_errerror trackingget_resourcepolicy/get_owningCollection/remove_metadataResourcePolicywithgroupName/groupUUIDapi_postdetailsparameter insearch_objects_loggerthroughoutTesting
client.pyandmodels.pycompile cleanly (py_compile).