Duplicate handle#9
Conversation
There was a problem hiding this comment.
Pull Request Overview
This pull request introduces support for duplicate prompts in the OpenAI batch processing flow by deduplicating identical API requests and ensuring consistent distribution of results. Key changes include updates to the batching logic in lmwrapper/openai_wrapper/batching.py, modifications in test files to validate deduplication behavior, and revisions to dependency versions and model references in several configuration and test files.
Reviewed Changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| test/test_xbatching_actual_run.py | Updates skip conditions and adjusts prompt parameters for testing purposes. |
| test/test_params.py | Changes the default small model parameter from a custom model to "gpt2". |
| test/test_openai_batching.py | Revises the duplicate prompts test to assert 3 unique prompts are processed. |
| test/test_models_common.py, test/test_huggingface_internals.py, test/test_huggingface.py, test/test_caching.py | Updates model names and references to use "gpt2". |
| pyproject.toml | Downgrades several dependency versions and updates the project version accordingly. |
| lmwrapper/openai_wrapper/batching.py | Modifies duplicate prompt handling strategy, replacing error-raising with deduplication. |
| lmwrapper/huggingface_wrapper/predictor.py | Removes an unused parameter in call configuration to align with updated API behavior. |
| README.md | Updates demo instructions and prompt examples to reflect recent changes. |
| CHANGELOG.md | Updates changelog to document added support for duplicate prompts in OpenAI batches. |
Comments suppressed due to low confidence (2)
lmwrapper/openai_wrapper/batching.py:190
- Consider adding an inline comment explaining that duplicate prompts are intentionally skipped so that a single API call’s result is distributed to all duplicate positions; this will clarify the deduplication strategy for future maintainers.
if custom_id in custom_ids:
lmwrapper/huggingface_wrapper/predictor.py:351
- Document the reason for removing the 'return_dict_in_generate' parameter to ensure that downstream code is aligned with the new response format and to improve clarity on API changes.
return_dict_in_generate=True,
| keywords = ["large language models", "openai"] | ||
| dependencies = [ | ||
| "openai~=1.79.0", | ||
| "openai~=1.55.3", |
There was a problem hiding this comment.
[nitpick] Consider adding a comment that explains the rationale behind downgrading the dependency versions, so that maintainers understand any compatibility or legacy reasons behind these changes.
No description provided.