Add temporality awareness to openrag responses and chunk creation #130

EnjoyBacon7 · 2025-11-04T15:40:55Z

API changes:

OpenRAG now optionally expects "datetime" as as file metadata.
Profiding "modified_at" and "created_at" in indexing requests overrides file-provided created and modified times
"indexed_at" is ignored when provided in indexing request (it is overridden by the current date)

v1.1

- openrag/utils/temporal.py TemporalQueryNormalizer class extracts temporal filters Date patterns recognition in multiple languages Relative time Extraction - openrag/components/indexer/chunker/chunker.py Chunkers now add an indexed_at timestamp to documents It is expected that indexed documents provide a created_at timestamp if available - openrag/components/indexer/vectordb/vectordb.py Milvus schema updated to include created_at and indexed_at fields Added Temporal filtering support in vector database queries - openrag/components/retriever.py & pipeline.py Added temporal_filter parameter to all retrievers Automatic temporal extraction from queries via TemporalQueryNormalizer Injects current UTC datetime into system prompt - openrag/components/reranker.py Reranker now combines relevance and temporal scores using a linear decay formula - RERANKER_TEMPORAL_WEIGHT (default 0.3) - RERANKER_TEMPORAL_DECAY_DAYS (default 365)

Added extraction for "modified_at" field in indexation Added "datetime" metadata field as preferred field for date information

Added formatted prompt logging in DEBUG mode Fixed db search with date filters to use OR logic between date fields

paultranvan · 2025-11-05T15:46:14Z

prompts/example1/sys_prompt_tmpl.txt

+4. Temporal Awareness
+   * Pay attention to the **temporal context** of both the query and the retrieved documents.
+   * Each document includes **creation_date** and **indexed_date** metadata indicating when it was created and indexed.
+   * When the user asks about **recent events**, **latest updates**, or uses temporal references (e.g., "last week", "yesterday", "this year"), prioritize documents with **more recent dates**.


Reformulation proposition with less redundancy: When the query includes temporal references (e.g., "last week", "yesterday", "this year"), prioritize documents with **more recent dates**.

paultranvan · 2025-12-01T08:56:55Z

openrag/utils/temporal.py

+        self.relative_number_pattern = r'(\d+)\s*\w+|\w+\s+(\d+)'
+
+        # English patterns for backward compatibility
+        self.english_patterns = {


The naming seems incorrect, it sounds more something like common_languages_patterns?

paultranvan · 2025-12-08T08:54:51Z

openrag/utils/temporal.py

+                        return self._get_last_n_days(number)
+                    else:
+                        # Large number, likely days
+                        return self._get_last_n_days(number)


This heuristic seems a bit risky to me; what if the query is something like 5 years or 12 months? We'll fall in days, right? Likewise, if I have a query like "summarize the documents mentioning 7 eleven acquisition" , we'll take it as a time query for 7 days, right?
I feel like we could have a lot of false positive here

EnjoyBacon7 and others added 6 commits October 22, 2025 16:51

Merge pull request #118 from linagora/dev

d37a6a5

v1.1

Update Docker images to latest versions

e184fca

Removed duplicate created_at field

b05b658

modified_at and datetime metadata handling

3ceb3f2

Added extraction for "modified_at" field in indexation Added "datetime" metadata field as preferred field for date information

Added chunk dates to prompt for LLM

9c0a787

Added formatted prompt logging in DEBUG mode Fixed db search with date filters to use OR logic between date fields

EnjoyBacon7 marked this pull request as draft November 4, 2025 16:00

Added documentation for temporal awareness API

103f794

dodekapod self-requested a review November 13, 2025 10:52

paultranvan reviewed Dec 8, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add temporality awareness to openrag responses and chunk creation #130

Add temporality awareness to openrag responses and chunk creation #130

Uh oh!

EnjoyBacon7 commented Nov 4, 2025 •

edited

Loading

Uh oh!

paultranvan Nov 5, 2025

Uh oh!

paultranvan Dec 1, 2025

Uh oh!

paultranvan Dec 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add temporality awareness to openrag responses and chunk creation #130

Are you sure you want to change the base?

Add temporality awareness to openrag responses and chunk creation #130

Uh oh!

Conversation

EnjoyBacon7 commented Nov 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

paultranvan Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

paultranvan Dec 1, 2025

Choose a reason for hiding this comment

Uh oh!

paultranvan Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

EnjoyBacon7 commented Nov 4, 2025 •

edited

Loading