A small demo Streamlit application to extract simple metadata from free text and add it to a local ontology-based Knowledge Graph (KG).
This repository demonstrates using an NLP backend to extract fields such as name, organization, age, favorite reaction, and catalysis research field from user text, present and validate the extracted metadata in a Streamlit UI, and add it to a local OWL Knowledge Graph.
- Streamlit front-end for interactive metadata extraction and KG management.
- Uses an ontology (OWL) as the underlying data model.
- Provides KG query UI to inspect people, organizations, reactions and research fields stored in the KG.
streamlitApp.py— Main Streamlit app. Handles the UI, calls extraction functions (fromcodebase.py), performs basic validation, shows KG queries and writes to the local ontology.codebase.py— Project logic used by the Streamlit UI (extraction calls, helper functions such asdict2kg,kg_query,call_ollama, etc.).config.json— Configuration used by the app (ontology paths and other settings).ontologies/— Folder containing the base ontologies and the knowledge graph files.BaseOntology.owlandBaseOntology.propertiesKnowledgeGraph.owlandKnowledgeGraph.properties
modelfiles/— Folder containing the modelfiles of the used LLMsevaluation/— Folder containing JSON files with abstracts and extracted labels.LICENSE— Project license.README.md— (this file)
- Python 3.8+ (3.10/3.11 recommended)
- The following Python packages are used (install with pip):
- streamlit
- owlready2
- pandas
Note: The repository uses codebase.py for core functionality. That file may require additional packages or API keys depending on the implementation of the extraction backend (for example, a local LLM/OLLAMA client or a cloud API). Inspect codebase.py and config.json for any additional requirements.
The LLMs are uploaded to Zenodo and can be found here
- Create and activate a virtual environment:
python -m venv .venv; .\.venv\Scripts\Activate.ps1- Install common dependencies:
python -m pip install --upgrade pip
pip install streamlit owlready2 pandas- Review and update
config.jsonif necessary (ontology paths, URLs, or endpoints used bycodebase.py).
From the repository root (PowerShell):
streamlit run .\streamlitApp.pyThis will open the Streamlit UI in your browser. The app expects a short free-text paragraph describing a person (name, organization, age, favorite reaction, catalysis research field). Click "Digest Data" to run the extraction, validate the extracted metadata in the form on the right, and click "Add to local Knowledge Graph" to persist it to ontologies/KnowledgeGraph_inferred.owl.
- The app imports everything from
codebase.py(viafrom codebase import *). That module provides the extraction function (call_ollamain the currentstreamlitApp.py), KG-writing (dict2kg) and query functions (kg_query). Make surecodebase.pyis present and configured appropriately. owlready2uses JPype under the hood; installingowlready2via pip should handle this, but on some systems additional Java configuration may be necessary.