Skip to content

artefactory/woodtapper

Repository files navigation

User-friendly Python toolbox for interpreting and manipulating decision tree ensembles from scikit-learn

CI Status Linting , formatting, imports sorting: ruff Pre-commit Docs

License Python Versions PyPI Version status

🪵 Key Features

WoodTapper is a Python toolbox that provides:

  • Rule extraction from tree-based ensembles: Generates a final estimator composed of a sequence of simple rule-based on features and thresholds.

  • Example-based explanations: Connects predictions to a small set of representative samples, returning the most similar examples along with their target values.

Detailed information about the modules can be found here.

WoodTapper is fully compatible with scikit-learn tree ensemble models.

🛠 Installation

From PyPi:

pip install woodtapper

From source:

git clone https://github.com/artefactory/woodtapper.git
cd woodtapper
pip install -e .[dev,docs]

Warning: If you are a Windows user, you need to have a C/C++ compiler before installing woodtapper.

🌿 WoodTapper RulesExtraction module

from woodtapper.extract_rules import SirusClassifier
from woodtapper.extract_rules.visualization import show_rules

sirus = SirusClassifier(n_estimators=1000,max_depth=2,
                          quantile=10,p0=0.01, random_state=0)
sirus.fit(X_train,y_train)
y_pred_sirus = sirus.predict(X_test)
show_rules(sirus,max_rules=10) # Show rules

🌱 WoodTapper ExampleExplanation module

from woodtapper.example_sampling import RandomForestClassifierExplained

rf_explained = RandomForestClassifierExplained(n_estimators=100)
rf_explained.fit(X_train,y_train)
Xy_explain = rf_explained.explanation(X_test) # Get the 5 most similar samples (and target) for each test sample

🙏 Acknowledgements

This work was done through a partnership between the Artefact Research Center and the Laboratoire de Probabilités Statistiques et Modélisation (LPSM) of Sorbonne University.

   

📜 Citation

If you find the code useful, please consider citing us:

@misc{woodtapper,
  title        = {WoodTapper: a Python package for explaining decision tree ensembles},
  author       = {Sakho, Abdoulaye and Aouad, Jad and Gauthier, Carl-Erik and Malherbe, Emmanuel and Scornet, Erwan},
  year         = {2025},
  howpublished = {\url{https://github.com/artefactory/woodtapper}},
}

For SIRUS methodology, consider citing:

@article{benard2021sirus,
  title={Sirus: Stable and interpretable rule set for classification},
  author={Benard, Clement and Biau, Gerard and Da Veiga, Sebastien and Scornet, Erwan},
  year={2021}
}