Skip to content

aadarshkadam067/DetectionForge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

85 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

DetectionForge

Detection-as-code pipeline with measured precision/recall against real OTRF attack captures.

by Aadarsh Kadam ยท github.com/aadarshkadam067

Rules Mean Precision Tactics Covered SIEM Conversion CI


What this is

DetectionForge is a detection-as-code pipeline that treats SIEM rules like software: version-controlled, automatically tested against real OTRF attack captures, and auto-converted to three SIEM backends. Every rule ships with measured precision and recall against a 1,648-event corpus (1,164 process-creation events + 484 registry events) drawn from OTRF Security-Datasets ZIPs โ€” no synthetic events, no hand-crafted fixtures. The pipeline ships 20 rules covering 21 ATT&CK techniques across 8 tactics, mean precision 0.997, with 60/60 multi-SIEM conversion success across Splunk SPL, Elastic EQL, and Microsoft Sentinel KQL. (The 21st technique is T1027 Obfuscated Files โ€” T1059.001's PowerShell -EncodedCommand rule legitimately covers both T1059.001 and T1027.)

Each rule's precision figure is classified as earned (14 rules โ€” the benign corpus contains same-shape events the rule correctly excludes) or structural-absence (6 rules โ€” precision = 1.000 because the corpus contains no events the rule could match; shipped under the T1218 LOLBIN-cluster rationale). The classification is in dist/data/dashboard_meta.json per rule. Three more techniques are documented as Phase 5 deferrals rather than shipped with structurally guaranteed measurements (see Phase 5 backlog below). The same standard applies everywhere: a 1.000 number is never shown without its provenance.


Dashboard

The dashboard presents the detection corpus, the precision measurements, and the documented gaps as a single static site. Served via any HTTP server over the dist/ directory โ€” no build step, no backend, four JSON contract files read at runtime.

Overview Overview โ€” headline metrics. The earned-vs-structural-absence split renders at the same prominence as the mean-precision figure; the honesty meter in the sidebar surfaces the 14/6 split, conversion tally, and FP count.

Rules Rules โ€” sortable, filterable, expandable table of all 20 rules. T1059.001's 0.933 precision and single false positive surface inline (highlighted), not rounded away. Filter by classification or logsource; expand any row for the Sigma source and the three converted backend queries.

Coverage Coverage โ€” ATT&CK tactic-column grid. 26 direct technique cells and 16 parent-rollup cells (dashed). Every non-rollup cell carries its classification chip; T1059.001 and T1027 render with an FP background because the underlying rule has one.

Trends Trends โ€” single-snapshot state. first_snapshot_date equals built_date, so each chart renders the current value as a horizontal level with a single marker. Successive forge build runs append points and the sparklines widen automatically.

Gaps Gaps โ€” three deferred techniques with class (Class 1 / Class 3), prerequisite type, and reason from dashboard_meta.json. The structural-absence inventory continues below this fold. Named, not hidden.


Verify in 5 minutes

git clone https://github.com/aadarshkadam067/DetectionForge.git
cd DetectionForge
python3 -m venv .venv && source .venv/bin/activate
pip install -e '.[dev,convert]'
python scripts/update_attack_data.py   # generates forge/data/attack_techniques.json
forge run                               # lint โ†’ test โ†’ convert โ†’ score โ†’ build

Or run the full pipeline in Docker:

docker compose -f docker/docker-compose.yml run --rm forge

Expected forge test summary:

20 rules โ€” mean precision 0.997  mean recall 1.000  mean F1 0.998

Expected forge convert summary:

60/60 conversions succeeded โ€” success rate 100.0% (threshold 95%)

ATT&CK coverage layer is written to reports/layer.json and is loadable directly into the public Navigator at https://mitre-attack.github.io/attack-navigator/ via Open Existing Layer โ†’ Upload from Local.

View the dashboard locally

The static measurement dashboard at dist/ consumes the JSON contract files in dist/data/. forge run must complete first โ€” the JSON files are not committed (they're build output), so a fresh clone serves an empty dashboard until the pipeline has populated dist/data/. Then:

forge run                              # populates dist/data/*.json
cd dist && python3 -m http.server 8000 # any static server works
# open http://127.0.0.1:8000/

If you open the dashboard before running the pipeline, it renders an explicit error card pointing at this same command โ€” by design, so the failure mode is legible instead of a silent empty UI.

The dashboard uses in-browser Babel transformation, which produces one console warning at page load. This is intentional โ€” it preserves the no-build-step deployment property documented in PDR ยง9: a production-deployable static asset with zero toolchain dependency. The dashboard ships as plain HTML/CSS/JSX read directly from disk.


Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  rules/*.yml     โ”‚   โ”‚  data/attack/    โ”‚   โ”‚  data/benign/...         โ”‚
โ”‚  (Sigma rules)   โ”‚   โ”‚  (TP fixtures)   โ”‚   โ”‚  data/registry_baseline  โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
         โ”‚                      โ”‚                          โ”‚
         โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                    โ–ผ                        โ–ผ
            โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”        โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
            โ”‚  forge lint   โ”‚        โ”‚  forge test   โ”‚
            โ”‚  (Stage 1)    โ”‚        โ”‚  (Stage 2)    โ”‚
            โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜        โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                    โ”‚                        โ”‚
                    โ–ผ                        โ–ผ
            โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”        โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
            โ”‚ forge convert โ”‚        โ”‚ forge score   โ”‚
            โ”‚  (Stage 3)    โ”‚        โ”‚  (Stage 4)    โ”‚
            โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜        โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                    โ”‚                        โ”‚
                    โ–ผ                        โ–ผ
        โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
        โ”‚ reports/converted/<r>/  โ”‚  โ”‚ reports/results.json   โ”‚
        โ”‚   splunk.spl            โ”‚  โ”‚ reports/layer.json     โ”‚
        โ”‚   elastic.eql           โ”‚  โ”‚ reports/conv_matrix... โ”‚
        โ”‚   sentinel.kql          โ”‚  โ”‚                        โ”‚
        โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                     โ”‚                            โ”‚
                     โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                   โ–ผ
                          โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                          โ”‚  forge build   โ”‚
                          โ”‚  (Stage 5)     โ”‚
                          โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                                   โ–ผ
                          โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
                          โ”‚  dist/data/                    โ”‚
                          โ”‚    dashboard_meta.json         โ”‚
                          โ”‚    results.json                โ”‚
                          โ”‚    conversion_matrix.json      โ”‚
                          โ”‚    layer.json                  โ”‚
                          โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
Stage Command Inputs Outputs
1. Lint forge lint rules/**/*.yml pass/fail with errors
2. Test forge test rules + fixtures + baselines reports/results.json
3. Convert forge convert rules reports/converted/<slug>/{splunk.spl, elastic.eql, sentinel.kql} + reports/conversion_matrix.json
4. Score forge score reports/results.json reports/layer.json (ATT&CK Navigator v4.5)
5. Build forge build all reports + rule YAMLs + _meta.yml dist/data/{dashboard_meta, results, conversion_matrix, layer}.json

Note on Stage 5. forge build produces JSON data artifacts only โ€” no HTML. The dashboard at dist/ is committed source (HTML/CSS/JSX, no build step) that consumes those JSON files at runtime. See PDR ยง9 and journey Entry 4.9 for the decision record and the rebuild rationale.


Measurements

Precision / Recall Table (all 20 rules)

The Class column distinguishes earned precision (the corpus contained same-shape events the rule correctly excluded โ€” discrimination is real) from struct-abs (the corpus contained no events the rule could match โ€” precision = 1.000 reflects absence, not discrimination; shipped under the T1218 LOLBIN-cluster rationale where applicable). See journey Entry 4.7 for the per-rule classification audit.

Technique Tactic Logsource P R F1 TP FP Class
T1003.001 Cred Access process_creation 1.000 1.000 1.000 2 0 earned
T1003.002 Cred Access process_creation 1.000 1.000 1.000 1 0 struct-abs
T1021.006 Lateral Movement process_creation 1.000 1.000 1.000 3 0 earned
T1033 Discovery process_creation 1.000 1.000 1.000 2 0 earned
T1053.005 Persistence process_creation 1.000 1.000 1.000 2 0 earned
T1055.001 Defense Evasion process_creation 1.000 1.000 1.000 1 0 earned
T1059.001 Execution process_creation 0.933 1.000 0.965 14 1 earned
T1059.005 Execution process_creation 1.000 1.000 1.000 1 0 struct-abs
T1087.001 Discovery process_creation 1.000 1.000 1.000 8 0 earned
T1105 C2 process_creation 1.000 1.000 1.000 1 0 earned
T1136.001 Persistence process_creation 1.000 1.000 1.000 4 0 earned
T1218.001 Defense Evasion process_creation 1.000 1.000 1.000 1 0 struct-abs
T1218.004 Defense Evasion process_creation 1.000 1.000 1.000 1 0 struct-abs
T1218.005 Defense Evasion process_creation 1.000 1.000 1.000 2 0 struct-abs
T1218.010 Defense Evasion process_creation 1.000 1.000 1.000 1 0 earned
T1218.013 Defense Evasion process_creation 1.000 1.000 1.000 1 0 struct-abs
T1220 Defense Evasion process_creation 1.000 1.000 1.000 1 0 earned
T1547.001 Persistence registry_event 1.000 1.000 1.000 3 0 earned
T1548.002 Priv Esc process_creation 1.000 1.000 1.000 1 0 earned
T1562.004 Defense Evasion process_creation 1.000 1.000 1.000 4 0 earned

Aggregate: 20 rules ยท mean precision 0.997 ยท mean recall 1.000 ยท mean F1 0.998 ยท 14 earned / 6 structural-absence.

Corpus: 1,648 events across two logsources โ€”

  • 1,164-event process-creation baseline (Sysmon EID 1) drawn from 121 OTRF atomic + compound Windows ZIPs, deduplicated on (Image, CommandLine, ParentImage, Computer) and filtered via the structural attack-event exclusion (ADR-005).
  • 484-event registry baseline (Sysmon EID 12/13) drawn from 4 OTRF source ZIPs (empire_wmi_local_event_subscriptions_elevated_user, empire_schtasks_creation_execution_elevated_user, covenant_dcom_iertutil_dll_hijack, empire_dcom_shellwindows_stager), with explicit exclusion of \Microsoft\Windows\CurrentVersion\Run\ and \Microsoft\Windows\CurrentVersion\RunOnce\ paths (the labeling axiom โ€” exclude exactly what the rule detects). Built in Phase 4 Day 3 via the ADR-006 per-logsource baseline routing.

48 structural-filter signatures total (46 process_creation + 3 registry_event, with overlap accounted for). All 20 rules clear their expected.precision threshold; the lone non-1.000 figure (T1059.001 at 0.933) is the documented cross-technique FP referenced in T1087.001 โ€” Before/After below applied symmetrically. Details in journey Entries 4.3 / 4.6 / 4.7.

Multi-SIEM Conversion Matrix

All 20 rules ร— 3 backends = 60/60 conversions succeeded (100%). Backends: pySigma-backend-splunk 2.1.0, pySigma-backend-elasticsearch 2.0.2, pySigma-backend-kusto 1.0.1. Two logsource categories convert cleanly:

Logsource Rules Conversions
process_creation 19 57/57
registry_event 1 (T1547.001) 3/3
Total 20 60/60

Per-rule per-backend status and the actual emitted query strings are in reports/conversion_matrix.json and reports/converted/<rule>/{splunk.spl, elastic.eql, sentinel.kql}. Trailing-backslash semantics on the registry logsource verified correct across all three backends โ€” \CurrentVersion\Run\ does not match \RunTime or \RunOnce on the \Run\ condition. See docs/measurements/phase4-day3-registry-harness.md for the verification.

T1087.001 โ€” Before/After Defect Fix (methodology evidence)

The harness caught a rule defect during corpus expansion: T1087.001's CommandLine|contains: user condition fired on net user /add commands from T1136.001 (Create Account), producing two cross-technique false positives. The defect was logged at detection time (Phase 3 Day 2), deliberately retained in the measurement, and fixed in a later commit โ€” providing a reproducible before/after delta.

State Precision Recall F1 FPs Defect
Before fix (Day 2) 0.800 1.000 0.889 2 user keyword matched net user /add
After fix (Day 4) 1.000 1.000 1.000 0 filter_account_creation: CommandLine|contains: ' /add '

The fix is a 4-line YAML change; the harness measures the improvement. This is the iterative-improvement loop the project was built to demonstrate. Full chain of evidence in docs/measurements/phase3-day4-t1087-fix.md.


Methodology

Corpus design

All positive fixtures (TP events) come from real OTRF Security-Datasets captures, never from synthetic or hand-authored events. Dataset selection is logged with an accept/reject decision and reason in data/captures/_decisions.md โ€” the evidentiary record for "how did you choose your test data?"

The 1,164-event process-creation baseline was assembled from 121 OTRF atomic + compound Windows ZIPs (Phase 4 Day 1 corpus build), deduplicated on (Image, CommandLine, ParentImage, Computer) using scripts/rebuild_baseline.py. A structural attack-event filter (ADR-005) then removed events whose hash(Image, CommandLine, ParentImage) matched any positive fixture. As new attack fixtures land, the filter is re-run; the baseline shrinks idempotently as cross-dataset attack-shape leaks are caught.

The 484-event registry baseline (Sysmon EID 12/13) was built in Phase 4 Day 3 by scripts/extract_registry_baseline.py from four OTRF source ZIPs. The extraction excludes \Microsoft\Windows\CurrentVersion\Run\ and \Microsoft\Windows\CurrentVersion\RunOnce\ paths up front (the labeling axiom โ€” the rule's detection target is exactly what is excluded; nothing more, nothing less). The generalized structural filter then re-applies post-fixture using (TargetObject, Details, Image) as the identity signature. See ADR-006 for the per-logsource routing decision and docs/measurements/phase4-day3-registry-harness.md for the build record.

Precision is only meaningful for rules whose baseline contains events of the same logsource the rule selects on. The harness enforces this via ADR-006: forge/__init__.py defines BASELINE_MAP (a category โ†’ file registry) and REGISTERED_LOGSOURCE_CATEGORIES (its keyset). At test time, rules with no explicit fixtures.negative are auto-routed by their logsource.category. At lint time, a rule with no negative fixture and no registered category is rejected before it can be measured (see tests/test_lint.py for the three guard cases). Multi-logsource expansion is now "add a baseline file + add a dispatch-table entry" โ€” no other changes.

Note on legacy figures. The pre-Phase-4 README reported a 1,481-event baseline. That figure was a raw concatenation count that included 419 cross-dataset duplicate events; the correct unique-event count for the pre-Phase-4 corpus was approximately 1,051. The Phase 4 rebuild corrects this and adds the registry baseline. See Entry 4.3 in the project journey for the accounting.

Rule format

Rules are authored in Sigma format with a forge: extension block for fixture paths, expected metrics, and multi-SIEM flags. All rules are Pydantic v2 validated on lint.

# Example โ€” rules/windows/credential_access/T1003_001_lsass_dump_comsvcs.yml
detection:
  selection:
    EventID: 1
    Image|endswith: '\rundll32.exe'
    CommandLine|contains|all:
      - 'comsvcs'
      - 'MiniDump'
  condition: selection

forge:
  fixtures:
    positive: data/attack/T1003.001/events.json
    negative: data/benign/workstation_baseline.json
  expected:
    precision: 0.95
    recall: 1.00
  multi_siem: true

The test harness (forge/test_harness.py) evaluates rules against fixture events using a hand-rolled field-matcher supporting four modifiers: |endswith, |contains, |contains|all, |startswith. pySigma is used only for forge convert (query-string generation), not for evaluation โ€” see ADR-002 for the design decision and spike results.

Multi-SIEM conversion

forge convert compiles each Sigma rule to query strings for all three backends via pySigma. The Sigma modifier set used across all current rules maps cleanly to each target language:

Sigma modifier SPL EQL KQL
|endswith field="*value" field:"*value" field endswith "value"
|contains field="*value*" field:"*value*" field contains "value"
|contains|all Repeated field clauses (AND) (f:"*a*" and f:"*b*") (f contains "a" and f contains "b")
|startswith field="value*" field:"value*" field startswith "value"
not filter NOT (field IN (...)) not (field like~ (...)) not((field endswith "..."))

No translation losses detected. All 51 conversions succeeded. Example converted queries for T1003.001 are in docs/examples/T1003_001_converted_queries.md.


Known limitations & Phase 5 backlog

Documented deferrals (three techniques, three prerequisite types)

Three techniques are documented as deferrals rather than shipped with structurally guaranteed precision figures. Each has a named prerequisite for unblocking.

Technique Class Prerequisite type Reason
T1037.001 Logon Script (UserInitMprLogonScript) Class 3 Corpus only 484-event registry baseline contains 0 events touching \Environment\ paths โ€” precision would be structurally guaranteed. Unblocking is corpus expansion, not harness work.
T1546.003 WMI Event Subscription Class 1 Harness + corpus Empire's WMI persistence goes through the WMI namespace API (Sysmon EID 19/20/21), not registry writes. Unblocking requires a new wmi_event baseline and dispatch entry.
T1110.003 Password Spraying Class 1 Harness + corpus + aggregation Rule selects on EID 4625 (logon failure), no baseline in current harness. Unblocking requires a logon-event baseline, dispatch entry, and aggregation logic to count failures per source within a window.

The three classes of deferral (Class 1 logsource-mismatch, Class 3 corpus-realism) are defined in journey Entry 4.5. T1037.001 is the most readily unblockable โ€” the harness already handles registry_event. Phase 5 Day 1 sequencing: tackle T1037.001 as a corpus expansion, then take on the harness-side work for the other two.

Principle: a rule's precision number is only meaningful when the benign baseline contains events of the same logsource the rule selects on, in sufficient population to plausibly false-positive. Shipping a rule whose precision would be structurally guaranteed contradicts the standard applied to every other rule in the project.

Single-event fixture caveats

Eight rules have a positive fixture of three events or fewer. Recall is measured as binary against those events โ€” it is 1.000 in every case but does not mean all real-world variants are covered. See per-rule _meta.yml in data/attack/<technique>/ for the OTRF source dataset and the variant captured.

Dashboard / presentation layer

forge build (Stage 5) writes the four JSON contract files to dist/data/. A static measurement dashboard at dist/ (React via CDN, no build step) consumes those files at runtime and presents five views: Overview, Rules, Coverage, Trends, Gaps. The earned-vs-structural-absence classification ships next to every precision number โ€” the project's honesty contract, surfaced in code. See View the dashboard locally above for the serve order.

ATT&CK coverage also ships via reports/layer.json, which loads directly into the official MITRE Navigator at https://mitre-attack.github.io/attack-navigator/ (Open Existing Layer โ†’ Upload from Local) for anyone who prefers the canonical viewer to the project dashboard.

The original UI deferral and the rebuild rationale are recorded in PDR ยง9 and journey Entry 4.9.


Acknowledgments

OTRF Security-Datasets โ€” all positive fixture events and the benign baseline corpus come from OTRF's open-source attack capture library. Dataset selection, event counts, and accept/reject decisions are logged in data/captures/_decisions.md.

Open Threat Research Foundation (OTRF). Security-Datasets. https://securitydatasets.com. MIT License.

MITRE ATT&CK โ€” technique and tactic mappings throughout this project follow the MITRE ATT&CK framework.

MITRE Corporation. ATT&CKยฎ. https://attack.mitre.org. ATT&CKยฎ content is licensed under CC BY 4.0.

SigmaHQ โ€” rule format and pySigma conversion backends.

SigmaHQ. Sigma โ€” Generic Signature Format for SIEM Systems. https://sigmahq.io. Apache License 2.0.


License

MIT โ€” see LICENSE.

Copyright (c) 2026 Aadarsh Kadam

About

Detection-as-code pipeline with measured precision/recall against OTRF captures. 20 Sigma rules, multi-SIEM (SPL/EQL/KQL), ATT&CK coverage, two logsource baselines. By Aadarsh Kadam.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages