Skip to content

Predictive skills#12

Open
ifountalis wants to merge 6 commits into
mainfrom
predictive_skills
Open

Predictive skills#12
ifountalis wants to merge 6 commits into
mainfrom
predictive_skills

Conversation

@ifountalis

Copy link
Copy Markdown

No description provided.

ifountalis and others added 2 commits March 30, 2026 15:51
@cafzal cafzal requested review from agarrard-rai and cafzal April 6, 2026 23:07
@cafzal

cafzal commented Apr 6, 2026

Copy link
Copy Markdown
Collaborator

Predictive Skills Review (Pass 2)

Re-reviewed after commits 8c33be2 (address PR feedback), cc21d18 (fix link prediction issue), and 283257d (merge main). Many items from the first pass are resolved. Updated findings below.

Resolved from Pass 1: Discovery routing (#1), case bug in modeling example (#2), graph-analysis disambiguation (#5), placeholder consistency (#7), pitfall consequences (#9), non-pitfall row removed (#10), register_and_load rebuild uncommented (#13), description shortened (#14). Training examples now document required variables in docstrings (partial #8).


Must Fix

1. Discovery predictive.md reference still treats GNN as "future." rai-discovery/references/predictive.md:42 says rai_predictive mode is "Future — when the RAI predictive reasoner is platform-integrated." But the predictive skills now exist and provide a full GNN pipeline. The routing table (SKILL.md:394) correctly points to rai-predictive-modeling, but the reference file contradicts it. An agent reading the reference will think GNN training isn't available yet. Lines 131, 138 repeat this. The predictive routing example (predictive_routing.md) only shows pre_computed mode — needs a GNN/rai_predictive walkthrough.

2. Case mismatch migrated to training example. The modeling fix (link_prediction_snowflake.py) now correctly uses Customer.C_customer_id and Customer.AGE. But train_link_prediction.py:38 still uses Customer.c_customer_id (lowercase) and Customer.age (lowercase, lines 39/45/46). These paired examples reference the same concepts from the same dataset — the casing must match.

3. No guidance on extending an existing ontology for GNN. (Unchanged from Pass 1.) The modeling skill treats concept definition as greenfield. Users coming from rai-build-starter-ontology will have an existing model with Concept("Customer") (no identify_by). The skill needs to address: create a new Model or extend? Redefine concepts? What happens to existing Relationships/Properties?

4. No guidance on creating train/val/test split tables. (Unchanged from Pass 1.) All three skills assume split tables exist in Snowflake but no skill explains how to create them or what schema they require.


Should Fix

5. No cross-skill pattern for graph metrics → GNN features. (Unchanged.) rai-discovery documents the Graph → Predictive chain pattern (SKILL.md:139) but the predictive modeling skill doesn't show how to consume graph-analysis outputs (centrality scores, community labels) as PropertyTransformer features.

6. Link prediction training example includes domain-specific filtering. train_link_prediction.py:45-46 adds Customer.age < 50, Customer.age > 20. This obscures the pure link-prediction pattern and uses the lowercase .age that doesn't match the modeling example's .AGE. Strip it.

7. Unused GNN import in modeling skill and examples. The Quick Reference (SKILL.md:42) and all three modeling examples import GNN but never use it — GNN belongs to the training phase.


Nice to Have

8. No ontology-type → PropertyTransformer annotation mapping. A small table ("Float → usually continuous, String enum → usually category") would help an agent translate existing ontology types to PT annotations.


Works Well

  • Internal consistency across the three skills is strong — shared terminology, parameter naming, code patterns all align
  • Boundary separation (modeling / training / management) is clean and logical
  • Reference files are well-scoped — hyperparameters, task-types, prediction-attributes are exactly the right granularity
  • Common Pitfalls tables now include consequence descriptions (RuntimeError, TypeError, etc.)
  • Examples cover all three main task types with consistent DB/SCHEMA placeholders
  • Relationship arity rules table in modeling is excellent — clear, complete, copy-paste ready
  • The "What to include vs. omit when loading" table in management prevents a common class of errors
  • Training examples now document required variables in docstrings
  • register_and_load.py Session 2 now shows the graph/PT rebuild inline
  • Discovery skill correctly routes predictive → rai-predictive-modeling
  • Link prediction guidance added: "do not choose for the user" between link_prediction and repeated_link_prediction (modeling SKILL.md:195)

@cafzal cafzal left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment

- Add predictive routing in rai-problem-discovery post-discovery section
- Fix case inconsistency (c_customer_id → C_customer_id) in link prediction example
- Add Graph constructor disambiguation in rai-graph-analysis and rai-predictive-modeling
- Standardize DB/SCHEMA placeholders across all examples and SKILL.md code blocks
- Add file pointers to paired modeling examples in training example headers
- Add failure consequences (error types) to pitfall tables across all three skills
- Remove non-pitfall row (pt=None) from training pitfalls table
- Uncomment required graph/PT rebuild in register_and_load.py Session 2
- Shorten rai-predictive-modeling frontmatter description

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@ifountalis

Copy link
Copy Markdown
Author

@cafzal thanks for the thorough review. Here's what we addressed and what we deferred, point by point.

Addressed

Not Addressed

ifountalis and others added 3 commits April 7, 2026 09:14
Add guidance that agents must not choose between link_prediction and
repeated_link_prediction — present both options and ask the user.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
export_csv now always defaults to True internally so users don't need to set it.
CDC-related options (skip_cdc) are removed entirely.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@cafzal

cafzal commented Apr 21, 2026

Copy link
Copy Markdown
Collaborator

Test Report — Hands-On GNN Pipeline Testing (Updated)

Ran the full predictive pipeline (all three task types) on the H&M dataset. Setup questions raised to the GNN team in #team-prod-neuralai-gnns.

What works end-to-end

  • PyRel gnn3 branch installs in isolated venv (requires make sync to vendor v0 layer)
  • GNN, PropertyTransformer import successfully
  • RAI native app connects, Predictive + Logic reasoners provision (~4s)
  • Data sync and table materialization for 31M+ row H&M dataset (~4 min)
  • GNN node/edge table creation succeeds
  • Dataset preparation (Step 1/3) completes (~8 min)
  • Trainer configuration (Step 2/3) completes (~2s)
  • Training job submission (Step 3/3) succeeds — training actually starts ("Preparation finished, starting training 1st epoch")

Issues found (in order of discovery)

1. SKILLS BUG: pt=pt is wrong, should be property_transformer=pt

All three skills and the retail_planning template use pt=pt in the GNN constructor. But the GNN class has property_transformer as the named parameter (not pt). Using pt=pt causes it to leak into **train_paramsTrainerConfig(**params)TypeError: TrainerConfig.__init__() got an unexpected keyword argument 'pt'.

Fix: Change all occurrences of pt=pt to property_transformer=pt in skills and examples. This is the single biggest blocker — without it, nothing trains.

2. has_time_column=True triggers validation error

When enabled, task_info.py hardcodes time_column = "timecolumn" (literal string). The relationalai_gnns dataset sanity check requires at least one data table to have a matching time column. But Transaction (which carries the time_col via PropertyTransformer) is an edge table, and preparation.py only sets time_col on TableType.NODE. Non-temporal mode (has_time_column=False) works fine.

Status: Raised with GNN team. Workaround: use has_time_column=False and non-temporal Relationships.

3. Experiment schema permissions needed for RAI native app

The GNN experiment tracking database/schema must be accessible to the RAI native app. Requires:

GRANT USAGE ON DATABASE <db> TO APPLICATION RELATIONALAI;
GRANT ALL ON SCHEMA <db>.<schema> TO APPLICATION RELATIONALAI;

Not documented in the skills.

4. SSL certificate error during log streaming (environment, not code)

After training starts successfully, log streaming hits an SSL error (unable to load trusted certificates). This is a pyOpenSSL/certifi version conflict in the venv, not a code issue. Training may still be running on the server despite the client-side error.

5. make sync prerequisite not documented

The gnn3 branch requires make sync (which clones and vendors the v0 compatibility layer from relationalai-python@v0.15.12) before the predictive module will import. Without it, from lqp import ir fails.

6. rai_app_name config for non-standard app names

If the RAI native app isn't called RELATIONALAI, the config needs rai_app_name: <app_name>. Not mentioned in skills.

Implications for the skills

Must fix in skills:

  • Change pt=ptproperty_transformer=pt in all SKILL.md code blocks and all example files
  • This is a bug that makes every GNN code sample non-functional

Should add to skills:

  • Document the GRANT ... TO APPLICATION RELATIONALAI step for experiment schema
  • Note that has_time_column=True requires the time_col concept to be a node (not edge) — or clarify the workaround
  • Document rai_app_name config option

Test environment

  • PyRel: gnn3 branch @ 9ed0bdafd, with make sync (v0.15.12)
  • relationalai-gnns: 0.1.5
  • Snowflake: NDSOEBE.RAI_PROD_PM (jqb21724) with H&M data replicated from NDSOEBE.RAI_GNNS_TEST
  • Dataset: HM_PYREL (1.37M customers, 105K articles, 31.5M transactions, 3 task schemas)
  • Test script: non-temporal mode (has_time_column=False, standard link_prediction not repeated_link_prediction)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants