Skip to content

Medallion architecture

RebelCore™ uses a Medallion architecture — a tiered approach to refining data, popularised by data lakehouse patterns. Every piece of data in RebelCore™ moves through the same four stages, each one cleaner and more useful than the last. Between every stage sits a governance gate — access controls and audit that decide who can move data forward and who can read each tier.

┌── governance ──┐ ┌── governance ──┐ ┌── governance ──┐
│ │ │ │ │ │
Bronze ────┴── Silver ─────┴──┴── Gold ───────┴──┴── Inference ──┘
(raw) (structured) (vectorized) (prompted)

Knowing where you are in this flow — and what governance applies at each gate — makes the rest of the documentation much easier to follow. Every page in this site maps to one of these stages.

Stage 1 — Bronze (raw): Import files

You upload your source files (CSV or Excel) into an import batch. Nothing is transformed yet; the files just sit in RebelCore™ in their original shape, attached to a project.

Goal of this stage: get the data in.

Import files

Stage 2 — Silver: Create your dataset

You take a finished import batch and run it through the Module Dataset Builder. RebelCore™ parses the files, normalises columns, attaches your hierarchy labels, and produces a structured dataset module that’s now part of the project.

Goal of this stage: make the data structured and consistent.

Create your dataset

Stage 3 — Gold (vectorized): The Tree

The silver dataset shows up as nodes in the project’s Tree. Here you do the gold-tier curation: select which columns to keep, weight them, drop the noise, and apply the AI Data Advisor’s suggestions. The result is a labelled vector set — the gold version of your data, ready for inference.

Goal of this stage: decide what matters and produce vectors.

How the Tree works · Chat & AI suggestions

Stage 4 — Inference: RebelCore™ Agent

From any node in the Tree you hand the gold vector set to the RebelCore™ Agent. You describe what you want in natural language and the agent runs the inference workflow over your curated data.

Goal of this stage: act on the data.

RebelCore™ Agent

Governance is the connective tissue

The arrows between stages aren’t free passes — every transition is a permission boundary. Roles assigned by your administrator decide who can do what at each tier:

TierWhat governance gates
BronzeWho can create import batches and upload raw files
SilverWho can build / view dataset modules in a project
GoldWho can apply curation in the Tree (suggestions, weights)
InferenceWho can prompt the Agent and see its responses

This is what lets a team of analysts safely use the Agent over sensitive data without seeing the raw files behind it — see Governance & access for the full pattern.

Why this matters

Each stage has a clear responsibility. If something looks wrong, you fix it at the stage where it went wrong:

  • Tree feels empty or noisy? You probably skipped curation in stage 2 or 3.
  • Agent giving weird answers? Check the gold vectors — what suggestions did you apply / not apply in the Tree?
  • A column that should exist isn’t there? Go back to stage 1, re-import.

This is also why you can’t skip stages. The agent only sees the gold layer; the gold layer only forms once silver is built; silver requires raw to be in place — and access to each one is granted independently.