Skip to content

Create your dataset

This is stage 2 of the RebelCore™ data flow:

Import (raw) → Create dataset (silver) → Tree (gold, vectorized) → Agent

After importing files, you’ve got raw uploads sitting in a batch. They’re not yet attached to your project and not yet structured. Creating a dataset is where you give them shape — pick a hierarchy, review the AI Data Advisor’s suggestions, and build them into a silver-tier dataset module that lives inside the project.

Silver means “structured and consistent” — the data has been parsed, normalised, and labelled, but you haven’t yet decided which columns matter most. That’s the next stage, in the Tree.

When you’re ready

You can create a dataset once an import batch’s Dataset Availability column is empty (status created). Batches still processing or already linked to a project won’t show the CREATE DATASET action.

Steps

1. Open the Datasets page

From any project, go to Imports in the side menu. The header says Datasets — every import batch lives here with its current status.

2. Click “CREATE DATASET” on the batch

In the Actions column, batches that are ready show a CREATE DATASET button. Click it.

This opens the Module Dataset Builder for that batch.

3. Configure the hierarchy

The builder shows a Hierarchy panel on the left:

  • Dataset — fixed; you can’t change this directly.
  • Label 1 (required) — pick a label that describes the top of your dataset hierarchy.

Pick values that reflect how you want to slice the data later in the Tree.

As soon as you select a file, RebelCore™ runs an AI Data Advisor in the background. It analyses the columns and rows and surfaces suggestions about which columns are most likely to matter.

You’ll see:

  • An advisor bar at the bottom of the file preview while it’s working (“Loading suggestions…”).
  • A suggestions panel once it’s done, listing the columns the advisor recommends keeping or flagging.

You don’t have to act on these now — the same suggestions surface again later from inside the Tree, where you’ll do the gold-tier curation. But it’s often easier to glance at them at this stage to confirm the build looks sensible.

5. Click “BUILD MODULE”

Top-right of the builder. The button switches to BUILDING… while RebelCore™ turns the batch into a dataset module.

When it finishes, the batch’s Dataset Availability column updates to show the path/label you assigned, and the dataset is now part of the project.

6. Switch to the Tree

Open Tree in the side menu. The new dataset appears as a top-level node — see How the Tree works for what to do next.

Common questions

What if I picked the wrong labels?

The simplest path is to start a fresh import batch and rebuild from there with the correct labels. If you need help editing an existing build, contact support.

Can one import batch produce multiple datasets?

Each import batch produces one dataset module. To work with multiple datasets in the same project, run separate import batches.

How long does Build Module take?

It depends on the number of files and rows. Most builds finish in under a minute; very large imports may take several minutes.

The build failed — what now?

The error message in the builder will usually point at the cause (e.g., a label collision, a missing required field). Fix the input and try again. If the error isn’t clear, click the import batch on the Datasets page to see the full status, and contact support if you’re stuck.