Corpus Guide

A Corpus is an enriched knowledge agent built inside a Brain that merges your technical and business knowledge to generate data models and business metrics. It acts as an intelligent agent for a specific business domain, enabling AI-powered analytics and natural language queries.

What is a Corpus?

A Corpus is an enriched knowledge agent that:

  • Lives inside a Brain: Created within a workspace for specific business domains
  • Merges Knowledge: Combines technical metadata with business context
  • Generates Outputs: Produces data models and business metrics automatically
  • Acts as an Agent: Provides intelligent analytics for your business unit

Knowledge Integration

Combines two types of knowledge:

  • Business Knowledge: Documents defining business rules, terminology, and context
  • Technical Knowledge: Database connections and table selections defining data structure

AI-Powered Generation

Automatically creates:

  • Data models with entity relationships
  • Business metrics aligned with your KPIs
  • Query optimizations for performance

Information Hierarchy: Corpus exists within a Brain (workspace), which belongs to a Tenant (organization). To learn more about this structure, see the Information Hierarchy guide.

Demo Corpus - Try It First!

Explore the Pre-configured Demo

When you first access Codd AI, you'll see an onboarding page with a "Try Demo" option. (you can also find it by clicking your profile icon in the top right corner and select Show Onboarding from the dropdown menu) This demo corpus is preconfigured with:

  • Retail sales data with complete business rules
  • Pre-generated data model with relationships
  • 25+ ready-to-use metrics
  • Sample questions to get you started

Try These Questions in the Demo:

  • "Show me gross margin by product"
  • "What's the average sales by product category?"
  • "Display monthly revenue trends"

Creating Your Own Corpus

Prerequisites: Create Knowledge First

Before creating a corpus, you need to set up your knowledge base:

  1. Business Knowledge (Recommended First): Upload documents with business rules, glossaries, and domain information
  2. Technical Knowledge: Connect to your database and select relevant tables

Step-by-Step Corpus Creation

1. Switch to Your Brain

Navigate to your brain (workspace) from the brain selector. By default, you have your personal brain and the demo brain.

2. Navigate to Corpus Menu

Go to the Corpus section. If no corpus exists, you'll see a message indicating you need to create knowledge first.

3. Add Business Knowledge

Click "Add Business Knowledge" and upload documents containing:

  • Business terminology and synonyms
  • KPI definitions and calculations
  • Business rules and policies
  • Domain-specific documentation

4. Add Technical Knowledge

Create technical knowledge by:

  • Selecting or creating a database integration
  • Choosing relevant tables (facts and dimensions)
  • Reviewing selected schemas
  • Saving the technical knowledge configuration

5. Create the Corpus

Now create your corpus:

  • Give it a short, descriptive name (this becomes the chat handle)
  • Choose an icon for visual identification
  • Select the knowledge cells to include
  • Review the knowledge summary
  • Save to initiate AI processing

Data Model Generation

After creating your corpus, Codd AI automatically generates a comprehensive data model by analyzing your knowledge:

AI-Driven Model Creation

The AI identifies and creates:

  • Primary Keys: Unique identifiers for each table
  • Foreign Keys: Relationships between tables
  • Cardinality: One-to-one, one-to-many, many-to-many relationships
  • Confidence Scores: AI's certainty about each relationship

Review and Validation

Once generation completes (you'll receive an email), review the model:

  1. Check each table's identified keys and relationships
  2. Verify cardinality matches your business logic
  3. Review confidence scores for accuracy
  4. Lock the model once validated to prevent changes

Note: Locking the data model is required before metric generation begins.

Metrics Generation

Automatic Metric Creation

After locking your data model, Codd AI generates business-relevant metrics:

  • AI analyzes your business knowledge to understand KPIs
  • Generates top 25 metrics covering key business areas
  • Each metric includes dimensions, measures, and SQL
  • Metrics are pre-validated for accuracy

Metric Components

  • Name and description
  • Dimensions (grouping fields)
  • Measures (calculations)
  • Database entities used
  • Generated SQL query

Metric Validation

  • Review each metric definition
  • Test execution in Canvas
  • Verify calculations match expectations
  • Modify SQL if needed
  • Save custom variations

Managing Your Corpus

Corpus Dashboard

From the corpus dashboard, you can:

  • View all knowledge cells and their summaries
  • Access the data model viewer
  • Browse and execute metrics
  • Monitor corpus health and usage
  • Update knowledge and regenerate models
  • Regenerate metrics or generate new additional metrics
  • Start conversations with your corpus in Canvas
  • Lock the corpus to prevent changes on knowledge items
  • Export the Data Model to a JSON file
  • Export the Corpus to a JSON file
  • To view all the corpuses in your current brain, click on switch corpus which redirects you to the corpus list page
  • you can Import the Corpus from a JSON file to your current brain using the import corpus button on corpus list page
  • Share your corpus with other users in your tenant. For details, see Corpus Sharing.

Update Knowledge

Add new documents or tables to expand your corpus capabilities

Refresh Data Model

Regenerate relationships when schema changes occur

Add Custom Metrics

Create additional metrics beyond the auto-generated ones

Export Corpus

Export the Corpus to a JSON file

Import Corpus

Import the Corpus from a JSON file to your current brain

Share Corpus

Share the Corpus with other users in your tenant

Best Practices

Knowledge Preparation

  • Always create business knowledge before technical knowledge
  • Include comprehensive business glossaries and KPI definitions
  • Ensure documents clearly define metrics and calculations
  • Use consistent terminology across all documents

Data Model Review

  • Thoroughly review all identified relationships
  • Pay attention to confidence scores below 80%
  • Verify cardinality matches your data reality
  • Lock the model only after complete validation

Metric Validation

  • Test each generated metric with known data
  • Verify calculations against existing reports
  • Document any custom modifications
  • Create naming conventions for custom metrics

Ongoing Management

  • Regularly update business knowledge as rules change
  • Monitor metric usage to identify gaps
  • Gather user feedback on missing capabilities
  • Schedule periodic reviews of the data model