Creating LLM Columns - Step-by-Step Guide

LLM columns bring AI capabilities directly into your data workflows, enabling powerful analysis and transformation of unstructured data. This guide walks you through creating and configuring LLM columns to extract insights from your text data.

Getting Started

Before diving into LLM columns, you'll need an existing data set to work with. If you haven't created one yet, check out our data set documentation and return here once your data is ready.

We'll use the Agent Conversations dataset to demonstrate the process.

Setting Your Primary Key

Every LLM column requires a primary key to function properly. This unique identifier tells the system how to map AI-generated outputs back to the correct rows in your table.

Open the Command Palette using Command + K (Mac) or Control + K (Windows), then select Set Primary Key.

You'll see a dialog box where you can enter an EraQL expression that represents a unique row identifier. Use a single column if it contains unique values, or combine multiple columns to create uniqueness. Once you've entered your expression, click Set Primary Key to confirm.

Creating Your LLM Column

There are two convenient ways to start creating your LLM column. You can click the plus button in the upper right corner of your table (you'll see a helpful tooltip when you hover), or use the Command Palette with Command + K and select New Column.

Both methods take you to the column type selection page where you'll choose LLM Column to begin configuration.

Configuring Your LLM Column

The LLM column creation page contains everything you need to set up your AI analysis. You'll see sections for Model selection, Tools, System Message, Prompt, and a testing area.

Choosing Your Model and Tools

Select the AI model that best fits your use case - our model guide provides detailed information about each option's strengths. If you need additional capabilities, check our tool guide for available extensions.

Writing Your System Message

The system message gives the AI context about what you want it to accomplish. Think of this as your opportunity to explain the task, provide examples, and set clear expectations for the output format.

Configuring Your Prompt

In the prompt box, enter an EraQL expression that represents the data you want the AI to analyze. This might be customer reviews, email text, addresses, or any other text-based information from your dataset.

Remember that prompts use templating, so you'll need double curly braces around your expressions - whether you're mixing text with expressions or using expressions alone.

Example Configuration

Here's a practical example that extracts dog breeds from customer service conversations:

System Message:

I'm going to send you a customer service interaction between an agent and a customer for a dog toy and treat brand called The Chew Crew.
Please evaluate the interaction and determine if the customer mentions their dog breed.

Only return the dog breed, do not return any additional attributes about the dog.

For example, if the customer says:
"My labrador really enjoys the mighty much bars"

Return:
"Labrador"

If the customer says:
"My small dog loves these treats!"

Return:
"N/A"

Return N/A for all responses that do not mention a dog breed.

Prompt:

{{"CONVERSATION"}}

Testing Your Configuration

Before running your LLM column on your entire dataset, take advantage of the testing capabilities to ensure you're getting the results you want.

Quick Testing with Chat

The chat window lets you test specific inputs and see outputs immediately. This is perfect for experimenting with different approaches and refining your configuration.

Preview with Real Data

Switch to the Preview tab and click the Play button to see how your configuration works with actual data from your dataset. This loads a sample of your real data so you can evaluate the quality of your system message and prompt working together.

Evaluating Your Results

When reviewing your preview results, look for several key indicators of quality:

Check that the AI is following your instructions precisely. In our dog breed example, you should see clean breed names like "Labrador" and "Golden Retriever" rather than additional descriptive text.

Pay attention to edge cases - when customers mention "small dog" or "puppy," you should see "N/A" responses as specified in your instructions.

Look for consistency in similar inputs. References like "my lab" should consistently return the same format, whether that's "Lab" or "Labrador" based on your specifications.

If you spot issues during preview, this is the perfect time to adjust your system message or prompt. You can iterate and re-preview until every result meets your standards.

Finalizing Your Settings

Before saving your LLM column, configure a few additional settings to ensure it runs exactly how you want.

Use the curly braces icon in the upper corner of the left sidebar to set your output type, which tells Cotera what kind of response format to expect.

The timer icon lets you configure your execution schedule. LLM columns are automatically set to "never run" by default, so you'll need to choose when you want them to execute.

The third icon provides access to advanced controls like temperature (which affects response creativity), max output tokens (limiting response length), and durability settings. Most users can stick with the defaults.

Saving and Running Your Column

Once you're satisfied with your configuration, click Save and provide a descriptive name that clearly identifies what your column does.

You can either save and let your column run according to your selected schedule, or choose Run Now to execute immediately and then continue with regular scheduling.

Publishing Your Changes

If Auto Publish is enabled, your LLM column becomes available to your organization immediately. If not, you'll need to manually publish your changes using Publish Graph in the Command Palette or the Publish button in the upper right corner.

Your LLM Column in Action

Your LLM column is now ready to analyze your data and provide AI-powered insights. You can monitor the results as they process and make adjustments to your configuration if needed.

LLM columns open up powerful possibilities for extracting structured insights from unstructured text data, making it easier to analyze customer feedback, categorize content, and identify patterns in your datasets.