Formula Columns
Formula columns let you perform calculations and data transformations on your dataset - similar to formulas in a spreadsheet, but designed for large-scale data processing. They use EraQL, Cotera's functional query language, to compute new values from your existing data without requiring AI.
When to Use Formula Columns
Formula columns are ideal when you need deterministic, rule-based transformations:
- Calculations: Compute totals, percentages, differences, or aggregations
- Data formatting: Standardize text, extract substrings, or combine fields
- Conditional logic: Apply if/then rules based on your data
- Date operations: Calculate durations, extract date parts, or format timestamps
- Type conversions: Convert between strings, numbers, and other data types
For tasks requiring judgment, interpretation, or natural language understanding, use LLM columns instead.
Quick Examples
Here are some common formula patterns:
Combine first and last name:
concat("first_name", ' ', "last_name")
Calculate a percentage:
div("completed_tasks", "total_tasks") |> mul(100)
Extract the month from a date:
date_part("created_at", 'month')
Conditional value:
if("amount" > 1000, {then: 'high_value', else: 'standard'})
Clean and standardize text:
lower("email") |> replace(' ', '')
EraQL: The Formula Language
Formula columns are powered by EraQL, a functional query language with a rich set of built-in functions for:
- Math:
add,sub,mul,div,round,abs,floor,ceil - Strings:
concat,lower,upper,replace,substring,split_part - Dates:
date_add,date_diff,date_trunc,date_part,now - Logic:
if,and,or,not,coalesce,is_null - Comparisons:
eq,gt,lt,gte,lte,one_of
For the complete syntax and all available functions, see the EraQL Language Reference.
Creating a Formula Column
- Navigate to your dataset and open the column editor
- Select Formula as the column type
- Write your EraQL expression
- Use the preview feature to test on sample data
- Publish when ready
Formula columns execute instantly and deterministically - the same input always produces the same output, making them reliable for business-critical calculations.