As artificial intelligence continues to reshape the technology landscape, even established fields like data analytics are experiencing profound transformations. We had the pleasure of speaking with Dr. Michael Flaxman, Head of Product at Heavy AI, a company that has spent the last decade pioneering big data analytics solutions and is now at the forefront of integrating AI into their platform. Our conversation explored the challenges and opportunities of bringing conversational AI to enterprise analytics, balancing accuracy with usability, and the evolving expectations of users in an AI-powered world. Here are the key takeaways from our discussion:
As artificial intelligence continues to reshape the technology landscape, even established fields like data analytics are experiencing profound transformations. We had the pleasure of speaking with Dr. Michael Flaxman, Head of Product at Heavy AI, a company that has spent the last decade pioneering big data analytics solutions and is now at the forefront of integrating AI into their platform. Our conversation explored the challenges and opportunities of bringing conversational AI to enterprise analytics, balancing accuracy with usability, and the evolving expectations of users in an AI-powered world. Here are the key takeaways from our discussion:
Dr. Flaxman: Heavy AI has specialised in big data analytics for about a decade now, with a particular focus on geospatial and temporal data sets containing billions of records. Think things like GPS pings from every Ford vehicle globally - that's the scale we're dealing with.
Historically, we've focused on classic predictive analytics for these massive data sets. But more recently, we've pivoted to become an AI-powered analytics platform. Our latest product, Heavy IQ, integrates large language models directly with our analytics engine. This allows users to query their data using natural language, which we then translate into SQL and visualisations.
It's been an interesting transition. In the past, users expected their analytics software to be 100% accurate but had to navigate some very clunky interfaces. Now, there's a much higher premium on ease of use, but also more tolerance for potential errors - within reason. We're navigating that shift in expectations while still maintaining the rigor necessary for enterprise analytics.
Dr. Flaxman: Certainly. At a high level, we're doing two main things: translating natural language queries into SQL, and then translating those results into appropriate visualisations. The first part - natural language to SQL - is something others in the industry are working on as well. There are established benchmarks we can measure ourselves against.
The visualisation piece is much more uncharted territory. There's no definitive rubric for what makes a "good" map or chart in response to a given query. It's highly contextual and often domain-specific. For instance, a telecom engineer and an ecologist might ask very similar questions about location data, but expect entirely different visualisations as the output.
We're constantly trying to balance providing enough context without overwhelming the user. If someone asks for a map of buildings in New York, do we just show the buildings? Or do we include streets, water boundaries, and other geographic context? There's no universally right answer.
Another challenge is handling ambiguity and iterative refinement. Users often start with a vague query and then want to zoom in, filter, or pivot based on what they see. We're working on making that conversational flow as intuitive as possible.
Dr. Flaxman: This is a thorny issue that we're actively grappling with. On one hand, exposing the underlying SQL provides transparency and allows tech-savvy users to verify that we've correctly interpreted their intent. But for non-technical users, that doesn't help much.
We're attacking this from a few angles. First, we're putting a lot of emphasis on metadata and governance. We use AI to extract and generate metadata about tables and columns, but then allow users to edit and refine that information. This helps provide context to the model about what different fields actually represent.
We're also working on better ways to capture and apply domain-specific knowledge and terminology. Different industries and even individual companies often have their own jargon and specific ways of defining key metrics. We're exploring ways to let users essentially "teach" the model about their particular business context and KPIs.
Ultimately, I think this comes down to finding the right balance between automation and human oversight. We want to leverage AI to make analytics more accessible and efficient, but there will always be a need for human judgement and domain expertise in ensuring the results are meaningful and accurate in a business context.
Dr. Flaxman: This is something we debate internally every day. There's a constant tension between wanting to provide fast, conversational interactions and ensuring the highest possible accuracy. Larger language models tend to be more accurate but also have higher latency, which can disrupt the flow of a conversation.
Our baseline goal has been to achieve over 90% accuracy on a wide range of queries. But even at that level, there's still a real risk of eroding user trust if they encounter incorrect results, especially in high-stakes business contexts.
We're approaching this challenge from a few angles. First, we're investing heavily in domain-specific fine-tuning and prompt engineering to improve accuracy in the contexts that matter most to our users. For example, we train heavily around maps and spatial joins even though these don’t appear in current common English-to-SQL benchmarks. We've also implemented feedback mechanisms that allow users to rate responses and help the system learn over time.
Another key strategy is being very transparent about the query generation process. We always show users the SQL we've generated alongside the results, allowing those who are SQL-savvy to verify and modify if needed. For less technical users, we're exploring better ways to explain the logic and assumptions behind our answers in plain language.
Ultimately, I think the key is setting appropriate expectations and providing the right guardrails. For certain types of exploratory analysis or quick insights, users might be comfortable with a small margin of error in exchange for speed and convenience. But for mission-critical reports or financial calculations, they'll likely still want human verification. Our goal is to support both modes effectively within the same platform.
Our conversation with Dr. Flaxman underscored both the immense potential and the nuanced challenges of bringing conversational AI to enterprise analytics. As Heavy AI continues to push the boundaries in this space, a few key principles emerged:
As the worlds of AI and analytics continue to converge, companies like Heavy AI are charting new territory in making powerful data insights more accessible and actionable for enterprises. By thoughtfully navigating the tensions between ease of use, accuracy, and domain specificity, they are paving the way for a future where advanced analytics are as intuitive to use as they are powerful. The journey is still in its early stages, but the potential to democratise data-driven decision making across organisations is immense.