Text-to-SQL Over 1,200-Table Data Warehouse
2-day turnaround → seconds
Client Context
A financial services company with a 1,200-table data warehouse containing 8 years of transaction, customer, and operational data. Business users depended on a small analytics team for every data request.
The Challenge
Analysts had a 2-day turnaround on data requests. Business users couldn’t access their own data without SQL expertise. The analytics team was a bottleneck, and by the time answers arrived, the questions had often changed.
Our Approach
We built a Text-to-SQL pipeline with a semantic schema catalogue that vectorises every table and column description into Qdrant using BGE-M3 embeddings. When a user asks a question, the system first identifies relevant tables, then generates accurate SQL, validates it, and returns results in natural language with the underlying query visible.
Timeline: 14 weeks
The Results
- Queries that required 2-day analyst turnaround now return in seconds
- Business users self-serve 70% of data requests
- Analytics team freed for complex analysis and model development
- 1,200 tables catalogued with semantic descriptions
Facing a similar challenge?
Let’s talk about how we can help your organisation achieve similar results.