Text-to-SQL Over 1,200-Table Data Warehouse
2-day turnaround → seconds
Client Context
A financial services company with a 1,200-table data warehouse containing 8 years of transaction, customer, and operational data. Business users depend on a small analytics team for every data request.
The Challenge
Analysts have a 2-day turnaround on data requests. Business users can't access their own data without SQL expertise. The analytics team is a bottleneck, and by the time answers arrive, the questions have often changed.
Our Approach
We build a Text-to-SQL pipeline with a semantic schema catalogue that vectorises every table and column description into Qdrant using BGE-M3 embeddings. When a user asks a question, the system first identifies relevant tables, then generates accurate SQL, validates it, and returns results in natural language with the underlying query visible.
Timeline: 14 weeks
The Results
- Queries that previously required a 2-day analyst turnaround return in seconds
- Business users projected to self-serve 70% of data requests
- Analytics team freed for complex analysis and model development
- 1,200 tables catalogued with semantic descriptions
Facing a similar challenge?
Let’s talk about how we can help your organisation achieve similar results.