How AI tax startup Blue J torched its entire business model for ChatGPT—and became a $300 million company

In the winter of 2022, as the tech world was becoming mesmerized by the sudden, explosive arrival of OpenAI’s ChatGPT, Benjamin Alarie faced a pivotal choice. His legal tech startup, Blue J, had a respectable business built on the AI of a bygone era, serving hundreds of accounting firms with predictive models. But it had hit a ceiling.

Alarie, a tenured tax law professor at the University of Toronto, saw the nascent, error-prone, yet powerful capabilities of large language models not as a curiosity, but as the future. He made a high-stakes decision: to pivot his entire company, which had been painstakingly built over nearly a decade, and rebuild it from the ground up on this unproven technology.

That bet has paid off handsomely. Blue J has since quietly secured a $122 million Series D funding round co-led by Oak HC/FT and Sapphire Ventures, placing the company's valuation at over $300 million. The move transformed Blue J from a niche player into one of Canada's fastest-growing legal tech firms, multiplying its revenue roughly twelve-fold and attracting 10 to 15 new customers every day.

The company now serves more than 3,500 organizations, including global accounting giant KPMG and several Fortune 500 companies. It is tackling a critical bottleneck in the professional services industry: a severe and worsening talent shortage. The U.S. has 340,000 fewer accountants than it did five years ago, and with 75% of current CPAs expected to retire in the next decade, firms are desperate for tools that can amplify the productivity of their remaining experts.

“What once took tax professionals 15 hours of manual research to do can now be completed in about 15 seconds with Blue J,” Alarie, the company's CEO, said in an exclusive interview with VentureBeat. "That value proposition—we can take hours of work and turn it into seconds of work—that is driving a lot of this."

When the dean's biography was wrong: the moment that changed everything

Alarie vividly remembers January 2023, when the dean of the law school stopped by his office for New Year's greetings. He asked her about ChatGPT and prompted the AI to describe her. ChatGPT confidently generated a biography. Some details were accurate. Others were completely fabricated.

"She was like, 'Okay, this is really kind of scary. This is wrong, and this has implications,'" Alarie said. Yet that moment of obvious failure didn't deter him. Instead, it crystallized his conviction.

The company's first iteration, launched in 2015, used supervised machine learning to build predictive models that could forecast judicial outcomes on specific tax issues. While technically sophisticated, it had a fundamental flaw: it couldn't answer every tax research question.

"The challenge was it couldn't answer every tax research question, which was really the holy grail," Alarie said. Customers loved the tool when it applied to their problem, but would quickly abandon it when it didn't. Revenue plateaued around $2 million annually.

Despite ChatGPT's notorious hallucinations, Alarie convinced his board to make the pivot. "I had this conviction that if we continued down that path, we weren't going to be able to address our number one limitation," he said. "Large language models seemed like a very promising direction."

He gave his team six months to deliver a working product.

From 90-second responses to 3 million queries: How Blue J tamed AI hallucinations

By August 2023, Blue J was ready to launch. What they released was, in Alarie's candid assessment, "super janky." The system took 90 seconds to respond. About half the answers had issues. The Net Promoter Score registered at just 20.

What transformed that flawed product into today's platform — with response times measured in seconds, a dissatisfaction rate of just one in 700 queries, and an NPS score in the mid-80s — was relentless focus on three strategic pillars.

First is proprietary content at massive scale. Blue J secured exclusive licensing with Tax Analysts (Tax Notes) and IBFD, the Amsterdam-based global tax authority covering 220+ jurisdictions. "We are the only platform on earth that takes in the best U.S. tax information from Tax Notes and the best global tax information from IBFD," Alarie said.

Second is deep human expertise. Blue J employs tax experts led by Susan Massey, who spent 13 years at the IRS Office of Chief Counsel as Branch Chief for Corporate Tax. Her team constantly tests the AI and refines its performance.

Third is an unprecedented feedback flywheel. With over 3 million tax research queries processed in 2025, Blue J is amassing unparalleled data. Each query generates feedback that flows back into the system.

Weekly active user rates hover between 75% and 85%, compared to 15% to 25% for traditional platforms. "A charitable ratio is like we're five times more intensively used," Alarie noted.

Inside Blue J's early access partnership with OpenAI

Blue J maintains an unusually close relationship with OpenAI that has proven crucial to its success. "We have a very good relationship with OpenAI, and we get early access to their models,"Alarie said. "It's quite collaborative. We give them a lot of really high quality feedback about how well different versions of forthcoming models are performing."

This feedback proves valuable because Blue J has developed what Alarie calls "ecologically valid" test questions — drawn from actual tax professional queries, with correct answers determined by Blue J's expert team. This helps OpenAI improve performance on complex reasoning tasks.

The company tests models from all major providers — OpenAI, Anthropic, Google's Gemini, and open-source alternatives — continuously evaluating which performs best. "We're not necessarily 100% committed to any particular provider," he explained. "We're testing all the time."

This approach helps Blue J navigate a challenging business model: charging approximately $1,500 per seat annually for unlimited queries while absorbing variable compute costs. "We've pre-committed to delivering them a really good user experience, unlimited tax research answers at a fixed price," Alarie said. "We're absorbing a lot of that risk."

Competition among foundation model providers creates downward pressure on API pricing, while Blue J's conservative usage modeling has proven accurate. Gross revenue retention exceeds 99%, while net revenue retention reaches 130% — considered best-in-class for SaaS businesses.

Taking on Thomson Reuters and LexisNexis with 75% weekly engagement

Blue J faces competition from established publishers like Thomson Reuters, LexisNexis, and Bloomberg, all of which announced AI capabilities throughout 2023 and 2024. Yet Blue J's engagement metrics suggest it has captured significant momentum, growing from just 200 customers in 2021 to over 3,500 organizations today.

The daily updates prove crucial. While the tax code itself changes only when Congress acts, the ecosystem evolves constantly through IRS regulations, new rulings, and court cases. All 50 states modify their tax codes regularly.

"Things are changing literally every day," Alarie said. "Every day we're updating the materials, and that's just the U.S. We cover Canada, we cover the UK. The aspirations are truly global for this thing."

Alarie's ambitions extend beyond building a successful startup. As author of the award-winning book "The Legal Singularity" and faculty affiliate at the Vector Institute for Artificial Intelligence, he has spent years contemplating AI's long-term impact on law.

In academic papers published in Tax Notes throughout 2023 and 2024, he chronicled generative AI's rise, predicting that "clients will become substantially more sophisticated" and that AI would push human experts toward higher-value strategic roles rather than routine research.

Blue J's $122 million plan: From tax research to 'global tax cognition'

The Series D funding, which brought total capital raised to over $133 million, will fuel aggressive geographic and product expansion. Blue J already operates in the U.S., Canada, and the U.K., with plans to eventually cover 220+ jurisdictions through its IBFD partnership.

Future capabilities could include automated memo generation, tax form completion, document drafting, and conversational history maintaining context across sessions—transforming Blue J from a research tool into what Alarie describes as "the operating layer for global tax cognition."

For all its success, Blue J operates in a domain where errors carry serious consequences. The hallucination problem hasn't been eliminated — it's been minimized through careful engineering, content curation, and human oversight. Blue J has trained its models to acknowledge when they cannot answer a question rather than fabricate information.

The business also faces economic risks if compute costs spiral or usage patterns exceed projections. And subtler questions loom about professional judgment: as AI systems become more capable, will users defer to outputs without sufficient critical evaluation?

From 15 hours to 15 seconds: What Blue J's AI pivot teaches every industry

Blue J's transformation offers lessons beyond tax software. The company's willingness to abandon eight years of proprietary technology and rebuild on an initially unreliable foundation required both courage and calculated risk-taking.

The decision paid off not because generative AI was inherently superior to supervised machine learning in all dimensions, but because it addressed the right problem: comprehensiveness rather than precision in narrow domains. Tax professionals didn't need 95% accuracy on 5% of questions. They needed good-enough accuracy on 100% of questions.

The improvement from an NPS of 20 to 84 in just over two years reflects relentless iteration informed by massive data collection. The content partnerships created differentiation that pure technology couldn't replicate. The team of tax experts provided domain knowledge necessary to ensure reliability.

Most fundamentally, Blue J recognized that the real competition wasn't other AI startups or even established publishers. It was the old way of doing things — the 15 hours of manual research, the institutional knowledge locked in retiring professionals' heads.

"People are like, 'What does Blue J do? They provide better tax answers. Okay, I think we need that,'" Alarie reflected.

As AI transforms profession after profession, that clarity of purpose may matter more than technological sophistication. The future belongs not to those who build the most advanced AI, but to those who most effectively harness it to solve problems humans actually have.

For a tax law professor who started with frustration about inefficient research methods, building a $300 million company marks an audacious endpoint. For the thousands of professionals now answering complex questions in 15 seconds instead of 15 hours, it represents the future of their profession, arriving faster than most expected.

The bet on ChatGPT when it was still hallucinating biographies has become a validation that sometimes the riskiest move is not to move at all.

Read more on VentureBeat