Notes from the work.
Short, considered thoughts. Briefs, builds, things I noticed.
- NOTE
Watched a client click the "regenerate" button six times in thirty seconds today, getting six slightly different answers. None better than the first. The model wasn't the problem. The interface was teaching them that more clicks = better output. We added a one-line subhead under the button: "If the first answer feels off, tell me what's wrong — I'll do better than guessing again." Click rate dropped 70%. Quality of conversations went up. Sometimes the AI feature is the copy.
- BRIEF
What they asked for: a chatbot. What they actually needed: a better intake form.
A services company came in wanting "an AI chatbot for the website." They had three quotes already, all in the $40–80k range. They wanted to know what we'd charge.
Spent twenty minutes on the call asking what the chatbot was for. The honest answer, after a bit: "We're tired of getting demo requests from people who aren't a fit, and our sales team is wasting hours on bad-fit calls."
That's not a chatbot problem. That's a qualification problem.
We rebuilt their existing contact form into a five-question conversational intake — the same one their sales team would ask on the first call anyway. Sent each submission to Slack with a fit verdict (strong fit / maybe / pass) and a one-line reasoning, plus a draft reply they could copy. Two weeks of build time. About $6k.
Demo no-shows dropped from 38% to 11%. Their sales lead said it was the first time in six months he'd opened a Monday inbox without dread.
The "AI chatbot" framing wasn't wrong, exactly. It just hadn't been pressure-tested. Most of the time, when someone asks for AI, they're describing a symptom — not the thing they actually need. The work is figuring out what's underneath.
That's why every engagement here starts with a brief, not a quote.
- BUILD
Rebuilt our website audit. It now takes 47 minutes.
It used to take three to four hours. Manual crawl, spreadsheet of issues, screenshots, a write-up. Mostly busywork — the kind a senior person does badly because they've already done it ten thousand times.
Spent a weekend rebuilding it as an agent loop. Crawls the site, runs Lighthouse, reads page copy, checks the structured data, looks at the nav hierarchy, compares against three competitors. Surfaces issues with severity + a suggested fix + a screenshot. Outputs a markdown report and a Loom-ready talking-points doc.
47 minutes start to finish. The findings are deeper than the old version, not shallower — because the agent doesn't get tired on page seventeen and start glossing.
Two things I didn't expect:
- The hardest part was prompting the model to not fabricate findings when nothing was wrong. "Nothing notable" is a valid answer. Most of the prompt is anti-padding rules.
- The audit is now small enough to run as a free thing for prospects. That changed the sales conversation more than the time savings did.
This is the shape of most of the AI work I do. Not magic. Just removing the parts a human shouldn't have been doing in the first place.
- LINKHenrik Karlsson — A blog post is a search query
"A blog post is a search query for the people who would understand you."
Reading this back tonight after a week of writing client briefs that all secretly want to be the same essay. The post is mostly about why writing in public works as a filter — but the line above is doing real work for me on a different question: what's the right format for a one-person consultancy's voice online? Not "thought leadership." Not "case studies." Search queries. Things specific enough that the right person nods and the wrong person bounces.
- OBSERVATION
Most AI shops are engineers who can't market, or marketers who can't build.
I've been looking at a lot of AI consultancy sites lately. There are roughly two patterns.
The engineering shops have a hero that says "We Build AI Solutions" in 96px Inter. The case studies are charts with axes that aren't labeled. The team page is six guys named Vikram and one designer they hired off Upwork. They will absolutely build you a working RAG pipeline. It will work for three weeks and then you'll quietly stop using it because no one explained what it was for.
The marketing shops have a hero with a soft gradient and a quote from McKinsey about how AI will add $13 trillion to the global economy by 2030. The case studies have words like "intelligent automation" and "human-centered AI." There is a chatbot on the corner of the site that opens with "Hi! I'm Aria, how can I help you transform your business today?" When you ask it a real question it loops back to "Would you like to schedule a discovery call?" They cannot ship.
The third pattern, which nobody is doing well, is a designer who can build. Someone who has spent ten years thinking about why software fails its users — and now has a tool that lets them ship the fix.
That's the gap I'm trying to sit in.
It's not really about AI. AI is just the latest material. The discipline is older: figure out what someone is actually trying to make happen, then build the smallest possible thing that makes it happen, then watch them use it and iterate until the friction is gone.
I keep meeting business owners who've been pitched by both shops and walked away frustrated. The engineering shop quoted them $80k for something they couldn't explain. The marketing shop quoted them $80k for a deck.
Half the time the actual thing they need costs $4k and ships in a week.
- NOTE
Why most of my clients are within an hour of Toronto, even though none of the work needs to be
Every project I run is remote-capable. The model doesn't care where the laptop is. And yet most of my work as an AI consultant in Toronto has stayed within an hour of the city.
I think it's because the first conversation has to be in person, or feel like it is. Coffee near King West, a lunch in Mississauga, a Zoom that runs forty minutes long because nobody's checking the time. That's where the real brief surfaces — the part the prospect didn't know they were going to say. Once that's done, the rest can ship from anywhere. I have US clients too. They just take an extra week to get past the polite ask and into the real one.
Geography isn't the constraint. Trust is. Toronto is just where mine is densest.
- BRIEF
An SMB owner who almost spent $90k. The right answer was $0 for now.
A founder reached out in February. Twelve-person services company, mostly residential clients across the GTA, profitable, no debt. He'd been pitched by two AI firms in the previous month and was deciding between them. The quotes were $62k and $89k. Both promised some version of "AI for small business" — a customer-facing chatbot, a lead-scoring tool, an internal knowledge assistant, and an automated proposal generator.
He sent me both proposals and asked which one was better.
Neither, was the honest answer.
What I could see from the proposals was that nobody had asked him a basic diagnostic question. His business was growing about 18% a year on word of mouth alone. His team was at full utilization. His main bottleneck, when I asked him to describe a bad week, was that his two senior techs were both retiring within eighteen months and he hadn't started writing down what they knew.
That's a problem AI can help with. It is not the problem either firm had quoted on.
We spent the first call mapping his actual constraints. He didn't need a chatbot — his customers don't shop online and his lead flow was already saturated. He didn't need lead scoring — every lead got a callback and most of them closed. The proposal generator was solving for a process that didn't have a backlog. None of the three flagship features in either quote would have changed his P&L in any direction I could see.
What he needed, and what we ended up scoping for the spring, was a project that doesn't sound like AI work at all. Sit with the two senior techs for a few days. Record the conversations. Build a structured internal knowledge tool from those recordings — searchable by job type, by failure mode, by the kind of question a junior tech would ask. Cite the source recording for every answer so the institutional memory stays attributable to the people who actually had it.
Estimated cost is somewhere in the $8k–$14k range depending on how much editorial cleanup the transcripts need. Estimated value is whatever you think it's worth to not lose thirty years of trade knowledge when two people retire.
The deeper point is one I keep running into. The AI shops bidding on this account weren't malicious. They were quoting their menu against his stated ask. Nobody had asked him what was actually fragile in his business. That diagnostic step is the one thing every flashy AI proposal seems to skip, and it's also the only step that determines whether the build is worth doing at all.
If you run a small business and you're getting AI proposals, the test is simple. Ask the firm to describe, in their own words, the single biggest operational risk to your business in the next three years. If they can't, they don't know enough about you to be quoting yet. The rest of how I think about engagements follows from there. For broader Canadian SMB context, Statistics Canada tracks the kinds of structural pressures actually showing up in small business data. Most of them have nothing to do with AI.
- BUILD
An internal AI tool for an ops team that didn't want one
Built a small internal tool last month for a manufacturing client in Mississauga. The team that used it didn't ask for it and was, on the first call, openly skeptical. Six weeks later they're the ones flagging when it goes down.
The job: their ops lead was spending the first ninety minutes of every morning pulling numbers from three systems — an ERP, a shipping platform, and a shared spreadsheet — into a single email summary for the floor managers. Standard internal AI tools / workflow automation territory. Nothing exotic.
What I built was deliberately small. A scheduled job that pulls from all three sources at 6 a.m., runs the numbers through a model with a tight prompt that knows the client's specific lexicon (their product codes, their site names, the difference between a "short" and a "miss" in their language), and drafts the morning summary in the ops lead's voice. He reviews it for ten minutes, fixes anything off, and sends it.
Two things I had to fight for during the build:
The model is not allowed to invent product codes it doesn't see in the source data. This sounds obvious. It is not, in practice, the default behavior. About 40% of the prompt is anti-fabrication rules, which is the same shape of problem I wrote about with the audit tool.
The tool does not auto-send. It drafts. The ops lead was the one who insisted on this and he was right. The trust gap closes faster when a human is still the last set of eyes for the first month or two. After that, you can talk about automation. Not before.
What surprised me was the second-order effect. The ops lead now has ninety extra minutes a morning, which he's been spending on the floor instead of at his desk. Three weeks in, the floor managers said he'd caught two recurring issues nobody had time to look at before. The tool didn't catch those. He did. The tool just gave him back the morning.
Most of the work I do ends up shaped like this. Not a flagship AI product. A piece of plumbing that deletes a meeting or saves a morning, designed by someone who's spent a long time watching how people actually use software.
- LINKAndy Matuschak — Tools for thought
"A medium does its best work when it's invisible."
Re-reading Andy Matuschak's notes on tools for thought this week while scoping a knowledge tool for a client. The line above isn't his exact phrasing but it's the through-line. A good tool disappears into the work; a bad one demands attention for itself. Most AI products right now are the bad kind — they want to be a destination, a tab you visit, a brand. The interesting ones are the opposite. They sit inside an existing workflow, do one specific thing, and you stop noticing them within a week. That's the bar I'm trying to hit. If a client remembers the AI is there, I've usually built it wrong.
- OBSERVATION
The custom AI vs ChatGPT subscription question is almost always the wrong one
I get asked some version of this once a week. Should we build a custom AI tool, or just pay for ChatGPT Team for the staff?
The framing is wrong, but it's wrong in a specific way that's worth pulling apart.
Buying a subscription gets you a fast, capable model and a familiar interface. Twenty dollars a head, working today. The cost of being wrong is approximately zero — you cancel and move on. If your team's main AI use case is "draft emails faster" or "summarize this PDF," a subscription is almost certainly the right answer, and any consultant telling you otherwise is selling something.
Building a custom tool gets you something different: control over what the model knows, where the data goes, how the interface fits the actual job, and what happens when the answer is wrong. That last one is the part most people skip past. A general-purpose chatbot that hallucinates a tax deadline is annoying. A staff-facing tool that hallucinates the same deadline, with your firm's logo on it, is a liability. The custom build's real value is in the guardrails, not the intelligence.
The honest decision tree is something like this. If the work is generic, like writing, summarizing, brainstorming, or light research, buy the subscription. The frontier models are better at general tasks than anything you'd build, and the gap is widening. Pay the seat fee, train your team how to actually use it, move on.
If the work depends on your specific knowledge, your specific workflow, or your specific liability exposure, building something starts to make sense. Not because the AI is better. It's the same model underneath. The wrapper around it is what's doing real work. Citations from your sources. Refusal to answer outside its lane. Logging for compliance. A UI that fits the job rather than a blank text box.
The mistake I see most often is the inverse on both sides. Companies who should obviously buy the subscription get talked into a $60k custom build because it sounds more serious. Companies who genuinely need a custom tool, usually because they have proprietary data they can't paste into a public chatbot, keep paying for subscriptions and watching their team paste sensitive data into them anyway.
If you're trying to think clearly about this, Ben Thompson's writing at Stratechery on aggregation and unbundling is a useful frame, even though he's writing about a different question. The pattern holds: the value isn't in the model, it's in the layer that makes the model fit a real job. Sometimes that layer is "ChatGPT plus a thirty-minute training session." Sometimes it's a $20k build. The skill is being honest about which one you actually need.
Most of the work I take on sits at that decision point. Half of my first calls end with me telling the prospect to buy seats and call me in six months if they hit a real wall. I've written about this dynamic before — the cheaper, more boring answer is usually the one that ships.
If you can't articulate, in one sentence, why a subscription wouldn't solve your problem, you're probably not ready for a custom build yet. And that's a useful answer, not a discouraging one.
- BRIEF
An accounting firm wanted ChatGPT. What they needed was a working memory.
A mid-sized accounting firm in the GTA reached out late last year. Twelve staff, mostly bookkeeping and tax for owner-operated businesses. The partner who called had been to a conference where someone said firms that didn't adopt ChatGPT for accounting would be gone in five years, and now he was trying to figure out what that meant for him.
His specific ask: a paid ChatGPT Team plan for everyone, plus a custom GPT trained on their tax knowledge. Quoted price from another shop, somewhere north of $30k.
I asked what he wanted his staff to actually do with it. He went quiet for a minute, then said something I think about a lot: "Honestly, I want them to stop asking me the same five questions every March."
That is not a ChatGPT problem.
We spent the first call mapping the five questions. Four of them had answers that lived in the firm's internal SOP doc, which nobody read because it was 90 pages and the search was bad. The fifth was about a recurring CRA filing edge case that the senior partner was the only one who knew, and he'd been answering it by email for fifteen years.
What we built was small. A staff-facing chat tool that searched their actual SOP doc and email archive (with the senior partner's permission), returned cited answers with the source paragraph, and flagged anything it wasn't confident about for human review. We seeded it with the senior partner's old answers to that recurring question so the institutional knowledge stopped living in his head.
Build time was about three weeks. Cost was a fraction of the original quote. The "AI" part is genuinely the smallest interesting part of it — most of the work was cleaning the SOP doc, structuring the email export, and writing the system prompt that made it admit when it didn't know something.
A month in, the partner sent a note saying his March was quieter than any in a decade. The staff weren't using it for anything fancy. They were using it as a search engine for things they should have been able to find anyway.
The lesson, again — and I keep writing variations of it — is that the AI label is a magnet for the wrong kind of project. Most professional services firms don't need ChatGPT. They need their own knowledge to be findable. AI is just the cheapest way to make that happen now.
If you're a partner at a firm thinking about this, the framing question isn't "how do we adopt AI." It's "what does my team ask me that I shouldn't have to answer twice." For more on how I tend to scope this kind of work, the work page is a decent overview, and a deeper read on a related dynamic is Ben Thompson at Stratechery on how integration changes the value of knowledge work.
The boring answer is usually the right one.
- NOTE
Three questions that smoke out a bad AI vendor in ten minutes
If you're trying to figure out how to evaluate AI vendors and you don't have a technical background, three questions get you most of the way there. Ask them on the first call.
One: "Can you walk me through something you built and what didn't work the first time?" A real builder has scars. A reseller has a deck.
Two: "If we did the smallest possible version of this, what would it look like?" The answer tells you whether they're solving your problem or selling their package.
Three: "What would make you tell me not to do this?" Anyone who can't answer that has never said no to a client, which means they'll happily build you the wrong thing.
- OBSERVATION
What AI implementation actually costs (and why most quotes are wrong)
The most common question I get on a first call is some version of: how much does AI cost?
The honest answer is that the question is malformed. It's like asking how much a building costs. A shed and a hospital are both buildings. The number depends on what you're trying to make happen, and most people asking the question haven't yet decided.
Here's the rough shape I've seen, from actually running these projects:
A focused internal tool — something that automates a single repetitive task for one team — usually lands in the $4k–$15k range if it's scoped tightly. A lead-qualification flow, a reporting summarizer, a custom audit tool. Two to four weeks of build time. Often the highest ROI work, and the work most consultancies don't want because it's too small to mark up.
A customer-facing AI feature integrated into an existing site or app — a smart search, a docs assistant trained on real content, a configurator — typically runs $15k–$50k. The cost isn't the model. The cost is the design work, the data plumbing, and the rounds of iteration after real users get their hands on it.
A platform-level rebuild with multiple agents, evals, retrieval, and ongoing oversight is where the $80k–$250k quotes come from. Sometimes that price is honest. Often it isn't. I've watched two of those projects get scoped down to $12k after a real conversation about the underlying need, which was something I wrote about earlier.
The reason quotes vary so wildly isn't that the work is mysterious. It's that most AI shops are pricing for the buyer's anxiety, not the build. If a manufacturing client believes AI is exotic, the price reflects that belief. The actual labor is closer to a well-scoped web project than to anything novel.
A few things that consistently inflate the number:
The vendor charges a "discovery" phase before they'll even quote. That phase is sometimes useful, but more often it's a way to get the client emotionally committed before the real number lands. I prefer a free conversation, a written brief, and a fixed quote — which is roughly how I structure engagements.
The build assumes a frontier model when a smaller open one would do. Or assumes a custom-trained model when prompting a hosted one would do. Both of those mistakes can 5x the cost.
There's a "platform" being built when really one workflow needs fixing. The smaller, more boring version of the project almost always ships sooner and gets used more.
If you want a benchmark on the broader Canadian SMB picture, Statistics Canada has been tracking AI adoption rates among small businesses, and the numbers are still small. Most of the cost gap I see is between what shops quote and what owners are actually willing to spend on something they don't yet trust. That trust gap is the real budget question, not the dollar figure.
The shortest version: if a quote is more than 10x what your gut says it should be, the vendor is solving a different problem than the one you described.