Case studies · first-party

Three AI products we shipped in 4–6 weeks each.

Most consultancies show you decks. We show you running products. Every system below was built and is operated by the same team that would embed in your business. Same playbook, same engagement model, same code-shipping discipline.

Case 01 · B2B SaaS

aidevboard.com

Job board + agentic API
8,400+
live AI/ML jobs indexed
489
companies tracked
6 wks
build to first paid customer
$49/mo
monetized API tier (Stripe)

The problem. Existing AI/ML job boards were either curated lists with no API or generic boards with no AI focus. Hiring agents and AI-tool builders had no programmatic way to query the market. Job boards as a category have been dominated by a few incumbents for fifteen years.

What we shipped. A full job board with a real REST API that agents wire into directly. Daily scraper across 538 sources (Ashby, Greenhouse, Lever, custom ATS endpoints). Stripe-powered API tier (free 100/hr, Pro $49/mo at 5,000/hr). MCP server registered in the official MCP registry — agents wire it in once and search the index forever.

How we built it. Single Go binary. Postgres on Fly.io. No platform lock-in. Scraper runs on Mac Studio (zero infra cost on the heaviest workload). The first paying customer signed within six weeks of the first commit. The point isn't that we built a job board. The point is that the same operating discipline ships your AI workflow.

What this engagement would look like for you

Two weeks of mapping (interviews, current systems, target workflows). Four weeks of building (production system, paid loop, monitoring). Two weeks of handoff (runbooks, training, evaluation harness). Your team operates the system after week eight.

Most relevant if you run
A specialty marketplace, recruiting firm, lead-gen business, distributor with a public catalog, or any operation where structured data + a programmatic interface unlocks value your competitors don't have.
Case 02 · Agent infrastructure

nothumansearch.ai

Agent-first search index + MCP
8,600+
agent-ready sites indexed
494
live MCP-verified servers
4 wks
scraper + index + public UI
8 tools
MCP server (in official registry)

The problem. Agents need to discover services that have proper integration surfaces (APIs, MCP servers, llms.txt). Google ranks for humans; nothing ranked for agents. The market needed a search index built specifically for what agents look for — with verification that the claims are real, not just marketing copy on a website.

What we shipped. A live JSON-RPC verifier that probes every claim, a 100-point agentic-readiness rubric, an embeddable score badge for sites that pass, a live MCP server (agents wire it in, never have to re-index), and a public UI at /score for any site to check itself.

How we built it. Same single-Go-binary pattern, separate database, separate Fly app. Daily recrawl auto-updates scores. Listed in the official Anthropic MCP registry (cryptographically signed publish flow) within four weeks of first commit. The interesting part isn't that we built a search engine. It's the eval harness — the JSON-RPC verifier that catches false claims — which is the same shape of evaluation infrastructure your AI workflow needs.

What this engagement would look like for you

One week mapping (what does "good output" look like for the workflow). Four to eight weeks building (the system + the eval harness that proves it's working). Two weeks handoff (runbooks, eval suite, the playbook for adding the next workflow on the same infrastructure).

Most relevant if you run
Any operation where AI needs to be trusted with output that goes to customers, vendors, or systems-of-record — healthcare intake, claims processing, contract review, customer service triage, anything where wrong answers are visible and costly.
Case 03 · Content + lead-gen

8bitconcepts research pipeline

Auto-refreshed analysis pipeline
17
research papers published
5
auto-refresh weekly from live data
2 wks
pipeline stand-up to first paper
100/100
agentic-readiness score (per NHS rubric)

The problem. Most "thought leadership" pipelines are humans typing in Google Docs. The cost is high, the cadence is unreliable, and the content drifts from what the data actually says because no one re-runs the queries. Operating businesses that want a credibility surface can't justify a full-time content team.

What we shipped. A pipeline that auto-regenerates five data-grounded research papers every week from live API data (aidevboard hiring data, nothumansearch ecosystem data). Includes OG-image generation, JSON-LD schema, sitemap + IndexNow auto-submission, RSS feed, and an unsubscribe-compliant newsletter delivery via Resend. Eleven additional qualitative papers run as one-time human-written pieces using the same template + publishing infrastructure.

How we built it. Python pipeline, runs on launchd cron. ~400 lines of orchestration, ~150 lines per paper-type generator, hardcoded brand template. The interesting part isn't the content. It's that the same infrastructure compounds — every new paper added drops into the index, sitemap, llms.txt, RSS, Resend digest with zero hand-wiring.

What this engagement would look like for you

Two weeks mapping (what content compounds for your business — case studies, market reports, weekly digests, customer-facing guides). Four to six weeks building the pipeline (templates, distribution, eval). Two weeks handoff (your team adds new content types without our help).

Most relevant if you run
A B2B services firm that needs ongoing credibility content, a SaaS company that needs to publish weekly product/data updates, an industry-specific media operation, or any business where content velocity matters more than content depth.
Want this for your business?

Same playbook. Your operation.

Each case above is a real shipped product running today. The engagement model is the same one we use with clients: 4–12 weeks, embedded in your team, built on your stack, handed off with runbooks. Costs less than a senior hire. Compounds forever.

Book a 30-min intro call → See engagements