Koca Ventures Ltd
71-75 Shelton Street
Covent Garden, London
WC2H 9JQ, United Kingdom
Registered in England & Wales — 16231043
Agents that do the work —built for production, owned by you.
The demo is the easy part. The moat is a system that survives real load, real data, and a real audit — running where your data lives, with the code and the keys in your hands. We build custom agent harnesses around the workflow your team already uses, not generic chatbots.
Four ways teams put agents to work
Real systems, not slideware
On-premise document intelligence for a property brokerage
A local-first brokerage CRM with a hand-built RAG layer: documents ingested, embedded, and answered with citations entirely on the firm's own hardware. Hybrid retrieval over full-text search, an embedded property-graph, and a local vector store; a local LLM (Ollama/Qwen) with multilingual embeddings; an agent browser that gathers market comparables; and an MCP server exposing the corpus as read-only tools. Offline-capable — nothing leaves the building unless explicitly enabled.
A B2B platform we built and operate
atmosverde — a B2B carbon platform we designed, built, and run in production. A real shipped product behind the same engineering bar we bring to client work. (Quiet by design; ask us and we'll walk you through it.)
A lot of “agentic” projects are just expensive automation wearing a new label. We'll tell you when you don't need autonomy — a bounded agent that handles the routine and hands the rest to a person is usually cheaper, more reliable, and easier to trust than “an agent for everything.”
Also: edge AI & computer vision (on-device perception) and robotics simulation. The same low-level systems depth shows up in our security research, which has been acknowledged by NVIDIA.
Straight answers
Is this just no-code automation?
No. No-code tools (Zapier, Make, n8n, Lindy) are great for simple, cloud-hosted glue. We build the part they can't reach: custom agent harnesses with real tool use, memory, approval gates, and on-premise deployment — wired into your actual systems, with the code and the runtime owned by you. Most 'pilots' fail on integration and operational fit, not model quality; that's the gap we work in.
Does our data leave our network?
Only if you decide it should. On-premise deployment is a first-class option, not a fallback: self-hosted inference (vLLM, Ollama, llama.cpp depending on hardware), local vector and graph stores, signed updates, role-based access, and audit logs. The default posture is that your documents and customer data stay inside your infrastructure; hosted models are an opt-in for workloads where the data sensitivity allows it.
What happens when the model gets something wrong?
We design for it. Agents run inside bounded harnesses with approval gates on anything consequential, structured outputs that are validated before they're acted on, and open tracing and evaluations so you can see why the agent did what it did. The honest answer is that fully autonomous is rarely what you want — most value comes from an agent that handles the routine 80% and routes the hard 20% to a person, pre-drafted.
Are we locked into you?
No. You own the runtime, the keys, the data stores, and the source. We build on open standards (the Model Context Protocol, standard SDKs, open-source inference and stores) and hand over a system your own engineers can run and extend. Discretion is part of the deal — we don't put your name on our website either.
How do you price it?
Per engagement, after we understand the workflow — there's no per-seat list price. A typical path is a short, paid discovery that scopes one real workflow and leaves you owning the spec, a fixed-scope build, and an optional monthly 'operate and improve' retainer. Tell us the pain point and we'll scope it honestly.
Which models do you use — Claude, GPT, or local?
Whichever fits the workload. Sensitive on-premise work runs on local models (Llama, Qwen, Mistral) served via vLLM or Ollama. Where data sensitivity permits and the reasoning is hard, we use Claude (Anthropic Agent SDK) or GPT (OpenAI Agents). Many production systems are hybrid — local models for ingestion, hosted models for the hard reasoning.
Last reviewed:
Start with one pain point
The strongest opening isn't a generic AI pitch. Share one workflow that hurts — document Q&A, follow-up chaos, after-hours calls, procurement — and we'll scope a small build around your real data before any larger commitment.
