Autonomous vehicle data is one of the biggest unsolved headaches in the self-driving industry. Every company developing robotaxis, autonomous trucks, or self-navigating construction rigs collects millions of hours of footage. Yet most of that autonomous vehicle data sits untouched in archives.
NomadicML wants to change that. The startup just closed an $8.4 million seed round at a $50 million post-money valuation. Its pitch is simple: stop letting 95% of your fleet footage collect dust.
Autonomous Vehicle Data Has a Massive Waste Problem
Here is the core issue. Companies building self-driving cars, delivery robots, and autonomous equipment generate staggering volumes of video every single day. A single test vehicle running cameras and sensors around the clock can produce several terabytes of raw footage in a week. However, organising and labelling that footage still relies on humans watching it. Even at fast-forward speed, that process does not scale.
Get fintech insights, deals, and updates before everyone else
Join 1,000+ fintech professionals
Consequently, the vast majority of autonomous vehicle data ends up in cold storage. Nobody reviews it. Nobody structures it. And as a result, the most valuable edge cases buried inside that footage never make it into training pipelines. The irony is hard to miss. Companies spend millions capturing this footage, then lack the tools to do anything useful with most of it.
This is exactly the gap that CEO Mustafa Bal and CTO Varun Krishnan spotted. Both founders met as computer science students at Harvard and later worked at companies like Lyft and Snowflake. During those stints, they noticed the same bottleneck over and over. Teams were drowning in autonomous vehicle data with no efficient way to extract value from it.
How Nomadic Turns Raw Footage Into Searchable Intelligence
Rather than simply slapping labels on video frames, Nomadic built a platform that transforms raw footage into structured, searchable datasets using a collection of vision-language models. Think of it as a search engine for your fleet’s footage. The platform ingests hours of driving video, applies multiple AI models in parallel, and produces indexed outputs that engineers can query in seconds.
For instance, imagine you need to fine-tune an AV’s understanding of police officers directing traffic through red lights. Or perhaps you need every clip where vehicles pass under a specific type of bridge. Traditionally, finding these clips meant paying a team of annotators to watch every minute. The alternative was accepting that these edge cases would stay buried forever. Nomadic’s platform lets engineers find those moments without manually scrubbing through thousands of hours of autonomous vehicle data.
On top of that, the structured output feeds directly into reinforcement learning pipelines. This means faster model iteration, better fleet monitoring, and ultimately smarter machines. Teams can identify a gap in their model’s understanding and pull relevant training examples in a fraction of the old timeline. That speed matters when iteration cycles determine who ships first. As AI-driven tools reshape how businesses operate, Nomadic’s approach represents a practical application of that same philosophy in the physical world.
Who Backed the $8.4 Million Round
TQ Ventures led the seed round, with additional investment from Pear VC. Notably, Jeff Dean (formerly Google’s chief scientist) also participated. That is serious credibility for a company still in its early stages. Dean’s involvement signals that the technical approach holds up under scrutiny from one of the most respected minds in machine learning.
Schuster Tanger, the TQ Ventures partner who led the deal, framed the opportunity in straightforward terms. He argued that the moment an autonomous vehicle company tries to build this infrastructure internally, it gets distracted from its core mission. That core mission is building the robot itself. He compared it to how Salesforce does not build its own cloud. Netflix does not build its own content distribution facilities either. In other words, handling autonomous vehicle data infrastructure is a specialised job, and Nomadic is positioning itself as the go-to provider.
Meanwhile, the startup also grabbed attention last month by winning first prize at the Nvidia GTC pitch contest. Nvidia is investing heavily in the autonomous driving ecosystem through its DRIVE platform and open-source Alpamayo models. So that kind of validation carries significant weight. It also puts Nomadic on the radar of every major player attending GTC.
The Competition Is Heating Up Around Autonomous Vehicle Data
Nomadic is not operating in a vacuum. Established players like Scale, Kognic, and Encord are all developing their own AI-powered annotation tools. Furthermore, Nvidia recently released its Alpamayo family of open-source models. These include vision-language-action models designed to help autonomous systems reason through complex scenarios.
Still, Krishnan draws a clear distinction. He describes Nomadic’s platform as an “agentic reasoning system” rather than a standard data labeler. You describe what you need, and the system figures out how to find it across your entire fleet’s autonomous vehicle data. That contextual understanding is what separates it from tools that simply tag objects in individual frames.
This competitive dynamic mirrors what we see across other industries. AI automation and human expertise must coexist to deliver real results. The winners are not necessarily the biggest companies. They are the ones solving specific problems better than anyone else.
Real Clients Are Already Seeing Results
Nomadic is not just pitching a vision. Several high-profile customers are already using the platform. Zoox, Mitsubishi Electric, Natix Network, and Zendar have all integrated Nomadic’s tools into their development workflows. Each of these companies operates in a different corner of the autonomy market. That breadth suggests the platform has broad applicability beyond just self-driving cars.
Antonio Puglielli, VP of Engineering at Zendar, noted that Nomadic’s platform allowed his team to scale up much faster than traditional outsourcing. He also highlighted the company’s domain expertise as a key differentiator. When you are dealing with autonomous vehicle data at scale, generic solutions tend to fall short. Off-the-shelf labelling tools may handle simple object detection, but they struggle with the contextual nuance that robotics and AV developers need.
These early customer wins matter because they prove product-market fit. Startups can raise money on a compelling story. But keeping clients like Zoox and Mitsubishi engaged requires delivering measurable improvements to their autonomous vehicle data workflows. Retention at this stage is arguably a stronger signal than the fundraise itself.
Why the Timing Works for Nomadic
Several trends are converging to make this the right moment for a company like Nomadic. First, the sheer volume of autonomous vehicle data being generated is growing exponentially. Wayve recently raised $1.2 billion to scale its AI-driven driving platform across multiple markets. It plans to launch commercial robotaxi trials in London later this year. Zoox is expanding its robotaxi service to new cities. And dozens of smaller companies are building autonomous equipment for construction, logistics, and agriculture. Every one of these players generates enormous quantities of raw footage that needs processing.
Second, the models needed to process this data are maturing rapidly. Nvidia’s Alpamayo platform introduced chain-of-thought reasoning for AV systems at CES 2026, signalling a shift from simple perception to genuine reasoning. These models can now think through novel scenarios step by step. But they still need high-quality, structured training data to perform well. Nomadic’s platform sits neatly alongside these advances. It helps companies extract maximum value from their autonomous vehicle data before feeding it into increasingly capable models.
Third, investors are clearly hungry for picks-and-shovels plays in the AV space. Building the vehicles themselves requires enormous capital and carries regulatory risk. By contrast, providing the data infrastructure layer is a more capital-efficient bet with broad applicability across the entire autonomy ecosystem.
What Nomadic Is Building Next
The team is currently developing specialised tools that evaluate the physics of lane changes from camera footage and optimise robotic gripping precision. These are the kinds of granular, edge-case problems that autonomous vehicle data platforms need to solve if they want to stay relevant. Surface-level annotation is a commodity. Deep contextual analysis of physical interactions is where the real value lies.
Looking further ahead, integrating non-visual sensor data like lidar readings into the analytical framework will be critical. Most modern AV systems rely on a fusion of cameras, radar, and lidar to build a complete picture of their environment. Bal has acknowledged the complexity of managing vast video archives against increasingly large models. But that complexity is precisely what creates the moat. Similar to how AI partnerships are transforming financial intelligence, the companies that crack the data infrastructure layer tend to become indispensable.
There is also a talent angle worth noting. Krishnan holds the title of international chess master, and every engineer on the team contributes to published scientific research. For a startup handling autonomous vehicle data at this level, that kind of technical depth is not a nice-to-have. It is a requirement. The problems Nomadic is tackling sit at the intersection of computer vision, large language models, and robotics. Solving them demands a team that can operate across all three domains.
The Bottom Line on Autonomous Vehicle Data
Self-driving technology is accelerating, and the companies building these systems cannot afford to let 95% of their autonomous vehicle data gather dust. Nomadic’s $8.4 million raise, its marquee clients, and its Nvidia GTC win all point to one thing. This startup is solving a genuine and growing problem.
Whether Nomadic can fend off larger competitors and open-source alternatives remains to be seen. The entry of Nvidia’s Alpamayo models into the ecosystem means that baseline annotation capabilities will become more accessible to everyone. However, Nomadic’s focus on contextual reasoning rather than basic labelling gives it a distinct position in a crowded field. If the platform keeps delivering the speed and accuracy that early customers like Zendar have reported, Nomadic has a strong shot at becoming essential AV infrastructure. For anyone watching the autonomous vehicle data space, this is a company worth tracking.
