AgentOps Investment Memo

DEAL CONTEXT

In May 2023, we wrote an initial thesis on AI Agents, tracking companies building agentic infrastructure and applications. At the time, all players focused on handling agent-related tasks end-to-end (e2e), from agent building to deployment. We concluded that the process of building agents from scratch - taking them from a conceptual stage to a fully functional state - would become a commoditized service. Eventually, it would be handled by foundational model providers or third-party agent frameworks. It seemed premature to invest heavily in this landscape, given its nascent stage and rapid evolution. However, because we believed that agents would be core to the future of software development, we decided to revisit the market once it had matured.

6 months later, OpenAI released the ability for users to create custom Generative Pre-trained Transformers (GPTs), which can perform actions like an agent. This indicated progress in the operationalization of LLM’s and demonstrated the commoditization of agent building software, confirming the consolidation of value in post-deployment iteration.

Recognizing this shift, we re-canvassed the landscape, sharing our thesis with various founders. This sparked excitement among some, particularly those whose vision aligned with the evolving market dynamics. One team, AgentOps, presented a unique opportunity. They were building an early product focusing on monitoring and observability, a bottleneck preventing production-level agent deployments. Moreover, their platform's partnership and network-driven growth strategy, exemplified by their ownership of Staf.ai (vetted list of AI agents, tools, and copilots) operation of Cerebral Valley (SF’s AI event page), and management of SF’s largest AI founder Twitter group indicated a proactive approach to staying ahead of the market.

By positioning themselves as the default observability solution for the partners they are connected with, they will be able to maintain their market position and continuously innovate based on customer and channel partner feedback. In a dynamic market environment, it's crucial to align with founders who are attuned to industry shifts and are committed to adapting and evolving their offerings accordingly. This team's proactive stance and ability to leverage their network for insights and growth makes them a compelling partner.

COMPANY OVERVIEW

The current state of agent building presents significant challenges. Models operate as black boxes, lacking transparency into AI outputs, making it difficult to interpret results or understand the underlying processes. This leaves developers with minimal understanding of the processes that led to the result. Without insight into the underlying mechanisms, interpreting outputs is all hypothetical, hindering development progress. Existing solutions for diagnosing issues are driven by trial and error, leading to overly complex procedures, low accuracy, and minimal reliability. Consequently, This inefficiency not only drives up costs but also introduces significant delays in project delivery. Overall, these limitations hinder progress in agent building and underscore the urgent need for more effective solutions.

AgentOps was founded in 2023 by Adam Silverman and Alex Reibman. It is a comprehensive platform for evaluations, observability, and benchmarking. It enables companies to debug AI agents and ensure compliance with three main products: Agent Suite, Session Drill-Down, and Session Replay.

The Agent Suite offers powerful testing capabilities, used to benchmark and evaluate session performance, enabling developers to assess agents effectiveness. Users can compare performance metrics across open source and proprietary datasets, gaining insights into how their agents stack up against industry standards. Additionally, developers can debug to extract insights from agent failure instances, facilitating performance issue identification and resolution.

Session Drill-Down provides developers with comprehensive analytics and session-level monitoring capabilities. By aggregating performance data into a comprehensive dashboard, developers see a high-level overview of all their projects, streamlining the performance assessment process and identifying common areas for improvement. Users can filter for specific sessions to conduct deep dives into successes, failures, costs, run time, and more, allowing for targeted analysis and optimization.

With Session Replay, developers can visualize individual, agent logic step-by-step. This helps developers to understand precisely how tasks were completed by agents, facilitating deterministic error diagnosis through direct insight. With run-level session replay, developers can effectively track agent behavior, perform root cause analysis, and correct defective behavior, resulting in more reliable agent systems.

KEY HIGHLIGHTS

Founders each have an exit as founder/CEO

Alex Reibman - Founded Menubites, acquired by Snappr in March 2023
Adam Silverman - CEO of Hot Dot Media, acquired by Wondr Gaming in May 2021

Founders have shown substantial connectivity and engagement with the agent community with Staf.ai & Cerebral Valley, enabling them to stay ahead of market
Integrations within the agent community allows AgentOps to serves as an index for the quickly-growing agent market

INDUSTRY OVERVIEW

Automation is the foundation for software development & is operationalized in real-world applications. Historically, enterprises have seen to adopt automation (software and physical) in a few main ways: automating low-skill activities, redefining roles & processes, and optimizing high-skill jobs. Software automation – commonly referred to as robotic process automation (RPA) – has had a high barrier to entry due to the machine learning’s lofty financial and educational requirements. Both of those barriers have been lowered by widely available & easily implemented LLM’s.

LLMs aren’t perfect. Large foundational models (OpenAI GPT, Google PaLM/Bard, Anthropic Claude) aren’t context specific and can’t fully leverage user knowledge or data. These models quickly became commoditized, with little differentiation in function or end-user experience. Because of this, the next generation of models focused on context-specificity for industries like law (Harvey, $20M from Sequoia) and medicine (Hippocratic AI, $50M from a16z) – both launched out of stealth with tens of millions in funding. These bring industry context to the model but still don’t allow the user to leverage their proprietary data.

Current Day

Autonomous agents are the automated application of LLMs. Using an LLM, they interpret a user prompt, self-generate instructions, and iterate through a conversational loop until they achieve their initial goal. They started out generalized (AutoGPT, AgentGPT, BabyAGI) and grew quickly, with AutoGPT becoming the fastest growing open-source project in GitHub history.

AI agent software applications are following a similar path as foundational models. The category began with powerful, novel, difficult-to-replicate software comprised of ~3 main players, but now is nearing 100 companies. However, they do not have the same development process. ML models have a robust devtool ecosystem (MLOps software) that agents do not. They also differ in output – LLMs produce responses, while agents are designed to produce real-world products (complete an action, conduct research, etc.).

Building autonomous, reinforcement learning agents in the real world is difficult, and fine-tuning a one-size-fits-all foundation model will not work with agents the way it does with LLMs. Some companies provide templatized versions of their product to flatten the adoption curve, while others are dedicated to a single task.

Agents perform functions and interact with data, demanding close monitoring and high adaptability. Agent infrastructure provides necessary tooling for compliance (monitoring & observability) and iteration (benchmark definitions, progress measurement, complex workflows).

Agents will have real world applications as internal software developers and embedded products, but enterprises will have to build for themselves, requiring enablement infrastructure like observability.

THE NEW AGE

LLMs serve as the underlying engine, while agents provide the means for functional execution. This dynamic has led to a proliferation of context-specific agents across various verticals and horizontals. In verticals such as finance (ThinkChain), and in horizontal domains like software development (Mutable) and codebase migrations (Second), there has been a notable surge in specialized agent applications. This illustrates the increasing maturity of agent-based systems.

Moreover, the infrastructure space supporting agent development has witnessed significant advancement, marked by broader AI companies launching agent building products. OpenAI's Assistants API and Langchain's agent infrastructure product launch reflect this trend. Simultaneously, previously end-to-end agent infrastructure software has begun to narrow its focus, with companies like E2B narrowing in on sandbox runtime environments.

Because of these advancements, the largest opportunity lies in post-deployment iteration. As LLMs gain context-specificity through techniques like fine-tuning, small model distillation, and RAG, agents will follow suit. Effective debugging, testing and retraining processes are crucial for this transition, with data collection through monitoring playing a pivotal role. As the industry continues to evolve, focusing on enhancing testing and retraining capabilities will be essential for ensuring the robustness and adaptability of agent-based systems.

COMPETITIVE LANDSCAPE

LangChain

Total Fundraising and Background

Founded 2023
$35M raised from Benchmark, Sequoia, Conviction

Product

A language model framework designed to power applications that integrate with other data sources and interact with their environment.
Support for various model types, prompt management, memory, indexing, chains, and agents
Used to build personal assistants, question-answering, chatbots, and data querying

Current Leadership

Harrison Chase (CEO)
ML Engineer @ Kensho Technologies
ML Engineer @ Robust Intelligence

Overview

Langchain is the most well known agent infrastructure software. It differentiates through quick build time & pre-built infrastructure, offering a playground for testing multiple models and frameworks without configuration. Because of this, Langchain is best for prototyping. Its comprehensive functionality has resulted in a bloated codebase (1M+ lines) and complex network of dependencies, limiting users’ ability to interoperate with existing infrastructure.

While Langchain's downmarket appeal reduces early, low-ACV land-grab opportunities for AgentOps, it does not undermine the long-term potential for capturing upmarket enterprise customers with the highest revenue potential.

OpenAI

Total Fundraising and Background

Founded 2015
$11.3B raised from YC, a16z, Sequoia, Tiger Global, Khosla, Quiet Capital, Founders Fund, Coatue, Microsoft, AWS, more

Product

AI research organization working towards AGI.
ChatGPT: LLM-powered chatbot
DALL-E: text-to-image generator

Current Leadership

Sam Altman (CEO)
Founder @ Loopt
President @ Y Combinator

MARKET SIZING

“The global Autonomous AI and Autonomous Agents Market size as per revenue was exceeded $4.8 billion in 2023 and is poised to hit around $28.5 billion by 2028, records a CAGR of 43.0% for the anticipated period, 2023-2028. The expansion of autonomous AI and agents is propelled by various factors, including the rising adoption of AI applications, the improved accessibility of parallel computational resources, and advancements in autonomous driving and healthcare.”

MarketsandMarkets

As agents are embedded in software, we predict that the agent observability market will grow in parallel with the global autonomous AI market.

PRODUCT OFFERING

AgentOps offers an easy instantiation process that integrates into existing workflows with 3 lines of code:

The product has existing functionality across evaluation & debugging, with a clear roadmap to build.

TEAM OVERVIEW

Alex Reibman (CEO)

Data Scientist @ Checkpoint
Data Scientist and Software Developer @ Axio
Data Science Team Lead @ EY
Founder @ Menubites: acquired by Snappr in March 2023

Adam Silverman (COO)

Growth Lead + Corporate Development @ KuzoClass
Director of Growth Marketing @ Bilt Rewards
CEO @ Hot Dot Media: acquired by Wondr Gaming in May 2021