Insights Where are autonomous agents in AI capability maturity?

Where are autonomous agents in AI capability maturity?

Does the future of AI live in autonomous agents or is the journey more complicated than that? Ever hear the statement, “every problem doesn’t need a hammer?” This is very much tied to “right tool for the right job” and the same is true for AI, as well as capabilities yet to evolve in the AI market. The revolution caused by AI will, in many ways, be accelerated by the advent of Copilots and Agents that harness abilities to offload work. That offloading however is very closely tied to capabilities that the agent needs to possess. All of this (capabilities, accuracy, autonomy, precision) are weights that will determine where on the continuum the tool will exist.

Let’s start by looking at the continuum, starting from left to right, you can see the oldest types of AI, or at least the most reliable… then moving into what we’re seeing in generative AI and very specific autonomous agents, eventually moving into general agents that have yet to truly emerge, but are coming.

  1. Predictive Large Datasets: AI in this tier analyzes large datasets to make predictions about future trends, behaviors, or events. It uses statistical algorithms and machine learning techniques to identify patterns and correlations within the data. These tell you what likely will happen, but not what to do about it. For example, predicting stock market movements based on historical data, supply chain forecasting, or forecasting weather patterns. We’ve been doing projects like this for a loooooonnnngggg time. That said, sometimes when people have been doing AI for a long time, they eschew new methods, as if technology hasn’t advanced. For example, the first couple projects we did that had NLP involved cost $1 million+ and we had to create the entire language portion from scratch. We’re at a point where the language and communication elements are handled by the LLM and we can focus on higher order precision.
  2. Prescriptive Large Datasets: This tier goes beyond prediction to suggest actions you can take to affect desired outcomes. It analyzes data and provides recommendations, allowing businesses or individuals to make informed decisions based on predicted trends. For instance, recommending personalized treatment plans for patients based on medical data, or indicating inventory moves based on upcoming demand forecasts. In these cases it is harder to tell the company what to do instead of just what is going to happen. The choice of next-best-action is usually reserved for humans. Granted, we typically make those decisions with incomplete data, dashboards, intuition, but we still own and make those decisions. As is the case with other AI to come, when we use AI, we need to measure, evaluate, measure again, and improve. This can be applied to humans in the same case, where they were not measured before. Such as, “how accurate was the last inventory forecast and our associated order?”… “how do we adjust it next time?”. The drive to measure it alone creates value here.
  3. Storytelling on Large Datasets: AI in this tier can transform complex data into narratives or visual stories. It helps communicate insights derived from large datasets effectively. For example, simulating what happens if a factory is opened or closed, if I increase account executives in a certain area, or if a potential storm’s impact on generator sales. This is where AI can truly cross the functional area for executive teams… being able to help with prediction of the right moves as sets-of-decisions, not just as one thing by itself. Is it more beneficial to open a new location at this corner, or another? A company I’m working with is doing that specific story-telling AI solution.
  4. General – Directed Activities: AI systems in this category can perform a wide range of tasks but require specific instructions and guidance. They are versatile in handling various tasks when directed by humans. This is the emergence of GPT-based platforms that can “do a lot of things well but nothing excellent”. In a sense, they are much like humans, since we can generalize many situations and respond to inputs. They still aren’t like humans though in the depth of generality and autonomy they have earned. Examples include chatbots, virtual assistants, and recommendation engines. As we all become Copilot users we’ll become very comfortable with this style of agent vs. the ML models in the earlier examples. We’re giving a task to be done, the Copilot is doing a portion of it, and we’re taking it from there. The mistake is to accept that is 100% complete… there is a reason its called a Copilot.
  5. Directed – Autonomous Activities: These AI systems can perform specific tasks autonomously without human intervention. However, they are limited to those particular activities they are programmed for. Even the idea of autonomy relays the though of Skynet becoming a reality, but autonomous systems in this case are very specific to a non-generalizable job. For instance, a robot on the assembly line performing a specific task. An example that creates a ton of controversy is self-driving cars. Even if these are theoretically safer than humans (which they likely are), the idea of a human being harmed by a robot, even in smaller quantities, evokes an emotional response. What if there were 10,000 deaths in the United States as a result of self-driving cars in a similar distribution to the 100,000s that happen due to human drivers? I’m guessing that 10,000, even if it represented a small percentage of the deaths in an equal distribution of cars, would represent something that people feel differently about. The second you move into autonomy, the bar goes up and so do the outcomes that need to be managed.
  6. Creative – Directed Activities: AI at this level has creative capabilities but still requires direction from humans. It can be used for content creation, design, music composition, etc., under human guidance. For instance, generating art based on specific input or composing music. Is this truly creative? It might not be fair to call it truly creative, since the idea is coming from the person, not the machine. It’s like coming up with an idea for an intern, “paint me a picture of a dog riding a tricycle”, which it then does and returns. It didn’t come up with the original idea, but it did create the asset based upon a set of combined knowledge from its training data. We also know people who create posts written by Chat GPT… they follow the same pattern and it’s really, really, annoying. I go out of my way to be authentic in my writing and although I may sometimes use Copilot for ideas, I ensure I communicate my own raw perspective, vs. the somewhat stale sound of an AI model. In the same way however, I will regularly have Copilot draft 90% of a functional document, because in that case, good-enough is good-enough and the authentic nature of the text is less important than its instructive content. Make no mistake though, AI is an asset and is coming for creative activities, just as well as much as it is coming for repeatable ones. The BS detector on the creative activities is higher though and will be easier to sniff out.
  7. General – Autonomous Activities: This is a more advanced form of AI that can perform a wide range of tasks autonomously. It has self-learning capabilities, allowing it to adapt and improve over time without human intervention. This by itself is even a range… for instance, a robot capable of performing various tasks in an Amazon warehouse might not be allowed to continue learning. There may be blocks placed upon the acquisition of new knowledge to limit the risk of the agent or to focus it on a set of tasks. In that sense it isn’t truly general, but even a general system has ranges of general. However, let’s assume for a second that you allow the platform to continue learning… essentially that there is no cap on that capability. This increases the ability of the platform to address general tasks (such that 1 million hours of training is inferior to 2 million hours of training), but also could lead to dramatically unpredictable results. Know anyone else like this? Humans… not in a directly synonymous way, but at least in that humans can be unpredictable in the relationship between training, knowledge, personality, etc. You may know that you prefer ice cream to cheesecake, but you don’t necessarily know why you prefer it. In a similar sense, explainability of a general AI model goes out the window, while possible capabilities grows considerably. We already see some simpler versions of this emerge in video gaming, chess, and other use cases where it isn’t a general-directed activity.
  8. Creative – Autonomous Activities: The most advanced tier where AI possesses creative abilities and can operate without human intervention. Where AI brings the ideas to you without you prompting it. It could potentially create original content, innovate, or even problem-solve creatively without human guidance. For instance, generating novel ideas, composing original poetry, or designing new products. We are already talking about AI delegation in the context of Copilot and people are just starting to learn how to do that. Many are not used to taking something (or a part of something) they do and giving it to someone else. Ever work with someone who isn’t a good delegator? That challenge will be even bigger as they move to AI. The best employees of the future will be those that can force multiply with AI to be a 5x, 8x, 11x employee based on their ability to delegate to agents. Is there a downside? Absolutely… it is tied to the devaluing of the human-person in the context of work being done by machines. Even at this level the AI is missing true consciousness and eternally-driven agency, which is different than even the minimum bar for AGI. We need to maintain even at this stage, that AI is not equal to the unique characteristics of humanity… or as I call it… “the Human Difference”, which is a post-yet-to-come.

What are my take-aways from this? The best solution is often the simplest solution. Don’t use an agent to build a Supply Chain forecasting model, but don’t use a typical ML model if you want an agent capable of sounding and acting human with a customer. Understand that we’ll continue to unlock more capabilities in agents that facilitate capabilities we simply can’t do today… such as agents that function akin to offshore developers. We’re close and we need to prepare our companies to make the transition.