The fascinating history of agentic artificial intelligence

And potentially, what comes next

Published in

Wonder & Think

8 min readFeb 5, 2025

In 1996, former World Chess Champion, Gary Kasparov, had his first game against a supercomputer: IBM’s Deep Blue.

At the time, Kasparov was renowned as the world’s youngest “undisputed” chess champion; he did, after all, defeat his predecessor, Anatoly Karpov, at the early age of 22—and he defended his title for a long time.

So how could this previously untested technology, albeit pretty cool, win against Kasparov?

It didn’t. At least, not until the following year.

In 1997, Deep Blue defeated Kasparov in a widely publicized six-game match. This was the first time the world saw the potential of agentic AI—something that could think and plan strategically enough to defeat someone who was once considered…undefeatable.

But that’s not where this story begins. If you don’t want to know the whole timeline, here’s a TL;DR, but if you do, it’s a pretty interesting read.

A timeline chart that shows the history of agentic AI, starting from 1956 all the way to 2025 — *A brief timeline of the history of agentic AI; Original source: AI*

But first things first, what is agentic artificial intelligence?

In simple terms, agentic artificial intelligence (or agentic AI) is an autonomous system that can handle complex goals or tasks with little human input. It can adapt, make decisions, and operate with a relatively high(er) level of independence.

These systems are trained on reinforcement learning, so they evolve based on new information, context, and changing environments.

Before you ask, this doesn’t mean they can run unchecked.

There are specific frameworks, like human-in-the-loop (HITL), a concept in machine learning, that require humans to intervene to mitigate biases in the algorithms, approve specific actions, and monitor how the system works — essentially, supervise the systems.

I’ll cover this in more detail in another article. For now, let’s look at how it all started.

Note: I’m trying to distill a vast, complex field into a few key milestones by scouring the internet. If I miss the mark, please bear with me.

We need to talk about Alan Mathison Turing

Before we dive in, I wouldn’t do this article — or you — justice if I didn’t discuss Alan Turing. While he might not play a huge role in this timeline, he is, in the opinion of many, the father of artificial intelligence.

There are a couple of reasons why:

In 1935, Turing described an abstract computing machine (known as the Turing Machine) with essentially limitless memory and a scanner that moves back and forth through that memory, symbol by symbol — reading what it finds and writing new symbols as needed.

And in 1945, Turing even predicted the possibility of AI outsmarting humans in chess (seriously). Clearly, he was onto something.

But his most significant contribution to the field, if you ask me, was the Turing Test. Although Turing proposed this imitation game decades ago, it still shapes how we think about AI today.

The Turing Test

In 1950, Turing dodged the traditional “what is intelligence?” debate by creating a practical test for computer intelligence. If you’ve watched Ex Machina, you’ll remember Domhnall Gleeson trying to figure out whether Alicia Vikander’s Ava really makes the cut.

In the Turing Test, you have three participants: a computer, a human foil, and a human interrogator. Everyone’s chatting via keyboard and screen, so there’s no visual giveaway. The interrogator’s job is to figure out which of the other two is the machine by asking anything and everything possible.

Meanwhile, the computer is free to lie, stall, or try any trick to blend in. For instance, it might respond “no” when asked if it’s a computer, or pretend to stumble over a math equation.

The big question: can the interrogator spot the AI, or will the computer pass itself off as human? If they can’t tell the difference, the computer is considered an intelligent, thinking being.

Short answer: We don’t have everything figured out yet.

The birth of artificial intelligence

Marvin Minsky etc. at 1956 Dartmouth Summer Research Project; Source: IEEE Spectrum

In 1956, a group of brilliant minds in computing and cognitive science gathered in one room with the goal of figuring out how machines could actually…think.

At the time, people weren’t totally sure what to call these “thinking machines.” So, John McCarthy, Assistant Professor of Mathematics at Dartmouth, decided to bring together the best in the field to tackle an audacious goal: figure out how to make machines think, reason, and learn like us (but maybe faster).

McCarthy teamed up with Marvin Minsky (Harvard), Nathaniel Rochester (IBM), and Claude Shannon (Bell Labs) for a summer workshop on AI research, known as the Dartmouth Summer Research Project on Artificial Intelligence.

They started off with one bold premise: if intelligence can be broken down into smaller components — like language, abstractions, and problem-solving — maybe a machine could be built to do all of those things.

Soon after, in 1959, attendee Arthur Samuel coined the term “machine learning,” and even created one of the earliest self-learning programs: a checkers-playing system (recognized as one of the first major advancement in evolutionary computing).

However, there were limitations.

Back then, most researchers were focused on symbolic, rule-based AI systems: machines that could only follow a strict set of preprogrammed instructions. While it was a big deal at the time, these systems couldn’t adapt or learn from new information, so they fell short when it came to true agentic potential.

Then, expert systems were on the rise

By the time the 1980s rolled around, the world saw the rise of expert systems, software that mimicked human decision-making in specific domains. At the same time, machine learning techniques were also gaining ground, allowing machines to, well, learn from data rather than just follow rigid rules.

For example, MYCIN was considered the OG expert system; it helped diagnose bacterial infections and prescribe treatments — a more “agentic” model. While this wasn’t perfect, it hinted at a future where machines could adapt to new information, handle complex tasks, and make decisions that once belonged solely in the realm of human beings.

Deep Blue vs. Kasparov

I’ll skip this part — you’re already hooked.

Deep learning gets all the attention

In the early 2000s deep learning became a huge deal.

On September 30, 2012, a convolutional neural network (CNN) called AlexNet showed up at the ImageNet challenge, an annual software contest where AI programs compete to see which one can best classify and detect objects.

AlexNet basically crushed everything else in sight with a 10.8 (or 9.8, depending on the source) percentage point lead over the second-place finisher. If that number doesn’t mean much to you, I get it. Simply put, it’s the kind of gap that shattered any preconceived assumptions about deep learning.

Before that day, deep learning was considered too impractical or too niche. But once people saw AlexNet’s sheer dominance, the question wasn’t “is deep learning real?” anymore; it was “how fast can we adopt this?”

That said, the unseen hand behind AlexNet’s brilliance (IYKYK) was the huge dataset that powered it.

And that synergy between big data and advanced neural networks set the foundation for what we see today in AI research.

LLMs rule the world (for now)

In 2022, OpenAI launched ChatGPT, which as we already know, took the world by storm. The system showed off its ability to hold complex conversations and create content that’s eerily human-like.

Large language models (LLMs), like ChatGPT, are a class of foundation models trained on massive datasets, and that allows them to understand and generate natural language (and other content) for a wide variety of tasks.

The MIT Technology Review puts it this way:

A large language model (or LLM) is a type of neural network that has been trained on lots and lots of text. (Neural networks are software inspired by the way neurons in animal brains signal one another.) Because text is made up of sequences of letters and words of varying lengths, language models require a type of neural network that can make sense of that kind of data.

LLMs have quickly become the poster child for generative AI. Following the popularity of ChatGPT, we saw a host of other LLMs pop up, from Anthropic’s Claude and Microsoft’s Copilot to Google Gemini (previously known as Bard).

And suddenly, the lines between human and machine got a whole lot blurrier.

The not-so-subtle leap from LLMs to agentic AI

That said, LLMs aren’t the same as agentic AI at all.

Agentic AI systems, like OpenAI’s Operator, certainly taps into the creative prowess of generative models, but its main focus is making autonomous decisions and driving toward specific objectives, not just churning out content.

Again, how do we define an “agent”?

Think of it as an LLM-backed decision-maker that can take in information, reason through it, plan next steps, act on them, and — most importantly — learn from all those actions over time.

At its heart is an LLM specifically trained to coordinate the interplay among various AI tools or services, explains S&P Global’s research analyst, Melissa Incera. You can then refine that core LLM on relevant data, hand it access to specific tools (like calculators, search engines, or other AI models), and give it the resources it needs (like databases, instructions, or emails).

The result, at its most advanced stage, is basically a digital knowledge worker — one that can autonomously handle workflows, complete tasks, and even manage other agents or processes.

Agentic AI maturity curve; Source: S&P Global

Right now, though, most “agentic” setups are still at what we’d call level 2 in maturity. In plain English, that means there’s a fair bit of manual oversight — predefined workflows and guardrails — and they’re not yet the kind of free-roaming, self-directed entities you’d truly call agents.

Getting there will require more sophisticated LLMs. And there’s a sticking point: reasoning. Until they can break down tasks, plan them out accurately, and fully own the decision-making process, agents won’t hit that higher level of autonomy we keep talking about.

In other words, the tech is on its way — but it’s not quite ready to run free just yet.

Finally, what about the ethics of it all?

AI is moving at a ridiculous speed right now. Take your pick: computer vision, natural language understanding, and even robotics.

But as we hand over more autonomy and decision-making power to these systems, we’ve got a whole bunch of ethical and regulatory questions staring us in the face.

There’s a lot to cover here, from applied AI ethics to the European Union’s proposed AI Act, so I’ll break that down in my next article.