Unraveling the Magic Behind AI, Machine Learning, Deep Learning, and Generative AI
Technology Public Lecture
Artificial Intelligence (AI) has become deeply embedded in every aspect of our daily lives. But what's really going on behind the curtain of magic?
The Common Mix-Up:
Most people use AI, Machine Learning, and Deep Learning as if they mean the same thing. While they're intimately related, they're actually quite different.
Let's use a simple analogy to make sense of it all!
Think of Russian Matryoshka dolls — those charming wooden dolls that nest inside one another, each smaller than the last.
They're not competing technologies — they're layers of intelligence, each building upon the last.
Artificial Intelligence (AI): This isn't a single technology — it's an umbrella term covering any technique that enables machines to mimic human intelligence.
Build machines that can think, reason, understand language, and solve problems — sometimes even better than humans can.
"The Specialist"
Every AI system we interact with today is narrow AI.
"The All-Rounder"
The holy grail of AI research — still theoretical.
| School 1: Symbolic AI | School 2: Machine Learning |
|---|---|
| "Strict Teacher Method" | "Smart Student Method" |
| Explicitly program human knowledge as rules into the system. | Skip the rules — feed it data and let it figure things out. |
| IF patient has fever THEN give Paracetamol (Expert Systems). | Show thousands of images and teach "this is a cat". |
| Provides clear structure & logic. | Provides flexibility and endless learning capability. |
In 2022, the painting that won first prize at the Colorado art exhibition wasn't created by a human, but by an AI called Midjourney.
Questions it raised:
This isn't just a technology; it's a revolutionary that tore up the 'Old Testament' of computing and wrote a 'New Testament'!
Like following a strict cookbook. The computer executes instructions but can't improvise.
Example: Website tax calculation.
Here's where it gets interesting — we flip the entire equation!
The system learns to create its own rules! Example: Teaching a computer to recognize cats.
The Traditional Approach Fails: Create a rule like "block emails mentioning 'prize'" and scammers simply write "pr!ze" or "p-r-i-z-e" to bypass it.
The Machine Learning Advantage:
It synthesizes these patterns into a decision: "This looks like spam!"
Learning through a teacher (Labelled Data).
Self-discovery method (Unlabeled Data).
Learning from trial and error.
Think of this as learning with answer keys. You provide both the data and the correct answers (labels).
Technical Note:
Garbage In = Garbage Out. If your labels are wrong, the answers will be wrong too!
Here's the dirty secret: The hardest part of supervised learning isn't fancy algorithms — it's the tedious human work of data labeling.
Example: Autonomous Vehicles
Teams of people manually draw bounding boxes around pedestrians, cars, and traffic signs in millions of video frames. It's painstaking work.
Behind every "smart" AI is an army of human labelers, often working through platforms like Amazon Mechanical Turk.
No labels, no answers — just raw data. The system must find patterns on its own.
Unlabeled data like purchased items, purchase frequency, amount spent.
What Happens:
Identity: Frequent visitors, big spenders.
Strategy: No discounts needed; give recognition and exclusive relationship manager (VIP Status).
Identity: Only come when there's a sale.
Strategy: VIP perks are wasted. Send flash sale announcements like "50% off today only".
Identity: First-time buyers.
Strategy: Warm welcome discount and loyalty program to convert them into loyal customers.
The AI learns through experience — like a child discovering that touching a hot stove hurts. Every action has consequences.
Example: Autonomous Taxi
March 2016. Google's AlphaGo played its 37th move against world champion Lee Sedol. Every expert watching called it a blunder — a move that violated centuries of Go strategy.
They were wrong. Move 37 was brilliant — it secured AlphaGo's victory and changed Go forever. This was the moment we realized: AI can discover strategies beyond human imagination.
The powerhouse behind today's AI revolution. Multi-layered neural networks that loosely mimic how our brains process information.
"Deep" refers to multiple layers of processing. Like building a house: foundation first, then walls, then roof. Each layer builds on the previous one.
Detects simple features: edges, corners, basic shapes.
Combines basic features into more complex patterns: eyes, ears, whiskers.
Synthesizes everything into the high-level concept: "This is a cat."
Sunday evening. Suppose you decide to go to the beach to relax. Your brain analyzes three main factors to make this decision:
Is it nice weather outside, or raining?
x₁ = 1 (Nice weather)
x₁ = 0 (Rain)
Does your friend want to come with you?
x₂ = 1 (Will come)
x₂ = 0 (Won't come)
Is public transport (bus/train) nearby?
x₃ = 1 (Available)
x₃ = 0 (Not available)
These three factors can be given to our perceptron as inputs through binary (0 or 1) variables x₁, x₂, x₃.
Each input gets a weight that reflects how much it matters to your decision. Let's say weather is your top priority:
w₁ = 6 → Weather
w₂ = 2 → Friend
w₃ = 2 → Transport
Threshold = 5
∑ = (w₁×x₁) + (w₂×x₂) + (w₃×x₃)
∑ ≥ 5 → 1 (Go)
∑ < 5 → 0 (Don't go)
Let's run through two scenarios and watch the perceptron decide:
x₁ = 1 (Nice weather)
x₂ = 0 (No friend)
x₃ = 0 (No transport)
∑ = (6×1) + (2×0) + (2×0) = 6
6 > 5 → Output: 1
✓ Go to the beach!
x₁ = 0 (Rain)
x₂ = 1 (Friend coming)
x₃ = 1 (Transport available)
∑ = (6×0) + (2×1) + (2×1) = 4
4 < 5 → Output: 0
✗ Stay home!
The Big Idea: By tuning weights and thresholds, we can completely reshape how a machine makes decisions. This is the foundation of neural networks!
"Computer Vision Specialist"
Excels at understanding images and video. Inspired by how our visual cortex processes what we see.
Applications: Face ID, medical imaging (detecting tumors), self-driving car vision.
"Sequential Data Expert"
Designed for data with order and context: language, speech, time series. Has "memory" of what came before.
Applications: Siri and Alexa, real-time translation, stock prediction.
Neural networks have been around since the 1950s. So why did deep learning only explode in the 2010s? Three things converged:
During training, the learning signal (gradient) must travel backward through all layers. But with each layer, it gets weaker and weaker until it vanishes into nothing.
Remember the "telephone game" where a message gets garbled as it passes through many people? Same problem here!
This roadblock stalled deep learning for decades. Modern fixes like ReLU activations and LSTM cells finally cracked it.
From recognition to creation. Instead of asking "What is this?", generative AI asks "What if I made something new?"
Ian Goodfellow's 2014 breakthrough: teaching AI to create by making two neural networks compete against each other!
Round 1: Generator creates a crude blob. Discriminator easily spots it as fake.
After thousands of rounds: Generator gets better at faking. Discriminator sharpens its detection. Eventually, the fakes become indistinguishable from reality!
The architecture powering ChatGPT, Gemini, and the AI writing revolution.
Emergent Abilities: More Is Different
GPT-3 has 175 billion parameters. Something magical happens at this scale: capabilities emerge that weren't explicitly programmed. The model wasn't trained to reason or translate, yet it does both.
"Because it rained, ____"
Small model: "the streets got wet" (predictable)
Large model: "children ran outside with paper boats while streams began to swell" (contextual understanding + creativity)
Language is where AI truly comes alive. All the techniques we've covered—machine learning, deep learning, transformers—reach their full potential when applied to human communication. Next chapter: we dive into the fascinating world of NLP.