MODULE 02 · FULL SESSION (3–4+ HR) · ML-FOCUSED + LIGHT PYTHON

Programming for
Artificial Intelligence

Student reference: myths, rules vs ML, Python simulations, full ML story (splits, loss, metrics, bias), expandable examples, self-check activities, and quiz — no separate teacher script on the page.

ML story (core)
Python logic (brief)
10 Quiz Questions
8 Real-World Examples
10 Quiz Questions
❓ Why code 🐍 Python light 📈 ML core 🎯 ML types 🔑 Concepts 🎤 Activities
↓ scroll to begin

Can AI build itself?

A core question for understanding how real AI systems are built — and what role people still play.

Think about it

“Can AI build itself?” Short answer for beginners: No — people still write the instructions, choose the data, and design the system. AI can help write code, but humans set goals and responsibility.

Discussion
💬

One simple line

Programming = communication with machines. We use languages like Python so the computer can follow steps reliably, millions of times faster than we can.

Definition
🍳

Recipe analogy

Recipe = code · Chef = computer · Dish = output. A wrong or vague recipe → bad dish. Same for bugs or missing steps in programs.

Analogy

Animated flow — follow left to right: idea becomes code, the machine runs it, you get output.

Bridge from Module 1

Quick verbal recap: AI is the big goal (smart behaviour). Machine learning is one way to achieve it using data. Programming is how we tell the machine what to do with that data — load it, train, show results. Today we slow down on ML + light Python, not on coding drills.

Try with a friend or write in your notes: Name one app you used recently that “feels smart.” What might it have learned from — your taps, location, voice, or something else?

Myths vs facts

Common mythCloser to the truth
“AI can think like a human brain.”Today’s systems match patterns in data; they don’t “understand” like people.
“AI will replace all programmers.”Tools change; humans still define problems, data, ethics, and checks.
“More data always fixes everything.”Bad, biased, or wrong data makes worse outcomes — quality matters.
“If the code runs, the AI is correct.”Software can run perfectly and still give unfair or silly predictions.
TopicIllustrative fact (industry surveys, rounded)
Developers using / learning AI-assisted coding toolsStack Overflow Developer Survey (2025): a large majority of professional developers reported using AI tools in their workflow.
Python in data / ML teachingConsistently among the most-taught languages in university and bootcamp data-science curricula worldwide.
ML project time (rule of thumb)Practitioners often report that data work (collection, cleaning, labels) takes more calendar time than picking an algorithm.

Exact percentages change every year; the point is scale and workflow, not memorising numbers.

Rules vs learning

📜

Rule-based (“if–then”)

A human writes every rule: “If email contains ‘win lottery,’ mark spam.” Works until spammers change words. Does not learn from new examples by itself.

📈

Machine learning

The system adjusts from many examples. It finds patterns humans did not hand-write. Still needs good data, programming to train it, and human oversight.

What is an “algorithm”? (plain English)

An algorithm is a clear sequence of steps to solve a task — like a precise recipe. A program implements algorithms in a language the computer runs. Machine learning uses algorithms that update internal numbers (weights) when they see data; classical programs use fixed logic unless someone edits the code.

This module builds mental models; deeper math comes in later modules.


Python in one glance

You do not need to code fluently yet — only follow the logic (storage, decisions, repetition). Use the simulations below to see what the computer would print.

What is Python? A language we use to tell the computer what to do. In AI courses it is popular because it is readable and many ML tools are built around it.

You will see Python in notebooks, scripts, and courses — but the computer does the heavy math inside libraries (written in fast languages). Your job first is to read logic: what is stored, what repeats, what branch runs.

Why Python shows up in AI

Reflection: Python loops can be slow in pure bytecode — why do researchers still use Python for AI? (Hint: libraries do heavy work in C/C++/CUDA; Python orchestrates experiments.)

Example — output to the screen:

print("Hello AI")
Hello AI

The computer displays exactly what we ask — that is the idea behind all later programs.

📦

1 · Variable = storage box

age = 20

A name (age) pointing to a value we can reuse.

🔀

2 · Condition = decision

if age > 18:
    print("Adult")

The computer chooses a path based on a true/false check.

🔁

3 · Loop = repeat

for i in range(3):
    print("Hello")

Same action multiple times — essential for processing lots of data later.

Two more tiny patterns

📝

Text in quotes = string

name = "Priya"
print(name)

Variables can hold text, not only numbers. ML pipelines often pass file paths, labels, and messages as strings.

📋

List = many items in order

scores = [72, 85, 90]
for s in scores:
    print(s)

A loop can walk through a list — same idea as walking through rows of a dataset later.

Simulate the list loop (conceptual output):

72
85
90

More basic simulations (click each ▶)

Numbers and arithmetic

a = 5
b = 3
print(a + b)
8

if / else (two branches)

age = 16
if age >= 18:
    print("Adult")
else:
    print("Minor")
Minor

Loop with index i

for i in range(4):
    print(i)
0
1
2
3

Joining strings

name = "Aisha"
print("Hello, " + name)
Hello, Aisha
Logic check: If you change age = 16 to age = 20 in the if/else example and run again, the output becomes Adult. The machine follows the condition exactly — it does not guess your intention.
What is a “bug”?

A bug is when the program does what you wrote, not what you wanted — wrong variable, wrong indent, wrong condition. ML has a cousin idea: the code runs, but predictions are wrong because of data, features, or model choice. Part 5 returns to this.

Remember: You don’t need fluent coding yet — if you can read variables, branches, loops, and print, you can follow how ML code is structured.

Can machines learn like humans?

In a practical sense, yes: Machine learning = learning from data. No data → no useful learning. This part ties together almost every idea you need before algorithms in Module 3.

Human learningMachine learning
ExperienceData (examples)
BrainModel
Practice / studyTraining
Guess on examPrediction

With a study partner or in your notes: pick one row in the table and write a everyday analogy (e.g. “exam practice ≈ training”).

What “learn” means here

We are not claiming a laptop has feelings or consciousness. In ML, “learn” means: the system updates from data so that its predictions improve on similar future examples. That update is implemented with math and code — usually by minimising mistakes on training data while hoping it still works on new data (generalisation — a theme for later modules).

PhaseWhat happens (story version)
Training (learning)Model sees many labelled or unlabelled examples and adjusts internal parameters.
Inference (prediction)Trained model receives a new example and outputs a label, score, or action — fast, like using a finished calculator.
DATA → LEARNING → MODEL → PREDICTION Data watch · click · text Learning training Model the “brain” Predict No data = No AI (for learning systems)

Core pipeline — animation suggests information moving stage to stage.

Illustrative “more examples → richer patterns” (not real metrics)

Labels: the “answer key”

Labelled data means each training example comes with the correct output we want the model to imitate later — spam/not spam, price sold, disease yes/no. Unlabelled data is only inputs; the algorithm must discover structure (unsupervised learning). Most beginner stories start with labels — that is supervised learning.

Self-check: Where do labels come from? Humans (annotators), sensors, historical records, or rules — if labels are wrong or noisy, the model learns that noise.

House IDSize (100 sq.ft units)RoomsCity zone (code)Label: sold price (₹ lakhs)
H01123285
H0292152
H031543118
H04113279
H0582145

Toy numbers for learning only — not a real market dataset.

One supervised row: inputs on the left, label on the right — the model learns to predict the label from the inputs.

Splitting data (train · validation · test)

Real projects usually divide examples into training (fit the model), validation (tune choices — which feature set, how complex the model), and test (one final honest check on data that did not influence those choices). If you tune on the test set, scores look artificially high — a form of cheating the metric.

SplitRole (simple)
TrainingModel updates its parameters to reduce error here.
ValidationCompare variants and hyperparameters without touching the test set.
TestFinal estimate of how the chosen model behaves on new-like data.

Long walkthrough · Recommendations

  1. Collect signals: what you watched, how long, likes, skips, time of day, device — all become data.
  2. Clean & organise: engineers remove broken rows, align IDs, protect privacy (high-level only today).
  3. Features: turn raw logs into numbers the model can eat — e.g. “completion rate for cooking videos this week.”
  4. Train: model sees millions of user histories and learns associations: people who liked A often liked B.
  5. Evaluate: team checks on held-out users — does watch time go up without promoting harmful content? (Ethics + metrics.)
  6. Serve: when you open the app, the inference step scores candidate videos and shows the top few — in milliseconds.
What can break this pipeline? Examples: not enough data, wrong or leaky features, biased click patterns, brand-new users with no history (cold start), safety/policy filters that block certain content, or servers failing at inference time — all real engineering issues, not only “math.”

Generalisation

Generalisation means the model performs well on new examples drawn from the same kind of reality — not only on the rows it memorised. Overfitting = fits training noise; underfitting = too simple to capture real patterns. Domain shift = training data and real-world data differ (new city, new device, new slang), so performance drops even if the code is unchanged.

Real examples — tap a card to expand
▶️
YouTube & Instagram Reels
Watch history → recommendations
  • You watch / skip / like → data
  • System finds patterns in behaviour → learning
  • It suggests the next video → prediction
✉️
Spam filter
Emails → spam or not
  • Millions of emails = data
  • ML learns patterns (words, links, senders)
  • New mail sorted into inbox vs spam → prediction
🗺️
Google Maps
Traffic → ETA
  • GPS + speed + history = data
  • Learns typical congestion patterns
  • Predicts arrival time → prediction
🍔
Swiggy · Zomato
Delivery & discovery
  • Past orders, location, time = data
  • “Restaurants you may like” / ETA for rider
  • Same pipeline: data → learn → predict
🎬
Netflix / Hotstar-style
Thumbnails, watch history, genre
  • Same recommendation story as short video — different UI, similar ML loop.
  • Personalisation + diversity rules (not only one genre forever).
🎙️
Voice assistants
Speech → intent
  • Audio waves → features → model predicts words then intent (“set alarm”).
  • Needs huge diverse recordings; accent bias is a real issue.
🏦
UPI / banking alerts
Unusual transaction patterns
  • Your normal spend pattern = baseline; spike at odd hour/location → risk score.
  • Often mix of rules + ML; explain to users simply: “we flag unusual activity.”
🔓
Face unlock
Image → match / no match
  • Camera image → model → “same person as enrolled template?” (classification-style).
  • Security + fairness: lighting, skin tone, masks — teams must test widely.
🌾
Agriculture (India context)
Crop stress, yield hints
  • Satellite or phone images + weather data → predict stress or estimate yield.
  • Helps extension workers prioritise field visits — still needs ground truth labels.
🌐
Translate / type-ahead
Text in → text out
  • Learns from parallel sentences (label = translation) + huge web text.
  • Keyboard next-word prediction = fast inference over language model scores.
FAQ · Does the model “remember” my video?

Not like a diary. It stores patterns (weights), not your exact clips. Services still log events for product and policy reasons — privacy is a separate, important topic (Module 7 touches deployment & ethics).

FAQ · Why can two people get different recommendations?

Different histories, locations, languages, and A/B tests. The model is personalising to your signals and the app’s business rules.

FAQ · Is this the same as ChatGPT?

Same big picture (data → train → predict), different model family and data. ChatGPT predicts the next piece of text; recommenders score items for you. We keep details for later modules.


Three families

Supervised is the most important for beginners — it matches “question + answer.” Reinforcement and unsupervised add other toolkits.

Compare: supervised has explicit targets; unsupervised discovers structure without provided labels.

Learning with answers

Each example includes the right output (label): e.g. “this email is spam,” “this house sold for ₹X.” The model learns to map inputs → labels. Think: teacher shows the class the answer key while practising.

Everyday supervised tasks:

  • House / rent price: past sales with size, area → predict price for a new listing.
  • Exam score estimate: hours studied, past grades → predict outcome (ethics: use carefully in education).
  • Spam / fraud: message + metadata → spam or legitimate; transaction → risk score.
  • Medical imaging assist: scan + doctor label → help highlight suspicious regions (high stakes — regulation matters).
  • Crop / leaf image: photo + disease label → warn farmer early.

Self-check: Where does the label come from? Humans, sensors, historical records, or rules — noisy labels teach noisy behaviour.

Learning without answers

Only the inputs are given. The algorithm looks for groups, patterns, or structure — like sorting people into similar taste clusters without naming the clusters first. You might later name a cluster “budget shoppers” after you inspect it.

Typical uses:

  • Customer segments for marketing — who browses but rarely buys?
  • Shopping baskets — which products appear together (for store layout)?
  • Anomaly detection — find machines that “look different” from normal vibration (often semi-supervised in industry).
  • Topic discovery in text — group news articles without reading all of them.

Pitfall: Clusters are mathematically real but need human interpretation — “cluster 3” is not automatically “good customers.”

Learning by trial & reward

An agent tries actions, gets rewards or penalties, and improves over time — like scoring points in a game. There is often no single “correct label” per step; instead, many steps build toward a goal.

Where you may have seen this story:

  • Chess / Go / video game bots that improve through self-play or simulation.
  • Robots learning to walk in physics simulators — fall = penalty, forward motion = reward.
  • Some ad or recommendation systems nudge policies with long-term clicks (simplified — don’t overclaim).

Contrast with supervised: RL needs a defined environment and reward signal; supervised needs many input–output pairs. Both still need programming to set up.

Quick “which type?” — guess, then reveal

Cover the answers, decide supervised / unsupervised / reinforcement for each, then tap Reveal.

A. Predict tomorrow’s temperature from past weather readings (numbers).

Usually supervised (regression): past days had known temperatures as labels.

B. Group news articles into themes without telling the system theme names.

Unsupervised clustering / topic modelling — labels not provided up front.

C. Drone learns to hover by trying motor speeds and staying stable.

Reinforcement learning — reward for stable hover, penalty for crash.

D. Detect credit-card fraud using past transactions marked fraud / not fraud.

Supervised classification (often with heavy class imbalance — later courses).
TypeYou mainly have…Typical question
SupervisedInputs + labels“What category / number for this new input?”
UnsupervisedInputs only“What groups or structure exist?”
ReinforcementStates, actions, rewards“What policy maximises long-term reward?”
Footnote for curious students: “Semi-supervised” uses a small labelled set + lots of unlabelled data; “self-supervised” builds labels from the data itself (common in modern language models). Names are less important today than the three main families.

Data, features, model, training, prediction

End-to-end language people use in ML teams: pipeline, splits, loss, epochs, metrics, errors, bias — still intuitive, no heavy formulas.

Extended pipeline — features refine raw data; training updates the model before prediction.

Symbol / word (common in courses)Meaning (beginner)
Input xOne example’s features (one row of data).
Label yCorrect output for supervised learning (class or number).
Prediction ŷ (“y-hat”)What the model outputs for that input.
DatasetMany (x, y) pairs or many x alone (unsupervised).

Self-check: in your own words, trace the extended pipeline from “raw logs” to “prediction” in one short paragraph.

1 · Data — fuel of AI

Images, text, clicks, GPS traces, audio… quality and quantity both matter.

2 · Features (very important)

Not every byte goes into the model directly. We pick meaningful inputs — for house price: size, location, number of rooms, floor, age…

Garbage in, garbage out: If data is wrong, incomplete, or collected unfairly, the fanciest model cannot invent truth. Cleaning data — removing duplicates, fixing typos, handling missing values — is a huge part of real ML work (often more than choosing an algorithm).

Noise & outliers: Wrong labels, sensor glitches, or one-off extreme values can pull a model off course. Teams use cleaning rules, robust losses, or outlier detection — you only need the idea that not every row is equally trustworthy.

Raw data vs features — two more examples

✉️

Email → spam features

Raw: full message bytes. Features might include: count of “free/offer/click,” presence of suspicious links, sender reputation score, time since account created — not the whole email dumped naively into one number.

🖼️

Image → simple view

Raw: grid of pixel brightness/colour. Deep models learn their own internal features; classical ML might use hand-crafted summaries (edges, colours). In short: pixels are data; the model gradually builds useful summaries.

Size (sq.ft)
Location (area / pincode)
Rooms
Age of building
FEATURES → MODEL → PRICE PREDICTION size location rooms Model ₹ predicted

House price story — features feed the model; output is a predicted price.

Loss, epochs, batches (how training “runs”)

Loss (error): A number that says how wrong the model is on the examples it is training on — e.g. “predicted price minus true price, squared.” Training tries to reduce loss step by step.

Epoch: One full pass through the training set (or through the sampling plan used in practice). Often you need many epochs; too many can encourage overfitting unless you regularise or stop early.

Batch / mini-batch: Training is usually done on small chunks of examples at a time (e.g. 32 or 256 rows) for speed and stable updates — not always one row at a time, rarely the entire dataset at once on huge data.

TermOne-line memory hook
Loss“How bad are we?” — training minimises it.
Epoch“Saw the whole training set once.”
Batch“Small pack of examples per update step.”
Learning rate“Step size” when adjusting the model — too big can diverge; too small can be slow (Module 3).

Bias vs variance (picture in words)

IdeaBeginner explanation
High bias (underfitting)Model too simple — misses real patterns even on training data (like using a straight line for a curved trend).
High variance (overfitting)Model too flexible — fits training noise; great on train, worse on new data.
GoalBalance: capture real signal, ignore noise — more data, better features, simpler model, or regularisation help.

3 · Model

The learned “brain” that turns inputs into outputs — can be simple or complex.

4 · Training

The process of adjusting the model using data (learning).

5 · Prediction

The output on new data: class (spam/not), price, ETA, recommended dish, etc.

Training data vs “new” data

We usually split the world into examples the model studied during training and examples reserved to check if it generalises. If it only memorises training examples, it may fail on new people, new cities, or new slang — that failure mode is often called overfitting (rote learning). A model that is too simple and misses real patterns is sometimes called underfitting.

Train vs held-out / new data — if the real world drifts away from training (domain shift), error rises even with the same code.

Metrics — how we score predictions

Accuracy = fraction of examples where prediction matches the label. Easy to read but misleading when one class is rare (e.g. 99% “not fraud” — a dumb model that always says “not fraud” gets 99% accuracy and catches zero fraud).

Precision (for a class): when the model says “yes,” how often was it right? Recall: of all real “yes” cases, how many did we catch? Trade-offs matter in medicine, spam, and safety.

Confusion matrix (spam example, 100 emails)
Predicted: not spamPredicted: spam
Actually not spam88 (good)2 (false alarm)
Actually spam5 (missed spam)5 (caught)

Toy counts: accuracy = 93/100 = 93%, but missed 5/10 spam → recall for spam = 50%. Numbers chosen to show accuracy alone can hide pain points.

MetricWhen teams care a lot
AccuracyBalanced classes and similar mistake costs.
PrecisionFalse alarms are expensive (e.g. blocking legit transactions).
RecallMissing a positive is dangerous (disease, fraud, safety).
RMSE / MAERegression — how far off predicted numbers are from true values.
Baseline: Before celebrating a model, compare to a trivial rule — e.g. “always predict the most common class” or “always predict average house price.” If ML does not beat a simple baseline, something may be wrong with features, data, or labels.
Classification (categories)Prediction (numbers)
Yes / No, Spam / Not spamHouse price, exam score, delivery time
Discrete labelsContinuous values
✔️

Good model

Correct or useful on new real cases — not only on examples it memorised.

⚠️

Bad model

Often wrong predictions, or only works on a tiny test set — misleading in practice.

⚖️

Bias (simple)

If data is wrong or unfair, the AI copies that. Example: biased hiring data → unfair suggestions. Fix data and process, not only code.

Good vs bad — extend the idea

Memorisation trick: A model that remembers every training email verbatim might score 100% on those emails but fail on new spam styles — looks “good” in a demo, bad in production.
Fairness angle: If face data mostly came from one skin tone, accuracy drops for others — not always a “bug” in code; often a data coverage problem plus testing gaps.
Explainability (one line): Stakeholders ask “why this prediction?” Simple models sometimes answer more easily; complex models may need special tools — still an active area.
Deepening (optional) · Cost of wrong predictions

Not all mistakes cost the same. False alarm on spam (inbox mail lost) vs missed spam (phishing) vs wrong medical triage — different costs. Teams choose metrics and safeguards accordingly. This connects to ethics and product design (Module 7).


Try it yourself

Short exercises you can do alone or with a study group. Local apps (Swiggy, Zomato, Instagram, UPI) are good examples when you relate ideas to real life.

Activity 1 · Name the ML type

Guess, then reveal.

Spam filter — labelled emails.

Supervised learning (data + labels).

Customer grouping — no segment names given, only behaviour.

Unsupervised (find groups / patterns).

Activity 2 · Design AI for a food app

List ideas you might use in a food-delivery app:

Follow-up: For each idea, note what data you need, what prediction you output, and whether it is likely supervised, unsupervised, or RL.

Activity 3 · Pipeline order

These steps are shuffled. Arrange them mentally or on paper: Prediction, Training, Raw logs, Features, Model file. (Optional distractor: “Internet memes” — not a pipeline stage.)

Raw logs → Features → Training (updates model) → Model file → Prediction on new data. (In practice, loops and monitoring repeat — this linear order is the teaching spine.)

Activity 4 · Find the feature

For student marks prediction, which are features vs label vs not useful?

Label: final exam score. Likely features: sleep, attendance (if measured well). Favourite colour — probably weak unless spurious correlation; good moment to discuss causation vs correlation lightly.

Activity 5 · Ethical pause

Scenario: College trains a model on past admission data that mostly accepted one gender or one region historically. Should we deploy it unchanged?

No simple “yes.” Historical bias replicates; need policy, fairness review, maybe new data collection, human oversight — tie back to Part 5 bias card.

Activity 6 · Quick review questions

  1. In one sentence: what is machine learning?
  2. Name one supervised example from today.
  3. Name one way data can make AI unfair.
After you finish: Skim one link from the module summary references. Next module (Module 3) names concrete algorithms — still mostly ideas first.
🚀 Closing line to remember:
Programming gives instructions. Machine learning gives intelligence from data. Data gives power — and responsibility.

Quick Knowledge Check

10 questions on ML basics: paradigms, features, loss, data, bias, and train/test splits. Instant feedback on every answer.

Score: 0 / 0

Key Takeaways

Module 2 distilled: Python logic, ML pipeline, splits, loss and epochs, metrics and baselines, three ML types, bias, and self-check activities.

↑ Back to Top
📚 Further reading:
• Python tutorial (first chapters) — docs.python.org/3/tutorial/
• Google: “Introduction to Machine Learning” — developers.google.com
• scikit-learn: supervised vs unsupervised — scikit-learn.org