Cassandra — Built for When It Must Not Go Down

CS 6500 — Week 11, Session 1

CS 6500 — Big Data Analytics | Week 11

The Driving Question

"A network partition splits a bank's ATM network into two regions that can't communicate with each other for 30 seconds. If the system prioritizes consistency (CP), all ATMs stop working. If it prioritizes availability (AP), ATMs keep working but might briefly show inconsistent balances. Which failure mode would you rather design around?"

CS 6500 — Big Data Analytics | Week 11

The Cassandra Answer

Cassandra chose availability over consistency by default — and gave operators a dial to tune the trade-off per query.

Session 1: Architecture and Data Model
Ring topology, consistent hashing, and gossip — how Cassandra eliminates single points of failure — then CQL: the SQL that isn't SQL

Session 2: ScyllaDB, Modeling, and Choosing
Query-first schema design, ScyllaDB's C++ rewrite for lower tail latency, and the three-way NoSQL decision framework

The ring architecture Cassandra introduced is now the baseline design for every global-scale database: DynamoDB, CosmosDB, and ScyllaDB all inherited it.

CS 6500 — Big Data Analytics | Week 11

Week 10 Recap

Redis and MongoDB — two answers to the same question: "SQL leaves a gap"

Redis MongoDB
Model Key-value + structures JSON documents
CAP CP (Redis Cluster) CP (replica set primary)
When Sub-ms latency, sessions, pub/sub Flexible schema, rich queries

Today's contrast: MongoDB elects a new primary after failure (seconds of downtime). Cassandra never has a primary — every node is equal.

CS 6500 — Big Data Analytics | Week 11

Ring Architecture

How Cassandra eliminates the single point of failure

CS 6500 — Big Data Analytics | Week 11

The Availability Challenge

Systems with a master node have an Achilles heel

MongoDB replica set:
  [Primary] ← all writes go here
  [Secondary 1]  [Secondary 2]

Primary fails → election takes 10–30 seconds → writes blocked

Cassandra's answer: there is no master.

Cassandra ring:
  [Node A] ── [Node B] ── [Node C] ── [Node D]
     ↑                                    │
     └────────────────────────────────────┘

Any node can accept any write. No election. No downtime.
CS 6500 — Big Data Analytics | Week 11

Consistent Hashing

Every partition key is hashed to a position on a ring (0 → 2¹²⁷)

Token ring (0 → 2¹²⁷, wraps around):

  Node A owns tokens:  0   –  25%
  Node B owns tokens: 25%  –  50%
  Node C owns tokens: 50%  –  75%
  Node D owns tokens: 75%  – 100%

  "user_123" → hash → token 18%  → stored on Node A
  "user_456" → hash → token 61%  → stored on Node C

Why not hash(key) % N?

  • Add a 5th node: % 5 relocates ~80% of all keys
  • Consistent hashing: only ~1/N of keys move to the new node
CS 6500 — Big Data Analytics | Week 11

Replication Strategy

Every row is stored on multiple nodes — controlled by replication factor

CREATE KEYSPACE ecommerce
  WITH replication = {
    'class': 'SimpleStrategy',
    'replication_factor': 3
  };
  • RF = 3: each row exists on 3 consecutive nodes on the ring
  • Coordinator receives write → forwards to RF replica nodes
  • Can survive RF − 1 node failures without data loss
CS 6500 — Big Data Analytics | Week 11

Replication: Strategy Selection

Strategy When to use
SimpleStrategy Single datacenter (dev/test)
NetworkTopologyStrategy Multi-DC production (specify RF per DC)
CS 6500 — Big Data Analytics | Week 11

Gossip Protocol

How does every node know who's alive without a master?

Every second, each node randomly contacts 1–3 peers and exchanges state:

Node A → Node C: "I've heard B is slow; D looks healthy"
Node C → Node B: "A says you're slow; I've seen no issues"
Node B → Node A: "I'm fine; here's my updated heartbeat"
  • Cluster state converges in O(log N) gossip rounds
  • No central coordinator → no single point of failure for cluster state
  • Failure detection: node stops responding to gossip → marked DOWN automatically

Practical impact: a Cassandra cluster can lose a node at 3 AM with no pages — the cluster routes around it until the node is replaced.

CS 6500 — Big Data Analytics | Week 11

Write Path

What happens when a client writes a row?

Client → writes to any node (that node = Coordinator)
              │
              ▼
Coordinator hashes partition key → identifies 3 replica nodes
              │
    ┌─────────┼─────────┐
    ▼         ▼         ▼
 Node A    Node B    Node C
    │         │         │
 1. Commit log (WAL — durable write to disk)
 2. Memtable (in-memory sorted structure)
 3. ACK to coordinator per consistency level
              │
Coordinator → ACK to client

Background (async): memtable flushes to immutable SSTable on disk; compaction merges SSTables over time.

CS 6500 — Big Data Analytics | Week 11

Read Path

What happens when a client reads a row?

Client → Coordinator hashes partition key → contacts replica nodes
              │
    ┌─────────┼─────────┐
    ▼         ▼         ▼
 Node A    Node B    Node C
    │
Fastest replica responds to coordinator → returns to client

Background read repair: if replicas disagree (tombstone, stale memtable), Cassandra merges by last-write-wins timestamp — the most recently written value wins.

Write-heavy implication: reads are slightly more expensive than writes because Cassandra may need to merge memtable + multiple SSTables on disk (solved by compaction strategy selection).

CS 6500 — Big Data Analytics | Week 11

Tunable Consistency

One cluster, many trade-offs — per query

CS 6500 — Big Data Analytics | Week 11

Consistency Levels

Both reads and writes independently configurable — per query

Level Replicas must respond Notes
ONE 1 Fastest; stale reads possible
TWO 2 Rarely used directly
QUORUM RF/2 + 1 majority Common production choice
LOCAL_QUORUM Majority within local DC Multi-DC best practice
ALL All RF nodes Strongest; cluster degraded on any failure
-- Per-query consistency (Beeline / application driver)
CONSISTENCY QUORUM;
SELECT * FROM orders_by_customer WHERE customer_id = :id;
CS 6500 — Big Data Analytics | Week 11

Strong Consistency Formula

R + W > RF guarantees at least one node overlaps reads and writes

RF Write CL W nodes Read CL R nodes R+W Strong?
3 QUORUM 2 QUORUM 2 4 ✅ yes
3 ONE 1 ONE 1 2 ❌ no
3 ALL 3 ONE 1 4 ✅ yes (but ALL writes block on any node failure)
5 QUORUM 3 QUORUM 3 6 ✅ yes
CS 6500 — Big Data Analytics | Week 11

Consistency: Production Pattern

Common production pattern: Write=ONE + Read=QUORUM

  • Fast writes (ack from 1 replica)
  • Consistent reads (majority checked)
  • R + W = 1 + 2 = 3 = RF → not strongly consistent (requires R+W > RF); trades correctness for write speed
CS 6500 — Big Data Analytics | Week 11

Cassandra Is AP — but Tunable

Default behavior (ONE/ONE): AP — prefers availability during partition

Network partition splits cluster:
  Side A: [Node 1, Node 2]   Side B: [Node 3]

  MongoDB CP: Node 3 refuses writes (can't reach majority)
  Cassandra AP: Both sides keep accepting writes
               → reconciled by last-write-wins on recovery

Tuning toward CP (QUORUM/QUORUM): effectively strongly consistent within the latency budget — but writes block if fewer than RF/2 + 1 nodes are reachable.

Design rule: choose AP when downtime costs more than temporary inconsistency (IoT telemetry, clickstreams, global leaderboards). Choose CP-mode when stale reads are unacceptable (financial balances — though most fintech uses dedicated CP databases for this).

CS 6500 — Big Data Analytics | Week 11

When to Use Cassandra

Use Cassandra when…

  • Write throughput is the primary constraint (millions of events/sec)
  • The system must stay up during network partitions or node failures
  • Query patterns are known and stable at design time
  • Data is time-series, event logs, or user activity (high write, low read complexity)
  • You need a multi-datacenter active-active deployment

Examples: IoT sensor ingest, clickstream logging, global leaderboards, messaging metadata, time-series metrics

Avoid Cassandra when…

  • Query patterns are exploratory or change frequently (use a data warehouse)
  • You need joins, aggregations across partitions, or ad-hoc analytics
  • Strong consistency is non-negotiable and latency budget is tight (use PostgreSQL or a CP store)
  • Data volume is modest — Cassandra's operational overhead is high for small datasets
  • You need transactions across multiple rows or tables

Examples: financial ledger, order management requiring ACID, reporting dashboards

CS 6500 — Big Data Analytics | Week 11

HBase

Same data model, opposite CAP choice

CS 6500 — Big Data Analytics | Week 11

What Is HBase?

A wide-column store built on top of HDFS — random read/write access into Hadoop data

Row key: "user_123"
  Column family: profile      Column family: activity
    name       → "Alice"        last_login  → "2025-03-14"
    email      → "a@co.com"     page_views  → "4821"
    city       → "Chicago"      purchases   → "17"
  • Wide-column model: rows have a row key; columns are grouped into column families; each cell is versioned by timestamp
  • Sparse by design: columns are dynamic — a row only stores the column qualifiers it actually has; no nulls, no wasted space
  • Built on HDFS: data files live in HDFS; HBase adds an index layer (HFile) and a write-ahead log for random access
  • CP database: a single HMaster coordinates region servers; strong consistency, no eventual reconciliation
CS 6500 — Big Data Analytics | Week 11

What Is HBase?

HBase is Google Bigtable's open-source descendant — the same model that inspired Cassandra, but with opposite availability trade-offs.

CS 6500 — Big Data Analytics | Week 11

HBase vs. Cassandra

Both are wide-column stores. The architecture couldn't be more different.

Cassandra (AP)

  • Leaderless ring — every node equal
  • Keeps accepting writes during a partition
  • Eventual consistency by default
  • Built for global-scale, high-availability workloads

HBase (CP)

  • Single HMaster coordinates all writes
  • Refuses writes if master is unreachable
  • Strong consistency guaranteed
  • Built on HDFS — lives inside the Hadoop ecosystem

The rule: if your data already lives in HDFS and you need random read/write access with strong consistency — HBase. If you need always-on availability across datacenters — Cassandra.

CS 6500 — Big Data Analytics | Week 11

When HBase Wins

HBase's niche: Hadoop-native storage with strong consistency

Scenario Why HBase Why Not Cassandra
Random reads into a 10 TB HDFS dataset Reads from HDFS directly; no ETL Cassandra stores its own data; can't query HDFS
Sparse rows (most columns empty per row) Column qualifiers are dynamic; only non-null values stored Cassandra tables have fixed columns
MapReduce / Spark jobs that also need point lookups Native Hadoop integration Requires separate connector
Strong consistency required (no "last-write-wins") CP: master ensures one truth AP by default; reconciles after the fact

Industry context: HBase was inspired by Google's Bigtable paper and was foundational at Facebook (Messages). Today Cassandra/ScyllaDB dominate new wide-column deployments — but HBase remains the answer when Hadoop integration is non-negotiable.

CS 6500 — Big Data Analytics | Week 11

CQL — The SQL That Isn't

Tables, partition keys, and why WHERE status = 'pending' fails

CS 6500 — Big Data Analytics | Week 11

CQL vs SQL

CQL looks familiar — but the constraints are fundamentally different

Feature SQL (Postgres) CQL (Cassandra)
Joins Yes ❌ None
Subqueries Yes ❌ None
Arbitrary WHERE Yes ❌ Partition key required
Aggregation across partitions Yes ❌ Very limited
Schema evolution ALTER TABLE ALTER TABLE (limited)
Updates In-place row update Upsert (INSERT = UPDATE)

The mental shift: in SQL, you design tables to model data. In CQL, you design tables to serve specific query patterns — one table per query.

CS 6500 — Big Data Analytics | Week 11

Table Anatomy

Three types of columns — each with a distinct role

CREATE TABLE orders_by_customer (
    customer_id  UUID,       ← Partition key: which node?
    order_date   DATE,       ← Clustering column: sort order on disk
    order_id     UUID,       ← Clustering column: uniqueness within partition
    total        DECIMAL,    ← Regular column: data payload
    status       TEXT,       ← Regular column: data payload
    PRIMARY KEY ((customer_id), order_date, order_id)
) WITH CLUSTERING ORDER BY (order_date DESC, order_id ASC);
Column type Purpose Queryable?
Partition key (customer_id) Routes to node(s) Required in every WHERE
Clustering columns order_date, order_id Sort order within partition Range queries allowed
Regular columns total, status Data stored in row Only with ALLOW FILTERING
CS 6500 — Big Data Analytics | Week 11

Demo: Connect

# Connect to the Cassandra container
docker exec -it cassandra cqlsh

# Check cluster health from outside
docker exec -it cassandra nodetool status
-- Verify you're in
DESCRIBE KEYSPACES;

-- Create our working keyspace
CREATE KEYSPACE IF NOT EXISTS ecommerce
  WITH replication = {
    'class': 'SimpleStrategy',
    'replication_factor': 1
  };

USE ecommerce;
CS 6500 — Big Data Analytics | Week 11

Demo: Create Table/Insert

-- Table optimized for: "all orders for customer X, newest first"
CREATE TABLE IF NOT EXISTS orders_by_customer (
    customer_id  UUID,
    order_date   DATE,
    order_id     UUID,
    total        DECIMAL,
    status       TEXT,
    PRIMARY KEY ((customer_id), order_date, order_id)
) WITH CLUSTERING ORDER BY (order_date DESC, order_id ASC);

-- Insert rows
INSERT INTO orders_by_customer (customer_id, order_date, order_id, total, status)
VALUES (11111111-1111-1111-1111-111111111111, '2025-03-15', uuid(), 249.99, 'completed');

INSERT INTO orders_by_customer (customer_id, order_date, order_id, total, status)
VALUES (11111111-1111-1111-1111-111111111111, '2025-02-20', uuid(), 89.00, 'completed');
CS 6500 — Big Data Analytics | Week 11

Demo: Query Patterns

-- ✅ Efficient: partition key in WHERE
SELECT * FROM orders_by_customer
WHERE customer_id = 11111111-1111-1111-1111-111111111111;

-- ✅ Efficient: range on clustering column
SELECT * FROM orders_by_customer
WHERE customer_id = 11111111-1111-1111-1111-111111111111
  AND order_date >= '2025-03-01';

-- ❌ Fails — no partition key
SELECT * FROM orders_by_customer WHERE status = 'completed';
-- ERROR: Cannot execute this query as it might involve data filtering

-- ⚠ ALLOW FILTERING: full cluster scan — never in production
SELECT * FROM orders_by_customer WHERE status = 'completed' ALLOW FILTERING;
CS 6500 — Big Data Analytics | Week 11

ALLOW FILTERING

What ALLOW FILTERING does:

Without partition key → Cassandra doesn't know which nodes hold the data
→ Must contact EVERY node in the cluster
→ Each node scans its entire SSTable for matching rows
→ O(total rows in cluster) for every query
CS 6500 — Big Data Analytics | Week 11

ALLOW FILTERING: The Fix

Design a table for the query instead

-- New table: optimized for "all orders with a given status"
CREATE TABLE orders_by_status (
    status       TEXT,
    order_date   DATE,
    order_id     UUID,
    customer_id  UUID,
    total        DECIMAL,
    PRIMARY KEY ((status), order_date, order_id)
) WITH CLUSTERING ORDER BY (order_date DESC);

-- Application writes to BOTH tables on every order event

Disk is cheap. Cluster-wide scans are not.

CS 6500 — Big Data Analytics | Week 11

Activity

Partition key design — pairs

CS 6500 — Big Data Analytics | Week 11

Activity: Design the Right PRIMARY KEY

Pairs | 8 minutes

A social media platform stores posts. The required query patterns are:

  1. All posts by a specific user, newest first
  2. All posts by a specific user on a specific date
CS 6500 — Big Data Analytics | Week 11

Activity: Evaluate the Options

For each design below, decide: Works / Fails / Works but has a flaw

-- Option A
PRIMARY KEY (post_id)

-- Option B
PRIMARY KEY ((user_id), created_at)
-- With CLUSTERING ORDER BY (created_at DESC)

-- Option C
PRIMARY KEY ((user_id, created_date), created_at)
-- With CLUSTERING ORDER BY (created_at DESC)

-- Option D
PRIMARY KEY ((user_id), created_at, post_id)
-- With CLUSTERING ORDER BY (created_at DESC)
CS 6500 — Big Data Analytics | Week 11

Activity Debrief

Option Query 1 Query 2 Notes
A (post_id) ❌ Fails — no user_id ❌ Fails Optimizes for "look up one post by ID" only
B ((user_id), created_at) ✅ Works ✅ Works But created_at alone is not unique — two posts at same second collide
C ((user_id, created_date), created_at) ❌ Fails — must supply date for Query 1 ✅ Works Bounds partition size (good!) but breaks Query 1
D ((user_id), created_at, post_id) ✅ Works ✅ Works Best choice — unique, ordered, partition bounded by user

Key insight: Option D is best — user_id partitions evenly, created_at enables range queries, post_id ensures uniqueness within the same second.

CS 6500 — Big Data Analytics | Week 11

Activity 2

Cassandra case study — pairs

CS 6500 — Big Data Analytics | Week 11

Activity: Real-World Cassandra

Pairs | 10 minutes

Pick one of the companies below. Look up how they use Cassandra and answer the three questions on the next slide.

Company Known for
Netflix Replaced a single Oracle DB with Cassandra for streaming metadata
Apple Runs one of the largest Cassandra deployments in the world (Siri, iCloud)
Discord Stores billions of messages; migrated away from Cassandra to ScyllaDB
Uber Uses Cassandra for geospatial and trip data at global scale
Instagram Stores user feed and activity data at hundreds of millions of users
CS 6500 — Big Data Analytics | Week 11

Activity: Case Study Questions

Answer these for your company:

  1. What data does Cassandra store for them? (not just "messages" — what is the schema shape: time-series? user-keyed? event log?)

  2. Which Cassandra properties made it the right choice? (high write throughput? AP availability? multi-DC? known query patterns?)

  3. What was the trade-off or pain point they hit? (modeling complexity? hotspots? eventual consistency bugs? operational cost?)

Be ready to share one sentence per question with the class.

CS 6500 — Big Data Analytics | Week 11

Case Study Debrief

One pair per company — 30 seconds each

Listen for patterns across all five:

  • Did every company use Cassandra for high-write, time-series-like data?
  • Did any company hit the query-pattern rigidity problem?
  • Did any company mention partition design as a source of pain?

The through-line: Cassandra earns its place when write volume and availability requirements are non-negotiable. The companies that struggled were the ones that reached for it before exhausting simpler options.

CS 6500 — Big Data Analytics | Week 11

Session 1 Key Takeaways

  • Ring architecture — every node is equal; no master means no single point of failure
  • Consistent hashing — adding nodes moves only ~1/N of data, not all of it
  • Tunable consistency — same cluster, different trade-offs per query via R + W > RF
  • Partition key is everything — it determines which node, enables all queries, and must be in every WHERE clause
  • One query, one tableALLOW FILTERING is a full cluster scan; the fix is a new table

Next session: ScyllaDB's C++ rewrite, query-first modeling patterns, and the three-way Redis / MongoDB / Cassandra decision framework

CS 6500 — Big Data Analytics | Week 11

What's Missing?

Cassandra eliminates the single point of failure — but every query still needs a table designed for it

CS 6500 — Big Data Analytics | Week 11

The Gaps

  • Query patterns must be known at design time — every new access pattern may require a new table; there is no "just add a WHERE clause" escape hatch; post-launch schema pivots require full data migrations
  • No ad-hoc analytics — a product manager's question like "show me all orders over $500 in March" either needs a pre-built table or ALLOW FILTERING; Cassandra was never designed to replace a data warehouse
  • Hotspot partitions — a low-cardinality partition key (e.g., status TEXT) creates a few enormous partitions that overwhelm the nodes responsible for them; detecting and fixing hotspots after launch is painful
  • Modeling mistakes are expensive — choosing the wrong partition key requires rewriting the table definition and migrating all data; unlike MongoDB's flexible documents, CQL schema changes are costly
CS 6500 — Big Data Analytics | Week 11

What Comes Next

Gap Solution When
Rules for choosing partition keys Query-first data modeling Week 11, Session 2
JVM GC pauses causing tail latency spikes ScyllaDB — C++ rewrite, shared-nothing Week 11, Session 2
Continuous data ingestion at millions of events/sec Apache Kafka — distributed event streaming Week 12

Cassandra is the right tool for high-write, high-availability workloads where query patterns are known — but when data arrives as a continuous stream rather than discrete writes, a different architecture wins.

CS 6500 — Big Data Analytics | Week 11

Speaker context: Students just finished MongoDB (Week 10 Session 2) and understand CP document stores. Today flips the CAP triangle: Cassandra defaults to AP — it keeps accepting writes during a partition, at the cost of possible momentary inconsistency. Lead with the "single master = single point of failure" problem before showing the ring. Demo-first works well: show the ALLOW FILTERING error early so the schema design motivation is visceral.

Quick recap slide — 2 minutes. The key contrast to plant in students' minds: both Redis and MongoDB chose consistency (CP); today flips to availability (AP). Ask: "What did MongoDB do when the primary failed?" Students should recall the election window. That 10–30 second window is the exact problem Cassandra's design eliminates.

Ask students: "If there's no primary, how does a client know which node to talk to?" They'll guess wrong — let them. The answer is: any node. The client connects to any node in the cluster; that node becomes the coordinator for that request and routes it to the correct replicas. This is the leaderless architecture insight.

Board demo: Draw a clock face. Place Node A at 12, B at 3, C at 6, D at 9. Scatter 8 "rows" around the clock — each goes to the nearest clockwise node. Add a node at 1:30 — only rows between 12 and 1:30 move. Then contrast: if you had 4 buckets (% 4) and add a 5th (% 5), almost every row changes bucket.

Emphasize that RF is set at the keyspace level, not per-table. RF=1 (class default) means no redundancy — fine for learning, never for production. RF=3 is the standard production choice because it survives one node failure while QUORUM reads/writes still function.

NetworkTopologyStrategy lets you set different RFs per datacenter: e.g., RF=3 in US-East, RF=2 in EU-West. This is the only correct choice in production — SimpleStrategy doesn't understand rack/DC topology, so it may place all three replicas on nodes in the same rack, defeating the purpose of RF=3.

Students often confuse gossip with a health-check service (like a load balancer ping). The key difference: gossip is decentralized — there is no "health check server" to fail. Each node independently builds a picture of cluster state. The O(log N) convergence means a 100-node cluster reaches consistent cluster state in roughly 7 rounds of gossip (~7 seconds). Mention `nodetool gossipinfo` for debugging.

Key point: the commit log ensures durability (survives node restart). The memtable enables fast writes. SSTables are immutable — no in-place updates. This is why Cassandra's write throughput is so high: sequential disk writes only.

Common misconception: students expect reads to be fast like a hash table lookup. Explain that because SSTables are immutable and writes never update in place, a single logical row may have pieces scattered across several SSTables. The read path merges them using timestamps (last-write-wins). Bloom filters eliminate most disk reads for missing keys. Compaction periodically merges SSTables to reduce read amplification — this is a background cost, not a request-path cost.

Spend a moment on why per-query configurability matters: a clickstream write (IoT sensor event) is fine at ONE — losing one event is acceptable and speed matters. An order total read for a payment confirmation might need QUORUM. Same cluster, same table, different trade-off. This tunability is what distinguishes Cassandra from databases where consistency is a cluster-wide setting.

Walk through the formula: R + W > RF means at least one node that confirmed the write must participate in any subsequent read. With RF=3, QUORUM writes go to 2 nodes and QUORUM reads check 2 nodes — the overlap guarantees one node saw both. The ALL write row is a useful trap: ALL write + ONE read is technically strongly consistent, but ALL writes block on any single node failure, which usually defeats the purpose of Cassandra.

The Write=ONE + Read=QUORUM pattern is popular in write-heavy workloads (metrics, logs) where you want fast ingest but need reliable reads. LOCAL_QUORUM is the multi-DC variant — it only requires a majority within the local datacenter, avoiding cross-DC latency on every write. This is what most production Cassandra deployments use.

This is a good place to address the "but my bank uses Cassandra" question — it often does, but for session state and transaction logs, not the authoritative balance. The canonical balance lives in a CP store (often a relational DB with serializable transactions). Cassandra's role is absorbing high-volume event writes cheaply. Push students to articulate the failure mode they're optimizing for before choosing a consistency level.

This is the "so what do I actually do?" slide students are waiting for. Reinforce the pattern: Cassandra is a write-optimized, availability-first store for high-volume workloads with known access patterns. The most common mistake is reaching for Cassandra because it's "scalable" when a Postgres instance would handle the load for years. Cassandra's operational cost (schema migrations, hotspot detection, compaction tuning) is significant — it should earn its place in the architecture.

Students need this context before the comparison slide. The key insight: HBase didn't replace HDFS — it sits *on top* of it. Your MapReduce or Spark jobs can read HDFS files directly; HBase adds the ability to do row-key lookups into that same data without a full scan. The column family concept is the source of most confusion — stress that column families are defined at schema creation time (like table columns), but the column qualifiers within a family are dynamic (any row can have any qualifier). A cell is identified by: row key + column family + column qualifier + timestamp.

Students need this context before the comparison slide. The key insight: HBase didn't replace HDFS — it sits *on top* of it. Your MapReduce or Spark jobs can read HDFS files directly; HBase adds the ability to do row-key lookups into that same data without a full scan. The column family concept is the source of most confusion — stress that column families are defined at schema creation time (like table columns), but the column qualifiers within a family are dynamic (any row can have any qualifier). A cell is identified by: row key + column family + column qualifier + timestamp.

Students often ask why HBase exists if Cassandra is strictly better. It isn't. HBase's superpower is sitting inside the Hadoop ecosystem — if your ETL pipeline produces HDFS files and you need point lookups into that data, HBase is far simpler than loading everything into Cassandra. HBase also handles extremely sparse data efficiently (column qualifiers are dynamic, not fixed schema). Keep this to 3–5 minutes; Assignment 3 focuses on Cassandra, not HBase.

Keep this to 5 minutes. The point is conceptual: same data model, opposite CAP position, different ecosystem home. Students don't implement HBase in this course — the assignment removed it in favor of depth in Cassandra/MongoDB.

The "CQL looks like SQL" surface similarity is a trap. Students who treat it like SQL will immediately hit errors (no joins, can't filter on non-key columns). Spend a moment on each row: "Joins — why not?" (data is distributed across nodes; a join would require shipping data between nodes, killing the performance advantage). "Arbitrary WHERE — why not?" (without a partition key, Cassandra doesn't know which nodes hold matching rows). These aren't limitations from laziness — they're intentional constraints that enable horizontal scale.

The double parentheses in `PRIMARY KEY ((customer_id), order_date, order_id)` confuse students. Outer parens = PRIMARY KEY clause. Inner parens = composite partition key boundary. Single-column partition key: `PRIMARY KEY (customer_id, order_date)` — here customer_id is partition key, order_date is clustering. Composite partition: `PRIMARY KEY ((customer_id, store_id), order_date)` — both columns together form the partition key. Draw this on the board if students look confused.

Demo tip: RF=1 for class to avoid needing multiple containers. Production would use RF=3 minimum. If nodetool status shows "UN" (Up/Normal), the node is healthy.

Have students run these commands themselves. `uuid()` generates a random UUID client-side — point out that unlike a SQL SERIAL primary key, there is no auto-increment in Cassandra. UUIDs are the standard PK choice because they distribute evenly across the ring. Add 3–4 more rows with different customer_ids before the query demo so range queries return visible results.

Let students hit the error. The error message is a feature: Cassandra refuses to let you accidentally do something O(total rows). ALLOW FILTERING exists for development/debugging only.

The error message Cassandra throws is deliberately informative: "Cannot execute this query as it might involve data filtering and thus may have unpredictable performance." It's telling you exactly what ALLOW FILTERING does. Treating the error as a bug to suppress (by adding ALLOW FILTERING) is the mistake. Treating it as a design signal is the right response: "this query pattern needs a dedicated table."

The application-level dual-write pattern is the standard solution: when an order is placed, write to both `orders_by_customer` and `orders_by_status` in the same request. This is denormalization by design. Preview that Cassandra Lightweight Transactions (LWT) exist for atomic operations, but they're slow (Paxos under the hood) and should be rare. The common pattern is application-managed consistency through dual writes.

8 minutes

Give pairs 8 minutes. Circulate and listen for the Option B collision discussion — many students won't notice that two posts at the same second collapse into one row. Also watch for confusion between the composite partition key in Option C and a single partition key. The goal is to surface the intuition that clustering columns provide both ordering and uniqueness guarantees.

Let pairs discuss before revealing the debrief slide. If most groups converge on D, great — ask them to articulate *why* C fails for Query 1. If groups are split between B and D, the collision scenario is the teaching moment: INSERT two posts for the same user at the same second with Option B and show that the second insert silently overwrites the first (Cassandra INSERTs are upserts).

Close the debrief by connecting back to the session theme: "We designed this table backwards from the query — we started with what the application needs to read, then built the schema." This is query-first design. Session 2 formalizes this into a full modeling workflow. Option C is worth dwelling on: the composite partition key `(user_id, created_date)` is actually a valid pattern for bounding partition size on high-volume users (e.g., a celebrity with millions of posts) — it just sacrifices Query 1 without also supplying the date.

10 minutes

10 minutes. Students should search "[company] Cassandra engineering blog" — most have published detailed posts. Discord's migration to ScyllaDB is especially rich: their blog post "How Discord Stores Billions of Messages" explains the hotspot problem directly. Netflix and Apple have DataStax Summit talks on YouTube. If students finish early, push them to find the actual partition key design the company uses.

Keep debrief tight — 5 minutes max. The goal is pattern recognition, not deep dives. If Discord came up, note that ScyllaDB is a drop-in Cassandra replacement (same CQL, same drivers) — that's exactly what Session 2 covers.

Assignment 3 is due at the end of Week 12. Students will design Cassandra schemas as part of it — this session is the foundation.

One-minute close. Kafka (Week 12) is the natural next step: Cassandra absorbs individual writes well, but when producers generate millions of events per second from many sources simultaneously, you need a buffer layer between producers and the database. That's Kafka's job — Cassandra is often the sink at the end of a Kafka pipeline.