Jagan Shanmugam Blog

The promise of NOT prompting but programming LMs

2025-10-28T00:00:00.000Z

After the democratization of sufficiently intelligent systems like ChatGPT/Claude/Gemini behind closed APIs, naturally the fear of being reduced to prompting away the problems is quite high. I feel this fear is rather deep, as previous workflows provided a sense of (more) control. At the same time, for many who take pride in the raw code get offended when this reduction to 'mere' prompting happens. Thus, when a solution that addresses the ego or the fear comes along, it will be well received among these groups first and then later adopted by others.

DSPy promises us to let us write programs like we used to do and not to deal with prompting in code. That is just the beginning, it goes beyond that taking inspiration from PyTorch. Introducing a structured approach to the LLM world with ML101 concepts is a hard task and I feel DSPy is slowly achieving that over time. Instead of manually crafting and tweaking prompts, DSPy lets you declare what you want and automatically optimizes how to achieve it.

For Data people, think of DSPy as the SQL for language models - just as SQL lets you declare what data you want without specifying how to retrieve it, DSPy lets you declare your LM pipeline logic without manually engineering prompts. On top of that, it is designed to be composable at all levels making it elegant for many tasks. For people who like to over-engineer stuff, I think prompt engineering will become an endless cycle without any clear end in sight. When I think about the popularity of React (frontend framework) and the long lasting success of SQL, I can only foresee the success of any declarative frameworks like DSPy.

Unlike Langchain/LlamaIndex, DSPy forces us back to the whiteboard and to think from first principles for any ML project. It asks us to think in terms of evaluation which has been questioned in the vibe era. Defining metrics, curating datasets, experimenting with a hypothesis and tracking experiments to systemcatically improve the End-to-End system is back in style with DSPy.

The Problem with Traditional Prompting

Traditional LLM app development is brittle:

Manual prompt engineering for every task or model
Hard to optimize - requires extensive trial and error
Difficult to compose complex pipelines
May breaks when you change LM providers
No systematic way to improve performance

The DSPy way: Declarative Programming

DSPy introduces a declarative, modular approach:

import dspy

# Configure your LM
lm = dspy.OpenAI(model='gpt-4o')
dspy.settings.configure(lm=lm)

# Define task
class Translate(dspy.Signature):
    """Translate English to German."""
    english = dspy.InputField()
    german = dspy.OutputField()
    instructions = dspy.InputField(desc="Instructions for the translation")

# Use it
translate = dspy.Predict(Translate)
result = translate(english="Around the World in 80 Days", instructions="Keep it short and concise like movie titles.")
print(result.german)  # "Um die Welt in 80 Tagen"

Core Concepts

1. Signatures - Declare Your Task

Signatures are like type signatures for LM operations. They specify inputs, outputs, and the task description:

class Summarize(dspy.Signature):
    """Summarize a long document into key points."""
    document = dspy.InputField()
    key_points = dspy.OutputField(desc="bullet list of main ideas")

Variable names matter in DSPy as they are sent in the prompt to the LLM for your task.

2. Modules - Build Composable Pipelines

Modules are PyTorch-style components that can be:

Composed together
Optimized end-to-end
Reused across projects

class MultiHopQA(dspy.Module):
    def __init__(self):
        self.retrieve = dspy.Retrieve()
        self.hop1 = dspy.ChainOfThought("context, question -> search_query")
        self.hop2 = dspy.ChainOfThought("context, question -> answer")

    def forward(self, question):
        ctx1 = self.retrieve(question)
        query = self.hop1(context=ctx1, question=question)
        ctx2 = self.retrieve(query.search_query)
        return self.hop2(context=ctx2, question=question)

3. Optimizers - Automatic Prompt and Weight Tuning

This is where DSPy shines. Optimizers automatically optimize your prompts:

BootstrapFewShot: Generates effective few-shot examples
MIPRO: Optimizes instructions and demonstrations jointly
BootstrapFewShotWithRandomSearch: Explores prompt variations
BayesianSignatureOptimizer: Uses Bayesian optimization

# Before compilation: generic prompts
# After compilation: optimized prompts with examples!
optimizer = dspy.BootstrapFewShot(max_bootstrapped_demos=4)
compiled = optimizer.compile(student=my_module, trainset=examples)

There are other primitives like Adapters, Tools and Metrics which I will not go into in detail here rather provide one-liners for comprehensiveness.

Adapters are the bridge between signatures and the actual LLM calls. You can easily swap JSON, BAML, XML adapters.
Tools can be standard functions or tools defined in MCP servers that can be attached to make the LLM agentic.
Metrics is the interesting part that can be combined with optimizers to optimize the program.

Why Declarative?

Imperative (Old)	Declarative (DSPy)
Manually craft prompts	Declare signatures
Hard-code examples	Auto-generate examples
Trial and error tuning	Systematic optimization
Brittle prompt strings	Composable modules
One-off solutions	Reusable components

Real-World Use Cases

RAG Systems: Retrieval + reasoning pipelines - Complex question answering with multiple reasoning steps
Data Extraction: Extract structured data from unstructured text - Extract data from unstructured text
Agentic Systems: Build reliable LLM-based agents - Build agents that can perform tasks
Classification: Few-shot classification that improves over time - Classify data with few-shot examples
LLM-as-a-Judge: Use LLMs to judge the quality of other LLMs' responses, align with human experts

DSPy Architecture

When to/not to use DSPy

Use DSPy when:
- Building complex LM pipelines
- Need systematic optimization
- Want composable, maintainable code
- Switching between LM providers
- Have training data to optimize with
Skip DSPy when:
- Simple one-off prompts
- No training/validation data
- Extremely domain-specific where manual control is critical

MCP Overview

2025-03-26T00:00:00.000Z

Hackathons are in full swing this year and as I wanted to catch up on MCPs, I decided to attend a Hackathon by ToolHouse focused on MCP. MCP servers and clients I created are at the end of this post.

Below I try to summarize my notes for MCP and as it is rather new it might evolve over time. Creating MCP servers is just one prompt away with Claude and I played around in Cline to create a few simple servers first to understand it. The Model Context Protocol (MCP) introduced by Anthropic is rapidly becoming the universal standard for connecting AI models to external data, tools, and real-world actions. Think of MCP as the "USB-C port for AI applications," enabling seamless integration between AI agents and a wide variety of data sources and APIs.

Why All the Hype?

Solving the "N times M Problem": MCP flattens the complex web of integrations between many AI clients and many servers/APIs, allowing tool providers to build one MCP server and application developers to connect to any MCP server with a compatible client.
More Powerful AI Applications: Standardization means richer, more capable AI applications that can take real-world actions.
Enterprise Efficiency: Enterprises can separate concerns, letting different teams build and maintain specialized MCP servers (e.g., for Vector DBs or RAG systems) that can be reused across teams and projects.
Flexibility and Interoperability: Developers gain access to a growing list of pre-built integrations and can easily switch between LLM providers and vendors.
AI-Native Design: Unlike OpenAPI or GraphQL, MCP is designed specifically for AI agents, refining patterns for tool use, resource access, and prompt incorporation.
Strong Foundation: MCP draws inspiration from the Language Server Protocol (LSP) and comes with a comprehensive specification.

Motivating Example

AI models are only as good as the context provided to them. Historically, context was manually copy-pasted into chatbots. Now, MCP enables direct hooks into user data and context, making AI more powerful and personalized.

Why Do LLMs Need Tools and External Context?

Overcoming inherent limitations: Standalone models have functional limits that can be addressed by interacting with external systems.
Accessing real-time data: Tools allow LLMs to get up-to-date information not present in their training data (e.g., weather APIs).
Performing actions: Tools enable LLMs to take actions in external systems, such as adding tasks, managing subscriptions, or controlling smart devices.

MCP Overview

M x N Problem: Many applications, many data sources/APIs. MCP solves this with a single, standardized protocol.
Inspired by LSP: MCP borrows from the Language Server Protocol, making it familiar to developers.

What is MCP?

MCP is a client-server protocol:

MCP Hosts: User-facing applications (e.g., Claude Desktop, IDEs, custom AI tools)
MCP Clients: Manage the connection to a specific MCP Server
MCP Servers: Lightweight programs exposing capabilities per the MCP spec, bridging the MCP world and external systems

Before and After MCP

Before: Each AI client needed custom integrations for every server/API.
After: One protocol, many integrations, less complexity.

Demo & Use Cases

Clients: IDEs (VSCode, Cursor, Windsurf), Goose, Claude Desktop, Mattermost, Persona Chat
Github Copilot: Agents with MCP servers for Figma-to-Code, Data Analysis, Github management
Goose: Agents with MCP servers using Nexus API
WHOIS MCP: Find domain owners
PowerPoint MCP: Create presentations with images

Limitations & Risks

LLMs can still hallucinate and are prone to prompt injection attacks
Risk of tool poisoning by untrusted MCP servers
Potential to execute untrusted code via STDIO MCP servers
Why not just use OpenAPI? MCP is AI-native and designed for agent workflows
Remote support and authentication (OAuth 2.1 for remote servers)

MCP Server Registry

Find open source MCP servers at:

Best Practices

Run only trusted or official MCP servers
Audit MCP servers before running locally
Use isolated environments (e.g., Docker)
Start with test data and restrict permissions
Explore what's possible!

Open source

As part of a hackathon, I created the below MCP servers and host. Check it out!

MCP Mind Map

TicTacToe RL

2020-04-20T00:00:00.000Z

This is an implementation of the classic Tic Tac Toe game, powered by Reinforcement Learning (RL)! This project demonstrates how an RL agent can learn to play Tic Tac Toe optimally through self-play and Temporal Difference (TD) learning.

Project Overview

Purpose: Train an RL agent to play Tic Tac Toe using self-play and TD(0) learning, and provide both a command-line and graphical interface for users to play against the trained agent.
Key Features:
- RL agent learns state values for all possible board configurations (over 19,000 states)
- Epsilon-greedy policy for balancing exploration and exploitation during training
- Pygame-based graphical UI for interactive play
- Command-line interface for quick testing
- Well-documented code and modular structure

How It Works

The RL agent is trained using a simple TD(0) update rule:

v(s) ← v(s) + α (v(s') - v(s))

v(s): Value of the current state
v(s'): Value of the next state
α: Learning rate

During training, the agent plays games against itself, updating state values based on the outcome and gradually improving its strategy. The agent uses an epsilon-greedy policy to occasionally explore random moves, ensuring a robust learning process.

Directory Structure

game_app.py: Pygame-based UI for playing against the RL agent
test_game.py: Command-line interface for testing
training_self_play.py: RL training logic
tic_tac_toe.py: Core game logic and state management
requirements: Python dependencies

Getting Started

Install Dependencies:
- Python 3.5+
- Install required packages:
```
pip install -r requirements
```
Train the RL Agent:
- Run training_self_play.py to train the agent (optional, pre-trained values included)
Play the Game:
- Graphical UI:
```
python game_app.py
```
- Command-line:
```
python test_game.py
```

Gameplay

The RL agent can play as either X or O.
In the Pygame UI, the agent and user take turns; click on a square to make your move.
After each game, click anywhere to restart.

References

Latent Space Bayesian Optimization

2020-03-15T00:00:00.000Z

Optimization is everywhere - in tuning machine learning models, industrial processes, and even in everyday decision-making. But what happens when the problem you want to optimize is a black box, expensive to evaluate, and has way too many parameters? That's where my master's thesis comes in: Latent Space Bayesian Optimization with Transfer Learning. Here's a deep dive into what I did, why it matters, and what I learned along the way.

The Problem: Black-Box Optimization in High Dimensions

Bayesian Optimization (BO) is a popular optimization method for expensive black-box functions, used in a variety of fields including hyperparameter optimization and industrial processes, up to moderate dimensions (10-20). The black box functions are sometimes over-parameterized which results in modeling redundant dimensions in high dimensional spaces. Optimization methods that focus on the most relevant dimensions find optimal solutions faster. Additionally, existing optimization data, e.g. from optimizing similar problems, can be used to further speed up the optimization process on the task of interest i.e. perform Transfer Learning. To warm start BO on the task at hand, it is of utmost importance to model data collected from similar tasks to transfer knowledge. In Transfer Learning BO, models which learn the underlying intrinsic function are essential. We propose a latent space model with Transfer Learning to, 1. learn a transformation from input space to latent space and 2. learn a common set of features from the learned latent space across multiple tasks to perform eﬃcient Transfer Learning. Our model is empirically evaluated against state-of-the-art methods on synthetic benchmarks.

The Core Idea: Optimize Where It Matters

Latent Spaces

The key insight is that, even though your input space might be huge (dozens or hundreds of parameters), the intrinsic dimensionality is often much lower. In other words, only a few directions in parameter space actually affect your objective. If you can learn a transformation from the high-dimensional input to a low-dimensional latent space that captures the important variation, you can optimize much more efficiently.

Transfer Learning

If you've already solved similar optimization problems (maybe with different materials or settings), you should be able to transfer what you've learned. The challenge is to design a model that can leverage this metadata (past optimization runs) to "warm start" the new optimization, avoiding the cold start problem that plagues standard BO.

Joint Learning in Latent Space

My thesis proposes a method that jointly learns:

A transformation from input space to latent space (either linear or nonlinear)
A shared set of features (basis functions) across several tasks for transfer learning

The model is trained in two phases:

Meta-training: Learn from metadata (previous tasks) to initialize the latent space and shared features.
Target training: Adapt the model to the new task, fine-tuning both the latent space and the prediction model as new data comes in.

Model Architecture

I explored two main variants:

Projection-ABLR: Uses a learnable linear projection from input to latent space, paired with Adaptive Bayesian Linear Regression (ABLR) for prediction.
AutoEncoder-ABLR: Uses an autoencoder (neural network) to learn a nonlinear mapping to latent space, again paired with ABLR.

Both models are trained to minimize a combination of negative log-likelihood (for prediction) and mean squared error (for reconstructing the input from the latent space). This joint loss ensures the latent space is predictive and reconstructive simulataneously.

Why ABLR?

Adaptive Bayesian Linear Regression is computationally efficient and scales well with the number of tasks and data points for transfer learning. It allows for a separate Bayesian regressor for each task but shares the feature mapping making it well suited for multi-task scenarios.

Experiments: Synthetic Benchmarks

To test the method, I used high-dimensional synthetic functions with known low intrinsic dimensionality:

Quadratic function: Parameterized by a small set of variables, projected into higher dimensions.
Rosenbrock function: A classic optimization benchmark, adapted for multi-task and high-dimensional settings.

I compared my models against state-of-the-art baselines:

REMBO: Random Embeddings for Bayesian Optimization (no transfer learning)
Multi-Task ABLR: Directly models in the input space with transfer learning
VAE-BO: Variational Autoencoder-based Bayesian Optimization

Key Results

Transfer learning helps: Models that leverage metadata start with lower regret (closer to the optimum) and converge faster, especially in the early iterations.
Latent space optimization is efficient: By optimizing in the learned low-dimensional space, the search is much more effective than in the original high-dimensional space.
Projection-ABLR outperforms: The linear projection model consistently achieved lower regret than baselines, especially when enough metadata was available.
AutoEncoder-ABLR needs more data: Nonlinear models (autoencoders) can capture more complex relationships but require more data to avoid overfitting or saturation.

Challenges and Open Questions

Estimating intrinsic dimensionality: Knowing how many latent dimensions to use is still an open problem. I tried cross-validation and meta-loss analysis, but the results were inconclusive. This remains a key challenge for future work.
Scaling to real-world tasks: The method works well on synthetic benchmarks. Applying it to real industrial processes (like welding) is the next step.
Negative transfer: If the metadata tasks are too different from the target, transfer learning can actually hurt performance. Designing robust ways to detect and avoid negative transfer is important.

Takeaways

Joint learning works: Simultaneously learning the latent space and prediction model is more effective than sequential approaches, especially when data is scarce.
Transfer learning is powerful: Leveraging past experience can dramatically speed up optimization in new tasks.
Linear vs. nonlinear: Linear projections are surprisingly effective when the intrinsic structure is simple, but nonlinear mappings (autoencoders) are more flexible for complex tasks - if you have enough data.

Conclusion

My thesis shows that Latent Space Bayesian Optimization with Transfer Learning is a promising approach for high-dimensional, expensive black-box optimization. By learning where to search (latent space) and how to transfer knowledge from previous tasks, we can solve challenging optimization problems more efficiently.

If you're working on hyperparameter tuning, industrial process optimization, or any scenario where experiments are costly and you have some prior data, this approach could save you time, money, and frustration.

Clustering Evolving Data Streams

2019-05-28T00:00:00.000Z

Clustering evolving data streams is one of those topics that sits right at the intersection of machine learning, big data, and real-time analytics. With the explosion of data from IoT devices, social media, and continuous sensors, we're not just dealing with big data - we're dealing with fast data that never stops coming. In this post, I'll walk through the core ideas behind clustering evolving data streams, the unique challenges, and some of the leading algorithms and concepts in this space.

Why Clustering Data Streams Is Different

Traditional clustering methods (think K-means, DBSCAN, etc.) assume you have all your data up front. But in the real world, data often arrives as a stream - unbounded, high-volume, and potentially high-dimensional. Here are the main challenges:

Single-pass constraint: You can't store all the data, so you need algorithms that process each point only once (or a small number of times).
Evolving nature: The underlying patterns (clusters) can change over time - new clusters can appear, old ones disappear, and others might split or merge.
Real-time requirements: You need to update clusters quickly, often before the next data point arrives.
Memory and computation limits: You have to summarize the stream efficiently, as storing everything is not an option.

Core Concepts

Stream Data Clustering

Given a stream of data points (often high-dimensional), the goal is to maintain a set of clusters that reflect the current structure of the data at any point in time. Each data point may also have a timestamp, and the "freshness" of a point decays over time - recent data is more relevant than old data.

Cluster Evolution

Clusters in streaming data are not static. Over time, you might see:

Emergence: New clusters appear.
Disappearance: Existing clusters fade away.
Split: A cluster divides into two or more.
Merge: Two or more clusters combine.
Adjustment: The shape or position of a cluster shifts.

Decay Models

To handle the evolving nature, most algorithms use a decay model - older data points have less influence on the current clustering result, often modeled with an exponential decay function.

Summarizing the Stream

Because you can't keep all the data, you need to summarize it. Three popular approaches:

Cluster-cells: Groups of nearby points summarized as a single entity (with a seed, density, and dependent distance).
Micro-clusters: Small, time-stamped summaries of data locality, often used in hierarchical or partitioning methods.
Grids: The data space is divided into grids, and densities are maintained for each grid cell - great for high-dimensional data.

Leading Algorithms

Let's look at some of the state-of-the-art methods for clustering evolving data streams:

EDMStream

Approach: Density-based, inspired by Density Peaks clustering.
How it works: Summarizes nearby points into cluster-cells and organizes them in a Dependency Tree (DP-Tree). The DP-Tree is updated as new data arrives, tracking dependencies and densities.
Cluster evolution: Can detect emergence, disappearance, split, merge, and adjustment of clusters in real time.
Key advantage: Real-time updates and evolution tracking, making it suitable for applications like news recommendation or intrusion detection.

CluStream

Approach: Partitioning, two-phase (online/offline).
How it works: In the online phase, maintains micro-clusters as summaries. In the offline phase, uses these micro-clusters to perform clustering (often with K-means) and analyze evolution.
Cluster evolution: Supports analysis over different time horizons, but not truly real-time.
Key advantage: Well-suited for scenarios where you can afford to do heavier computation offline.

D-Stream

Approach: Density-based, grid-oriented.
How it works: Maps incoming data to a grid, updates densities, and periodically clusters dense grid cells. Uses decay to handle evolving data.
Cluster evolution: Can adapt to changing data, but less effective in high-dimensional spaces.
Key advantage: Efficient for outlier detection and works well when the number of dimensions is moderate.

E-Stream

Approach: Evolution-based, extends CluStream.
How it works: Each cluster is represented as a Fading Cluster Structure with Histogram (FCH). Tracks evolution using histograms and supports appearance, disappearance, self-evolution, merge, and split.
Key advantage: Explicitly tracks cluster evolution using statistical summaries.

MEC Algorithm

Approach: Evolution tracking (not clustering itself).
How it works: Uses bipartite graphs and conditional probabilities to track how clusters change between time windows. Categorizes transitions as birth, death, split, merge, or survival.
Key advantage: Useful for monitoring and analyzing cluster evolution after clustering has been performed.

Quick Comparison

Here's a markdown table summarizing the main differences:

Algorithm	Summary Structure	Based On	Real-time?	Cluster Evolution Tracking	Notes
EDMStream	Cluster-cell	Density	Yes	Yes	Tracks evolution, incremental updates
CluStream	Micro-clusters	Partitioning	No	Partial (offline)	Hierarchical, uses K-means offline
D-Stream	Grids	Density	No	Partial	Efficient for outliers, less for high-dims
E-Stream	FCH (histograms)	Partitioning	No	Yes	Tracks evolution using histograms
MEC	N/A (post-hoc)	N/A	N/A	Yes	For monitoring, not clustering itself

Evaluation Metrics

When you cluster evolving data streams, you care about:

Cluster quality: Internal (e.g., sum of squared distances within clusters) and external (e.g., purity, entropy).
Response time: How quickly can the algorithm update clusters as new data arrives?
Adaptability: How well does it handle the appearance/disappearance of clusters?
Scalability: Can it handle high-dimensional or high-volume streams?

Real-World Applications

News recommendation: Grouping articles in real time as trends evolve.
Intrusion detection: Detecting new types of attacks as they emerge in network traffic.
Sensor networks: Monitoring environmental data for emerging patterns.

Final Thoughts

Clustering evolving data streams is a vibrant research area with real-world impact. The key is to balance speed, memory usage, and adaptability to change. EDMStream stands out for real-time evolution tracking, but each algorithm has its sweet spot depending on your data and requirements.

If you want to dive deeper, check out and all the sources cited there:

Advanced Data Engg - Clustering Evolving Data Streams

Jagan Shanmugam Blog

The promise of NOT prompting but programming LMs

The Problem with Traditional Prompting​

The DSPy way: Declarative Programming​

Core Concepts​

1. Signatures - Declare Your Task​

2. Modules - Build Composable Pipelines​

3. Optimizers - Automatic Prompt and Weight Tuning​

There are other primitives like Adapters, Tools and Metrics which I will not go into in detail here rather provide one-liners for comprehensiveness.​

Why Declarative?​

Real-World Use Cases​

DSPy Architecture​

When to/not to use DSPy​

Further Reading​

MCP Overview

Why All the Hype?​

Motivating Example​

Why Do LLMs Need Tools and External Context?​

MCP Overview​

What is MCP?​

Before and After MCP​

Demo & Use Cases​

Limitations & Risks​

MCP Server Registry​

Best Practices​

Further Reading​

Open source​

MCP Mind Map​

TicTacToe RL

Project Overview​

How It Works​

Directory Structure​

Getting Started​

Gameplay​

References​

Latent Space Bayesian Optimization

The Problem: Black-Box Optimization in High Dimensions​

The Core Idea: Optimize Where It Matters​

Latent Spaces​

Transfer Learning​

Joint Learning in Latent Space​

Model Architecture​

Why ABLR?​

Experiments: Synthetic Benchmarks​

Key Results​

Challenges and Open Questions​

Takeaways​

Conclusion​

Clustering Evolving Data Streams

Why Clustering Data Streams Is Different​

Core Concepts​

Stream Data Clustering​

Cluster Evolution​

Decay Models​

Summarizing the Stream​

Leading Algorithms​

EDMStream​

CluStream​

D-Stream​

E-Stream​

MEC Algorithm​

Quick Comparison​

Evaluation Metrics​

Real-World Applications​

Final Thoughts​

The Problem with Traditional Prompting

The DSPy way: Declarative Programming

Core Concepts

1. Signatures - Declare Your Task

2. Modules - Build Composable Pipelines

3. Optimizers - Automatic Prompt and Weight Tuning

There are other primitives like Adapters, Tools and Metrics which I will not go into in detail here rather provide one-liners for comprehensiveness.

Why Declarative?

Real-World Use Cases

DSPy Architecture

When to/not to use DSPy

Further Reading

Why All the Hype?

Motivating Example

Why Do LLMs Need Tools and External Context?

MCP Overview

What is MCP?

Before and After MCP

Demo & Use Cases

Limitations & Risks

MCP Server Registry

Best Practices

Further Reading

Open source

MCP Mind Map

Project Overview

How It Works

Directory Structure

Getting Started

Gameplay

References

The Problem: Black-Box Optimization in High Dimensions

The Core Idea: Optimize Where It Matters

Latent Spaces

Transfer Learning

Joint Learning in Latent Space

Model Architecture

Why ABLR?

Experiments: Synthetic Benchmarks

Key Results

Challenges and Open Questions

Takeaways

Conclusion

Why Clustering Data Streams Is Different

Core Concepts

Stream Data Clustering

Cluster Evolution

Decay Models

Summarizing the Stream

Leading Algorithms

EDMStream

CluStream

D-Stream

E-Stream

MEC Algorithm

Quick Comparison

Evaluation Metrics

Real-World Applications

Final Thoughts