CATO: End-to-End Optimization of ML-Based Traffic Analysis Pipelines

Abstract. Machine learning has shown tremendous potential for improving the capabilities of network traffic analysis applications, often outperforming simpler rule-based heuristics. However, ML-based solutions remain difficult to deploy in practice. Many existing approaches only optimize the predictive performance of their models, overlooking the practical challenges of running them against network traffic in real time. This is especially problematic in the domain of traffic analysis, where the efficiency of the serving pipeline is a critical factor in determining the usability of a model. In this work, we introduce CATO, a framework that addresses this problem by jointly optimizing the predictive performance and the associated systems costs of the serving pipeline. CATO leverages recent advances in multi-objective Bayesian optimization to efficiently identify Pareto-optimal configurations, and automatically compiles end-to-end optimized serving pipelines that can be deployed in real networks. Our evaluations show that compared to popular feature optimization techniques, CATO can provide up to 3600×lower inference latency and 3.7×higher zero-loss throughput while simultaneously achieving better model performance.

Making Machine Learning Network Analysis More Efficient

Machine learning has revolutionized how we analyze network traffic, helping us classify applications, detect intrusions, and measure service quality. However, deploying ML models for real-time network analysis comes with significant challenges. While ML models can achieve impressive accuracy for network analysis tasks, they often struggle in real-world deployments. The key issue? It’s not just about having an accurate model - the entire pipeline needs to process network traffic efficiently in real-time. Even small delays can cause packet loss and render a model ineffective.

Traditional approaches focus on either:

Using lightweight models
Implementing models in specialized hardware
Making predictions with minimal data

However, these approaches often compromise accuracy for speed unnecessarily.

How CATO Works

CATO takes a novel approach by simultaneously optimizing two critical factors:

The model’s predictive performance (accuracy)
The system costs of running the analysis pipeline

It uses a technique called multi-objective Bayesian optimization to efficiently find configurations that balance these competing goals. The system automatically compiles optimized pipelines that can be deployed in real networks.

The key innovation is that CATO searches for the optimal combination of:

Which network traffic features to analyze
How much traffic data to collect before making predictions

Results

We tested CATO across several real-world scenarios with remarkable results:

Reduced inference latency by up to 3600× (from several minutes to under 0.1 seconds)
Increased throughput by 3.7× while maintaining accuracy
Improved model performance compared to traditional optimization approaches

For example, in an IoT device classification task, CATO achieved better accuracy using just 3 packets compared to baseline approaches that required 10 packets.

Why This Matters

Real-time network analysis is crucial for many applications:

Network security monitoring
Quality of service management
Traffic classification
Application performance monitoring

CATO’s ability to generate efficient pipelines without sacrificing accuracy makes ML-based analysis more practical for real-world deployments. This could enable more widespread adoption of ML techniques in network operations.

Looking Forward

This work opens up exciting possibilities for deploying ML in network environments with strict performance requirements. Future work could explore:

Broader model selection strategies
Performance profiling across different types of hardware
Automated adaptation to changing network conditions

CATO represents a significant step forward in making ML-based network analysis both accurate and efficient enough for practical deployment.

Resources

Bibtex citation

@article{wan2025cato,
  title={CATO: End-to-end Optimization of ML Traffic Analysis Pipelines},
  author={Wan, Gerry and Liu, Shinan and Bronzino, Francesco and Feamster, Nick and Durumeric, Zakir},
  journal={USENIX Symposium on Networked Systems Design and Implementation},
  year={2025}
}

Optimizing ML-Based Traffic Analysis with CATO