Broadcast Analytics System - MLOps Platform

Status: Production-Ready (All 6 Phases Complete) Repository: Private Company Repository Version: 2.0

Executive Summary

A production-grade enterprise MLOps platform for broadcast analytics combining machine learning, LLM-powered chat, and automated deployment infrastructure. Features 24 trained models, intelligent caching, schedule optimization, and comprehensive monitoring.

System Architecture

graph LR
    subgraph Data["Data Layer"]
        PG["PostgreSQL 16<br/>ML Registry"]
        Mongo["MongoDB 7<br/>Chat History"]
        Redis["Redis 7<br/>LLM Cache"]
    end

    subgraph ML["ML Pipeline"]
        Loader["Data<br/>Loader"]
        Preprocess["Feature<br/>Engineering"]
        Training["Model<br/>Training"]
        Evaluation["Model<br/>Evaluation"]
        Registry["ML<br/>Registry"]
    end

    subgraph LLM["LLM Agent"]
        Agent["LangChain<br/>Agent"]
        LLMPrimary["Groq<br/>Llama 70B"]
        LLMSecondary["Gemini<br/>2 Flash"]
        LLMFallback["OpenRouter<br/>Fallback"]
        Cache["Response<br/>Cache"]
    end

    subgraph API["API Layer"]
        APINode["REST API<br/>OpenAPI"]
        Predictor["Prediction<br/>Service"]
        Optimizer["Schedule<br/>Optimizer"]
        Export["Data<br/>Export"]
    end

    subgraph UI["Frontend"]
        Dashboard["Monitoring<br/>Dashboard"]
        Chat["LLM Chat<br/>Interface"]
        Charts["Interactive<br/>Charts"]
    end

    subgraph Monitor["CI/CD and Monitoring"]
        GHA["GitHub<br/>Actions"]
        Prometheus["Prometheus<br/>Metrics"]
        Grafana["Grafana<br/>Dashboards"]
        Alertmanager["Alert<br/>Manager"]
    end

    Loader --> Preprocess
    Preprocess --> Training
    Training --> Evaluation
    Evaluation --> Registry
    Registry --> PG

    APINode --> Predictor
    APINode --> Optimizer
    APINode --> Export
    Predictor --> Registry

    Agent --> LLMPrimary
    LLMPrimary -.->|Fallback| LLMSecondary
    LLMSecondary -.->|Fallback| LLMFallback
    Agent --> Cache
    Cache --> Redis
    Agent --> Mongo

    Dashboard --> APINode
    Chat --> Agent
    Charts --> APINode

    APINode --> Prometheus
    Prometheus --> Grafana
    Prometheus --> Alertmanager

    classDef dataStyle fill:#FFE0B2,stroke:#E65100,stroke-width:2px
    classDef mlStyle fill:#E1BEE7,stroke:#4A148C,stroke-width:2px
    classDef llmStyle fill:#C8E6C9,stroke:#1B5E20,stroke-width:2px
    classDef apiStyle fill:#BBDEFB,stroke:#0D47A1,stroke-width:2px
    classDef frontendStyle fill:#F8BBD0,stroke:#880E4F,stroke-width:2px
    classDef cicdStyle fill:#FFF9C4,stroke:#F57F17,stroke-width:2px

    class PG,Mongo,Redis dataStyle
    class Loader,Preprocess,Training,Evaluation,Registry mlStyle
    class Agent,LLMPrimary,LLMSecondary,LLMFallback,Cache llmStyle
    class API,Predictor,Optimizer,Export apiStyle
    class Dashboard,Chat,Charts frontendStyle
    class GHA,Prometheus,Grafana,Alertmanager cicdStyle

System Overview

The platform provides: - Predictive analytics for broadcast revenue and ratings - Natural language query interface with LLM agents - Automated ML pipeline with model registry - Schedule optimization using genetic algorithms - Industrial-grade monitoring stack

Technology Stack

Backend Framework

  • API: FastAPI with async support
  • Python: 3.12+ with modern features
  • Validation: Pydantic models
  • Server: Uvicorn ASGI server
  • ORM: SQLAlchemy 2.0 for PostgreSQL

Frontend

  • Framework: React 18
  • Visualization: Chart.js for interactive charts
  • HTTP Client: Axios
  • Theme: Custom enterprise professional theme
  • Build: Create React App

Databases

PostgreSQL 16

  • Purpose: Structured data, ML model registry
  • Features:
  • Model versioning and metadata
  • Training results and metrics
  • Performance tracking
  • ACID transactions

MongoDB 7.0

  • Purpose: Unstructured data, chat sessions
  • Features:
  • Chat history storage
  • Session management
  • Flexible document schema
  • Async access with Motor

Redis 7.0

  • Purpose: High-performance caching
  • Features:
  • LLM response cache (75%+ hit rate)
  • Session data
  • Real-time metrics
  • Pub/sub messaging

Machine Learning Frameworks

Scikit-learn

  • Models: Random Forest, Gradient Boosting, ElasticNet, MLP
  • Purpose: Classical ML algorithms
  • Features: Feature engineering, preprocessing, evaluation

XGBoost

  • Purpose: Gradient boosting trees
  • Performance: Fast training, high accuracy
  • Features: GPU support, early stopping, feature importance

CatBoost

  • Purpose: Categorical feature handling
  • Performance: State-of-the-art accuracy
  • Features: Built-in categorical encoding, robust to overfitting

LLM & Agent Framework

LangChain

  • Purpose: LLM application framework
  • Features:
  • Agent orchestration
  • Tool integration
  • Memory management
  • Chain composition

LLM Providers

Primary: Groq - Model: Llama 3.3 70B - Free Tier: 14.4K TPM - Speed: Fastest inference - Use Case: Primary generation

Secondary: Google Gemini - Model: Gemini 2.0 Flash - Free Tier: 1.5M tokens/day - Speed: Excellent balance - Use Case: Fallback provider

Tertiary: OpenRouter - Model: Various (Llama, Mixtral) - Use Case: Final fallback

Optimization

Genetic Algorithm

  • Library: DEAP (Distributed Evolutionary Algorithms)
  • Purpose: Schedule optimization
  • Features:
  • Custom fitness functions
  • Multi-objective optimization
  • Constraint handling
  • Population evolution

Infrastructure & DevOps

Containerization

  • Docker: Multi-stage builds for efficiency
  • Docker Compose: Full stack orchestration
  • Images: Optimized layer caching
  • Networking: Custom bridge networks

Orchestration

  • Kubernetes: Production deployment
  • Resources:
  • Deployments for stateless services
  • StatefulSets for databases
  • Services for load balancing
  • Ingress for external access
  • ConfigMaps for configuration
  • Secrets for credentials

CI/CD

  • Platform: GitHub Actions
  • Pipelines:
  • Automated testing (pytest)
  • Code quality (black, ruff, mypy)
  • Docker image builds
  • Deployment automation
  • ML model training pipeline

Monitoring Stack

Prometheus 2.48.0 - Time-series metrics collection - 90-day data retention - Alert rule evaluation - Service discovery

Grafana 10.2.2 - Professional dashboards - Real-time visualization - Alert management - Data source integration

Alertmanager - Alert routing - Email and Slack notifications - Alert grouping and deduplication - Silence management

Loki + Promtail - Log aggregation - Log querying - Integration with Grafana - Distributed tracing (planned)

Core Features

Automated ML Pipeline

1. Data Loading

  • ETL from PostgreSQL
  • Feature extraction
  • Data validation
  • Missing value handling

2. Preprocessing

  • Feature engineering
  • Categorical encoding
  • Numerical scaling
  • Train/test splitting

3. Model Training

  • 6 Algorithms:
  • Random Forest
  • XGBoost
  • CatBoost
  • Gradient Boosting
  • Multi-Layer Perceptron (MLP)
  • ElasticNet

  • 4 Target Variables:

  • Rating predictions
  • Revenue forecasts
  • Audience metrics
  • Engagement scores

  • Total: 24 trained models (6 × 4)

4. Model Evaluation

  • R² score calculation
  • MSE, RMSE, MAE metrics
  • Feature importance analysis
  • Cross-validation
  • Model comparison

5. Model Selection

  • Automatic best model selection (R² > 0.7)
  • Versioning and metadata tracking
  • A/B testing support
  • Rollback capabilities

6. Model Registry

  • Database-backed storage
  • Version control
  • Metadata tracking (hyperparameters, metrics, training date)
  • Production model tagging
  • Model provenance

LLM-Powered Analytics

7 Specialized Tools

  1. get_programs: List all available programs
  2. get_program_details: Detailed program information
  3. predict_values: Generate predictions for specific programs
  4. get_best_model: Retrieve best model for target variable
  5. analyze_trends: Time-series trend analysis
  6. export_data: Export results (JSON/CSV/Excel)
  7. optimize_schedule: Run genetic algorithm optimization

Agent Architecture

LangChain Agent
  ├── Chat History (MongoDB)
  ├── Tool Selection (dynamic)
  ├── LLM Generation (multi-provider)
  └── Response Formatting

Natural Language Queries

Examples: - "What's the predicted revenue for program X?" - "Show me the top 5 programs by rating" - "Optimize the schedule for prime time" - "What factors affect viewership most?" - "Export last week's data to Excel"

Response Caching System

Architecture: - Redis-based caching layer - Key: hash(query + parameters) - TTL: Configurable (default 1 hour) - Invalidation: Smart cache invalidation

Benefits: - Cost Reduction: 75%+ reduction in LLM API costs - Latency: <10ms for cache hits vs 2-5s for misses - Scalability: Reduces backend load - Consistency: Same query returns same answer

Metrics: - Cache hit rate monitoring - Cost savings tracking - Performance analytics

Schedule Optimization

Genetic Algorithm Optimizer

Components: 1. Chromosome: Schedule representation 2. Fitness Function: Multi-objective scoring - Maximize total revenue - Maximize audience reach - Balance genre distribution - Respect time slot constraints 3. Genetic Operators: - Selection (tournament, roulette) - Crossover (single-point, uniform) - Mutation (swap, shuffle) 4. Evolution: 100+ generations 5. Convergence: Early stopping on plateau

Constraints: - Time slot availability - Content ratings (G, PG, R) - Minimum/maximum program length - Genre diversity requirements - Advertiser preferences

Output: - Optimized schedule - Predicted performance metrics - Visualization of improvements - Alternative schedules (top-K)

API Endpoints

Chat & Session Management

POST /api/v1/chat

Send message to LLM agent.

Request:

{
  "message": "Predict revenue for program XYZ",
  "session_id": "uuid-..."
}

Response:

{
  "response": "Based on historical data...",
  "session_id": "uuid-...",
  "metadata": {
    "model_used": "groq-llama-70b",
    "cache_hit": false,
    "tokens_used": 450,
    "latency_ms": 1850
  }
}

POST /api/v1/sessions/create

Create new chat session.

POST /api/v1/chat/history

Get conversation history.

Models & Predictions

GET /api/v1/models/production

List all production models.

POST /api/v1/predict

Generate predictions.

Request:

{
  "program_id": "ABC123",
  "target": "revenue",
  "features": {...}
}

GET /api/v1/models/{target}/best

Get best model for target variable.

Data & Export

POST /api/v1/export

Export data in various formats.

Formats: JSON, CSV, Excel

GET /api/v1/programs

List all programs with metadata.

Monitoring

GET /health

Basic health check.

GET /health/deep

Detailed system health: - Database connections - Redis status - LLM provider status - Disk space - Memory usage

GET /api/v1/llm/cache/stats

Cache performance statistics:

{
  "hit_rate": 0.78,
  "total_requests": 10000,
  "cache_hits": 7800,
  "cache_misses": 2200,
  "cost_savings": "$375",
  "avg_latency_cached_ms": 8,
  "avg_latency_uncached_ms": 2100
}

GET /api/v1/llm/tokens/usage

Token usage tracking per model.

GET /metrics

Prometheus metrics endpoint.

Project Structure

v2/
├── src/                           # Source code
│   ├── core/                      # Core application
│   │   ├── server.py              # FastAPI server
│   │   ├── agent_service.py       # LangChain agent
│   │   ├── agent_service_cached.py # Cached wrapper
│   │   ├── llm_cache.py           # Redis caching
│   │   ├── llm_ab_testing.py      # A/B testing
│   │   └── config.py              # Configuration
│   │
│   └── ml/                        # ML Pipeline
│       ├── EDA.py                 # Exploratory analysis
│       ├── preprocess.py          # Feature engineering
│       ├── modeling.py            # Model training
│       ├── predictor.py           # Predictions
│       ├── optimizer.py           # Schedule optimization
│       └── data_Loader.py         # Data loading
│
├── db/                            # Database layer
│   ├── postgres_manager.py        # PostgreSQL
│   ├── mongo_manager.py           # MongoDB
│   ├── ml_registry.py             # Model registry
│   ├── models.py                  # SQLAlchemy models
│   ├── migrations/                # SQL migrations
│   └── schemas/                   # Pydantic schemas
│
├── dashboard/                     # Frontend
│   └── src/
│       ├── Chat.js                # Chat interface
│       ├── MonitoringDashboard.js # Monitoring UI
│       ├── ChartComponent.js      # Visualizations
│       └── App.js                 # Main app
│
├── infrastructure/                # Deployment
│   ├── docker/
│   │   ├── docker-compose.yml     # Full stack
│   │   ├── Dockerfile.api         # API image
│   │   └── .env.example           # Environment template
│   ├── k8s/                       # Kubernetes
│   │   ├── base/                  # Base manifests
│   │   └── README.md              # K8s guide
│   ├── monitoring/                # Monitoring
│   │   ├── prometheus.yml         # Prometheus config
│   │   └── alerts.yml             # Alert rules
│   └── deploy.sh                  # Deployment script
│
├── scripts/                       # Automation
│   ├── deploy_databases.sh        # DB deployment
│   ├── train_models.py            # Automated training
│   └── monitor_models.py          # Performance monitoring
│
├── tests/                         # Test suite
│   ├── unit/                      # Unit tests
│   ├── integration/               # Integration tests
│   └── conftest.py                # Pytest fixtures
│
├── .github/workflows/             # CI/CD
│   ├── ci.yml                     # Continuous Integration
│   ├── cd.yml                     # Continuous Deployment
│   └── ml-pipeline.yml            # ML automation
│
└── docs/                          # Documentation
    ├── getting-started/           # Quick start guides
    ├── development/               # Development docs
    ├── production/                # Production guides
    └── reference/                 # Technical reference

Performance Characteristics

API Performance

  • Average Latency: 200-500ms
  • P95 Latency: <2s
  • P99 Latency: <5s
  • Throughput: 1000+ requests/minute

Database Performance

  • Query Latency: <100ms average
  • Connection Pool: 10-50 connections
  • Transaction Rate: 5000+ TPS

Cache Performance

  • Hit Rate: 75-80% in production
  • Latency: <10ms for hits
  • Memory Usage: ~500MB for 10K cached responses

ML Inference

  • Single Prediction: 50-100ms
  • Batch (100): 500ms
  • Model Loading: <1s on startup

Schedule Optimization

  • Small Schedule (50 slots): 2-3 seconds
  • Large Schedule (200 slots): 5-10 seconds
  • Genetic Algorithm: 100-500 generations

Monitoring & Observability

Pre-configured Dashboards

System Metrics: - CPU usage per service - Memory utilization - Disk I/O and space - Network throughput

Application Metrics: - API request rate and latency - Error rates by endpoint - Cache hit/miss rates - LLM token usage

Database Metrics: - Query performance - Connection pool status - Database size and growth - Slow query log

ML Metrics: - Model accuracy drift - Prediction latency - Feature distribution - Training job status

Alert Rules

Critical Alerts: - Database down (immediate) - API error rate >5% (immediate) - Disk space <10% (15 min) - Memory usage >90% (5 min)

Warning Alerts: - High latency (P95 >3s) (30 min) - Cache hit rate <50% (1 hour) - Model accuracy drift >10% (daily)

Notification Channels: - Email for critical alerts - Slack for all alerts - PagerDuty integration (optional)

Security Features

API Security

  • Input Validation: Pydantic models enforce schemas
  • SQL Injection: Parameterized queries only
  • XSS Protection: Input sanitization
  • Rate Limiting: Per-IP and per-user limits
  • CORS: Configured per environment

Authentication & Authorization

  • JWT Tokens: Stateless authentication
  • Token Expiration: Configurable TTL
  • Role-Based Access: Admin, user, read-only roles
  • API Keys: For service-to-service communication

Data Security

  • Encryption at Rest: Database encryption
  • Encryption in Transit: TLS/SSL for all connections
  • Secrets Management: Kubernetes Secrets, environment variables
  • Audit Logging: All sensitive operations logged

Container Security

  • Image Scanning: Trivy for vulnerability scanning
  • Non-root User: Containers run as unprivileged user
  • Read-only Filesystem: Where possible
  • Network Policies: Kubernetes network isolation

Cost Optimization

Development (Free Tier)

  • Databases: Local Docker containers
  • LLMs: 100% free-tier APIs
  • Hosting: Local machine
  • Total: $0/month

Production (Estimated)

  • Infrastructure: $100-200/month
  • Kubernetes cluster (3 nodes)
  • Load balancer
  • Storage (100GB)
  • Databases: $100/month
  • Managed PostgreSQL
  • Managed MongoDB
  • Managed Redis
  • LLM (with caching): $50-100/month
  • Free tier covers most queries
  • Paid tier for overflow
  • Monitoring: Included (open source)
  • Total: $300-500/month

Savings from Caching

  • Without Cache: ~$500/month in LLM costs
  • With Cache (75% hit rate): ~$125/month
  • Savings: $375/month (75% reduction)

Deployment

Docker Compose (Quick Start)

# Start all services
docker-compose up -d

# Services include:
# - PostgreSQL 16
# - MongoDB 7.0
# - Redis 7.0
# - FastAPI backend (port 5000)
# - React frontend (port 3000)

Kubernetes (Production)

# Deploy to production
bash deploy.sh kubernetes production

# Verify deployment
kubectl get pods -n bms
kubectl get services -n bms

# View logs
kubectl logs -f deployment/api -n bms

Testing

Unit Tests

# Run all unit tests
pytest tests/unit/ -v

# With coverage
pytest tests/unit/ --cov=. --cov-report=html

Integration Tests

# Run integration tests
pytest tests/integration/ -v

# Requires running services (databases, Redis)

End-to-End Tests

# Run E2E tests
pytest tests/e2e/ -v

# Tests full workflows

Load Testing

# Using locust
locust -f tests/load/locustfile.py

# Simulates 100+ concurrent users

Documentation

Comprehensive Guides

Getting Started: - Installation guide - Quick start (5-minute setup) - Project overview - Architecture introduction

Development: - Setup guide for developers - ML pipeline documentation - LLM agent development - API reference - Database schema

Production: - Database deployment and scaling - Application deployment - Infrastructure setup (Docker, K8s) - MLOps best practices - Monitoring setup

Reference: - Project structure - ML pipeline technical details - Further reading on algorithms

Completed Phases

Future Enhancements

Planned Features: - Real-time streaming predictions - Advanced visualization (D3.js) - Mobile app (React Native) - Multi-tenant support - Advanced anomaly detection - AutoML for hyperparameter tuning

Infrastructure: - Multi-region deployment - Advanced caching (CDN) - Read replicas for databases - Service mesh (Istio) - Advanced tracing (Jaeger)

  • Repository: Private Company Repository
  • Documentation: Comprehensive guides for development, production, and MLOps

← Back to Projects