Broadcast Analytics System - MLOps Platform¶
Status: Production-Ready (All 6 Phases Complete) Repository: Private Company Repository Version: 2.0
Executive Summary¶
A production-grade enterprise MLOps platform for broadcast analytics combining machine learning, LLM-powered chat, and automated deployment infrastructure. Features 24 trained models, intelligent caching, schedule optimization, and comprehensive monitoring.
System Architecture¶
graph LR
subgraph Data["Data Layer"]
PG["PostgreSQL 16<br/>ML Registry"]
Mongo["MongoDB 7<br/>Chat History"]
Redis["Redis 7<br/>LLM Cache"]
end
subgraph ML["ML Pipeline"]
Loader["Data<br/>Loader"]
Preprocess["Feature<br/>Engineering"]
Training["Model<br/>Training"]
Evaluation["Model<br/>Evaluation"]
Registry["ML<br/>Registry"]
end
subgraph LLM["LLM Agent"]
Agent["LangChain<br/>Agent"]
LLMPrimary["Groq<br/>Llama 70B"]
LLMSecondary["Gemini<br/>2 Flash"]
LLMFallback["OpenRouter<br/>Fallback"]
Cache["Response<br/>Cache"]
end
subgraph API["API Layer"]
APINode["REST API<br/>OpenAPI"]
Predictor["Prediction<br/>Service"]
Optimizer["Schedule<br/>Optimizer"]
Export["Data<br/>Export"]
end
subgraph UI["Frontend"]
Dashboard["Monitoring<br/>Dashboard"]
Chat["LLM Chat<br/>Interface"]
Charts["Interactive<br/>Charts"]
end
subgraph Monitor["CI/CD and Monitoring"]
GHA["GitHub<br/>Actions"]
Prometheus["Prometheus<br/>Metrics"]
Grafana["Grafana<br/>Dashboards"]
Alertmanager["Alert<br/>Manager"]
end
Loader --> Preprocess
Preprocess --> Training
Training --> Evaluation
Evaluation --> Registry
Registry --> PG
APINode --> Predictor
APINode --> Optimizer
APINode --> Export
Predictor --> Registry
Agent --> LLMPrimary
LLMPrimary -.->|Fallback| LLMSecondary
LLMSecondary -.->|Fallback| LLMFallback
Agent --> Cache
Cache --> Redis
Agent --> Mongo
Dashboard --> APINode
Chat --> Agent
Charts --> APINode
APINode --> Prometheus
Prometheus --> Grafana
Prometheus --> Alertmanager
classDef dataStyle fill:#FFE0B2,stroke:#E65100,stroke-width:2px
classDef mlStyle fill:#E1BEE7,stroke:#4A148C,stroke-width:2px
classDef llmStyle fill:#C8E6C9,stroke:#1B5E20,stroke-width:2px
classDef apiStyle fill:#BBDEFB,stroke:#0D47A1,stroke-width:2px
classDef frontendStyle fill:#F8BBD0,stroke:#880E4F,stroke-width:2px
classDef cicdStyle fill:#FFF9C4,stroke:#F57F17,stroke-width:2px
class PG,Mongo,Redis dataStyle
class Loader,Preprocess,Training,Evaluation,Registry mlStyle
class Agent,LLMPrimary,LLMSecondary,LLMFallback,Cache llmStyle
class API,Predictor,Optimizer,Export apiStyle
class Dashboard,Chat,Charts frontendStyle
class GHA,Prometheus,Grafana,Alertmanager cicdStyle
System Overview¶
The platform provides: - Predictive analytics for broadcast revenue and ratings - Natural language query interface with LLM agents - Automated ML pipeline with model registry - Schedule optimization using genetic algorithms - Industrial-grade monitoring stack
Technology Stack¶
Backend Framework¶
- API: FastAPI with async support
- Python: 3.12+ with modern features
- Validation: Pydantic models
- Server: Uvicorn ASGI server
- ORM: SQLAlchemy 2.0 for PostgreSQL
Frontend¶
- Framework: React 18
- Visualization: Chart.js for interactive charts
- HTTP Client: Axios
- Theme: Custom enterprise professional theme
- Build: Create React App
Databases¶
PostgreSQL 16¶
- Purpose: Structured data, ML model registry
- Features:
- Model versioning and metadata
- Training results and metrics
- Performance tracking
- ACID transactions
MongoDB 7.0¶
- Purpose: Unstructured data, chat sessions
- Features:
- Chat history storage
- Session management
- Flexible document schema
- Async access with Motor
Redis 7.0¶
- Purpose: High-performance caching
- Features:
- LLM response cache (75%+ hit rate)
- Session data
- Real-time metrics
- Pub/sub messaging
Machine Learning Frameworks¶
Scikit-learn¶
- Models: Random Forest, Gradient Boosting, ElasticNet, MLP
- Purpose: Classical ML algorithms
- Features: Feature engineering, preprocessing, evaluation
XGBoost¶
- Purpose: Gradient boosting trees
- Performance: Fast training, high accuracy
- Features: GPU support, early stopping, feature importance
CatBoost¶
- Purpose: Categorical feature handling
- Performance: State-of-the-art accuracy
- Features: Built-in categorical encoding, robust to overfitting
LLM & Agent Framework¶
LangChain¶
- Purpose: LLM application framework
- Features:
- Agent orchestration
- Tool integration
- Memory management
- Chain composition
LLM Providers¶
Primary: Groq - Model: Llama 3.3 70B - Free Tier: 14.4K TPM - Speed: Fastest inference - Use Case: Primary generation
Secondary: Google Gemini - Model: Gemini 2.0 Flash - Free Tier: 1.5M tokens/day - Speed: Excellent balance - Use Case: Fallback provider
Tertiary: OpenRouter - Model: Various (Llama, Mixtral) - Use Case: Final fallback
Optimization¶
Genetic Algorithm¶
- Library: DEAP (Distributed Evolutionary Algorithms)
- Purpose: Schedule optimization
- Features:
- Custom fitness functions
- Multi-objective optimization
- Constraint handling
- Population evolution
Infrastructure & DevOps¶
Containerization¶
- Docker: Multi-stage builds for efficiency
- Docker Compose: Full stack orchestration
- Images: Optimized layer caching
- Networking: Custom bridge networks
Orchestration¶
- Kubernetes: Production deployment
- Resources:
- Deployments for stateless services
- StatefulSets for databases
- Services for load balancing
- Ingress for external access
- ConfigMaps for configuration
- Secrets for credentials
CI/CD¶
- Platform: GitHub Actions
- Pipelines:
- Automated testing (pytest)
- Code quality (black, ruff, mypy)
- Docker image builds
- Deployment automation
- ML model training pipeline
Monitoring Stack¶
Prometheus 2.48.0 - Time-series metrics collection - 90-day data retention - Alert rule evaluation - Service discovery
Grafana 10.2.2 - Professional dashboards - Real-time visualization - Alert management - Data source integration
Alertmanager - Alert routing - Email and Slack notifications - Alert grouping and deduplication - Silence management
Loki + Promtail - Log aggregation - Log querying - Integration with Grafana - Distributed tracing (planned)
Core Features¶
Automated ML Pipeline¶
1. Data Loading¶
- ETL from PostgreSQL
- Feature extraction
- Data validation
- Missing value handling
2. Preprocessing¶
- Feature engineering
- Categorical encoding
- Numerical scaling
- Train/test splitting
3. Model Training¶
- 6 Algorithms:
- Random Forest
- XGBoost
- CatBoost
- Gradient Boosting
- Multi-Layer Perceptron (MLP)
-
ElasticNet
-
4 Target Variables:
- Rating predictions
- Revenue forecasts
- Audience metrics
-
Engagement scores
-
Total: 24 trained models (6 × 4)
4. Model Evaluation¶
- R² score calculation
- MSE, RMSE, MAE metrics
- Feature importance analysis
- Cross-validation
- Model comparison
5. Model Selection¶
- Automatic best model selection (R² > 0.7)
- Versioning and metadata tracking
- A/B testing support
- Rollback capabilities
6. Model Registry¶
- Database-backed storage
- Version control
- Metadata tracking (hyperparameters, metrics, training date)
- Production model tagging
- Model provenance
LLM-Powered Analytics¶
7 Specialized Tools¶
- get_programs: List all available programs
- get_program_details: Detailed program information
- predict_values: Generate predictions for specific programs
- get_best_model: Retrieve best model for target variable
- analyze_trends: Time-series trend analysis
- export_data: Export results (JSON/CSV/Excel)
- optimize_schedule: Run genetic algorithm optimization
Agent Architecture¶
LangChain Agent
├── Chat History (MongoDB)
├── Tool Selection (dynamic)
├── LLM Generation (multi-provider)
└── Response Formatting
Natural Language Queries¶
Examples: - "What's the predicted revenue for program X?" - "Show me the top 5 programs by rating" - "Optimize the schedule for prime time" - "What factors affect viewership most?" - "Export last week's data to Excel"
Response Caching System¶
Architecture: - Redis-based caching layer - Key: hash(query + parameters) - TTL: Configurable (default 1 hour) - Invalidation: Smart cache invalidation
Benefits: - Cost Reduction: 75%+ reduction in LLM API costs - Latency: <10ms for cache hits vs 2-5s for misses - Scalability: Reduces backend load - Consistency: Same query returns same answer
Metrics: - Cache hit rate monitoring - Cost savings tracking - Performance analytics
Schedule Optimization¶
Genetic Algorithm Optimizer¶
Components: 1. Chromosome: Schedule representation 2. Fitness Function: Multi-objective scoring - Maximize total revenue - Maximize audience reach - Balance genre distribution - Respect time slot constraints 3. Genetic Operators: - Selection (tournament, roulette) - Crossover (single-point, uniform) - Mutation (swap, shuffle) 4. Evolution: 100+ generations 5. Convergence: Early stopping on plateau
Constraints: - Time slot availability - Content ratings (G, PG, R) - Minimum/maximum program length - Genre diversity requirements - Advertiser preferences
Output: - Optimized schedule - Predicted performance metrics - Visualization of improvements - Alternative schedules (top-K)
API Endpoints¶
Chat & Session Management¶
POST /api/v1/chat¶
Send message to LLM agent.
Request:
{
"message": "Predict revenue for program XYZ",
"session_id": "uuid-..."
}
Response:
{
"response": "Based on historical data...",
"session_id": "uuid-...",
"metadata": {
"model_used": "groq-llama-70b",
"cache_hit": false,
"tokens_used": 450,
"latency_ms": 1850
}
}
POST /api/v1/sessions/create¶
Create new chat session.
POST /api/v1/chat/history¶
Get conversation history.
Models & Predictions¶
GET /api/v1/models/production¶
List all production models.
POST /api/v1/predict¶
Generate predictions.
Request:
{
"program_id": "ABC123",
"target": "revenue",
"features": {...}
}
GET /api/v1/models/{target}/best¶
Get best model for target variable.
Data & Export¶
POST /api/v1/export¶
Export data in various formats.
Formats: JSON, CSV, Excel
GET /api/v1/programs¶
List all programs with metadata.
Monitoring¶
GET /health¶
Basic health check.
GET /health/deep¶
Detailed system health: - Database connections - Redis status - LLM provider status - Disk space - Memory usage
GET /api/v1/llm/cache/stats¶
Cache performance statistics:
{
"hit_rate": 0.78,
"total_requests": 10000,
"cache_hits": 7800,
"cache_misses": 2200,
"cost_savings": "$375",
"avg_latency_cached_ms": 8,
"avg_latency_uncached_ms": 2100
}
GET /api/v1/llm/tokens/usage¶
Token usage tracking per model.
GET /metrics¶
Prometheus metrics endpoint.
Project Structure¶
v2/
├── src/ # Source code
│ ├── core/ # Core application
│ │ ├── server.py # FastAPI server
│ │ ├── agent_service.py # LangChain agent
│ │ ├── agent_service_cached.py # Cached wrapper
│ │ ├── llm_cache.py # Redis caching
│ │ ├── llm_ab_testing.py # A/B testing
│ │ └── config.py # Configuration
│ │
│ └── ml/ # ML Pipeline
│ ├── EDA.py # Exploratory analysis
│ ├── preprocess.py # Feature engineering
│ ├── modeling.py # Model training
│ ├── predictor.py # Predictions
│ ├── optimizer.py # Schedule optimization
│ └── data_Loader.py # Data loading
│
├── db/ # Database layer
│ ├── postgres_manager.py # PostgreSQL
│ ├── mongo_manager.py # MongoDB
│ ├── ml_registry.py # Model registry
│ ├── models.py # SQLAlchemy models
│ ├── migrations/ # SQL migrations
│ └── schemas/ # Pydantic schemas
│
├── dashboard/ # Frontend
│ └── src/
│ ├── Chat.js # Chat interface
│ ├── MonitoringDashboard.js # Monitoring UI
│ ├── ChartComponent.js # Visualizations
│ └── App.js # Main app
│
├── infrastructure/ # Deployment
│ ├── docker/
│ │ ├── docker-compose.yml # Full stack
│ │ ├── Dockerfile.api # API image
│ │ └── .env.example # Environment template
│ ├── k8s/ # Kubernetes
│ │ ├── base/ # Base manifests
│ │ └── README.md # K8s guide
│ ├── monitoring/ # Monitoring
│ │ ├── prometheus.yml # Prometheus config
│ │ └── alerts.yml # Alert rules
│ └── deploy.sh # Deployment script
│
├── scripts/ # Automation
│ ├── deploy_databases.sh # DB deployment
│ ├── train_models.py # Automated training
│ └── monitor_models.py # Performance monitoring
│
├── tests/ # Test suite
│ ├── unit/ # Unit tests
│ ├── integration/ # Integration tests
│ └── conftest.py # Pytest fixtures
│
├── .github/workflows/ # CI/CD
│ ├── ci.yml # Continuous Integration
│ ├── cd.yml # Continuous Deployment
│ └── ml-pipeline.yml # ML automation
│
└── docs/ # Documentation
├── getting-started/ # Quick start guides
├── development/ # Development docs
├── production/ # Production guides
└── reference/ # Technical reference
Performance Characteristics¶
API Performance¶
- Average Latency: 200-500ms
- P95 Latency: <2s
- P99 Latency: <5s
- Throughput: 1000+ requests/minute
Database Performance¶
- Query Latency: <100ms average
- Connection Pool: 10-50 connections
- Transaction Rate: 5000+ TPS
Cache Performance¶
- Hit Rate: 75-80% in production
- Latency: <10ms for hits
- Memory Usage: ~500MB for 10K cached responses
ML Inference¶
- Single Prediction: 50-100ms
- Batch (100): 500ms
- Model Loading: <1s on startup
Schedule Optimization¶
- Small Schedule (50 slots): 2-3 seconds
- Large Schedule (200 slots): 5-10 seconds
- Genetic Algorithm: 100-500 generations
Monitoring & Observability¶
Pre-configured Dashboards¶
System Metrics: - CPU usage per service - Memory utilization - Disk I/O and space - Network throughput
Application Metrics: - API request rate and latency - Error rates by endpoint - Cache hit/miss rates - LLM token usage
Database Metrics: - Query performance - Connection pool status - Database size and growth - Slow query log
ML Metrics: - Model accuracy drift - Prediction latency - Feature distribution - Training job status
Alert Rules¶
Critical Alerts: - Database down (immediate) - API error rate >5% (immediate) - Disk space <10% (15 min) - Memory usage >90% (5 min)
Warning Alerts: - High latency (P95 >3s) (30 min) - Cache hit rate <50% (1 hour) - Model accuracy drift >10% (daily)
Notification Channels: - Email for critical alerts - Slack for all alerts - PagerDuty integration (optional)
Security Features¶
API Security¶
- Input Validation: Pydantic models enforce schemas
- SQL Injection: Parameterized queries only
- XSS Protection: Input sanitization
- Rate Limiting: Per-IP and per-user limits
- CORS: Configured per environment
Authentication & Authorization¶
- JWT Tokens: Stateless authentication
- Token Expiration: Configurable TTL
- Role-Based Access: Admin, user, read-only roles
- API Keys: For service-to-service communication
Data Security¶
- Encryption at Rest: Database encryption
- Encryption in Transit: TLS/SSL for all connections
- Secrets Management: Kubernetes Secrets, environment variables
- Audit Logging: All sensitive operations logged
Container Security¶
- Image Scanning: Trivy for vulnerability scanning
- Non-root User: Containers run as unprivileged user
- Read-only Filesystem: Where possible
- Network Policies: Kubernetes network isolation
Cost Optimization¶
Development (Free Tier)¶
- Databases: Local Docker containers
- LLMs: 100% free-tier APIs
- Hosting: Local machine
- Total: $0/month
Production (Estimated)¶
- Infrastructure: $100-200/month
- Kubernetes cluster (3 nodes)
- Load balancer
- Storage (100GB)
- Databases: $100/month
- Managed PostgreSQL
- Managed MongoDB
- Managed Redis
- LLM (with caching): $50-100/month
- Free tier covers most queries
- Paid tier for overflow
- Monitoring: Included (open source)
- Total: $300-500/month
Savings from Caching¶
- Without Cache: ~$500/month in LLM costs
- With Cache (75% hit rate): ~$125/month
- Savings: $375/month (75% reduction)
Deployment¶
Docker Compose (Quick Start)¶
# Start all services
docker-compose up -d
# Services include:
# - PostgreSQL 16
# - MongoDB 7.0
# - Redis 7.0
# - FastAPI backend (port 5000)
# - React frontend (port 3000)
Kubernetes (Production)¶
# Deploy to production
bash deploy.sh kubernetes production
# Verify deployment
kubectl get pods -n bms
kubectl get services -n bms
# View logs
kubectl logs -f deployment/api -n bms
Testing¶
Unit Tests¶
# Run all unit tests
pytest tests/unit/ -v
# With coverage
pytest tests/unit/ --cov=. --cov-report=html
Integration Tests¶
# Run integration tests
pytest tests/integration/ -v
# Requires running services (databases, Redis)
End-to-End Tests¶
# Run E2E tests
pytest tests/e2e/ -v
# Tests full workflows
Load Testing¶
# Using locust
locust -f tests/load/locustfile.py
# Simulates 100+ concurrent users
Documentation¶
Comprehensive Guides¶
Getting Started: - Installation guide - Quick start (5-minute setup) - Project overview - Architecture introduction
Development: - Setup guide for developers - ML pipeline documentation - LLM agent development - API reference - Database schema
Production: - Database deployment and scaling - Application deployment - Infrastructure setup (Docker, K8s) - MLOps best practices - Monitoring setup
Reference: - Project structure - ML pipeline technical details - Further reading on algorithms
Completed Phases¶
Future Enhancements¶
Planned Features: - Real-time streaming predictions - Advanced visualization (D3.js) - Mobile app (React Native) - Multi-tenant support - Advanced anomaly detection - AutoML for hyperparameter tuning
Infrastructure: - Multi-region deployment - Advanced caching (CDN) - Read replicas for databases - Service mesh (Istio) - Advanced tracing (Jaeger)
Links¶
- Repository: Private Company Repository
- Documentation: Comprehensive guides for development, production, and MLOps