Cloud Deployment

Turqoa runs on managed Kubernetes across AWS, Azure, and GCP. Cloud deployment is ideal for multi-terminal operations, elastic scaling, and minimal on-site infrastructure management.

Supported Cloud Providers

Provider	Compute Service	GPU Instances	Managed DB	Status
AWS	EKS	g5.xlarge (A10G)	RDS PostgreSQL	GA
Azure	AKS	NC4as_T4_v3	Azure Database for PostgreSQL	GA
GCP	GKE	n1-standard-8 + T4	Cloud SQL PostgreSQL	GA

Terraform / IaC Examples

Turqoa provides official Terraform modules for each supported provider.

AWS Example

module "turqoa_cluster" {
  source  = "turqoa/platform/aws"
  version = "3.2.0"

  cluster_name    = "turqoa-production"
  region          = "us-west-2"
  vpc_cidr        = "10.0.0.0/16"
  
  # Node groups
  inference_nodes = {
    instance_type = "g5.xlarge"
    min_size      = 2
    max_size      = 8
    gpu_count     = 1
  }

  application_nodes = {
    instance_type = "m6i.2xlarge"
    min_size      = 3
    max_size      = 6
  }

  # Database
  db_instance_class    = "db.r6g.xlarge"
  db_allocated_storage = 500
  db_multi_az          = true

  # Networking
  enable_vpn_gateway = true
  vpn_terminal_cidrs = ["203.0.113.0/24"]

  tags = {
    Environment = "production"
    Project     = "turqoa"
  }
}

Azure Example

module "turqoa_cluster" {
  source  = "turqoa/platform/azure"
  version = "3.2.0"

  cluster_name    = "turqoa-production"
  location        = "eastus2"
  resource_group  = "rg-turqoa-prod"

  gpu_node_pool = {
    vm_size    = "Standard_NC4as_T4_v3"
    min_count  = 2
    max_count  = 8
  }

  app_node_pool = {
    vm_size    = "Standard_D8s_v5"
    min_count  = 3
    max_count  = 6
  }

  database = {
    sku_name   = "GP_Gen5_4"
    storage_mb = 512000
    ha_mode    = "ZoneRedundant"
  }
}

Networking

VPC / VNet Architecture

┌───────────────── VPC 10.0.0.0/16 ─────────────────┐
│                                                     │
│  ┌─── Public Subnets ───┐  ┌── Private Subnets ──┐ │
│  │ 10.0.1.0/24 (AZ-a)  │  │ 10.0.10.0/24 (AZ-a)│ │
│  │ 10.0.2.0/24 (AZ-b)  │  │ 10.0.20.0/24 (AZ-b)│ │
│  │                       │  │                      │ │
│  │ - ALB / NLB           │  │ - EKS nodes          │ │
│  │ - NAT Gateway         │  │ - RDS                 │ │
│  │ - VPN Gateway         │  │ - Kafka (MSK)         │ │
│  └───────────────────────┘  └──────────────────────┘ │
└─────────────────────────────────────────────────────┘
         │
         │ Site-to-Site VPN / Direct Connect
         ▼
┌─── Terminal Network ───┐
│  Cameras / TOS / Edge  │
└────────────────────────┘

Required Ports

Port	Direction	Purpose
443	Inbound	Command Center HTTPS, API
6443	Internal	Kubernetes API server
9092-9094	Internal	Kafka brokers
5432	Internal	PostgreSQL
554	Inbound (VPN)	RTSP camera streams from terminal
80/8080	Inbound (VPN)	ONVIF camera management

Scaling Configuration

Turqoa uses Kubernetes Horizontal Pod Autoscaler (HPA) and cluster autoscaler for dynamic scaling.

# turqoa-hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: turqoa-inference
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: turqoa-inference
  minReplicas: 2
  maxReplicas: 8
  metrics:
    - type: Resource
      resource:
        name: nvidia.com/gpu
        target:
          type: Utilization
          averageUtilization: 70
    - type: Pods
      pods:
        metric:
          name: turqoa_inference_queue_depth
        target:
          type: AverageValue
          averageValue: "5"

Scaling Triggers

Metric	Scale-Up Threshold	Scale-Down Threshold	Cooldown
GPU utilization	> 70% for 2 min	< 30% for 10 min	5 min
Inference queue depth	> 5 pending	< 1 pending for 10 min	5 min
API request rate	> 500 req/s	< 100 req/s for 10 min	3 min

Monitoring Setup

Turqoa exports metrics in Prometheus format and integrates with cloud-native monitoring:

Provider	Monitoring	Logging	Alerting
AWS	CloudWatch + Prometheus	CloudWatch Logs	SNS + PagerDuty
Azure	Azure Monitor + Prometheus	Log Analytics	Action Groups
GCP	Cloud Monitoring + Prometheus	Cloud Logging	Alerting Policies

# Verify metrics endpoint
curl -s http://turqoa-api:9090/metrics | head -20

# Key metrics to monitor:
# turqoa_gate_transactions_total
# turqoa_inference_latency_seconds
# turqoa_decision_engine_evaluations_total
# turqoa_camera_stream_fps
# turqoa_tos_query_latency_seconds

Note: Cloud deployments require a site-to-site VPN or direct connect link between the cloud VPC and the terminal network for camera stream ingestion. Public internet routing is not supported for RTSP traffic.