feat: at helm deployment

This commit is contained in:
Morten Olsen
2025-10-18 00:01:58 +02:00
parent 1f7837cabc
commit 47a8dd96c2
26 changed files with 2080 additions and 73 deletions

View File

@@ -0,0 +1,194 @@
## Context
The Backbone Helm chart currently has minimal configuration support. Users cannot configure the broker through Helm values, making it difficult to deploy in production environments. The chart needs to support all configuration options from the README and follow Helm best practices for production deployments.
### Constraints
- Must maintain backward compatibility where possible
- Must align with environment variables documented in README
- Must follow Kubernetes and Helm best practices
- Storage backend (SQLite, PostgreSQL) may require persistent volumes
### Stakeholders
- Kubernetes operators deploying Backbone
- Users requiring production-grade deployments with HA, monitoring, and persistence
## Goals / Non-Goals
### Goals
- Expose all README environment variables through Helm values
- Support persistent storage for `/data` directory with configurable storage class
- Follow Helm best practices (resources, probes, security contexts, labels)
- Enable production-ready deployments with proper health checks
- Support ingress for HTTP API exposure
### Non-Goals
- StatefulSet conversion (deployment is sufficient for single-replica MQTT broker)
- Horizontal Pod Autoscaling (MQTT broker state management complexity)
- Built-in monitoring/metrics exporters (separate concern)
- Multi-replica support with Redis clustering (future enhancement)
## Decisions
### Decision: Use PVC for /data persistence
**Rationale**: The application may use SQLite or store session data in `/data`. A PVC ensures data survives pod restarts and enables backup/restore workflows.
**Alternatives considered**:
- emptyDir: Loses data on pod restart, unsuitable for production
- hostPath: Ties pod to specific node, reduces portability
- PVC (chosen): Standard Kubernetes pattern, supports storage classes, backup-friendly
**Implementation**:
- Optional PVC controlled by `persistence.enabled` flag
- Configurable storage class, size, and access mode
- Defaults to disabled for backward compatibility
### Decision: Environment variable structure in values.yaml
**Rationale**: Flatten environment variables under logical sections (config, k8s, oidc, redis) rather than deep nesting for better readability.
**Structure**:
```yaml
config:
adminToken: ''
jwtSecret: ''
httpPort: 8883
tcpPort: 1883
k8s:
enabled: true # default true since chart runs in K8s
ws:
enabled: false
api:
enabled: false
tcp:
enabled: false
oidc:
enabled: false
discovery: ''
clientId: ''
clientSecret: ''
groupField: 'groups'
adminGroup: ''
writerGroup: ''
readerGroup: ''
redis:
enabled: false
host: 'localhost'
port: 6379
password: ''
db: 0
```
### Decision: ServiceAccount template instead of hardcoded name
**Rationale**: Current deployment references `{{ .Release.Name }}` for ServiceAccount but doesn't create it. Extract to proper template with configurable name and annotations.
**Migration**: Existing deployments referencing release name continue working.
### Decision: Default K8S_ENABLED to true in chart
**Rationale**: The Helm chart is deployed TO Kubernetes, so K8s integration should default to enabled. Users can disable if running in non-operator mode.
### Decision: Security context defaults
Apply restricted security context by default:
```yaml
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 1000
readOnlyRootFilesystem: false # /data needs write access
```
**Rationale**: Follows Kubernetes security best practices. ReadOnlyRootFilesystem disabled because SQLite needs write access to `/data`.
### Decision: Probe configuration
Add both liveness and readiness probes with sensible defaults:
- Liveness: HTTP GET `/health` on port 8883 (requires API_ENABLED)
- Readiness: HTTP GET `/health` on port 8883
- Fallback: TCP socket check on ports if API disabled
**Rationale**: Enables Kubernetes to detect unhealthy pods and route traffic appropriately.
## Risks / Trade-offs
### Risk: Breaking changes for existing deployments
**Mitigation**:
- Set conservative defaults matching current behavior where possible
- Document migration path in CHANGELOG or upgrade notes
- Version bump signals breaking changes (0.1.0 → 0.2.0)
### Risk: Complex values.yaml overwhelming users
**Mitigation**:
- Provide comprehensive comments
- Include examples in comments
- Keep sensible defaults for 90% use case
- Create example values files for common scenarios
### Risk: Storage class availability varies by cluster
**Mitigation**:
- Make storage class configurable (default: `""` uses cluster default)
- Document common storage classes in values comments
- Support disabling persistence entirely
## Migration Plan
### For existing deployments:
1. Review `values.yaml` changes
2. Set `persistence.enabled: false` to maintain stateless behavior (if desired)
3. Configure environment variables previously set via manual env overrides
4. Update service types if non-default required
5. Helm upgrade with new chart version
### Rollback:
Standard Helm rollback: `helm rollback <release> <revision>`
### Validation:
```bash
# Dry-run
helm upgrade --install backbone ./charts --dry-run --debug
# Lint
helm lint ./charts
# Template verification
helm template backbone ./charts > manifests.yaml
kubectl apply --dry-run=client -f manifests.yaml
```
## Open Questions
1. **Should probes be enabled by default?**
- Proposal: Yes, but only if `api.enabled=true`, otherwise use TCP checks
2. **Default persistence size?**
- Proposal: 1Gi for SQLite database and session data
3. **Should we support initContainers for DB migrations?**
- Proposal: No, out of scope for this change (future enhancement)
4. **Ingress class defaults?**
- Proposal: Empty string, user must specify their ingress class

View File

@@ -0,0 +1,24 @@
## Why
The current Helm chart lacks configuration flexibility and best practices support. It does not expose the environment variables documented in the README, missing PersistentVolume support for data persistence, and lacks standard Helm values patterns (resources, nodeSelector, tolerations, etc.).
## What Changes
- Add comprehensive values structure for all configuration options documented in README
- Add PersistentVolumeClaim configuration with configurable storage class for `/data` mount
- Add standard Helm best practices: resource limits/requests, node selectors, tolerations, affinity, security contexts
- Add proper labels and annotations following Helm conventions
- Add liveness and readiness probes for container health checks
- Add ServiceAccount template (currently hardcoded in deployment)
- Make service types and configurations customizable
- Add ingress support for HTTP API endpoint
## Impact
- Affected specs: `helm-deployment` (new capability)
- Affected code:
- `charts/values.yaml` - Complete restructure with backward compatibility
- `charts/templates/deployment.yaml` - Add volume mounts, env vars, probes, security
- `charts/templates/services.yaml` - Make service types configurable
- `charts/templates/*.yaml` - Add missing templates (serviceaccount, pvc, ingress)
- `charts/Chart.yaml` - Version bump to reflect changes

View File

@@ -0,0 +1,278 @@
## ADDED Requirements
### Requirement: Configuration Values Structure
The Helm chart SHALL provide a comprehensive values.yaml structure that exposes all configuration options documented in the README.
#### Scenario: All environment variables configurable
- **WHEN** a user deploys the chart
- **THEN** values.yaml MUST include sections for: config (adminToken, jwtSecret, ports), k8s (enabled), ws (enabled), api (enabled), tcp (enabled), oidc (all 8 variables), and redis (all 5 variables)
#### Scenario: Default values match README defaults
- **WHEN** a user deploys without custom values
- **THEN** environment variables MUST default to values documented in README (e.g., K8S_ENABLED=true in K8s context, HTTP_PORT=8883, TCP_PORT=1883)
### Requirement: Persistent Volume Support
The chart SHALL support optional persistent storage for the `/data` directory with configurable storage class.
#### Scenario: Enable persistence with default storage class
- **WHEN** `persistence.enabled=true` is set
- **THEN** a PersistentVolumeClaim MUST be created and mounted to `/data` in the container
#### Scenario: Custom storage class
- **WHEN** `persistence.storageClass` is specified
- **THEN** the PVC MUST request that storage class
#### Scenario: Configurable volume size
- **WHEN** `persistence.size` is specified
- **THEN** the PVC MUST request that storage size (default: 1Gi)
#### Scenario: Persistence disabled by default
- **WHEN** no persistence configuration is provided
- **THEN** no PVC MUST be created and deployment uses emptyDir or no volume
### Requirement: Resource Management
The chart SHALL support Kubernetes resource limits and requests configuration.
#### Scenario: Resource limits configurable
- **WHEN** `resources.limits.cpu` or `resources.limits.memory` are set
- **THEN** the deployment MUST include these resource limits
#### Scenario: Resource requests configurable
- **WHEN** `resources.requests.cpu` or `resources.requests.memory` are set
- **THEN** the deployment MUST include these resource requests
#### Scenario: Default resources
- **WHEN** no resources are specified
- **THEN** the deployment MUST NOT set resource constraints (Kubernetes default behavior)
### Requirement: Pod Scheduling
The chart SHALL support standard Kubernetes pod scheduling options.
#### Scenario: Node selector support
- **WHEN** `nodeSelector` values are provided
- **THEN** the deployment MUST include the node selector configuration
#### Scenario: Tolerations support
- **WHEN** `tolerations` array is provided
- **THEN** the deployment MUST include the tolerations
#### Scenario: Affinity support
- **WHEN** `affinity` configuration is provided
- **THEN** the deployment MUST include the affinity rules
### Requirement: Health Probes
The chart SHALL support configurable liveness and readiness probes.
#### Scenario: HTTP probes when API enabled
- **WHEN** `api.enabled=true` and probes are enabled
- **THEN** liveness and readiness probes MUST use HTTP GET on `/health` endpoint
#### Scenario: TCP probes as fallback
- **WHEN** `api.enabled=false` and probes are enabled
- **THEN** liveness and readiness probes MUST use TCP socket checks on configured ports
#### Scenario: Configurable probe parameters
- **WHEN** probe values (`initialDelaySeconds`, `periodSeconds`, `timeoutSeconds`) are set
- **THEN** the deployment MUST use these probe configurations
#### Scenario: Probes can be disabled
- **WHEN** `livenessProbe.enabled=false` or `readinessProbe.enabled=false`
- **THEN** the respective probe MUST be omitted from the deployment
### Requirement: Security Context
The chart SHALL support security context configuration following Kubernetes security best practices.
#### Scenario: Pod security context
- **WHEN** `podSecurityContext` values are provided
- **THEN** the deployment MUST apply these security settings at pod level
#### Scenario: Container security context
- **WHEN** `securityContext` values are provided
- **THEN** the deployment MUST apply these security settings at container level
#### Scenario: Default security settings
- **WHEN** no security context is specified
- **THEN** the deployment SHOULD use secure defaults (runAsNonRoot, non-root UID)
### Requirement: Service Configuration
The chart SHALL support configurable service types and settings for both HTTP and TCP services.
#### Scenario: HTTP service type configurable
- **WHEN** `service.http.type` is set to LoadBalancer, ClusterIP, or NodePort
- **THEN** the HTTP service MUST use that service type
#### Scenario: TCP service type configurable
- **WHEN** `service.tcp.type` is set to LoadBalancer, ClusterIP, or NodePort
- **THEN** the TCP service MUST use that service type
#### Scenario: Service annotations
- **WHEN** `service.http.annotations` or `service.tcp.annotations` are provided
- **THEN** the respective services MUST include those annotations
#### Scenario: Service ports configurable
- **WHEN** `service.http.port` or `service.tcp.port` are specified
- **THEN** the services MUST expose those external ports (targeting container ports from config)
### Requirement: ServiceAccount Management
The chart SHALL create and manage a ServiceAccount for the deployment with configurable name and annotations.
#### Scenario: ServiceAccount creation
- **WHEN** the chart is deployed
- **THEN** a ServiceAccount resource MUST be created
#### Scenario: ServiceAccount name configurable
- **WHEN** `serviceAccount.name` is specified
- **THEN** the ServiceAccount and deployment MUST use that name
#### Scenario: ServiceAccount annotations
- **WHEN** `serviceAccount.annotations` are provided
- **THEN** the ServiceAccount MUST include those annotations (useful for IRSA, Workload Identity)
### Requirement: Ingress Support
The chart SHALL support optional Ingress configuration for exposing the HTTP API.
#### Scenario: Ingress creation
- **WHEN** `ingress.enabled=true`
- **THEN** an Ingress resource MUST be created
#### Scenario: Ingress host configuration
- **WHEN** `ingress.hosts` array is provided
- **THEN** the Ingress MUST include rules for those hosts
#### Scenario: Ingress TLS
- **WHEN** `ingress.tls` configuration is provided
- **THEN** the Ingress MUST include TLS settings with specified secret names
#### Scenario: Ingress class
- **WHEN** `ingress.className` is specified
- **THEN** the Ingress MUST use that ingress class
#### Scenario: Ingress annotations
- **WHEN** `ingress.annotations` are provided
- **THEN** the Ingress MUST include those annotations (e.g., for cert-manager, nginx settings)
### Requirement: Labels and Annotations
The chart SHALL apply standard Helm and Kubernetes labels following best practices.
#### Scenario: Standard labels applied
- **WHEN** resources are created
- **THEN** they MUST include labels: `app.kubernetes.io/name`, `app.kubernetes.io/instance`, `app.kubernetes.io/version`, `app.kubernetes.io/managed-by`
#### Scenario: Custom labels support
- **WHEN** `commonLabels` are defined in values
- **THEN** all resources MUST include these additional labels
#### Scenario: Custom annotations support
- **WHEN** `commonAnnotations` are defined in values
- **THEN** all resources MUST include these additional annotations
### Requirement: Environment Variable Mapping
The chart SHALL correctly map all values.yaml configuration to container environment variables matching README documentation.
#### Scenario: Admin and JWT configuration
- **WHEN** `config.adminToken` or `config.jwtSecret` are set
- **THEN** environment variables `ADMIN_TOKEN` and `JWT_SECRET` MUST be set in the container
#### Scenario: Feature toggles
- **WHEN** `k8s.enabled`, `ws.enabled`, `api.enabled`, or `tcp.enabled` are set
- **THEN** corresponding environment variables `K8S_ENABLED`, `WS_ENABLED`, `API_ENABLED`, `TCP_ENABLED` MUST be set as string "true" or "false"
#### Scenario: Port configuration
- **WHEN** `config.httpPort` or `config.tcpPort` are set
- **THEN** environment variables `HTTP_PORT` and `TCP_PORT` MUST be set
#### Scenario: OIDC configuration
- **WHEN** OIDC values (`oidc.enabled`, `oidc.discovery`, etc.) are provided
- **THEN** all 8 OIDC environment variables MUST be set correctly
#### Scenario: Redis configuration
- **WHEN** Redis values (`redis.enabled`, `redis.host`, etc.) are provided
- **THEN** all 5 Redis environment variables MUST be set correctly
#### Scenario: Sensitive values from secrets
- **WHEN** `config.jwtSecret`, `config.adminToken`, `oidc.clientSecret`, or `redis.password` reference existing secrets
- **THEN** the deployment MUST use valueFrom/secretKeyRef to inject these values
### Requirement: Template Syntax Correctness
The chart templates SHALL use correct Helm template syntax without errors.
#### Scenario: Valid Go template syntax
- **WHEN** templates are rendered with `helm template`
- **THEN** no syntax errors MUST occur
#### Scenario: No spacing in template delimiters
- **WHEN** examining template files
- **THEN** template expressions MUST use `{{` and `}}` without internal spaces (e.g., `{{ .Value }}` not `{ { .Value } }`)
### Requirement: Chart Validation
The chart SHALL pass Helm linting and validation checks.
#### Scenario: Helm lint passes
- **WHEN** `helm lint` is run on the chart
- **THEN** no errors MUST be reported
#### Scenario: Chart renders successfully
- **WHEN** `helm template` is run with default values
- **THEN** valid Kubernetes manifests MUST be produced
#### Scenario: Chart renders with custom values
- **WHEN** `helm template` is run with various custom values combinations
- **THEN** valid Kubernetes manifests MUST be produced without errors

View File

@@ -0,0 +1,41 @@
## 1. Update Values Schema
- [x] 1.1 Restructure `values.yaml` with comprehensive configuration sections
- [x] 1.2 Add all environment variable mappings from README
- [x] 1.3 Add persistence configuration with storage class options
- [x] 1.4 Add standard Helm values (resources, nodeSelector, tolerations, affinity)
- [x] 1.5 Add probe configurations (liveness, readiness)
- [x] 1.6 Add service configuration options
- [x] 1.7 Add ingress configuration
## 2. Update Deployment Template
- [x] 2.1 Add all environment variables from values
- [x] 2.2 Add PVC volume mount to `/data`
- [x] 2.3 Add resource limits and requests
- [x] 2.4 Add node selector, tolerations, and affinity
- [x] 2.5 Add security context configurations
- [x] 2.6 Add liveness and readiness probes
- [x] 2.7 Add proper labels and annotations
- [x] 2.8 Fix template syntax issues (remove spaces in braces)
## 3. Create Missing Templates
- [x] 3.1 Create `serviceaccount.yaml` template
- [x] 3.2 Create `persistentvolumeclaim.yaml` template
- [x] 3.3 Create `ingress.yaml` template (optional, controlled by values)
- [x] 3.4 Update `clusterrolebinding.yaml` to reference ServiceAccount template
## 4. Update Service Templates
- [x] 4.1 Make HTTP service type configurable (ClusterIP/LoadBalancer/NodePort)
- [x] 4.2 Make TCP service type configurable
- [x] 4.3 Add service annotations support
- [x] 4.4 Add proper labels following Helm conventions
## 5. Documentation and Validation
- [x] 5.1 Update `Chart.yaml` version (bump to 0.2.0)
- [x] 5.2 Add comments to `values.yaml` explaining options
- [x] 5.3 Test chart rendering with `helm template`
- [x] 5.4 Validate against Helm best practices using `helm lint`