Files
nuclei-operator/config/crd/bases/nuclei.homelab.mortenolsen.pro_nucleiscans.yaml
Morten Olsen 12d681ada1 feat: implement pod-based scanning architecture
This major refactor moves from synchronous subprocess-based scanning to
asynchronous pod-based scanning using Kubernetes Jobs.

## Architecture Changes
- Scanner jobs are now Kubernetes Jobs with TTLAfterFinished for automatic cleanup
- Jobs have owner references for garbage collection when NucleiScan is deleted
- Configurable concurrency limits, timeouts, and resource requirements

## New Features
- Dual-mode binary: --mode=controller (default) or --mode=scanner
- Annotation-based configuration for Ingress/VirtualService resources
- Operator-level configuration via environment variables
- Startup recovery for orphaned scans after operator restart
- Periodic cleanup of stuck jobs

## New Files
- DESIGN.md: Comprehensive architecture design document
- internal/jobmanager/: Job Manager for creating/monitoring scanner jobs
- internal/scanner/runner.go: Scanner mode implementation
- internal/annotations/: Annotation parsing utilities
- charts/nuclei-operator/templates/scanner-rbac.yaml: Scanner RBAC

## API Changes
- Added ScannerConfig struct for per-scan scanner configuration
- Added JobReference struct for tracking scanner jobs
- Added ScannerConfig field to NucleiScanSpec
- Added JobRef and ScanStartTime fields to NucleiScanStatus

## Supported Annotations
- nuclei.homelab.mortenolsen.pro/enabled
- nuclei.homelab.mortenolsen.pro/templates
- nuclei.homelab.mortenolsen.pro/severity
- nuclei.homelab.mortenolsen.pro/schedule
- nuclei.homelab.mortenolsen.pro/timeout
- nuclei.homelab.mortenolsen.pro/scanner-image

## RBAC Updates
- Added Job and Pod permissions for operator
- Created separate scanner service account with minimal permissions

## Documentation
- Updated README, user-guide, api.md, and Helm chart README
- Added example annotated Ingress resources
2025-12-12 20:55:09 +01:00

463 lines
20 KiB
YAML

---
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.19.0
name: nucleiscans.nuclei.homelab.mortenolsen.pro
spec:
group: nuclei.homelab.mortenolsen.pro
names:
kind: NucleiScan
listKind: NucleiScanList
plural: nucleiscans
shortNames:
- ns
- nscan
singular: nucleiscan
scope: Namespaced
versions:
- additionalPrinterColumns:
- jsonPath: .status.phase
name: Phase
type: string
- jsonPath: .status.summary.totalFindings
name: Findings
type: integer
- jsonPath: .spec.sourceRef.kind
name: Source
type: string
- jsonPath: .metadata.creationTimestamp
name: Age
type: date
name: v1alpha1
schema:
openAPIV3Schema:
description: NucleiScan is the Schema for the nucleiscans API
properties:
apiVersion:
description: |-
APIVersion defines the versioned schema of this representation of an object.
Servers should convert recognized schemas to the latest internal value, and
may reject unrecognized values.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
type: string
kind:
description: |-
Kind is a string value representing the REST resource this object represents.
Servers may infer this from the endpoint the client submits requests to.
Cannot be updated.
In CamelCase.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
type: string
metadata:
type: object
spec:
description: NucleiScanSpec defines the desired state of NucleiScan
properties:
scannerConfig:
description: ScannerConfig allows overriding scanner settings for
this scan
properties:
image:
description: Image overrides the default scanner image
type: string
nodeSelector:
additionalProperties:
type: string
description: NodeSelector for scanner pod scheduling
type: object
resources:
description: Resources defines resource requirements for the scanner
pod
properties:
claims:
description: |-
Claims lists the names of resources, defined in spec.resourceClaims,
that are used by this container.
This field depends on the
DynamicResourceAllocation feature gate.
This field is immutable. It can only be set for containers.
items:
description: ResourceClaim references one entry in PodSpec.ResourceClaims.
properties:
name:
description: |-
Name must match the name of one entry in pod.spec.resourceClaims of
the Pod where this field is used. It makes that resource available
inside a container.
type: string
request:
description: |-
Request is the name chosen for a request in the referenced claim.
If empty, everything from the claim is made available, otherwise
only the result of this request.
type: string
required:
- name
type: object
type: array
x-kubernetes-list-map-keys:
- name
x-kubernetes-list-type: map
limits:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
description: |-
Limits describes the maximum amount of compute resources allowed.
More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
type: object
requests:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
description: |-
Requests describes the minimum amount of compute resources required.
If Requests is omitted for a container, it defaults to Limits if that is explicitly specified,
otherwise to an implementation-defined value. Requests cannot exceed Limits.
More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
type: object
type: object
templateURLs:
description: TemplateURLs specifies additional template repositories
to clone
items:
type: string
type: array
timeout:
description: Timeout overrides the default scan timeout
type: string
tolerations:
description: Tolerations for scanner pod scheduling
items:
description: |-
The pod this Toleration is attached to tolerates any taint that matches
the triple <key,value,effect> using the matching operator <operator>.
properties:
effect:
description: |-
Effect indicates the taint effect to match. Empty means match all taint effects.
When specified, allowed values are NoSchedule, PreferNoSchedule and NoExecute.
type: string
key:
description: |-
Key is the taint key that the toleration applies to. Empty means match all taint keys.
If the key is empty, operator must be Exists; this combination means to match all values and all keys.
type: string
operator:
description: |-
Operator represents a key's relationship to the value.
Valid operators are Exists and Equal. Defaults to Equal.
Exists is equivalent to wildcard for value, so that a pod can
tolerate all taints of a particular category.
type: string
tolerationSeconds:
description: |-
TolerationSeconds represents the period of time the toleration (which must be
of effect NoExecute, otherwise this field is ignored) tolerates the taint. By default,
it is not set, which means tolerate the taint forever (do not evict). Zero and
negative values will be treated as 0 (evict immediately) by the system.
format: int64
type: integer
value:
description: |-
Value is the taint value the toleration matches to.
If the operator is Exists, the value should be empty, otherwise just a regular string.
type: string
type: object
type: array
type: object
schedule:
description: |-
Schedule for periodic rescanning in cron format
If empty, scan runs once
type: string
severity:
description: Severity filters scan results by severity level
enum:
- info
- low
- medium
- high
- critical
items:
type: string
type: array
sourceRef:
description: SourceRef references the Ingress or VirtualService being
scanned
properties:
apiVersion:
description: APIVersion of the source resource
type: string
kind:
description: Kind of the source resource - Ingress or VirtualService
enum:
- Ingress
- VirtualService
type: string
name:
description: Name of the source resource
type: string
namespace:
description: Namespace of the source resource
type: string
uid:
description: UID of the source resource for owner reference
type: string
required:
- apiVersion
- kind
- name
- namespace
- uid
type: object
suspend:
description: Suspend prevents scheduled scans from running
type: boolean
targets:
description: Targets is the list of URLs to scan, extracted from the
source resource
items:
type: string
minItems: 1
type: array
templates:
description: |-
Templates specifies which Nuclei templates to use
If empty, uses default templates
items:
type: string
type: array
required:
- sourceRef
- targets
type: object
status:
description: NucleiScanStatus defines the observed state of NucleiScan
properties:
completionTime:
description: CompletionTime is when the last scan completed
format: date-time
type: string
conditions:
description: Conditions represent the latest available observations
items:
description: Condition contains details for one aspect of the current
state of this API Resource.
properties:
lastTransitionTime:
description: |-
lastTransitionTime is the last time the condition transitioned from one status to another.
This should be when the underlying condition changed. If that is not known, then using the time when the API field changed is acceptable.
format: date-time
type: string
message:
description: |-
message is a human readable message indicating details about the transition.
This may be an empty string.
maxLength: 32768
type: string
observedGeneration:
description: |-
observedGeneration represents the .metadata.generation that the condition was set based upon.
For instance, if .metadata.generation is currently 12, but the .status.conditions[x].observedGeneration is 9, the condition is out of date
with respect to the current state of the instance.
format: int64
minimum: 0
type: integer
reason:
description: |-
reason contains a programmatic identifier indicating the reason for the condition's last transition.
Producers of specific condition types may define expected values and meanings for this field,
and whether the values are considered a guaranteed API.
The value should be a CamelCase string.
This field may not be empty.
maxLength: 1024
minLength: 1
pattern: ^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$
type: string
status:
description: status of the condition, one of True, False, Unknown.
enum:
- "True"
- "False"
- Unknown
type: string
type:
description: type of condition in CamelCase or in foo.example.com/CamelCase.
maxLength: 316
pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])$
type: string
required:
- lastTransitionTime
- message
- reason
- status
- type
type: object
type: array
x-kubernetes-list-map-keys:
- type
x-kubernetes-list-type: map
findings:
description: |-
Findings contains the array of scan results from Nuclei JSONL output
Each element is a parsed JSON object from Nuclei output
items:
description: Finding represents a single Nuclei scan finding
properties:
description:
description: Description provides details about the finding
type: string
extractedResults:
description: ExtractedResults contains any data extracted by
the template
items:
type: string
type: array
host:
description: Host that was scanned
type: string
matchedAt:
description: MatchedAt is the specific URL or endpoint where
the issue was found
type: string
metadata:
description: Metadata contains additional template metadata
type: object
x-kubernetes-preserve-unknown-fields: true
reference:
description: Reference contains URLs to additional information
about the finding
items:
type: string
type: array
severity:
description: Severity of the finding
type: string
tags:
description: Tags associated with the finding
items:
type: string
type: array
templateId:
description: TemplateID is the Nuclei template identifier
type: string
templateName:
description: TemplateName is the human-readable template name
type: string
timestamp:
description: Timestamp when the finding was discovered
format: date-time
type: string
type:
description: Type of the finding - http, dns, ssl, etc.
type: string
required:
- host
- severity
- templateId
- timestamp
type: object
type: array
jobRef:
description: JobRef references the current or last scanner job
properties:
name:
description: Name of the Job
type: string
podName:
description: PodName is the name of the scanner pod (for log retrieval)
type: string
startTime:
description: StartTime when the job was created
format: date-time
type: string
uid:
description: UID of the Job
type: string
required:
- name
- uid
type: object
lastError:
description: LastError contains the error message if the scan failed
type: string
lastRetryTime:
description: LastRetryTime is when the last availability check retry
occurred
format: date-time
type: string
lastScanTime:
description: LastScanTime is when the last scan was initiated
format: date-time
type: string
nextScheduledTime:
description: NextScheduledTime is when the next scheduled scan will
run
format: date-time
type: string
observedGeneration:
description: ObservedGeneration is the generation observed by the
controller
format: int64
type: integer
phase:
description: Phase represents the current scan phase
enum:
- Pending
- Running
- Completed
- Failed
type: string
retryCount:
description: |-
RetryCount tracks the number of consecutive availability check retries
Used for exponential backoff when waiting for targets
type: integer
scanStartTime:
description: ScanStartTime is when the scanner pod actually started
scanning
format: date-time
type: string
summary:
description: Summary provides aggregated scan statistics
properties:
durationSeconds:
description: DurationSeconds is the duration of the scan in seconds
format: int64
type: integer
findingsBySeverity:
additionalProperties:
type: integer
description: FindingsBySeverity breaks down findings by severity
level
type: object
targetsScanned:
description: TargetsScanned is the number of targets that were
scanned
type: integer
totalFindings:
description: TotalFindings is the total number of findings
type: integer
required:
- targetsScanned
- totalFindings
type: object
type: object
type: object
served: true
storage: true
subresources:
status: {}