Files
nplus/samples/ha/README.md
2025-01-24 16:18:47 +01:00

2.5 KiB

High Availability

To gain a higher level of availability for your Instance, you can

  • create more Kubernetes Cluster Nodes
  • create more replicas of the nscale and nplus components
  • distribute those replicas across multiple nodes using anti-affinities

This is how:

helm install \
  --values samples/ha/values.yaml
  --values samples/environment/demo.yaml \
  sample-ha nplus/nplus-instance

The essents of the values file is this:

  • We use three (3) nscale Server Application Layer, two dedicated to user access, one dedicated to jobs
  • if the jobs node fails, the user nodes take the jobs (handled by priority)
  • if one of the user nodes fail, the other one handles the load
  • Kubernetes takes care of restarting nodes should that happen
  • All components run with two replicas
  • Pod anti-affinities handle the distribution
  • any administration component only connects to the jobs nappl, leaving the user nodes to the users
  • PodDisruptionBudgets are defined for the crutial components. These are set via minReplicaCount for the components that can support multiple replicas, and minReplicaCountType for the first replicaSet of the components that do not support replicas, in this case nstla.
web:
  replicaCount: 2
  minReplicaCount: 1
rs:
  replicaCount: 2
  minReplicaCount: 1
ilm:
  replicaCount: 2
  minReplicaCount: 1
cmis:
  replicaCount: 2
  minReplicaCount: 1
webdav:
  replicaCount: 2
  minReplicaCount: 1
nstla:
  minReplicaCountType: 1
administrator:
  nappl:
    host: "{{ .component.prefix }}nappljobs.{{ .Release.Namespace }}"
  waitFor:
    - "-service {{ .component.prefix }}nappljobs.{{ .Release.Namespace }}.svc.cluster.local:{{ .this.nappl.port }} -timeout 600"
pam:
  nappl:
    host: "{{ .component.prefix }}nappljobs.{{ .Release.Namespace }}"
  waitFor:
    - "-service {{ .component.prefix }}nappljobs.{{ .Release.Namespace }}.svc.cluster.local:{{ .this.nappl.port }} -timeout 600"
nappl:
  replicaCount: 2
  minReplicaCount: 1
  jobs: false
  waitFor:
    - "-service {{ .component.prefix }}nappljobs.{{ .Release.Namespace }}.svc.cluster.local:{{ .this.nappl.port }} -timeout 600"
nappljobs:
  replicaCount: 1
  jobs: true
  disableSessionReplication: true
  ingress:
    enabled: false
  snc:
    enabled: true
  waitFor:
    - "-service {{ .component.prefix }}database.{{ .Release.Namespace }}.svc.cluster.local:5432 -timeout 600"
application:
  nstl: 
    host: "{{ .component.prefix }}nstl-cluster.{{ .Release.Namespace }}"
  nappl:
    host: "{{ .component.prefix }}nappljobs.{{ .Release.Namespace }}"