Files
nplus/samples/ha/README.md

79 lines
2.5 KiB
Markdown
Raw Permalink Normal View History

2025-01-24 16:18:47 +01:00
# High Availability
To gain a higher level of availability for your Instance, you can
- create more Kubernetes Cluster Nodes
- create more replicas of the *nscale* and *nplus* components
- distribute those replicas across multiple nodes using anti-affinities
This is how:
```
helm install \
--values samples/ha/values.yaml
--values samples/environment/demo.yaml \
sample-ha nplus/nplus-instance
```
The essents of the values file is this:
- We use three (3) *nscale Server Application Layer*, two dedicated to user access, one dedicated to jobs
- if the jobs node fails, the user nodes take the jobs (handled by priority)
- if one of the user nodes fail, the other one handles the load
- Kubernetes takes care of restarting nodes should that happen
- All components run with two replicas
- Pod anti-affinities handle the distribution
- any administration component only connects to the jobs nappl, leaving the user nodes to the users
- PodDisruptionBudgets are defined for the crutial components. These are set via `minReplicaCount` for the components that can support multiple replicas, and `minReplicaCountType` for the **first** replicaSet of the components that do not support replicas, in this case nstla.
```
web:
replicaCount: 2
minReplicaCount: 1
rs:
replicaCount: 2
minReplicaCount: 1
ilm:
replicaCount: 2
minReplicaCount: 1
cmis:
replicaCount: 2
minReplicaCount: 1
webdav:
replicaCount: 2
minReplicaCount: 1
nstla:
minReplicaCountType: 1
administrator:
nappl:
host: "{{ .component.prefix }}nappljobs.{{ .Release.Namespace }}"
waitFor:
- "-service {{ .component.prefix }}nappljobs.{{ .Release.Namespace }}.svc.cluster.local:{{ .this.nappl.port }} -timeout 600"
pam:
nappl:
host: "{{ .component.prefix }}nappljobs.{{ .Release.Namespace }}"
waitFor:
- "-service {{ .component.prefix }}nappljobs.{{ .Release.Namespace }}.svc.cluster.local:{{ .this.nappl.port }} -timeout 600"
nappl:
replicaCount: 2
minReplicaCount: 1
jobs: false
waitFor:
- "-service {{ .component.prefix }}nappljobs.{{ .Release.Namespace }}.svc.cluster.local:{{ .this.nappl.port }} -timeout 600"
nappljobs:
replicaCount: 1
jobs: true
disableSessionReplication: true
ingress:
enabled: false
snc:
enabled: true
waitFor:
- "-service {{ .component.prefix }}database.{{ .Release.Namespace }}.svc.cluster.local:5432 -timeout 600"
application:
nstl:
host: "{{ .component.prefix }}nstl-cluster.{{ .Release.Namespace }}"
nappl:
host: "{{ .component.prefix }}nappljobs.{{ .Release.Namespace }}"
```