79 lines
2.5 KiB
Markdown
79 lines
2.5 KiB
Markdown
|
|
# High Availability
|
||
|
|
|
||
|
|
To gain a higher level of availability for your Instance, you can
|
||
|
|
|
||
|
|
- create more Kubernetes Cluster Nodes
|
||
|
|
- create more replicas of the *nscale* and *nplus* components
|
||
|
|
- distribute those replicas across multiple nodes using anti-affinities
|
||
|
|
|
||
|
|
This is how:
|
||
|
|
|
||
|
|
```
|
||
|
|
helm install \
|
||
|
|
--values samples/ha/values.yaml
|
||
|
|
--values samples/environment/demo.yaml \
|
||
|
|
sample-ha nplus/nplus-instance
|
||
|
|
```
|
||
|
|
|
||
|
|
The essents of the values file is this:
|
||
|
|
|
||
|
|
- We use three (3) *nscale Server Application Layer*, two dedicated to user access, one dedicated to jobs
|
||
|
|
- if the jobs node fails, the user nodes take the jobs (handled by priority)
|
||
|
|
- if one of the user nodes fail, the other one handles the load
|
||
|
|
- Kubernetes takes care of restarting nodes should that happen
|
||
|
|
- All components run with two replicas
|
||
|
|
- Pod anti-affinities handle the distribution
|
||
|
|
- any administration component only connects to the jobs nappl, leaving the user nodes to the users
|
||
|
|
- PodDisruptionBudgets are defined for the crutial components. These are set via `minReplicaCount` for the components that can support multiple replicas, and `minReplicaCountType` for the **first** replicaSet of the components that do not support replicas, in this case nstla.
|
||
|
|
|
||
|
|
```
|
||
|
|
web:
|
||
|
|
replicaCount: 2
|
||
|
|
minReplicaCount: 1
|
||
|
|
rs:
|
||
|
|
replicaCount: 2
|
||
|
|
minReplicaCount: 1
|
||
|
|
ilm:
|
||
|
|
replicaCount: 2
|
||
|
|
minReplicaCount: 1
|
||
|
|
cmis:
|
||
|
|
replicaCount: 2
|
||
|
|
minReplicaCount: 1
|
||
|
|
webdav:
|
||
|
|
replicaCount: 2
|
||
|
|
minReplicaCount: 1
|
||
|
|
nstla:
|
||
|
|
minReplicaCountType: 1
|
||
|
|
administrator:
|
||
|
|
nappl:
|
||
|
|
host: "{{ .component.prefix }}nappljobs.{{ .Release.Namespace }}"
|
||
|
|
waitFor:
|
||
|
|
- "-service {{ .component.prefix }}nappljobs.{{ .Release.Namespace }}.svc.cluster.local:{{ .this.nappl.port }} -timeout 600"
|
||
|
|
pam:
|
||
|
|
nappl:
|
||
|
|
host: "{{ .component.prefix }}nappljobs.{{ .Release.Namespace }}"
|
||
|
|
waitFor:
|
||
|
|
- "-service {{ .component.prefix }}nappljobs.{{ .Release.Namespace }}.svc.cluster.local:{{ .this.nappl.port }} -timeout 600"
|
||
|
|
nappl:
|
||
|
|
replicaCount: 2
|
||
|
|
minReplicaCount: 1
|
||
|
|
jobs: false
|
||
|
|
waitFor:
|
||
|
|
- "-service {{ .component.prefix }}nappljobs.{{ .Release.Namespace }}.svc.cluster.local:{{ .this.nappl.port }} -timeout 600"
|
||
|
|
nappljobs:
|
||
|
|
replicaCount: 1
|
||
|
|
jobs: true
|
||
|
|
disableSessionReplication: true
|
||
|
|
ingress:
|
||
|
|
enabled: false
|
||
|
|
snc:
|
||
|
|
enabled: true
|
||
|
|
waitFor:
|
||
|
|
- "-service {{ .component.prefix }}database.{{ .Release.Namespace }}.svc.cluster.local:5432 -timeout 600"
|
||
|
|
application:
|
||
|
|
nstl:
|
||
|
|
host: "{{ .component.prefix }}nstl-cluster.{{ .Release.Namespace }}"
|
||
|
|
nappl:
|
||
|
|
host: "{{ .component.prefix }}nappljobs.{{ .Release.Namespace }}"
|
||
|
|
```
|