Provisioning and HA

All Dashboard components are designed to have some level of redundancy and resiliency.

Overall Provisioning Overview

rectangle prod-noc-alarms01.geant.org {
   component "Collector" as COL1
   component "Classifier" as CLA1
   component "Archiver" as ARC1
   component "Correlator" as COR1
   queue "RabbitMQ node" as RMQ1
   database "MariaDB node" as SQL1
   database "Redis node" as Redis1
}

rectangle prod-noc-alarms02.geant.org {
   component "Collector" as COL2
   component "Classifier" as CLA2
   component "Archiver" as ARC2
   component "Correlator" as COR2
   queue "RabbitMQ node" as RMQ2
   database "MariaDB node" as SQL2
   database "Redis node" as Redis2
}

rectangle prod-noc-alarms03.geant.org {
   component "Collector" as COL3
   component "Classifier" as CLA3
   component "Archiver" as ARC3
   component "Correlator" as COR3
   queue "RabbitMQ node" as RMQ3
   database "MariaDB node" as SQL3
   database "Redis node" as Redis3
}

COL1 <--> COL2
COL2 <--> COL3
COR1 <--> COR2
COR2 <--> COR3
RMQ1 <--> RMQ2
RMQ2 <--> RMQ3
SQL1 <--> SQL2
SQL2 <--> SQL3
Redis1 <--> Redis2
Redis2 <--> Redis3

Data Processing Pipeline

The system is composed of several processes that communicate with each other via a RabbitMQ queue broker.

The basic RabbitMQ pipeline is shown below:

TODO: throw out the 2nd diagram, kept for now as an example. 1st diagram is the newer syntax

start
note right
  SNMP traps
  from NE's
end note

fork
  #lightgreen:Collector-01;
  note right: leader
fork again
  :Collector-02;
fork again
  :Collector-03;
end fork

:broker}
note right
  dashboard.collection
  exchange
end note

split
   fork
     :Classifier-01;
   fork again
     :Classifier-02;
   fork again
     :Classifier-03;
   end fork

   :broker}
   note right
     dashboard.classified
     exchange
   end note

   fork
     #lightgreen:Correlator-01;
     note right: leader
   fork again
     #lightgrey:Correlator-02;
   fork again
     #lightgrey:Correlator-03;
      note right
         followers don't
         consume messages
      end note
   end fork

   :broker}
   note right
     dashboard.external.notifications
     exchange
   end note

   fork
     :TTS Notifier-01;
   fork again
     :TTS Notifier-02;
   fork again
     :TTS Notifier-03;
   end fork

   :Email Server;
   stop

split again

   fork
     :Archiver-01;
   fork again
     :Archiver-02;
   fork again
     :Archiver-03;
   end fork

   :ElasticSearch;
   stop

end split

"NE #1" --> ===SNMP===
"NE #2" --> ===SNMP===
"NE #3" --> ===SNMP===
"NE #4" --> ===SNMP===
"..." --> ===SNMP===

===SNMP=== --> "Collector-01"
===SNMP=== --> "Collector-02"
===SNMP=== --> "Collector-03"

"Collector-01" --> ===TRAPS===
"Collector-02" --> ===TRAPS===
"Collector-03" --> ===TRAPS===

===TRAPS=== --> "Classifier-01"
===TRAPS=== --> "Classifier-02"
===TRAPS=== --> "Classifier-03"
===TRAPS=== --> "Archiver-01"
===TRAPS=== --> "Archiver-02"
===TRAPS=== --> "Archiver-03"

"Classifier-01" --> ===CLASSIFIED===
"Classifier-02" --> ===CLASSIFIED===
"Classifier-03" --> ===CLASSIFIED===

===CLASSIFIED=== --> "Correlator-01"
===CLASSIFIED=== --> "Correlator-02"
===CLASSIFIED=== --> "Correlator-03"

"Correlator-01" --> ===MESSAGES===

===MESSAGES=== --> "tts notifier-01"
===MESSAGES=== --> "tts notifier-02"
===MESSAGES=== --> "tts notifier-03"

RabbitMQ

A RabbitMQ cluster is provisioned across all 3 nodes, with mirroring of all queues.

node "RabbitMQ 01" as RMQ1 {
  rectangle "exchanges" as EX1 #lightgreen
  rectangle "dashboard* queues" as Q1 #lightblue
}

node "RabbitMQ 02" as RMQ2 {
  rectangle "exchanges" as EX2 #lightgreen
  rectangle "dashboard* queues" as Q2 #lightblue
}

node "RabbitMQ 03" as RMQ3 {
  rectangle "exchanges" as EX3 #lightgreen
  rectangle "dashboard* queues" as Q3 #lightblue
}

EX1 <..> EX2 : "mirroring"
EX2 <..> EX3 : "mirroring"
Q1 <..> Q2 : "mirroring"
Q2 <..> Q3 : "mirroring"

note as N1
    Modern api's support randomized
    server selection and automatic
    iteration until an available host
    is found
end note

'center
RMQ2 -[hidden]right--> N1

Galera Cluster

Alarms are stored in a MariaDB Galera cluster. This is the interface between the backend processing provided by this project and the GUI.

The Galera cluster is provisioned something like this:

cloud Galera as "Galera Cluster"  {
  database MDB1 as "MariaDB-01"
  database MDB2 as "MariaDB-02"
  database MDB3 as "MariaDB-03"
}


node HAProxy
HAProxy -up--> MDB1
HAProxy -up--> MDB2
HAProxy -up--> MDB3

node client as "mysql client"
client -up--> HAProxy

Inventory Provider and Redis

The dashboard servers are provisioned with a Redis cluster, which is the backend of the Inventory Provider rather than being accessed directly by dashboard components.

collections Redis as "Redis Cluster"
collections Sentinel as "Sentinel Group"
node "Inventory Provider 01" as IV1 #lightgreen
node "Inventory Provider 02" as IV2 #lightgreen
node HAProxy
node classifier as "classifier(s)" #lightgreen

HAProxy --> IV1
HAProxy --> IV2
classifier --> HAProxy
IV1 --> Sentinel
IV2 --> Sentinel
Sentinel --> Redis