Detail of proposed monitoring system components

These pages propose a set of components which would enable an operational monitoring system based on the local datastore/aggregator model discussed in December 2013, to meet a representative set of use cases. We are focusing on a small number of use cases in order to describe a coherent system — the intention is that the design of the system be such that it should be able to address all use cases.

The next steps are to determine whether this set of components is reasonable (necessary and sufficient) to meet the representative use cases, and to determine who can implement each piece. Please discuss these questions on monitoring@geni.net, or on the call Tuesday January 14th. You can also reach GPO infra at gpo-infra@geni.net with any questions or comments.

Detail drawings and components of representative use cases

Each page describes an implementation strategy and set of needed components for a representative use case, based on the local datastore/aggregator model.

Overview of all components

ID Name Use cases this supports POC Other Participants
(a) Operational Alerting 3 4 5 7 Tim
(b) Aggregator 3 4 5 6 7 Ryan Cody, Mitch, Anirban
(c) Datastore polling API 3 4 5 6 7 Ryan Cody, Mitch, Anirban
(d) Local datastore (ExoGENI) 3 5 6 7 Jonathan
(d) Local datastore (FOAM) 5 6 Nick
(d) Local datastore (ION AM) 5 6 7
(d) Local datastore (InstaGENI) 3 5 6 7 Gary
(d) Local datastore (OESS AM) 5 6 7 Luke
(d) Local datastore (clearinghouse) 6 GPO
(d) Local datastore (config data: operators to notify) 3 4 5 7
(d) Local datastore (config data: racks/AMs to query) 3 4 5 7 Mitch
(d) Local datastore (external checks) 4 5 7
(e) Control network traffic counters (ExoGENI) 5
(e) Control network traffic counters (FOAM) 5
(e) Control network traffic counters (ION AM) 5
(e) Control network traffic counters (InstaGENI) 5
(e) Control network traffic counters (OESS AM) 5
(e) Dataplane traffic counters/VLAN data (ExoGENI) 7
(e) Dataplane traffic counters/VLAN data (ION AM) 7
(e) Dataplane traffic counters/VLAN data (InstaGENI) 7
(e) Dataplane traffic counters/VLAN data (OESS AM) 7
(e) Shared node metrics (ExoGENI) 3 Jonathan
(e) Shared node metrics (InstaGENI) 3 Gary
(e) Sliver/resource mapping data (ExoGENI) 6
(e) Sliver/resource mapping data (FOAM) 6
(e) Sliver/resource mapping data (ION AM) 6
(e) Sliver/resource mapping data (InstaGENI) 6
(e) Sliver/resource mapping data (OESS AM) 6
(f) External Checks (generic web service checks) 4
(f) External Checks (ping) 5
(f) External Checks (using AM API) 4 7 Ali
(g) Operational Reports 6