Paper |
Title |
Page |
THPD02 |
What it Takes to Make a System Reliable |
139 |
|
- M.R. Clausen, M. Möller, S. Rettig-Labusga, B. Schoeneburg
DESY, Hamburg, Germany
|
|
|
What is a reliable system and how is reliability defined? This depends on the actual situation and in which environment the system is operated. If you can rely on a scheduled downtime of the controlled system every week, reliability is defined in hours or weeks. In this case the system must run just longer than the scheduled downtime. If the system has to continuously operate for months and even years, your requirements are rising. In cases where continuous operations must be guaranteed even during software or hardware updates, redundant systems come into play. The hardware selection process is driven by basic requirements like 'no moving parts' or 'redundant power supplies'. This implies the selection of possible (fan-less) CPU boards with passive cooling. It also implies no hard discs and reduces therefore the selection of possible operating systems. Continuous operation during updates requires redundant controllers/ CPUs also in addition to redundant power supplies. The latter has a lot of impact on the software running inside the controllers. We will describe the selection process of the components we have chosen and summarize our experience of several years of operations.
|
|
|
Poster THPD02 [0.280 MB]
|
|
|