Author: Varela, F.
Paper Title Page
WEBHAUST01 LHCb Online Infrastructure Monitoring Tools 618
 
  • L.G. Cardoso, C. Gaspar, C. Haen, N. Neufeld, F. Varela
    CERN, Geneva, Switzerland
  • D. Galli
    INFN-Bologna, Bologna, Italy
 
  The Online System of the LHCb experiment at CERN is composed of a very large number of PCs: around 1500 in a CPU farm for performing the High Level Trigger; around 170 for the control system, running the SCADA system - PVSS; and several others for performing data monitoring, reconstruction, storage, and infrastructure tasks, like databases, etc. Some PCs run Linux, some run Windows but all of them need to be remotely controlled and monitored to make sure they are correctly running and to be able, for example, to reboot them whenever necessary. A set of tools was developed in order to centrally monitor the status of all PCs and PVSS Projects needed to run the experiment: a Farm Monitoring and Control (FMC) tool, which provides the lower level access to the PCs, and a System Overview Tool (developed within the Joint Controls Project – JCOP), which provides a centralized interface to the FMC tool and adds PVSS project monitoring and control. The implementation of these tools has provided a reliable and efficient way to manage the system, both during normal operations but also during shutdowns, upgrades or maintenance operations. This paper will present the particular implementation of this tool in the LHCb experiment and the benefits of its usage in a large scale heterogeneous system.  
slides icon Slides WEBHAUST01 [3.211 MB]  
 
WEPMU033 Monitoring Control Applications at CERN 1141
 
  • F. Varela, F.B. Bernard, M. Gonzalez-Berges, H. Milcent, L.B. Petrova
    CERN, Geneva, Switzerland
 
  The Industrial Controls and Engineering (EN-ICE) group of the Engineering Department at CERN has produced, and is responsible for the operation of around 60 applications, which control critical processes in the domains of cryogenics, quench protections systems, power interlocks for the Large Hadron Collider and other sub-systems of the accelerator complex. These applications require 24/7 operation and a quick reaction to problems. For this reason the EN-ICE is presently developing the monitoring tool to detect, anticipate and inform of possible anomalies in the integrity of the applications. The tool builds on top of Simatic WinCC Open Architecture (formerly PVSS) SCADA and makes usage of the Joint COntrols Project (JCOP) and UNICOS Frameworks developed at CERN. The tool provides centralized monitoring of the different elements integrating the controls systems like Windows and Linux servers, PLCs, applications, etc. Although the primary aim of the tool is to assist the members of the EN-ICE Standby Service, the tool may present different levels of details of the systems depending on the user, which enables experts to diagnose and troubleshoot problems. In this paper, the scope, functionality and architecture of the tool are presented and some initial results on its performance are summarized.  
poster icon Poster WEPMU033 [1.719 MB]