Paper | Title | Page |
---|---|---|
TUPPC063 | Control and Monitoring of the Online Computer Farm for Offline Processing in LHCb | 721 |
|
||
LHCb, one of the 4 experiments at the LHC accelerator at CERN, uses approximately 1500 PCs (averaging 12 cores each) for processing the High Level Trigger (HLT) during physics data taking. During periods when data acquisition is not required most of these PCs are idle. In these periods it is possible to profit from the unused processing capacity to run offline jobs, such as Monte Carlo simulation. The LHCb offline computing environment is based on LHCbDIRAC (Distributed Infrastructure with Remote Agent Control). In LHCbDIRAC, job agents are started on Worker Nodes, pull waiting tasks from the central WMS (Workload Management System) and process them on the available resources. A Control System was developed which is able to launch, control and monitor the job agents for the offline data processing on the HLT Farm. This control system is based on the existing Online System Control infrastructure, the PVSS SCADA and the FSM toolkit. It has been extensively used launching and monitoring 22.000+ agents simultaneously and more than 850.000 jobs have already been processed in the HLT Farm. This paper describes the deployment and experience with the Control System in the LHCb experiment. | ||
![]() |
Poster TUPPC063 [2.430 MB] | |
WECOAAB01 | An Overview of the LHC Experiments' Control Systems | 982 |
|
||
Although they are LHC experiments, the four experiments, either by need or by choice, use different equipment, have defined different requirements and are operated differently. This led to the development of four quite different Control Systems. Although a joint effort was done in the area of Detector Control Systems (DCS) allowing a common choice of components and tools and achieving the development of a common DCS Framework for the four experiments, nothing was done in common in the areas of Data Acquisition or Trigger Control (normally called Run Control). This talk will present an overview of the design principles, architectures and technologies chosen by the four experiments in order to perform the Control System's tasks: Configuration, Control, Monitoring, Error Recovery, User Interfacing, Automation, etc.
Invited |
||
![]() |
Slides WECOAAB01 [2.616 MB] | |