Author: Bonaccorsi, E.
Paper Title Page
WEMMU005 Fabric Management with Diskless Servers and Quattor on LHCb 691
  • P. Schweitzer, E. Bonaccorsi, L. Brarda, N. Neufeld
    CERN, Geneva, Switzerland
  Large scientific experiments nowadays very often are using large computer farms to process the events acquired from the detectors. In LHCb a small sysadmin team manages 1400 servers of the LHCb Event Filter Farm, but also a wide variety of control servers for the detector electronics and infrastructure computers : file servers, gateways, DNS, DHCP and others. This variety of servers could not be handled without a solid fabric management system. We choose the Quattor toolkit for this task. We will present our use of this toolkit, with an emphasis on how we handle our diskless nodes (Event filter farm nodes and computers embedded in the acquisition electronic cards). We will show our current tests to replace the standard (RedHat/Scientific Linux) way of handling diskless nodes to fusion filesystems and how it improves fabric management.  
slides icon Slides WEMMU005 [0.119 MB]  
poster icon Poster WEMMU005 [0.602 MB]  
WEPMU035 Distributed Monitoring System Based on ICINGA 1149
  • C. Haen, E. Bonaccorsi, N. Neufeld
    CERN, Geneva, Switzerland
  The basic services of the large IT infrastructure of the LHCb experiment are monitored with ICINGA, a fork of the industry standard monitoring software NAGIOS. The infrastructure includes thousands of servers and computers, storage devices, more than 200 network devices and many VLANS, databases, hundreds diskless nodes and many more. The amount of configuration files needed to control the whole installation is big, and there is a lot of duplication, when the monitoring infrastructure is distributed over several servers. In order to ease the manipulation of the configuration files, we designed a monitoring schema particularly adapted to our network and taking advantage of its specificities, and developed a tool to centralize its configuration in a database. Thanks to this tool, we could also parse all our previous configuration files, and thus fill in our Oracle database, that comes as a replacement of the previous Active Directory based solution. A web frontend allows non-expert users to easily add new entities to monitor. We present the schema of our monitoring infrastructure and the tool used to manage and automatically generate the configuration for ICINGA.  
poster icon Poster WEPMU035 [0.375 MB]  
WEPMU037 Virtualization for the LHCb Experiment 1157
  • E. Bonaccorsi, L. Brarda, M. Chebbi, N. Neufeld
    CERN, Geneva, Switzerland
  • F. Sborzacchi
    INFN/LNF, Frascati (Roma), Italy
  The LHCb Experiment, one of the four large particle physics detectors at CERN, counts in its Online System more than 2000 servers and embedded systems. As a result of ever-increasing CPU performance in modern servers, many of the applications in the controls system are excellent candidates for virtualization technologies. We see virtualization as an approach to cut down cost, optimize resource usage and manage the complexity of the IT infrastructure of LHCb. Recently we have added a Kernel Virtual Machine (KVM) cluster based on Red Hat Enterprise Virtualization for Servers (RHEV) complementary to the existing Hyper-V cluster devoted only to the virtualization of the windows guests. This paper describes the architecture of our solution based on KVM and RHEV as along with its integration with the existing Hyper-V infrastructure and the Quattor cluster management tools and in particular how we use to run controls applications on a virtualized infrastructure. We present performance results of both the KVM and Hyper-V solutions, problems encountered and a description of the management tools developed for the integration with the Online cluster and LHCb SCADA control system based on PVSS.