ICALEPCS2011 - List of Authors (Bonaccorsi, E.)

Paper

Title

Page

Fabric Management with Diskless Servers and Quattor on LHCb

691

P. Schweitzer, E. Bonaccorsi, L. Brarda, N. Neufeld
CERN, Geneva, Switzerland

Large scientific experiments nowadays very often are using large computer farms to process the events acquired from the detectors. In LHCb a small sysadmin team manages 1400 servers of the LHCb Event Filter Farm, but also a wide variety of control servers for the detector electronics and infrastructure computers : file servers, gateways, DNS, DHCP and others. This variety of servers could not be handled without a solid fabric management system. We choose the Quattor toolkit for this task. We will present our use of this toolkit, with an emphasis on how we handle our diskless nodes (Event filter farm nodes and computers embedded in the acquisition electronic cards). We will show our current tests to replace the standard (RedHat/Scientific Linux) way of handling diskless nodes to fusion filesystems and how it improves fabric management.

Slides WEMMU005 [0.119 MB]

Poster WEMMU005 [0.602 MB]

WEPMU035

Distributed Monitoring System Based on ICINGA

1149

C. Haen, E. Bonaccorsi, N. Neufeld
CERN, Geneva, Switzerland

The basic services of the large IT infrastructure of the LHCb experiment are monitored with ICINGA, a fork of the industry standard monitoring software NAGIOS. The infrastructure includes thousands of servers and computers, storage devices, more than 200 network devices and many VLANS, databases, hundreds diskless nodes and many more. The amount of configuration files needed to control the whole installation is big, and there is a lot of duplication, when the monitoring infrastructure is distributed over several servers. In order to ease the manipulation of the configuration files, we designed a monitoring schema particularly adapted to our network and taking advantage of its specificities, and developed a tool to centralize its configuration in a database. Thanks to this tool, we could also parse all our previous configuration files, and thus fill in our Oracle database, that comes as a replacement of the previous Active Directory based solution. A web frontend allows non-expert users to easily add new entities to monitor. We present the schema of our monitoring infrastructure and the tool used to manage and automatically generate the configuration for ICINGA.

Poster WEPMU035 [0.375 MB]

WEPMU037

Virtualization for the LHCb Experiment

1157

E. Bonaccorsi, L. Brarda, M. Chebbi, N. Neufeld
CERN, Geneva, Switzerland
F. Sborzacchi
INFN/LNF, Frascati (Roma), Italy

The LHCb Experiment, one of the four large particle physics detectors at CERN, counts in its Online System more than 2000 servers and embedded systems. As a result of ever-increasing CPU performance in modern servers, many of the applications in the controls system are excellent candidates for virtualization technologies. We see virtualization as an approach to cut down cost, optimize resource usage and manage the complexity of the IT infrastructure of LHCb. Recently we have added a Kernel Virtual Machine (KVM) cluster based on Red Hat Enterprise Virtualization for Servers (RHEV) complementary to the existing Hyper-V cluster devoted only to the virtualization of the windows guests. This paper describes the architecture of our solution based on KVM and RHEV as along with its integration with the existing Hyper-V infrastructure and the Quattor cluster management tools and in particular how we use to run controls applications on a virtualized infrastructure. We present performance results of both the KVM and Hyper-V solutions, problems encountered and a description of the management tools developed for the integration with the Online cluster and LHCb SCADA control system based on PVSS.

Paper	Title	Page
WEMMU005	Fabric Management with Diskless Servers and Quattor on LHCb	691
	P. Schweitzer, E. Bonaccorsi, L. Brarda, N. Neufeld CERN, Geneva, Switzerland
	Large scientific experiments nowadays very often are using large computer farms to process the events acquired from the detectors. In LHCb a small sysadmin team manages 1400 servers of the LHCb Event Filter Farm, but also a wide variety of control servers for the detector electronics and infrastructure computers : file servers, gateways, DNS, DHCP and others. This variety of servers could not be handled without a solid fabric management system. We choose the Quattor toolkit for this task. We will present our use of this toolkit, with an emphasis on how we handle our diskless nodes (Event filter farm nodes and computers embedded in the acquisition electronic cards). We will show our current tests to replace the standard (RedHat/Scientific Linux) way of handling diskless nodes to fusion filesystems and how it improves fabric management.
	Slides WEMMU005 [0.119 MB]
	Poster WEMMU005 [0.602 MB]

WEPMU035	Distributed Monitoring System Based on ICINGA	1149
	C. Haen, E. Bonaccorsi, N. Neufeld CERN, Geneva, Switzerland
	The basic services of the large IT infrastructure of the LHCb experiment are monitored with ICINGA, a fork of the industry standard monitoring software NAGIOS. The infrastructure includes thousands of servers and computers, storage devices, more than 200 network devices and many VLANS, databases, hundreds diskless nodes and many more. The amount of configuration files needed to control the whole installation is big, and there is a lot of duplication, when the monitoring infrastructure is distributed over several servers. In order to ease the manipulation of the configuration files, we designed a monitoring schema particularly adapted to our network and taking advantage of its specificities, and developed a tool to centralize its configuration in a database. Thanks to this tool, we could also parse all our previous configuration files, and thus fill in our Oracle database, that comes as a replacement of the previous Active Directory based solution. A web frontend allows non-expert users to easily add new entities to monitor. We present the schema of our monitoring infrastructure and the tool used to manage and automatically generate the configuration for ICINGA.
	Poster WEPMU035 [0.375 MB]

WEPMU037	Virtualization for the LHCb Experiment	1157
	E. Bonaccorsi, L. Brarda, M. Chebbi, N. Neufeld CERN, Geneva, Switzerland F. Sborzacchi INFN/LNF, Frascati (Roma), Italy
	The LHCb Experiment, one of the four large particle physics detectors at CERN, counts in its Online System more than 2000 servers and embedded systems. As a result of ever-increasing CPU performance in modern servers, many of the applications in the controls system are excellent candidates for virtualization technologies. We see virtualization as an approach to cut down cost, optimize resource usage and manage the complexity of the IT infrastructure of LHCb. Recently we have added a Kernel Virtual Machine (KVM) cluster based on Red Hat Enterprise Virtualization for Servers (RHEV) complementary to the existing Hyper-V cluster devoted only to the virtualization of the windows guests. This paper describes the architecture of our solution based on KVM and RHEV as along with its integration with the existing Hyper-V infrastructure and the Quattor cluster management tools and in particular how we use to run controls applications on a virtualized infrastructure. We present performance results of both the KVM and Hyper-V solutions, problems encountered and a description of the management tools developed for the integration with the Online cluster and LHCb SCADA control system based on PVSS.