Author: Sjoen, R.
Paper Title Page
WEPMU036 Efficient Network Monitoring for Large Data Acquisition Systems 1153
 
  • D.O. Savu, B. Martin
    CERN, Geneva, Switzerland
  • A. Al-Shabibi
    Heidelberg University, Heidelberg, Germany
  • S.M. Batraneanu, S.N. Stancu
    UCI, Irvine, California, USA
  • R. Sjoen
    University of Oslo, Oslo, Norway
 
  Though constantly evolving and improving, the available network monitoring solutions have limitations when applied to the infrastructure of a high speed real-time data acquisition (DAQ) system. DAQ networks are particular computer networks where experts have to pay attention to both individual subsections as well as system wide traffic flows while monitoring the network. The ATLAS Network at the Large Hadron Collider (LHC) has more than 200 switches interconnecting 3500 hosts and totaling 8500 high speed links. The use of heterogeneous tools for monitoring various infrastructure parameters, in order to assure optimal DAQ system performance, proved to be a tedious and time consuming task for experts. To alleviate this problem we used our networking and DAQ expertise to build a flexible and scalable monitoring system providing an intuitive user interface with the same look and feel irrespective of the data provider that is used. Our system uses custom developed components for critical performance monitoring and seamlessly integrates complementary data from auxiliary tools, such as NAGIOS, information services or custom databases. A number of techniques (e.g. normalization, aggregation and data caching) were used in order to improve the user interface response time. The end result is a unified monitoring interface, for fast and uniform access to system statistics, which significantly reduced the time spent by experts for ad-hoc and post-mortem analysis.  
poster icon Poster WEPMU036 [5.945 MB]