Keyword: data-management
Paper Title Other Keywords Page
MOPKN009 The CERN Accelerator Measurement Database: On the Road to Federation database, controls, extraction, status 102
  • C. Roderick, R. Billen, M. Gourber-Pace, N. Hoibian, M. Peryt
    CERN, Geneva, Switzerland
  The Measurement database, acting as short-term central persistence and front-end of the CERN accelerator Logging Service, receives billions of time-series data per day for 200,000+ signals. A variety of data acquisition systems on hundreds of front-end computers publish source data that eventually end up being logged in the Measurement database. As part of a federated approach to data management, information about source devices are defined in a Configuration database, whilst the signals to be logged are defined in the Measurement database. A mapping, which is often complex and subject to change and extension, is therefore required in order to subscribe to the source devices, and write the published data to the corresponding named signals. Since 2005, this mapping was done by means of dozens of XML files, which were manually maintained by multiple persons, resulting in a configuration that was error prone. In 2010 this configuration was improved, such that it becomes fully centralized in the Measurement database, reducing significantly the complexity and the number of actors in the process. Furthermore, logging processes immediately pick up modified configurations via JMS based notifications sent directly from the database, allowing targeted device subscription updates rather than a full process restart as was required previously. This paper will describe the architecture and the benefits of current implementation, as well as the next steps on the road to a fully federated solution.  
MOPKN011 CERN Alarms Data Management: State & Improvements laser, database, controls, operation 110
  • Z. Zaharieva, M. Buttner
    CERN, Geneva, Switzerland
  The CERN Alarms System - LASER is a centralized service ensuring the capturing, storing and notification of anomalies for the whole accelerator chain, including the technical infrastructure at CERN. The underlying database holds the pre-defined configuration data for the alarm definitions, for the Operators alarms consoles as well as the time-stamped, run-time alarm events, propagated through the Alarms Systems. The article will discuss the current state of the Alarms database and recent improvements that have been introduced. It will look into the data management challenges related to the alarms configuration data that is taken from numerous sources. Specially developed ETL processes must be applied to this data in order to transform it into an appropriate format and load it into the Alarms database. The recorded alarms events together with some additional data, necessary for providing events statistics to users, are transferred to the long-term alarms archive. The article will cover as well the data management challenges related to the recently developed suite of data management interfaces in respect of keeping data consistency between the alarms configuration data coming from external data sources and the data modifications introduced by the end-users.  
poster icon Poster MOPKN011 [4.790 MB]  
THCHAUST02 Large Scale Data Facility for Data Intensive Synchrotron Beamlines experiment, synchrotron, detector, software 1216
  • R. Stotzka, A. Garcia, V. Hartmann, T. Jejkal, H. Pasic, A. Streit, J. van Wezel
    KIT, Karlsruhe, Germany
  • D. Haas, W. Mexner, T. dos Santos Rolo
    Karlsruhe Institute of Technology (KIT), Karlsruhe, Germany
  ANKA is a large scale facility of the Helmholtz Association of National Research Centers in Germany located at the Karlsruhe Institute of Technology. As the synchrotron light source it is providing light from hard X-rays to the far-infrared for research and technology. It is serving as a user facility for the national and international scientific community currently producing 100 TB of data per year. Within the next two years a couple of additional data intensive beamlines will be operational producing up to 1.6 PB per year. These amounts of data have to be stored and provided on demand to the users. The Large Scale Data Facility LSDF is located on the same campus as ANKA. It is a data service facility dedicated for data intensive scientific experiments. Currently storage of 4 PB for unstructured and structured data and a HADOOP cluster as a computing resource for data intensive applications are available. Within the campus experiments and the main large data producing facilities are connected via 10 GE network links. An additional 10 GE link exists to the internet. Tools for an easy and transparent access allow scientists to use the LSDF without bothering with the internal structures and technologies. Open interfaces and APIs support a variety of access methods to the highly available services for high throughput data applications. In close cooperation with ANKA the LSDF provides assistance to efficiently organize data and meta data structures, and develops and deploys community specific software running on the directly connected computing infrastructure.  
slides icon Slides THCHAUST02 [1.294 MB]