Keyword: data-analysis
Paper Title Other Keywords Page
WEPGF037 Data Lifecycle in Large Experimental Physics Facilities: The Approach of the Synchrotron ELETTRA and the Free Electron Laser FERMI operation, experiment, synchrotron, electron 777
 
  • F. Billè, R. Borghes, F. Brun, V. Chenda, A. Curri, V. Duic, D. Favretto, G. Kourousias, M. Lonza, M. Prica, R. Pugliese, M. Scarcia, M. Turcinovich
    Elettra-Sincrotrone Trieste S.C.p.A., Basovizza, Italy
 
  Often the producers of Big Data face the emerging problem of Data Deluge. Nevertheless experimental facilities such as synchrotrons and free electron lasers may have additional requirements, mostly related to the necessity of managing the access for thousands of scientists. A complete data lifecycle describes the seamless path that joins distinct IT tasks such as experiment proposal management, user accounts, data acquisition and analysis, archiving, cataloguing and remote access. This paper presents the data lifecycle of the synchrotron ELETTRA and the free electron laser FERMI. With the focus on data access, the Virtual Unified Office (VUO) is presented. It is a core element in scientific proposal management, user information DB, scientific data oversight and remote access. Eventually are discussed recent developments of the beamline software, that holds the key role to data and metadata acquisition but also requires integration with the rest of the system components in order to provide data cataloging, data archiving and remote access. The scope of this paper is to disseminate the current status of a complete data lifecycle, discuss key issues and hint on the future directions.  
poster icon Poster WEPGF037 [1.137 MB]  
Export • reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml)  
 
WEPGF043 Metadatastore: A Primary Data Store for NSLS-2 Beamlines experiment, database, EPICS, GUI 794
 
  • A. Arkilic, D.B. Allan, T.A. Caswell, L.R. Dalesio, W.K. Lewis
    BNL, Upton, Long Island, New York, USA
 
  Funding: Department of Energy, Brookhaven National Lab
The beamlines at NSLS-II are among the highest instrumented, and controlled of any worldwide. Each beamline can produce unstructured data sets in various formats. This data should be made available for data analysis and processing for beamline scientists and users. Various data flow systems are in place in numerous synchrotrons, however these are very domain specific and cannot handle such unstructured data. We have developed a data flow service, metadatastore, that manages experimental data in NSLS-II beamlines. This service enables data analysis and visualization clients to access this service either directly or via databroker api in a consistent and partition tolerant fashion, providing a reliable and easy to use interface to our state-of-the-art beamlines.
 
Export • reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml)  
 
WEPGF044 Filestore: A File Management Tool for NSLS-II Beamlines experiment, EPICS, data-acquisition, database 796
 
  • A. Arkilic, T.A. Caswell, D. Chabot, L.R. Dalesio, W.K. Lewis
    BNL, Upton, Long Island, New York, USA
 
  Funding: Brookhaven National Lab, Departmet of Energy
NSLS-II beamlines can generate 72,000 data sets per day resulting in over 2 M data sets in one year. The large amount of data files generated by our beamlines poses a massive file management challenge. In response to this challenge, we have developed filestore, as means to provide users with an interface to stored data. By leveraging features of Python and MongoDB, filestore can store information regarding the location of a file, access and open the file, retrieve a given piece of data in that file, and provide users with a token, a unique identifier allowing them to retrieve each piece of data. Filestore does not interfere with the file source or the storage method and supports any file format, making data within files available for NSLS-II data analysis environment.
 
poster icon Poster WEPGF044 [0.854 MB]  
Export • reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml)  
 
WEPGF046 Towards a Second Generation Data Analysis Framework for LHC Transient Data Recording framework, operation, software, hardware 802
 
  • S. Boychenko, C. Aguilera-Padilla, M. Dragu, M.A. Galilée, J.C. Garnier, M. Koza, K.H. Krol, R. Orlandi, M.C. Poeschl, T.M. Ribeiro, K.S. Stamos, M. Zerlauth
    CERN, Geneva, Switzerland
  • M. Zenha-Rela
    University of Coimbra, Coimbra, Portugal
 
  During the last two years, CERNs Large Hadron Collider (LHC) and most of its equipment systems were upgraded to collide particles at an energy level twice higher compared to the first operational period between 2010 and 2013. System upgrades and the increased machine energy represent new challenges for the analysis of transient data recordings, which have to be both dependable and fast. With the LHC having operated for many years already, statistical and trend analysis across the collected data sets is a growing requirement, highlighting several constraints and limitations imposed by the current software and data storage ecosystem. Based on several analysis use-cases, this paper highlights the most important aspects and ideas towards an improved, second generation data analysis framework to serve a large variety of equipment experts and operation crews in their daily work.  
poster icon Poster WEPGF046 [0.501 MB]  
Export • reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml)  
 
WEPGF053 Monitoring and Cataloguing the Progress of Synchrotron Experiments, Data Reduction, and Data Analysis at Diamond Light Source From a User's Perspective experiment, software, detector, radiation 822
 
  • J. Aishima
    SLSA, Clayton, Australia
  • A. Ashton, S. Fisher, K. Levik, G. Winter
    DLS, Oxfordshire, United Kingdom
 
  The high data rates produced by the latest generation of detectors, more efficient sample handling hardware and ever more remote users of the beamlines at Diamond Light Source require improved data reduction and data analysis techniques to maximize their benefit to scientists. In this paper some of the experiment data reduction and analysis steps are described, including real time image analysis with DIALS, our Fast DP and xia2-based data reduction pipelines, and Fast EP phasing and Dimple difference map calculation pipelines that aim to rapidly provide feedback about the recently completed experiment. SynchWeb, an interface to an open source laboratory information management system called ISPyB (co-developed at Diamond and the ESRF), provides a modern, flexible framework for managing samples and visualizing the data from all of these experiments and analyses, including plots, images, and tables of the analysed and reduced data, as well as showing experimental metadata, sample information.  
Export • reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml)  
 
WEPGF068 Formalizing Expert Knowledge in Order to Analyse CERN's Control Systems controls, monitoring, software, operation 857
 
  • A. Voitier, M. Gonzalez-Berges, F.M. Tilaro
    CERN, Geneva, Switzerland
  • M. Roshchin
    Siemens AG, Corporate Technology, München, Germany
 
  The automation infrastructure needed to reliably run CERN's accelerator complex and its experiments produces large and diverse amounts of data, besides physics data. Over 600 industrial control systems with about 45 million parameters store more than 100 terabytes of data per year. At the same time a large technical expertise in this domain is collected and formalized. The study is based on a set of use cases classified into three data analytics domains applicable to CERN's control systems: online monitoring, fault diagnosis and engineering support. A known root cause analysis concerning gas system alarms flooding was reproduced with Siemens' Smart Data technologies and its results were compared with a previous analysis. The new solution has been put in place as a tool supporting operators during breakdowns in a live production system. The effectiveness of this deployment suggests that these technologies can be applied to more cases. The intended goals would be to increase CERN's systems reliability and reduce analysis efforts from weeks to hours. It also ensures a more consistent approach for these analyses by harvesting a central expert knowledge base available at all times.  
poster icon Poster WEPGF068 [1.500 MB]  
Export • reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml)