Paper | Title | Other Keywords | Page |
---|---|---|---|
WEPGF037 | Data Lifecycle in Large Experimental Physics Facilities: The Approach of the Synchrotron ELETTRA and the Free Electron Laser FERMI | operation, experiment, synchrotron, electron | 777 |
|
|||
Often the producers of Big Data face the emerging problem of Data Deluge. Nevertheless experimental facilities such as synchrotrons and free electron lasers may have additional requirements, mostly related to the necessity of managing the access for thousands of scientists. A complete data lifecycle describes the seamless path that joins distinct IT tasks such as experiment proposal management, user accounts, data acquisition and analysis, archiving, cataloguing and remote access. This paper presents the data lifecycle of the synchrotron ELETTRA and the free electron laser FERMI. With the focus on data access, the Virtual Unified Office (VUO) is presented. It is a core element in scientific proposal management, user information DB, scientific data oversight and remote access. Eventually are discussed recent developments of the beamline software, that holds the key role to data and metadata acquisition but also requires integration with the rest of the system components in order to provide data cataloging, data archiving and remote access. The scope of this paper is to disseminate the current status of a complete data lifecycle, discuss key issues and hint on the future directions. | |||
![]() |
Poster WEPGF037 [1.137 MB] | ||
Export • | reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml) | ||
WEPGF043 | Metadatastore: A Primary Data Store for NSLS-2 Beamlines | experiment, database, EPICS, GUI | 794 |
|
|||
Funding: Department of Energy, Brookhaven National Lab The beamlines at NSLS-II are among the highest instrumented, and controlled of any worldwide. Each beamline can produce unstructured data sets in various formats. This data should be made available for data analysis and processing for beamline scientists and users. Various data flow systems are in place in numerous synchrotrons, however these are very domain specific and cannot handle such unstructured data. We have developed a data flow service, metadatastore, that manages experimental data in NSLS-II beamlines. This service enables data analysis and visualization clients to access this service either directly or via databroker api in a consistent and partition tolerant fashion, providing a reliable and easy to use interface to our state-of-the-art beamlines. |
|||
Export • | reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml) | ||
WEPGF044 | Filestore: A File Management Tool for NSLS-II Beamlines | experiment, EPICS, data-acquisition, database | 796 |
|
|||
Funding: Brookhaven National Lab, Departmet of Energy NSLS-II beamlines can generate 72,000 data sets per day resulting in over 2 M data sets in one year. The large amount of data files generated by our beamlines poses a massive file management challenge. In response to this challenge, we have developed filestore, as means to provide users with an interface to stored data. By leveraging features of Python and MongoDB, filestore can store information regarding the location of a file, access and open the file, retrieve a given piece of data in that file, and provide users with a token, a unique identifier allowing them to retrieve each piece of data. Filestore does not interfere with the file source or the storage method and supports any file format, making data within files available for NSLS-II data analysis environment. |
|||
![]() |
Poster WEPGF044 [0.854 MB] | ||
Export • | reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml) | ||
WEPGF046 | Towards a Second Generation Data Analysis Framework for LHC Transient Data Recording | framework, operation, software, hardware | 802 |
|
|||
During the last two years, CERNs Large Hadron Collider (LHC) and most of its equipment systems were upgraded to collide particles at an energy level twice higher compared to the first operational period between 2010 and 2013. System upgrades and the increased machine energy represent new challenges for the analysis of transient data recordings, which have to be both dependable and fast. With the LHC having operated for many years already, statistical and trend analysis across the collected data sets is a growing requirement, highlighting several constraints and limitations imposed by the current software and data storage ecosystem. Based on several analysis use-cases, this paper highlights the most important aspects and ideas towards an improved, second generation data analysis framework to serve a large variety of equipment experts and operation crews in their daily work. | |||
![]() |
Poster WEPGF046 [0.501 MB] | ||
Export • | reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml) | ||
WEPGF053 | Monitoring and Cataloguing the Progress of Synchrotron Experiments, Data Reduction, and Data Analysis at Diamond Light Source From a User's Perspective | experiment, software, detector, radiation | 822 |
|
|||
The high data rates produced by the latest generation of detectors, more efficient sample handling hardware and ever more remote users of the beamlines at Diamond Light Source require improved data reduction and data analysis techniques to maximize their benefit to scientists. In this paper some of the experiment data reduction and analysis steps are described, including real time image analysis with DIALS, our Fast DP and xia2-based data reduction pipelines, and Fast EP phasing and Dimple difference map calculation pipelines that aim to rapidly provide feedback about the recently completed experiment. SynchWeb, an interface to an open source laboratory information management system called ISPyB (co-developed at Diamond and the ESRF), provides a modern, flexible framework for managing samples and visualizing the data from all of these experiments and analyses, including plots, images, and tables of the analysed and reduced data, as well as showing experimental metadata, sample information. | |||
Export • | reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml) | ||
WEPGF068 | Formalizing Expert Knowledge in Order to Analyse CERN's Control Systems | controls, monitoring, software, operation | 857 |
|
|||
The automation infrastructure needed to reliably run CERN's accelerator complex and its experiments produces large and diverse amounts of data, besides physics data. Over 600 industrial control systems with about 45 million parameters store more than 100 terabytes of data per year. At the same time a large technical expertise in this domain is collected and formalized. The study is based on a set of use cases classified into three data analytics domains applicable to CERN's control systems: online monitoring, fault diagnosis and engineering support. A known root cause analysis concerning gas system alarms flooding was reproduced with Siemens' Smart Data technologies and its results were compared with a previous analysis. The new solution has been put in place as a tool supporting operators during breakdowns in a live production system. The effectiveness of this deployment suggests that these technologies can be applied to more cases. The intended goals would be to increase CERN's systems reliability and reduce analysis efforts from weeks to hours. It also ensures a more consistent approach for these analyses by harvesting a central expert knowledge base available at all times. | |||
![]() |
Poster WEPGF068 [1.500 MB] | ||
Export • | reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml) | ||