Christopher Allen (Osprey DCS LLC)
TUPS70
The Data Platform: an independent system for management of heterogeneous, time-series data to enable data science applications
1839
The Data Platform is a fully independent system for management and retrieval of heterogeneous, time-series data required for machine learning and general data science applications deployed at large particle accelerator facilities. It is an independent subsystem within the larger Machine Learning Data Platform (MLDP) which provides full-stack support for such facilities and applications [1]. The Data Platform maintains the heterogeneous data archive along with all associated metadata and post-acquisition user annotations. It also facilitates all interactions between data scientists and the data archive, thus it directly supports all back-end data science use cases. Accelerator facilities include thousands of data sources sampled at high frequencies, so ingestion performance is a key requirement and the current challenge. We describe the operation, architecture, performance, and development status of the Data Platform.
  • C. McChesney, C. Allen
    Osprey DCS LLC
  • L. Dalesio
    EPIC Consulting
  • M. Davidsaver
    Brookhaven National Laboratory
Paper: TUPS70
DOI: reference for this paper: 10.18429/JACoW-IPAC2024-TUPS70
About:  Received: 15 May 2024 — Revised: 16 May 2024 — Accepted: 18 May 2024 — Issue date: 01 Jul 2024
Cite: reference for this paper using: BibTeX, LaTeX, Text/Word, RIS, EndNote
TUPS71
A data science and machine learning platform supporting large particle accelerator control and diagnostics applications
1843
Osprey DCS is developing the Machine Learning Data Platform (MLDP) supporting data science applications specific to large particle accelerator facilities and other large experimental physics facilities. It represents a “data-science ready” host platform providing integrated support for advanced data science applications used for diagnosis, modeling, control, and optimization of these facilities. There are three primary functions of the platform: 1) high-speed data acquisition, 2) archiving and management of time-correlated, heterogeneous data, and 3) comprehensive access and interaction with archived data. The objective is to provide full-stack support, from low-level hardware acquisition to broad data accessibility within a portable, standardized platform offering a data-centric interface for accelerator physicists and data scientists. Osprey DCS has developed a working prototype MLDP* and is now pursuing full-scale development. We present an overview of the MLDP including use cases, architecture, and deployment, along with the current development status. The MLDP is deployable at any facility, however, the low-level acquisition component is EPICS based.
  • C. Allen
    Osprey DCS LLC
  • C. McChesney
    Los Alamos National Laboratory
  • M. Davidsaver
    Brookhaven National Laboratory
  • L. Dalesio
    EPIC Consulting
Paper: TUPS71
DOI: reference for this paper: 10.18429/JACoW-IPAC2024-TUPS71
About:  Received: 01 May 2024 — Revised: 17 May 2024 — Accepted: 18 May 2024 — Issue date: 01 Jul 2024
Cite: reference for this paper using: BibTeX, LaTeX, Text/Word, RIS, EndNote