Author: Gao, Y.
Paper Title Page
THCPL03 A Success-History Based Learning Procedure to Optimize Server Throughput in Large Distributed Control Systems 1182
 
  • Y. Gao, T.G. Robertazzi
    Stony Brook University, Stony Brook, New York, USA
  • K.A. Brown
    BNL, Upton, Long Island, New York, USA
  • J. Chen
    Stony Brook University, Computer Science Department, Stony Brook, New York, USA
 
  Funding: Work supported by Brookhaven Science Associates, LLC under Contract No. DE-SC0012704 with the U.S. Department of Energy.
Large distributed control systems typically can be modeled by a hierarchical structure with two physical layers: Console Level Computers (CLCs) and Front End Computers (FECs). The controls system of the Relativistic Heavy Ion Collider (RHIC) consists of more than 500 FECs, each acting as a server providing services to a potentially unlimited number of clients. This can lead to a bottleneck in the system. Heavy traffic can slow down or even crash a system, making it momentarily unresponsive. One mechanism to circumvent this is to transfer the heavy communications traffic to more robust higher performance servers, keeping the load on the FEC low. In this work, we study this client-server problem from a different perspective. We introduce a novel game theory model for the problem, and formulate it into an integer programming problem. We point out its difficulty and propose a heuristic algorithms to solve it. Simulation results show that our proposed schemes efficiently manage the client-server activities, and result in a high server throughput and a low crash probability.
 
video icon Talk as video stream: https://youtu.be/veLaGGNTs8w  
slides icon Slides THCPL03 [1.321 MB]  
DOI • reference for this paper ※ https://doi.org/10.18429/JACoW-ICALEPCS2017-THCPL03  
Export • reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml)  
 
THMPL03 A New Simulation Architecture for Improving Software Reliability in Collider-Accelerator Control Systems 1261
 
  • Y. Gao, T.G. Robertazzi
    Stony Brook University, Stony Brook, New York, USA
  • K.A. Brown, J. Morris, R.H. Olsen
    BNL, Upton, Long Island, New York, USA
 
  Funding: Work supported by Brookhaven Science Associates, LLC under Contract No. DE-SC0012704 with the U.S. Department of Energy.
The Relativistic Heavy Ion Collider (RHIC) complex of accelerators at Brookhaven National Laboratory (BNL) operates using a large distributed controls system, consisting of approximately 1.5 million control points, over 430 VME based control modules, and thousands of server processes. We have developed a new testing platform that can be used to improve code reliability and help streamline the code development process by adding more automated testing. The testing platform simulates the control system using the actual controls system code base but by redirecting the I/O to simulated interfaces. In this report, we will describe the design of the system and the current status of its development.
 
slides icon Slides THMPL03 [0.666 MB]  
poster icon Poster THMPL03 [0.674 MB]  
DOI • reference for this paper ※ https://doi.org/10.18429/JACoW-ICALEPCS2017-THMPL03  
Export • reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml)