Paper | Title | Page |
---|---|---|
THCPL03 | A Success-History Based Learning Procedure to Optimize Server Throughput in Large Distributed Control Systems | 1182 |
|
||
Funding: Work supported by Brookhaven Science Associates, LLC under Contract No. DE-SC0012704 with the U.S. Department of Energy. Large distributed control systems typically can be modeled by a hierarchical structure with two physical layers: Console Level Computers (CLCs) and Front End Computers (FECs). The controls system of the Relativistic Heavy Ion Collider (RHIC) consists of more than 500 FECs, each acting as a server providing services to a potentially unlimited number of clients. This can lead to a bottleneck in the system. Heavy traffic can slow down or even crash a system, making it momentarily unresponsive. One mechanism to circumvent this is to transfer the heavy communications traffic to more robust higher performance servers, keeping the load on the FEC low. In this work, we study this client-server problem from a different perspective. We introduce a novel game theory model for the problem, and formulate it into an integer programming problem. We point out its difficulty and propose a heuristic algorithms to solve it. Simulation results show that our proposed schemes efficiently manage the client-server activities, and result in a high server throughput and a low crash probability. |
||
![]() |
Talk as video stream: https://youtu.be/veLaGGNTs8w | |
![]() |
Slides THCPL03 [1.321 MB] | |
DOI • | reference for this paper ※ https://doi.org/10.18429/JACoW-ICALEPCS2017-THCPL03 | |
Export • | reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml) | |
THMPL03 | A New Simulation Architecture for Improving Software Reliability in Collider-Accelerator Control Systems | 1261 |
|
||
Funding: Work supported by Brookhaven Science Associates, LLC under Contract No. DE-SC0012704 with the U.S. Department of Energy. The Relativistic Heavy Ion Collider (RHIC) complex of accelerators at Brookhaven National Laboratory (BNL) operates using a large distributed controls system, consisting of approximately 1.5 million control points, over 430 VME based control modules, and thousands of server processes. We have developed a new testing platform that can be used to improve code reliability and help streamline the code development process by adding more automated testing. The testing platform simulates the control system using the actual controls system code base but by redirecting the I/O to simulated interfaces. In this report, we will describe the design of the system and the current status of its development. |
||
![]() |
Slides THMPL03 [0.666 MB] | |
![]() |
Poster THMPL03 [0.674 MB] | |
DOI • | reference for this paper ※ https://doi.org/10.18429/JACoW-ICALEPCS2017-THMPL03 | |
Export • | reference for this paper using ※ BibTeX, ※ LaTeX, ※ Text/Word, ※ RIS, ※ EndNote (xml) | |