FIRST OPERATIONAL EXPERIENCE WITH THE LHC BEAM DUMP TRIGGER SYNCHRONISATION UNIT

A. Antoine, C. Boucly, N. Magnin, P. Juteau, N. Voumard, CERN, Geneva, Switzerland

Abstract

Two LHC Beam Dumping Systems (LBDS) remove the counter-rotating beams safely from the collider during setting up of the accelerator, at the end of a physics run and in case of emergencies. Dump requests can come from 3 different sources: the machine protection system in emergency cases, the machine timing system for scheduled dumps or the LBDS itself in case of internal failures. These dump requests are synchronized with the 3 μs beam abort gap in a fail-safe redundant Trigger Synchronization Unit (TSU) based on a Digital Phase Locked Loop (DPLL), locked onto the LHC beam revolution frequency with a maximum phase error of 40 ns. The synchronized trigger pulses coming out of the TSU are then distributed to the high voltage generators of the beam dump kickers through a redundant fault-tolerant trigger distribution system. This paper describes the operational experience gained with the TSU since its commissioning with beam in 2009, and highlights the improvements, which have been implemented for a safer operation. This includes an increase of the diagnosis and monitoring functionalities and a more automated validation of the hardware and embedded firmware before deploying or executing a post-operational analysis of the TSU performance, after each dump action. In the light of this first experience the outcome of the external review performed in 2010 is presented. The lessons learnt on the project life cycle for the design of mission critical electronic modules are discussed.

INTRODUCTION

The TSU re-phases the dump requests with the circulating beam such that the 100 % rising edge of the magnetic field in the extraction kicker coincides with the end of the beam abort gap in the circulating beam [1].

Locked to the LHC beam revolution frequency, two redundant TSUs produce continuously dump trigger pulse trains synchronised with the beam abort gap. The distribution of these pulse trains is inhibited until a beam dump is requested. The first pulse which pass the inhibit stage are then sent via two redundant trigger fan-outs (TFO) to all the power trigger modules. In this way the first pulse after the reception of a dump request will synchronously trigger the system.

In case the beam revolution frequency is lost during more than one turn, an internally synchronised direct digital synthesiser based on a Numerically Controlled Oscillator (NCO) and on a Digital Phase Lock Loop (DPLL), precisely locked on the beam revolution frequency, will issue an internal dump request and will generate a dump trigger which is synchronous with the beam abort gap.

In addition, the two redundant TSUs continuously cross-check their DPLL synchronisation. A phase discrepancy greater than 40 ns between the two TSUs automatically issues a synchronous dump. If the synchronisation of only one of the TSU fails, a synchronous dump trigger is forced by the redundant system.

During the LHC cold check-out period all the functionalities of the TSU modules have been extensively tested and commissioned. No major technical issues have been identified at that stage. First operational experiences with beam have demonstrated the correct operational performance of the TSU module [2].

Nevertheless, some unforeseen events like asynchronous synchronous dumps* induced by the TSU modules have been recorded during the LHC commissioning. Despite a high level of embedded diagnosis and monitoring functionalities, tools to capture spurious synchronisation failures like synchronous asynchronous dump appear to be missing.

In addition, due to its key function within the critical LBDS system, it has also been decided to perform an external review of the TSU module and its embedded software in order to check and validate its functionality.

REVIEW

Early 2008, an internal review of the LBDS in general and of the trigger synchronisation and distribution system (TSDS) in particular has been conducted with the goal to validate all the implemented technical solutions before the start of the commissioning of the LBDS with beam. The final recommendation of this review has suggested to perform a more detailed review of the TSU module and its embedded software, and, to build a test bench for validation of its different functionalities.

Subsequently an external review of the TSU module has been performed in 2009/2010 by a company specialized in electronics engineering R&D, with the aim to obtain a detailed evaluation of the TSU module functional suitability and an identification of conception and/or implementation errors. The review objectives were:

- The validation of the correct implementation of the functional requirements,

* The TSU trigger outputs is asynchronous compared to the LHC revolution frequency but the 15 extraction magnets of the beam dump kicker are triggered in a synchronous way.
The identification of possible hardware and software anomalies,
The verification of the pre-series performance,
The recommendation of possible improvements,
The proposal of guidelines of possible maintenance procedures for the embedded software.

The review has been divided into three phases, requirement review, design review and hardware & software review, with an executive summary fully documented at the end of each phase.

The Requirement Review

The objective of the requirement review phase was to ascertain the adequacy of the requirements in defining the characteristics and the functionalities of the TSU module and to check that all requirements are covered by at least one module or sub-module of the TSU design, which could be in hardware, software or a mix of both.

A high level hierarchy that identifies all hardware and software modules with their corresponding functionalities has been created and cross-checked through a validation matrix with the list of all functional requirements.

The requirements review demonstrated that all requirements are indeed properly covered by at least one module and that all links between modules are coherent. However, it has been noted that the architecture of some required functions, like the PLL design, are sometimes too complex even although the hardware and software architectures are correct.

The Design Review

The second part of the review consisted in an in-depth analysis of the TSU module design in order to ensure that its conceptual design meets the baseline requirements and that its performance levels, typically reliability and availability, are within specification. A low level hierarchy has been created for this purpose and a requirement coverage matrix issued to synthesize the results with a basic OK/NOK status result for every requirement associated with a criticality level. A failure severity classification based on three levels of criticality has been used (Table 1)

Table 1: Criticality Levels

<table>
<thead>
<tr>
<th>Critical</th>
<th>Failure that can induce a dysfunction considered as serious.</th>
</tr>
</thead>
<tbody>
<tr>
<td>Major</td>
<td>Failure that can affect functionality considered as non-critical or with low probability.</td>
</tr>
<tr>
<td>Minor</td>
<td>Failure with no risk for the global functionality.</td>
</tr>
</tbody>
</table>

Over more than 200 functional requirements which were analysed the design review phase has identified 11 requirements as not properly implemented, from which only one has been classified as “critical”. Indeed it has been demonstrated that in case of an internal failure in one of the two redundant TSU modules, the faulty TSU indicates correctly its faulty state to the redundant module but doesn’t generate any trigger, (in any case, an asynchronous dump should have been issued). Four other failure cases have been classified as major due to their possible impact on the module availability and the six remaining cases have been evaluated as minor

The Hardware and Software Review

An in-depth analysis of the TSU electronic circuit and its embedded software has concluded the review. The hardware has been completely checked with respect to modern state-of-the-art hardware design techniques as well as the embedded VHDL software. In addition, the major embedded software modules have been fully simulated and their reactions to incorrect operational conditions analysed in detail.

The analysis of the electronic circuit has highlighted the possibility of the propagation of an internal hardware failure to the redundant module, lack of protection against external environment perturbations, missing Electro-Static Discharge (ESD) and Electro-Magnetic Interference (EMI) protections and under-sized components in case of unforeseen DC mode operation. These design errors have been considered by CERN as minor taking into account the specific environment and operational conditions under which the TSU module is operated. Nevertheless, it has been decided to take these remarks into account for the development of the next hardware version of the TSU module.

The software analysis has highlighted missing debouncers to prevent meta-stable behaviour on some input signals and a wrong behaviour in case of a large phase jump of the beam revolution frequency signal that will imply an asynchronous synchronous dump output trigger. Both remarks have been confirmed by CERN as valid and implemented in a new release of the embedded software.

NEW TOOLS

On the basis of the first years of operation, two different tools have been developed in order to improve the maintenance, the monitoring and the post-operation diagnosis of the TSU.

An automated test bench has been developed to validate the hardware and software modifications before their operational deployment, and a trigger synchronisation unit internal post operational check (TSU IPOC) has been implemented with two functions: a monitoring function to acquire the sequence of synchronisation and triggering signals after each dump and a logic analyser of the dump events sequence for validation of its correct execution.

Automated Test Bench

When an update of the TSU module is required, the major difficulty during its re-commissioning comes from the need to have all the external conditions available in order to re-validate its correct functionality. As these conditions are only met when the machine is closed, a full
re-validation of the TSU, which is located in an underground gallery directly adjacent to the machine tunnel, is impossible.

Thus, an automated test bench based on a PXI crate has been developed to emulate all TSU external signals (dump request clients, beam revolution frequency, arming command…), to check the correct functionalities of the TSU through the application of different stimuli and error signals on the emulated signal and to validate the TSU response through the acquisition of the different TSU output signals, internal diagnosis registers and IPOC analysis result.

The test bench has been developed around a NI-PXI 8184 embedded controller running LabVIEW Real-Time. A NI-PXI 5412 arbitrary waveform generator has been used for emulation of the beam revolution frequency while the other signals are emulated through either an NI-PXI 5402 arbitrary function for frequency based signal or a NI-PXI 6115 multifunction I/O module for digital input.

All input functions and their actions are controlled through a LabVIEW based user interface (Figure 1). Through this interface, each individual signal can be independently tested or an automated test sequence can be launched.

When operated in automated test mode, the response to different failure cases on each input signal is tested. The test bench analyses all results and sets a flag on each test when passed. In the case of a failure, the test bench automatically records which rule has been violated.

The first release of the TSU embedded software includes detailed internal diagnosis functions. It appears rapidly that a tool to correlate in the time domain the response of the two redundant TSUs as well as the trigger signal distribution to the generators was missing.

A TSU IPOC with monitoring function has been developed to get a better understanding of the entire LBDS triggering process, from the capture of the dump requests up the monitoring of the synchronisation process and the generation and distribution of trigger and re-trigger signals.

The system is based on a PCI 32 bit 125 MS/s digital I/O module from SPECTRUM. The acquisition software is running on the LINUX front-end and has been implemented within the FESA framework.

Signals acquired after each dump include the dump requests, the beam revolution frequency, the re-phased beam revolution frequency, the synchronous trigger outputs from the TSU and the fan-out circuits as well as the asynchronous trigger outputs and the re-trigger pulse chain signals.

A JAVA GUI permits an on-line analysis of the events sequence of the last dump and a replay previous dump for off-line analysis. An overview of a typical dump event sequence is shown in Figure 2.
Up to now the IPOC has no link with the LBDS interlock system and its results don't affect the operational condition of the LBDS. In the future, this system will be included within the LBDS External Post Operational Check (XPOC) system [3] and thus force an expert to analyse and acknowledge failures before continuing with machine operation.

**IMPROVEMENTS**

Since its commissioning several minor updates on the TSU units have been applied successfully. Most of these updates have been induced by changes of specification, mainly to implement additional diagnosis. One has been made to remove a critical bug identified during the commissioning with beam. Here is a short summary of the different updates:

- Correction of a critical bug to avoid asynchronous synchronous trigger due to a hardware error in the design on the trigger output management output logic. This part, initially designed with combinational discrete components was responsible for a random phase shift of the synchronisation trigger pulse when the dump request was received;
- Implementation of the acquisition of UTC time stamp of the dump request;
- Implementation of a fast inhibit mechanism to LHC injection interlocking system in order to avoid the possibility to inject beam in the machine one turn after a beam dump;
- Consolidation of the TSU output signals for integration within the TSU logic analyser.

An important embedded software release has also been started for implementation of the external review recommendations. All critical and major identified failure cases were cured in this update, but the automated test failed. It was demonstrated that the change in the VHDL code induces a new routing of the FPGA circuitry, which revealed an insufficient filtering of the $+1.2V$ power supply of the TSU board. A stronger inductance supported by a new decoupling capacitor was implemented and the following automated test bench passed with more than 35000 arming/dump sequences without any failure. That new release will be deployed at the end of 2011.

Finally, the review has taught us a new way of working for future projects. The project life cycle for a design, V cycle, consists on dividing the designers in two independent teams. The first team creates all the steps to design a final product (requirements, high level and low level architecture, etc.), while the second team is developing verification tools for all steps of the project to the final product, including the design of the final test bench. In that way, most of common modes in the conception are rejected which improves the reliability and the robustness of the final product.

**CONCLUSION**

The first operational experience and the external review led to the creation of three new functions. An automated test bench that allows the validation of any hardware or software modifications before deployment in operation, a TSU IPOC monitoring function with an on-line graphical interface to show a dump trigger signature and a TSU IPOC logic analyser looking at defined rules to validate the correct operation of the TSU units once in operation. Additionally, the external review has given us a new methodology in project design improving the reliability of final products, the V cycle.

The last release of the TSU units, taking into account all critical and major design errors highlighted by the external review, is now ready for deployment after a successful completion of the automated test process.

A new hardware design release will be started in 2012 to improve the robustness of the interfaces.

**REFERENCES**