### FPGA BASED BUNCH-BY-BUNCH FEEDBACK SIGNAL PROCESSOR

T. Nakamura<sup>1</sup>, K. Kobayashi<sup>2</sup> *JASRI / SPring-8 Mikazuki-cho, Hyogo 679-5198 Japan* 

#### **ABSTRACT**

A new bunch-by-bunch feedback processor is developed in the SPring-8 and old processors that operated this two years for transverse feedback system was replaced to this new one in Sept. 2005. An old processor was composed of six boards with 12-bit ADC, FPGA and DAC on it and analog multiplexer and the new processor is composed of six ADCs and single FPGA and DACs with more than 50-taps FIR filter. Using newly developed algorithm and this long FIR filter, new board can handle two-dimensional transverse feedback by a single loop, from monitor to kicker. And several functions for future advanced operations of the ring is also implemented to the new processor. This processor is also planned to install to Photon Factory at KEK, the Taiwan Light Source and the NewSUBARU at Univ. of Hyogo.

#### INTRODUCTION

Bunch-by-bunch feedback systems are now indispensable device for rings with high impedance by low gap chambers, like light sources, or for high current ring like B-Factories. In the SPring-8 storage ring, an 8-GeV third generation light source, a transverse feedback system [1] was installed on September 2003 and without trouble in these two years. For future advanced operation of the ring and for the easiness of installation and tuning, new feedback processor is developed and the old processors were replaced to this new processor in September 2005.

#### FEEDBACK SYSTEM

The block diagram of the transverse feedback system of SPring-8 is shown in Fig. 1. The bunch rate or RF acceleration frequency is 508.58MHz and harmonic number (the number of bunches in the ring) is 2436. The system consists of a bunch position monitor (BPM), an analog front-end (analog demultiplexer in our system), a feedback processor, power amplifiers and kickers. The RF signal from BPM is processed to beam position signal in baseband with analog front-end and is feed to the digital feedback processor. At the feedback processor, the analog position signal of each bunches is converted to digital one and processed to a feedback signal that drives kicker to damp the bunch motion. The latency of the system from BPM to kicker should be one or two revolution period of the ring plus bunch propagation delay between the BPM and the kicker.



Fig. 1. Block diagram of bunch-by-bunch feedback system (with old feedback processor).

e-mail: nakamura@spring8.or.jp, http://acc-web.spring8.or.jp/~nakamura

<sup>&</sup>lt;sup>2</sup> e-mail: kkoba@spring8.or.jp

#### **OLD FEEDBACK PROCESSOR**

A block diagram of the old feedback processor is shown in Fig. 1. The old feedback processor is composed of six commercial ADC-FPGA-DAC boards for FIR filter and one analog multiplexer.

The ADC-FPGA-DAC board (HUNT Eng. HERON IO2V) has two 12-bit ADCs(AD9432, 105MS/s), one FPGA(Xilinx XC2V1000fg456-4, 1M gates, 40 multipliers,  $40 \times 18 \text{kbit-RAM}$  blocks) and two 14-bit DACs(AD9767, 125MS/s). Three boards are enough for the feedback processor but one ADC and one DAC per board are used for signal processing and the other DAC is used for monitoring and diagnostics of operation. The bunch position signal coming with 508.58MHz rate (= $f_{RF}$ ) is sampled with six ADCs with sampling rate 84.76MS/s, one sixth of  $f_{RF}$ . Each ADC samples bunch position signals of every six bunches to cover every bunch with six ADCs. FPGA execute FIR filter with the ADC data at past several turns; every 406 data in data stream of each ADC, to produce feedback signal, where 406=2436/6 and 2436 is the harmonic number. The output from FIR filter is converted to analog signal by DAC and the six analog signals are multiplexed by analog multiplexer, with six ADCs, one FPGA and DACs, to produce an analog feedback signal for kickers with band width 254MHz

The latency of the old feedback processor itself is  $\sim$ 400ns and total latency of the system from the monitor to kicker is 1.3 $\mu$ s and also 3.5 $\mu$ s digital delay is produced in FPGA to adjust to 4.8 $\mu$ s, one revolution period of the ring.

In the SPring-8 ring, two feedback loops are installed; one for horizontal and one for vertical, and 9-tap FIR filter, that is obtained by time domain least square method[1], is implemented to the feedback processor for each direction. The frequency (tune) response of each 9-tap FIR filters is shown in Fig. 4.

## Development Environment for FPGA

We use Xilinx ISE Foundation for logic design environment. It is an integrated environment including timing driven implementation tools for programmable logic design and synthesis and verification capabilities along with design entry. Design entry is not only by languages (VHDL, Verilog-HDL etc.) but also by schematic, and the mixture of design entry can be used. The BBF 9-tap FIR filter is designed by VHDL and schematic mixed entries.

### Implemented Circuits

The scheme for the implementation of FIR filter to FPGAs is shown in Fig. 2. The main components of the implemented circuit are as follows:

- Memory: Prepared for storage of past position data of bunches. z<sup>-1</sup> in Figure 3 means 1-turn delay and is materialized 406-words SRAM for data stream of each ADCs. "406" is one sixth of harmonics of the SPring-8.
- Adder: 5-port adder is made to reduce the number of stages as shown in Fig. 2 and is a key for stable operation of the FPGA used in the board. This reduction of the stage is effective to avoid errors by clock skew in the FPGA and to reduce the power consumption and a circuit area (or number of gates) on FPGA. 5-port adder is implemented by schematic entry.
- Multiplier: Build-in multipliers are used to fulfil the requirements of high-speed operation, therefore this number of build-in multipliers is one of the constraint to the number of taps of FIR filter.
- Shift-register: It is used for additional delay for adjust latency to one or two revolution period. The Xilinx-ISE Mapping Report File shows

## Logic Utilization

| Logic Cumzumon                         |         |        |        |     |
|----------------------------------------|---------|--------|--------|-----|
| Total Number Slice Registers           | 1,948   | out of | 10,240 | 19% |
| Logic Distribution                     |         |        |        |     |
| Number of occupied Slices              | 2,010   | out of | 5,120  | 39% |
| Number of Block RAMs                   | 8       | out of | 40     | 20% |
| Number of GCLKs                        | 3       | out of | 16     | 18% |
| Number of DCMs                         | 1       | out of | 8      | 12% |
| Total equivalent gate count for design | 628,134 |        |        |     |

About 63% of total system gates are in use for the 9-tap FIR filter operation. This controlled resource of FPGA is one of the reasons for a stable operation during about 18 months without trouble. It takes about 10 minutes for synthesis of this design by the Windows PC(Pentium-4 3GHz, 1GB memory). Synthesis is generation of the logical or physical representation for the target silicon device, FPGA.



Fig. 2. Implementation of 9-tap FIR filter with 5-port adder used in old processors.



Fig. 3. Gain(left) and phase(right) of 9-tap FIR filters used for the SPring-8 storage ring. Solid line: for horizontal, dashed line: for vertical. Horizontal and vertical tunes are 0.15 and 0.35, respectively.

#### **NEW FEEDBACK PROCESSOR**

New processor was developed for future advanced operation of the ring and for reducing the difficulty of installation and tuning. The old processors are replaced to the new processors and the new processors are currently in operation in the SPring-8 storage ring from September 2005. The components are listed in Table 1. The digital signal processing stages, FIR filter and multiplexer, are integrated to one FPGA instead of 7 FPGAs in the old processor and this eliminate excess 6 pairs of DAC to ADC, shortens the latency and reduces noise. The new processor uses parts shown on Table 1 and block diagram is shown in FIg. 4.

Table 1. Components used for the new feedback processor

| ADC  | Analog Devices AD9433                 | 12-bit, 125MS/, 750MHz analog BW, |  |  |  |  |  |
|------|---------------------------------------|-----------------------------------|--|--|--|--|--|
|      |                                       | latency 10 clock cycle            |  |  |  |  |  |
| FPGA | Xilinx Virtex-II PRO XC2VP70-6FF1517C | 328 multiplier (18x18bit)         |  |  |  |  |  |
| DAC  | Rockwell Scientific RDA012            | 12-bit,1GS/s(max.)                |  |  |  |  |  |



Fig. 4. Block diagram of the new feedback processor. FIR filter stages can be 50-tap FIR filter or two 20-tap FIR filter.

Based on the experience of the old one, we made the new processor with following performance;

#### Six ADC mode and Four ADC mode

To make the processor flexible, the new processor can operate "six ADC" mode and "four ADC" mode. Six ADC mode is currently used in the SPring-8. Four ADC mode is necessary if harmonics of a ring has 4 as a factor but does not have 6.

### 12-bit Resolution ADC

Beams of recent rings have transverse size of a few micro-meter and sub-micro meter resolution is necessary to maintain the motion less than the beam size. On the other hand, dynamic range of ~mm is required not to be saturated by the perturbation at injection or COD. 12-bit resolution ADC is required to meet these requirements without additional circuits. This is the same as the old processors.

## ADC Sampling Rate up to 125MS/s

With four ADC mode, the feedback processor drives ADCs at 125MS/s if RF frequency is 500MHz, which is mostly used in recent rings.

### 750MHz Input Analog Bandwidth

Our RF acceleration frequency,  $f_{RF}$ , is 508.58MHz and we choose this as carrier frequency for the signal from a beam position monitor electrodes. This signal have frequency band of the beam motion from 255MHz( (1-1/2)  $f_{RF}$ ) to 763MHz,( (1+1/2)  $f_{RF}$ ). If analog bandwidth of ADC covers this frequency band, we can sample this signal directly and a down-conversion stage to base-band is not necessary, which introduce the possibility to makes system simple and low noise.

### 50-tap FIR filter or two 20-tap FIR filters

We developed a signal processing method to realize two-dimensional feedback by single loop[2] and longer taps are required for it. And we can use the processor for longitudinal feedback with longer taps. Instead of 50-tap FIR filter, two 20-tap FIR filter can be implemented if necessary.

## Fast Switching of FIR Filter Coefficients

32 sets of FIR filter coefficients can be stored in internal register of FPGA and we can switch these with software control or with external logic signal. In later case, the switching speed is several tens nano seconds. This function makes system flexible to meet dynamic change of ring parameters such as tune or bunch current.

## 32Mword of Sampled Data

32Mega words of history of sampled data of ADC are stored in 64MB memory in the processor. For the SPring-8 case, the bunch spacing is 2ns and data during 64 ms data can be stored in the memory. This is longer enough than the radiation damping time, 8ms, that is the typical time scale of the coherent motion.

## Latency less than 400ns

The latency of the feedback system to one or two revolution period of a ring is preferable by good frequency response of FIR filter. In Photon Factory, the revolution period is 600ns and the budget for the processor is 400ns.

# Internal frequency multiplier for DAC clock

For six ADC mode, a clock of one sixth of RF frequency is supplied to the processor for ADCs and FPGA and for four ADC modes, one fourth of it is supplied. Frequency multiplier supplies a DAC clock of RF frequency with cycle-to-cycle jitter 50ps from ADC clock. In the case of the jitter is a problem, DAC clock can be switched to external clock.

## Multiple DACs

The processor has five DACs. Four for multiplexed FIR filter output and one for multiplexed raw ADC data for diagnostics and tuning. The latency of multiplexed FIR filter output can be controlled with internal delay. Each DAC has complementary outputs. In case that several kicker electrodes are used for feedback, tuning of delay and polarity for individual kicker are necessary but it can be easy adjusted using these functions and outputs.

### CF card as a boot device

Compact Flash(CF) card is used as a boot device and hold configuration data of the feedback processor. With this, the feedback processor can be a "turnkey" system.

## USB controllable

We choose USB2.0 to control the processor. We had several other choices, PCI, Ethernet or IEEE1394. The independence of the device from controller CPU is necessary not to be interfered by controller trouble and PCI is excluded from this point because modules on PCI have to be reset at CPU reset. To make system simple and to transfer 32MW data, Ethernet is excluded. USB and IEEE1394 are "hot plug" and fast in data transfer. IEEE1394 has an advantage that it has optical link standard. However, most controllers and PCs have USB port and software developers have much more experience on USB than IEEE1394.

## Linux Device Driver

Device driver of the feedback processor for kernel 2.4 is developed and most function are controllable with it.

## The Xilinx-ISE Mapping Report File shows

| Logic Utilization                      |               |        |     |
|----------------------------------------|---------------|--------|-----|
| Number of Slice Flip Flops             | 35,264 out of | 66,176 | 53% |
| Logic Distribution                     |               |        |     |
| Number of occupied Slices              | 26,101 out of | 33,088 | 78% |
| Number of Block RAMs                   | 75 out of     | 328    | 22% |
| Total Number of 4 input LUTs           | 25,327 out of | 66,176 | 38% |
| Number used for Dual Port RAMs         | 8,192         |        |     |
| Total equivalent gate count for design | 7,484,501     |        |     |

Total equivalent gate count is about 12 times as much as old one. Xilinx dose not announce total number of gates that can be implemented for the Virtex-II Pro series and from number of occupied slices shown in reports at synthesis, we may say that 80% ~90% is in use. It takes about more than 90 minutes for synthesis with Windows PC(Pentium-4 3GHz, 2GB memory).

Picture of old and new feedback processors are shown in Fig. 5 The new feedback processors and device driver for Linux are made by Tokyo Electron Device Ltd. based on our conceptual design.



Fig. 5. New feedback processor.

#### TWO-DIMENSIONAL FEEDBACK SYSTEM WITH SINGLE LOOP

The scheme for single-loop two-dimensional feedback is tested at several storage rings and its effectiveness is proved. Skewed position BPM detects a beam position signal in both direction and a feedback processor treats simultaneously those two dimensional position data to produce a feedback signal to kick both directions. A kicker stripline is also placed at skewed position to be able to kick both directions. The response of a sample of FIR filter for two-dimensional feedback is shown in Fig. 6.



FIg. 6. Gain and phase of 24-tap FIR filter for two-dimensional feedback by single loop. The target tunes and feedback gain are (0.15, 1) and (0.35, 4) in this example.

### **SUMMARY**

The transverse bunch-by-bunch feedback system is in operation from January 2004 without trouble, and no extra tuning even if after long-term shutdown. New feedback processor is developed for future planned advanced operation and two processors are installed replacing old ones in September, 2005, and show good performance. It takes 12 hours to install / replace and to confirm the operation. This New Processor is also planed to install in Photon Factory at KEK, Taiwan Light Source and NewSUBARU at Univ. of Hyogo, with single-loop two-dimensional scheme.

#### REFERENCES

- [1] T. Nakamura, et al., "Transverse Bunch-by-bunch Feedback System for the SPring-8 Storage Ring", EPAC'2004, Lucerne, Switzerland, July 2004.
- [2] T. Nakamura, "High Precision Transverse Bunch-by-Bunch Feedback System with FPGA and High Resolution ADC", J. Particle Accelerator Society of Japan, Vol.1, No.3, 2004.(in Japanese)