Development of 500 MHz Multi-Channel Readout Electronics for Fast Radiation Detectors

Wolfgang Hennig, Stephen J. Asztalos, Dimitry Breus, Konstantin Sabourov and William K. Warburton

Abstract—We describe the development of readout electronics for fast radiation detectors that digitize signals at a rate of 500 MHz, process the digital data stream to measure pulse heights, bin the results in on-board MCA spectra, and optionally capture waveforms for pulse shape analysis. The electronics are targeted for applications requiring good energy resolution and precise timing, for example life time measurements on exotic nuclei, timing measurements with fast scintillators such as LaBr$_3$ or BaF$_2$, or pulse shape analysis with liquid scintillators or phoswich detectors. Upgrading the existing XIA Pixie-4 spectrometer design with a 12-bit, 500 MHz analog to digital converter, we built a prototype of a 4-channel electronics module and evaluated its performance in terms of energy resolution, timing resolution, and improvements in pulse shape analysis.

Index Terms—Digital signal processing, pulse shape analysis, gamma-ray spectroscopy, timing resolution.

I. INTRODUCTION

As accelerators are upgraded to reach higher energies and better yields of highly unstable nuclei with shorter lifetimes, and as detectors are improved to generate faster signals, provide higher count rates, and achieve better position, time and/or energy resolution, there is a need for higher speed digital detector readout electronics to match these improvements. Currently available readout electronics typically digitize the detector signal at 40-100 MHz with 12-14 bit precision or 0.5-2 GHz with 8-10 bit precision. The first group provides good energy resolution and is sufficient for microsecond decay times, but is clearly inadequate for the desired nanosecond time regime. The second group provides good timing resolution, but is limited in its energy resolution, and usually does little more than capture waveforms for offline processing, which is not suitable for high count rates.

We therefore developed the prototype of a spectrometer capable of sampling detector signals at a rate of 500 MHz with 12 bit precision. This instrument, derived from the existing XIA Pixie-4 spectrometer digitizing at 75 MHz [1], can not only capture waveforms, but can also process the digital data stream to measure pulse heights, bin the result in on-board MCA spectra, detect pulse pileup, record time stamps, live time and event rates, and perform pulse shape analysis in an on-board digital signal processor (DSP). The on-board firmware can be customized for specific applications and parameter settings can be stored as files for easy switching between applications. A small number of prototype boards (named P500) were built and characterized in terms of energy resolution, timing resolution, and improvements in pulse shape analysis, as described below. The timing measurements are similar to previous measurements with the Pixie-4 [2], but since the analysis algorithms and other conditions were different, key measurements have been repeated with the Pixie-4 in the current work for a direct comparison between P500 and Pixie-4.

II. HARDWARE DEVELOPMENT AND TEST

Fig. 1 shows a block diagram of the P500, a standard 3U CompactPCI/PXI module. It consists of a high speed front end with three channels of 500 MHz analog to digital converters (ADC) and a Xilinx Virtex-4 FPGA. An additional fourth channel digitizing at 125 MHz is intended for reference purposes. The analog section of each channel includes a Nyquist filter to suppress frequencies above ½ of the sampling rate. In the FPGA input stage, the 12 bit, 500 MHz digital data
stream from each ADC is “de-serialized” into a 48 bit, 125 MHz data stream for internal processing. FPGA processing currently implemented for each channel includes a) triggering on the rising edge of a detector pulse, b) capture of up to 8K 12 bit samples in a FIFO, c) accumulation of filter sums for reconstruction of pulse height, d) pileup inspection to reject pulses following each other so closely that the filter sums would overlap, and e) recording run statistics such as input counts and live time. In one board, two of the 12 bit, 500 MHz ADCs were replaced with pin compatible 14 bit, 400 MHz ADCs (connecting only the upper 12 bits of the output data) to investigate potential improvements with this higher precision part. This board was only used for hardware tests, not for the performance evaluation below.

The back end of the module, identical to the Pixie-4, includes a 16 bit DSP that manages the download of filter and trigger parameters to the FPGA and computes pulse heights (i.e. energies) from filter sums read from the FPGA. Optionally the DSP can perform pulse shape analysis, e.g. computing rise times and sums over characteristic regions of a pulse from the waveforms captured in the FIFO. MCA spectra and event-by-event list mode data (timestamps, energy, and waveforms) are stored in a 256K, 32 bit external memory controlled by a second FPGA and can be read out through a PCI interface. Several of the trigger lines defined by the PXI standard are used to distribute clocks and triggers between modules.

<table>
<thead>
<tr>
<th>Parameter</th>
<th>ADS5463 (P500 default)</th>
<th>ADS5474a (P500 option)</th>
<th>AD6645a (Pixie-4)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Bits</td>
<td>12</td>
<td>14</td>
<td>14</td>
</tr>
<tr>
<td>Rate</td>
<td>500 MHz</td>
<td>400 MHz</td>
<td>75 MHz</td>
</tr>
<tr>
<td>INL (max)</td>
<td>±2.5 LSB</td>
<td>±0.75 LSB</td>
<td>±0.125 LSB</td>
</tr>
<tr>
<td>INL (typ)</td>
<td>±0.8/±0.3 LSB</td>
<td>±0.25 LSB</td>
<td>±0.063 LSB</td>
</tr>
<tr>
<td>DNL (max)</td>
<td>±0.95 LSB</td>
<td>±0.375/±0.25 LSB</td>
<td>±0.375/±0.25 LSB</td>
</tr>
<tr>
<td>DNL (typ)</td>
<td>±0.25 LSB</td>
<td>±0.175 LSB</td>
<td>±0.063 LSB</td>
</tr>
<tr>
<td>ENOB</td>
<td>10.4</td>
<td>10.9</td>
<td></td>
</tr>
<tr>
<td>RMS noise</td>
<td>0.7 LSB</td>
<td>0.45 LSB</td>
<td></td>
</tr>
</tbody>
</table>

*Values for 14 bit ADCs are given in units of 12 bit LSB.

Manufacturer specifications for the dynamic accuracy – such as integral nonlinearity (INL), differential nonlinearity (DNL), effective number of bits (ENOB), and RMS noise – for the ADC used on the P500 (Texas Instruments ADS5463 and ADS5474) are given in Table I, together with values specified for the 14 bit, 75 MHz ADC used on the Pixie-4. Where appropriate, values for 14 bit ADCs are scaled to match the 12 bit ADC, i.e. 1 least significant bit (LSB) of a 12 bit ADC is equivalent to 4 LSBs of a 14 bit ADC.

The complete board circuitry, i.e. ADC plus analog front end and FPGA readout, was tested against several of these specifications. The noise was estimated from the distribution of several thousand ADC samples captured in FPGA memory with the inputs to the module left unconnected. The RMS noise of the samples was ~0.9 LSB for the P500 with the ADS5463 and ~0.2 LSB (12 bit) for the Pixie-4.

To measure the INL, the on-board 16-bit digital-to-analog converter (DAC) that normally compensates any offset in the input signal was ramped over the full range of the ADC. The ADC output as a function of DAC set value is nominally a straight line; any deviations are due to the nonlinearity of the DAC, the ADC, or some other circuit element. Results of these measurements are shown in Fig. 2. We observe for both the Pixie-4 and the P500, the INL exceeds the ADC specifications, but there are a ~32 periodic jumps in the P500 measurements that are not present in the Pixie-4. We conclude that the on-board DAC, originally chosen only for offset compensation with a specified precision of 1%, dominates the INL and thus the overall specified INL values for the ADC can not be confirmed in this measurement. However, since DAC and other circuit elements are essentially the same for the P500 and the Pixie-4, we conclude that the jumps are specific to the ADS5463 and ADS5474, likely due to the design of these fast ADCs as “folding ADCs”, reusing the same $2^N$ fine comparators after a first stage with $2^5$ coarse comparators to reduce the total number of comparators required. In pulse height measurements, this may cause double peaks to occur in the spectrum, especially for small pulses where the jump may be a significant fraction of the amplitude.

![Fig. 2. Measurement of integral nonlinearity (INL): Residual from linear fit to measured ADC output as a function of input voltage set by an on-board DAC.](image)

Further nonlinearity measurements were performed with a HPGe detector with reference samples as a source of pulses with defined height. This test characterizes the effect of the
INL in spectrometer operation. Plotting measured energy vs nominal energy (Fig. 3) we see deviations from a straight line fit of up to 4 keV for the P500 with the 500 MHz ADC compared to less than 0.1 keV for the Pixie4.

Thus overall non-linearities in the P500 are substantially worse than in the Pixie-4, which is understandable given that one contains a 12 bit ADC designed for high speed and the other contains a 14 bit ADC designed for precision. The non-linearities may distort spectra and worsen energy resolutions in measurements with HPGe detectors, but likely are not significant for faster, lower precision detectors such as LaBr₃, which are the primary target application. Processing in the FPGA may be used to reduce the effects of the nonlinearity, e.g. by correcting measured ADC values with calibration data from Fig. 2 contained in a lookup table. For this reason, the final spectrometer module will be equipped with a more precise DAC. We note that most other commercially available high-speed ADCs digitize only at 8-10 bits (with about 7 ENOB) and are likely to have similar or even worse effects.

III. PERFORMANCE EVALUATION

A. Energy Resolution

Even though HPGe detectors are not the primary application for the P500, they do provide a good test signal for performance evaluation. Pulse processing to measure energies was first implemented offline, then online. Energy resolutions in energy spectra computed offline are generally worse than in those computed online due to a) length limits in waveform capture and b) lack of baselines averaging using data between pulses. Offline processing is also very time consuming.

With online processing, the P500 comes close to the energy resolution of the Pixie-4 and reaches ~2 keV FWHM (0.15%) for the 1.3 MeV peak at an input count rate of ~2200 counts/s, compared to ~1.7 keV (0.13%) for the Pixie-4 (Fig. 4). The resolution varies with the peaking time of the energy filter, which is an input parameter that has to be adjusted for pulse shapes. Since longer filter times also imply larger dead times for the pulse height measurement, the peaking time also controls the tradeoff of throughput and resolution. This variation is similar for the Pixie-4 and the P500.

In contrast, when the input count rate is varied, energy resolutions hardly change for the Pixie-4 but vary strongly for the P500: P500 resolutions can achieve ~1.7 keV (0.13%) at ~1000 counts/s but at higher count rates peaks broaden and/or form double peaks. We attribute this behavior to the ADC non-linearities observed above, since at high count rates pulses begin to overlap and the input signal spans a larger fraction of the total ADC range.

In any case, the performance of the P500 is more than sufficient for fast scintillators such as LaBr₃, which have lower intrinsic resolutions than HPGe detectors (Fig. 5 and 6).

Using the same LaBr₃ crystal and PMT and similar filter settings, we even observed slightly better energy resolution with the P500 than with the Pixie-4, e.g. for the P500 the resolution at 1.3 MeV varied between 1.7% and 2.1% as the input count rate (ICR) increased to over 150,000 counts/s compared to 2.0%-2.5% for the Pixie-4. No double peaks from non-linearities were observed in LaBr₃ measurements.

B. Timing Resolution

In the Pixie-4, distributed clocks have a frequency of 37.5 MHz and are doubled to 75 MHz inside the FPGAs. In the P500, to simplify prototype development, channels 0 and 1 are clocked from a dedicated 500 MHz oscillator. Channels 2 and 3 are clocked from a programmable phase lock loop (PLL) chip that multiplies an incoming 37.5 MHz clock to create a 500 MHz clock. These channels are thus compatible with a Pixie-4 distributed clock and can be used to determine if the clock distribution and multiplication affects the timing.
We measured the timing precision in four different modes, illustrated in Fig. 7: A) a single signal source split into two branches, one of them delayed, and then merged and fed into a single ADC channel; B) a single signal source split into 2 branches, one of them delayed and each branch fed into a separate ADC channel in the same module; C) as in B), but using two separate ADC channels in two modules; and D) 2 coincident signals, one of them delayed, each fed into a separate ADC channel in the same module. The signal source was either an Agilent programmable pulse generator or a Photonis XP2020 photomultiplier tube (PMT) attached to a LaBr$_3$ or plastic scintillator. In each case, we measured the time difference $\Delta T$ between the two rising edges by applying a constant fraction algorithm offline to captured waveforms. The algorithm finds the two points closest to a user defined threshold and computes the time of arrival by linear interpolation between these points. We note that these measurements are very sensitive to the actual cabling, threshold, pulse source, and pulse shape; in the results shown here these parameters have been optimized for each mode but kept constant within a mode. Acquiring several hundred waveforms, we can build a histogram of the measured values $\Delta T$, which forms a narrow distribution with a full width at half maximum (FWHM) called $dT$.

In mode A), which is similar to the start-stop operation of a time-to-digital converter, $dT$ is very small: about 53 ps FWHM for LaBr$_3$, and 20 ps or less for the pulser (Fig. 8). Measurements with a plastic scintillator result in a nominal $dT$ of 49 ps FWHM, but the distribution is non-Gaussian and its full width at 10% of the maximum is about 400 ps.

In modes B) and C), pulser measurements with the P500 resulted in timing resolutions of ~20 ps and ~40 ps, respectively (Fig. 9). Varying the length of the delay cable did not significantly affect the timing resolution. Equivalent measurements with a Pixie-4 resulted in timing resolutions of 100-300 ps in mode B, but histograms of $\Delta T$ often showed double peaks or shoulders. Applying a rise time cut to limit the analysis to pulses with a certain rise time reduced these effects and improved the timing resolution to about 100 ps, but removed about 70% of collected events.

Mode C) measurements with the P500 and LaBr$_3$ (Fig. 10) resulted in timing resolutions on average ~75 ps in both modes B, though $dT$ was in some cases as good as 23 ps. Only events with energies $< \sim 1$ MeV were included in this analysis because larger pulses were not fully captured by the...
ADC in this measurement.

Fig. 11. Histograms of measured time difference $\Delta T$ between two coincident $^{60}$Co pulses acquired with a pair of LaBr$_3$ crystals and fast PMTs.

In mode D), using a pair of LaBr$_3$ crystals and PMTs, we measured ~900 ps with a Pixie-4 and ~630 ps with the P500 when including all events (Fig. 11). Limiting events to those with energies >1 MeV and applying a rise time cut to Pixie-4 data that removes approximately 50% of the >1MeV events, we achieve ~400 ps and ~250 ps for the Pixie-4 and P500, respectively. The timing resolution attributed to each channel is then 1/sqrt(2) of these values, i.e. ~282 ps and 177 ps for the Pixie-4 and P500, respectively.

In comparison, for a traditional timing measurement [3] similar to mode D) with an analog constant fraction discriminator to measure the delay between coincident pulses from two scintillators/PMTs (BaF$_2$ and LaBr$_3$), the timing resolution attributed to the LaBr$_3$ channel is reported to be ~140ps FWHM with $^{60}$Co. These measurements used a PMT model XP20D0, a revision of the model XP2020 used in our tests, in which the number of dynodes has been reduced to 8 and a screening grid was applied to the anode to reduce timing jitter.

Several conclusions can be made from these measurements: First, the P500 electronics itself can resolve time of arrival differences as little as 20ps with an ideal, repetitive source like the pulser (mode A and B), and the broadening of the timing resolution with real detector signals must be attributed to jitter and pulse shape variations from the detector, for example due to varying interaction locations in the detector or random processes in the light collection and amplification. These effects are particularly strong when independent detectors are used in mode D). Second, comparison between modes B) and C) demonstrate that the penalty for distributing triggers and clocks from module to module over the PXI backplane is very small – $\Delta T$ in pulser measurements increases only by ~20 ps and in LaBr$_3$ measurements no increase is noticeable within the resolution due to the signal source itself. Third, as expected, comparison of the measurements with the Pixie-4 and the P500 demonstrates that the higher digitization rate improves the timing resolution. However, since the overall timing resolution contains contributions from both the electronics and the signal source (to first approximation added in quadrature), higher digitization rates will not improve the timing resolution indefinitely [4]. In fact, while $\Delta T$ is ~15 times larger for the Pixie-4 than for the P500 in mode B), with the pulser, it is only ~50% larger in mode D), despite the 6.67 times slower digitization rate. Thus only systems with sufficiently low noise and timing jitter in the signal source will see significantly improved timing resolutions with higher digitization rates.

C. Interleaving ADCs for higher sampling rates

Fig. 12. Histograms of measured relative gain, offset and phase shift for several thousand waveforms from 2 ADCs clocked at 250 MHz with 1.95 ns phase shift at the source.

With multiple ADCs and a programmable PLL chip on the same board, there is the obvious possibility to interleave data streams from N different ADCs clocked at controlled phase shifts to achieve higher overall sampling rates [5]. However, if gain and offset from the ADCs do not match exactly and the phase is not shifted by exactly $\Delta T = (\text{clock period} / N)$, distortions occur that limit the precision of the interleaved data [6]. A detailed investigation of such mismatch and its effect on the performance is beyond the scope of the work reported here, but in timing tests with the pulser (mode B) we can measure mismatch and jitter and thus obtain a first indication if interleaving is feasible. The P500’s PLL chip was programmed to output a 250 MHz clock signal, for one channel delayed by 1.95 ns. (The PLL is limited to delays in steps of 150 ps, and we chose a base frequency of 250 MHz to be able to compare the interleaved data with single-channel full rate data). Fig. 12 shows histograms of relative gain, offset and phase from several thousand pulses acquired synchronously in channels 2 and 3. Gain mismatch is ~0.2% with a jitter of 0.11% FWHM; offset mismatch is ~11.7 ADC steps with a jitter of 0.45 ADC steps, and phase mismatch from the ideal 2 ns is ~130ps with a jitter of 36 ps. The offset mismatch could be further reduced by finely adjusting the offset DAC for each channel, and the phase mismatch (most likely due to clock line delays) could be further reduced by delaying the clock in the PLL by one additional 150 ps step. The P500 thus fulfills a necessary condition for interleaved operation, i.e. small and stable mismatch, and is potentially able to generate 12 bit data streams at double or even four times the ADC clock rate, though the quality of such data remains to be investigated.

D. Pulse Shape Analysis

As a sample application for pulse shape analysis (PSA), we used the P500 to capture waveforms from a CsI(Tl)/BC-404
phoswich detector [7]. In this detector, interactions in the BC-404 create fast pulses (with nominally 2ns decay time), interactions in the CsI create slow pulses (1μs decay), and simultaneous interaction in both parts of the detector create characteristic fast/slow pulses. The detector is sketched as an inset in Fig. 13, and PSA sums to characterize pulse types are indicated below the waveforms. Both waveforms in Fig. 13 come from a 662 keV photon scattering from the BC-404 into the CsI, depositing ~180 keV in the BC-404 and ~480 keV in the CsI. Some shaping of the pulse is generally unavoidable due to a) the finite rise time of the PMT and b) the Nyquist filter preceding the ADC, limiting the bandwidth to half the sampling frequency to avoid aliasing of higher frequency noise. Consequently the P500 is able to resolve the fast pulse contributions from the BC-404 much better than the Pixie-4 with its slower sampling and lower bandwidth.

This in turn leads to a much better separation of events with coincident interactions in CsI and BC-404 from events interacting in only one of the scintillators. Part of the PSA calibration process includes measuring the slope of PSA sum P vs. sum C for BC-404 only and CsI only events. BC-404 only events have a large value for P and a relatively small value for C, so they fall close to the vertical axis in a scatter plot. CsI only events have smaller values for P and larger values for C. As shown in Fig. 14, the slopes for the distributions of the two event types are much more orthogonal for the P500. Since the BC-404 contribution is so much shorter in the P500 with its higher bandwidth, the length of P can be reduced from ~100 ns to ~20 ns, i.e. there is hardly any CsI contribution in the sum P.

In summary, we updated the design of the existing Pixie-4 spectrometer with a high speed ADC and FPGA to build the prototype of a new high speed spectrometer module, named P500. The P500 obtained good energy resolution (even though the nonlinearities of the ADC are substantially worse than those of the slower 14-bit ADC used on the Pixie-4) and very good timing resolution. Overall, the P500 is well suited for applications with fast scintillators and with limitations may even be suited for some HPGe applications, for example gamma ray tracking with segmented detectors that often have resolutions in the order of 2 keV at 1.3 MeV.

A final Pixie-500 spectrometer will consist of the P500 front end (with minor changes such as a better DAC and more gain options), but the DSP will be upgraded to a 32 bit floating point model and the host interface will be upgraded to the PXI Express standard (PXIe). PXIe combines the high speed PCI Express data bus found in modern PCs with additional lines for clock and trigger distribution. This allows high data transfer rates from module to host PC (nominally up to 1GB/s with a PCIe x4 link) and precise synchronization of data acquisition between modules.

**REFERENCES**


