2-2017

Design Techniques for Direct Digital Synthesis Circuits with Improved Frequency Accuracy over Wide Frequency Ranges

Stefan Leitner

Haibo Wang
zhwang@siu.edu

Spyros Tragoudas
Southern Illinois University Carbondale

Follow this and additional works at: http://opensiuc.lib.siu.edu/ece_articles

Recommended Citation

This Article is brought to you for free and open access by the Department of Electrical and Computer Engineering at OpenSIUC. It has been accepted for inclusion in Articles by an authorized administrator of OpenSIUC. For more information, please contact opensiuc@lib.siu.edu.
Design Techniques for Direct Digital Synthesis Circuits with Improved Frequency Accuracy over Wide Frequency Ranges

Stefan Leitner, Haibo Wang, and Spyros Tragoudas

Department of Electrical and Computer Engineering, Southern Illinois University
Carbondale, Illinois 62901, USA

[http://www.worldscientific.com/worldscinet/jcsc]
Design Techniques for Direct Digital Synthesis Circuits with Improved Frequency Accuracy over Wide Frequency Ranges

Stefan Leitner, Haibo Wang, and Spyros Tragoudas

Department of Electrical and Computer Engineering, Southern Illinois University
Carbondale, Illinois 62901, USA

1s3130@siu.edu

Received (Day Month Year)
Revised (Day Month Year)
Accepted (Day Month Year)

Recently, there are increasing interests in impedance sensors for various applications. Direct digital synthesis (DDS) circuits are commonly used in such sensor circuits for generating stimulus signals, due to the advantages of accurate frequency control, drift-free performance, etc. Previously reported DDS circuits for sensor applications typically maintain superb frequency accuracy within relatively small frequency ranges. This paper investigates techniques to improve frequency accuracy over wide frequency ranges. In addition, it presents an analytical framework to estimate the signal to noise ratio (SNR) of the generated signal and derives guidelines for optimizing DDS circuit configurations. Both simulation and hardware measurement results are presented to validate the derived SNR estimation equation as well as the developed frequency accuracy enhancement techniques.

Keywords: Direct Digital Synthesizer, Frequency Synthesis, FPGA Design.

1. Introduction

Direct Digital Synthesis (DDS) techniques have been widely used to generate sinusoidal or arbitrary waveforms with programmable frequencies.\(^1\) Compared to analog waveform generation circuits, DDS has advantages of accurate frequency control, fast dynamic frequency adjustments, immune to frequency and amplitude drifts, etc. Its applications include signal processing, communication, control and impedance sensing circuits. Recently, there are increasing interests in impedance sensing for various applications,\(^3\) which prompt the demand for DDS circuits that can maintain excellent frequency accuracy over wide frequency ranges. Existing DDS circuits developed for impedance sensing applications typically attain good frequency accuracy in low frequency ranges, but their maximum signal frequencies are typically limited to tens or 100 KHz.\(^6\) On the other hand, DDS circuits for radar and communication applications can achieve superb performance in high frequency ranges (e.g. tens or hundreds of MHz or GHz), but suffer poor frequency accuracy at low frequency ranges.\(^10\)

This paper investigates techniques to improve frequency accuracy over a wide frequency range, e.g. from 1Hz (or below) to tens of MHz. It proposes to partition the target frequency range into sub regions and develops techniques that enable the DDS circuit to automatically adjust phase accumulator clock and phase increment value according to the target signal frequency. To help designers estimate the circuit performance at early design stage, this paper derives a framework to predict the signal to noise ratio (SNR) of the DDS circuit output. Unlike the existing analysis which mainly focuses on the effect of the size and configuration of the lookup table (LUT) used in the DDS circuit, this work considers
the key design parameters of both digital and analog (smoothing filter) components of the DDS circuit. To validate the proposed techniques, various DDS circuits are designed and implemented on a prototype hardware consisting of an FPGA board, a 12-bit digital to analog converter (DAC) and a fourth-order passive smoothing filter. Hardware measurement data show good agreement with analytical estimation and Matlab simulation results.

The rest of the paper is organized as follows. Section 2 discusses the basic principles of DDS techniques and its applications in impedance sensing circuits. It also highlights the difficulty to maintain high frequency accuracy over wide frequency ranges and briefly reviews existing design techniques for DDS circuits. Section 3 analyzes the impacts of the key design parameters on the performance of the DDS circuits. It presents analytical equations for SNR estimation and derives guidelines for optimal LUT configuration. Section 4 presents techniques to enhance frequency accuracy over wide frequency ranges. The design and implementation of the DDS circuits are discussed in Section 5. Measurement and simulation results are reported in Section 6 and the paper is concluded in Section 7.

2. Background

Impedance sensing has been used in many fields, including biochemistry, material science, biosensors, consumer electronics, etc. In such applications, the complex impedance is measured over a frequency range and the obtained impedance versus frequency relation is often depicted in a Nyquist or Bode plot. There are several techniques to measure the sensor impedance. Figure 1 (a) shows the auto balancing bridge measurement method, which is widely used in impedance sensor circuits. During the measurement process, a stimulus signal, generated by a signal source with programmable frequency, is applied to the device under test (Z_X) via a buffer circuit. The potential at the left terminal of Z_X is measured and denoted as V_1; the potential at the right terminal of Z_X is virtual ground. Hence the voltage across Z_X is known from V_1. Meanwhile, the current flowing through Z_X also flows through the amplifier feedback R. The current value can be determined from the voltage measurement V_2 at the amplifier output. The ratio of the voltage across Z_X over the current flowing through it reveals the impedance of Z_X.

To achieve high sensitivity, many impedance sensors require the frequency of the stimulus signal to be swept in a large frequency range, with small steps and high accuracy. Such requirements make the use of digital techniques to synthesize the stimulus signals more attractive due to the large programmable frequency ranges and robust performance associated with digital frequency synthesis techniques. There are several techniques to synthesize sinusoidal signals using digital circuits, including Cordic algorithm, feedback based digital oscillator, nonlinear digital to analog converter and LUT-based numerical controlled digital oscillator approach. Among them, the LUT-based method is often the popular choice due to its superb frequency control and stability. Figure 1 (b) shows the basic structure of a LUT-based DDS circuit. It consists of a phase accumulator (PA), LUT, DAC, and analog low-pass filter (LPF). The LUT stores a set of equally spaced samples of
a sinusoidal waveform. The LUT address, also referred to as the signal phase, is generated by the PA, which increases the signal phase by a constant value $\alpha$ every clock cycle. The digital output of the LUT is converted to an analog signal by the DAC and subsequently smoothed by the LPF to reduce harmonic components.

Assume the bit width of the PA is $p$ and the clock frequency is $f_{clk}$. The frequency of the synthesized signal is a function of the phase increment value $\alpha$:

$$f_{sig} = \frac{\alpha}{2^p} f_{clk}$$

In theory, $\alpha$ can be any integer value from 1 to $2^{p-1}$, resulting in a programmable frequency range from $f_{clk}/2^p$ to $f_{clk}/2$. The former is due to the PA’s finite resolution and the latter is limited by the sampling theory constraints. Since the minimum change on $\alpha$ is 1, the frequency resolution is $f_{clk}/2^p$. Due to this fixed frequency resolution, a DDS circuit with wide frequency range typically suffers poor frequency accuracy in the low frequency range. For example, assuming $f_{clk} = 100$MHz and $p=27$, the resulting frequency resolution is 0.745Hz. To generate a 10Hz signal, $\alpha = 13$ results in the best approximation, which is about 9.69 Hz. This corresponds to a relative frequency error of 3%. The relative frequency error increases further for frequencies below 10Hz. Such large frequency errors typically cannot be tolerated in many impedance sensing applications. The simplest approach to enhance the frequency accuracy is to improve the frequency tuning resolution by increasing $p$. This however is often undesirable due to increased circuit size and power consumption.
This example also reveals that the value of $\alpha$ (13) is not the proper digital representation of the desired signal frequency (10). This inevitably complicates the interface between the DDS circuit and its host control when a large number of signal frequencies have to be generated over a wide frequency range.

In the past, various efforts have been devoted towards minimizing LUT size to reduce circuit area. Many of them take advantage of the symmetry property of the sine wave and, hence, store only the first quadrant of the sine wave. Values in other quadrants can be generated by subtracting an appropriate constant according to the PA output. Thus, the LUT size can be reduced to one fourth of the original size without affecting the signal spectral purity. Other techniques exploit trigonometric identities which enable samples to be generated by addition and multiplication operations from a reduced set of data points stored in the LUT. This however requires additional hardware and computation, adversely impacting the circuit speed as well as the maximum signal frequency. Also, due to the nonlinear effects of interpolation the spectral purity of the synthesized signal is negatively affected.

Driven by communication and radar applications, techniques to improve signal spectral purity, often measured by spurious free dynamic range (SFDR), have been intensively studied. A simple approach is to avoid assigning even numbers to $\alpha$ such that $\alpha/2^p$ is irreducible, which prevents the circuit from repeatedly accessing certain data points and skipping the others. This helps spread out the noise energy associated with the quantization errors and hence improves SFDR. However, in the low frequency range the value of $\alpha$ is very small. Skipping even numbers will cause significant frequency error. To avoid this limitation, the phase dithering method adds random noise to the PA output to break the pattern of repeatedly accessing some data and skipping the other. The magnitude of the random noise is typically selected to be smaller than half of the resolution of the phase signal. In practice, the phase dithering method often adds pseudo-random noise generated by a linear feedback shift register to the PA output with the constraint that the repetition rate of the noise should be much smaller than the signal frequency. Similarly, amplitude dithering can be used to improve SFDR. These two approaches help randomize the quantization noise, but at the expense of a slightly increased noise floor. Furthermore, the noise shaping method is also used to curtail the PA quantization noise and hence improve SFDR. The PA output is typically truncated before it is fed to the LUT address input. In the simplest noise shaping approach, such truncation errors can be added to the phase increment value $\alpha$. More sophisticated noise shaping filters can be used for improved noise shaping efficiency at the cost of increased hardware overhead.

3. DDS Circuit Performance Analysis

This section studies the impact of the key design parameters on the performance of DDS circuits. The analytical results serve as a framework for designers to explore various design options and subsequently select the optimal design for given hardware resources. Complementary to many existing studies that emphasize on SFDR, this work focuses on maximizing SNR, which is also crucial in many sensor applications.
The analysis assumes that the bit widths of the LUT output and address input are \( l \) and \( n \), respectively. Note that only the more significant bits (MSBs) of the PA output are fed to the LUT address input. These bits are referred to as integer bits and the remaining bits of the PA output are called fractional bits. It further assumes that only the first quadrant of the sine wave is stored in the LUT. Therefore, a complete (one period) sine wave contains \( 2^{n+2} \) data points. The time duration between two consecutive data points is:

\[
\Delta t_{\text{samples}} = T_{\text{sig}} = \frac{1}{2^{n+2} \cdot f_{\text{sig}}}
\]

The finite resolutions on signal phase (limited by \( n \)) and signal magnitude (limited by \( l \)) introduce distortions to the synthesized signal. The phase inaccuracy can be modeled as sampling jitter \( j(t) \), which can be expressed as a sinusoidal signal with the amplitude of \( \Delta t_{\text{samples}}/2 \) and the same frequency as the synthesized signal:

\[
j(t) = \frac{\Delta t_{\text{samples}}}{2} + \frac{\Delta t_{\text{samples}}}{2} \cdot \cos 2\pi f_{\text{sig}} \cdot t
\]

Assuming the synthesized signal \( V(t) \) swings from 0 to 1, it can be written as:

\[
V(t) = \frac{1}{\sqrt{2}} + \frac{1}{\sqrt{2}} \sin[2\pi f_{\text{sig}} \cdot (t + j(t))]
\]

\[
\approx \frac{1}{\sqrt{2}} + \frac{1}{\sqrt{2}} \sin 2\pi f_{\text{sig}} \cdot t + \frac{1}{\sqrt{2}} \frac{\partial}{\partial t} \sin 2\pi f_{\text{sig}} \cdot t \cdot j(t)
\]

The last term in the approximated expression represents the distortion due to sampling jitter. Its root mean square (RMS) value can be derived as:

\[
D_n = \frac{\pi}{2^{n+4}}
\]

Since the signal swings from 0 to 1 and only the first quadrant of the signal is stored in the LUT, the maximum magnitude difference among the data stored in the LUT is \( 1/2 \). Thus, the \( V_{\text{LSB}} \) of the digital data in the LUT is:

\[
V_{\text{LSB}} = \frac{1}{2^{l+1}}
\]

Then the signal distortion due to quantization noise is:

\[
D_l = \frac{V_{\text{LSB}}}{\sqrt{12}} = \frac{1}{2^{l+1} \sqrt{12}}
\]

The combined noise power is \( D^2 = D_n^2 + D_l^2 \) and hence, SNR can be derived as:
SNR = \(-10 \cdot \log_{10}\left(8\left(\frac{\pi}{2n+4}\right)^2 + \frac{2}{3}\left(\frac{1}{2l+1}\right)^2\right)\)  
(8)

For a given LUT size \(M\), \(n\) and \(l\) follow the relation:

\[ l = \frac{M}{2^n} \]  
(9)

Also, the optimal values for \(n\) and \(l\) that result in the minimum distortion energy can be obtained by solving:

\[ \frac{\partial D^2}{\partial n} = -2 \cdot \ln 2 \cdot D^n - 2 \cdot \ln 2 \cdot D^l \cdot \frac{\partial l}{\partial n} = 0 \]  
(10)

It is:

\[ n = l - \frac{1}{2} \log_2 l - \frac{1}{2} \log_2 \left(\frac{16}{3\pi^2} \ln 2\right) \approx l - \frac{1}{2} \log_2 l + 0.7 \]  
(11)

When \(l\) is not very large, the above relation indicates that \(n \approx l\) is the optimal LUT configuration for improving SNR with the given LUT size limitation. Previous works propose the relation of \(n = l + 1\) for maximizing SFDR since it leads to the largest harmonic due to phase inaccuracy being below the amplitude quantization noise floor.\(^1\)\(^2\)

When the SNR instead of SFDR is of great interest, and there are stringent LUT size constraints, the LUT configuration of \(n \approx l\) leads to a more memory efficient implementation, since increasing \(n\) by 1 doubles the LUT size.

The signal from the LUT output is fed to the DAC and subsequently smoothed by the LPF. In typical sensor applications, the DACs used in the DDS circuits have moderate resolution and modest conversion rate.\(^5\)\(^9\) Thanks to the advanced DAC circuit techniques, the distortions caused by non-ideal DAC circuit effects are small and often can be ignored in such scenarios. The major effect of the DAC circuit on the synthesized signal is due to the zero-order hold circuit behavior, which refers to the fact that the DAC output holds the current value until the arrival of the next value. The transfer function of the zero-order hold circuit is a sinc function, which attenuates the generated signal as well as its images, as shown in Figure 2 (a). Among all the images, only the first image, whose frequency is \(f_{\text{clk}} - f_{\text{sig}}\), may have notable effect since other images appear at higher frequencies and are adequately attenuated by the sinc filter as well as the smoothing filter. Assume the order of the smoothing filter is \(F_{\text{order}}\). After passing the filter passband, the filter gain is approximately attenuated by \(-20 \cdot F_{\text{order}} dB\) per decade as shown in Figure 2 (b). Thus, the filter gain at the first image frequency is:

\[ G = 10^{-F_{\text{order}} \log_{10}\left(\frac{f_{\text{clk}} - f_{\text{sig}}}{f_{\text{BW}}}\right)} \]  
(12)
where \( f_{BW} \) is the bandwidth of the smoothing filter and the expression of \( \log_{10}(f_{clk} - f_{sig})/f_{BW} \) is to compute how many frequency decades is the first image frequency beyond the filter bandwidth. Subsequently, the magnitude of the first image can be approximated by:

\[
A_{\text{image}} = \frac{1}{2} \text{sinc} \left( \frac{\pi(f_{clk} - f_{sig})}{f_{clk}} \right) \cdot 10^{-F_{\text{order}} \cdot \log_{10}(f_{clk} - f_{sig})/f_{BW}}
\]

(13)

Assume the noise due to finite LUT length and limited PA resolution is white noise. Its contribution to the noise energy at the LPF output can be estimated by multiplying its original value by \((2f_{ENB}/f_{clk})^2\), where \( f_{ENB} \) is the LPF equivalent noise bandwidth. When the filter order is 1 or 2, \( f_{ENB} \) is about 1.57 or 1.11 times the filter bandwidth. When the filter order is higher, \( f_{ENB} \) is about the same as the filter bandwidth. Finally, the total noise and distortion energy at the LPF output can be estimated by summing the filtered noise energy and the energy of the first image. That is:

\[
N_{\text{total}}^2 = \left( \frac{\pi}{2^{n+4}} + \frac{1}{12} \left( \frac{1}{2^{l+1}} \right)^2 \right) \cdot \left( \frac{2f_{ENB}}{f_{clk}} \right)^2
\]

+ \left( \frac{1}{2 \cdot \sqrt{2}} \text{sinc} \left( \frac{\pi(f_{clk} - f_{sig})}{f_{clk}} \right) \cdot 10^{-F_{\text{order}} \cdot \log_{10}(f_{clk} - f_{sig})/f_{BW}} \right)^2
\]

(14)

Meanwhile, the signal energy is:
Design Techniques for Direct Digital Synthesis Circuits

\[ S^2 = \left( \frac{1}{2 \cdot \sqrt{2}} \cdot \text{sinc} \left( \frac{\pi f_{\text{sig}}}{f_{\text{clk}}} \right) \right)^2 \]  

Equation (15)

Thus, the SNR expression of the final output signal is given in Equation 16. This expression can be used to estimate the achievable SNR in the early design stage for selecting optimal DDS circuit configurations.

\[

t\text{SNR} = -10 \log_{10} \left( 8 \left( \frac{\pi}{2^{n+1}} \right)^2 + \frac{2}{3} \left( \frac{1}{2^{l+1}} \right)^2 + \left( \frac{10^{-\text{order} \cdot \log_{10} \left( \frac{f_{\text{clk}} - f_{\text{sig}}}{f_{\text{BW}}} \right)}}{\text{sinc} \left( \frac{\pi f_{\text{sig}}}{f_{\text{clk}}} \right)} \right)^2 \right)
\]

Equation (16)

4. Techniques to improve Frequency accuracy

In typical impedance sensing circuits, a control unit programs the DDS circuit to synthesize stimulus signals with various frequencies. As illustrated in the earlier example, the phase increment value \( \alpha \) is not exactly the digital representation of signal frequency \( f_{\text{sig}} \). Though it is possible to let the control unit calculate the proper \( \alpha \) value for each frequency, this complicates the interface between the two since the control unit has to track the clock frequency used in the DDS circuit. A more desirable approach is to let the control unit directly send the digital representation of \( f_{\text{sig}} \) to the DDS circuit and the DDS circuit compute the desirable \( \alpha \) value on its own.

From Equation 1 we have:

\[
\alpha = \frac{2^p}{f_{\text{clk}}} f_{\text{sig}} = C \cdot f_{\text{sig}}
\]

Equation (17)

We refer to \( C = \frac{2^p}{f_{\text{clk}}} \) as the phase increment scaling factor. Once the control unit sends \( f_{\text{sig}} \) value to the DDS circuit, the corresponding \( \alpha \) value can be computed using the above equation. In practical implementations, both \( \alpha \) and \( C \) are represented by finite bit widths, which introduce errors to the realized signal frequency. We use \( \Delta \alpha \) and \( \Delta C \) to represent the inaccuracies of \( \alpha \) and \( C \) caused by their finite bit widths. Then we have:

\[
\alpha = (C + \Delta C) \cdot f_{\text{sig}} + \Delta \alpha
\]

Equation (18)

Substituting it to Equation 17, the absolute and the relative frequency error can be derived as:

\[
\Delta f_{\text{sig}} = \frac{\Delta \alpha}{2^p} \cdot f_{\text{clk}} + \frac{\Delta C}{2^p} \cdot f_{\text{clk}} \cdot f_{\text{sig}}
\]

Equation (19)
\[ \frac{\Delta f_{\text{sig}}}{f_{\text{sig}}} = \frac{\Delta \alpha}{2^p} \cdot \frac{f_{\text{clk}}}{f_{\text{sig}}} + \frac{\Delta C}{2^p} \cdot f_{\text{clk}} \]

(20)

Figure 3: Bit truncations in computing phase increment value \( \alpha \) and LUT address

Figure 3 depicts the bit width assignments for the PA output and parameter \( C \). It partitions the \( p \)-bit PA output into three sections, whose bit widths are 2, \( n \) and \( k \). Thus, \( p = 2 + n + k \). The first two MSBs indicate the quadrant of the signal; the next \( n \) bits are integer bits fed to the LUT address input; and the remaining \( k \) bits are fractional bits. The figure also shows that the bit width of \( C \) is \( 2 + n + k + m \) (or \( p + m \)). The extra \( m \) bits are to reduce the impact of \( \Delta C \).

With using the above notations, Equation 19 can be simplified to:

\[ \Delta f_{\text{sig}} = \frac{1}{2^{2+n+k}} \cdot f_{\text{clk}} \left( 1 + \frac{f_{\text{sig}}}{2^m} \right) \]

(21)

Note that both \( \Delta \alpha \) and \( \Delta C \) contribute to frequency error \( \Delta f_{\text{sig}} \). There are two approaches to assign \( \alpha \) and \( C \) bit widths to satisfy the frequency error requirement. In the first approach, \( m \) can be selected large enough such that \( f_{\text{sig}}/2^m \) is much smaller than 1 and hence the contribution due to \( \Delta C \) can be ignored. Since the theoretical maximum signal frequency is \( f_{\text{clk}}/2^p \), we can select:

\[ m \geq \log_2 f_{\text{clk}} \]

(22)

In this case, the frequency error is mainly determined by the frequency resolution \( f_{\text{clk}}/2^p \), which can be improved by increasing the PA fractional bit width \( k \). For a given frequency accuracy requirement \( \epsilon_f \), the required \( k \) value is:

\[ k \geq \log_2 \frac{f_{\text{clk}}}{\epsilon_f} - (2 + n) \]

(23)

In the second approach, \( \alpha \) and \( C \) values are selected such that \( \Delta \alpha \) has a negative value and \( \Delta C \) has a positive value. As a result, the frequency error contributions due to \( \Delta \alpha \) and \( \Delta C \) partially cancel each other. In the computation of \( \alpha \), its fractional bits are simply truncated. Thus, the digital representation of \( \alpha \) is smaller than its ideal value and \( \Delta \alpha \) is negative. However, in the computation of \( C \), the result is always rounded up to the nearest integer, resulting in a positive \( \Delta C \). Subsequently, Equation 21 can be written as:
\[ \Delta f_{\text{sig}} = \frac{1}{2^{n+2+k}} \cdot f_{\text{clk}} \left( \frac{f_{\text{sig}}}{2^m} - 1 \right) \]  
\hspace{1cm} (24)

When the signal frequency is small, term \( f_{\text{sig}} / 2^m \) is much smaller than 1 and hence can be ignored. Therefore, the \( k \) value given by Equation 23 is still valid. When the signal frequency is large, the frequency error is kept low by the fact that \( f_{\text{sig}} / 2^m \) and \( -1 \) partially cancel each other. To make the above two terms completely cancel each other at the maximum signal frequency \( f_{\text{sig}, \text{max}} \), we can select \( m \) as:

\[ m = \log_2 f_{\text{sig}, \text{max}} \]  
\hspace{1cm} (25)

This selection not only reduces the bit width of \( C \) but also reduces the frequency error when the synthesized signal frequency is close to the upper end of the frequency range.

In certain scenarios, the relative frequency error \( \Delta f_{\text{sig}} / f_{\text{sig}} \) is of interest. With the above bit width notations, Equation 20 can be re-written as:

\[ \frac{\Delta f_{\text{sig}}}{f_{\text{sig}}} = \frac{1}{2^{n+2+k}} \cdot \frac{f_{\text{clk}}}{f_{\text{sig}}} \left( 1 + \frac{f_{\text{sig}}}{2^m} \right) \approx \frac{1}{2^{n+2+k}} \cdot \frac{f_{\text{clk}}}{f_{\text{sig}}} \]  
\hspace{1cm} (26)

Then for a relative frequency error requirement \( \epsilon \), the required \( k \) value is:

\[ k \geq \log_2 \left( \frac{1}{\epsilon} \cdot \frac{f_{\text{clk}}}{f_{\text{sig}}} \right) - (2 + n) \]  
\hspace{1cm} (27)

It shows that the clock to signal frequency ratio directly affects the \( k \) value. If the design is to cover a very large frequency range \( \left( f_{\text{sig}, \text{min}}, f_{\text{sig}, \text{max}} \right) \), the maximum clock to signal frequency ratio, which is \( R_{\text{max}} = f_{\text{clk}} / f_{\text{sig}, \text{min}} \), determines the \( k \) value. This typically leads to a very large \( k \) value and a bulky PA circuit implementation. To address this problem, we propose to partition the entire signal frequency range into sub ranges. Each subrange uses a dedicated clock frequency such that the \( R_{\text{max}} \) value is kept relatively low. For example, if the circuit is to cover the frequency range from 1Hz to 50MHz and the other design parameters are: \( n = 10 \), \( f_{\text{clk}} = 100\text{MHz} \), \( \epsilon = 0.1\% \), the straightforward implementation requires 37 bits at the PA output, 25 bits of them being fractional bits. By the proposed approach, the frequency range is partitioned into three sub ranges: [1Hz, 300Hz], [300Hz, 100KHz], and [100K, 50MHz]. The clock frequencies for the three sub ranges are 1kHz, 300KHz, and 100MHz, respectively. This keeps \( R_{\text{max}} = 1000 \) and the design requires only 20 bits at the PA output, 8 of them being fractional bits. In general, if the frequency range \( \left[ f_{\text{sig}, \text{min}}, f_{\text{sig}, \text{max}} \right] \) of the DDS circuit is evenly partitioned into \( Q \) sub ranges in logarithmic scale, the proposed approach can reduce the bit width of the PA circuit by:

\[ \mu = Q - 1 \cdot \log_2 \frac{f_{\text{sig}, \text{max}}}{f_{\text{sig}, \text{min}}} \]  
\hspace{1cm} (28)
5. DDS circuit design and implementation

This section discusses the key circuit structures to implement the proposed DDS circuit. It partitions the target frequency range into sub ranges. The corresponding clock frequency and $C$ value for each sub range are computed offline and stored in registers. As illustrated in Figure 4, once the desired signal frequency value is fed to the DDS circuit, the sub range select block, a simple combinational circuit, identifies the sub range covering the desired frequency and accordingly guides the multiplexers to select the proper scaling factor $C$ and clock frequency $f_{clk}$. The proper scaling factor $C$ is then used to compute the desired phase increment value $\alpha$. The different clock frequencies can be generated by reprogramming the clock management circuit. Note that the computation of $\alpha$ is performed once for a given signal frequency and the clock management circuit programming is needed only when the signal frequency transits from one sub range to another.

As mentioned earlier, the LUT only stores one quadrant of the sine wave. However, the PA output will enumerate all the data point positions of the entire period of the waveform. The first two bits of the PA output, denoted as $PA[n+1]$ and $PA[n]$, are used to indicate the quadrant information. The next $n$ bits of the PA output, denoted as $PA[n-1:0]$, are fed to the LUT address input. Hence, the number of data points stored in the LUT is $2^n$ and the total number of data points covered by the entire period of the waveform is $2^{n+2}$. Figure 5 illustrates how the $2^{n+2}$ data points are generated from the $2^n$ entries stored in the LUT. In the example, we assume 8 entries are stored in the LUT ($n = 3$) and there are 32 ($2^{3+2}$) data points within the entire period of the sinusoidal waveform. In the figure, we use star symbols to represent data points stored in the LUT and dot symbols to represent data points that are not directly stored in the LUT. The horizontal axis denotes the index of the 32 data points, which are also the corresponding PA outputs. When the PA output is between 0 and 7, it points to the first quadrant of the waveform, whose data points are directly stored in the LUT. Thus, PA output $PA[n-1:0]$ can be directly used as the LUT address input. When the PA output reaches 8, it points to the first data point of the second quadrant, which is $\sin(\pi/2)$. Since the digital representation of the magnitude of
Design Techniques for Direct Digital Synthesis Circuits

\[
\sin(\pi/2) \text{ is all 1's, it does not have to be stored in the LUT. When the PA output is between 9 and 15, it points to the second quadrant, in which the sinusoidal waveform is the flipped version of the first quadrant waveform with respect to the vertical line crossing PA=8. As a result, the data points with index from 9 to 15 have the same values as the data stored in the LUT from location 7 to 1. This relation is illustrated by the first and second rows in the text box of the figure. It is resembling that we fold index 9 to 15 back with respect to index 8. When the PA output is between 17 and 23, it points to the third quadrant, in which every data point has the same magnitude but opposite sign as the data point located in the first quadrant and shares the same } PA[n-1:0] \text{ value. This is illustrated by the third row in the text box. Similarly, when the PA points to the fourth quadrant, its outputs (25 to 31) are folded back to address the LUT as shown in the fourth row in the text box. In summary, the index numbers vertically aligned in the text box represent data points with the same magnitude.}
\]

\[
\begin{array}{cccccccc}
1 & 0.8 & 0.6 & 0.4 & 0.2 & 0 & -0.2 & -0.4 & -0.6 & -0.8 & -1
\end{array}
\]

Figure 5: An illustrative example of PA output folding operation

The above operations are carried out by the circuit shown in Figure 6. Signed- and magnitude numbers are used to represent the sinusoidal signal waveform. The MSB of the PA output is the sign bit. The next bit of the PA output is used to select either } PA[n-1:0] \text{ or its 2's complementary code as the LUT address. The former case represents the first and third quadrants of the waveform and the latter corresponds to the second and the fourth quadrants. It is easy to verify that the 2's complementary operation is the same as the aforementioned folding operation. When } PA[n] = 1 \text{ and } PA[n-1:0] = 00 \ldots 00, \text{ the corresponding data points are the positive and negative peaks of the sinusoidal waveform,}
whose magnitude value is not stored in the LUT. The all-zero-detection circuit and the following AND-gate are used to detect such cases. The output of the AND gate guides the output multiplexer to select either the LUT output or the digital code $11 \cdots 11$ as the signal magnitude accordingly.

**Figure 6:** Digital waveform generation circuit

### 6. Simulations and Experimental Results

To validate the above analysis and the developed techniques, both Matlab simulations and hardware experiments are conducted. The simulation and measurement results show good agreement with the results obtained from the estimation equations. Also, the results demonstrate significant frequency accuracy improvement by the proposed techniques. The outcomes of these investigations are described as follows.

#### 6.1. SNR Simulation

Matlab programs are developed to emulate the operation of the digital portion of the DDS circuits. The LUT sizes, both $n$ and $l$ values, are varied in simulation to emulate circuit behaviors with different LUT configurations. The SNR values obtained from simulations are very close to the estimations by Equation 8. Since the simulated LUT outputs should be identical to the actual LUT outputs in hardware, the consistence between the simulation and estimation results validates the accuracy of Equation 8. Figure 7 shows SNR data for the cases that the LUT address width ($n$) are 10 and 12 bits, respectively. For each case, the bit width ($l$) of the data sample changes from 6 to 16 bits. Both simulation and estimation results indicate that after $l$ exceeds $n$, SNR will not significantly improve by further increasing $l$. This is consistent with the earlier conclusion that selecting $n=l$ will lead to a memory efficient implementation when SNR is of great interest.

Finally, Equations 8 and 16 are used to study the impact of the analog portion of the DDS circuits. With $f_{clk}=100\text{MHz}$ and a fourth-order smoothing filter, the SNR values at the LUT and the filter outputs are estimated and plotted in Figure 8. Two signal frequencies, 1MHz and 15MHz, are examined in the study. The LUT address input width
Design Techniques for Direct Digital Synthesis Circuits

Figure 7: Estimated vs. simulated SNR (dB)

Figure 8: Estimation results with and without considering the effect of the analog circuit
is fixed at 10 bits; but its resolution varies from 8 to 14 bits. Also, the smoothing filter bandwidth is selected as 25 MHz and 20 MHz in the two studies depicted in Figure 8. For the low frequency signal, the SNR at the final DDS output is higher than the SNR at the LUT output. This is because the smoothing filter adequately attenuates both noise (due to limited phase and LUT resolutions) and the undesirable image energy. However, for the high frequency signal, the signal image is not adequately attenuated by the smoothing filter and subsequently the SNR at the DDS output becomes smaller than that at the LUT output. Aggressively reducing filter bandwidth helps address this problem as illustrated in Figure 8 (b).

6.2. Frequency accuracy comparison

In this investigation, we assume the DDS circuit is to cover the frequency range from 1Hz to 50MHz and the required absolute frequency error is \( \varepsilon_f = 0.1Hz \) with a LUT address width of 10 bits. In the straightforward implementation, the DDS circuit will use a single clock frequency which is 100MHz in theory (though practically it is preferred to be higher). The drawback of this implementation is that the relative frequency error is very high at low frequency range, e.g. 10% at 1Hz. To overcome this shortcoming, the proposed method partitions the frequency range into three sub ranges as shown in Table 1. Their corresponding clock frequencies and scaling factors \( C \) are also listed in the table. Note that this partition limits the maximum clock to signal frequency ratio \( R_{\text{max}} \) to 1000. Thus the number of fractional bits reduces from 25 in the convention design to 8 in the proposed design. The proposed design also significantly improves both absolute and relative frequency error in the low frequency range. In the proposed design, \( C \) has a resolution of 47 bits. Still this requires only little extra hardware, since the constants are known during the design stage.

![Figure 9: Absolute frequency error of the conventional and proposed DDS circuits](image)
Figure 10: Relative frequency error of the conventional and proposed DDS circuits

Table 1: Clock frequencies and calibrating constants for different frequency sub-ranges

<table>
<thead>
<tr>
<th>Freq. Subrange</th>
<th>$f_{clk}$</th>
<th>Scaling factor $C$</th>
</tr>
</thead>
<tbody>
<tr>
<td>1Hz-300Hz</td>
<td>1kHz</td>
<td>72,057,593,379,424</td>
</tr>
<tr>
<td>300Hz-100kHz</td>
<td>300kHz</td>
<td>240,191,977,932</td>
</tr>
<tr>
<td>100kHz-50MHz</td>
<td>100MHz</td>
<td>720,575,941</td>
</tr>
</tbody>
</table>

Table 2: Design parameter comparison; Conventional and Proposed LUT based DDS

<table>
<thead>
<tr>
<th></th>
<th>Conventional LUT based DDS</th>
<th>Proposed LUT based DDS</th>
</tr>
</thead>
<tbody>
<tr>
<td>Bit width of $\alpha$</td>
<td>37 Bits</td>
<td>20 Bits</td>
</tr>
<tr>
<td>Bit width of $C$ (p+m)</td>
<td>N/A</td>
<td>47 Bits</td>
</tr>
<tr>
<td>Bit width of PA</td>
<td>37 Bits</td>
<td>20 Bits</td>
</tr>
<tr>
<td>PA Fraction bit width (k)</td>
<td>25 Bits</td>
<td>8 Bits</td>
</tr>
<tr>
<td>LUT address width (n)</td>
<td>10 Bits</td>
<td>10 Bits</td>
</tr>
<tr>
<td>Maximum Absolute Frequency Error</td>
<td>0.1Hz (constant)</td>
<td>$10^{-6}$ Hz when $f_{sig}$&lt;300Hz $3\times10^{-4}$ Hz when 300Hz&lt;$f$&lt;100KHz 0.1Hz when $f$=100 KHz</td>
</tr>
<tr>
<td>Maximum Relative Frequency Error</td>
<td>10%</td>
<td>0.0001%</td>
</tr>
<tr>
<td>Clock frequency</td>
<td>Constant, 100MHz</td>
<td>Discrete steps, 1kHz, 300kHz, 100MHz</td>
</tr>
</tbody>
</table>
The absolute and relative frequency errors of the conventional and proposed implementations are compared in Figures 9 and 10, respectively. It shows the proposed approach results in significant reduction on both absolute and relative frequency errors. For the conventional implementation, the absolute frequency error is kept at the constant level of 0.1Hz. The absolute frequency errors in the proposed method distribute to three major levels. In the first two partitioned sub ranges, their error levels are significantly lower than 0.1Hz. The frequency error in the last partitioned sub range initially reaches the 0.1Hz level and then gradually reduces with the increase of signal frequency. This is due to the cancellation mechanism of the errors caused by $\Delta a$ and $\Delta C$. The relative frequency errors shown in Figure 10 exhibit similar reduction by the proposed method as well. Finally, Table 2 compares the key design parameters as well as the frequency errors of the conventional- and proposed LUT based DDS circuits used in the above studies.

![Prototype hardware and measurement setup](image)

Figure 11: Prototype hardware and measurement setup

### 6.3. Measurement results

Several DDS circuits with different LUT sizes are implemented on a prototype platform consisting of FPGA, DAC and passive LPF as shown in Figure 11. The synthesized signals are captured by a data acquisition board for SNR analysis. The SNR values obtained from the measurement data are compared with the estimation by Equation 16. Figure 12 shows the comparison for LUTs with 8, 10, and 12-bit address inputs. For
each LUT address bit width, the bit size of data samples stored in the LUT is varied from 6 to 14 bits. Hence, 15 LUT configurations are examined in this study. It shows that the estimation results are very close to the measurement results when the SNR values are below 65dB. When the estimated SNR value is much higher than 65dB, the measurement results are saturated around 65dB, which is likely caused by noise performance of the measurement setup.

![SNR comparison graphs](image)

7. Conclusions
This work develops DDS design techniques to improve frequency accuracy over wide frequency ranges. It also derives analytical equations for SNR estimation with considering the design parameters in both analog and digital portions of the DDS circuit. The effectiveness of the developed frequency accuracy enhancement techniques as well as the accuracy of the derived SNR estimation equations are validated by both simulation and hardware measurement results. The developed techniques, analytical equations and design
guidelines will accelerate the development of hardware efficient DDS circuits that can maintain excellent frequency accuracy over wide frequency ranges.

Acknowledgments

This research has been supported in part by the grant NSF IIP 1535658 and NSF I/UCRC for Embedded Systems at SIUC under grant NSF IIP 1361847. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

References