To appear in: IEEE Transactions on Signal Processing, 1999. Copyright (c) IEEE, 1999.
This version of the paper is for reading online. For a printed version please download the Postscript file or view the PDF file to obtain best results. If you cannot see the equations online please read these notes.

Alternative Discrete-Time Operators: An Algorithm for Optimal Selection of Parameters

Andrew D. Back, Bill G. Horne, Ah Chung Tsoi, C. Lee Giles ¹

Abstract

In this note, we consider the issue of parameter sensitivity in models based on alternative discrete time operators (ADTOs). A generic first order ADTO is proposed which encompasses all the known first order ADTOs introduced so far in the literature. New bounds on the operator parameters are derived, and a new algorithm is given for optimally selecting the parameters to give minimum parameter senstivity.

Digital filters, low sensitivity, delta operator, shift operator, gamma operator.

1 Introduction

An important aspect in the development of linear models, are practical properties such as parameter convergence, model robustness, parameter sensitivity, and parsimony in representations.

In this note we consider the problem of parameter sensitivity which occurs when using a finite wordlength representation. Suppose we have a hardware implementation using two's complement arithmetic. Then any real valued number a can be represented in B bits as a = k ( -b₀ + å_{i = 1}^B-1 b_i 2ⁱ ) + e where k is some scaling factor, b₀ is the sign bit, and e is the error in the representation [1]. The question of interest, is what effect does e play in the properties of a given model for some finite B? When fixed point arithmetic is used, the issue of quantization of the signals and of the model parameters needs to be considered.

A variety of methods have been proposed to deal with the problem of quantization in linear systems. It is well known that for any linear filter transfer function there exist many possible structural realizations, all of them related through a similarity transformation. The problem of determining an optimal filter realization which minimizes the round-off noise has been solved by Mullis and Roberts [2]. Theile solved the problem of determining an optimal similarity transformation to reduce parameter sensitivity [3,4]. It has been shown by Jackson [5] that a low noise structure can also have low parameter sensitivity properties provided certain constraints are met. Various filter realizations with low round-off noise and low parameter sensitivity have been proposed in [6,7,8].

Another widely used approach to produce models with low parameter sensitivity is the method of shift operator replacement. Such operators have become known as alternative discrete time operators. Alternative discrete time operators (ADTOs) have been proposed in various forms for improving the numerical performance of control and system identification models, and for digital filters [9,10,11,12,13]. A number of ADTOs have been proposed in the literature. Some of most common examples are:

1. Delta operator: d = [(z-1)/( D)] where D is the sampling rate. This operator was introduced for digital filters in [9] to replace the shift operator. It has since been shown to have better performance than the shift operator in direct form digital filters [14] and in other applications [15,10].

2. Gamma operator: g = [(z-(1-c))/ c], where c is an adjustable parameter. This operator was introduced in [16] as an efficient basis of convolutional memory representation.

3. Rho operator: r = [(z-(1-c₁D))/( c₂D)], where c₁,c₂ are adjustable parameters. This operator was introduced in the context of robust adaptive control [12]. It is stably invertible, thus allowing the derivation of robust estimation algorithms.

4. Other structures: Bilinear transform operators were proposed independently in [17,11], a second order g operator was introduced in [18] and a number of other possibilities are mentioned in [10]. The Laguerre filter [13] also gives low sensitivity behaviour. The Kautz filter [19] is an extension of the Laguerre filter, but allows for complex poles.

While these operators have been of particular interest in the area of high speed sampling [15,10,11] and provide low parameter sensitivity models, there appears to have been little effort given to determine the optimal selection of the operator parameters. One reason for this is that the most commonly used structure is the d operator which has no adjustable parameters. Adaptive algorithms have been proposed for the g operator in the context of minimizing mean square output error, but not for producing low sensitivity models. Other work performed in this area has been directed towards finding appropriate model parameters which are based on ADTOs [15,11].

In this paper we consider a generic first order ADTO which is described in Section II. We give bounds on the operator parameters and an algorithm for the optimal selection of parameters to obtain minimum parameter sensitivity in a linear filter. A simple numerical example is given to illustrate the effectiveness of the algorithm. Conclusions are given in Section III.

2 Analysis and Selection of Operator Parameters

2.1 A Generic First Order ADTO: the n operator

For the purposes of this paper we will consider a generic first order operator given by

n =

z-c₁

c₂

(1)

where c₁,c₂ are constants. The generic operator may represent each of the previously discussed types: d, g and r operators by appropriate choice of the pararameters. The operator is stable for 0 < |c₁| < 1.

In this section, we will present some new results on the bounds on c₁ and c₂ and the optimal selection of these parameters to obtain minimum parameter sensitivity models. The derivation of these results is not difficult, though they give properties of the n operator which have not appeared previously in the literature².

2.2 Bounds on the Selection of Operator Parameters

For a polynomial H(z) = å_{i = 0}ⁿ b_i z^-i = z^-n Õ_{i = 1}ⁿ (z - l_i ), where l_i is the ith root (ie. a zero of the filter), with parameter sensitivity S_ij = -[(¶l_i)/( ¶b_j)], we seek another rational function H(n) = å_{i
= 0}ⁿ b_i^¢ n^-i = n^-n Õ_{i = 1}ⁿ (n- l_i^¢ ), l_i^¢ = [(l_i - c₁)/( c₂)]; and S_ij^¢ = -[(¶l_i^¢)/( ¶b_j^¢)], such that S_ij^¢ £ S_ij, "i,j and H(z) = H(n).

Following a similar approach to the d operator case [10,11], the parameter sensitivity is found as

S_ij^{^¢}

(l_i^¢)^n-j

Õ
k ¹ i

(l_i^¢-l_k^¢)

c₂^j-1

(l_i-c₁)^n-j

Õ
k ¹ i

(l_i-l_k)

(2)

A lower bound on c₂ may be found as follows. For polynomials B(z) = å_{i = 0}ⁿb_izⁱ and B(n) = å_{i = 0}ⁿb_i^¢nⁱ it can be shown that the parameters b_j and b_j^¢ are related by the following formula:

b_i^¢ = c₂ⁱ

n
å
j = i

b_j

æ
ç
è

ö
÷
ø

c₁^j-i

(3)

Hence, it is clear that if c₂ < 1, then b_n^¢® 0 as n becomes large. Consider the leading coefficient, b_n^¢ = b_nc₂ⁿ. This term vanishes as n becomes large which implies that for FWL implementations of polynomials, the coefficients may end up getting truncated to zero if c₂ is too small. A lower bound on c₂ is given by

c₂

2^{-[((b-1))/ n]}

(4)

where b is the wordlength in bits used to represent c₂.

This lower bound can be used to select either a minimum c₂ value for a given number of bits representation, or a minimum word length for a given c₂ value. There is not any practical upper bound on c₂, since if max_i S^¢_ij(c₁,c₂) occurs for j = 0, then c₂ should be as large as possible. Note that from these constraints, the operator n will always be a low pass filter.

2.3 Optimal Operator Sensitivity

The next question we seek to answer, is how to select c₁ and c₂ to optimize this improvement in sensitivity for a given FWL implementation³.

From (2) we know the sensitivity of the roots of H(n), is a function of the coefficients of the corresponding polynomial in c₁, c₂. Note that the sensitivity S^¢_ij(c₁,c₂) consists of a matrix of functions, one corresponding to each root and coefficient. The approach we adopt, is to seek to find the overall operator parameters c₁, c₂ which will simultaneously minimize the largest of all of the sensitivities given by

max
i,j

| S^¢_ij(c₁,c₂)|

(5)

To minimize (5) with respect to c₁, it is sufficient to simultaneously minimize the maximum of the partial sensitivity functions f_i defined as

f_i(c₁)

| l_i-c₁| ⁿ

Õ
k ¹ i

|l_i-l_k|

i = 1,...,n

(6)

To minimize the maximum of (6) over i = 1,...,n we proceed as follows. First, each f_i is a simple function concave upwards everywhere, and has a unique minimum at c_1i^* = Â{l_i}, where Â{ l} indicates the real part of l. Since max_if_i may be formed piecewise from various f_i i = 1..n, then it is also concave upwards. It follows that the required value of c₁^* which will minimize max_if_i may occur at one of the c_1i^*, or where two of the functions, denoted by f_i^* and f_j^*, intersect. The required intersecting curves f_i^* and f_j^* are those which define, piecewise, max_i f_i(c₁) , i = 1,...,n.

The method we use to find the functions f_i^* and f_j^* is similar to that used in convex hull algorithms [20]. Firstly we find the minimums of each individual f_i i = 1,...,n, and the intersection of each possible pair of functions. We then perform a search to find the maximum value of the function or functions (intersecting case). We then choose c₁^* as the value which will minimize max_if_i.

The intersection of f_i^* and f_j^* occurs when

| l_i^*-c₁| ⁿ

Õ
k ¹ i^*

| l_i^*-l_k|

| l_j^*-c₁| ⁿ

Õ
k ¹ j^*

| l_j^*-l_k|

(7)

Defining d_j = [ Õ_{k ¹ j}| l_j-l_k| ] ^2/n, we may rewrite (7) as a quadratic equation in the unknown c₁,

a^¢ c₁² + b^¢ c₁+ c^¢ = 0 ,

(8)

where a^¢ = [1/( d_i^*)]-[1/( d_j^*)],

b^¢ = [(2Â{l_j^*})/( d_j^*)] - [(2Â{l_i^*})/( d_i^*)] and

c^¢ = [(| l_i^*| ²)/( d_i^*)] - [(| l_j^*| ²)/( d_j^*)]. The roots of this quadratic function may be real, real and repeated, or complex and give the intersection point.

If the roots of (8) are real and unique, then f_i^* and f_j^* will intersect at two points. One of those points will be the required minimum, corresponding to c₁^* (see Fig. 1(a)). If the roots of (8) are repeated, then the curves intersect at only one point, giving c₁^* directly. If the roots are complex, the curves f_i^* and f_j^* do not intersect(see Fig. 1(b)). In this case, the optimal c₁^* value can be found simply at the minimum of max(f_i^*, f_j^*). This point can be obtained by taking the derivative of max(f_i^*, f_j^*) with respect to c₁. A simple test for such complex roots is that l_i^*2 + l_j^*2 < 2 l_i^* l_j^*.

The method for determining the true values i^* and j^* is shown in shown in Table I. Lack of space does not permit a detailed description, however the procedure essentially compares successive c₁^* values until convergence to find the min-max solution on the convex hull.

Figure 1: (a) Intersection of f_i^* and f_j^* for the case where the roots of the quadratic equation are real. (b) For the case where the roots are complex, the curves do not intersect. (refer to Postscript or PDF versions of paper).

Once we have c₁^*, we then proceed to find c₂^*. For any given value of c₁, it is possible to choose c₂ such that S^¢_ij(c₁^*,c₂) is a maximum for either j = 0 or j = n. Using (2), substituting j = n, we have

max
i

| S_in^¢(c₁^*,c₂)|

c₂^n-1

max
i

g_i

(9)

where g_i = [1/( Õ_{k ¹ i}| l_i-l_k|)] and similarly substituting j = 0 in (2) we have,

max
i

| S_i0^¢(c₁^*,c₂)|

c₂^-1

max
i

f_i(c₁^* )

(10)

Note that (9) is an increasing function with c₂, while (10) is a decreasing function with c₂. Since max_i g_i and max_i f_i(c₁ ) are both positive reals, c₂^* can be found at the intersection of (9) and (10) which minimizes max_ij | S_ij^¢(c₁^*,c₂) |. Hence c₂^* can be found by solving for the equality of (9) and (10) to give

c₂

é
ê
ê
ê
ê
ê
ë

max
i

f_i(c₁)

max
i

g_i

ù
ú
ú
ú
ú
ú
û

1/n

(11)

The algorithm is summarized in Table I. To illustrate the ideas contained in the paper we give a numerical example below.

Table 1: Algorithm for determining optimal n operator parameters.


		For B(z), sort l_i i = 1...n: Â{l₁} < ¼ < Â{l_n}
		let i^* = 1, j^* = n
		// Determine true i^, j^ to find optimal c₁^*.
		do
		for i = 2,..,n-1
		function c_1i^¢ = compute c1(i,j^*)
		if f_i(c_1i^¢) > f_i^(c₁^)
		i^*: = i fi
		end
		for j = n-1,..,2
		function c_1j^¢ = compute c1(i^*,j)
		if f_j(c_1j^¢) > f_i^(c₁^)
		j^*: = j fi
		end
		until convergence of c₁^*
		if c₁^* > 1.0 then c₁^* = 1.0
		c₂^* = ( [(max_if_i(c₁))/( max_jg_j)] ) ^1/n.

		function compute c₁(i,j)
		// For any f_i and f_j, find the min-max solution point c₁:.
		if l_i²+l_j² < 2l_il_j
		// Test for nonintersecting f_i and f_j.
		// Roots of (8) are complex.
		For any c₁, find [i\tilde] = argmax_{i = i,j}f_i( c₁) .
		c₁ = Â{l_[i\tilde]}.
		else
		// Roots of (8) are real.
		For i and j, solve (8) for c₁.
		fi

Example: Selection of parameters for a linear FIR model

Consider an FIR model given by:

G(z)

1.0 - 7.58z^-1 + 24.3681z^-2 - 42.4387z^-3

- 23.1061z^-4 + 5.4960z^-5

(12)

The maximum parameter sensitivity of each of this filter is max_i,jS_ij = 1829. The model is implemented using ADTOs and applying the proposed algorithm, we obtain parameter values c₁ = 1.0 and c₂ = 0.705. The maximum parameter sensitivity now obtained is max_i,jS^¢_ij = 21, an improvement of more than 87 times over the usual shift operator case.

To see the effect this has on the actual performance of the filter, we simulate the models with 8 bit wordlength. It can be easily observed in Fig. 2 that the H(z) model is subject to significant changes in the zero positions due to the FWL implementation. The difference between the true zeros (shown as 'o') and the zeros due to coefficient rounding (shown as '*') can be clearly seen. In Fig. 2(b) the performance of the H(n) model is shown. Here there is no significant change in the zero positions due to coefficient rounding. From this example⁴, it is evident that the n operator model exhibits significantly better performance than the shift operator model.

Figure 2: Example results showing the zero positions for: (a) H(z) and (b) H(n), due to coefficient rounding using an 8 bit wordlength. In each case the true zeros are shown as circles and the perturbed zeros due to coefficient rounding are shown using the * symbol. The lower parameter sensitivity of the n operator model gives significantly reduced change in the zeros when the coefficients are rounded (refer to Postscript or PDF versions of paper)..

3 Conclusions

In this paper we briefly reviewed a number of discrete time operators which may be used in place of the usual shift operator in digital filters. These operators provide a means of giving significantly improved performance under conditions of parameter perturbation. As a means of dealing with the various ADTOs, we considered a generic first order operator, termed the n operator. We derived an algorithm for choosing optimal parameters in terms of reducing the maximum parameter sensivity. The effective performance of the algorithm has been shown in experiments, and we show a typical result in this paper.

It is apparent that alternative discrete-time operators offer a useful approach for deriving low sensitivity linear models which can give significantly improved numerical accuracy when compared with conventional shift operator structures.

References

[1]: R.A. Roberts and C.T. Mullis, Digital Signal Processing, Addison-Wesley, 1987.
[2]: C. T. Mullis and R. A. Roberts, ``Synthesis of minimum roundoff fixed point digital filters,'' IEEE Trans. Circuits, Syst., pp. 551-562, 1976.
[3]: L. Thiele, ``Design of sensitivity and roundoff noise optimal state-space discrete systems,'' Int. J. Circuit Theory Appl., vol. 12, pp. 39-46, 1984.
[4]: L. Thiele, ``On the sensitivity of linear state-space systems,'' IEEE Trans. Circuits, Syst., vol. CAS-33, pp. 502-510, 1986.
[5]: L.B. Jackson, Digital Filters and Signal Processing, Kluwer Academic Publishers, Norwell,MA, 1996.
[6]: L.B. Jackson, A.G. Lindgren, and Y. Kin, ``Optimal synthesis of second order state space structures for digital filters,'' IEEE Trans. Circuits, Syst., vol. CAS-26, no. 3, pp. 149-153, 1979.
[7]: C.W. Barnes, ``Roundoff noise and overflow in normal digital filtering,'' IEEE Trans. Circuits, Syst., vol. CAS-26, no. 3, pp. 154-159, 1979.
[8]: W. L. Mills, C. T. Mullis, and R. A. Roberts, ``Low roundoff noise and normal realizations of fixed point IIR digital filters,'' IEEE Trans. Acoustics, Speech, and Signal Processing, vol. ASSP-29, no. 4, pp. 893, 1981.
[9]: R.C. Agarwal and C.S. Burrus, ``New recursive digital filter structures having very low sensitivity and roundoff noise,'' IEEE Trans. Circuits, Syst., vol. CAS-22, no. 12, pp. 921-927, 1975.
[10]: G.C. Goodwin, R.H. Middleton, and H.V. Poor, ``High-speed digital signal processing and control,'' Proc. IEEE, vol. 80, no. 2, pp. 240-259, 1992.
[11]: M. Gevers and G. Li, Parameterizations in control, estimation and filtering problems: accuracy aspects, Springer-Verlag, London, 1993.
[12]: M. Palaniswami, ``Digital estimation and control with a new discrete time operator,'' in Proc. 30th IEEE Conf. Decision and Control, New York, NY, 1991, IEEE Press, pp. 1631-1632.
[13]: B. Wahlberg, ``System identification using laguerre models,'' IEEE Trans. Automat. Control, vol. 36, pp. 551-562, 1991.
[14]: D. Williamson, ``Delay replacement in direct form structures,'' IEEE Trans. Acoust., Speech, Signal Processing, vol. 34, no. 4, pp. 453-460, Apr. 1988.
[15]: H. Fan and Q. Li, ``A d-operator recursive gradient algorithm for adaptive signal processing,'' in Proc. IEEE Int. Conf. Acoust. Speech, Signal Proc. IEEE Press, 1993, vol. III, pp. 492-495.
[16]: B. de Vries and J.C. Principe, ``A theory for neural networks with time delays,'' in Advances in Neural Information Processing Systems, R.P. Lippman, J.E. Moody, and D. S. Touretzky, Eds., San Mateo, CA, 1991, Morgan Kaufmann, vol. 3, pp. 162-168.
[17]: A.D. Back and A.C. Tsoi, ``A comparison of discrete-time operator models for nonlinear system identification,'' in Advances in Neural Information Processing Systems, G. Tesauro, D. S. Touretzky, and T. K. Leen, Eds., Cambridge, MA, 1995, The MIT Press, vol. 7, pp. 883-890.
[18]: T. Oliveira e Silva, P.G. de Oliveira, J.C. Principe, and B. de Vries, ``Generalized feedforward filters with complex poles,'' in Proc. of the 1992 IEEE Workshop Neural Networks for Signal Processing 2 (NNSP92), S.Y. Kung, F. Fallside, J. Aa. Sorenson, and C.A. Kamm, Eds., Piscataway, NJ, 1992, IEEE Press, pp. 503-510.
[19]: P. Lindskog and B. Wahlberg, ``Application of Kautz models in system identification,'' in Proc. 12th Triennial World Congress. IFAC, 1993, pp. 41-44.
[20]: J.R. Smart, Modern Geometries, Brooks/Cole, New York, 1998.

Footnotes:

¹ Andrew Back is with the Brain Science Institute, RIKEN, 2-1 Hirosawa, Wako-shi, Saitama 351-0198 Japan. Email: back@brain.riken.go.jp. Bill Horne is with AADM Inc., 9 Pace Farm Road, Califon, NJ 07830,USA. Ah Chung Tsoi is with the Faculty of Informatics, University of Wollongong, Northfields Avenue, Wollongong, NSW 2522, Australia. C. Lee Giles is with NEC Research Institute, 4 Independence Way, Princeton, NJ 08540, USA and also with the Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742 USA. ADB acknowledges financial support from the Australian Research Council and the Brain Science Institute, RIKEN, Japan. ACT acknowledges partial support from the Australian Research Council. Copyright (c) IEEE, 1999.

² The lower bound of c₂ is a generalization of the results derived for the shift operator which appeared in [11]. The results on optimal parameter sensitivity are new, as far as we are aware.

³ If infinite precision was possible, we could improve the sensitivity to some arbitrary value by simply making c₂ as small as we like.

⁴ We have conducted a number of experiments using different models and they show similar performance improvements.

File translated from T_EX by T_TH, version 1.57.

Alternative Discrete-Time Operators: An Algorithm for Optimal Selection of Parameters

Andrew D. Back, Bill G. Horne, Ah Chung Tsoi, C. Lee Giles 1

Abstract

1 Introduction

2 Analysis and Selection of Operator Parameters

2.1 A Generic First Order ADTO: the n operator

2.2 Bounds on the Selection of Operator Parameters

2.3 Optimal Operator Sensitivity

Example: Selection of parameters for a linear FIR model

3 Conclusions

References

Footnotes:

Andrew D. Back, Bill G. Horne, Ah Chung Tsoi, C. Lee Giles ¹