To appear in: IEEE Transactions on Signal Processing,
1999. Copyright (c) IEEE, 1999.
This version of the paper is for reading
online. For a printed version please download the Postscript file or view the PDF file to obtain best results.
If you cannot see the equations online please read
these notes.
In this note, we consider the issue of parameter sensitivity in models based on alternative discrete time operators (ADTOs). A generic first order ADTO is proposed which encompasses all the known first order ADTOs introduced so far in the literature. New bounds on the operator parameters are derived, and a new algorithm is given for optimally selecting the parameters to give minimum parameter senstivity.
Digital filters, low sensitivity, delta operator, shift operator, gamma operator.
An important aspect in the development of linear models, are practical properties such as parameter convergence, model robustness, parameter sensitivity, and parsimony in representations.
In this note we consider the problem of parameter sensitivity which occurs when using a finite wordlength representation. Suppose we have a hardware implementation using two's complement arithmetic. Then any real valued number a can be represented in B bits as a = k ( b_{0} + å_{i = 1}^{B1} b_{i} 2^{i} ) + e where k is some scaling factor, b_{0} is the sign bit, and e is the error in the representation [1]. The question of interest, is what effect does e play in the properties of a given model for some finite B? When fixed point arithmetic is used, the issue of quantization of the signals and of the model parameters needs to be considered.
A variety of methods have been proposed to deal with the problem of quantization in linear systems. It is well known that for any linear filter transfer function there exist many possible structural realizations, all of them related through a similarity transformation. The problem of determining an optimal filter realization which minimizes the roundoff noise has been solved by Mullis and Roberts [2]. Theile solved the problem of determining an optimal similarity transformation to reduce parameter sensitivity [3,4]. It has been shown by Jackson [5] that a low noise structure can also have low parameter sensitivity properties provided certain constraints are met. Various filter realizations with low roundoff noise and low parameter sensitivity have been proposed in [6,7,8].
Another widely used approach to produce models with low parameter sensitivity is the method of shift operator replacement. Such operators have become known as alternative discrete time operators. Alternative discrete time operators (ADTOs) have been proposed in various forms for improving the numerical performance of control and system identification models, and for digital filters [9,10,11,12,13]. A number of ADTOs have been proposed in the literature. Some of most common examples are:
1. Delta operator: d = [(z1)/( D)] where D is the sampling rate. This operator was introduced for digital filters in [9] to replace the shift operator. It has since been shown to have better performance than the shift operator in direct form digital filters [14] and in other applications [15,10].
2. Gamma operator: g = [(z(1c))/ c], where c is an adjustable parameter. This operator was introduced in [16] as an efficient basis of convolutional memory representation.
3. Rho operator: r = [(z(1c_{1}D))/( c_{2}D)], where c_{1},c_{2} are adjustable parameters. This operator was introduced in the context of robust adaptive control [12]. It is stably invertible, thus allowing the derivation of robust estimation algorithms.
4. Other structures: Bilinear transform operators were proposed independently in [17,11], a second order g operator was introduced in [18] and a number of other possibilities are mentioned in [10]. The Laguerre filter [13] also gives low sensitivity behaviour. The Kautz filter [19] is an extension of the Laguerre filter, but allows for complex poles.
While these operators have been of particular interest in the area of high speed sampling [15,10,11] and provide low parameter sensitivity models, there appears to have been little effort given to determine the optimal selection of the operator parameters. One reason for this is that the most commonly used structure is the d operator which has no adjustable parameters. Adaptive algorithms have been proposed for the g operator in the context of minimizing mean square output error, but not for producing low sensitivity models. Other work performed in this area has been directed towards finding appropriate model parameters which are based on ADTOs [15,11].
In this paper we consider a generic first order ADTO which is described in Section II. We give bounds on the operator parameters and an algorithm for the optimal selection of parameters to obtain minimum parameter sensitivity in a linear filter. A simple numerical example is given to illustrate the effectiveness of the algorithm. Conclusions are given in Section III.
For the purposes of this paper we will consider a generic first order operator given by

(1) 
where c_{1},c_{2} are constants. The generic operator may represent each of the previously discussed types: d, g and r operators by appropriate choice of the pararameters. The operator is stable for 0 < c_{1} < 1.
In this section, we will present some new results on the bounds on c_{1} and c_{2} and the optimal selection of these parameters to obtain minimum parameter sensitivity models. The derivation of these results is not difficult, though they give properties of the n operator which have not appeared previously in the literature^{2}.
For a polynomial H(z) = å_{i = 0}^{n} b_{i} z^{i} = z^{n} Õ_{i = 1}^{n} (z  l_{i} ), where l_{i} is the ith root (ie. a zero of the filter), with parameter sensitivity S_{ij} = [(¶l_{i})/( ¶b_{j})], we seek another rational function H(n) = å_{i = 0}^{n} b_{i}^{¢} n^{i} = n^{n} Õ_{i = 1}^{n} (n l_{i}^{¢} ), l_{i}^{¢} = [(l_{i}  c_{1})/( c_{2})]; and S_{ij}^{¢} = [(¶l_{i}^{¢})/( ¶b_{j}^{¢})], such that S_{ij}^{¢} £ S_{ij}, "i,j and H(z) = H(n).
Following a similar approach to the d operator case [10,11], the parameter sensitivity is found as

(2) 
A lower bound on c_{2} may be found as follows. For polynomials B(z) = å_{i = 0}^{n}b_{i}z^{i} and B(n) = å_{i = 0}^{n}b_{i}^{¢}n^{i} it can be shown that the parameters b_{j} and b_{j}^{¢} are related by the following formula:

(3) 
Hence, it is clear that if c_{2} < 1, then b_{n}^{¢}® 0 as n becomes large. Consider the leading coefficient, b_{n}^{¢} = b_{n}c_{2}^{n}. This term vanishes as n becomes large which implies that for FWL implementations of polynomials, the coefficients may end up getting truncated to zero if c_{2} is too small. A lower bound on c_{2} is given by

(4) 
where b is the wordlength in bits used to represent c_{2}.
This lower bound can be used to select either a minimum c_{2} value for a given number of bits representation, or a minimum word length for a given c_{2} value. There is not any practical upper bound on c_{2}, since if max_{i} S^{¢}_{ij}(c_{1},c_{2}) occurs for j = 0, then c_{2} should be as large as possible. Note that from these constraints, the operator n will always be a low pass filter.
The next question we seek to answer, is how to select c_{1} and c_{2} to optimize this improvement in sensitivity for a given FWL implementation^{3}.
From (2) we know the sensitivity of the roots of H(n), is a function of the coefficients of the corresponding polynomial in c_{1}, c_{2}. Note that the sensitivity S^{¢}_{ij}(c_{1},c_{2}) consists of a matrix of functions, one corresponding to each root and coefficient. The approach we adopt, is to seek to find the overall operator parameters c_{1}, c_{2} which will simultaneously minimize the largest of all of the sensitivities given by

(5) 
To minimize (5) with respect to c_{1}, it is sufficient to simultaneously minimize the maximum of the partial sensitivity functions f_{i} defined as

(6) 
To minimize the maximum of (6) over i = 1,...,n we proceed as follows. First, each f_{i} is a simple function concave upwards everywhere, and has a unique minimum at c_{1i}^{*} = Â{l_{i}}, where Â{ l} indicates the real part of l. Since max_{i}f_{i} may be formed piecewise from various f_{i} i = 1..n, then it is also concave upwards. It follows that the required value of c_{1}^{*} which will minimize max_{i}f_{i} may occur at one of the c_{1i}^{*}, or where two of the functions, denoted by f_{i*} and f_{j*}, intersect. The required intersecting curves f_{i*} and f_{j*} are those which define, piecewise, max_{i} f_{i}(c_{1}) , i = 1,...,n.
The method we use to find the functions f_{i*} and f_{j*} is similar to that used in convex hull algorithms [20]. Firstly we find the minimums of each individual f_{i} i = 1,...,n, and the intersection of each possible pair of functions. We then perform a search to find the maximum value of the function or functions (intersecting case). We then choose c_{1}^{*} as the value which will minimize max_{i}f_{i}.
The intersection of f_{i*} and f_{j*} occurs when

(7) 
Defining d_{j} = [ Õ_{k ¹ j} l_{j}l_{k} ] ^{2/n}, we may rewrite (7) as a quadratic equation in the unknown c_{1},

(8) 
where a^{¢} = [1/( d_{i*})][1/( d_{j*})],
b^{¢} = [(2Â{l_{j*}})/( d_{j*})]  [(2Â{l_{i*}})/( d_{i*})] and
c^{¢} = [( l_{i*} ^{2})/( d_{i*})]  [( l_{j*} ^{2})/( d_{j*})]. The roots of this quadratic function may be real, real and repeated, or complex and give the intersection point.
If the roots of (8) are real and unique, then f_{i*} and f_{j*} will intersect at two points. One of those points will be the required minimum, corresponding to c_{1}^{*} (see Fig. 1(a)). If the roots of (8) are repeated, then the curves intersect at only one point, giving c_{1}^{*} directly. If the roots are complex, the curves f_{i*} and f_{j*} do not intersect(see Fig. 1(b)). In this case, the optimal c_{1}^{*} value can be found simply at the minimum of max(f_{i*}, f_{j*}). This point can be obtained by taking the derivative of max(f_{i*}, f_{j*}) with respect to c_{1}. A simple test for such complex roots is that l_{i*}2 + l_{j*}2 < 2 l_{i*} l_{j*}.
The method for determining the true values i^{*} and j^{*} is shown in shown in Table I. Lack of space does not permit a detailed description, however the procedure essentially compares successive c_{1}^{*} values until convergence to find the minmax solution on the convex hull.
Figure 1: (a) Intersection of f_{i*} and f_{j*} for
the case where the roots of the quadratic equation are real. (b) For the case where the
roots are complex, the curves do not intersect. (refer to
Postscript or PDF versions of paper).
Once we have c_{1}^{*}, we then proceed to find c_{2}^{*}. For any given value of c_{1}, it is possible to choose c_{2} such that S^{¢}_{ij}(c_{1}^{*},c_{2}) is a maximum for either j = 0 or j = n. Using (2), substituting j = n, we have

(9) 
where g_{i} = [1/( Õ_{k ¹ i} l_{i}l_{k})] and similarly substituting j = 0 in (2) we have,

(10) 
Note that (9) is an increasing function with c_{2}, while (10) is a decreasing function with c_{2}. Since max_{i} g_{i} and max_{i} f_{i}(c_{1} ) are both positive reals, c_{2}^{*} can be found at the intersection of (9) and (10) which minimizes max_{ij}  S_{ij}^{¢}(c_{1}^{*},c_{2}) . Hence c_{2}^{*} can be found by solving for the equality of (9) and (10) to give

(11) 
The algorithm is summarized in Table I. To illustrate the ideas contained in the paper we give a numerical example below.
Table 1: Algorithm for determining optimal n operator parameters.
For B(z), sort l_{i} i = 1...n: Â{l_{1}} < ¼ < Â{l_{n}}  
let i^{*} = 1, j^{*} = n  
// Determine true i^{*}, j^{*} to find optimal c_{1}^{*}.  
do  
for i = 2,..,n1  
function c_{1i}^{¢} = compute c1(i,j^{*})  
if f_{i}(c_{1i}^{¢}) > f_{i*}(c_{1}^{*})  
i^{*}: = i fi  
end  
for j = n1,..,2  
function c_{1j}^{¢} = compute c1(i^{*},j)  
if f_{j}(c_{1j}^{¢}) > f_{i*}(c_{1}^{*})  
j^{*}: = j fi  
end  
until convergence of c_{1}^{*}  
if c_{1}^{*} > 1.0 then c_{1}^{*} = 1.0  
c_{2}^{*} = ( [(max_{i}f_{i}(c_{1}))/( max_{j}g_{j})] ) ^{1/n}.  
function compute c_{1}(i,j)  
// For any f_{i} and f_{j}, find the minmax solution point c_{1}:.  
if l_{i}^{2}+l_{j}^{2} < 2l_{i}l_{j}  
// Test for nonintersecting f_{i} and f_{j}.  
// Roots of (8) are complex.  
For any c_{1}, find [i\tilde] = argmax_{i = i,j}f_{i}( c_{1}) .  
c_{1} = Â{l_{[i\tilde]}}.  
else  
// Roots of (8) are real.  
For i and j, solve (8) for c_{1}.  
fi  
Consider an FIR model given by:

(12) 
The maximum parameter sensitivity of each of this filter is max_{i,j}S_{ij} = 1829. The model is implemented using ADTOs and applying the proposed algorithm, we obtain parameter values c_{1} = 1.0 and c_{2} = 0.705. The maximum parameter sensitivity now obtained is max_{i,j}S^{¢}_{ij} = 21, an improvement of more than 87 times over the usual shift operator case.
To see the effect this has on the actual performance of the filter, we simulate the models with 8 bit wordlength. It can be easily observed in Fig. 2 that the H(z) model is subject to significant changes in the zero positions due to the FWL implementation. The difference between the true zeros (shown as 'o') and the zeros due to coefficient rounding (shown as '*') can be clearly seen. In Fig. 2(b) the performance of the H(n) model is shown. Here there is no significant change in the zero positions due to coefficient rounding. From this example^{4}, it is evident that the n operator model exhibits significantly better performance than the shift operator model.
Figure 2: Example results showing the zero positions for: (a) H(z) and (b) H(n), due to coefficient rounding using an 8 bit wordlength. In each case the true zeros are shown as circles and the perturbed zeros due to coefficient rounding are shown using the * symbol. The lower parameter sensitivity of the n operator model gives significantly reduced change in the zeros when the coefficients are rounded (refer to Postscript or PDF versions of paper)..
In this paper we briefly reviewed a number of discrete time operators which may be used in place of the usual shift operator in digital filters. These operators provide a means of giving significantly improved performance under conditions of parameter perturbation. As a means of dealing with the various ADTOs, we considered a generic first order operator, termed the n operator. We derived an algorithm for choosing optimal parameters in terms of reducing the maximum parameter sensivity. The effective performance of the algorithm has been shown in experiments, and we show a typical result in this paper.
It is apparent that alternative discretetime operators offer a useful approach for deriving low sensitivity linear models which can give significantly improved numerical accuracy when compared with conventional shift operator structures.
^{1} Andrew Back is with the Brain Science Institute, RIKEN, 21 Hirosawa, Wakoshi, Saitama 3510198 Japan. Email: back@brain.riken.go.jp. Bill Horne is with AADM Inc., 9 Pace Farm Road, Califon, NJ 07830,USA. Ah Chung Tsoi is with the Faculty of Informatics, University of Wollongong, Northfields Avenue, Wollongong, NSW 2522, Australia. C. Lee Giles is with NEC Research Institute, 4 Independence Way, Princeton, NJ 08540, USA and also with the Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742 USA. ADB acknowledges financial support from the Australian Research Council and the Brain Science Institute, RIKEN, Japan. ACT acknowledges partial support from the Australian Research Council. Copyright (c) IEEE, 1999.
^{2} The lower bound of c_{2} is a generalization of the results derived for the shift operator which appeared in [11]. The results on optimal parameter sensitivity are new, as far as we are aware.
^{3} If infinite precision was possible, we could improve the sensitivity to some arbitrary value by simply making c_{2} as small as we like.
^{4} We have conducted a number of experiments using different models and they show similar performance improvements.
File translated from T_{E}X by T_{T}H, version 1.57.