The effect of zeros - System Analysis and Control

So far, we have analyzed the response and performance of systems based on their poles. When we looked at first and second-order systems in canonical form,

G(s) = \frac{K}{\tau s + 1}, \quad G(s) = \frac{K\omega_n^2}{s^2 + 2\zeta \omega_n s + \omega_n^2},

(1)

We saw that the poles determine the stability of the system and performance metrics such as peak time, overshoot, and settling time.

We also considered higher-order systems and approximated them by keeping only the dominant poles. In all of these cases, we assumed the system did not contain any zeros (the numerators had no $s$ terms). Now we’ll see what happens when we introduce zeros into the system.

Where do zeros come from?¶

The poles of a system are associated with the denominator of its transfer function, while the zeros are associated with the numerator. When we do a partial fraction expansion (PFE) of a transfer function, we break it down into a sum of terms, each of which corresponds to a pole. The zeros are not directly visible in the PFE; they are “hidden” in the residues. For example:

\frac{p}{s+a} + \frac{q}{s+b} = \frac{(p+q)s + (pa + qb)}{(s+a)(s+b)}.

(2)

The poles are at $s=-a$ and $s=-b$ , which we can read directly from the denominators in the PFE. The zero is at $s = -\frac{pa + qb}{p+q}$ , which changes depending on the residues $p$ and $q$ . As $p+q\to 0$ the zero moves towards $-\infty$ . When $p+q=0$ , there is no zero at all. In this case, we sometimes say there is a “zero at infinity”.

Physical intuition¶

Where do poles and zeros come from in a physical system?

Poles are associated with the energy storage elements in the system. There are as many poles as the order of the system, and the locations of the poles determine stability and performance properties, such as settling time, percent overshoot, and peak time.
Zeros are associated with the way the system is measured or actuated. They do not affect system stability, but they can have a significant effect on the transient response, particularly percent overshoot.

For example, consider the following mass-spring-damper system.

Figure 1:Spring-mass-damper system with two masses, two springs, two dampers, and two applied forces.

Suppose we can apply a force $u_1$ to mass 1 or $u_2$ to mass 2. We also have several sensors: $y_1$ measures the position of mass 1, $y_2$ measures the velocity of mass 2, and $y_3$ measures the relative position of the masses. The equations of motion are:

\begin{gathered} m_1 \ddot x_1 + b_1 \dot x_1 + k_1 x_1 - b_2(\dot x_2-\dot x_1) - k_2 (x_2 - x_1) = u_1 \\ m_2 \ddot x_2 + b_2(\dot x_2-\dot x_1) + k_2 (x_2 - x_1) = u_2 \\ y_1 = x_1, \qquad y_2 = \dot x_2, \qquad y_3 = x_2 - x_1 \end{gathered}

(3)

If we solve for all possible transfer functions from $U_i$ to $Y_j$ we obtain:

\def\arraystretch{1.5} \begin{array}{c|cc} & U_1 & U_2 \\ \hline Y_1 & \frac{m_2 s^2 + b_2 s + k_2}{D(s)} & \frac{b_2 s + k_2}{D(s)} \\ Y_2 & \frac{b_2 s^2 + k_2 s}{D(s)} & \frac{m_1 s^3 + (b_1+b_2)s^2 + (k_1+k_2)s}{D(s)} \\ Y_3 & \frac{m_2 s^2}{D(s)} & \frac{m_1 s^2 + b_1 s + k_1}{D(s)} \\ \end{array}

(4)

Each transfer function has the same denominator $D(s)$ , which is given by:

\begin{aligned} D(s) &= m_1 m_2 s^4 + \left(b_1 m_2+b_2 \left(m_1+m_2\right)\right) s^3 \\ &\qquad+ \left(b_1 b_2+k_1 m_2+k_2 \left(m_1+m_2\right)\right) s^2 \\ &\qquad\qquad+ \left(b_2 k_1+b_1 k_2\right) s + k_1 k_2 \end{aligned}

(5)

Derivation of transfer functions

Start by taking Laplace transforms of the equations of motion:

\begin{gathered} \bigl(m_1s^2 + (b_1+b_2) s + (k_1+k_2)\bigr)X_1 - (b_2 s + k_2)X_2 = U_1 \\ -(b_2 s + k_2)X_1 + (m_2 s^2 + b_2 s + k_2)X_2 = U_2 \\ Y_1 = X_1, \qquad Y_2 = s X_2, \qquad Y_3 = X_2 - X_1 \end{gathered}

(6)

To solve for $X_1$ , eliminate $X_2$ by multiplying the first equation by $(m_2 s^2 + b_2 s + k_2)$ and the second equation by $(b_2 s + k_2)$ , then adding the two equations. Similarly, to solve for $X_2$ , eliminate $X_1$ by multiplying the second equation by $(m_1 s^2 + (b_1+b_2)s + (k_1+k_2))$ and the first equation by $(b_2 s + k_2)$ , then adding the two equations. This gives us:

\begin{aligned} D(s) X_1 &= (m_2 s^2 + b_2 s + k_2) U_1 + (b_2 s + k_2) U_2 \\ D(s) X_2 &= (b_2 s + k_2) U_1 + \bigl(m_1 s^2 + (b_1+b_2)s + (k_1+k_2)\bigr) U_2 \end{aligned}

(7)

In both cases, $D(s)$ is the same, and it is given by:

D(s) = \bigl(m_1s^2 + (b_1+b_2) s + (k_1+k_2)\bigr)(m_2 s^2 + b_2 s + k_2) - (b_2 s + k_2)^2

(8)

This expression simplifies to Eq. (5) above. So in summary:

\begin{aligned} X_1 &= \frac{(m_2 s^2 + b_2 s + k_2)}{D(s)} U_1 + \frac{(b_2 s + k_2)}{D(s)} U_2 \\ X_2 &= \frac{(b_2 s + k_2)}{D(s)} U_1 + \frac{\bigl(m_1 s^2 + (b_1+b_2)s + (k_1+k_2)\bigr)}{D(s)} U_2 \end{aligned}

(9)

Since $Y_1 = X_1$ , the first equation in (9) gives us:

\boxed{Y_1 = \frac{(m_2 s^2 + b_2 s + k_2)}{D(s)} U_1 + \frac{(b_2 s + k_2)}{D(s)} U_2}

(10)

Since $Y_2 = s X_2$ , multiply the second equation in (9) by $s$ and obtain:

\boxed{Y_2 = \frac{b_2 s^2 + k_2 s}{D(s)} U_1 + \frac{m_1 s^3 + (b_1+b_2)s^2 + (k_1+k_2)s}{D(s)} U_2}

(11)

Finally, since $Y_3 = X_2 - X_1$ , we can subtract the first equation in (9) from the second and obtain:

\boxed{Y_3 = \frac{m_2 s^2}{D(s)} U_1 + \frac{m_1 s^2 + b_1 s + k_1}{D(s)} U_2}

(12)

From Eqs. (10), (11), and (12) we can read off the three rows of the table (4).

The four roots of $D(s)$ are the four poles of the system, and they correspond to the four energy storage elements: kinetic energy of each mass and potential energy of each spring. The polynomial $D(s)$ encodes the physics of the system: how the energy moves between the different storage elements and how it dissipates through the dampers. These physics are the same regardless of how we inject energy into the system (actuation) or how we measure the system (sensing).

The zeros, however, depend on both sensing and actuation. As we can see, the numerators of the transfer functions are all different.

Derivative mixing interpretation¶

Let’s see what adding a zero to a system $G(s)$ does to its time domain response. We’ll add a zero by multiplying the transfer function by $(\tau s + 1)$ . This puts a zero at $s=-\frac{1}{\tau}$ , but it does not change the poles or the DC gain. We obtain:

	Input	Transfer function	Output
Original	$u(t)$	$G(s)$	$y(t)$
With zero	$u(t)$	$(\tau s + 1)G(s)$	$y(t) + \tau \dot y(t)$

Adding a zero amounts to adding a scaled derivative of the original response to itself.

If $|\tau|$ is small, the zero $s=-\frac{1}{\tau}$ is far from the imaginary axis. The zero has a small effect and the response is essentially unchanged.
If $|\tau|$ is large, the zero is close to the imaginary axis. The zero has a large effect and the new response starts to look like the derivative of the original response.

Zeros in second-order systems¶

Consider a second-order system with an added zero:

G(s) = \frac{K\omega_n^2(\tau s + 1)}{s^2 + 2\zeta \omega_n s + \omega_n^2}.

(13)

We added a zero at $s = -\frac{1}{\tau}$ , without changing the DC gain.

If $\tau > 0$ , the zero is in the left-half plane (LHP).
If $\tau = 0$ , there is no zero (“zero at infinity”).
If $\tau < 0$ , the zero is in the right-half plane (RHP).

Our original step response (without the zero) is the familiar step response of a second-order system:

y(t) = K\left(1 - \frac{1}{\sqrt{1 - \zeta^2}} e^{-\zeta \omega_n t} \sin(\omega_d t + \phi)\right)

(14)

Adding the zero adds a scaled derivative of the original response:

\begin{aligned} \tau \dot y(t) &= \frac{K\tau}{\sqrt{1 - \zeta^2}} \biggl( \zeta \omega_n e^{-\zeta \omega_n t} \sin(\omega_d t + \phi) - \omega_d e^{-\zeta \omega_n t} \cos(\omega_d t + \phi) \biggr) \\ &= \frac{K\tau \omega_n}{\sqrt{1 - \zeta^2}} e^{-\zeta \omega_n t} \biggl( \zeta \sin(\omega_d t + \phi) - \sqrt{1 - \zeta^2} \cos(\omega_d t + \phi) \biggr) \\ &= \frac{K\tau \omega_n}{\sqrt{1 - \zeta^2}} e^{-\zeta \omega_n t} \biggl( \cos\phi \sin(\omega_d t + \phi) - \sin\phi \cos(\omega_d t + \phi) \biggr) \\ &= \frac{K\tau \omega_n}{\sqrt{1 - \zeta^2}} e^{-\zeta \omega_n t} \sin(\omega_d t) \end{aligned}

(15)

If this formula looks familiar, it’s because we’ve seen it before! Whenever $y(t)$ is a step response, $\dot y(t)$ is the corresponding impulse response. We calculated the impulse response for a second-order system here.

The term $\tau \dot y$ is zero when $y$ reaches a maximum or minimum. More interpretations:

When $\tau > 0$ (LHP zero), the zero has the effect of “pulling” the response up when it is increasing and “pushing” the response down when it is decreasing.
When $\tau < 0$ (RHP zero), the zero has the opposite effect: “pushing” down when it is increasing and “pulling” up when it is decreasing.

Here is a plot showing how the step response changes as we vary $\tau$ . For this example, we fixed $\zeta=0.5$ and $\omega_n=1$ .

Step response of the second-order system for different values of \tau. We fixed \zeta=0.5 and \omega_n=1. The poles remain unchanged, but the zero location changes with \tau, which significantly affects the transient response. — Figure 2:Step response of the second-order system (13) for different values of $\tau$ . We fixed $\zeta=0.5$ and $\omega_n=1$ . The poles remain unchanged, but the zero location changes with $\tau$ , which significantly affects the transient response.

We can observe some interesting effects as we vary $\tau$ :

For $\tau > 0$ (LHP zero), the response is faster and more aggressive. Peak time decreases and overshoot increase. Settling time increases slightly.
For $\tau = 0$ (no zero / zero at infinity), we get the standard second-order response. Overshoot and settling time are minimized.
For $\tau < 0$ (RHP zero), the response initially moves in the wrong direction! This is called inverse response or undershoot. Peak time and overshoot both increase. Settling time increases slightly.

Yet another interpretation: The slope of the response at $t=0$ is given by the formula^[1] $K\omega_n^2 \tau$ . So the initial slope is directly proportional to $\tau$ . This explains why the response is initially faster for $\tau > 0$ and initially slower (in the wrong direction) for $\tau < 0$ .

Interactive applet¶

Using the applet below (click the icon in the lower-right corner to fullscreen it), you can interactively explore how the step response of a second-order system changes as you change the various parameters. This time, try adjusting the value of $\tau$ to see how the zero location affects the transient response.

RHP zeros¶

The inverse response phenomenon we observed for second-order systems with RHP zeros may seem strange, but it is actually common in real systems! Some examples:

Driving a car in reverse: Let $u$ be the angle of the steering wheel and $y$ be the lateral position of the front bumper. If you turn the steering wheel to the right, the front bumper initially moves left before the car eventually turns to the right.
Airplane altitude adjustment: Let $u$ be the angle of the elevator^[2] and $y$ be the vertical position of the center of mass of the plane. If you increase elevator angle (in order to pitch the airplane up and climb), there is an initial downforce due to the elevator angle, which causes the plane to initially drop in altitude before eventually climbing.

In both of these examples, the transfer function from input $u$ to output $y$ contains a RHP zero, which causes the inverse response. This can make the system difficult to control because it behaves like it has a “built-in delay”; it must initially move in the wrong direction before it can move correctly.

Zeros can only be moved by changing the way the system is measured or actuated. For example, if we put the elevator at the front of the airplane instead of the back^[3], we would get a LHP zero instead of a RHP zero, so no inverse response. Similarly, if we could steer the rear tires instead of the front tires (or just drive the car forwards instead of in reverse), there would be no inverse response.

Zeros and dominant poles¶

We saw in the section on higher-order systems that we can approximate a higher-order system by keeping only its dominant poles. However, this assumed there were no zeros! It turns out that zeros can disrupt dominant pole approximations because zeros lessen the effect of nearby poles. To illustrate, consider a system with two simple poles and one zero:

G(s) = \frac{10}{z} \cdot \frac{(s + z)}{(s + 1)(s + 10)},

(16)

where $z$ is a parameter that we can vary to change the zero location. The poles are fixed at $s=-1$ and $s=-10$ , so the dominant pole approximation would suggest that the transient response should be dominated by the pole at $s=-1$ regardless of the value of $z$ . However, we shall see that this is not always the case.

We included the factor of $\frac{10}{z}$ in front to ensure that $G(s)$ would always have a DC gain of 1 for any choice of $z$ . If we let $z\to\infty$ , we recover the case with no zeros.

Now perform a PFE to obtain two first-order terms and place them in canonical form so we can compare their DC gains:

\boxed{ \begin{aligned} G(s) = \frac{10}{9} \cdot \biggl(\; \underbrace{\frac{1-\frac{1}{z}}{\vphantom{\frac{1}{10}}s + 1}}_{\textsf{slow pole}} - \underbrace{\frac{\frac{1}{10}-\frac{1}{z}}{\frac{1}{10}s + 1}}_{\textsf{fast pole}} \;\biggr) \end{aligned} }

(17)

Now let’s look at what happens as we vary $z$ :

If $z\to\infty$ (no zero), the slow pole has a DC gain of $\frac{10}{9}$ and the fast pole has a DC gain of $-\frac{1}{9}$ . As expected, this is the standard dominant pole scenario; the slow pole dominates due to its much larger DC gain.
If $z=10$ (zero at the fast pole), the slow pole has a DC gain of $\frac{10}{9}$ and the fast pole has a DC gain of 0. The fast pole is effectively cancelled by the zero, so the slow pole still dominates. We can see this in the original transfer function (16). When $z=10$ , the numerator has a factor of $(s+10)$ , which cancels the $(s+10)$ in the denominator, leaving us with a single pole at $s=-1$ .
If $z=1$ (zero at the slow pole), the slow pole has a DC gain of 0 and the fast pole has a DC gain of $\frac{10}{9}$ . The slow pole is effectively cancelled by the zero, so the fast pole dominates instead! This is a dramatic change in behavior caused by the zero, even though the poles remain unchanged.
If $z\to 0$ (zero close to the origin), the DC gains of both poles blow up, causing large transient distortions.

We can summarize these observations with the following principles:

Interactive applet¶

The following interactive applet illustrates the effect of zeros on pole dominance. You can move the poles by dragging the “ $\mathsf{x}$ ” markers and move the zero by dragging the “ $\mathsf{o}$ ” marker. The step response will update in real time to show how the transient response changes. The shaded lines emanating from the poles indicate the DC gain of each pole; the longer the line, the larger the DC gain and the more dominant that pole is.

Test your knowledge¶

Solution to Exercise 1 #

We seek a PFE of the form:

G(s) = \frac{A}{s+a} + \frac{B}{s+b}

(19)

Using the cover-up method, we have:

\begin{aligned} A &= \frac{s+z}{s+b}\bigg|_{s=-a} = \frac{z-a}{b-a} \\ B &= \frac{s+z}{s+a}\bigg|_{s=-b} = \frac{z-b}{a-b} \end{aligned}

(20)

Putting the transfer function in canonical form, we get:

\begin{aligned} G(s) &= \frac{z-a}{b-a} \cdot \frac{1}{s+a} + \frac{b-z}{b-a} \cdot \frac{1}{s+b} \\ &= \underbrace{\frac{1}{a}\cdot\frac{z-a}{b-a}}_{\substack{\textsf{DC gain of}\\\textsf{slow pole}}} \cdot \frac{1}{\tfrac{1}{a}s+1} + \underbrace{\frac{1}{b}\cdot \frac{b-z}{b-a}}_{\substack{\textsf{DC gain of}\\\textsf{fast pole}}} \cdot \frac{1}{\tfrac{1}{b}s+1} \end{aligned}

(21)

Setting the DC gains equal to each other, we get:

\frac{1}{a}\cdot\frac{z-a}{b-a} = \frac{1}{b}\cdot \frac{b-z}{b-a} \implies z = \frac{2ab}{a+b}

(22)

Or in other words, the zero must be located at the harmonic mean of the two poles in order for both poles to have equal DC gains. We can rearrange the result to get:

\frac{1}{z} = \frac{1}{2}\biggl(\frac{1}{a} + \frac{1}{b}\biggr)

(23)

So if we think of the poles and zeros in terms of their time constants rather than their actual values, the time constant of the zero must be the average of the time constants of the two poles in order for both poles to have equal DC gains.

Footnotes¶

You can also calculate this directly from the equations for $y$ and $\dot y$ above, or you can apply the initial value theorem, counterpart of the final value theorem.
↩
The elevator is a control surface on the horizontal tail of the airplane that can be angled up or down to control the pitch of the airplane.
↩
This is called a canard configuration. It is used in some fighter jets, but it is not common in commercial airplanes due to stability and control issues.
↩