Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

The effect of zeros

So far, we have analyzed the response and performance of systems based on their poles. When we looked at first and second-order systems in canonical form,

G(s)=Kτs+1,G(s)=Kωn2s2+2ζωns+ωn2,G(s) = \frac{K}{\tau s + 1}, \quad G(s) = \frac{K\omega_n^2}{s^2 + 2\zeta \omega_n s + \omega_n^2},

We saw that the poles determine the stability of the system and performance metrics such as peak time, overshoot, and settling time.

We also considered higher-order systems and approximated them by keeping only the dominant poles. In all of these cases, we assumed the system did not contain any zeros (the numerators had no ss terms). Now we’ll see what happens when we introduce zeros into the system.

Where do zeros come from?

The poles of a system are associated with the denominator of its transfer function, while the zeros are associated with the numerator. When we do a partial fraction expansion (PFE) of a transfer function, we break it down into a sum of terms, each of which corresponds to a pole. The zeros are not directly visible in the PFE; they are “hidden” in the residues. For example:

ps+a+qs+b=(p+q)s+(pa+qb)(s+a)(s+b).\frac{p}{s+a} + \frac{q}{s+b} = \frac{(p+q)s + (pa + qb)}{(s+a)(s+b)}.

The poles are at s=as=-a and s=bs=-b, which we can read directly from the denominators in the PFE. The zero is at s=pa+qbp+qs = -\frac{pa + qb}{p+q}, which changes depending on the residues pp and qq. As p+q0p+q\to 0 the zero moves towards -\infty. When p+q=0p+q=0, there is no zero at all. In this case, we sometimes say there is a “zero at infinity”.

Physical intuition

Where do poles and zeros come from in a physical system?

For example, consider the following mass-spring-damper system.

Spring-mass-damper system with two masses, two springs, two dampers, and two applied forces.

Figure 1:Spring-mass-damper system with two masses, two springs, two dampers, and two applied forces.

Suppose we can apply a force u1u_1 to mass 1 or u2u_2 to mass 2. We also have several sensors: y1y_1 measures the position of mass 1, y2y_2 measures the velocity of mass 2, and y3y_3 measures the relative position of the masses. The equations of motion are:

m1x¨1+b1x˙1+k1x1b2(x˙2x˙1)k2(x2x1)=u1m2x¨2+b2(x˙2x˙1)+k2(x2x1)=u2y1=x1,y2=x˙2,y3=x2x1\begin{gathered} m_1 \ddot x_1 + b_1 \dot x_1 + k_1 x_1 - b_2(\dot x_2-\dot x_1) - k_2 (x_2 - x_1) = u_1 \\ m_2 \ddot x_2 + b_2(\dot x_2-\dot x_1) + k_2 (x_2 - x_1) = u_2 \\ y_1 = x_1, \qquad y_2 = \dot x_2, \qquad y_3 = x_2 - x_1 \end{gathered}

If we solve for all possible transfer functions from UiU_i to YjY_j we obtain:

U1U2Y1m2s2+b2s+k2D(s)b2s+k2D(s)Y2b2s2+k2sD(s)m1s3+(b1+b2)s2+(k1+k2)sD(s)Y3m2s2D(s)m1s2+b1s+k1D(s)\def\arraystretch{1.5} \begin{array}{c|cc} & U_1 & U_2 \\ \hline Y_1 & \frac{m_2 s^2 + b_2 s + k_2}{D(s)} & \frac{b_2 s + k_2}{D(s)} \\ Y_2 & \frac{b_2 s^2 + k_2 s}{D(s)} & \frac{m_1 s^3 + (b_1+b_2)s^2 + (k_1+k_2)s}{D(s)} \\ Y_3 & \frac{m_2 s^2}{D(s)} & \frac{m_1 s^2 + b_1 s + k_1}{D(s)} \\ \end{array}

Each transfer function has the same denominator D(s)D(s), which is given by:

D(s)=m1m2s4+(b1m2+b2(m1+m2))s3+(b1b2+k1m2+k2(m1+m2))s2+(b2k1+b1k2)s+k1k2\begin{aligned} D(s) &= m_1 m_2 s^4 + \left(b_1 m_2+b_2 \left(m_1+m_2\right)\right) s^3 \\ &\qquad+ \left(b_1 b_2+k_1 m_2+k_2 \left(m_1+m_2\right)\right) s^2 \\ &\qquad\qquad+ \left(b_2 k_1+b_1 k_2\right) s + k_1 k_2 \end{aligned}

The four roots of D(s)D(s) are the four poles of the system, and they correspond to the four energy storage elements: kinetic energy of each mass and potential energy of each spring. The polynomial D(s)D(s) encodes the physics of the system: how the energy moves between the different storage elements and how it dissipates through the dampers. These physics are the same regardless of how we inject energy into the system (actuation) or how we measure the system (sensing).

The zeros, however, depend on both sensing and actuation. As we can see, the numerators of the transfer functions are all different.

Derivative mixing interpretation

Let’s see what adding a zero to a system G(s)G(s) does to its time domain response. We’ll add a zero by multiplying the transfer function by (τs+1)(\tau s + 1). This puts a zero at s=1τs=-\frac{1}{\tau}, but it does not change the poles or the DC gain. We obtain:

InputTransfer functionOutput
Originalu(t)u(t)G(s)G(s)y(t)y(t)
With zerou(t)u(t)(τs+1)G(s)(\tau s + 1)G(s)y(t)+τy˙(t)y(t) + \tau \dot y(t)

Adding a zero amounts to adding a scaled derivative of the original response to itself.

Zeros in second-order systems

Consider a second-order system with an added zero:

G(s)=Kωn2(τs+1)s2+2ζωns+ωn2.G(s) = \frac{K\omega_n^2(\tau s + 1)}{s^2 + 2\zeta \omega_n s + \omega_n^2}.

We added a zero at s=1τs = -\frac{1}{\tau}, without changing the DC gain.

Our original step response (without the zero) is the familiar step response of a second-order system:

y(t)=K(111ζ2eζωntsin(ωdt+ϕ))y(t) = K\left(1 - \frac{1}{\sqrt{1 - \zeta^2}} e^{-\zeta \omega_n t} \sin(\omega_d t + \phi)\right)

Adding the zero adds a scaled derivative of the original response:

τy˙(t)=Kτ1ζ2(ζωneζωntsin(ωdt+ϕ)ωdeζωntcos(ωdt+ϕ))=Kτωn1ζ2eζωnt(ζsin(ωdt+ϕ)1ζ2cos(ωdt+ϕ))=Kτωn1ζ2eζωnt(cosϕsin(ωdt+ϕ)sinϕcos(ωdt+ϕ))=Kτωn1ζ2eζωntsin(ωdt)\begin{aligned} \tau \dot y(t) &= \frac{K\tau}{\sqrt{1 - \zeta^2}} \biggl( \zeta \omega_n e^{-\zeta \omega_n t} \sin(\omega_d t + \phi) - \omega_d e^{-\zeta \omega_n t} \cos(\omega_d t + \phi) \biggr) \\ &= \frac{K\tau \omega_n}{\sqrt{1 - \zeta^2}} e^{-\zeta \omega_n t} \biggl( \zeta \sin(\omega_d t + \phi) - \sqrt{1 - \zeta^2} \cos(\omega_d t + \phi) \biggr) \\ &= \frac{K\tau \omega_n}{\sqrt{1 - \zeta^2}} e^{-\zeta \omega_n t} \biggl( \cos\phi \sin(\omega_d t + \phi) - \sin\phi \cos(\omega_d t + \phi) \biggr) \\ &= \frac{K\tau \omega_n}{\sqrt{1 - \zeta^2}} e^{-\zeta \omega_n t} \sin(\omega_d t) \end{aligned}

If this formula looks familiar, it’s because we’ve seen it before! Whenever y(t)y(t) is a step response, y˙(t)\dot y(t) is the corresponding impulse response. We calculated the impulse response for a second-order system here.

The term τy˙\tau \dot y is zero when yy reaches a maximum or minimum. More interpretations:

Here is a plot showing how the step response changes as we vary τ\tau. For this example, we fixed ζ=0.5\zeta=0.5 and ωn=1\omega_n=1.

Step response of the second-order system  for different values of \tau. We fixed \zeta=0.5 and \omega_n=1. The poles remain unchanged, but the zero location changes with \tau, which significantly affects the transient response.

Figure 2:Step response of the second-order system (13) for different values of τ\tau. We fixed ζ=0.5\zeta=0.5 and ωn=1\omega_n=1. The poles remain unchanged, but the zero location changes with τ\tau, which significantly affects the transient response.

We can observe some interesting effects as we vary τ\tau:

Yet another interpretation: The slope of the response at t=0t=0 is given by the formula[1] Kωn2τK\omega_n^2 \tau. So the initial slope is directly proportional to τ\tau. This explains why the response is initially faster for τ>0\tau > 0 and initially slower (in the wrong direction) for τ<0\tau < 0.

Interactive applet

Using the applet below (click the icon in the lower-right corner to fullscreen it), you can interactively explore how the step response of a second-order system changes as you change the various parameters. This time, try adjusting the value of τ\tau to see how the zero location affects the transient response.

RHP zeros

The inverse response phenomenon we observed for second-order systems with RHP zeros may seem strange, but it is actually common in real systems! Some examples:

In both of these examples, the transfer function from input uu to output yy contains a RHP zero, which causes the inverse response. This can make the system difficult to control because it behaves like it has a “built-in delay”; it must initially move in the wrong direction before it can move correctly.

Zeros can only be moved by changing the way the system is measured or actuated. For example, if we put the elevator at the front of the airplane instead of the back[3], we would get a LHP zero instead of a RHP zero, so no inverse response. Similarly, if we could steer the rear tires instead of the front tires (or just drive the car forwards instead of in reverse), there would be no inverse response.

Zeros and dominant poles

We saw in the section on higher-order systems that we can approximate a higher-order system by keeping only its dominant poles. However, this assumed there were no zeros! It turns out that zeros can disrupt dominant pole approximations because zeros lessen the effect of nearby poles. To illustrate, consider a system with two simple poles and one zero:

G(s)=10z(s+z)(s+1)(s+10),G(s) = \frac{10}{z} \cdot \frac{(s + z)}{(s + 1)(s + 10)},

where zz is a parameter that we can vary to change the zero location. The poles are fixed at s=1s=-1 and s=10s=-10, so the dominant pole approximation would suggest that the transient response should be dominated by the pole at s=1s=-1 regardless of the value of zz. However, we shall see that this is not always the case.

We included the factor of 10z\frac{10}{z} in front to ensure that G(s)G(s) would always have a DC gain of 1 for any choice of zz. If we let zz\to\infty, we recover the case with no zeros.

Now perform a PFE to obtain two first-order terms and place them in canonical form so we can compare their DC gains:

G(s)=109(  11z110s+1slow pole1101z110s+1fast pole  )\boxed{ \begin{aligned} G(s) = \frac{10}{9} \cdot \biggl(\; \underbrace{\frac{1-\frac{1}{z}}{\vphantom{\frac{1}{10}}s + 1}}_{\textsf{slow pole}} - \underbrace{\frac{\frac{1}{10}-\frac{1}{z}}{\frac{1}{10}s + 1}}_{\textsf{fast pole}} \;\biggr) \end{aligned} }

Now let’s look at what happens as we vary zz:

We can summarize these observations with the following principles:

Interactive applet

The following interactive applet illustrates the effect of zeros on pole dominance. You can move the poles by dragging the “x\mathsf{x}” markers and move the zero by dragging the “o\mathsf{o}” marker. The step response will update in real time to show how the transient response changes. The shaded lines emanating from the poles indicate the DC gain of each pole; the longer the line, the larger the DC gain and the more dominant that pole is.

 


Test your knowledge

Solution to Exercise 1 #

We seek a PFE of the form:

G(s)=As+a+Bs+bG(s) = \frac{A}{s+a} + \frac{B}{s+b}

Using the cover-up method, we have:

A=s+zs+bs=a=zabaB=s+zs+as=b=zbab\begin{aligned} A &= \frac{s+z}{s+b}\bigg|_{s=-a} = \frac{z-a}{b-a} \\ B &= \frac{s+z}{s+a}\bigg|_{s=-b} = \frac{z-b}{a-b} \end{aligned}

Putting the transfer function in canonical form, we get:

G(s)=zaba1s+a+bzba1s+b=1azabaDC gain ofslow pole11as+1+1bbzbaDC gain offast pole11bs+1\begin{aligned} G(s) &= \frac{z-a}{b-a} \cdot \frac{1}{s+a} + \frac{b-z}{b-a} \cdot \frac{1}{s+b} \\ &= \underbrace{\frac{1}{a}\cdot\frac{z-a}{b-a}}_{\substack{\textsf{DC gain of}\\\textsf{slow pole}}} \cdot \frac{1}{\tfrac{1}{a}s+1} + \underbrace{\frac{1}{b}\cdot \frac{b-z}{b-a}}_{\substack{\textsf{DC gain of}\\\textsf{fast pole}}} \cdot \frac{1}{\tfrac{1}{b}s+1} \end{aligned}

Setting the DC gains equal to each other, we get:

1azaba=1bbzba    z=2aba+b\frac{1}{a}\cdot\frac{z-a}{b-a} = \frac{1}{b}\cdot \frac{b-z}{b-a} \implies z = \frac{2ab}{a+b}

Or in other words, the zero must be located at the harmonic mean of the two poles in order for both poles to have equal DC gains. We can rearrange the result to get:

1z=12(1a+1b)\frac{1}{z} = \frac{1}{2}\biggl(\frac{1}{a} + \frac{1}{b}\biggr)

So if we think of the poles and zeros in terms of their time constants rather than their actual values, the time constant of the zero must be the average of the time constants of the two poles in order for both poles to have equal DC gains.

Footnotes
  1. You can also calculate this directly from the equations for yy and y˙\dot y above, or you can apply the initial value theorem, counterpart of the final value theorem.

  2. The elevator is a control surface on the horizontal tail of the airplane that can be angled up or down to control the pitch of the airplane.

  3. This is called a canard configuration. It is used in some fighter jets, but it is not common in commercial airplanes due to stability and control issues.