Transformation of Roots

Last updated on September 4, 2025

Suppose we are given a univariate polynomial \(f\in \mathbb {Q}[x]\) of degree \(n\) with complex roots \(\alpha _1, \alpha _2, \ldots , \alpha _n\), and \(p\in \mathbb {Q}[z]\) is another polynomial of degree \(m\). We want to study the transformation of the roots of \(f\) under the action of \(p\). So we want to find the polynomial whose roots are \(p(\alpha _1), p(\alpha _2), \ldots , p(\alpha _n)\). We denote this polynomial by \(T(f,p)\). For the sake of clarity, we use the variable \(y\) for \(T(f,p)\). There might be many polynomials whose roots are exactly \(p(\alpha _1), p(\alpha _2), \ldots , p(\alpha _n)\), so we think of \(T(f,p)\) as the equivalence class of polynomials whose roots are exactly \(p(\alpha _1), p(\alpha _2), \ldots , p(\alpha _n)\). If we require \(T(f,p)\) to be monic, then it is uniquely defined. By the fundamental theorem of symmetric polynomials it follows that \(T(f,p)\) is also a univariate polynomial in \(y\) with coefficients in \(\mathbb {Q}\).

When \(p\) is linear

As a first example, suppose \(p(x)=z+1\). Then we have

\[ T(f,p) = f(y-1). \]

More generally for any linear polynomial \(p(x)=az+b\) with \(a\neq 0\), we have

\[ T(f,p) = f\left (\frac {y-b}{a}\right ). \]

When \(p\) is quadratic

It is easy to observe that to obtain \(T(f,p)\), we somehow need to substitute the ”inverse” function of \(p(x)\) in \(f\). What if \(p(x)\) does not have an inverse function? For example, consider \(p(x)=x^2\). In this case, we might try the naive approach of substituting \(\sqrt {y}\) in place of \(x\) in \(f\). This gives us

\[ T(f,p) = f\left (\sqrt {y}\right ). \]

However, this is not a valid substitution since \(f\left (\sqrt {y}\right )\) is not a polynomial in \(y\). This illustrates the challenge of transforming roots under non-invertible polynomials. So what can we do in this case? Let us try to work with this naive approach and see if we can find a suitable polynomial representation. For this, we first write \(f\) as:

\[ f(x) = g(x)+xh(x) \]

where \(g(x)\) both have only even degree monomials, is is easy to see that such a representation is always possible. Now let us consider \(f(\sqrt {y})\):

\[ a(y) := f(\sqrt {y}) = g(\sqrt {y})+\sqrt {y}h(\sqrt {y}). \]

Suppose \(\alpha \in \mathbb {C}\) is a root of \(f\). Then we have \(f(\alpha )=0\). And intuitively it feels that \(a(y)\) is the ”right” function to consider. How do we the make above equation a polynomial? We make it a polynomial by simply taking \(\sqrt {y}\) on the right hand side and squaring both sides. This gives us the polynomial equation

\[ \left (g(\sqrt {y})\right )^2= y \left (h(\sqrt {y})\right )^2. \]

So we now consider the polynomial \(b(y) := \left (g(\sqrt {y})\right )^2- y \left (h(\sqrt {y})\right )^2\). First notice that \(b(y)\) is indeed a polynomial in \(y\) of degree \(n\). Also \(b(y)\) vanishes at \(y=\alpha ^2\) for each root \(\alpha \) of \(f\). To see this, first see that \(f(\alpha )=0\) implies that \(g(\alpha )=-\alpha h(\alpha )\). Squaring both sides gives us \(\left (g(\alpha )\right )^2=\alpha ^2 \left (h(\alpha )\right )^2\). Hence \(\alpha ^2\) is a root of \(b(y)\). Therefore in this case we have: \(T(f,z^2) = b(y)\).

Exercise 1. Suppose \(f(x)\in \mathbb {Q}[x]\) is a polynomial computed by an arithmetic circuit of size \(s\), show that \(T(f,z^2)\) can be computed by an arithmetic circuit of size at most \(O(s)\).

Higher Powers

How do we extend the above ideas for \(p=z^3\) or more generally for \(p=z^m\)? For this, the most elegant way is to use the idea of resultants.See Resultant on wikipedia for a gentle introduction. Resultant is defined as follows: Given two polynomials \(f(x)\) and \(g(x)\), the resultant of \(f\) and \(g\), denoted \(\text {Res}(f,g)\), is a polynomial function of the coefficients of \(f\) and \(g\) such that \(\text {Res}(f,g)=0\) if and only if \(f\) and \(g\) have a common root and is usually defined as the determinant of the Sylvester matrix of \(f\) and \(g\). The property that we need of the resultant here is that

\begin{equation} \text {Res}(f,g) = \left (\mathrm {LC}(f)\right )^m\prod _{i=1}^n g(\alpha _i)=(-1)^{mn}\left (\mathrm {LC}(g)\right )^n\prod _{j=1}^m f(\beta _j)\label {eq:resultant} \end{equation}

where \(\alpha _1,\ldots ,\alpha _n\) are the roots of \(f\), \(\beta _1,\ldots ,\beta _m\) are the roots of \(g\), and \(\mathrm {LC}\) denotes the leading coefficient. Now consider the polynomial \(g(x)=x^m-y\). By using equation (1) we have:

\[ \text {Res}(f,g)=f_n^m\prod _{i=1}^n(\alpha _i^m-y)=T(f,z^m). \]

By using equation (1) it is also easy to see the argument we made for \(m=2\). To see this, notice that \(\text {Res}(f,x^2-y)=f(\sqrt y)f(-\sqrt {y})\). Now it is easy to see that \(f(\sqrt y)f(-\sqrt {y})=b(y)\) where \(b(y)\) is the polynomial we defined in the previous section. Equation (1) also shows that:

\begin{equation} T(f,z^m)=\prod _{i=1}^mf\left (\zeta _m^i{y^{\frac {1}{m}}}\right )\label {eq:TforZtothem}, \end{equation}

where \(\zeta _m\) is a primitive \(m\)-th root of unity. So how do we compute \(T(f,z^m)\) using equation (2)? For \(j\in [m]\), we want to compute \(f\left (\zeta _m^j{y^{\frac {1}{m}}}\right )\). Given an arithmetic circuit \(C\) of size \(s\) for \(f\), we split each gate of \(C\) into \(m\) gates. Suppose a gate \(G\) of \(C\) computes the polynomial \(q_G(x)\) then \(q_G(x\zeta _m^j{y^{\frac {1}{m}}})\) looks like \(\sum _{i=0}^{m-1} q_i(y)y^{\frac {i}{m}}\) for some polynomials \(q_i(y)\in \mathbb {Q}[\zeta _m][y]\). So the \(i\)-the copy of \(G\) essentially computes \(q_i(y)\). Now it is easy to see that this can be propagated through the circuit. This gives us a circuit for \(f\left (\zeta _m^j{y^{\frac {1}{m}}}\right )\) of size \(O(sm^2)\), but this circuit is over \(\mathbb {Q}[\zeta _m]\). Now we can multiply these \(m\) polynomials using a tree of multiplications to get a circuit for \(T(f,z^m)\) of size \(O(sm^3)\). To get rid of \(\zeta _m\), we can write each coefficient of the polynomial computed by the circuit as \(a_0+a_1\zeta _m+\ldots +a_{m-1}\zeta _m^{m-1}\) for some \(a_i\in \mathbb {Q}\) and then replace this by a vector of length \(m\) with \(i\)-th coordinate equal to \(a_i\). This gives us a circuit over \(\mathbb {Q}\) of size \(O(sm^5)\) computing \(T(f,z^m)\). Hence we obtain the following theorem:

Theorem 1. Let \(f(x)\in \mathbb {Q}[x]\) be a polynomial computed by an arithmetic circuit of size \(s\). Then \(T(f,z^m)\) can be computed by an arithmetic circuit of size \(O(sm^5)\).

General Polynomials \(p(z)\)

What if \(p(z)\) is not a power \(z^m\) but an arbitrary polynomial of degree \(m\)? In this case, we can use the same idea of resultants. In this case also we have:

\begin{equation} T(f,z^m)=\prod _{i=1}^mf(\theta _i)\label {eq:Tforgeneralpz}, \end{equation}

where \(\theta _1,\ldots ,\theta _m\) are the roots of \(p(x)-y\) over the algebraic closure of \(\mathbb {Q}(y)\). Notice that \(p(x)-y\) is irreducible over \(\mathbb {Q}(y)\), hence \(\theta _1,\ldots ,\theta _m\) are conjugates of each other. Choose any arbitrary root \(\theta \in \{\theta _1,\ldots ,\theta _m\}\). Then notice that

\[ \prod _{i=1}^m f(\theta _i) \]

is simply the field norm of \(f(\theta )\) from the extension \(\mathbb {Q}(y,\theta )/\mathbb {Q}(y)\):

\[ \operatorname {Norm}_{\mathbb {Q}(y,\theta )/\mathbb {Q}(y)}(f(\theta )) \;=\; \prod _{\sigma } \sigma (f(\theta )) \;=\; \prod _{i=1}^m f(\theta _i). \]

Hence \(T(f,p(z))\) is nothing but the norm of the element \(f(\theta )\) down to the base field \(\mathbb {Q}(y)\). Since norms always land in the base field, we conclude that \(T(f,p(z)) \in \mathbb {Q}(y)\). Equivalently,

\[ T(f,p(z)) \;=\; \operatorname {Norm}_{\mathbb {Q}(y,\theta )/\mathbb {Q}(y)}(f(\theta )). \]

Computational perspective

This norm can be computed in several equivalent ways:
\(\bullet \) Via conjugates: directly expand \(\prod _{i=1}^m f(\theta _i)\).
\(\bullet \) Via determinant: build the multiplication matrix \(M_{f(\theta )}\) acting on the \(\mathbb {Q}(y)\)-vector space \(\mathbb {Q}(y,\theta )\) with basis \(\{1,\theta ,\ldots ,\theta ^{m-1}\}\), and compute \(\det (M_{f(\theta )})\).

Gate-by-gate construction of \(M_{f(\theta )}\) for general \(p(z)\)

Let

\[ q(x) = \sum _{i=0}^m p_i x^i - y \in \mathbb {Q}(y)[x] \]

and let \(\theta \) be a root of \(q(x)\). We want to construct the multiplication matrix

\[ M_{f(\theta )} \in \mathbb {Q}(y)^{m \times m} \]

for a polynomial \(f(x)\in \mathbb {Q}[x]\) given as an arithmetic circuit.

Step 1: Basis and vector representation.

Use the standard basis

\[ \{1, \theta , \theta ^2, \dots , \theta ^{m-1}\}. \]

Every element \(u(\theta ) \in \mathbb {Q}(y,\theta )\) is represented by a vector

\[ \mathbf {u} = (u_0, u_1, \dots , u_{m-1}) \in \mathbb {Q}(y)^m, \qquad u(\theta ) = \sum _{i=0}^{m-1} u_i \theta ^i. \]

Step 2: Input gates.

For each input gate in the circuit of \(f\):
\(\bullet \) Constant \(c \in \mathbb {Q}\): vector \((c,0,\dots ,0)\).
\(\bullet \) Variable \(x\): vector \((0,1,0,\dots ,0)\).

Step 3: Internal gates.

Process gates in topological order:

\(\bullet \)Addition / subtraction: Componentwise addition/subtraction of the vectors.
\(\bullet \) Scalar multiplication \(c \in \mathbb {Q}\): Multiply each coordinate by \(c\).
\(\bullet \) Multiplication: Let \(\mathbf {u} = (u_0,\dots ,u_{m-1})\), \(\mathbf {v} = (v_0,\dots ,v_{m-1})\) be the vectors for \(u(\theta ), v(\theta )\).

Compute the convolution

\[ w_k = \sum _{i+j=k} u_i v_j, \quad 0 \le k \le 2m-2, \]

and then reduce modulo \(q(x)\) using

\[ \theta ^m = -\frac {1}{p_m}\Big (\sum _{i=0}^{m-1} p_i \theta ^i - y \Big ) \]

repeatedly to express all powers \(\theta ^k\) with \(k\ge m\) in terms of \(\theta ^0,\dots ,\theta ^{m-1}\). The resulting vector is the output of the multiplication gate.

Step 4: Constructing \(M_{f(\theta )}\).

Let \(\mathbf {f} = (f_0,\dots ,f_{m-1})\) be the final vector for \(f(\theta )\). The \(j\)-th column of \(M_{f(\theta )}\) is obtained as follows:
\(\bullet \) Compute the coefficient vector of \(\theta ^j f(\theta )\) by shifting \(\mathbf {f}\) by \(j\) positions.
\(\bullet \)Reduce modulo \(q(x)\) using the relation for \(\theta ^m\) as above, so all powers are \(< m\).
\(\bullet \) Place the resulting vector as column \(j\) of \(M_{f(\theta )}\).

Finally, the determinant of \(M_{f(\theta )}\) gives

\[ \operatorname {Norm}_{\mathbb {Q}(y,\theta )/\mathbb {Q}(y)}(f(\theta )) = T(f,p(z)) \in \mathbb {Q}(y). \]

Once again, we get the following theorem:

Theorem 2. Let \(f(x)\in \mathbb {Q}[x]\) be a polynomial computed by an arithmetic circuit of size \(s\). Then for any polynomial \(p(z)\) of degree \(m\), \(T(f,p(z))\) can be computed by an arithmetic circuit of size \(O(sm^5)\).

Problem 1. What if \(p(z)\) is also given by an arithmetic circuit of size \(t\)? Can we get a better bound than \(O(s \left (\mathrm {deg}(g)\right )^5)\) for the circuit size of \(T(f,p(z))\)?