Emile Timothy
Main Page

Blog

Showing that the VC-dimension is \(\Sigma_3^P\)-complete

The VC (Vapnik–Chervonenkis) dimension is a measure of complexity of a set of functions that can be learned by a simple binary classification algorithm. To define it more formally, we need the concept of shattering.

Given a collection \(S = S_1, ..., S_M\) of subsets of a finite set \(U\), the VC-dimension of \(S\) is the size of the largest set \(X \subseteq U\) such that for every \(X' \subseteq X\), there is an \(i\) for which \(S_i \cap X = X'\). For this \(i\), we can write that \(X\) is shattered by \(S\). So, a Boolean circuit \(C\) that computes a function \(f:\{0, 1\}^m \times \{0, 1\}^n \rightarrow \{0, 1\}\) succinctly represents a collection \(S\) of \(𝟐^π’Ž\) subsets of \(𝑼 = \{𝟎, 𝟏\}^𝒏\) as follows: the set \(𝑺_π’Š\) consists of exactly those elements \(x\) for which \(π‘ͺ(π’Š, 𝒙) = 𝟏\).

Using this notion, the language VC-DIMENSION is the set of pairs \((C, k)\) for which \(C\) represents a collection of subsets \(S\) whose VC dimension is at least \(k\). The purpose of this blog post is to show that this language has an intricate relationship with the polynomial hierarchy in computational complexity.

To explain this better, I'm going to introduce some notation:

Let \(\text{P}^A\) be the set of problems solvable in poly-time with a TM that uses an oracle for a problem in \(A\).
Let \(\text{NP}^A\) be the set of problems solvable in nondeterministic poly-time with a TM that uses an oracle for a problem in \(A\).
Let \(\text{coNP}^A\) be the set of problems solvable in co-nondeterministic poly-time with a TM that uses an oracle for a problem in \(A\).

Then, define \(\Delta_0^P := \Sigma_0^P := \Pi_0^P:=P\), where \(P\) is the set of problems that can be solved in poly-time. Then, for all \(i \geq 0\), define: $$\Delta_{i+1}^P:=\text{P}^{\Sigma_i^P}$$ $$\Sigma_{i+1}^P:=\text{NP}^{\Sigma_i^P}$$ $$\Pi_{i+1}^P:=\text{coNP}^{\Sigma_i^P}$$ The result that I want to show here is that the VC-dimension is in \(\Sigma_3^P\), and that is the strongest form of problem in \(\Sigma_3^P\). So, any solution that can compute the VC-dimension can be used to compute any other problem in \(\Sigma_3^P\). The complexity-theoretic word for this phenomenon is completenesss. That is to say, the V-dimension is \(\Sigma_3^P\)-complete.


By definition, \(X\) being shattered by a collection of sets \(S\) means that \(βˆ€π‘‹β€² βŠ† 𝑋, βˆƒπ‘– s.t. 𝑆𝑖 ∩ 𝑋 = 𝑋′\). This is the same thing as writing that \(X\) is shattered by a collection of sets \(C\) if and only if $$𝑃(𝐴) = \{𝑋 ∩ 𝑆_𝑖 | S_i ∈ 𝐢\}$$ However, since a set of size \(m\) has a power set of size \(2^π‘š\), no set of size \(> m\) will have \(2^π‘š\) possible subsets (a set of size \(m + 1\) will have a power set of size \(2^{π‘š+1} = 2 \cdot 2^π‘š > 2^π‘š\)). Therefore, a set \(X\) that is shattered by a collection of \(2^π‘š\) subsets will have a size atmost \(m\).

Hence, the largest possible \(X\) will have size \(m\).



A Boolean circuit \(C\) that computes a function \(f: \{0, 1\}^π‘š Γ— \{0, 1\}^𝑛 β†’ \{0, 1\}\) succinctly represents a collection \(s\) of \(2^π‘š\) subsets of \(π‘ˆ = \{0, 1\}^𝑛\). However, if the Boolean circuit represents \(2^π‘š\) possible subsets, then (by lemma 1) the VC-DIMENSION of \(C\) is \(π‘š\).

Note then that by definition: $$Ξ£_3^𝑃 = \text{NP}^{Ξ£_2^P} = \text{NP}^{\text{NP}^{\Sigma_1^P}} = \text{NP}^{\text{NP}^{\text{NP}}}$$ Therefore, we can now define instances of the 𝑉𝐢-dimension decision problem by phrasing the question as whether whether there is a β€œ\(\{(𝐢, π‘š) \text{ s.t. } VC(C) β‰₯ π‘š\}\)”?” where \(C\) is the collection of circuits which succinctly represent the collection of subsets \(S\).

The β€œyes” instance (using the \(NP^{NP^{NP}}\) notation) of the VC-dimension problem would then be: $$\text{VCβˆ’DIMENSION}=\{(𝐢,π‘š): βˆƒX,Xβ‰₯m \text{ s.t. }βˆ€X' βŠ†π‘‹,βˆƒπ‘–\text{ s.t. }Siβˆ©π‘‹=𝑋'\}$$ $$\text{VCβˆ’DIMENSION}=\{(𝐢,π‘š):βˆƒX,|X|β‰₯m \text{ s.t. }βˆ€X' βŠ†π‘‹,βˆƒπ‘–\text{ s.t. }βˆ€x∈X',C(i,x)=1\}$$ Since every certificate would also need a certificate which would in turn need a proof of membership (just by the nature of this representation), where \(i, X, X'\) are polynomially large in \(m\), the nondeterminism can just act on every subset knowing that each subset would be polynomially large since \(π‘π‘œπ‘™π‘¦ (π‘π‘œπ‘™π‘¦(π‘π‘œπ‘™π‘¦(π‘š))) ∈ π‘π‘œπ‘™π‘¦(π‘š)\).

Therefore, we can decide a language in polynomial time since atleast one of the (polynomially long) computation paths would lead to an β€œACCEPT” state if it is a β€œyes” instance and none of the (polynomially long) computation paths would lead to a β€œREJECT” state if it is a β€œno” instance.

Therefore, we have that: $$\text{VC-DIMENSION} ∈ \Sigma_3^P$$ QED.



Given an instance of VC-DIMENSION: $$\text{VCβˆ’DIMENSION}=\{(𝐢,π‘š):βˆƒX,|X|β‰₯m\text{ s.t. }βˆ€X' βŠ†π‘‹,βˆƒπ‘–\text{ s.t. }βˆ€x∈X',C(i,x)=1\}$$ We can show that it is \(Ξ£_3𝑃\)-complete by reducing from an instance of \(𝑄𝑆𝐴𝑇_3\) which is known to be \(Ξ£_3^P\)-complete. Therefore, let an arbitrary instance of \(𝑄𝑆𝐴𝑇_3\) be: $$βˆƒπ‘₯_1 βˆ€π‘₯_2 βˆƒπ‘₯_3 \text{ s.t. } \varphi(π‘₯_1,π‘₯_2,π‘₯_3)=1$$ We then define the universe \(U\) as the set \(\{0, 1\}^\ell \times \{1, 2, 3, ... \ell\}\) s.t. \(βˆ€π‘₯_1, |π‘₯_1| = \ell\), we define the following subset: $$π‘ˆ_{π‘₯_1} = \{π‘₯_1\} \times \{1,2,3,...\ell\}$$ Using this, we define the collection of sets, \(S\), succinctly represented by circuit \(C\) (note that the circuit would have to maintain this invariant \(βˆ€π‘₯_1,π‘₯_2,π‘₯_3\), as:

1) If \(πœ‘(π‘₯_1,π‘₯_2,π‘₯_3)=1\), then \(𝑆(π‘₯_1,π‘₯_2,π‘₯_3)=\{π‘₯_1\}Γ—\{1,2,3,...\ell\},|π‘₯_1|=l\)
2) If \(πœ‘(π‘₯_1, π‘₯_2, π‘₯_3) = 0\), then \(𝑆(π‘₯_1, π‘₯_2, π‘₯_3) = πœ™\), where \(πœ™\) is the empty set (note that we don’t need to address this case here since \(πœ‘(π‘₯_1,π‘₯_2,π‘₯_3) = 0\) should really just do nothing, hence the empty set will suffice), where the name of each set is \((π‘₯_1, π‘₯_2, π‘₯_3)\).

We now define the reduction function \(f\):

Inputs: A set named \((π‘₯_1, π‘₯_2, π‘₯_3)\) and an element, \(a = (π‘₯'_1,π‘₯_2)\) of a potential subset \(π‘ˆ_{x_1}\).

Function: The isomorphism between the VC-DIMENSION instance \((𝐢,\ell)\) and \(𝑄𝑆𝐴𝑇_3\) instance \(πœ‘(π‘₯_1,π‘₯_2,π‘₯_3)\).

Deciding Procedure: We return β€œACCEPT” if \(π‘Ž ∈ 𝑆(π‘₯_1, π‘₯_2, π‘₯_3)\) and β€œNO” if \(π‘Ž βˆ‰ 𝑆(π‘₯_1, π‘₯_2, π‘₯_3)\).

Therefore, we do the following:

If \((πœ‘(π‘₯_1, π‘₯_2, π‘₯_3) = 1) ∧ (a ∈ (x_1, \ell))\), return β€œACCEPT”.

Else, return β€œREJECT”.

Finally, the onus of the proof is to show that β€œyes” instances of \(𝑄𝑆𝐴𝑇_3\) maps to β€œyes” instances of the VC- DIMENSION decision problem and vice versa, and that β€œno” instances of \(𝑄𝑆𝐴𝑇_3\) maps to β€œno” instances of the VC-DIMENSION decision problem and vice versa:

To show that β€œyes” maps to β€œyes”:
To show that β€œyes” \(𝑄𝑆𝐴𝑇_3\) instances map to β€œyes” VC-DIMENSION instances:
Given a β€œyes” instance of \(𝑄𝑆𝐴𝑇_3\), \(βˆƒπ‘₯_1 βˆ€π‘₯_2 βˆƒπ‘₯_3 \text{ s.t. } πœ‘(π‘₯_1, π‘₯_2 , π‘₯_3 ) = 1\), the set \(𝑆(π‘₯_1 , π‘₯_2 , π‘₯_3 ) = \{x_1\}\times x_2\).
Recall that the initial subset was:
$$π‘ˆ_{π‘₯_1} =\{π‘₯_1\}\times\{1,2,3,...\ell\}$$
However, since \(𝑆(π‘₯_1, π‘₯_2, π‘₯_3) = \{π‘₯_1\} \times π‘₯_2\) and \(πœ‘(π‘₯_1, π‘₯_2, π‘₯_3) = 1\), the VC dimension of \(S\) would be \(β‰₯ π‘₯_2 = \ell\).
Hence, the VC dimension of \(S\) that is represented by circuit \(C\) would be \(β‰₯ \ell\), which satisfies the decision problem.
QED.

To show that β€œyes” VC-DIMENSION instances map to β€œyes” \(𝑄𝑆𝐴𝑇_3\) instances:
Given a β€œyes” instance of VC-DIMENSION \((𝐢, π‘š)\), then the set \(S\) represented by the circuit \(C\) has a size atleast \(m\).
Therefore, \(βˆƒπ‘‹, |𝑋| β‰₯ π‘š \text{ s.t. } 𝑆 \text{ shatters } 𝑋\). It follows then by definition of shattering that \(βˆƒ π‘˜ \text{ s.t. } 𝑋 ∈ π‘ˆ_π‘˜\).
However, this can only happen if \(βˆ€π‘₯_2, {π‘₯_1} \times π‘₯_2 ∈ 𝑆\). This is the same thing as writing:
$$βˆƒπ‘₯_1 βˆ€π‘₯_2 βˆƒπ‘₯_3 s.t.πœ™(π‘₯_1,π‘₯_2,π‘₯_3) = 1,$$
which is a β€œyes” instance of \(𝑄𝑆𝐴𝑇_3\).
QED.

To show that β€œno” maps to β€œno”, the contrapositive argument holds since a β€œyes” instance of the VC- DIMENSION problem being reduced to a β€œyes” instance of \(𝑄𝑆𝐴𝑇_3\) implies that β€œno” instances of \(𝑄𝑆𝐴𝑇_3\) can be reduced to β€œno” instances of VC-DIMENSION, and a β€œyes” instance of \(𝑄𝑆𝐴𝑇_3\) being reduced to a β€œyes” instance of VC-DIMENSION implies that a β€œno” instance of VC-DIMENSION can be reduced to a β€œno” instance of \(QSAT_3\).

Since we have shown that β€œyes” maps to β€œyes” and β€œno” maps to β€œno”, we have that \(βˆƒ\) a reduction function from instances of \(𝑄𝑆𝐴𝑇_3\) to instances of the VC-DIMENSION problem and that \(βˆƒ\) a reduction function from instances of the VC-DIMENSION problem to instances of \(𝑄𝑆𝐴𝑇_3\). Additionally, since \(𝑄𝑆𝐴𝑇_3\) is known to be \(\Sigma_3^P\)-complete, it follows then that 𝑉𝐢-DIMENSION is also \(\Sigma_3^𝑃\)-complete.

QED.