如果莎士比亚只会一个单词,那么他的皇皇巨著也就是在阿巴阿巴。

主要资料:

lec8.pdf (rutgers.edu)

Lecture 1 (mit.edu)

# introduction

Suppose we have a Boolean function f:{0,1}n×{0,1}n{0,1}f:\{0,1\}^n\times\{0,1\}^n\rightarrow\{0,1\}, it defines a communication problem as follows: there are two players, say Alice and Bob, who get inputs x{0,1}nx\in\{0,1\}^n and y{0,1}ny\in\{0,1\}^n, respectively, and their goal is to compute f(x,y)f(x,y).

For instance, in the equality problem, Alice and Bob would like to evaluate:

EQ(x,y)={1x=y0xy\text{EQ}(x,y)=\begin{cases} 1&x=y\\ 0&x\neq y \end{cases}

Considering neither Alice nor Bob has the entire input on their own, the players need to communicate with each other to determine the value of f(x,y)f(x, y). The communication happens according to a protocol and is done as follows:

Alice sends a single bit to Bob based solely on her input; after receiving this bit, Bob responds back with a single bit of his own which is a function of his input and the message of Alice; the players continue this throughout the protocol; we assume that the last bit communicated by any player is the output of the problem.

The main measure of efficiency in this model is the communication cost of the protocols, defined as follows.

Definition(Communication cost). The communication cost of a protocol π\pi, denoted by π\|\pi\|, is the worst-case number of bits communicated between Alice and Bob in π\pi over any choice of inputs x,yx,y.

Definition(Deterministic communication complexity). The deterministic communication complexity of function ff is defined as D(f)=minππD(f)=\min_{\pi}\|\pi\|, where π\pi ranges over all protocols that can solve ff.

Note that D(f)2nD(f)\leq 2n for any function as Alice can send all here nn-bit input to Bob, while Bob responding with n1n-1 0’s between each one, and then Bob can output the final answer in another one bit.

# Protocol Trees

An easy way of describing a protocol between Alice and Bob is by using a protocol tree:

1

At the root of the tree, the left-child node contains all (x,y)(x, y) pairs such that Alice sends bit 0 for them and the right child-node contains the (x,y)(x, y) pairs where Alice sends 1 for them. We continue this way so that whenever we are at a node NN, the root-to-node path corresponds to the bits communicated so far.

Theorem(Rectangle Property). For a node NN in the protocol tree, the set S{0,1}n×{0,1}nS\subseteq\{0,1\}^n\times\{0,1\}^n assigned to it forms a combinatorial rectangle, i.e. S=X×YS=X\times Y for X,Y{0,1}nX,Y\subseteq\{0,1\}^n. In other word:

{(x,y)S(a,b)S{(x,b)S(a,y)S\begin{cases} (x,y)\in S\\ (a,b)\in S \end{cases} \Rightarrow \begin{cases} (x,b)\in S\\ (a,y)\in S\\ \end{cases}

Proof:

Here is a video illustrating [Protocol Trees and Combinatorial Rectangles](Protocol Trees and Combinatorial Rectangles - YouTube) (strongly recommended). Let’s assume MM is a matrix representing function ff:

2

thus when Alice sends a bit, then some rows of the matrix is erased since the bit sent by Alice is only determined by the input for Alice and message already sent/received. The message is represented by the path from root to the node, which is fixed. So when we track the path from root to node NN, we are erasing some rows of matrix, some columns of matrix,… since we know the protocol and the message already sent.

For example, if we see Alice sends 0, then the input for Alice cannot be 00, so we erase a row of of matrix. Every path corresponds to a list of alternative removement, leading to a sub-matrix of all same elements (which is called monochromatic combinatorial rectangle, and “monochromatic” is for all same):

3

then we can compute the function ff. The theorem is saying that the left entries forms a sub-matrix, thus entries (x,y),(a,b)(x,y),(a,b) are left indicates (x,b),(y,a)(x,b),(y,a) are left.

\square.

Let C(f)C(f) denote the minimum monochromatic combinatorial rectangles needed to partition MfM_f (the matrix representation for ff).

Theorem: The number of leaves in a protocol tree is at least C(f)C(f), thus the height of the tree is at least logC(f)\lceil\log C(f)\rceil.

Example: For function EQ\text{EQ},

MEQ=(1000010000100001)M_{\text{EQ}}=\begin{pmatrix} 1&0&0&0\\ 0&1&0&0\\ 0&0&1&0\\ 0&0&0&1 \end{pmatrix}

thus C(EQ)nC(\text{EQ})\geq n.

# A steaming lower bound for Distinct Element Problem

Theorem. Any deterministic streaming algorithm for computing DE exactly requires Ω(min{n,m})\Omega(\min\{n,m\}) bits, given a stream of nn elements from the universe [m][m].

Proof: The proof comes from a reduction from Equality.

Consider an instance (x,y)(x,y) of EQ problem. Define the sub-streams:

σA:={i:xi=1},σB:={j:yj=1}\sigma_A:=\{i:x_i=1\},\sigma_B:=\{j:y_j=1\}

where xi,yjx_i,y_j denote the ii-th/jj-th bits of x,yx,y. And let σ:=σAσB\sigma:=\sigma_A\circ\sigma_B be the concatenation of two sub-streams.

An important property is that when EQ(x,y)=1\text{EQ}(x,y)=1, we have DE(σ)=DE(σA)=DE(σB)\text{DE}(\sigma)=\text{DE}(\sigma_A)=\text{DE}(\sigma_B), while EQ(x,y)=0\text{EQ}(x,y)=0, either DE(σ)DE(σA)\text{DE}(\sigma)\neq\text{DE}(\sigma_A) or DE(σ)DE(σB)\text{DE}(\sigma)\neq\text{DE}(\sigma_B).

Assume there is a deterministic streaming algorithm A\mathcal{A} for computing DE that requires ss bits of space, now consider the following protocol π\pi for EQ based on A\mathcal{A}. Alice generates σA\sigma_A and Bob generates σB\sigma_B locally, then Alice runs A\mathcal{A} on σA\sigma_A, then sends the memory content and DE(σA)\text{DE}(\sigma_A) to Bob, which is s+logns+\log n bits.

Bob continues to run A\mathcal{A} on σB\sigma_B with memory contents from Alice then get answer for DE(σ)\text{DE}(\sigma). Thus Bob knows DE(σ),DE(σA),DE(σB)\text{DE}(\sigma),\text{DE}(\sigma_A),\text{DE}(\sigma_B) and can output EQ(x,y)\text{EQ}(x,y) by the property mentioned above.

Thus s+lognns+\log n\geq n, which leads to s=Ω(n)s=\Omega(n).

\square.

# Randomized Communication Complexity

There are two ways of introducing random bits to the communication model

  • the private coin/randomness model where Alice and Bob have access to separate sources of randomness on their own.
  • the public coin/randomness model where players have access to a shared source of randomness.

Similar to other settings, we require that a randomized protocol for a problem ff to output the correct answer to f(x,y)f(x, y) for any given x,yx, y to Alice and Bob, with probability at least 2/3 (or some other constant strictly more than half).

Newman’s Theorem: Any public coin protocol π\pi for a communication problem ff with probability of success at least 1ϵ1-\epsilon can be simulated by a private coin protocol θ\theta such that θπ+O(logn+log1/δ)\|\theta\|\leq \|\pi\|+O(\log n+\log 1/\delta) and θ\theta outputs the correct answer to ff with probability at least 1ϵδ1-\epsilon-\delta.

Proof:

Let π(x,y,r)\pi(x,y,r) denote the deterministic output of protocol π\pi on inputs x,yx,y and public randomness rr.

Lemma: For some t=O(n/δ2)t=O(n/\delta^2), there exist strings r1,...,rtr_1,...,r_t such that for all (x,y){0,1}n×{0,1}n(x,y)\in\{0,1\}^n\times\{0,1\}^n, we have

Pri[t][π(x,y,ri)f(x,y)]ϵ+δ\mathop{\text{Pr}}_{i\leftarrow[t]}\left [\pi(x,y,r_i)\neq f(x,y)\right ]\leq\epsilon+\delta

Assuming the lemma, let’s see how to design a good protocol. Given the protocol π\pi, we can hardcode r1,...,rtr_1,...,r_t since they are independent of input (x,y)(x,y). Then Alice samples an index i[t]i\in [t] and sends it to Bob, using logt=O(logn+log1/δ)\log t=O(\log n+\log 1/\delta) bits, then Alice and Bob all simulates protocol π\pi with public random variable rir_i. It’s obvious to see that the protocol satisfies.

Proof of Lemma:

Define the indicator random variable Z(x,y,r){0,1}Z(x,y,r)\in\{0,1\} for π(x,y,r)=f(x,y)\pi(x,y,r)=f(x,y). By the guarantee of the protocol π\pi, we have that:

x,y,Er[Z(x,y,r)]=Pr[π errs on (x,y)]ϵ\forall x,y,\quad \mathbb{E}_r[Z(x,y,r)]=\text{Pr}[\pi\text{ errs on }(x,y)]\leq\epsilon

Now, suppose we sample public coins t=2n/δ2t=\lceil 2n/\delta^2\rceil times. By Chernoff bound:

Prr1,...,rt[i=1tZ(x,y,ri)tϵ+δ]Prr1,...,rt[i=1tZ(x,y,ri)tE[i=1tZ(x,y,ri)t]δ]exp(2δ2t)exp(4n)\begin{aligned} \mathop {\text{Pr}}_{r_1,...,r_t}\left [\frac{\sum_{i=1}^tZ(x,y,r_i)}{t}\geq\epsilon+\delta\right ] &\leq\mathop {\text{Pr}}_{r_1,...,r_t}\left [\frac{\sum_{i=1}^tZ(x,y,r_i)}{t}-\mathbb{E}\left [\frac{\sum_{i=1}^tZ(x,y,r_i)}{t}\right ]\geq\delta\right ]\\ &\leq\exp (-2\delta^2t)\\ &\leq \exp(-4n) \end{aligned}

As such, by union bound over all choices of (x,y){0,1}n×{0,1}n(x,y)\in\{0,1\}^n\times\{0,1\}^n, we have,

Prr1,...,rt[(x,y) s.t. i=1tZ(x,y,ri)tϵ+δ]22nexp(4n)<22n4n<1\mathop{\text{Pr}}_{r_1,...,r_t}\left [\exists(x,y)\text{ s.t. } \frac{\sum_{i=1}^tZ(x,y,r_i)}{t}\geq\epsilon+\delta\right ]\leq 2^{2n}\exp(-4n)<2^{2n-4n}<1

If for all r1,...,rtr_1,...,r_t, we have "for all (x,y){0,1}n×{0,1}n(x,y)\in\{0,1\}^n\times\{0,1\}^n, Pri[t][π(x,y,ri)f(x,y)]>ϵ+δ\mathop{\text{Pr}}_{i\leftarrow[t]}\left [\pi(x,y,r_i)\neq f(x,y)\right ]>\epsilon+\delta", then it must hold that

Prr1,...,rt[(x,y) s.t. i=1tZ(x,y,ri)tϵ+δ]=1\mathop{\text{Pr}}_{r_1,...,r_t}\left [\exists(x,y)\text{ s.t. } \frac{\sum_{i=1}^tZ(x,y,r_i)}{t}\geq\epsilon+\delta\right ]=1

which contradicts. So there exists such r1,...,rtr_1,...,r_t.

\square.

Remark: Since we only care about communication complexity, thus we don’t care about how to find r1,...,rtr_1,...,r_t. We could assume that we have infinite local computing resources.

Remark: Newman’s theorem can be seen as a very basic pseudo-random number generator (PSG): We were able to reduce the entire random bits needed by π\pi to only O(logn+log(1/δ))O(\log n + \log (1/\delta)) bits (and thus communicate it between the players) at the cost of only paying an additive factor of δ\delta in the algorithm. Note that aside from transforming public coins to private coins, this theorem also implies that any constant-error protocol can be made to work with only O(logn)O(\log n) bits of randomness.

Definition(Communication cost). The communication cost of a randomized protocol π\pi is the worst-case number of bits communicated between Alice and Bob in π\pi over any choice of inputs x,yx,y and the randomness of the protocol.

We can view a public coin protocol as a distribution over deterministic protocols obtained by first using the public coins to sample the deterministic protocol and then running the deterministic protocol on the input. As such, we can alternatively define the communication cost of π\pi as the maximum communication cost of any deterministic protocol in the support of this distribution.

Definition(Randomized communication complexity). The randomized communication complexity of function ff is defined as R(f)=minππR(f)=\min_\pi\|\pi\|, where π\pi ranges over all randomized protocols that can solve π\pi with probability of success at least 2/3.

Definition. Let μ\mu be a distribution on inputs (x,y)(x, y). We define the distributional communication complexity of ff over distribution μ\mu as Dμ(f)=minππD_\mu(f) = \min_\pi\|\pi\| where π\pi ranges overall deterministic protocols that output the correct answer on inputs sampled from μ\mu with probability at least 2/3.

Proposition(Yao’s minmax principle). For any communication problem f:{0,1}n×{0,1}n{0,1}f:\{0,1\}^n\times\{0,1\}^n\rightarrow\{0,1\}:

  • Dμ(f)R(f)D_\mu(f)\leq R(f) for all input distributions μ\mu.
  • Dμ(f)=R(f)D_\mu(f)=R(f) for some input distribution μ\mu.

Yao’s minimax principle gives us a way of proving lower bounds for randomized protocols by instead considering deterministic ones on random inputs (we typically only use the first part which follows from a simple averaging argument – the second part implies that this approach can always gives us the tightest possible bound if we are able to find the “right” distribution μ\mu).

# A example: Randomized Communication Complexity of Equality

Algorithm:

  1. Generate a public random string aR{0,1}na\in_R\{0,1\}^n uniformly at random.
  2. Alice send c=i=1naiximod2c=\sum_{i=1}^n a_ix_i\mod 2 to Bob.
  3. Bob also computes c=i=1naiyimod2c'=\sum_{i=1}^n a_iy_i\mod 2, and output {0with prob 1/3{1c=c0ccwith prob 2/3\begin{cases}0&\text{with prob 1/3}\\ \begin{cases}1&c'=c\\0&c'\neq c\end{cases} &\text{with prob 2/3}\end{cases}.

Analysis:

when xyx\neq y:

Pra{0,1}n[i=1naixi=i=1naiyimod2]=12\mathop{\text{Pr}}_{a\sim\{0,1\}^n}\left [\sum_{i=1}^n a_ix_i=\sum_{i=1}^n a_iy_i\mod 2\right ]=\frac{1}{2}

Thus when x=yx=y, with probability 2/32/3 Bob outputs 1, and when xyx\neq y , with probability 1/3+2/31/2=2/31/3+2/3\cdot 1/2=2/3 Bob outputs 0.

The randomized communication complexity is O(1)O(1) since Alice just need to send 1 bit to Bob. And with Newman’s theorem, we can turn the above protocol into a private random protocol.

It is also worth mentioning that one can prove an Ω(logn)\Omega(\log n) bit lower bound on the communication cost of any private coin protocol for the equality problem. This in turn implies the tightness of Newman’s theorem (for constant δ\delta).

# One Way Communication Complexity: The Index Problem

Problem: In the index communication problem Ind , Alice gets a string x{0,1}nx\in\{0,1\}^n and Bob gets an index i[n]i\in[n]; the goal for the players is to output xix_i. i.e., Ind(x,i)=xi\text{Ind}(x,i)=x_i.

Theorem. Any randomized streaming algorithm for computing DE exactly requires Ω(R(Ind)logn)\Omega(R(\text{Ind})-\log n) bits of space, where Ind is on domain {0,1}n\{0,1\}^n.

Proof:

Assume there is a randomized streaming algorithm A\mathcal{A} for computing DE that requires ss bits of space. Now consider an instance (x,i)(x, i) of Ind problem. Define the sub-streams:

σA:={j:xj=1}σB={i}\sigma_A:=\{j:x_j=1\} \quad\sigma_B=\{i\}

and let σ:=σAσB\sigma:=\sigma_A\circ\sigma_B be the concatenation of two sub-streams. An important property is, when Ind(x,i)=1\text{Ind}(x,i)=1, we have DE(σ)=DE(σA)\text{DE}(\sigma)=\text{DE}(\sigma_A); while Ind(x,i)=0\text{Ind}(x,i)=0 we have DE(σ)=DE(σA)+1\text{DE}(\sigma)=\text{DE}(\sigma_A)+1.

Similar to the previous proof, we find a deduction from DE to Ind , in other way, we find a algorithm for Ind with space R(DE)+O(logn)R(\text{DE})+O(\log n). Thus, R(Ind)R(DE)+O(logn)R(\text{Ind})\leq R(\text{DE})+O(\log n).

\square.

Definition(One-way communication complexity). In the one-way communication model, we only allow Alice to send a single message to Bob and Bob then needs to output the answer.

4

We then define

  • One-way deterministic communication complexity: D(f):=minππ\overrightarrow{D}(f):=\min_\pi\|\pi\| where π\pi ranges over all deterministic one-way protocols for ff.
  • One-way randomized communication complexity: R(f):=minππ\overrightarrow{R}(f):=\min_\pi\|\pi\| where π\pi ranges over all randomized one-way protocols for ff with probability of success at least 2/32/3.
  • One-way distributional communication complexity: Dμ(f):=minππ\overrightarrow{D}_\mu(f):=\min_\pi\|\pi\| where π\pi ranges over all deterministic one-way protocols for ff with probability of success at least 2/32/3 over the distribution μ\mu.

D(Ind)n\overrightarrow{D}(\text{Ind})\geq n is obvious, since by pigeonhole principle, if the message of Alice has size less than nn, then at least two different strings xy{0,1}nx\neq y\in\{0,1\}^n , are mapped to the same message MM. Now let ii be an index where xiyix_i\neq y_i . The output of Bob is a deterministic function of MM and ii and is thus wrong for one of xix_i and yiy_i , a contradiction.

Theorem. R(Ind)=Ω(n)\overrightarrow{R}(\text{Ind})=\Omega(n).

Proof:

Let’s define the Hamming distance:

Δ(x,y):={ixiyi}\Delta(x,y):=|\{i\mid x_i\neq y_i\}|

Lemma: For any parameter δ(0,14)\delta\in(0,\frac{1}{4}), there exists a subset F{0,1}n\mathcal{F}\subseteq\{0,1\}^n with size exp(nδ2/4)\exp(n\delta^2/4) such that for any two xyFx\neq y\in\mathcal{F}, Δ(x,y)>n/2δn\Delta(x,y)>n/2-\delta n.

Proof:

If x,yx,y are uniformly sampled from {0,1}n\{0,1\}^n, we have:

E[Δ(x,y)]=i=1nPr[xiyi]=n2\mathbb{E}[\Delta(x,y)]=\sum_{i=1}^n\text{Pr}[x_i\neq y_i]=\frac{n}{2}

By Chernoff bound,

Pr[Δ(x,y)n/2δn]Pr[Δ(x,y)E[Δ(x,y)]δn]exp(δ22n)\text{Pr}[\Delta(x,y)\leq n/2-\delta n]\leq\text{Pr}[|\Delta(x,y)-\mathbb{E}[\Delta(x,y)]|\geq \delta n]\leq\exp(-\frac{\delta^2}{2}\cdot n)

Let t=exp(nδ2/4)t=\exp(n\delta^2/4) and suppose we sample strings x1,...,xtx_1,...,x_t uniformly from F\mathcal{F}. By union bound:

Pr[ij,Δ(xi,xj)n/2δn](t2)exp(δ2n/2)<1\text{Pr}[\exists i\neq j,\Delta(x_i,x_j)\leq n/2-\delta n]\leq \begin{pmatrix}t\\2\end{pmatrix}\cdot\exp(-\delta^2 n/2)<1

Thus there must exist x1,...,xtx_1,...,x_t satisfying the requirements.

\square.

The lemme illustrates a construction of “large” family of strings in {0,1}n\{0,1\}^n that are “far from” each other.

We will use Yao’s minmax principle to show that there is a distribution μ\mu where Dμ(Ind)=Ω(n)\overrightarrow{D}_\mu(\text{Ind})=\Omega(n) output the correct answer with probability 0.9 as opposed to 2/3.

With Lemma above, we can construct a set F\mathcal{F}, for all x,yF,Δ(x,y)>0.4nx,y\in\mathcal{F},\Delta(x,y)>0.4n for δ=0.1\delta=0.1, and F=2Ω(n)|\mathcal{F}|=2^{\Omega(n)}. In the distribution μ\mu, we sample xFx\in\mathcal{F} uniformly at random and give it to Alice, and sample i[n]i\in[n] uniformly at random and independently and give it to Bob.

Now consider any deterministic one-way protocol π\pi for Ind on the distribution μ\mu with probability of success at least 0.9. By definition:

PrxFPri[n][π errs on input (x,i)]0.1\mathop{\text{Pr}}_{x\in\mathcal{F}}\mathop{\text{Pr}}_{i\in[n]}[\pi\text{ errs on input }(x,i)]\leq 0.1

Let F\mathcal{F}' be the subset of F\mathcal{F}:

F={xFPri[n][π errs on input (x,i)]0.2}\mathcal{F}'=\{x\in\mathcal{F}\mid \mathop{\text{Pr}}_{i\in[n]}[\pi\text{ errs on input }(x,i)]\leq 0.2\}

We have FF/2|\mathcal{F}'|\geq|\mathcal{F}|/2 as otherwise:

PrxFPri[n][π errs on input (x,i)]=PrxFPri[n][π errs on input (x,i)]+PrxFFPri[n][π errs on input (x,i)]>PrxFFPri[n][π errs on input (x,i)]>FFF0.2>120.2=0.1\begin{aligned} \mathop{\text{Pr}}_{x\in\mathcal{F}}\mathop{\text{Pr}}_{i\in[n]}[\pi\text{ errs on input }(x,i)]&=\mathop{\text{Pr}}_{x\in\mathcal{F}'}\mathop{\text{Pr}}_{i\in[n]}[\pi\text{ errs on input }(x,i)]+\mathop{\text{Pr}}_{x\in\mathcal{F}-\mathcal{F}'}\mathop{\text{Pr}}_{i\in[n]}[\pi\text{ errs on input }(x,i)]\\ &>\mathop{\text{Pr}}_{x\in\mathcal{F}-\mathcal{F}'}\mathop{\text{Pr}}_{i\in[n]}[\pi\text{ errs on input }(x,i)]\\ &>\frac{|\mathcal{F}-\mathcal{F}'|}{|\mathcal{F}|}\cdot 0.2\\ &>\frac{1}{2}\cdot 0.2=0.1 \end{aligned}

which contradicts.

For any xFx\in\mathcal{F}', let M(x)M(x) denote the message sent by Alice to Bob when here input was xx (note that M(x)M(x) is a deterministic function of xx). We claim that given only M(x)M(x) for any xFx\in\mathcal{F}', we can recover the string xx entirely. The proof is as follows

Given M(x)M(x), define the vector y(x){0,1}ny(x)\in\{0,1\}^n such that for any i[n],yi:=π(x,i)i\in [n],y_i:=\pi(x,i), i.e., the output of the protocol when the input to Alice is xx and input to Bob is ii. Note that y(x)y(x) is a deterministic function of M(x)M(x) only. Moreover we have

Δ(y,x)0.2n\Delta(y,x)\leq 0.2n

by the definition F\mathcal{F}' as only 0.2 fraction of entries of y(x)y(x) can be different from xx. On the other hand, for any zFz\in\mathcal{F}, we have

Δ(y(x),z)Δ(z,x)Δ(y(x),x)>0.4n0.2nΔ(y(x),x)\begin{aligned} \Delta(y(x),z)&\geq \Delta(z,x)-\Delta(y(x),x)\\ &>0.4n-0.2n\\ &\geq \Delta(y(x),x) \end{aligned}

As such, given y(x)y(x), we can simply find the string wFw\in\mathcal{F} which minimizes Δ(y(x),w)\Delta(y(x),w); by the two equations above, this string has to be xx, thus allowing us to recover xx from y(x)y(x). Since y(x)y(x) itself was a deterministic function of M(x)M(x), we obtained a one-to-one mapping from xM(x)x\rightarrow M(x) for all xFx\in\mathcal{F}'. This implies that there needs to be F=2Ω(n)|\mathcal{F}'|=2^{\Omega(n)}, different messages sent by Alice, which means length of her message needs to be Ω(n)\Omega (n) bits, concluding the proof.

\square.

# A reduction from nearest neighbor query to index problem

Problem(nearest neighbor search) NNS .

Given a stream of points a1,a2,...,ama_1,a_2,...,a_min 2-dimensional Euclidean space [n]2[n]^2 and another point pp at the end of the stream. Assuming m>nm>n, to compute:

mini=1,...,maib\min_{i=1,...,m}\|a_i-b\|

Intuitively, if we can solve this problem efficiently, then with same complexity can we determine that an element is in the set of not, which seems quite difficult.

Theorem. Any algorithm trying to solve NNS requires Ω(n)\Omega(n) bits of storage.

Proof:

The proof is based on the reduction to Ind . Assuming there exists an algorithm A\mathcal{A} computing NNS with ss bits. Now consider an instance for Ind problem (x,i)(x,i), define points:

aj=(j,xj) for j=1,...,nb=(i,0)\begin{aligned} &a_j=(j,x_j)\text{ for }j=1,...,n\\ &b=(i,0) \end{aligned}

and sub-streams:

σA={a1,a2,...,an}σB={b}\sigma_A=\{a_1,a_2,...,a_n\}\quad \sigma_B=\{b\}

Let Alice run A\mathcal{A} on σA\sigma_A and send storage to Bob, which needs ss bits. The answer of NNS on σA\sigma_A (whether an{a1,...,an1}a_n\in\{a_1,...,a_{n-1}\}) doesn’t matter.

Then Bob continues to run A\mathcal{A} on σB\sigma_B. If Bob outputs 00, which means there exists a point aja_j satisfying:

ajb=0j=iaj=0\|a_j-b\|=0\Rightarrow j=i\wedge a_j=0

thus Bob could output 0 as the answer to Ind . Otherwise, he ought to output 1. Thus, we have sR(Ind)=Ω(n)s\geq\overrightarrow{R}(\text{Ind})=\Omega(n).

\square.