Solidność podziału junty

16

Mówimy, że funkcja boolowska f : { 0 , 1 } n{ 0 , 1 }f:{0,1}n{0,1} jest k-k junta, jeśli ff ma co najwyżej kk zmiennych wpływających.

Niech f : { 0 , 1 } n{ 0 , 1 }f:{0,1}n{0,1} będzie 2 k-2k juntą. Oznaczają odpowiednio zmienne ff o x 1 , x 2 , ... , x nx1,x2,,xn . Napraw S 1 = { x 1 , x 2 , , x n2 },S 2 = { x n2 +1,xn2 +2,,xn}.

S1={x1,x2,,xn2},S2={xn2+1,xn2+2,,xn}.
Oczywiście istniejeS{S1,S2},S{S1,S2}takie żeSSzawiera co najmniejkkwpływających zmiennychff.

Teraz niech ϵ > 0ϵ>0 i przyjmijmy, że f : { 0 , 1 } n{ 0 , 1 }f:{0,1}n{0,1} wynosi ϵϵ -dużo z każdego 2 k-2k junta (tj. Należy zmienić ułamek co najmniej ϵϵ wartości ff , aby uzyskać 2 k-2k juntę). Czy możemy stworzyć „solidną” wersję powyższego oświadczenia? To znaczy, czy istnieje uniwersalna stała cc i zbiór S { S 1 , S 2 }S{S1,S2}taki, że ff wynosi ϵcϵc -far od każdej funkcji, która zawiera co najwyżejkkwpływających zmiennych wSS?

Uwaga: W pierwotnym sformułowaniu pytania cc ustalono na 22 . Przykład Neala pokazuje, że taka wartość cc nie wystarcza. Ponieważ jednak w testach własności zwykle nie jesteśmy zbytnio zainteresowani stałymi, nieco złagodziłem ten warunek.


źródło
Czy możesz wyjaśnić swoje warunki? Czy zmienna „wpływa”, chyba że wartość f jest zawsze niezależna od zmiennej? Czy „zmiana wartości ff ” oznacza zmianę jednej z wartości f ( x )f(x) dla określonego xx ?
Neal Young,
Oczywiście, zmienna x ixi wpływa jeśli istnieje nn ciągu bitowych yy takie, że f ( r ) f ( R ' )f(y)f(y) , w którym Y 'y jest łańcuchem yy z Ii -tym współrzędnych odbitym. Zmiana wartości ff oznacza zmianę w tabeli prawdy.

Odpowiedzi:

17

Odpowiedź brzmi tak". Dowodem jest sprzeczność.

Dla wygody notacji oznaczmy pierwsze zmienne n / 2 przez x, a drugie zmienne n / 2 przez y . Załóżmy, że f ( x , y ) jest δ -zbliżone do funkcji f 1 ( x , y ), która zależy tylko od k współrzędnych x . Oznacz jego wpływowe współrzędne przez T 1 . Podobnie załóżmy, że f ( x , y ) ton/2xn/2yf(x,y)δf1(x,y)kxT1f(x,y) δ - zamknij do funkcji f 2 ( x , y )δf2(x,y) która zależy tylko od k współrzędnych y . Oznacz jego wpływowe współrzędne przez T 2 . Musimy udowodnić, że f wynosi 4 δ - blisko 2 k- junta ˜ f ( x , y ) .kyT2f4δ2kf~(x,y)

Let us say that (x1,y1)(x2,y2)(x1,y1)(x2,y2) if x1x1 and x2x2 agree on all coordinates in T1T1 and y1y1 and y2y2 agree on all coordinates in T2T2. We choose uniformly at random a representative from each equivalence class. Let (ˉx,ˉy)(x¯,y¯) be the representative for the class of (x,y)(x,y). Define ˜ff~ as follows: ˜f(x,y)=f(ˉx,ˉy).

f~(x,y)=f(x¯,y¯).

It is obvious that ˜ff~ is a 2k2k-junta (it depends only on variables in T1T2)T1T2). We shall prove that it is at distance 4δ4δ from ff in expectation.

We want to prove that Pr˜f(Prx,y(˜f(x,y)f(x,y)))=Pr(f(ˉx,ˉy)f(x,y))4δ,

Prf~(Prx,y(f~(x,y)f(x,y)))=Pr(f(x¯,y¯)f(x,y))4δ,
where xx and yy are chosen uniformly at random. Consider a random vector ˜xx~ obtained from xx by keeping all bits in T1T1 and randomly flipping all bits not in T1T1, and a vector ˜yy~ defined similarly. Note that Pr(˜f(x,y)f(x,y))=Pr(f(ˉx,ˉy)f(x,y))=Pr(f(˜x,˜y)f(x,y)).
Pr(f~(x,y)f(x,y))=Pr(f(x¯,y¯)f(x,y))=Pr(f(x~,y~)f(x,y)).

We have, Pr(f(x,y)f(˜x,y))Pr(f(x,y)f1(x,y))+Pr(f1(x,y)f1(˜x,y))+Pr(f1(˜x,y)f(˜x,y))δ+0+δ=2δ.

Pr(f(x,y)f(x~,y))Pr(f(x,y)f1(x,y))+Pr(f1(x,y)f1(x~,y))+Pr(f1(x~,y)f(x~,y))δ+0+δ=2δ.

Similarly, Pr(f(˜x,y)f(˜x,˜y))2δPr(f(x~,y)f(x~,y~))2δ. We have Pr(f(ˉx,ˉy)f(x,y))4δ.

Pr(f(x¯,y¯)f(x,y))4δ.
QED

It easy to “derandomize” this proof. For every (x,y)(x,y), let ˜f(x,y)=1f~(x,y)=1 if f(x,y)=1f(x,y)=1 for most (x,y)(x,y) in the equivalence class of (x,y)(x,y), and ˜f(x,y)=0f~(x,y)=0, otherwise.

Yury
źródło
12

The smallest cc that the bound holds for is c=1212.41c=1212.41.

Lemmas 1 and 2 show that the bound holds for this cc. Lemma 3 shows that this bound is tight.

(In comparison, Juri's elegant probabilistic argument gives c=4c=4.)

Let c=121c=121. Lemma 1 gives the upper bound for k=0k=0.

Lemma 1: If ff is ϵgϵg-near a function gg that has no influencing variables in S2S2, and ff is ϵhϵh-near a function hh that has no influencing variables in S1, then f is ϵ-near a constant function, where ϵ(ϵg+ϵh)/2c.

Proof. Let ϵ be the distance from f to a constant function. Suppose for contradiction that ϵ does not satisfy the claimed inequality. Let y=(x1,x2,,xn/2) and z=(xn/2+1,,xn) and write f, g, and h as f(y,z), g(y,z) and h(y,z), so g(y,z) is independent of z and h(y,z) is independent of y.

(I find it helpful to visualize f as the edge-labeling of the complete bipartite graph with vertex sets {y} and {z}, where g gives a vertex-labeling of {y}, and h gives a vertex-labeling of {z}.)

Let g0 be the fraction of pairs (y,z) such that g(y,z)=0. Let g1=1g0 be the fraction of pairs such that g(y,z)=1. Likewise let h0 be the fraction of pairs such that h(y,z)=0, and let h1 be the fraction of pairs such that h(y,z)=1.

Without loss of generality, assume that, for any pair such that g(y,z)=h(y,z), it also holds that f(y,z)=g(y,z)=h(y,z). (Otherwise, toggling the value of f(y,z) allows us to decrease both ϵg and ϵh by 1/2n, while decreasing the ϵ by at most 1/2n, so the resulting function is still a counter-example.) Say any such pair is ``in agreement''.

The distance from f to g plus the distance from f to h is the fraction of (x,y) pairs that are not in agreement. That is, ϵg+ϵh=g0h1+g1h0.

The distance from f to the all-zero function is at most 1g0h0.

The distance from f to the all-ones function is at most 1g1h1.

Further, the distance from f to the nearest constant function is at most 1/2.

Thus, the ratio ϵ/(ϵg+ϵh) is at most min(1/2,1g0h0,1g1h1)g0h1+g1h0,

where g0,h0[0,1] and g1=1g0 and h1=1h0.

By calculation, this ratio is at most 12(21)=c/2. QED

Lemma 2 extends Lemma 1 to general k by arguing pointwise, over every possible setting of the 2k influencing variables. Recall that c=121.

Lemma 2: Fix any k. If f is ϵg-near a function g that has k influencing variables in S2, and f is ϵh-near a function h that has k influencing variables in S1, then f is ϵ-near a function ˆf that has at most 2k influencing variables, where ϵ(ϵg+ϵh)/2c.

Proof. Express f as f(a,y,b,z) where (a,y) contains the variables in S1 with a containing those that influence h, while (b,z) contains the variables in S2 with b containing those influencing g. So g(a,y,b,z) is independent of z, and h(a,y,b,z) is independent of y.

For each fixed value of a and b, define Fab(y,z)=f(a,y,b,z), and define Gab and Hab similarly from g and h respectively. Let ϵgab be the distance from Fab to Gab (restricted to (y,z) pairs). Likewise let ϵhab be the distance from Fab to Hab.

By Lemma 1, there exists a constant cab such that the distance (call it ϵab) from Fab to the constant function cab is at most (ϵhab+ϵgab)/(2c). Define ˆf(a,y,b,z)=cab.

Clearly ˆf depends only on a and b (and thus at most k variables).

Let ϵˆf be the average, over the (a,b) pairs, of the ϵab's, so that the distance from f to ˆf is ϵˆf.

Likewise, the distances from f to g and from f to h (that is, ϵg and ϵh) are the averages, over the (a,b) pairs, of, respectively, ϵgab and ϵhab.

Since ϵab(ϵhab+ϵgab)/(2c) for all a,b, it follows that ϵˆf(ϵg+ϵh)/(2c). QED

Lemma 3 shows that the constant c above is the best you can hope for (even for k=0 and ϵ=0.5).

Lemma 3: There exists f such that f is (0.5/c)-near two functions g and h, where g has no influencing variables in S2 and h has no influencing variables in S1, and f is 0.5-far from every constant function.

Proof. Let y and z be x restricted to, respectively, S1 and S2. That is, y=(x1,,xn/2) and z=(xn/2+1,,xn).

Identify each possible y with a unique element of [N], where N=2n/2. Likewise, identify each possible z with a unique element of [N]. Thus, we think of f as a function from [N]×[N] to {0,1}.

Define f(y,z) to be 1 iff max(y,z)12N.

By calculation, the fraction of f's values that are zero is (12)2=12, so both constant functions have distance 12 to f.

Define g(y,z) to be 1 iff y12N. Then g has no influencing variables in S2. The distance from f to g is the fraction of pairs (y,z) such that y<12N and z12N. By calculation, this is at most 12(112)=0.5/c

Similarly, the distance from f to h, where h(y,z)=1 iff z12N, is at most 0.5/c.

QED

Neal Young
źródło
First of all, thanks Neal! This indeed sums it up for k=0, and sheds some light on the general problem. However in the case of k=0 the problem is a bit degenerate (as 2k=k), so I'm more curious regarding the case of k1. I didn't manage to extend this claim for k>0, so if you have an idea on how to do it - I'd appreciate it. If it simplifies the problem, then the exact constants are not crucial; that is, ϵ/2-far can be replaced by ϵ/c-far, for some universal constant c.
2
I've edited it to add the extension to general k. And Yuri's argument below gives a slightly looser factor with an elegant probabilistic argument.
Neal Young
Sincere thanks Neal! This line of reasoning is quite enlightening.