Exercise 1: def main(){ x := if flip(1/2) then uniform(0,1) else gauss(0,1); r := if flip(1/2) then x else x^2; return r; } Exercise 2: map(λx. f(g(x)))(d) = Λb. ∫da d[a]·δ((λx. f(g(x)))(a))[b] = Λb. ∫da d[a]·δ(f(g(a)))[b]. = Λb. ∫da' d[a']·δ(f(g(a')))[b] = Λb. ∫da∫da' d[a']·δ(g(a'))[a]·δ(f(a))[b] = Λb. ∫da (Λb'. ∫da' d[a']·δ(g(a'))[b'])[a]·δ(f(a))[b] = map(f)(Λb'. ∫da' d[a']·δ(g(a'))[b']) = map(f)(Λb. ∫da d[a]·δ(g(a))[b]) map(f)(map(g)(d)). return(f(x)) = Λt. δ(f(x))[t] = Λb. δ(f(x))[b] = Λb. δ(f(x))[b]·∫da δ(x)[a] = Λb. ∫da δ(x)[a]·δ(f(a))[b] = Λb. ∫da return(x)[a]·δ(f(a))[b] = map(f)(return(x)). f;return = λa.Λc. ∫db f(a)[b]·return(b)[c] = λa.Λc. ∫db f(a)[b]·δ(b)[c] = λa.Λc. f(a)[c] = f. return;f = λa.Λc. ∫db return(a)[b]·f(b)[c] = λa.Λc. ∫db δ(a)[b]·f(b)[c] = λa.Λc. f(a)[c] = f. (f;g);h = λa.Λc. ∫db (f;g)(a)[b]·h(b)[c] = λa'.Λc'. ∫db' (f;g)(a')[b']·h(b')[c'] = λa'.Λc'. ∫db' (λa.Λc. ∫db f(a)[b]·g(b)[c])(a')[b']·h(b')[c'] = λa'.Λc'. ∫db'∫db f(a')[b]·g(b)[b']·h(b')[c'] = λa'.Λc'. ∫db∫db' f(a')[b]·g(b)[b']·h(b')[c'] = λa'.Λc'. ∫db f(a')[b]·∫db' g(b)[b']·h(b')[c'] = λa'.Λc'. ∫db f(a')[b]·(λa.Λc ∫db' g(a)[b']·h(b')[c])(b)[c'] = λa'.Λc'. ∫db' f(a')[b']·(λa.Λc ∫db g(a)[b']·h(b')[c])(b')[c'] = λa'.Λc'. ∫db' f(a')[b']·(g;h)(b')[c'] = λa.Λc. ∫db f(a)[b]·(g;h)(b)[c] = f;(g;h). Exercise 3: Λr. ∫dx∫dy [0≤x]·[x≤1]·[0≤y]·[y≤1]·δ(x+y)[r] = Λr. ∫dx∫dy[0≤x]·[x≤1]·[0≤y]·[y≤1]·δ(x+y)[r] = Λr. ∫dx∫dy[0≤x]·[x≤1]·[0≤y]·[y≤1]·δ(r-x)[y]/|1| = Λr. ∫dx∫dy[0≤x]·[x≤1]·[0≤y]·[y≤1]·δ(r-x)[y] = Λr. ∫dx[0≤x]·[x≤1]·[0≤r-x]·[r-x≤1] = Λr. ∫dx[0≤x]·[x≤1]·[x≤r]·[r-1≤x] = Λr. ∫dx[0≤x]·[r-1≤x]·[x≤1]·[x≤r] = Λr. ∫dx[max(0,r-1)≤x]·[x≤min(1,r)] = [max(0,r-1)≤min(1,r)]·(min(1,r)-max(0,r-1)). Λr. ∫dx∫dy [0≤x]·[x≤1]·[0≤y]·[y≤1]·δ(x·y)[r] = Λr. ∫dx∫dy[0≤x]·[x≤1]·[0≤y]·[y≤1]·([x=0]+x≠0)·δ(x·y)[r] = Λr. ∫dx[0≤x]·[x≤1]·[x=0]·∫dy[0≤y]·[y≤1]·δ(x·y)[r] + ∫dx[x≠0]·∫dy[0≤x]·[x≤1]·[0≤y]·[y≤1]·δ(x·y)[r] = Λr. ∫dx[x=0]·∫dy[0≤y]·[y≤1]·δ(x·y)[r] + ∫dx[x≠0]·∫dy[0≤x]·[x≤1]·[0≤y]·[y≤1]·δ(x·y)[r] = ∫dx[x≠0]·∫dy[0≤x]·[x≤1]·[0≤y]·[y≤1]·δ(x·y)[r] = ∫dx[x≠0]·∫dy[0≤x]·[x≤1]·[0≤y]·[y≤1]·δ(r/x)[y]/|x| = ∫dx[x≠0]·[0≤x]·[x≤1]·[0≤r/x]·[r/x≤1]/|x| = ∫dx[x≠0]·[0≤x]·[x≤1]·[0≤r]·[r≤x]/|x| = [0≤r]·∫dx[x≠0]·[0≤x]·[r≤x]·[x≤1]/|x| = [0≤r]·∫dx[x≠0]·[r≤x]·[x≤1]/x = [0≤r]·[r≤1]·(log(1)-log(r)) = -[0≤r]·[r≤1]·log(r). Λr. ∫dx (1/2·[0≤x]·[x≤1]+1/2·δ(1/2)[x])·δ(x·(1-x))[r] = Λr. ∫dx (1/2·[0≤x]·[x≤1]+1/2·δ(1/2)[x])·([x≠1/2]+[x=1/2])·δ(x·(1-x))[r] = Λr. ∫dx (1/2·[0≤x]·[x≤1]+1/2·δ(1/2)[x])·([x≠1/2]·δ(x·(1-x))[r]+[x=1/2]·δ(x·(1-x))[r]) = Λr. ∫dx (1/2·[0≤x]·[x≤1]+1/2·δ(1/2)[x])·([x≠1/2]·δ(x·(1-x))[r]+[x=1/2]·δ(1/4)[r]) = Λr. ∫dx (1/2·[0≤x]·[x≤1]·([x≠1/2]·δ(x·(1-x))[r]+[x=1/2]·δ(1/4)[r])+1/2·δ(1/2)[x]·([x≠1/2]·δ(x·(1-x))[r]+[x=1/2]·δ(1/4)[r])) = Λr. ∫dx (1/2·[0≤x]·[x≤1]·[x≠1/2]·δ(x·(1-x))[r]+1/2·δ(1/2)[x]·[x=1/2]·δ(1/4)[r]) = Λr. ∫dx (1/2·[0≤x]·[x≤1]·[x≠1/2]·δ(x·(1-x))[r]+1/2·δ(1/2)[x]·δ(1/4)[r]) = Λr. (∫dx 1/2·[0≤x]·[x≤1]·[x≠1/2]·δ(x·(1-x))[r]) +(∫dx 1/2·δ(1/2)[x]·δ(1/4)[r]) = Λr. (∫dx 1/2·[0≤x]·[x≤1]·[x≠1/2]·δ(x·(1-x))[r]) +1/2·δ(1/4)[r] = Λr. (∫dx 1/2·[0≤x]·[x≤1]·[x≠1/2]·([0<1-4·r]·(δ((1+√1̅-̅4̅·̅r̅)/2)[x]+δ((1-√1̅-̅4̅·̅r̅)/2)[x])/|1-2·x|+[0=1-4r]·δ(1/2)[x])) +1/2·δ(1/4)[r] = Λr. (∫dx 1/2·[0≤x]·[x≤1]·[x≠1/2]·([0<1-4·r]·(δ((1+√1̅-̅4̅·̅r̅)/2)[x]+δ((1-√1̅-̅4̅·̅r̅)/2)[x])/|1-2·x|)) +1/2·δ(1/4)[r] = Λr. (∫dx 1/2·[0≤x]·[x≤1]·[x≠1/2]·[0<1-4·r]·δ((1+√1̅-̅4̅·̅r̅)/2)[x]/|1-2·x|) +(∫dx 1/2·[0≤x]·[x≤1]·[x≠1/2]·[0<1-4·r]·δ((1-√1̅-̅4̅·̅r̅)/2)[x]/|1-2·x|) +1/2·δ(1/4)[r] = Λr. (∫dx 1/2·[0≤(1+√1̅-̅4̅·̅r̅)/2]·[(1+√1̅-̅4̅·̅r̅)/2≤1]·[(1+√1̅-̅4̅·̅r̅)/2≠1/2]·[0<1-4·r]·δ((1+√1̅-̅4̅·̅r̅)/2)[x]/|1-2·(1+√1̅-̅4̅·̅r̅)/2|) +(∫dx 1/2·[0≤(1-√1̅-̅4̅·̅r̅)/2]·[(1-√1̅-̅4̅·̅r̅)/2≤1]·[(1-√1̅-̅4̅·̅r̅)/2≠1/2]·[0<1-4·r]·δ((1-√1̅-̅4̅·̅r̅)/2)[x]/|1-2·(1-√1̅-̅4̅·̅r̅)/2|) +1/2·δ(1/4)[r] = Λr. (1/2·[0≤(1+√1̅-̅4̅·̅r̅)/2]·[(1+√1̅-̅4̅·̅r̅)/2≤1]·[(1+√1̅-̅4̅·̅r̅)/2≠1/2]·[0<1-4·r]/|1-2·(1+√1̅-̅4̅·̅r̅)/2|) +(1/2·[0≤(1-√1̅-̅4̅·̅r̅)/2]·[(1-√1̅-̅4̅·̅r̅)/2≤1]·[(1-√1̅-̅4̅·̅r̅)/2≠1/2]·[0<1-4·r]/|1-2·(1-√1̅-̅4̅·̅r̅)/2|) +1/2·δ(1/4)[r] = Λr. (1/2·[1+√1̅-̅4̅·̅r̅≤2]·[1+√1̅-̅4̅·̅r̅≠1]·[0<1-4·r]/|1-(1+√1̅-̅4̅·̅r̅)|) +(1/2·[0≤1-√1̅-̅4̅·̅r̅]·[1-√1̅-̅4̅·̅r̅≠1]·[0<1-4·r]/|1-(1-√1̅-̅4̅·̅r̅)|) +1/2·δ(1/4)[r] = Λr. (1/2·[√1̅-̅4̅·̅r̅≤1]·[√1̅-̅4̅·̅r̅≠0]·[0<1-4·r]/|-√1̅-̅4̅·̅r̅|) +(1/2·[√1̅-̅4̅·̅r̅≤1]·[√1̅-̅4̅·̅r̅≠0]·[0<1-4·r]/|√1̅-̅4̅·̅r̅|) +1/2·δ(1/4)[r] = Λr. (1/2·[√1̅-̅4̅·̅r̅≤1]·[1-4·r≠0]·[0<1-4·r]/√1̅-̅4̅·̅r̅) +(1/2·[√1̅-̅4̅·̅r̅≤1]·[1-4·r≠0]·[0<1-4·r]/√1̅-̅4̅·̅r̅) +1/2·δ(1/4)[r] = Λr. (1/2·[√1̅-̅4̅·̅r̅≤1]·[r<1/4]/√1̅-̅4̅·̅r̅) +(1/2·[√1̅-̅4̅·̅r̅≤1]·[r<1/4]/√1̅-̅4̅·̅r̅) +1/2·δ(1/4)[r] = Λr. [√1̅-̅4̅·̅r̅≤1]·[r<1/4]/√1̅-̅4̅·̅r̅+1/2·δ(1/4)[r] = Λr. [1-4·r≤1]·[r<1/4]/√1̅-̅4̅·̅r̅+1/2·δ(1/4)[r] = Λr. [-4·r≤0]·[r<1/4]/√1̅-̅4̅·̅r̅+1/2·δ(1/4)[r] = Λr. [0≤r]·[r<1/4]/√1̅-̅4̅·̅r̅+1/2·δ(1/4)[r]. Λy. ∫dx [0≤x]·[x≤1]·δ((2+x)/(3+x))[y] = Λy. ∫dx [0≤x]·[x≤1]·(δ((3·y-2)/(1-y))[x]/|((3+x)-(2+x))/(3+x)²|) = Λy. ∫dx [0≤x]·[x≤1]·(δ((3·y-2)/(1-y))[x]/|1/(3+x)²|) = Λy. [0≤(3·y-2)/(1-y)]·[(3·y-2)/(1-y)≤1]·/|1/(3+(3·y-2)/(1-y))²| = Λy. [2/3≤y]·[y≤3/4]/(1/(3+(3·y-2)/(1-y))²) = Λy. [2/3≤y]·[y≤3/4]·(3+(3·y-2)/(1-y))². Exercise 4: def main(){ p := uniform(0,1); x₁ := flip(p); x₂ := flip(p); observe(x₁ && !x₂); return p } Unnormalized posterior: Λp. [0≤p]·[p≤1]·∫dx₁(p·δ(1)[x₁]+(1-p)·δ(0)[x₁])·∫dx₂(p·δ(1)[x₂]+(1-p)·δ(0)[x₂])·[x₁≠0]·[x₂=0] = Λp. [0≤p]·[p≤1]·∫dx₁(p·δ(1)[x₁]+(1-p)·δ(0)[x₁])·[x₁≠0]·∫dx₂(p·δ(1)[x₂]+(1-p)·δ(0)[x₂])·[x₂=0] = Λp. [0≤p]·[p≤1]·∫dx₁p·δ(1)[x₁]·∫dx₂(1-p)·δ(0)[x₂] = Λp. [0≤p]·[p≤1]·p·(1-p). Normalization constant: Z = ∫dp (Λp. [0≤p]·[p≤1]·p·(1-p))[p] = ∫dp [0≤p]·[p≤1]·p·(1-p) = F(1)-F(0) where F(p)=p²/2-p³/3 = (1/2-1/3)-(0-0) = 1/6. Normalized posterior: Λp. [0≤p]·[p≤1]·p·(1-p)/Z = Λp. 6·[0≤p]·[p≤1]·p·(1-p). (This is the Beta(2,2) distribution.) Posterior of first program: Λx. [0≤x]·[x≤1]·∫dy[0≤y]·[y≤1]·δ(0)[x-y] = Λx. [0≤x]·[x≤1]·∫dy[0≤y]·[y≤1]·δ(x)[y]/|-1| = Λx. [0≤x]·[x≤1]·[0≤x]·[x≤1] = Λx. [0≤x]·[x≤1]. Coincidentally, this posterior is already normalized. Expected result of first program: ∫dx x·[0≤x]·[x≤1] = 1/2. Unnormalized posterior of second program: Λx. [0≤x]·[x≤1]·∫dy[0≤y]·[y≤1]·δ(1)[x/y] = Λx. [0≤x]·[x≤1]·∫dy[0≤y]·[y≤1]·δ(x)[y]/|-x/y²| = Λx. [0≤x]·[x≤1]/|-x/x²| = Λx. [0≤x]·[x≤1]·x. Normalization constant: Z = ∫dx [0≤x]·[x≤1]·x. = F(1)-F(0) where F(x)=x²/2 = 1/2. Normalized posterior of second program: Λx. [0≤x]·[x≤1]·x/Z = Λx. 2·[0≤x]·[x≤1]·x. Expected result of second program: ∫dx x·2·[0≤x]·[x≤1]·x = ∫dx 2·[0≤x]·[x≤1]·x² = 2·(F(1)-F(0)) where F(x)=x³/3 = 2·(1/3-0) = 2/3. We now introduce a parameter ε: The first program becomes: def main(ε){ x := uniform(0,1); y := uniform(0,1); observe(abs(x-y) <= ε); return x; } The second program becomes: def main(ε){ x := uniform(0,1); y := uniform(0,1); observe(abs(x/y-1) <= ε); return x; } The expected values of the original programs are the expected values of those programs in the limit ε→ 0. For visualization, you can write def main(δ){ ε := 1/δ; x := uniform(0,1); y := uniform(0,1); observe(abs(x/y-1) <= ε); return x; } into a file "limit.psi" and call \$ psi --plot=[0:40] --expectation limit.psi This will create a plot with gnuplot that shows the expected value depending on δ=1/ε. As 1/ε approaches ∞, you should see that the expected value approaches 2/3. Intuitive explanation: The following equivalent versions are a bit more interpretable: def main(ε){ x := uniform(0,1); y := uniform(0,1); observe(x-ε <= y && y <= x+ε); return x; } def main(ε){ x := uniform(0,1); y := uniform(0,1); observe((1-ε)*x <= y && y <= (1+ε)*x); return x; } The first program conditions on the event that x is close to y in an absolute sense, while the second program conditions on the event that x is close to y in a relative sense. In the first program, conditioned on the value of x, the observation succeeds with equal probability for values x and (1-x), therefore the expected value is 1/2, independently of ε. (The probability density is symmetric around 1/2.) In the second program, the observation succeeds iff (x,y) satisfies (1-ε)·x ≤ y ≤ (1+ε)·x. This is more likely to happen given that x is large. Therefore, in the posterior, larger values of x will have more weight, and the expected value will be more than 1/2, even as ε approaches 0. Therefore, when we want to condition on the event that x == y for continuous random variables x and y, it matters not only that we observe x == y, but also "how" we observe it (x-y is 0 vs. x/y is 1). One therefore needs to carefully choose how to condition continuous random variables. Also see: https://en.wikipedia.org/wiki/Borel%E2%80%93Kolmogorov_paradox