- Home
- Technology
*Recursive Compressed Sensing*

prev

next

out of 65

View

225Download

1

Embed Size (px)

Recursive Compressed Sensing

Pantelis Sopasakis

Presentation at ICTEAM UC Louvain, Belgiumjoint work with N. Freris and P. Patrinos

IMT Institute for advanced studies Lucca, Italy NYU, Abu Dhabi, United Arab Emirates

ESAT, KU Leuven, Belgium

April 7, 2016

Motivation

MRI

Radio-astronomy

Holography

Seismology

Photography

RadarsFa

cial

rec

ogni

tion

Speech recognition

Fault detection

Medicalimaging

Faci

al r

ecog

nitio

n

Part

icle

phy

sics

Video processing

ECG

Encr

yptio

n

Communication networks

System identification

Compressed Sensing

1 / 55

Spoiler alert!

The proposed method is an order of magnitude faster compared toother reported methods for recursive compressed sensing.

2 / 55

Outline

1. Forward-Backward Splitting

2. The Forward-Backward envelope function

3. The Forward-Backward Newton method

4. Recursive compressed sensing

5. Simulations

3 / 55

I. Forward-Backward Splitting

Forward-Backward Splitting

Problem structure

minimize (x) = f(x) + g(x)

where

1. f, g : Rn R are proper, closed, convex2. f has L-Lipschitz gradient

3. g is prox-friendly, i.e., its proximal operator

proxg(v) := arg minz

{g(z) + 12v z

2}

is easily computable[1].

[1]Parikh & Boyd 2014; Combettes & Pesquette, 2010.

4 / 55

Example #1

Constrained QPs

minimize 12x>Qx+ q>x

f

+ (x | B) g

where B is a set on which projections are easy to compute and

(x | B) =

{0, if x B,+, otherwise

Then proxg(x) = proj(x | B).

5 / 55

Example #2

LASSO problems

minimize 12Ax b2

f

+x1 g

Indeed,

1. f is cont. diff/ble with f(x) = A>(Ax b)2. g is prox-friendly

6 / 55

Other examples

X Constrained optimal control

X Elastic net

X Sparse log-logistic regression

X Matrix completion

X Subspace identification

X Support vector machines

7 / 55

Forward-Backward Splitting

FBS offers a generic framework for solving such problems using theiteration

xk+1 = proxg(xk f(xk)) =: T(xk),

for < 2/L.

Features:

1. (xk) ? O(1/k)2. with Nesterovs extrapolation (xk) ? O(1/k2)

8 / 55

Forward-Backward Splitting

The iteration

xk+1 = proxg(xk f(xk)),

can be written as[2]

xk+1 = arg minz

{f(xk) + f(xk), z xk+ 12 z x

k2 Qf(z,xk)

+g(z)},

where Qf(z, xk) serves as a quadratic model for f [3].

[2]Beck and Teboulle, 2010.[3]Qf(, x

k) is the linearization of f at xk plus a quadratic term; moreover, Qf(z, xk) f(x) and Qf(z, z) = f(z).

9 / 55

Forward-Backward Splitting

x0

(x0)

= f + g

10 / 55

Forward-Backward Splitting

x0

(x0)

= f + g

11 / 55

Forward-Backward Splitting

x0

(x0)

= f + g

Qf(z;x0) + g(z)

12 / 55

Forward-Backward Splitting

x0 x1

(x0)

(x1)

= f + g

Qf(z;x0) + g(z)

13 / 55

Forward-Backward Splitting

x0 x1 x2

(x0)

(x1)

(x2)

= f + g

Qf(z;x1) + g(z)

14 / 55

Forward-Backward Splitting

x0 x1 x2 x3

(x0)

(x1)

(x2)(x3)

= f + g

Qf(z;x2) + g(z)

15 / 55

Overview

Generic convex optimization problem

minimize f(x) + g(x).

The generic iteration

xk+1 = proxg(xk f(xk))

is a fixed-point iteration for the optimality condition

x? = proxg(x? f(x?))

16 / 55

Overview

It generalizes several other methods

xk+1 =

xk f(xk) gradient method, g = 0C(x

k f(xk)) gradient projection, g = ( | C)proxg(x

k) proximal point algorithm, f = 0

There are several flavors of proximal gradient algorithms[4].

[4]Nesterovs accelerated method, FISTA (Beck & Teboulle), etc.

17 / 55

Shortcomings

FBS are first-order methods, therefore, they can be slow!

Overhaul. Use a better quadratic model for f [5]:

Qf,B(z, xk) = f(xk) + f(xk), z xk+ 12 z x

k2Bk ,

where Bk is (an approximation of) 2f(x).

Drawback. No closed form solution of the inner problem.

[5]As in Becker & Fadili 2012; Lee et al. 2012; Tran-Dinh et al. 2013.

18 / 55

II. Forward-Backward Envelope

Forward-Backward Envelope

The Forward-Backward envelope of is defined as

(x) = minz

{f(x) + f(x), z x+ g(z) + 12 z x

2},

with 1/L. Lets see how it looks...

19 / 55

Forward-Backward Envelope

x

(x)

(x)

20 / 55

Forward-Backward Envelope

x

(x)

(x)

21 / 55

Forward-Backward Envelope

x

(x)

(x)

22 / 55

Properties of FBE

Define

T(x) = proxg(x f(x))R(x) =

1(x T(x))

FBE upper bound

(x) (x) 12 R(x)2

FBE lower bound

(x) (T(x)) +1Lf

2 R(x)2

x T(x)

(x)

(T(x))(x)

x? = T(x?)

(x?)

23 / 55

Properties of FBE

Ergo: Minimizing is equivalent to minimizing its FBE .

inf = inf

arg min = arg min

However, is continuously diff/able[6] whenever f C2.

[6]More about the FBE: P. Patrinos, L. Stella and A. Bemporad, 2014.

24 / 55

FBE is C2

FBE can be written as

(x) = f(x) 2f(x)2 + g(xf(x)),

where g is the Moreau envelope of g,

g(v) = minz{g(z) + 12 z v

2}

g is a smooth approximation of g with g(x) = 1(x proxg(x)). Iff C2, then

(x) = (I 2f(x))R(x).

Therefore,arg min = arg min = zer .

25 / 55

The Moreau envelope

g(x) = |x|g0.1

g10

26 / 55

Forward-Backward Newton

X Since is C1 but not C2, we may not apply a Newton method.X The FB Newton method is a semi-smooth method for minimizing

using a notion of generalized differentiability.

X The FBN iterations are

xk+1 = xk + kdk,

where dk is a Newton direction given by

Hkdk = (xk),

Hk 2B(xk),

X B is the so-called B-subdifferential (well define it later)

27 / 55

III. Forward-Backward Newton

Optimality conditions

LASSO problem

minimize 12Ax b2

f

+x1 g

.

Optimality conditions

f(x?) g(x?).

where f(x) = A>(Ax b) and g(x)i = sign(xi) for xi 6= 0 andg(x)i = [, ] otherwise, so

if(x?) = sign(x?i ), if xi 6= 0,|if(x?)| , otherwise

28 / 55

Optimality conditions

If we knew the set

= {i : x?i = 0}, = {j : x?j 6= 0},

we would be able to write down the optimality conditions as

A>Ax? = A

> b+ sign(x

?)

Goal. Devise a method to determine efficiently.

29 / 55

Optimality conditions

We may write the optimality conditions as follows

x? = proxg(x? f(x?)),

where

proxg(z)i = sign(zi)(|zi| )+.

ISTA and FISTA are method for the iterative solution of theseconditions. Instead, we are looking for a zero of the fixed-point residualoperator

R(x) = x proxg(x f(x)).

30 / 55

B-subdifferential

For a function F : Rn Rn which is almost everywhere differentiable, wedefine its B-subdifferential to be[7]

BF (x) :=

{B Rnn

{xn}n : xn x,R(xn) exists and R(xn) B}.

[7]See Facchinei & Pang, 2004

31 / 55

Forward-Backward Newton

R(x) is nonexpansive Lipschitz Differentiable a.e. B-sub-differentiable (BR(x)). The proposed algorithm takes the form

xk+1 = xk kH1k R(xk), with Hk BR(xk).

When close to the solution, all Hk are nonsingular. Take

Hk = I Pk(I A>A),

where Pk is diagonal with (Pk)ii = 1 iff i k, where

k = {i : |xki if(xki )| > }

The scalar k is computed by a simple line search method to ensureglobal convergence of the algorithm.

32 / 55

Forward-Backward Newton

The Forward-Backward Newton method can be concisely written as

xk+1 = xk + kdk.

The Newton direction dk is determined as follows without the need toformulate Hk

dkk = (R(xk))k ,

A>kAkdkk

= (R(xk))k A>kAkd

kk.

For the method to converge globally, we compute k so that the Armijocondition is satisfied for

(xk + kd

k) (xk) + k(xk)>dk.

33 / 55

Forward-Backward Newton

Require: A, y, , x0, 0.95/A2x x0while R(x) > do {i : |xi if(x)| > } {i : |xi if(x)| }d xs sign(x f(x))Solve A>A(x + d) = A

> y s

1while (x+ d) (x) + (x)>d do 12

end whilex x+ d

end while

34 / 55

Speeding up FBN by Continuation

1. In applications of LASSO we have x?0 m n[8]

2. If 0 := f(x0), then supp(x) = 3. We relax the optimization problem solving

P() : minimize 12Ax y2 + x1

4. Once we have approximately solved P() we update as

max{, },

until eventually = .

5. This way we enforce that (i) |k| increases smoothly, (ii) |k| < m,(iii) A>kAk remains always positive definite.

[8]The zero-norm of x, x0, is the number of its nonzeroes.35 / 55

Speeding up FBN by Continuation

1. In applications of LASSO we have x?0 m n[8]

2. If 0 := f(x0