## JOINT VIDEO CODING AND TRACKING APPROACH: INTRODUCTION

When acquiring a digital video stream, the goal may not be necessarily to display it with the best possible visual quality to the end-user. In many cases, such as in video surveillance applications. The acquired video could be automatically processed in order to further analysis tasks and extract relevant information. Most of these high-level activities entail the summarization of dimension and/or the speed of objects moving in the scene, etc. Once this aggregated information is computed, the analysis task can carry on, while all the additional low-level information contained in the raw video stream (i.e., most part of the acquired signal) is discarded. Moreover, in some acquisition devices, such as medical scanners or imaging systems working at wavelengths where cheap CMOS or CCD sensors are ineffective, this approach may be unfavorable especially from the point of view of the costs of acquisition devices.

OBJECT TRACKING ON COMPRESSED VIDEO CODING

A. Haar Wavelet Transform

To calculate the Haar transform of an array of n samples:
1. Find the average of each pair of samples. (n/2 averages) .
2. Find the difference between each average and the samples it was calculated from. (n/2 differences)
3. Fill the first half of the array with averages.
4. Fill the second half of the array with differences.
5. Repeat the process on the first half of the array.

B. Background Subtraction

Background subtraction is a widely used approach for detecting moving objects in videos from static cameras. Let { x1, x2,……..xt…….xN } be the acquired frame sequence and let b be an estimate of the scene slowly varying background. Then, at each time instant t, the image of the foreground can be computed as ft = Xt — b . Actually, the background model cannot be fixed but it must adapt to illumination and motion changes. Thus, it must be continuously updated as new frames are acquired. A very simple and computationally efficient method to do this is the running average method.

bt =ax— + (1 — a)b—i (l)
Where ae [0,l] is a parameter that defines the background adaptation rate.

This method can also be directly implemented in the projections domain. Let
xp = Axt be the current frame projections and let bp = Abt be the background projections. It comes out that the foreground projections are easily computed as

ftp = Aft = A (xt— bt) (2)
= Axt — Abt = xp — bp » Xp — bp

while the background projections can still be updated with the running average method , without the need of recovering the pixel domain representation of the background itself:
bp = afp—1 + (l — a) bP1 (3)

Background prediction is denoted by the recursive filter B ( z) , whose transfer function is