When designing and developing scale selection mechanisms for generating hypotheses about characteristic scales in signals, it is essential that the selected scale levels reflect the extent of the underlying structures in the signal.
This paper presents a theory and in-depth theoretical analysis about the scale selection properties of methods for automatically selecting local temporal scales in time-dependent signals based on local extrema over temporal scales of scale-normalized temporal derivative responses. Specifically, this paper develops a novel theoretical framework for performing such temporal scale selection over a time-causal and time-recursive temporal domain as is necessary when processing continuous video or audio streams in real time or when modelling biological perception.
For a recently developed time-causal and time-recursive scale-space concept defined by convolution with a scale-invariant limit kernel, we show that it is possible to transfer a large number of the desirable scale selection properties that hold for the Gaussian scale-space concept over a non-causal temporal domain to this temporal scale-space concept over a truly time-causal domain. Specifically, we show that for this temporal scale-space concept, it is possible to achieve true temporal scale invariance although the temporal scale levels have to be discrete, which is a novel theoretical construction.
The analysis starts from a detailed comparison of different temporal scale-space concepts and their relative advantages and disadvantages, leading the focus to a class of recently extended time-causal and time-recursive temporal scale-space concepts based on first-order integrators or equivalently truncated exponential kernels coupled in cascade. Specifically, by the discrete nature of the temporal scale levels in this class of time-causal scale-space concepts, we study two special cases of distributing the intermediate temporal scale levels, by using either a uniform distribution in terms of the variance of the composed temporal scale-space kernel or a logarithmic distribution.
In the case of a uniform distribution of the temporal scale levels, we show that scale selection based on local extrema of scale-normalized derivatives over temporal scales makes it possible to estimate the temporal duration of sparse local features defined in terms of temporal extrema of first- or second-order temporal derivative responses. For dense features modelled as a sine wave, the lack of temporal scale invariance does, however, constitute a major limitation for handling dense temporal structures of different temporal duration in a uniform manner.
In the case of a logarithmic distribution of the temporal scale levels, specifically taken to the limit of a time-causal limit kernel with an infinitely dense distribution of the temporal scale levels towards zero temporal scale, we show that it is possible to achieve true temporal scale invariance to handle dense features modelled as a sine wave in a uniform manner over different temporal durations of the temporal structures as well to achieve more general temporal scale invariance for any signal over any temporal scaling transformation with a temporal scaling factor that is an integer power of the distribution parameter of the time-causal limit kernel.
It is shown how these temporal scale selection properties developed for a pure temporal domain carry over to feature detectors defined over time-causal spatio-temporal and spectro-temporal domains.
Scale space, Scale, Scale selection, Temporal, Spatio-temporal, Scale invariance, Differential invariant, Feature detection, Video analysis, Computer vision