Human Activity Analysis

dc.contributor.advisor

Tomasi, Carlo

dc.contributor.author

Carley, Cassandra Mariette

dc.date.accessioned

2019-04-02T16:26:50Z

dc.date.available

2019-04-02T16:26:50Z

dc.date.issued

2018

dc.department

Computer Science

dc.description.abstract

Video cameras have become increasingly prevalent, with higher resolution and frame-rates. Humans are often the focus of these videos, making human motion analysis an important field. This thesis explores the level of detail necessary to distinguish human activities for tasks of regression, like body tracking, and activity classification.

We first consider activities that can be distinguished by their appearance during a single moment in time. Specifically, we use a database-retrieval approach to both approximate the full 3D pose of the hand from a single frame and to classify into its configuration. To index the database we present a novel silhouette signature and signature distance to capture differences in both the extension and abduction of fingers.

Next, we consider more complex activities, like typing, that are characterized by a motion texture, or statistical regularities in space and time. A single frame is inadequate to distinguish such activities, and it may be difficult to track the detailed sequence of body and object elements because of occlusions or temporal aliasing. Further, such activities are not characterized by a detailed sequence of 3D poses, but rather by the motion texture they produce in space and time. We propose a new motion texture activity representation for computer vision tasks that require such spatial-temporal reasoning. Autocorrelation is used to capture temporal aspects of an activity signal that may be unbounded in time, and we show how it can be efficiently computed using an exponential moving average formulation. An optional space-time aggregation handles a potentially variable number of motion signals. This motion texture representation transforms any input signal of an activity into a fixed-size representation, even when the activity itself has varying extents in space and time. As a result of this conversion, any off-the-shelf classifier can be applied to detect the activity.

For evaluation, we show how our representation can be used as a motion texture ``layer'' within a convolutional neural network. We first study typing detection, and use our method with trajectories from corner points as input. The resulting motion texture descriptor captures hand-object motion patterns that we use within a privacy-filter pipeline to obscure potentially sensitive content, like passcodes. We also study the more abstract challenge of identity recognition by gait and demonstrate significant improvements over the state of the art using silhouette sequences as input to our autocorrelation network. Further, we show that adding a shallow network before the autocorrelation computation and training the network end-to-end learns a more robust activity feature.

dc.identifier.uri

https://hdl.handle.net/10161/18208

dc.subject

Computer science

dc.subject

activity classification

dc.subject

database retrieval

dc.subject

hand motion regression and classification

dc.subject

motion texture

dc.subject

privacy filter

dc.subject

re-ID by gait

dc.title

Human Activity Analysis

dc.type

Dissertation

Files

Original bundle

Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
Carley_duke_0066D_14843.pdf
Size:
13.23 MB
Format:
Adobe Portable Document Format
No Thumbnail Available
Name:
Carley_duke_0066D_17/Typing_Detection_Privacy_Filter_Demonstration.mp4
Size:
126.84 MB
Format:

Collections