Learning descriptive models of objects and activities from egocentric video