Welcome to RobotFlow

Multi-modal Perception

Perception has been linked to “semantics” for a long time. When the researchers refer to the term of “multi-modal”, they usually mean “vision-language”. Despite that the integration of vision and language is indeed exciting, we pay more attention on those modalities that are less explored but equally, if not more, important. For example, force, temperature, position, etc. After all, if perception is to understand the world by sensory signals, then any sensors that can measure the world can provide useful information.

Interactive Perception

The interactive perception puts the conventional perception algorithm in an “interactive” setting. Don’t confuse this term with some application like interactive image editing, or “human-in-loop” active learning. It specifically means the interaction with the world is carried out by an intelligent agent. The topic of interactive perception can be regarded as an interactive version of multi-modal perception.