Faculty Directory

Aloimonos, Yiannis

Aloimonos, Yiannis

Professor
Computer Science
Electrical and Computer Engineering
UMIACS
The Institute for Systems Research
Maryland Robotics Center
Brain and Behavior Institute
4214 Iribe Center

Professor Aloimonos holds a Ph.D. in Computer Science from the University of Rochester.

His research is devoted to the principles governing the design and analysis of real-time systems that possess perceptual capabilities, for the purpose of both explaining animal vision and designing seeing machines. Such capabilities have to do with the ability of the system to control its motion and the motion of its parts using visual input (navigation and manipulation) and the ability of the system to break up its environment into a set of categories relevant to its tasks and recognize these categories (categorization and recognition).

The work is being done in the framework of Active and Purposive Vision, a paradigm also known as Animate or Behavioral Vision. In simple terms, this approach suggests that Vision has a purpose, a goal. This goal is action; it can be theoretical, practical or aesthetic. When Vision is considered in conjunction with action, it becomes easier. The reason is that the descriptions of space-time that the system needs to derive are not general purpose, but are purposive. This means that these descriptions are good for restricted sets of tasks, such as tasks related to navigation, manipulation and recognition.

If Vision is the process of deriving purposive space-time descriptions as opposed to general ones, one is faced with the difficult question of where to star t (with which descriptions)? Understanding moving images is a capability shared by all "seeing" biological systems. It was therefore decided to start with descriptions that involve time. Another reason for this is that motion problems are purely geometric and understanding the geometry amounts to solving the problems. This led to a consideration of the problems of navigation. Within navigation, once again, one faces the same question: in which order should navigational capabilities be developed? This led to the development of a synthetic approach, according to which the order of development is related to the complexity of the underlying model. The appropriate starting point is the capability of understanding self-motion. By performing a geometric analysis of motion fields, global patterns of partial aspects of motion fields were found to be associated with particular 3D motion. This gave rise to a series of algorithms for recovering egomotion through pattern matching. The qualitative nature of the algorithms in conjunction with a nature of the well-defined input (the input is the normal flow, i.e. the component of the flow along the gradient of the image) makes the solution stable against noise.

Other problems, higher in the hierarchy of navigation, are independent motion detection, estimation of ordinal depth, and learning of space. To illustrate these topics, consider the case of ordinal depth. Traditionally, systems were supposed to estimate depth. Such metric information is too much to expect from systems that are supposed to just navigate successfully. Many tasks can be achieved by using an ordinal depth representation. Such a representation can be extracted without knowledge of the exact image motion or displacement. Recent studies on visual space distortion have triggered a new framework for understanding visual shape. A study of a spectrum of shape representations lying between the projective and Euclidean layers is currently underway.

The learning of space can be based on the principle of learning routes. A system knows the space around it if it can successfully visit a set of locations. With more memory available, relationships between the representations of different routes give rise to partial geocentric maps.

In hand-eye coordination, the concept of a perceptual kinematic map has been introduced. This is a map from the robot's joints to image features. Currently under investigation is the problem of creating a classification of the singularities of this map.

The work on active, anthropomorphic vision led to the study of fixation and the development of TALOS (TALOS), a system that implements dynamic fixation. Since fixation is a principle of Active Vision and fixating observers build representations relative to fixations, it is important to solve fixation in real time and demonstrate it in hardware. TALOS consists of a binocular head/eye system augmented with additional sensors. It is designed to perform fixation as it is moving, in real time.

The ideas of Purposive Vision have led to the study of Intelligence as a purposive activity. A four-valued logic is being developed for handling reasoning in a system of interacting purposive agents.

 

The research of Professor Aloimonos is devoted to the principles governing the design and analysis of real-time systems that possess perceptual capabilities, for the purpose of both explaining animal vision and designing seeing machines. Such capabilities have to do with the ability of the system to control its motion and the motion of its parts using visual input (navigation and manipulation) and the ability of the system to break up its environment into a set of categories relevant to its tasks and recognize these categories (categorization and recognition). The work is being done in the framework of Active and Purposive Vision, a paradigm also known as Animate or Behavioral Vision.

Since the early 2000 he has been working on the integration of sensorimotor information with the conceptual system, bridging the gap between signals and symbols. This led to the introduction of language tools into the Robotics community. During the past five years his research is supported by the European Union under the cognitive systems program in the projects POETICON and POETICON++ , by the National Science Foundation under the Cyber Physics Systems Program in the project Robots with Vision that find objects and by the National Institues of Health in the project Human Activity Languages.

Here is an example of going from language to action (ask a robot to do something). Note how the robot announces that he has to think for a moment, before performing the action but does not reveal its thinking. Here, some of the thinking is revealed.

For the dual problem of going from action to language (observing an activity and describing in natural language what is going on), see our demos in the Telluride Neuromorphic Cognition Engineering workshops .


University of Maryland Has Strong Presence at ICRA 2024

Researchers detail advancements in navigation, trajectory planning.

Modi Briefed on UMD-led Aquaculture Research

Doctoral student Kaustubh Joshi delivered a presentation on modernizing the industry with advanced tech.

UMD’s SeaDroneSim can generate simulated images and videos to help UAV systems recognize ‘objects of interest’ in the water

The suite’s simulated objects could fill a large dataset gap for the UAV-based computer vision systems used in searching for maritime objects.

The Modern Battle for Maryland’s Oysters

UMD researchers use AI and robotics to help revive a struggling industry.

Three ECE Professors Ranked Top Scientists in the World by Guide2Research

They join seven other UMD faculty members breaking into the top 1000 scientist rankings based on their prolific research output.

UMD Researchers to Have a Strong Showing at ICRA 2021

The 2021 International Conference on Robotics and Automation (ICRA 2021) will be held both in person and online from May 30 to June 5, 2021.

New undergraduate minor in robotics and autonomous systems

The Maryland Robotics Center will administer the program, which begins in Fall 2021.

Maryland engineers receive $10M to transform shellfish farming

The team will help farmers tap the economic potential and environmental benefits of shellfish aquaculture.

Microrobots soon could be seeing better, say UMD faculty in Science Robotics

Commentary by Yiannis Aloimonos and Cornelia Fermüller notes this alternative to SLAM-reliant computer vision could one day permeate all robotics.

Helping robots remember

Hyperdimensional computing theory could change the way AI works

New research will help cyber-physical systems understand human activities

NSF grant funds Fermüller, Baras and Aloimonos to develop a three-layer architecture.