Deep reinforcement learning (DRL) has been successfully used to solve various robotic manipulation tasks. However, most of the existing works do not address the issue of control stability. This is in sharp contrast to the control theory community where the well-established norm is to prove stability whenever a control law is synthesized. What makes traditional stability analysis difficult for DRL are the uninterpretable nature of the neural network policies and unknown system dynamics. In this work, unconditional stability is obtained by deriving an interpretable deep policy structure based on the energy shaping control of Lagrangian systems. Then, stability during physical interaction with an unknown environment is established based on passivity. The result is a stability guaranteeing DRL in a model-free framework that is general enough for contact-rich manipulation tasks. With an experiment on a peg-in-hole task, we demonstrate, to the best of our knowledge, the first DRL with stability guarantee on a real robotic manipulator.
Reinforcement Learning (RL) of robotic manipu-lation skills, despite its impressive successes, stands to benefitfrom incorporating domain knowledge from control theory. Oneof the most important properties that is of interest is controlstability. Ideally, one would like to achieve stability guaranteeswhile staying within the framework of state-of-the-art deepRL algorithms. Such a solution does not exist in general,especially one that scales to complex manipulation tasks. Wecontribute towards closing this gap by introducing normalizing-flow control structure, that can be deployed in any latest deepRL algorithms. While stable exploration is not guaranteed,our method is designed to ultimately produce deterministiccontrollers with provable stability. In addition to demonstratingour method on challenging contact-rich manipulation tasks, wealso show that it is possible to achieve considerable explorationefficiency–reduced state space coverage and actuation efforts–without losing learning efficiency.
In this paper we consider a mobile platform controlled by two entities; an autonomousagent and a human user. The human aims for the mobile platform to complete a task, whichwe will denote as the human task, and will impose a control input accordingly, while not beingaware of any other tasks the system should or must execute. The autonomous agent will in turnplan its control input taking in consideration all safety requirements which must be met, sometask which should be completed as much as possible (denoted as the robot task), as well aswhat it believes the human task is based on previous human control input. A framework for theautonomous agent and a mixed initiative controller are designed to guarantee the satisfaction ofthe safety requirements while both the human and robot tasks are violated as little as possible.The framework includes an estimation algorithm of the human task which will improve witheach cycle, eventually converging to a task which is similar to the actual human task. Hence, theautonomous agent will eventually be able to find the optimal plan considering all tasks and thehuman will have no need to interfere again. The process is illustrated with a simulated example
Consider challenging sim-to-real cases lacking high-fidelity simulators and allowing only 10-20 hardware trials. This work shows that even imprecise simulation can be beneficial if used to build transfer-aware representations.
First, the thesis introduces an informed kernel that embeds the space of simulated trajectories into a lower-dimensional space of latent paths. It uses a sequential variational autoencoder (sVAE) to handle large-scale training from simulated data. Its modular design enables quick adaptation when used for Bayesian optimization (BO) on hardware. The thesis and the included publications demonstrate that this approach works for different areas of robotics: locomotion and manipulation. Furthermore, a variant of BO that ensures recovery from negative transfer when using corrupted kernels is introduced. An application to task-oriented grasping validates its performance on hardware.
For the case of parametric learning, simulators can serve as priors or regularizers. This work describes how to use simulation to regularize a VAE's decoder to bind the VAE's latent space to simulator parameter posterior. With that, training on a small number of real trajectories can quickly shift the posterior to reflect reality. The included publication demonstrates that this approach can also help reinforcement learning (RL) quickly overcome the sim-to-real gap on a manipulation task on hardware.
A longer-term vision is to shape latent spaces without needing to mandate a particular simulation scenario. A first step is to learn general relations that hold on sequences of states from a set of related domains. This work introduces a unifying mathematical formulation for learning independent analytic relations. Relations are learned from source domains, then used to help structure the latent space when learning on target domains. This formulation enables a more general, flexible and principled way of shaping the latent space. It formalizes the notion of learning independent relations, without imposing restrictive simplifying assumptions or requiring domain-specific information. This work presents mathematical properties, concrete algorithms and experimental validation of successful learning and transfer of latent relations.
We develop an approach that benefits from large simulated datasets and takes full advantage of the limited online data that is most relevant. We propose a variant of Bayesian optimization that alternates between using informed and uninformed kernels. With this Bernoulli Alternation Kernel we ensure that discrepancies between simulation and reality do not hinder adapting robot control policies online. The proposed approach is applied to a challenging real-world problem of task-oriented grasping with novel objects. Our further contribution is a neural network architecture and training pipeline that use experience from grasping objects in simulation to learn grasp stability scores. We learn task scores from a labeled dataset with a convolutional network, which is used to construct an informed kernel for our variant of Bayesian optimization. Experiments on an ABB Yumi robot with real sensor data demonstrate success of our approach, despite the challenge of fulfilling task requirements and high uncertainty over physical properties of objects.
Gaussian Processes (GPs) have been widely used in robotics as models, and more recently as key structures in active learning algorithms, such as Bayesian optimization. GPs consist of two main components: the mean function and the kernel. Specifying a prior mean function has been a common way to incorporate prior knowledge. When a prior mean function could not be constructed manually, the next default has been to incorporate prior (simulated) observations into a GP as 'fake' data. Then, this GP would be used to further learn from true data on the target (real) domain. We argue that embedding prior knowledge into GP kernels instead provides a more flexible way to capture simulation-based information. We give examples of recent works that demonstrate the wide applicability of such kernel-centric treatment when using GPs as part of Bayesian optimization. We also provide discussion that helps to build intuition for why such 'kernels as priors' view is beneficial.
Data-efficiency is crucial for autonomous robots to adapt to new tasks and environments. In this work, we focus on robotics problems with a budget of only 10-20 trials. This is a very challenging setting even for data-efficient approaches like Bayesian optimization (BO), especially when optimizing higher-dimensional controllers. Previous work extracted expert-designed low-dimensional features from simulation trajectories to construct informed kernels and run ultra sample-efficient BO on hardware. We remove the need for expert-designed features by proposing a model and architecture for a sequential variational autoencoder that embeds the space of simulated trajectories into a lower-dimensional space of latent paths in an unsupervised way. We further compress the search space for BO by reducing exploration in parts of the state space that are undesirable, without requiring explicit constraints on controller parameters. We validate our approach with hardware experiments on a Daisy hexapod robot and an ABB Yumi manipulator. We also present simulation experiments with further comparisons to several baselines on Daisy and two manipulators. Our experiments indicate the proposed trajectory-based kernel with dynamic compression can offer ultra data-efficient optimization.
Manipulation of deformable objects has given rise to an important set of open problems in the field of robotics. Application areas include robotic surgery, household robotics, manufacturing, logistics, and agriculture, to name a few. Related research problems span modeling and estimation of an object's shape, estimation of an object's material properties, such as elasticity and plasticity, object tracking and state estimation during manipulation, and manipulation planning and control. In this survey article, we start by providing a tutorial on foundational aspects of models of shape and shape dynamics. We then use this as the basis for a review of existing work on learning and estimation of these models and on motion planning and control to achieve desired deformations. We also discuss potential future lines of work.
Unmanned Aerial Vehicles (UAV) are a potential solution to fast and cost efficient package delivery services. There are two types of UAVs, namely fixed wing (UAV-FW) and rotor wing (UAV-RW), which have their own advantages and drawbacks. In this paper we aim at providing different solutions to a collaborating multi-agent scenario combining both UAVs types. We show the problem can be reduced to the facility location problem (FLP) and propose two local search algorithms to solve it: Tabu search and simulated annealing.
Since the 1950s, robotics research has sought to build a general-purpose agent capable of autonomous, open-ended interaction with realistic, unconstrained environments. Cognition is perceived to be at the core of this process, yet understan#ding has been challenged because cognition is referred to differently within and across research areas, and is not clearly defined. The classic robotics approach is decomposition into functional modules which perform planning, reasoning, and problem solving or provide input to these mechanisms. Although advancements have been made and numerous success stories reported in specific niches, this systems-engineering approach has not succeeded in building such a cognitive agent. The emergence of an action-oriented paradigm oilers a new approach: action and perception are no longer separable into functional modules but must be considered in a complete loop. This chapter reviews work on different mechanisms for action-perception learning and discusses the role of embodiment in the design of the underlying representations and learning. It discusses the evaluation of agents and suggests the development of a new embodied Turing lest. Appropriate scenarios need to be devised in addition to current competitions, so that abilities can be tested over long time periods.
The task of exploration does not end when the robot has covered the entire environment. The world is dynamic and to model this property and to keep the map up to date the robot needs to re-explore. In this work, we present an approach to long-term exploration that builds on prior work on dynamic mapping, volumetric representations of space, and exploration planning. The main contribution of our work is a novel formulation of the information gain function that controls the exploration so that it trades off revisiting highly dynamic areas where changes are very likely with covering the rest of the environment to ensure both coverage and up-to-date estimates of the dynamics. We provide experimental validation of our approach in three different simulated environments.
We present an approach to automatically assign semantic labels to rooms reconstructed from 3D RGB maps of apartments. Evidence for the room types is generated using state-of-the-art deep-learning techniques for scene classification and object detection based on automatically generated virtual RGB views, as well as from a geometric analysis of the map's 3D structure. The evidence is merged in a conditional random field, using statistics mined from different datasets of indoor environments. We evaluate our approach qualitatively and quantitatively and compare it to related methods.
In this paper, we propose a new way of doing formation obstacle avoidance using a combination of Constraint Based Programming (CBP) and Rapidly Exploring Random Trees (RRTs). RRT is used to select waypoint nodes, and CBP is used to move the formation between those nodes, reactively rotating and translating the formation to pass the obstacles on the way. Thus, the CBP includes constraints for both formation keeping and obstacle avoidance, while striving to move the formation towards the next waypoint. The proposed approach is compared to a pure RRT approach where the motion between the RRT waypoints is done following linear interpolation trajectories, which are less computationally expensive than the CBP ones. The results of a number of challenging simulations show that the proposed approach is more efficient for scenarios with high obstacle densities.
Learning state representations enables robotic planning directly from raw observations such as images. Several methods learn state representations by utilizing losses based on the reconstruction of the raw observations from a lower-dimensional latent space. The similarity between observations in the space of images is often assumed and used as a proxy for estimating similarity between the underlying states of the system. However, observations commonly contain task-irrelevant factors of variation which are nonetheless important for reconstruction, such as varying lighting and different camera viewpoints. In this work, we define relevant evaluation metrics and perform a thorough study of different loss functions for state representation learning. We show that models exploiting task priors, such as Siamese networks with a simple contrastive loss, outperform reconstruction-based representations in visual task planning in case of task-irrelevant factors of variations.
In this work we consider the problem of control under Signal Temporal Logic specifications (STL) that depend on relative position information among neighboring agents. In particular, we consider STL tasks for given pairs of agents whose satisfaction is translated into a set of setpoint output tracking problems with transient and steady-state constraints. Contrary to existing work the proposed framework does not require initial satisfaction of the funnel constraints but can ensure their satisfaction within a pre-specified finite time. Given a tree topology in which agents sharing a STL task form an edge, we show that the resulting control laws ensure the satisfaction of the STL task as well as boundedness of all closed loop signals using only local information.
We study the problem of motion feasibility for multiagent control systems on Lie groups with collision-avoidance constraints. We first consider the problem for kinematic left-invariant control systems and next, for dynamical control systems given by a left-trivialized Lagrangian function. Solutions of the kinematic problem give rise to linear combinations of the control inputs in a linear subspace, annihilating the collision-avoidance constraints. In the dynamical problem, motion feasibility conditions are obtained by using techniques from variational calculus on manifolds, given by a set of equations in a vector space, and Lagrange multipliers annihilating the constraint force that prevents the deviation of solutions from a constraint submanifold.
This paper presents a collision avoidance method for tele-operated unmanned aerial vehicles (UAVs). The method is designed to assist the operator at all times, such that the operator can focus solely on the main objectives instead of avoiding obstacles. We restrict the altitude to be fixed in a three dimensional environment to simplify the control and operation of the UAV. The method contributes a number of desired properties not found in other collision avoidance systems for tele-operated UAVs. Our method i) can handle situations where there is no input from the user by actively stopping and proceeding to avoid obstacles, ii) allows the operator to slide between prioritizing staying away from objects and getting close to them in a safe way when so required, and iii) provides for intuitive control by not deviating too far from the control input of the operator. We demonstrate the effectiveness of the method in real world experiments with a physical hexacopter in different indoor scenarios. We also present simulation results where we compare controlling the UAV with and without our method activated.
Cognitive science is witnessing a pragmatic turn away from the traditional representation-centered framework of cognition toward one that focuses on understanding cognition as being "enactive." The enactive view holds that cognition does not produce models of the world but rather subserves action, as it is grounded in sensorimotor skills. The conclusions of this Ernst Strungmann Forum suggest that strong conceptual advances are possible when cognition is framed by an action-oriented paradigm. Experimental evidence from cognitive science, neuroscience, psychology, robotics, and philosophy of mind supports this position. This chapter provides an overview of the discourse surrounding this collaborative effort. Core topics which guided this multidisciplinary perusal are identified and challenges that emerged are highlighted. Action-oriented views from a variety of disciplines have started to cross-fertilize, thus promoting an integration of concepts and creating fertile ground for a novel theory of cognition to emerge.
, with static obstacles. In particular, we propose a decentralised control protocol such that each agent reaches a predefined position at the workspace, while using local information based on a limited sensing radius. The proposed scheme guarantees that the initially connected agents remain always connected. In addition, by introducing certain distance constraints, we guarantee inter-agent collision avoidance as well as collision avoidance with the obstacles and the boundary of the workspace. The proposed controllers employ a class of Decentralized Nonlinear Model Predictive Controllers (DNMPC) under the presence of disturbances and uncertainties. Finally, simulation results verify the validity of the proposed framework.
Safe driving requires autonomous vehicles to anticipate potential hidden traffic participants and other unseen objects, such as a cyclist hidden behind a large vehicle, or an object on the road hidden behind a building. Existing methods are usually unable to consider all possible shapes and orientations of such obstacles. They also typically do not reason about observations of hidden obstacles over time, leading to conservative anticipations. We overcome these limitations by (1) modeling possible hidden obstacles as a set of states of a point mass model and (2) sequential reasoning based on reachability analysis and previous observations. Based on (1), our method is safer, since we anticipate obstacles of arbitrary unknown shapes and orientations. In addition, (2) increases the available drivable space when planning trajectories for autonomous vehicles. In our experiments, we demonstrate that our method, at no expense of safety, gives rise to significant reductions in time to traverse various intersection scenarios from the CommonRoad Benchmark Suite.
Reinforcement learning methods can achieve significant performance but require a large amount of training data collected on the same robotic platform. A policy trained with expensive data is rendered useless after making even a minor change to the robot hardware. In this paper, we address the challenging problem of adapting a policy, trained to perform a task, to a novel robotic hardware platform given only few demonstrations of robot motion trajectories on the target robot. We formulate it as a few-shot meta-learning problem where the goal is to find a meta-model that captures the common structure shared across different robotic platforms such that data-efficient adaptation can be performed. We achieve such adaptation by introducing a learning framework consisting of a probabilistic gradient-based meta-learning algorithm that models the uncertainty arising from the few-shot setting with a low-dimensional latent variable. We experimentally evaluate our framework on a simulated reaching and a real-robot picking task using 400 simulated robots generated by varying the physical parameters of an existing set of robotic platforms. Our results show that the proposed method can successfully adapt a trained policy to different robotic platforms with novel physical parameters and the superiority of our meta-learning algorithm compared to state-of-the-art methods for the introduced few-shot policy adaptation problem.
We present a tracking controller for mobile multi-robot systems based on dual quaternion pose representations applied to formations of robots in a leader-follower configuration, by using a cluster-space state approach. The proposed controller improves system performance with respect to previous works by reducing steady-state tracking errors. The performance is evaluated through experimental field tests with a formation of an unmanned ground vehicle (UGV) and an unmanned aerial vehicle (UAV), as well as a formation of two UAVs.
This paper presents a distributed control strategy for a multi-agent system commanded by a set of leaders that has to accomplish a high-level plan consisting of a sequence of tasks specified by a state-space region and a timed constraint. The agents are also subject to relative-distance constraints with its neighbors. The solution consists in an adaptive distributed mechanism to update the feedback gains for the leader agents, which is executed following a self-triggered algorithm. The results show how the proposed approach provides less conservative results than if feedback gains are held constant, and are illustrated with a simulation example.
In this paper, we present a distributed hybrid control strategy for multiagent systems with contingent temporal tasks and prescribed formation constraints. Each agent is assigned a local task given as a linear temporal logic formula. In addition, two commonly seen kinds of cooperative robotic tasks, namely, service and formation, are requested and exchanged among the agents in real time. The service request is a short-term task provided by one agent to another. On the other hand, the formation request is a relative deployment requirement with predefined transient response imposed by an associated performance function. The proposed hybrid control strategy consists of four major components: 1) the contingent requests handlingmodule; 2) the real-time events monitoring module; 3) the local discrete plan synthesis module; and 4) the continuous control switching module, and it is shown that all local tasks and contingent service/formation requests are fulfilled. Finally, a simulated paradigm demonstrates the proposed control strategy.
Current control applications necessitate in many cases the consideration of systems with multiple interconnected components. These components/agents may need to fulfill high-level tasks at a discrete planning layer and also coupled constraints at the continuous control layer. Toward this end, the need for combined decentralized control at the continuous layer and planning at the discrete layer becomes apparent. While there are approaches that handle the problem in a top-down centralized manner, decentralized bottom-up approaches have not been pursued to the same extent. We present here some of our results for the problem of combined, hybrid control and task planning from high-level specifications for multi-agent systems in a bottom-up manner. In the first part, we present some initial results on extending the necessary notion of abstractions to multi-agent systems in a distributed fashion. We then consider a setup where agents are assigned individual tasks in the form of linear temporal logic (LTL) formulas and derive local task planning strategies for each agent. In the last part, the problem of combined distributed task planning and control under coupled continuous constraints is further considered.
Cloth manipulation remains a challenging problem for the robotic community. Recently, there has been an increased interest in applying deep learning techniques to problems in the fashion industry. As a result, large annotated data sets for cloth category classification and landmark detection were created. In this work, we leverage these advances in deep learning to perform cloth manipulation. We propose a full cloth manipulation framework that, performs category classification and landmark detection based on an image of a garment, followed by a manipulation strategy. The process is performed iteratively to achieve a stretching task where the goal is to bring a crumbled cloth into a stretched out position. We extensively evaluate our learning pipeline and show a detailed evaluation of our framework on different types of garments in a total of 140 recorded and available experiments. Finally, we demonstrate the benefits of training a network on augmented fashion data over using a small robotic-specific data set.
In this paper, we introduce a hybrid zonotope-based approach for formally verifying the behavior of autonomous systems operating under Linear Temporal Logic (LTL) specifications. In particular, we formally verify the LTL formula by constructing temporal logic trees (TLT)s via backward reachability analysis (BRA). In previous works, TLTs are predominantly constructed with either highly general and computationally intensive level set-based BRA or simplistic and computationally efficient polytope-based BRA. In this work, we instead propose the construction of TLTs using hybrid zonotope-based BRA. By using hybrid zonotopes, we show that we are able to formally verify LTL specifications in a computationally efficient manner while still being able to represent complex geometries that are often present when deploying autonomous systems, such as non-convex, disjoint sets. Moreover, we evaluate our approach on a parking example, providing preliminary indications of how hybrid zonotopes facilitate computationally efficient formal verification of LTL specifications in environments that naturally lead to non-convex, disjoint geometries.
In this paper we present the system we developed for the Amazon Picking Challenge 2015, and discuss some of the lessons learned that may prove useful to researchers and future teams developing autonomous robot picking systems. For the competition we used a PR2 robot, which is a dual arm robot research platform equipped with a mobile base and a variety of 2D and 3D sensors. We adopted a behavior tree to model the overall task execution, where we coordinate the different perception, localization, navigation, and manipulation activities of the system in a modular fashion. Our perception system detects and localizes the target objects in the shelf and it consisted of two components: one for detecting textured rigid objects using the SimTrack vision system, and one for detecting non-textured or nonrigid objects using RGBD features. In addition, we designed a set of grasping strategies to enable the robot to reach and grasp objects inside the confined volume of shelf bins. The competition was a unique opportunity to integrate the work of various researchers at the Robotics, Perception and Learning laboratory (formerly the Computer Vision and Active Perception Laboratory, CVAP) of KTH, and it tested the performance of our robotic system and defined the future direction of our research.
In underwater robotic interaction tasks (e.g., sampling of sea organisms, underwater welding, panel handling, etc) various issues regarding the uncertainties and complexity of the robot dynamic model, the external disturbances (e.g., sea currents), the steady state performance as well as the overshooting/undershooting of the interaction force error, should be addressed during the control design. Motivated by the aforementioned considerations, this paper presents a force/position tracking control protocol for an Underwater Vehicle Manipulator System (UVMS) in compliant contact with a planar surface, without incorporating any knowledge of the UVMS dynamic model, the exogenous disturbances or the contact stiffness model. Moreover, the proposed control framework guarantees: (i) certain predefined minimum speed of response, maximum steady state error as well as overshoot/undershoot concerning the force/position tracking errors, (ii) contact maintenance and (iii) bounded closed loop signals. Additionally, the achieved transient and steady state performance is solely determined by certain designer-specified performance functions/parameters and is fully decoupled from the control gain selection and the initial conditions. Finally, both simulation and experimental studies clarify the proposed method and verify its efficiency.
With the increasing penetration of distributed energy resources into islanded microgrids, a grid-forming inverter (GFI) has become the key element interfacing renewable energy sources. Usually, the GFI is employed with an output filter to minimize the harmonic content achieving high-quality output voltage regulation. To this end, model-predictive control (MPC) has been widely proposed to control the output voltage of GFI systems due to the high-quality performance, fast control response, and straightforward handling of constraints. However, an accurate model of the system is required for the conventional MPC to avoid a suboptimal performance under uncertainties. To overcome this known drawback, a novel model-free predictive control is proposed in this article. Consequently, the output voltage of an LC L-filtered GFI is regulated without the knowledge of the physical model.
We introduce the concept of structured synthesis for Markov decision processes. A structure is induced from finitely many pre-specified options for a system configuration. We define the structured synthesis problem as a nonlinear programming problem (NLP) with integer variables. As solving NLPs is not feasible in general, we present an alternative approach. A transformation of models specified in the PRISM probabilistic programming language creates models that account for all possible system configurations by nondeterministic choices. Together with a control module that ensures consistent configurations throughout a run of the system, this transformation enables the use of optimized tools for model checking in a black-box fashion. While this transformation increases the size of a model, experiments with standard benchmarks show that the method provides a feasible approach for structured synthesis. We motivate and demonstrate the usefulness of the approach along a realistic case study involving surveillance by unmanned aerial vehicles in a shipping facility.
In this paper, we explore communication protocols between two or more agents in an initially partially known environment. We assume two types of agents (A and B), where an agent of Type A constitutes an information source (e.g., a mobile sensor) with its own local objective expressed in temporal logic, and an agent of Type B constitutes an agent that accomplishes its own mission (e.g., search and rescue mission) also expressed in temporal logic. An agent of Type B requests information from an agent of Type A to update its knowledge about the environment. In this work, we develop an algorithm that is able to verify if a communication protocol exists, for any possible initial plan executed by an agent of Type B.
Task-oriented grasping refers to the problem of computing stable grasps on objects that allow for a subsequent execution of a task. Although grasping objects in a task-oriented manner comes naturally to humans, it is still very challenging for robots. Take for example a service robot deployed in a household. Such a robot should be able to execute complex tasks that might include cutting a banana or flipping a pancake. To do this, the robot needs to know what and how to grasp such that the task can be executed. There are several challenges when it comes to this. First, the robot needs to be able to select an appropriate object for the task. This pertains to the theory of \emph{affordances}. Second, it needs to know how to place the hand such that the task can be executed, for example, grasping a knife on the handle when performing cutting. Finally, algorithms for task-oriented grasping should be scalable and have high generalizability over many object classes and tasks. This is challenging since there are no available datasets that contain information about mutual relations between objects, tasks and grasps.In this thesis, we present methods and algorithms for task-oriented grasping that rely on deep learning. We use deep learning to detect object \emph{affordances}, predict task-oriented grasps on novel objects and to parse human activity datasets for the purpose of transferring this knowledge to a robot.For learning affordances, we present a method for detecting functional parts given a visual observation of an object and a task. We utilize the detected affordances together with other object properties to plan for stable, task-oriented grasps on novel objects.For task-oriented grasping, we present a system for predicting grasp scores that take into account both the task and the stability. The grasps are then executed on a real-robot and refined via bayesian optimization. Finally, for parsing human activity datasets, we present an algorithm for estimating 3D hand and object poses and shapes from 2D images so that the information about the contacts and relative hand placement can be extracted. We demonstrate that we can use the information obtained in this manner to teach a robot task-oriented grasps by performing experiments with a real robot on a set of novel objects.
We propose to leverage a real-world, human activity RGB dataset to teach a robot <italic>Task-Oriented Grasping</italic> (TOG). We develop a model that takes as input an RGB image and outputs a hand pose and configuration as well as an object pose and a shape. We follow the insight that jointly estimating hand and object poses increases accuracy compared to estimating these quantities independently of each other. Given the trained model, we process an RGB dataset to automatically obtain the data to train a TOG model. This model takes as input an object point cloud and outputs a suitable region for task-specific grasping. Our ablation study shows that training an object pose predictor with the hand pose information (and vice versa) is better than training without this information. Furthermore, our results on a real-world dataset show the applicability and competitiveness of our method over state-of-the-art. Experiments with a robot demonstrate that our method can allow a robot to preform TOG on novel objects.
We develop a system for modeling hand-object interactions in 3D from RGB images that show a hand which is holding a novel object from a known category. We design a Convolutional Neural Network (CNN) for Hand-held Object Pose and Shape estimation called HOPS-Net and utilize prior work to estimate the hand pose and configuration. We leverage the insight that information about the hand facilitates object pose and shape estimation by incorporating the hand into both training and inference of the object pose and shape as well as the refinement of the estimated pose. The network is trained on a large synthetic dataset of objects in interaction with a human hand. To bridge the gap between real and synthetic images, we employ an image-to-image translation model (Augmented CycleGAN) that generates realistically textured objects given a synthetic rendering. This provides a scalable way of generating annotated data for training HOPS-Net. Our quantitative experiments show that even noisy hand parameters significantly help object pose and shape estimation. The qualitative experiments show results of pose and shape estimation of objects held by a hand 'in the wild'.
The current trend in computer vision is development of data-driven approaches where the use of large amounts of data tries to compensate for the complexity of the world captured by cameras. Are these approaches also viable solutions in robotics? Apart from 'seeing', a robot is capable of acting, thus purposively change what and how it sees the world around it. There is a need for an interplay between processes such as attention, segmentation, object detection, recognition and categorization in order to interact with the environment. In addition, the parameterization of these is inevitably guided by the task or the goal a robot is supposed to achieve. In this talk, I will present the current state of the art in the area of robot vision and discuss open problems in the area. I will also show how visual input can be integrated with proprioception, tactile and force-torque feedback in order to plan, guide and assess robot's action and interaction with the environment. Interaction between two agents builds on the ability to engage in mutual prediction and signaling. Thus, human-robot interaction requires a system that can interpret and make use of human signaling strategies in a social context. Our work in this area focuses on developing a framework for human motion prediction in the context of joint action in HRI. We base this framework on the idea that social interaction is highly influences by sensorimotor contingencies (SMCs). Instead of constructing explicit cognitive models, we rely on the interaction between actions the perceptual change that they induce in both the human and the robot. This approach allows us to employ a single model for motion prediction and goal inference and to seamlessly integrate the human actions into the environment and task context. We employ a deep generative model that makes inferences over future human motion trajectories given the intention of the human and the history as well as the task setting of the interaction. With help predictions drawn from the model, we can determine the most likely future motion trajectory and make inferences over intentions and objects of interest.
Partial observability and uncertainty are common problems in sequential decision-making that particularly impede the use of formal models such as Markov decision processes (MDPs). However, in practice, agents may be able to employ costly sensors to measure their environment and resolve partial observability by gathering information. Moreover, imprecise transition functions can capture model uncertainty. We combine these concepts and extend MDPs to robust active-measuring MDPs (RAM-MDPs). We present an active-measure heuristic to solve RAM-MDPs efficiently and show that model uncertainty can, counterintuitively, let agents take fewer measurements. We propose a method to counteract this behavior while only incurring a bounded additional cost. We empirically compare our methods to several baselines and show their superior scalability and performance.
This paper investigates the rendezvous problem for the autonomous cooperative landing of an unmanned aerial vehicle (UAV) on an unmanned surface vehicle (USV). Such heterogeneous agents, with nonlinear dynamics, are dynamically decoupled but share a common cooperative rendezvous task. The underlying control scheme is based on distributed Model Predictive Control (MPC). The main contribution is a rendezvous algorithm with an online update rule of the rendezvous location. The algorithm only requires the agents to exchange information when they can not guarantee to rendezvous. Hence, the exchange of information occurs aperiodically, which reduces the necessary communication between the agents. Furthermore, we prove that the algorithm guarantees recursive feasibility. The simulation results illustrate the effectiveness of the proposed algorithm applied to the problem of autonomous cooperative landing.
We propose a control protocol based on the prescribed performance control (PPC) methodology for a quadro-tor unmanned aerial vehicle (UAV). Quadrotor systems belong to the class of underactuated systems for which the original PPC methodology cannot be directly applied. We introduce the necessary design modifications to stabilize the considered system with prescribed performance. The proposed control protocol does not use any information of dynamic model parameters or exogenous disturbances. Furthermore, the stability analysis guarantees that the tracking errors remain inside of designer-specified time-varying functions, achieving prescribed performance independent from the control gains’ selection. Finally, simulation results verify the theoretical results.
Signal Temporal Logic (STL) is a rigorous specification language that allows one to express various spatiotemporal requirements and preferences. Its semantics (called robustness) allows quantifying to what extent are the STL specifications met. In this work, we focus on enabling STL constraints and preferences in the Real-Time Rapidly ExploringRandom Tree (RT-RRT*) motion planning algorithm in an environment with dynamic obstacles. We propose a cost function that guides the algorithm towards the asymptotically most robust solution, i.e. a plan that maximally adheres to the STL specification. In experiments, we applied our method to a social navigation case, where the STL specification captures spatio-temporal preferences on how a mobile robot should avoid an incoming human in a shared space. Our results show that our approach leads to plans adhering to the STL specification, while ensuring efficient cost computation.
In this paper, we propose a method to infer temporal logic behaviour models of an a priori unknown system. We use the formalism of Signal Temporal Logic (STL), which can express various robot motion planning and control specifications, including spatial preferences. In our setting, data is collected through a series of queries the learning algorithm poses to the system under test. This active learning approach incrementally builds a hypothesis solution which, over time, converges to the actual behaviour of the system. Active learning presents several benefits compared to supervised learning: in the case of costly prior labelling of data, and if the system to test is accessible, the learning algorithm can interact with the system to refine its guess of the specification of the system. Inspired by mobile robot navigation tasks, we present experimental case studies to ensure the relevance of our method.
Motivated by the recent interest in cyber-physicaland autonomous robotic systems, we study the problem ofdynamically coupled multi-agent systems under a set of signaltemporal logic tasks. In particular, the satisfaction of each ofthese signal temporal logic tasks depends on the behavior of adistinct set of agents. Instead of abstracting the agent dynamicsand the temporal logic tasks into a discrete domain and solvingthe problem therein or using optimization-based methods, wederive collaborative feedback control laws. These control laws arebased on a decentralized control barrier function condition thatresults in discontinuous control laws, as opposed to a centralizedcondition resembling the single-agent case. The benefits of ourapproach are inherent robustness properties typically present infeedback control as well as satisfaction guarantees for continuous-time multi-agent systems. More specifically, time-varying controlbarrier functions are used that account for the semantics of thesignal temporal logic tasks at hand. For a certain fragment ofsignal temporal logic tasks, we further propose a systematic wayto construct such control barrier functions. Finally, we showthe efficacy and robustness of our framework in an experimentincluding a group of three omnidirectional robots
Abstract—Motivated by the recent interest in cyber-physicaland interconnected autonomous systems, we study the problemof dynamically coupled multi-agent systems under conflictinglocal signal temporal logic tasks. Each agent is assigned a localsignal temporal logic task regardless of the tasks that the otheragents are assigned to. Such a task may be dependent, i.e., thesatisfaction of the task may depend on the behavior of more thanone agent, so that the satisfaction of the conjunction of all localtasks may be conflicting. We propose a hybrid feedback controlstrategy using time-varying control barrier functions. Our controlstrategy finds least violating solutions in the aforementionedconflicting situations based on a suitable robustness notion andby initiating collaboration among agents.
The need for computationally-efficient control methods of dynamical systems under temporal logic tasks has recently become more apparent. Existing methods are computationally demanding and hence often not applicable in practice. Especially with respect to multi-robot systems, these methods do not scale computationally. In this letter, we propose a framework that is based on control barrier functions and signal temporal logic. In particular, time-varying control barrier functions are considered where the temporal properties are used to satisfy signal temporal logic tasks. The resulting controller is given by a switching strategy between a computationally-efficient convex quadratic program and a local feedback control law.
Control systems that satisfy temporal logic specifications have become increasingly popular due to their applicability to robotic systems. Existing control methods, however, are computationally demanding, especially when the problem size becomes too large. In this paper, a robust and computationally efficient model predictive control framework for signal temporal logic specifications is proposed. We introduce discrete average space robustness, a novel quantitative semantic for signal temporal logic, that is directly incorporated into the cost function of the model predictive controller. The optimization problem entailed in this framework can be written as a convex quadratic program when no disjunctions are considered and results in a robust satisfaction of the specification. Furthermore, we define the predicate robustness degree as a new robustness notion. Simulations of a multi-agent system subject to complex specifications demonstrate the efficacy of the proposed method.
This brief presents the implementation and experimental results of two frameworks for multi-agent systems under temporal logic tasks, which we have recently proposed. Each agent is subject to either a local linear temporal logic (LTL) or a local signal temporal logic (STL) task where each task may further be coupled, i.e., the satisfaction of a task may depend on more than one agent. The agents are represented by mobile robots with different sensing and actuation capabilities. We propose to combine the two aforementioned frameworks to use the strengths of both LTL and STL. For the implementation, we take into account practical issues, such as collision avoidance, and, in particular, for the STL framework, input saturation, the digital implementation of continuous-time feedback control laws, and a controllability assumption that was made in the original work. The experimental results contain three scenarios that show a wide variety of tasks.