Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2020, ArXiv
Open-ended learning is a core research field of machine learning and robotics aiming to build learning machines and robots able to autonomously acquire knowledge and skills and to reuse them to solve novel tasks. The multiple challenges posed by open-ended learning have been operationalized in the robotic competition REAL 2020. This requires a simulated camera-arm-gripper robot to (a) autonomously learn to interact with objects during an intrinsic phase where it can learn how to move objects and then (b) during an extrinsic phase, to re-use the acquired knowledge to accomplish externally given goals requiring the robot to move objects to specific locations unknown during the intrinsic phase. Here we present a 'baseline architecture' for solving the challenge, provided as baseline model for REAL 2020. Few models have all the functionalities needed to solve the REAL 2020 benchmark and none has been tested with it yet. The architecture we propose is formed by three components: ...
arXiv (Cornell University), 2020
Open-ended learning is a core research field of developmental robotics and AI aiming to build learning machines and robots that can autonomously acquire knowledge and skills incrementally as infants and children. The first contribution of this work is to study the challenges posed by the previously proposed benchmark 'REAL competition' aiming to foster the development of truly open-ended learning robot architectures. The competition involves a simulated camera-arm robot that: (a) in a first 'intrinsic phase' acquires sensorimotor competence by autonomously interacting with objects; (b) in a second 'extrinsic phase' is tested with tasks unknown in the intrinsic phase to measure the quality of knowledge previously acquired. This benchmark requires the solution of multiple challenges usually tackled in isolation, in particular exploration, sparse-rewards, object learning, generalisation, task/goal self-generation, and autonomous skill learning. As a second contribution, we present a set of 'REAL-X' robot architectures that are able to solve different versions of the benchmark, where we progressively release initial simplifications. The architectures are based on a planning approach that dynamically increases abstraction, and intrinsic motivations to foster exploration. REAL-X achieves a good performance level in very demanding conditions. We argue that the REAL benchmark represents a valuable tool for studying open-ended learning in its hardest form.
IEEE Transactions on Cognitive and Developmental Systems, 2023
Open-ended learning is a core research field of developmental robotics and AI aiming to build learning machines and robots that can autonomously acquire knowledge and skills incrementally as infants. The first contribution of this work is to highlight the challenges posed by the previously proposed benchmark 'REAL competition' fostering the development of truly open-ended learning robots. The benchmark involves a simulated camera-arm robot that: (a) in a first 'intrinsic phase' acquires sensorimotor competence by autonomously interacting with objects; (b) in a second 'extrinsic phase' is tested with tasks, unknown in the intrinsic phase, to measure the quality of previously acquired knowledge. The benchmark requires the solution of multiple challenges usually tackled in isolation, in particular exploration, sparse-rewards, object learning, generalisation, task/goal self-generation, and autonomous skill learning. As a second contribution, the work presents a 'REAL-X' architecture. Different systems implementing the architecture can solve different versions of the benchmark progressively releasing initial simplifications. The REAL-X systems are based on a planning approach that dynamically increases abstraction and on intrinsic motivations to foster exploration. Some systems achieves a good performance level in very demanding conditions. Overall, the REAL benchmark represents a valuable tool for studying openended learning in its hardest form.
2019
Success stories of applied machine learning can be traced back to the datasets and environments that were put forward as challenges for the community. The challenge that the community sets as a benchmark is usually the challenge that the community eventually solves. The ultimate challenge of reinforcement learning research is to train real agents to operate in the real environment, but until now there has not been a common real-world RL benchmark. In this work, we present a prototype real-world environment from OffWorld Gym -- a collection of real-world environments for reinforcement learning in robotics with free public remote access. Close integration into existing ecosystem allows the community to start using OffWorld Gym without any prior experience in robotics and takes away the burden of managing a physical robotics system, abstracting it under a familiar API. We introduce a navigation task, where a robot has to reach a visual beacon on an uneven terrain using only the camera ...
2018
Reproducibility has been a significant challenge in deep reinforcement learning and robotics research. Open-source frameworks and standardized benchmarks can serve an integral role in rigorous evaluation and reproducible research. We introduce SURREAL, an open-source scalable framework that supports stateof-the-art distributed reinforcement learning algorithms. We design a principled distributed learning formulation that accommodates both on-policy and off-policy learning. We demonstrate that SURREAL algorithms outperform existing opensource implementations in both agent performance and learning efficiency. We also introduce SURREAL Robotics Suite, an accessible set of benchmarking tasks in physical simulation for reproducible robot manipulation research. We provide extensive evaluations of SURREAL algorithms and establish strong baseline results.
2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020
Recent advances in deep reinforcement learning (RL) have demonstrated its potential to learn complex robotic manipulation tasks. However, RL still requires the robot to collect a large amount of real-world experience. To address this problem, recent works have proposed learning from expert demonstrations (LfD), particularly via inverse reinforcement learning (IRL), given its ability to achieve robust performance with only a small number of expert demonstrations. Nevertheless, deploying IRL on real robots is still challenging due to the large number of robot experiences it requires. This paper aims to address this scalability challenge with a robust, sample-efficient, and general meta-IRL algorithm, SQUIRL, that performs a new but related long-horizon task robustly given only a single video demonstration. First, this algorithm bootstraps the learning of a task encoder and a task-conditioned policy using behavioral cloning (BC). It then collects realrobot experiences and bypasses reward learning by directly recovering a Q-function from the combined robot and expert trajectories. Next, this algorithm uses the Q-function to reevaluate all cumulative experiences collected by the robot to improve the policy quickly. In the end, the policy performs more robustly (90%+ success) than BC on new tasks while requiring no trial-and-errors at test time. Finally, our real-robot and simulated experiments demonstrate our algorithm's generality across different state spaces, action spaces, and vision-based manipulation tasks, e.g., pick-pour-place and pick-carry-drop.
IEEE Robotics and Automation Letters, 2020
In order to effectively learn multi-step tasks, robots must be able to understand the context by which task progress is defined. In reinforcement learning, much of this information is provided to the learner by the reward function. However, comparatively little work has examined how the reward function captures-or fails to capture-task context in robotics, particularly in long-horizon tasks where failure is highly consequential. To address this issue, we describe the Schedule for Positive Task (SPOT) Reward and the SPOT-Q reinforcement learning algorithm, which efficiently learn multi-step block manipulation tasks in both simulation and real-world environments. SPOT-Q is remarkably effective compared to past benchmarks. It successfully completes simulated trials of a variety of tasks including stacking cubes (98%), clearing toys by pushing and grasping arranged in random (100%) and adversarial (95%) patterns, and creating rows of cubes (93%). Furthermore, we demonstrate direct sim to real transfer. By directly loading the simulation-trained model on the real robot, we are able to create real stacks in 90% of trials and rows in 80% of trials with no additional real-world fine-tuning. Our system is also quite efficient-models train within 1-10k actions, depending on the task. As a result, our algorithm makes learning complex, multi-step tasks both efficient and practical for real world manipulation tasks. Code is available at https: //github.com/jhu-lcsr/good robot.
Australasian Conference on Robotics and Automation, 2023
Applications using Model-Free Reinforcement Learning (MFRL) have grown exponentially and have shown remarkable results in the last decade. The application of MFRL to robots shows significant promise for its capability to solve complex control problems, at least virtually or in simulation. Due to the practical challenges of training in a real-world environment, there is limited work bridging the gap to real physical robots. This article benchmarks the state-of-the-art MFRL algorithms training on an open-source robotic manipulation testbed consisting of a fully actuated, 4-Degrees of Freedom (DoF), two-fingered robot gripper to understand the limitations and challenges involved in real-world applications. Experimental analysis using two different statespace representations is presented to understand their impact on executing a dexterous manipulation task. The source code, the CAD files of the robotic manipulation testbed, and a handy guide on how to approach MFRL's application to real-world are provided to facilitate replication of the results and further experimentation by other research groups.
2018
Through many recent successes in simulation, model-free reinforcement learning has emerged as a promising approach to solving continuous control robotic tasks. The research community is now able to reproduce, analyze and build quickly on these results due to open source implementations of learning algorithms and simulated benchmark tasks. To carry forward these successes to real-world applications, it is crucial to withhold utilizing the unique advantages of simulations that do not transfer to the real world and experiment directly with physical robots. However, reinforcement learning research with physical robots faces substantial resistance due to the lack of benchmark tasks and supporting source code. In this work, we introduce several reinforcement learning tasks with multiple commercially available robots that present varying levels of learning difficulty, setup, and repeatability. On these tasks, we test the learning performance of off-the-shelf implementations of four reinfor...
2009
Software tools for programming autonomous systems that are embedded in unstructured environments are increasingly important in robotics. We introduce a layered software architecture designed to facilitate the construction of hierarchical models for adaptive control programs that are learned and that can be transferred to related contexts and new robots. We focus on the interface between a robot's sensory and motor resources and processes that learn autonomously by exploring effects of the robot's actions. We provide an implementation of this interface called the Control Basis Application Programming Interface (CBAPI) that is designed to create hierarchical behavior and implicit knowledge out of closed-loop control primitives. The CBAPI provides a natural combinatorial means of building closed-loop controllers by combining sensory and motor resources. By so doing, it supports a variety of techniques for structuring stochastic exploration and interactive machine learning. Moreover, it provides for a natural implicit knowledge representation. We believe that the CBAPI represents a programming interface for adaptive control programs that advances the state-of-the-art in robotic software environments.
Hand-eye coordination is a requirement for many manipulation tasks including grasping and reaching. However, accurate hand-eye coordination has shown to be especially difficult to achieve in complex robots like the iCub humanoid. In this work, we solve the hand-eye coordination task using a visuomotor deep neural network predictor that estimates the arm's joint configuration given a stereo image pair of the arm and the underlying head configuration. As there are various unavoidable sources of sensing error on the physical robot, we train the predictor on images obtained from simulation. The images from simulation were modified to look realistic using an image-to-image translation approach. In various experiments, we first show that the visuomotor predictor provides accurate joint estimates of the iCub's hand in simulation. We then show that the predictor can be used to obtain the systematic error of the robot's joint measurements on the physical iCub robot. We demonstrate that a calibrator can be designed to automatically compensate this error. Finally, we validate that this enables accurate reaching of objects while circumventing manual fine-calibration of the robot.
arXiv (Cornell University), 2023
Fig. 1: We propose an open, large-scale dataset for robot learning curated from 21 institutions across the globe. The dataset represents diverse behaviors, robot embodiments and environments, and enables learning generalized robotic policies.
arXiv (Cornell University), 2020
The success of reinforcement learning for real world robotics has been, in many cases limited to instrumented laboratory scenarios, often requiring arduous human effort and oversight to enable continuous learning. In this work, we discuss the elements that are needed for a robotic learning system that can continually and autonomously improve with data collected in the real world. We propose a particular instantiation of such a system, using dexterous manipulation as our case study. Subsequently, we investigate a number of challenges that come up when learning without instrumentation. In such settings, learning must be feasible without manually designed resets, using only on-board perception, and without hand-engineered reward functions. We propose simple and scalable solutions to these challenges, and then demonstrate the efficacy of our proposed system on a set of dexterous robotic manipulation tasks, providing an in-depth analysis of the challenges associated with this learning paradigm. We demonstrate that our complete system can learn without any human intervention, acquiring a variety of vision-based skills with a real-world three-fingered hand. Results and videos can be found at https://sites.google.com/view/realworld-rl/.
arXiv (Cornell University), 2019
Autonomy is fundamental for artificial agents acting in complex real-world scenarios. The acquisition of many different skills is pivotal to foster versatile autonomous behaviour and thus a main objective for robotics and machine learning. Intrinsic motivations have proven to properly generate a task-agnostic signal to drive the autonomous acquisition of multiple policies in settings requiring the learning of multiple tasks. However, in real-world scenarios tasks may be interdependent so that some of them may constitute the precondition for learning other ones. Despite different strategies have been used to tackle the acquisition of interdependent/hierarchical tasks, fully autonomous open-ended learning in these scenarios is still an open question. Building on previous research within the framework of intrinsically-motivated open-ended learning, we propose an architecture for robot control that tackles this problem from the point of view of decision making, i.e. treating the selection of tasks as a Markov Decision Process where the system selects the policies to be trained in order to maximise its competence over all the tasks. The system is then tested with a humanoid robot solving interdependent multiple reaching tasks.
Artificial Neural Networks and Machine Learning – ICANN 2022, 2022
Collecting large amounts of training data with a real robot to learn visuomotor abilities is time-consuming and limited by expensive robotic hardware. Simulators provide a safe, distributable way to collect data, but due to discrepancies between simulation and reality, learned strategies often do not transfer to the real world. This paper examines whether domain randomisation can increase the real-world performance of a model trained entirely in simulation without additional fine-tuning. We replicate a reach-to-grasp experiment with the NICO humanoid robot in simulation and develop a method to autonomously create training data for a supervised learning approach with an end-to-end convolutional neural architecture. We compare model performance and real-world transferability for different amounts of data and randomisation conditions. Our results show that domain randomisation improves the transferability of a model and can mitigate negative effects of overfitting.
Procedia Computer Science, 2019
Peer-review under responsibility of the scientific committee of the 13th International Symposium "Intelligent Systems" (INTELS'18).
The application of deep learning in robotics leads to very specific problems and research questions that are typically not addressed by the computer vision and machine learning communities. In this paper we discuss a number of robotics-specific learning, reasoning, and embodiment challenges for deep learning. We explain the need for better evaluation metrics, highlight the importance and unique challenges for deep robotic learning in simulation, and explore the spectrum between purely data-driven and model-driven approaches. We hope this paper provides a motivating overview of important research directions to overcome the current limitations, and helps to fulfill the promising potentials of deep learning in robotics.
IEEE Robotics and Automation Letters, 2020
Intelligent manipulation benefits from the capacity to flexibly control an end-effector with high degrees of freedom (DoF) and dynamically react to the environment. However, due to the challenges of collecting effective training data and learning efficiently, most grasping algorithms today are limited to topdown movements and open-loop execution. In this work, we propose a new low-cost hardware interface for collecting grasping demonstrations by people in diverse environments. This data makes it possible to train a robust end-to-end 6DoF closedloop grasping model with reinforcement learning that transfers to real robots. A key aspect of our grasping model is that it uses "action-view" based rendering to simulate future states with respect to different possible actions. By evaluating these states using a learned value function (e.g., Q-function), our method is able to better select corresponding actions that maximize total rewards (i.e., grasping success). Our final grasping system is able to achieve reliable 6DoF closed-loop grasping of novel objects across various scene configurations, as well as in dynamic scenes with moving objects.
2022
In this paper, we study the problem of enabling a vision-based robotic manipulation system to generalize to novel tasks, a long-standing challenge in robot learning. We approach the challenge from an imitation learning perspective, aiming to study how scaling and broadening the data collected can facilitate such generalization. To that end, we develop an interactive and flexible imitation learning system that can learn from both demonstrations and interventions and can be conditioned on different forms of information that convey the task, including pre-trained embeddings of natural language or videos of humans performing the task. When scaling data collection on a real robot to more than 100 distinct tasks, we find that this system can perform 24 unseen manipulation tasks with an average success rate of 44%, without any robot demonstrations for those tasks.
European Conference on Artificial Intelligence, 2020
This paper presents the achievements obtained from a study performed within the IMPACT (Intrinsically Motivated Planning Architecture for Curiosity-driven roboTs) Project funded by the European Space Agency (ESA). The main contribution of the work is the realization of an innovative robotic architecture in which the well-known three-layered architectural paradigm (decisional, executive, and functional) for controlling robotic systems is enhanced with autonomous learning capabilities. The architecture is the outcome of the application of an interdisciplinary approach integrating Artificial Intelligence (AI), Autonomous Robotics, and Machine Learning (ML) techniques. In particular, state-of-the-art AI planning systems and algorithms were integrated with Reinforcement Learning (RL) algorithms guided by intrinsic motivations (curiosity, exploration, novelty, and surprise). The aim of this integration was to: (i) develop a software system that allows a robotic platform to autonomously represent in symbolic form the skills autonomously learned through intrinsic motivations; (ii) show that the symbolic representation can be profitably used for automated planning purposes, thus improving the robot's exploration and knowledge acquisition capabilities. The proposed solution is validated in a test scenario inspired by a typical space exploration mission involving a rover.
Applied Sciences
Models trained with Deep Reinforcement learning (DRL) have been deployed in various areas of robotics with varying degree of success. To overcome the limitations of data gathering in the real world, DRL training utilizes simulated environments and transfers the learned policy to real-world scenarios, i.e., sim–real transfer. Simulators fail to accurately capture the entire dynamics of the real world, so simulation-trained policies often fail when applied to reality, termed a reality gap (RG). In this paper, we propose a search (mapping) algorithm that takes in real-world observation (images) and maps them to the policy-equivalent images in the simulated environment using a convolution neural network (CNN) model. The two-step training process, DRL policy and a mapping model, overcomes the RG problem with simulated data only. We evaluated the proposed system with a gripping task of a custom-made robot arm in the real world and compared the performance against a conventional DRL sim–re...
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.