Robotics

New submissions

Submissions received from Wed 1 May 24 to Thu 2 May 24, announced Fri, 3 May 24

New submissions
Cross-lists
Replacements

[ total of 62 entries: 1-62 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Fri, 3 May 24

[1] arXiv:2405.00685 [pdf, ps, other]: Title: The active visual sensing methods for robotic welding: review, tutorial and prospect

Authors: ZhenZhou Wang

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)

The visual sensing system is one of the most important parts of the welding robots to realize intelligent and autonomous welding. The active visual sensing methods have been widely adopted in robotic welding because of their higher accuracies compared to the passive visual sensing methods. In this paper, we give a comprehensive review of the active visual sensing methods for robotic welding. According to their uses, we divide the state-of-the-art active visual sensing methods into four categories: seam tracking, weld bead defect detection, 3D weld pool geometry measurement and welding path planning. Firstly, we review the principles of these active visual sensing methods. Then, we give a tutorial of the 3D calibration methods for the active visual sensing systems used in intelligent welding robots to fill the gaps in the related fields. At last, we compare the reviewed active visual sensing methods and give the prospects based on their advantages and disadvantages.
[2] arXiv:2405.00687 [pdf, other]: Title: Optimal Planning for Timed Partial Order Specifications

Authors: Kandai Watanabe, Georgios Fainekos, Bardh Hoxha, Morteza Lahijanian, Hideki Okamoto, Sriram Sankaranarayanan

Comments: 2024 IEEE International Conference on Robotics and Automation

Subjects: Robotics (cs.RO); Logic in Computer Science (cs.LO)

This paper addresses the challenge of planning a sequence of tasks to be performed by multiple robots while minimizing the overall completion time subject to timing and precedence constraints. Our approach uses the Timed Partial Orders (TPO) model to specify these constraints. We translate this problem into a Traveling Salesman Problem (TSP) variant with timing and precedent constraints, and we solve it as a Mixed Integer Linear Programming (MILP) problem. Our contributions include a general planning framework for TPO specifications, a MILP formulation accommodating time windows and precedent constraints, its extension to multi-robot scenarios, and a method to quantify plan robustness. We demonstrate our framework on several case studies, including an aircraft turnaround task involving three Jackal robots, highlighting the approach's potential applicability to important real-world problems. Our benchmark results show that our MILP method outperforms state-of-the-art open-source TSP solvers OR-Tools.
[3] arXiv:2405.00688 [pdf, ps, other]: Title: Understanding Social Perception, Interactions, and Safety Aspects of Sidewalk Delivery Robots Using Sentiment Analysis

Authors: Yuchen Du, Tho V. Le

Comments: 34 pages, 7 figures, 2 tables

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)

This article presents a comprehensive sentiment analysis (SA) of comments on YouTube videos related to Sidewalk Delivery Robots (SDRs). We manually annotated the collected YouTube comments with three sentiment labels: negative (0), positive (1), and neutral (2). We then constructed models for text sentiment classification and tested the models' performance on both binary and ternary classification tasks in terms of accuracy, precision, recall, and F1 score. Our results indicate that, in binary classification tasks, the Support Vector Machine (SVM) model using Term Frequency-Inverse Document Frequency (TF-IDF) and N-gram get the highest accuracy. In ternary classification tasks, the model using Bidirectional Encoder Representations from Transformers (BERT), Long Short-Term Memory Networks (LSTM) and Gated Recurrent Unit (GRU) significantly outperforms other machine learning models, achieving an accuracy, precision, recall, and F1 score of 0.78. Additionally, we employ the Latent Dirichlet Allocation model to generate 10 topics from the comments to explore the public's underlying views on SDRs. Drawing from these findings, we propose targeted recommendations for shaping future policies concerning SDRs. This work provides valuable insights for stakeholders in the SDR sector regarding social perception, interaction, and safety.
[4] arXiv:2405.00689 [pdf, ps, other]: Title: Anti-Jamming Path Planning Using GCN for Multi-UAV

Authors: Haechan Jeong

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI)

This paper addresses the increasing significance of UAVs (Unmanned Aerial Vehicles) and the emergence of UAV swarms for collaborative operations in various domains. However, the effectiveness of UAV swarms can be severely compromised by jamming technology, necessitating robust antijamming strategies. While existing methods such as frequency hopping and physical path planning have been explored, there remains a gap in research on path planning for UAV swarms when the jammer's location is unknown. To address this, a novel approach, where UAV swarms leverage collective intelligence to predict jamming areas, evade them, and efficiently reach target destinations, is proposed. This approach utilizes Graph Convolutional Networks (GCN) to predict the location and intensity of jamming areas based on information gathered from each UAV. A multi-agent control algorithm is then employed to disperse the UAV swarm, avoid jamming, and regroup upon reaching the target. Through simulations, the effectiveness of the proposed method is demonstrated, showcasing accurate prediction of jamming areas and successful evasion through obstacle avoidance algorithms, ultimately achieving the mission objective. Proposed method offers robustness, scalability, and computational efficiency, making it applicable across various scenarios where UAV swarms operate in potentially hostile environments.
[5] arXiv:2405.00690 [pdf, other]: Title: Scenarios Engineering driven Autonomous Transportation in Open-Pit Mines

Authors: Siyu Teng, Xuan Li, Yuchen Li, Lingxi Li, Yunfeng Ai, Long Chen

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI)

One critical bottleneck that impedes the development and deployment of autonomous transportation in open-pit mines is guaranteed robustness and trustworthiness in prohibitively extreme scenarios. In this research, a novel scenarios engineering (SE) methodology for the autonomous mining truck is proposed for open-pit mines. SE increases the trustworthiness and robustness of autonomous trucks from four key components: Scenario Feature Extractor, Intelligence & Index (I&I), Calibration & Certification (C&C), and Verification & Validation (V&V). Scenario feature extractor is a comprehensive pipeline approach that captures complex interactions and latent dependencies in complex mining scenarios. I&I effectively enhances the quality of the training dataset, thereby establishing a solid foundation for autonomous transportation in mining areas. C&C is grounded in the intrinsic regulation, capabilities, and contributions of the intelligent systems employed in autonomous transportation to align with traffic participants in the real world and ensure their performance through certification. V&V process ensures that the autonomous transportation system can be correctly implemented, while validation focuses on evaluating the ability of the well-trained model to operate efficiently in the complex and dynamic conditions of the open-pit mines. This methodology addresses the unique challenges of autonomous transportation in open-pit mining, promoting productivity, safety, and performance in mining operations.
[6] arXiv:2405.00691 [pdf, other]: Title: Proactive Route Planning for Electric Vehicles

Authors: Saeed Nasehi, Farhana Choudhury, Egemen Tanin

Subjects: Robotics (cs.RO)

Due to the limited driving range, inadequate charging facilities, and time-consuming recharging, the process of finding an optimal charging route for electric vehicles (EVs) differs from that of other vehicle types. The time and location of EV charging during a trip impact not only the individual EV's travel time but also the travel time of other EVs, due to the queuing that may arise at the charging station(s). This issue is at large seen as a significant constraint for uplifting EV sales in many countries. In this study, we present a novel Electric Vehicle Route Planning problem, which involves finding the fastest route with recharging for an EV routing request. We model the problem as a new graph problem and present that the problem is NP-hard. We propose a novel two-phase algorithm to traverse the graph to find the best possible charging route for each EV. We also introduce the notion of `influence factor' to propose heuristics to find the best possible route for an EV with the minimum travel time that avoids using charging stations and time to recharge at those stations which can lead to better travel time for other EVs. The results show that our method can decrease total travel time of the EVs by 50\% in comparison with the state-of-the-art on a real dataset, where the benefit of our approach is more significant as the number of EVs on the road increases.
[7] arXiv:2405.00693 [pdf, other]: Title: Large Language Models for Human-Robot Interaction: Opportunities and Risks

Authors: Jesse Atuhurra

Subjects: Robotics (cs.RO); Computation and Language (cs.CL)

The tremendous development in large language models (LLM) has led to a new wave of innovations and applications and yielded research results that were initially forecast to take longer. In this work, we tap into these recent developments and present a meta-study about the potential of large language models if deployed in social robots. We place particular emphasis on the applications of social robots: education, healthcare, and entertainment. Before being deployed in social robots, we also study how these language models could be safely trained to ``understand'' societal norms and issues, such as trust, bias, ethics, cognition, and teamwork. We hope this study provides a resourceful guide to other robotics researchers interested in incorporating language models in their robots.
[8] arXiv:2405.00694 [pdf, ps, other]: Title: Analysis of the Efficacy of the Use of Inertial Measurement and Global Positioning System Data to Reverse Engineer Automotive CAN Bus Steering Signals

Authors: Kevin Setterstrom, Jeremy Straub

Subjects: Robotics (cs.RO); Systems and Control (eess.SY)

Autonomous vehicle control is growing in availability for new vehicles and there is a potential need to retrofit older vehicles with this capability. Additionally, automotive cybersecurity has become a significant concern in recent years due to documented attacks on vehicles. As a result, researchers have been exploring reverse engineering techniques to automate vehicle control and improve vehicle security and threat analysis. In prior work, a vehicle's accelerator and brake pedal controller area network (CAN) channels were identified using reverse engineering techniques without prior knowledge of the vehicle. However, the correlation results for deceleration were lower than those for acceleration, which may be able to be improved by incorporating data from an additional telemetry device. In this paper, a method that uses IMU and GPS data to reverse-engineer a vehicle's steering wheel position CAN channels, without prior knowledge of the vehicle, is presented. Using GPS data is shown to greatly improve correlation values for deceleration, particularly for the brake pedal CAN channels. This work demonstrates the efficacy of using these data sources for automotive CAN reverse engineering. This has potential uses in automotive vehicle control and for improving vehicle security and threat analysis.
[9] arXiv:2405.00695 [pdf, other]: Title: Joint torques prediction of a robotic arm using neural networks

Authors: Giulia d'Addato, Ruggero Carli, Eurico Pedrosa, Artur Pereira, Luigi Palopoli, Daniele Fontanelli

Comments: 6 pages, 5 figures, submitted to CASE 2024

Subjects: Robotics (cs.RO); Machine Learning (cs.LG)

Accurate dynamic models are crucial for many robotic applications. Traditional approaches to deriving these models are based on the application of Lagrangian or Newtonian mechanics. Although these methods provide a good insight into the physical behaviour of the system, they rely on the exact knowledge of parameters such as inertia, friction and joint flexibility. In addition, the system is often affected by uncertain and nonlinear effects, such as saturation and dead zones, which can be difficult to model. A popular alternative is the application of Machine Learning (ML) techniques - e.g., Neural Networks (NNs) - in the context of a "black-box" methodology. This paper reports on our experience with this approach for a real-life 6 degrees of freedom (DoF) manipulator. Specifically, we considered several NN architectures: single NN, multiple NNs, and cascade NN. We compared the performance of the system by using different policies for selecting the NN hyperparameters. Our experiments reveal that the best accuracy and performance are obtained by a cascade NN, in which we encode our prior physical knowledge about the dependencies between joints, complemented by an appropriate optimisation of the hyperparameters.
[10] arXiv:2405.00696 [pdf, other]: Title: Life-long Learning and Testing for Automated Vehicles via Adaptive Scenario Sampling as A Continuous Optimization Process

Authors: Jingwei Ge, Pengbo Wang, Cheng Chang, Yi Zhang, Danya Yao, Li Li

Subjects: Robotics (cs.RO)

Sampling critical testing scenarios is an essential step in intelligence testing for Automated Vehicles (AVs). However, due to the lack of prior knowledge on the distribution of critical scenarios in sampling space, we can hardly efficiently find the critical scenarios or accurately evaluate the intelligence of AVs. To solve this problem, we formulate the testing as a continuous optimization process which iteratively generates potential critical scenarios and meanwhile evaluates these scenarios. A bi-level loop is proposed for such life-long learning and testing. In the outer loop, we iteratively learn space knowledge by evaluating AV in the already sampled scenarios and then sample new scenarios based on the retained knowledge. Outer loop stops when all generated samples cover the whole space. While to maximize the coverage of the space in each outer loop, we set an inner loop which receives newly generated samples in outer loop and outputs the updated positions of these samples. We assume that points in a small sphere-like subspace can be covered (or represented) by the point in the center of this sphere. Therefore, we can apply a multi-rounds heuristic strategy to move and pack these spheres in space to find the best covering solution. The simulation results show that faster and more accurate evaluation of AVs can be achieved with more critical scenarios.
[11] arXiv:2405.00797 [pdf, other]: Title: ADM: Accelerated Diffusion Model via Estimated Priors for Robust Motion Prediction under Uncertainties

Authors: Jiahui Li, Tianle Shen, Zekai Gu, Jiawei Sun, Chengran Yuan, Yuhang Han, Shuo Sun, Marcelo H. Ang Jr

Comments: 7 pages, 4 figures

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)

Motion prediction is a challenging problem in autonomous driving as it demands the system to comprehend stochastic dynamics and the multi-modal nature of real-world agent interactions. Diffusion models have recently risen to prominence, and have proven particularly effective in pedestrian motion prediction tasks. However, the significant time consumption and sensitivity to noise have limited the real-time predictive capability of diffusion models. In response to these impediments, we propose a novel diffusion-based, acceleratable framework that adeptly predicts future trajectories of agents with enhanced resistance to noise. The core idea of our model is to learn a coarse-grained prior distribution of trajectory, which can skip a large number of denoise steps. This advancement not only boosts sampling efficiency but also maintains the fidelity of prediction accuracy. Our method meets the rigorous real-time operational standards essential for autonomous vehicles, enabling prompt trajectory generation that is vital for secure and efficient navigation. Through extensive experiments, our method speeds up the inference time to 136ms compared to standard diffusion model, and achieves significant improvement in multi-agent motion prediction on the Argoverse 1 motion forecasting dataset.
[12] arXiv:2405.00841 [pdf, other]: Title: Sim-Grasp: Learning 6-DOF Grasp Policies for Cluttered Environments Using a Synthetic Benchmark

Authors: Juncheng Li, David J. Cappelleri

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI)

In this paper, we present Sim-Grasp, a robust 6-DOF two-finger grasping system that integrates advanced language models for enhanced object manipulation in cluttered environments. We introduce the Sim-Grasp-Dataset, which includes 1,550 objects across 500 scenarios with 7.9 million annotated labels, and develop Sim-GraspNet to generate grasp poses from point clouds. The Sim-Grasp-Polices achieve grasping success rates of 97.14% for single objects and 87.43% and 83.33% for mixed clutter scenarios of Levels 1-2 and Levels 3-4 objects, respectively. By incorporating language models for target identification through text and box prompts, Sim-Grasp enables both object-agnostic and target picking, pushing the boundaries of intelligent robotic systems.
[13] arXiv:2405.00846 [pdf, other]: Title: Gameplay Filters: Safe Robot Walking through Adversarial Imagination

Authors: Duy P. Nguyen, Kai-Chieh Hsu, Wenhao Yu, Jie Tan, Jaime F. Fisac

Subjects: Robotics (cs.RO); Machine Learning (cs.LG)

Ensuring the safe operation of legged robots in uncertain, novel environments is crucial to their widespread adoption. Despite recent advances in safety filters that can keep arbitrary task-driven policies from incurring safety failures, existing solutions for legged robot locomotion still rely on simplified dynamics and may fail when the robot is perturbed away from predefined stable gaits. This paper presents a general approach that leverages offline game-theoretic reinforcement learning to synthesize a highly robust safety filter for high-order nonlinear dynamics. This gameplay filter then maintains runtime safety by continually simulating adversarial futures and precluding task-driven actions that would cause it to lose future games (and thereby violate safety). Validated on a 36-dimensional quadruped robot locomotion task, the gameplay safety filter exhibits inherent robustness to the sim-to-real gap without manual tuning or heuristic designs. Physical experiments demonstrate the effectiveness of the gameplay safety filter under perturbations, such as tugging and unmodeled irregular terrains, while simulation studies shed light on how to trade off computation and conservativeness without compromising safety.
[14] arXiv:2405.00867 [pdf, other]: Title: A Convex Formulation of the Soft-Capture Problem

Authors: Ibrahima Sory Sow, Geordan Gutow, Howie Choset, Zachary Manchester

Comments: Accepted to ISpaRo24

Subjects: Robotics (cs.RO); Systems and Control (eess.SY); Optimization and Control (math.OC)

We present a fast trajectory optimization algorithm for the soft capture of uncooperative tumbling space objects. Our algorithm generates safe, dynamically feasible, and minimum-fuel trajectories for a six-degree-of-freedom servicing spacecraft to achieve soft capture (near-zero relative velocity at contact) between predefined locations on the servicer spacecraft and target body. We solve a convex problem by enforcing a convex relaxation of the field-of-view constraint, followed by a sequential convex program correcting the trajectory for collision avoidance. The optimization problems can be solved with a standard second-order cone programming solver, making the algorithm both fast and practical for implementation in flight software. We demonstrate the performance and robustness of our algorithm in simulation over a range of object tumble rates up to 10{\deg}/s.
[15] arXiv:2405.00882 [pdf, other]: Title: A Differentiable Dynamic Modeling Approach to Integrated Motion Planning and Actuator Physical Design for Mobile Manipulators

Authors: Zehui Lu, Yebin Wang

Subjects: Robotics (cs.RO); Systems and Control (eess.SY)

This paper investigates the differentiable dynamic modeling of mobile manipulators to facilitate efficient motion planning and physical design of actuators, where the actuator design is parameterized by physically meaningful motor geometry parameters. These parameters impact the manipulator's link mass, inertia, center-of-mass, torque constraints, and angular velocity constraints, influencing control authority in motion planning and trajectory tracking control. A motor's maximum torque/speed and how the design parameters affect the dynamics are modeled analytically, facilitating differentiable and analytical dynamic modeling. Additionally, an integrated locomotion and manipulation planning problem is formulated with direct collocation discretization, using the proposed differentiable dynamics and motor parameterization. Such dynamics are required to capture the dynamic coupling between the base and the manipulator. Numerical experiments demonstrate the effectiveness of differentiable dynamics in speeding up optimization and advantages in task completion time and energy consumption over established sequential motion planning approach. Finally, this paper introduces a simultaneous actuator design and motion planning framework, providing numerical results to validate the proposed differentiable modeling approach for co-design problems.
[16] arXiv:2405.00943 [pdf, other]: Title: Space Debris Reliable Capturing by a Dual-Arm Orbital Robot: Detumbling and Caging

Authors: Akiyoshi Uchida, Kentaro Uno, Kazuya Yoshida

Comments: 8 pages, 14 figures. Manuscript accepted at the IEEE International Conference on Space Robotics (iSpaRo) 2024

Subjects: Robotics (cs.RO)

A chaser satellite equipped with robotic arms can capture space debris and manipulate it for use in more advanced missions such as refueling and deorbiting. To facilitate capturing, a caging-based strategy has been proposed to simplify the control system. Caging involves geometrically constraining the motion of the target debris, and is achieved via position control. However, if the target is spinning at a high speed, direct caging may result in unsuccessful constraints or hardware destruction; therefore, the target should be de-tumbled before capture. To address this problem, this study proposes a repeated contact-based method that uses impedance control to mitigate the momentum of the target. In this study, we analyzed the proposed detumbling technique from the perspective of impedance parameters. We investigated their effects through a parametric analysis and demonstrated the successful detumbling and caging sequence of a microsatellite as representative of space debris. The contact forces decreased during the detumbling sequence compared with direct caging. Further, the proposed detumbling and caging sequence was validated through simulations and experiments using a dual-arm air-floating robot in two-dimensional microgravity emulating testbed.
[17] arXiv:2405.00956 [pdf, other]: Title: Efficient Data-driven Scene Simulation using Robotic Surgery Videos via Physics-embedded 3D Gaussians

Authors: Zhenya Yang, Kai Chen, Yonghao Long, Qi Dou

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)

Surgical scene simulation plays a crucial role in surgical education and simulator-based robot learning. Traditional approaches for creating these environments with surgical scene involve a labor-intensive process where designers hand-craft tissues models with textures and geometries for soft body simulations. This manual approach is not only time-consuming but also limited in the scalability and realism. In contrast, data-driven simulation offers a compelling alternative. It has the potential to automatically reconstruct 3D surgical scenes from real-world surgical video data, followed by the application of soft body physics. This area, however, is relatively uncharted. In our research, we introduce 3D Gaussian as a learnable representation for surgical scene, which is learned from stereo endoscopic video. To prevent over-fitting and ensure the geometrical correctness of these scenes, we incorporate depth supervision and anisotropy regularization into the Gaussian learning process. Furthermore, we apply the Material Point Method, which is integrated with physical properties, to the 3D Gaussians to achieve realistic scene deformations. Our method was evaluated on our collected in-house and public surgical videos datasets. Results show that it can reconstruct and simulate surgical scenes from endoscopic videos efficiently-taking only a few minutes to reconstruct the surgical scene-and produce both visually and physically plausible deformations at a speed approaching real-time. The results demonstrate great potential of our proposed method to enhance the efficiency and variety of simulations available for surgical education and robot learning.
[18] arXiv:2405.01019 [pdf, ps, other]: Title: Investigating the relationship between empathy and attribution of mental states to robots

Authors: Alberto Lillo, Alessandro Saracco, Elena Siletto, Claudio Mattutino, Cristina Gena

Subjects: Robotics (cs.RO); Human-Computer Interaction (cs.HC)

This paper describes an experimental evaluation aimed at detecting the users' perception of the robot's empathic abilities during a conversation. The results have been then analyzed to search for a possible relationship between the perceived empathy and the attribution of mental states to the robot, namely the user's perception of the robot's mental qualities as compared to humans. The involved sample consisted of 68 subjects, including 34 adults and 34 between teenagers and children. By conducting the experiment with both adult and child participants, make possible to compare the results obtained from each group and identify any differences in perception between the various age groups.
[19] arXiv:2405.01044 [pdf, other]: Title: Differentiable Particles for General-Purpose Deformable Object Manipulation

Authors: Siwei Chen, Yiqing Xu, Cunjun Yu, Linfeng Li, David Hsu

Subjects: Robotics (cs.RO)

Deformable object manipulation is a long-standing challenge in robotics. While existing approaches often focus narrowly on a specific type of object, we seek a general-purpose algorithm, capable of manipulating many different types of objects: beans, rope, cloth, liquid, . . . . One key difficulty is a suitable representation, rich enough to capture object shape, dynamics for manipulation and yet simple enough to be acquired effectively from sensor data. Specifically, we propose Differentiable Particles (DiPac), a new algorithm for deformable object manipulation. DiPac represents a deformable object as a set of particles and uses a differentiable particle dynamics simulator to reason about robot manipulation. To find the best manipulation action, DiPac combines learning, planning, and trajectory optimization through differentiable trajectory tree optimization. Differentiable dynamics provides significant benefits and enable DiPac to (i) estimate the dynamics parameters efficiently, thereby narrowing the sim-to-real gap, and (ii) choose the best action by backpropagating the gradient along sampled trajectories. Both simulation and real-robot experiments show promising results. DiPac handles a variety of object types. By combining planning and learning, DiPac outperforms both pure model-based planning methods and pure data-driven learning methods. In addition, DiPac is robust and adapts to changes in dynamics, thereby enabling the transfer of an expert policy from one object to another with different physical properties, e.g., from a rigid rod to a deformable rope.
[20] arXiv:2405.01054 [pdf, other]: Title: Continual Learning for Robust Gate Detection under Dynamic Lighting in Autonomous Drone Racing

Authors: Zhongzheng Qiao, Xuan Huy Pham, Savitha Ramasamy, Xudong Jiang, Erdal Kayacan, Andriy Sarabakha

Comments: 8 pages, 6 figures, in 2024 International Joint Conference on Neural Networks (IJCNN)

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

In autonomous and mobile robotics, a principal challenge is resilient real-time environmental perception, particularly in situations characterized by unknown and dynamic elements, as exemplified in the context of autonomous drone racing. This study introduces a perception technique for detecting drone racing gates under illumination variations, which is common during high-speed drone flights. The proposed technique relies upon a lightweight neural network backbone augmented with capabilities for continual learning. The envisaged approach amalgamates predictions of the gates' positional coordinates, distance, and orientation, encapsulating them into a cohesive pose tuple. A comprehensive number of tests serve to underscore the efficacy of this approach in confronting diverse and challenging scenarios, specifically those involving variable lighting conditions. The proposed methodology exhibits notable robustness in the face of illumination variations, thereby substantiating its effectiveness.
[21] arXiv:2405.01107 [pdf, other]: Title: CoViS-Net: A Cooperative Visual Spatial Foundation Model for Multi-Robot Applications

Authors: Jan Blumenkamp, Steven Morad, Jennifer Gielis, Amanda Prorok

Subjects: Robotics (cs.RO); Multiagent Systems (cs.MA); Systems and Control (eess.SY)

Spatial understanding from vision is crucial for robots operating in unstructured environments. In the real world, spatial understanding is often an ill-posed problem. There are a number of powerful classical methods that accurately regress relative pose, however, these approaches often lack the ability to leverage data-derived priors to resolve ambiguities. In multi-robot systems, these challenges are exacerbated by the need for accurate and frequent position estimates of cooperating agents. To this end, we propose CoViS-Net, a cooperative, multi-robot, visual spatial foundation model that learns spatial priors from data. Unlike prior work evaluated primarily on offline datasets, we design our model specifically for online evaluation and real-world deployment on cooperative robots. Our model is completely decentralized, platform agnostic, executable in real-time using onboard compute, and does not require existing network infrastructure. In this work, we focus on relative pose estimation and local Bird's Eye View (BEV) prediction tasks. Unlike classical approaches, we show that our model can accurately predict relative poses without requiring camera overlap, and predict BEVs of regions not visible to the ego-agent. We demonstrate our model on a multi-robot formation control task outside the confines of the laboratory.
[22] arXiv:2405.01115 [pdf, ps, other]: Title: A New Self-Alignment Method without Solving Wahba Problem for SINS in Autonomous Vehicles

Authors: Hongliang Zhang, Yilan Zhou, Lei Wang, Tengchao Huang

Subjects: Robotics (cs.RO); Systems and Control (eess.SY)

Initial alignment is one of the key technologies in strapdown inertial navigation system (SINS) to provide initial state information for vehicle attitude and navigation. For some situations, such as the attitude heading reference system, the position is not necessarily required or even available, then the self-alignment that does not rely on any external aid becomes very necessary. This study presents a new self-alignment method under swaying conditions, which can determine the latitude and attitude simultaneously by utilizing all observation vectors without solving the Wahba problem, and it is different from the existing methods. By constructing the dyadic tensor of each observation and reference vector itself, all equations related to observation and reference vectors are accumulated into one equation, where the latitude variable is extracted and solved according to the same eigenvalues of similar matrices on both sides of the equation, meanwhile the attitude is obtained by eigenvalue decomposition. Simulation and experiment tests verify the effectiveness of the proposed methods, and the alignment result is better than TRIAD in convergence speed and stability and comparable with OBA method in alignment accuracy with or without latitude. It is useful for guiding the design of initial alignment in autonomous vehicle applications.
[23] arXiv:2405.01134 [pdf, other]: Title: Leveraging Procedural Generation for Learning Autonomous Peg-in-Hole Assembly in Space

Authors: Andrej Orsula, Matthieu Geist, Miguel Olivares-Mendez, Carol Martinez

Comments: Accepted for publication at the 2024 International Conference on Space Robotics (iSpaRo) | The source code is available at this https URL

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

The ability to autonomously assemble structures is crucial for the development of future space infrastructure. However, the unpredictable conditions of space pose significant challenges for robotic systems, necessitating the development of advanced learning techniques to enable autonomous assembly. In this study, we present a novel approach for learning autonomous peg-in-hole assembly in the context of space robotics. Our focus is on enhancing the generalization and adaptability of autonomous systems through deep reinforcement learning. By integrating procedural generation and domain randomization, we train agents in a highly parallelized simulation environment across a spectrum of diverse scenarios with the aim of acquiring a robust policy. The proposed approach is evaluated using three distinct reinforcement learning algorithms to investigate the trade-offs among various paradigms. We demonstrate the adaptability of our agents to novel scenarios and assembly sequences while emphasizing the potential of leveraging advanced simulation techniques for robot learning in space. Our findings set the stage for future advancements in intelligent robotic systems capable of supporting ambitious space missions and infrastructure development beyond Earth.
[24] arXiv:2405.01192 [pdf, other]: Title: Imagine2touch: Predictive Tactile Sensing for Robotic Manipulation using Efficient Low-Dimensional Signals

Authors: Abdallah Ayad, Adrian Röfer, Nick Heppert, Abhinav Valada

Comments: 3 pages, 3 figures, 2 tables, accepted at ViTac2024 ICRA2024 Workshop. arXiv admin note: substantial text overlap with arXiv:2403.15107

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)

Humans seemingly incorporate potential touch signals in their perception. Our goal is to equip robots with a similar capability, which we term Imagine2touch. Imagine2touch aims to predict the expected touch signal based on a visual patch representing the area to be touched. We use ReSkin, an inexpensive and compact touch sensor to collect the required dataset through random touching of five basic geometric shapes, and one tool. We train Imagine2touch on two out of those shapes and validate it on the ood. tool. We demonstrate the efficacy of Imagine2touch through its application to the downstream task of object recognition. In this task, we evaluate Imagine2touch performance in two experiments, together comprising 5 out of training distribution objects. Imagine2touch achieves an object recognition accuracy of 58% after ten touches per object, surpassing a proprioception baseline.
[25] arXiv:2405.01266 [pdf, other]: Title: MFTraj: Map-Free, Behavior-Driven Trajectory Prediction for Autonomous Driving

Authors: Haicheng Liao, Zhenning Li, Chengyue Wang, Huanming Shen, Bonan Wang, Dongping Liao, Guofa Li, Chengzhong Xu

Comments: Accepted by IJCAI 2024

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI)

This paper introduces a trajectory prediction model tailored for autonomous driving, focusing on capturing complex interactions in dynamic traffic scenarios without reliance on high-definition maps. The model, termed MFTraj, harnesses historical trajectory data combined with a novel dynamic geometric graph-based behavior-aware module. At its core, an adaptive structure-aware interactive graph convolutional network captures both positional and behavioral features of road users, preserving spatial-temporal intricacies. Enhanced by a linear attention mechanism, the model achieves computational efficiency and reduced parameter overhead. Evaluations on the Argoverse, NGSIM, HighD, and MoCAD datasets underscore MFTraj's robustness and adaptability, outperforming numerous benchmarks even in data-challenged scenarios without the need for additional information such as HD maps or vectorized maps. Importantly, it maintains competitive performance even in scenarios with substantial missing data, on par with most existing state-of-the-art models. The results and methodology suggest a significant advancement in autonomous driving trajectory prediction, paving the way for safer and more efficient autonomous systems.
[26] arXiv:2405.01284 [pdf, other]: Title: Behavior Imitation for Manipulator Control and Grasping with Deep Reinforcement Learning

Authors: Liu Qiyuan

Comments: 50 pages, 30 figures, Final Year Project Report at Nanyang Technological University, Singapore This article is an NTU FYP report. The formal paper is still in the preparation process

Subjects: Robotics (cs.RO); Machine Learning (cs.LG)

The existing Motion Imitation models typically require expert data obtained through MoCap devices, but the vast amount of training data needed is difficult to acquire, necessitating substantial investments of financial resources, manpower, and time. This project combines 3D human pose estimation with reinforcement learning, proposing a novel model that simplifies Motion Imitation into a prediction problem of joint angle values in reinforcement learning. This significantly reduces the reliance on vast amounts of training data, enabling the agent to learn an imitation policy from just a few seconds of video and exhibit strong generalization capabilities. It can quickly apply the learned policy to imitate human arm motions in unfamiliar videos. The model first extracts skeletal motions of human arms from a given video using 3D human pose estimation. These extracted arm motions are then morphologically retargeted onto a robotic manipulator. Subsequently, the retargeted motions are used to generate reference motions. Finally, these reference motions are used to formulate a reinforcement learning problem, enabling the agent to learn a policy for imitating human arm motions. This project excels at imitation tasks and demonstrates robust transferability, accurately imitating human arm motions from other unfamiliar videos. This project provides a lightweight, convenient, efficient, and accurate Motion Imitation model. While simplifying the complex process of Motion Imitation, it achieves notably outstanding performance.
[27] arXiv:2405.01316 [pdf, other]: Title: LOG-LIO2: A LiDAR-Inertial Odometry with Efficient Uncertainty Analysis

Authors: Kai Huang, Junqiao Zhao, Jiaye Lin, Zhongyang Zhu, Shuangfu Song, Chen Ye, Tiantian Feng

Subjects: Robotics (cs.RO)

Uncertainty in LiDAR measurements, stemming from factors such as range sensing, is crucial for LIO (LiDAR-Inertial Odometry) systems as it affects the accurate weighting in the loss function. While recent LIO systems address uncertainty related to range sensing, the impact of incident angle on uncertainty is often overlooked by the community. Moreover, the existing uncertainty propagation methods suffer from computational inefficiency. This paper proposes a comprehensive point uncertainty model that accounts for both the uncertainties from LiDAR measurements and surface characteristics, along with an efficient local uncertainty analytical method for LiDAR-based state estimation problem. We employ a projection operator that separates the uncertainty into the ray direction and its orthogonal plane. Then, we derive incremental Jacobian matrices of eigenvalues and eigenvectors w.r.t. points, which enables a fast approximation of uncertainty propagation. This approach eliminates the requirement for redundant traversal of points, significantly reducing the time complexity of uncertainty propagation from $\mathcal{O} (n)$ to $\mathcal{O} (1)$ when a new point is added. Simulations and experiments on public datasets are conducted to validate the accuracy and efficiency of our formulations. The proposed methods have been integrated into a LIO system, which is available at https://github.com/tiev-tongji/LOG-LIO2.
[28] arXiv:2405.01328 [pdf, ps, other]: Title: An Advanced Framework for Ultra-Realistic Simulation and Digital Twinning for Autonomous Vehicles

Authors: Yuankai He, Hanlin Chen, Weisong Shi

Comments: 6 Pages. 5 Figures, 1 Table

Subjects: Robotics (cs.RO)

Simulation is a fundamental tool in developing autonomous vehicles, enabling rigorous testing without the logistical and safety challenges associated with real-world trials. As autonomous vehicle technologies evolve and public safety demands increase, advanced, realistic simulation frameworks are critical. Current testing paradigms employ a mix of general-purpose and specialized simulators, such as CARLA and IVRESS, to achieve high-fidelity results. However, these tools often struggle with compatibility due to differing platform, hardware, and software requirements, severely hampering their combined effectiveness. This paper introduces BlueICE, an advanced framework for ultra-realistic simulation and digital twinning, to address these challenges. BlueICE's innovative architecture allows for the decoupling of computing platforms, hardware, and software dependencies while offering researchers customizable testing environments to meet diverse fidelity needs. Key features include containerization to ensure compatibility across different systems, a unified communication bridge for seamless integration of various simulation tools, and synchronized orchestration of input and output across simulators. This framework facilitates the development of sophisticated digital twins for autonomous vehicle testing and sets a new standard in simulation accuracy and flexibility. The paper further explores the application of BlueICE in two distinct case studies: the ICAT indoor testbed and the STAR campus outdoor testbed at the University of Delaware. These case studies demonstrate BlueICE's capability to create sophisticated digital twins for autonomous vehicle testing and underline its potential as a standardized testbed for future autonomous driving technologies.
[29] arXiv:2405.01333 [pdf, other]: Title: NeRF in Robotics: A Survey

Authors: Guangming Wang, Lei Pan, Songyou Peng, Shaohui Liu, Chenfeng Xu, Yanzi Miao, Wei Zhan, Masayoshi Tomizuka, Marc Pollefeys, Hesheng Wang

Comments: 21 pages, 19 figures

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)

Meticulous 3D environment representations have been a longstanding goal in computer vision and robotics fields. The recent emergence of neural implicit representations has introduced radical innovation to this field as implicit representations enable numerous capabilities. Among these, the Neural Radiance Field (NeRF) has sparked a trend because of the huge representational advantages, such as simplified mathematical models, compact environment storage, and continuous scene representations. Apart from computer vision, NeRF has also shown tremendous potential in the field of robotics. Thus, we create this survey to provide a comprehensive understanding of NeRF in the field of robotics. By exploring the advantages and limitations of NeRF, as well as its current applications and future potential, we hope to shed light on this promising area of research. Our survey is divided into two main sections: \textit{The Application of NeRF in Robotics} and \textit{The Advance of NeRF in Robotics}, from the perspective of how NeRF enters the field of robotics. In the first section, we introduce and analyze some works that have been or could be used in the field of robotics from the perception and interaction perspectives. In the second section, we show some works related to improving NeRF's own properties, which are essential for deploying NeRF in the field of robotics. In the discussion section of the review, we summarize the existing challenges and provide some valuable future research directions for reference.
[30] arXiv:2405.01354 [pdf, other]: Title: Human-Robot Interaction Conversational User Enjoyment Scale (HRI CUES)

Authors: Bahar Irfan, Jura Miniota, Sofia Thunberg, Erik Lagerstedt, Sanna Kuoppamäki, Gabriel Skantze, André Pereira

Subjects: Robotics (cs.RO); Human-Computer Interaction (cs.HC)

Understanding user enjoyment is crucial in human-robot interaction (HRI), as it can impact interaction quality and influence user acceptance and long-term engagement with robots, particularly in the context of conversations with social robots. However, current assessment methods rely solely on self-reported questionnaires, failing to capture interaction dynamics. This work introduces the Human-Robot Interaction Conversational User Enjoyment Scale (HRI CUES), a novel scale for assessing user enjoyment from an external perspective during conversations with a robot. Developed through rigorous evaluations and discussions of three annotators with relevant expertise, the scale provides a structured framework for assessing enjoyment in each conversation exchange (turn) alongside overall interaction levels. It aims to complement self-reported enjoyment from users and holds the potential for autonomously identifying user enjoyment in real-time HRI. The scale was validated on 25 older adults' open-domain dialogue with a companion robot that was powered by a large language model for conversations, corresponding to 174 minutes of data, showing moderate to good alignment. Additionally, the study offers insights into understanding the nuances and challenges of assessing user enjoyment in robot interactions, and provides guidelines on applying the scale to other domains.
[31] arXiv:2405.01361 [pdf, other]: Title: Haptic-Based Bilateral Teleoperation of Aerial Manipulator for Extracting Wedged Object with Compensation of Human Reaction Time

Authors: Jeonghyun Byun, Dohyun Eom, H. Jin Kim

Comments: to be presented in 2024 IEEE International Conference on Unmanned Aircraft Systems (ICUAS), Chania, Crete, Greece, 2024

Subjects: Robotics (cs.RO)

Bilateral teleoperation of an aerial manipulator facilitates the execution of industrial missions thanks to the combination of the aerial platform's maneuverability and the ability to conduct complex tasks with human supervision. Heretofore, research on such operations has focused on flying without any physical interaction or exerting a pushing force on a contact surface that does not involve abrupt changes in the interaction force. In this paper, we propose a human reaction time compensating haptic-based bilateral teleoperation strategy for an aerial manipulator extracting a wedged object from a static structure (i.e., plug-pulling), which incurs an abrupt decrease in the interaction force and causes additional difficulty for an aerial platform. A haptic device composed of a 4-degree-of-freedom robotic arm and a gripper is made for the teleoperation of aerial wedged object-extracting tasks, and a haptic-based teleoperation method to execute the aerial manipulator by the haptic device is introduced. We detect the extraction of the object by the estimation of the external force exerted on the aerial manipulator and generate reference trajectories for both the aerial manipulator and the haptic device after the extraction. As an example of the extraction of a wedged object, we conduct comparative plug-pulling experiments with a quadrotor-based aerial manipulator. The results validate that the proposed bilateral teleoperation method reduces the overshoot in the aerial manipulator's position and ensures fast recovery to its initial position after extracting the wedged object.
[32] arXiv:2405.01392 [pdf, other]: Title: LLMSat: A Large Language Model-Based Goal-Oriented Agent for Autonomous Space Exploration

Authors: David Maranto

Comments: B.A.Sc thesis

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multiagent Systems (cs.MA); Space Physics (physics.space-ph)

As spacecraft journey further from Earth with more complex missions, systems of greater autonomy and onboard intelligence are called for. Reducing reliance on human-based mission control becomes increasingly critical if we are to increase our rate of solar-system-wide exploration. Recent work has explored AI-based goal-oriented systems to increase the level of autonomy in mission execution. These systems make use of symbolic reasoning managers to make inferences from the state of a spacecraft and a handcrafted knowledge base, enabling autonomous generation of tasks and re-planning. Such systems have proven to be successful in controlled cases, but they are difficult to implement as they require human-crafted ontological models to allow the spacecraft to understand the world. Reinforcement learning has been applied to train robotic agents to pursue a goal. A new architecture for autonomy is called for. This work explores the application of Large Language Models (LLMs) as the high-level control system of a spacecraft. Using a systems engineering approach, this work presents the design and development of an agentic spacecraft controller by leveraging an LLM as a reasoning engine, to evaluate the utility of such an architecture in achieving higher levels of spacecraft autonomy. A series of deep space mission scenarios simulated within the popular game engine Kerbal Space Program (KSP) are used as case studies to evaluate the implementation against the requirements. It is shown the reasoning and planning abilities of present-day LLMs do not scale well as the complexity of a mission increases, but this can be alleviated with adequate prompting frameworks and strategic selection of the agent's level of authority over the host spacecraft. This research evaluates the potential of LLMs in augmenting autonomous decision-making systems for future robotic space applications.
[33] arXiv:2405.01402 [pdf, other]: Title: Learning Force Control for Legged Manipulation

Authors: Tifanny Portela, Gabriel B. Margolis, Yandong Ji, Pulkit Agrawal

Comments: This work has been accepted to ICRA24, as well as the Loco-manipulation workshop at ICRA24

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Systems and Control (eess.SY)

Controlling contact forces during interactions is critical for locomotion and manipulation tasks. While sim-to-real reinforcement learning (RL) has succeeded in many contact-rich problems, current RL methods achieve forceful interactions implicitly without explicitly regulating forces. We propose a method for training RL policies for direct force control without requiring access to force sensing. We showcase our method on a whole-body control platform of a quadruped robot with an arm. Such force control enables us to perform gravity compensation and impedance control, unlocking compliant whole-body manipulation. The learned whole-body controller with variable compliance makes it intuitive for humans to teleoperate the robot by only commanding the manipulator, and the robot's body adjusts automatically to achieve the desired position and force. Consequently, a human teleoperator can easily demonstrate a wide variety of loco-manipulation tasks. To the best of our knowledge, we provide the first deployment of learned whole-body force control in legged manipulators, paving the way for more versatile and adaptable legged robots.
[34] arXiv:2405.01440 [pdf, other]: Title: A Review of Reward Functions for Reinforcement Learning in the context of Autonomous Driving

Authors: Ahmed Abouelazm, Jonas Michel, J. Marius Zoellner

Comments: Accepted at "Interaction-driven Behavior Prediction and Planning for Autonomous Vehicles" workshop in 35th IEEE Intelligent Vehicles Symposium (IV 2024)

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

Reinforcement learning has emerged as an important approach for autonomous driving. A reward function is used in reinforcement learning to establish the learned skill objectives and guide the agent toward the optimal policy. Since autonomous driving is a complex domain with partly conflicting objectives with varying degrees of priority, developing a suitable reward function represents a fundamental challenge. This paper aims to highlight the gap in such function design by assessing different proposed formulations in the literature and dividing individual objectives into Safety, Comfort, Progress, and Traffic Rules compliance categories. Additionally, the limitations of the reviewed reward functions are discussed, such as objectives aggregation and indifference to driving context. Furthermore, the reward categories are frequently inadequately formulated and lack standardization. This paper concludes by proposing future research that potentially addresses the observed shortcomings in rewards, including a reward validation framework and structured rewards that are context-aware and able to resolve conflicts.
[35] arXiv:2405.01472 [pdf, other]: Title: IntervenGen: Interventional Data Generation for Robust and Data-Efficient Robot Imitation Learning

Authors: Ryan Hoque, Ajay Mandlekar, Caelan Garrett, Ken Goldberg, Dieter Fox

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI)

Imitation learning is a promising paradigm for training robot control policies, but these policies can suffer from distribution shift, where the conditions at evaluation time differ from those in the training data. A popular approach for increasing policy robustness to distribution shift is interactive imitation learning (i.e., DAgger and variants), where a human operator provides corrective interventions during policy rollouts. However, collecting a sufficient amount of interventions to cover the distribution of policy mistakes can be burdensome for human operators. We propose IntervenGen (I-Gen), a novel data generation system that can autonomously produce a large set of corrective interventions with rich coverage of the state space from a small number of human interventions. We apply I-Gen to 4 simulated environments and 1 physical environment with object pose estimation error and show that it can increase policy robustness by up to 39x with only 10 human interventions. Videos and more results are available at https://sites.google.com/view/intervengen2024.
[36] arXiv:2405.01504 [pdf, ps, other]: Title: Evaluation and Optimization of Adaptive Cruise Control in Autonomous Vehicles using the CARLA Simulator: A Study on Performance under Wet and Dry Weather Conditions

Authors: Roza Al-Hindaw, Taqwa I.Alhadidi, Mohammad Adas

Subjects: Robotics (cs.RO); Systems and Control (eess.SY)

Adaptive Cruise Control ACC can change the speed of the ego vehicle to maintain a safe distance from the following vehicle automatically. The primary purpose of this research is to use cutting-edge computing approaches to locate and track vehicles in real time under various conditions to achieve a safe ACC. The paper examines the extension of ACC employing depth cameras and radar sensors within Autonomous Vehicles AVs to respond in real time by changing weather conditions using the Car Learning to Act CARLA simulation platform at noon. The ego vehicle controller's decision to accelerate or decelerate depends on the speed of the leading ahead vehicle and the safe distance from that vehicle. Simulation results show that a Proportional Integral Derivative PID control of autonomous vehicles using a depth camera and radar sensors reduces the speed of the leading vehicle and the ego vehicle when it rains. In addition, longer travel time was observed for both vehicles in rainy conditions than in dry conditions. Also, PID control prevents the leading vehicle from rear collisions
[37] arXiv:2405.01527 [pdf, other]: Title: Track2Act: Predicting Point Tracks from Internet Videos enables Diverse Zero-shot Robot Manipulation

Authors: Homanga Bharadhwaj, Roozbeh Mottaghi, Abhinav Gupta, Shubham Tulsiani

Comments: preprint

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)

We seek to learn a generalizable goal-conditioned policy that enables zero-shot robot manipulation: interacting with unseen objects in novel scenes without test-time adaptation. While typical approaches rely on a large amount of demonstration data for such generalization, we propose an approach that leverages web videos to predict plausible interaction plans and learns a task-agnostic transformation to obtain robot actions in the real world. Our framework,Track2Act predicts tracks of how points in an image should move in future time-steps based on a goal, and can be trained with diverse videos on the web including those of humans and robots manipulating everyday objects. We use these 2D track predictions to infer a sequence of rigid transforms of the object to be manipulated, and obtain robot end-effector poses that can be executed in an open-loop manner. We then refine this open-loop plan by predicting residual actions through a closed loop policy trained with a few embodiment-specific demonstrations. We show that this approach of combining scalably learned track prediction with a residual policy requiring minimal in-domain robot-specific data enables zero-shot robot manipulation, and present a wide array of real-world robot manipulation results across unseen tasks, objects, and scenes. https://homangab.github.io/track2act/

Cross-lists for Fri, 3 May 24

[38] arXiv:2405.00698 (cross-list from cs.NE) [pdf, other]: Title: CUDA-Accelerated Soft Robot Neural Evolution with Large Language Model Supervision

Authors: Lechen Zhang

Comments: 3 pages, 5 figures

Subjects: Neural and Evolutionary Computing (cs.NE); Robotics (cs.RO)

This paper addresses the challenge of co-designing morphology and control in soft robots via a novel neural network evolution approach. We propose an innovative method to implicitly dual-encode soft robots, thus facilitating the simultaneous design of morphology and control. Additionally, we introduce the large language model to serve as the control center during the evolutionary process. This advancement considerably optimizes the evolution speed compared to traditional soft-bodied robot co-design methods. Further complementing our work is the implementation of Gaussian positional encoding - an approach that augments the neural network's comprehension of robot morphology. Our paper offers a new perspective on soft robot design, illustrating substantial improvements in efficiency and comprehension during the design and evolutionary process.
[39] arXiv:2405.00746 (cross-list from cs.LG) [pdf, other]: Title: Leveraging Sub-Optimal Data for Human-in-the-Loop Reinforcement Learning

Authors: Calarina Muslimani, Matthew E. Taylor

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO)

To create useful reinforcement learning (RL) agents, step zero is to design a suitable reward function that captures the nuances of the task. However, reward engineering can be a difficult and time-consuming process. Instead, human-in-the-loop (HitL) RL allows agents to learn reward functions from human feedback. Despite recent successes, many of the HitL RL methods still require numerous human interactions to learn successful reward functions. To improve the feedback efficiency of HitL RL methods (i.e., require less feedback), this paper introduces Sub-optimal Data Pre-training, SDP, an approach that leverages reward-free, sub-optimal data to improve scalar- and preference-based HitL RL algorithms. In SDP, we start by pseudo-labeling all low-quality data with rewards of zero. Through this process, we obtain free reward labels to pre-train our reward model. This pre-training phase provides the reward model a head start in learning, whereby it can identify that low-quality transitions should have a low reward, all without any actual feedback. Through extensive experiments with a simulated teacher, we demonstrate that SDP can significantly improve or achieve competitive performance with state-of-the-art (SOTA) HitL RL algorithms across nine robotic manipulation and locomotion tasks.
[40] arXiv:2405.00924 (cross-list from eess.SY) [pdf, ps, other]: Title: Zonotope-based Symbolic Controller Synthesis for Linear Temporal Logic Specifications

Authors: Wei Ren, Raphael M. Jungers, Dimos V. Dimarogonas

Comments: 16 pages, 11 figures

Subjects: Systems and Control (eess.SY); Robotics (cs.RO)

This paper studies the controller synthesis problem for nonlinear control systems under linear temporal logic (LTL) specifications using zonotope techniques. A local-to-global control strategy is proposed for the desired specification expressed as an LTL formula. First, a novel approach is developed to divide the state space into finite zonotopes and constrained zonotopes, which are called cells and allowed to intersect with the neighbor cells. Second, from the intersection relation, a graph among all cells is generated to verify the realization of the accepting path for the LTL formula. The realization verification determines if there is a need for the control design, and also results in finite local LTL formulas. Third, once the accepting path is realized, a novel abstraction-based method is derived for the controller design. In particular, we only focus on the cells from the realization verification and approximate each cell thanks to properties of zonotopes. Based on local symbolic models and local LTL formulas, an iterative synthesis algorithm is proposed to design all local abstract controllers, whose existence and combination establish the global controller for the LTL formula. Finally, the proposed framework is illustrated via a path planning problem of mobile robots.
[41] arXiv:2405.01114 (cross-list from cs.LG) [pdf, other]: Title: Continual Imitation Learning for Prosthetic Limbs

Authors: Sharmita Dey, Benjamin Paassen, Sarath Ravindran Nair, Sabri Boughorbel, Arndt F. Schilling

Subjects: Machine Learning (cs.LG); Robotics (cs.RO)

Lower limb amputations and neuromuscular impairments severely restrict mobility, necessitating advancements beyond conventional prosthetics. Motorized bionic limbs offer promise, but their utility depends on mimicking the evolving synergy of human movement in various settings. In this context, we present a novel model for bionic prostheses' application that leverages camera-based motion capture and wearable sensor data, to learn the synergistic coupling of the lower limbs during human locomotion, empowering it to infer the kinematic behavior of a missing lower limb across varied tasks, such as climbing inclines and stairs. We propose a model that can multitask, adapt continually, anticipate movements, and refine. The core of our method lies in an approach which we call -- multitask prospective rehearsal -- that anticipates and synthesizes future movements based on the previous prediction and employs a corrective mechanism for subsequent predictions. We design an evolving architecture that merges lightweight, task-specific modules on a shared backbone, ensuring both specificity and scalability. We empirically validate our model against various baselines using real-world human gait datasets, including experiments with transtibial amputees, which encompass a broad spectrum of locomotion tasks. The results show that our approach consistently outperforms baseline models, particularly under scenarios affected by distributional shifts, adversarial perturbations, and noise.
[42] arXiv:2405.01258 (cross-list from cs.CV) [pdf, other]: Title: Towards Consistent Object Detection via LiDAR-Camera Synergy

Authors: Kai Luo, Hao Wu, Kefu Yi, Kailun Yang, Wei Hao, Rongdong Hu

Comments: The source code will be made publicly available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO); Image and Video Processing (eess.IV)

As human-machine interaction continues to evolve, the capacity for environmental perception is becoming increasingly crucial. Integrating the two most common types of sensory data, images, and point clouds, can enhance detection accuracy. However, currently, no model exists that can simultaneously detect an object's position in both point clouds and images and ascertain their corresponding relationship. This information is invaluable for human-machine interactions, offering new possibilities for their enhancement. In light of this, this paper introduces an end-to-end Consistency Object Detection (COD) algorithm framework that requires only a single forward inference to simultaneously obtain an object's position in both point clouds and images and establish their correlation. Furthermore, to assess the accuracy of the object correlation between point clouds and images, this paper proposes a new evaluation metric, Consistency Precision (CP). To verify the effectiveness of the proposed framework, an extensive set of experiments has been conducted on the KITTI and DAIR-V2X datasets. The study also explored how the proposed consistency detection method performs on images when the calibration parameters between images and point clouds are disturbed, compared to existing post-processing methods. The experimental results demonstrate that the proposed method exhibits excellent detection performance and robustness, achieving end-to-end consistency detection. The source code will be made publicly available at https://github.com/xifen523/COD.
[43] arXiv:2405.01534 (cross-list from cs.LG) [pdf, other]: Title: Plan-Seq-Learn: Language Model Guided RL for Solving Long Horizon Robotics Tasks

Authors: Murtaza Dalal, Tarun Chiruvolu, Devendra Chaplot, Ruslan Salakhutdinov

Comments: Published at ICLR 2024. Website at this https URL 9 pages, 3 figures, 3 tables; 14 pages appendix (7 additional figures)

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)

Large Language Models (LLMs) have been shown to be capable of performing high-level planning for long-horizon robotics tasks, yet existing methods require access to a pre-defined skill library (e.g. picking, placing, pulling, pushing, navigating). However, LLM planning does not address how to design or learn those behaviors, which remains challenging particularly in long-horizon settings. Furthermore, for many tasks of interest, the robot needs to be able to adjust its behavior in a fine-grained manner, requiring the agent to be capable of modifying low-level control actions. Can we instead use the internet-scale knowledge from LLMs for high-level policies, guiding reinforcement learning (RL) policies to efficiently solve robotic control tasks online without requiring a pre-determined set of skills? In this paper, we propose Plan-Seq-Learn (PSL): a modular approach that uses motion planning to bridge the gap between abstract language and learned low-level control for solving long-horizon robotics tasks from scratch. We demonstrate that PSL achieves state-of-the-art results on over 25 challenging robotics tasks with up to 10 stages. PSL solves long-horizon tasks from raw visual input spanning four benchmarks at success rates of over 85%, out-performing language-based, classical, and end-to-end approaches. Video results and code at https://mihdalal.github.io/planseqlearn/
[44] arXiv:2405.01538 (cross-list from cs.CV) [pdf, other]: Title: Multi-Space Alignments Towards Universal LiDAR Segmentation

Authors: Youquan Liu, Lingdong Kong, Xiaoyang Wu, Runnan Chen, Xin Li, Liang Pan, Ziwei Liu, Yuexin Ma

Comments: CVPR 2024; 33 pages, 14 figures, 14 tables; Code at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Robotics (cs.RO)

A unified and versatile LiDAR segmentation model with strong robustness and generalizability is desirable for safe autonomous driving perception. This work presents M3Net, a one-of-a-kind framework for fulfilling multi-task, multi-dataset, multi-modality LiDAR segmentation in a universal manner using just a single set of parameters. To better exploit data volume and diversity, we first combine large-scale driving datasets acquired by different types of sensors from diverse scenes and then conduct alignments in three spaces, namely data, feature, and label spaces, during the training. As a result, M3Net is capable of taming heterogeneous data for training state-of-the-art LiDAR segmentation models. Extensive experiments on twelve LiDAR segmentation datasets verify our effectiveness. Notably, using a shared set of parameters, M3Net achieves 75.1%, 83.1%, and 72.4% mIoU scores, respectively, on the official benchmarks of SemanticKITTI, nuScenes, and Waymo Open.

Replacements for Fri, 3 May 24

[45] arXiv:2011.04840 (replaced) [pdf, other]: Title: Robots of the Lost Arc: Self-Supervised Learning to Dynamically Manipulate Fixed-Endpoint Cables

Authors: Harry Zhang, Jeffrey Ichnowski, Daniel Seita, Jonathan Wang, Huang Huang, Ken Goldberg

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI)
[46] arXiv:2209.15566 (replaced) [pdf, other]: Title: ContactNet: Online Multi-Contact Planning for Acyclic Legged Robot Locomotion

Authors: Angelo Bratta, Avadesh Meduri, Michele Focchi, Ludovic Righetti, Claudio Semini

Subjects: Robotics (cs.RO)
[47] arXiv:2211.09325 (replaced) [pdf, other]: Title: TAX-Pose: Task-Specific Cross-Pose Estimation for Robot Manipulation

Authors: Chuer Pan, Brian Okorn, Harry Zhang, Ben Eisner, David Held

Comments: Conference on Robot Learning (CoRL), 2022. Supplementary material is available at this https URL

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[48] arXiv:2302.13192 (replaced) [pdf, other]: Title: Reinforcement Learning based Autonomous Multi-Rotor Landing on Moving Platforms

Authors: Pascal Goldschmid, Aamir Ahmad

Comments: 24 pages, 13 figures, 13 tables

Subjects: Robotics (cs.RO); Systems and Control (eess.SY)
[49] arXiv:2306.12893 (replaced) [pdf, other]: Title: FlowBot++: Learning Generalized Articulated Objects Manipulation via Articulation Projection

Authors: Harry Zhang, Ben Eisner, David Held

Comments: arXiv admin note: text overlap with arXiv:2205.04382

Subjects: Robotics (cs.RO)
[50] arXiv:2308.07275 (replaced) [pdf, other]: Title: On Semidefinite Relaxations for Matrix-Weighted State-Estimation Problems in Robotics

Authors: Connor Holmes, Frederike Dümbgen, Timothy D Barfoot

Subjects: Robotics (cs.RO); Optimization and Control (math.OC)
[51] arXiv:2308.16471 (replaced) [pdf, other]: Title: Foundational Policy Acquisition via Multitask Learning for Motor Skill Generation

Authors: Satoshi Yamamori, Jun Morimoto

Comments: 11 pages, 6 figures

Subjects: Robotics (cs.RO); Machine Learning (cs.LG)
[52] arXiv:2311.18044 (replaced) [pdf, other]: Title: Transfer Learning in Robotics: An Upcoming Breakthrough? A Review of Promises and Challenges

Authors: Noémie Jaquier, Michael C. Welle, Andrej Gams, Kunpeng Yao, Bernardo Fichera, Aude Billard, Aleš Ude, Tamim Asfour, Danica Kragic

Comments: 21 pages, 7 figures

Subjects: Robotics (cs.RO); Machine Learning (cs.LG)
[53] arXiv:2403.15813 (replaced) [pdf, other]: Title: Learning Early Social Maneuvers for Enhanced Social Navigation

Authors: Yigit Yildirim, Mehmet Suzer, Emre Ugur

Comments: Accepted for presentation in the workshop of Robot Trust for Symbiotic Societies (RTSS) at ICRA 2024

Subjects: Robotics (cs.RO)
[54] arXiv:2404.00354 (replaced) [pdf, other]: Title: Follow me: an architecture for user identification and social navigation with a mobile robot

Authors: Andrea Ruo, Lorenzo Sabattini, Valeria Villani

Journal-ref: Proceedings of the European Robotics Forum 2024

Subjects: Robotics (cs.RO)
[55] arXiv:2404.00691 (replaced) [pdf, other]: Title: Graph-Based vs. Error State Kalman Filter-Based Fusion Of 5G And Inertial Data For MAV Indoor Pose Estimation

Authors: Meisam Kabiri, Claudio Cimarelli, Hriday Bavle, Jose Luis Sanchez-Lopez, Holger Voos

Subjects: Robotics (cs.RO)
[56] arXiv:2404.12079 (replaced) [pdf, other]: Title: Trajectory Planning for Autonomous Vehicle Using Iterative Reward Prediction in Reinforcement Learning

Authors: Hyunwoo Park

Comments: 8 pages, 6 figures

Subjects: Robotics (cs.RO)
[57] arXiv:2009.08618 (replaced) [pdf, other]: Title: 6-DoF Grasp Planning using Fast 3D Reconstruction and Grasp Quality CNN

Authors: Yahav Avigal, Samuel Paradis, Harry Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[58] arXiv:2303.00638 (replaced) [pdf, other]: Title: MEGA-DAgger: Imitation Learning with Multiple Imperfect Experts

Authors: Xiatao Sun, Shuo Yang, Mingyan Zhou, Kunpeng Liu, Rahul Mangharam

Subjects: Machine Learning (cs.LG); Robotics (cs.RO)
[59] arXiv:2309.08152 (replaced) [pdf, other]: Title: DA-RAW: Domain Adaptive Object Detection for Real-World Adverse Weather Conditions

Authors: Minsik Jeon, Junwon Seo, Jihong Min

Comments: Accepted to ICRA 2024. Our project website can be found at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[60] arXiv:2310.05921 (replaced) [pdf, other]: Title: Conformal Decision Theory: Safe Autonomous Decisions from Imperfect Predictions

Authors: Jordan Lekeufack, Anastasios N. Angelopoulos, Andrea Bajcsy, Michael I. Jordan, Jitendra Malik

Comments: 8 pages, 5 figures

Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Robotics (cs.RO); Methodology (stat.ME)
[61] arXiv:2402.16075 (replaced) [pdf, other]: Title: Don't Start from Scratch: Behavioral Refinement via Interpolant-based Policy Diffusion

Authors: Kaiqi Chen, Eugene Lim, Kelvin Lin, Yiyang Chen, Harold Soh

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[62] arXiv:2404.04197 (replaced) [pdf, other]: Title: Convex MPC and Thrust Allocation with Deadband for Spacecraft Rendezvous

Authors: Pedro Taborda, Hugo Matias, Daniel Silvestre, Pedro Lourenço

Comments: Extended version

Subjects: Systems and Control (eess.SY); Robotics (cs.RO)

New submissions
Cross-lists
Replacements

[ total of 62 entries: 1-62 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, recent, 2405, contact, help (Access key information)

> cs > cs.RO

Robotics

New submissions

New submissions for Fri, 3 May 24

Cross-lists for Fri, 3 May 24

Replacements for Fri, 3 May 24