Legged Autonomous Inspection

Technologies: C++, ROS 2, Nav2, Machine Learning, Unitree Go1

Since they can traverse environments designed for humans better than their wheeled counterparts, legged robots are well suited to keeping tabs on an industrial environment. This can especially come in handy for environments that are inconvenient or hazardous for humans to visit.

I was interested in this autonomous inspection application, so my winter project in Northwestern University’s MSR program focused on exploring this with the quadruped Unitree Go1.

Table of Contents

The Application
The System
- Navigation
- Optical Character Recognition
Future Work

The Application

Legged robots can be provided with a map of their industrial setting with specific points of interest. Once the robot has reached each point of interest, it can use an array of sensors - visual, thermal, and more - to collect data and decide if that data requires further investigation.

Industrial Inspection Flowchart

A high level flow chart for how a legged robot could perform industrial inspections.

I was particularly interested in exploring the navigation and sensing aspect of this application. I wanted to create a robot that could sense some information, make a decision as to where to navigate next based on that information, and then autonomously navigate there.

I chose to present the data as text located somewhere at the inspection point, which the Go1 could read with its cameras and use to decide its next destination.

A real-world analogy to this capability would be visually reading data from display readouts throughout an industrial environment. Many display readouts, especially in older facilities, provide sensor data that cannot be networked to a plant SCADA system. Having the ability to detect text and extract data from those sensor readouts would be very versatile.

This Project's Flowchart

A high level flowchart for my legged autonomous inspection demo.

The System

Accomplishing this autonomous inspection task required several subsystems integrated together. These subsystems generally fell into two categories:

Navigation (3D LiDAR, RTAB-Map, Nav2)
Optical Character Recognition (Onboard Cameras, EAST Text Detection, CRNN Text Recognition)

with some control and high level logic packages to tie the subsystems together.

Block Diagram of the System Software Stack

A block diagram of the system software stack.

A critical task for this project is autonomous navigation around a mapped environment and avoiding obstacles that may not have been present in the original map. Luckily ROS 2 has Nav2, a software stack designed specifically for this task. This integrated with a 3D LiDAR and custom C++ control packages to allow the Go1 to navigate through its environment autonomously.

Here’s a video of the Nav2 stack working with the Unitree Go1 for autonomous navigation. The point cloud created by the 3D LiDAR is displayed in the top left and the created map in the bottom right.

Expand for more technical details on the navigation subsystem.

The first part of integrating Nav2 with the Unitree Go1 was sensing. The Unitree Go1 EDU Plus model has an option of coming with a RoboSense RS-Helios-16P 3D LiDAR, so I worked with Marno Nel to set up this sensor for mapping of the environment. We chose to process the point cloud data provided by this sensor with RTAB-Map for its 3D 6DoF mapping capabilities. RTAB-Map provides the Nav2 stack with ICP odometry and SLAM updates for the position of the robot and objects in the environment. Nav2 then uses this information to create local and global costmaps for planning paths that avoid collisions with obstacles in the environment.

The second part of integrating Nav2 with the Unitree Go1 was control. Nav2 can already plan paths and command velocities, but it needs some way to pass these commands through to the robot. I wrote a cmd_processor ROS 2 C++ node in the unitree_nav repository that takes in commanded velocities and translates them to control messages that can be interpreted by my communication node in the unitree_ros2 repository and sent to the Go1 over UDP. This node also handles other control capabilities, like commanding the Go1 to stand up and lay down. The other half of control was commanding Nav2 to plan and execute paths, which I wrote example code for in the nav_to_pose node of the unitree_nav repository.

Optical Character Recognition

The second required subsystem is visual text detection and recognition, referred to here as Optical Character Recognition (OCR). This project uses the Go1’s onboard stereo cameras and two pre-trained machine learning models to accomplish this task. I wrote a ROS 2 C++ wrapper for the Unitree Camera SDK to publish camera data. This data then feeds into an EAST Text Detection model to locate text in the images. Finally, a convolutional recurrent neural network (CRNN) is used to recognize and interpret the text data.

Stereo Rectified, Depth, and Point Cloud

Go1 head front camera feeds of stereo rectified and depth images and point cloud data.

Woof! Bark! Arf! Ruff!

Making the dog read dog words.

Expand for more technical details on the visual text detection subsystem.

My Unitree Camera SDK C++ wrapper uses image_transport for image compression. This publishes raw, rectified, depth, and point cloud images from any of the Go1’s five onboard cameras.

The first machine learning model uses a neural network based on a TensorFlow re-implementation of the Efficient and Accurate Scene Text Detector (EAST) model. This pipeline is designed to detect where text is located in a natural scene so that a text recognition model can parse it into characters. The EAST model provides bounding vertices for lines of text in arbitrary orientations, as shown by the green bounding rectangles in the image above.

The second machine learning model uses a convolutional recurrent neural network (CRNN) trained on the MJSynth and SynthText datasets. It accepts the bounding vertices from the EAST model and parses the image cropped at those vertices into actual text, shown in red above.

Finally, I wrote a simple pitch sweeping sequence that commands the Go1 to sweep its head front camera up and down continuously until it reliabily detects a text command. Below is a demonstration of this behavior.

Future Work

Putting the navigation and OCR subsystems together creates a system that can pretty reliably navigate between inspection points in an arbitrary order and avoid obstacles along the way. But as with any project, there is always improvements that can be made.

1. Letting the dog off the leash.

As can be seen in the demo video, an Ethernet cable connects my laptop to the Go1’s network of internal computers. This is partially so I can run visualization of the map and point cloud data in RVIZ, but it’s also because our Go1’s onboard computers are not yet capable of running all the required nodes for this project.

One of the most challenging parts of this project was upgrading the Go1’s onboard NVIDIA Jetson Nanos to Ubuntu 22.04. The Nanos come with Ubuntu 18.04 - two LTS’s behind the only Tier 1 supported Linux OS for ROS 2 Humble (Ubuntu 22.04). Katie Hughes and I worked together with help from this awesome blog from Q-engineering to upgrade the Go1 to what we believe to be the only ROS 2 Humble version out there (at the time of writing this post).

Unfortunately, we did not have time to also upgrade the Jetson Xavier NX on the Go1, which has more computing power. Getting the Xavier on 22.04 would allow us to run point cloud processing nodes at reasonable speeds on the robot. We also currently have some limited wireless control of the Go1, which could be improved by bridging the onboard network through the Go1 Raspberry Pi’s WiFi adapter. These two improvements would allow us to let the dog off the leash and create a truly mobile autonomous inspection bot.

2. Integrating IMU odometry.

The Go1 provides some odometry calculated from data from its onboard IMU, with limited accuracy. Marno Nel and I experimented with integrating this with the Nav2 stack for faster odometry updates, but we weren’t able to get it working in time to compare with the ICP odometry. Onboard IMU odometry may help improve localization by providing faster odometry updates in between the LiDAR-based SLAM updates. Our progress can be found in the onboard-odometry branch of the unitree_nav repo.

3. Improving holonomic capabilities.

The Go1 can move forward, backward, and to each side. At current, Nav2 only ever plans paths where the robot moves in the forward and backward directions and rotates. With a little more investigation, this could be improved so the robot can move side to side too when planning paths!

4. Improving data search capabilities.

Currently, when searching for data, the Go1 only sweeps up and down from its current position. If there was some error in the navigation, this can sometimes cause it to miss the data at the inspection point. Improved algorithms to move the Go1 around the inspection point might help find the data more reliably.

5. Adding other sensors.

Visual inspection is great and could be extended to further cases. But it would also be versatile to include other sensors such as IR cameras to improve the dog’s sensing capabilities.

A big thanks to the other engineers who collaborated with me in getting the Go1 up and running. The autonomous inspection project was my own, but Marno Nel, Katie Hughes, Ava Zahedi, and I worked together on upgrading and integrating the Go1 with ROS 2 Humble.

The Application

The System

Navigation

Optical Character Recognition

Future Work