I was born and raised in Johannesburg, South Africa.
2014-2017
University of Cape Town - Bsc(Eng) mechatonics Hons.
After completing High School, I moved from Johannesburg to Cape Town where I attended the University of Cape Town. At UCT I studied a Bsc(Eng) in Mechatronics. I was on the Dean's merit list for my first three years and graduated my fourth year with honours in distinction.
April 2018-September 2018
Explore Data science
After graduating from UCT, I worked at Explore Data Science in Woodstock, Cape Town. As a part of the Explore team I created content to teach students at the data science academy various digital skills. These digital skills included python, machine learning and Java. My other main responsibility was to create a web-app to process written documentation submitted by students.
September 2018
Northwestern University - Masters in Robotics
I then started a M.S. in robotics at Northwestern University, Chicago. As part of the masters I learnt various skills related to robotics including ROS. I am also involved in various projects as part of courses and that I have pursued on my own time. These projects can be found in my portfolio above.
More
Coming
Soon
Contact info
Feel free to contact me in any of the ways given here.
This project aims to use reinforcement learning on a Gazebo simulation of the Jackal with the end goal of transferring the learned policy to the real robot. A large part of this project was writing a library to interface with Gazebo in a clean and easy manner through ROS. Once this framework is set up there are multiple autonomous navigation and exploration goals that can be tackled with various algorithms. One of these goals is to investigate how policies change based on changing the sight range of the Jackal, this learning aim is in line with the interests of a sponsoring lab at Northwestern University.
Gazebo and ROS
ROS and Gazebo have been used to allow the Jackal to learn in an environment that provides easy restarts and no damage to the physical robot. ClearPaths simulation repositories for the Jackal were cloned and built from source. Some of the repositories had to be forked so that small adjustments could be made in order to better fit with the real world Northwestern specific model of the Jackal.
State Observation forms
Single environments
In order to explore a single environment, (x,y) co-ordinates can be used as state information with "new map information" learned at each step as a reward. This will allow for learning a policy specific to that one environment.
General environments
Using RL to learn policies for general environments is not trivial and is not common practice. There are better algorithms and heuristics for exploring general environments, However reinforcement learning was used for this project with the hopes of exploring the difference that range of sight makes and to get experience using reinforcement learning.
In order to learn "general" policies, any state information that is specific to a single environment (e.g x,y co-ordinates) should not be used. For this reason environment agnostic states have been chosen. The environment agnostic states that can be used are:
•The full 3D point cloud • Condensing the point cloud into a flat Laser scan • Segmenting the laser scan and using the mean/min/max or some other metric of each segment • Segmenting the laser scan and binning the mean/min/max/other metric of each segment into discrete values • Using a map of the robots current understanding of its surroundings.
Examples of the last three options can be seen in the figures below. Examples of segmented and discretized LIDAR models.
Part of environment mapped by the robot on the right. The image representation to be fed to a CNN on the left.
Algorithms
SARSA
The first algorithm implemented was SARSA, this is owing to the fact that the problem of learning how to autonomously navigate an arbitrary environment is a model free problem. The SARSA implementation would require a the number of states to be relatively small and discrete. This is possible if learning on one environment only using (x, y) co-ordinates as a state. However to be able to use the LIDAR readings this is infeasible as there are too many states and they are possibly continuous in value. To accommodate this Deep reinforcement learning was implemented.
Deep RL
Deep reinforcement learning can solve the problem of very large or continuous state spaces. Deep rl allows for approximation of the Q values of state, action pairs rather than specifically calculating and storing them in a table. This causes the algorithm to be feasible in terms of memory and it also makes the process of reinforcement learning more sample efficient (The agent no longer has to visit every state and perform each action multiple times.)
Target networks and memory replay
Deep mind has produced much greater results for deep reinforcement learning by using some stabilization strategies in learning. Two of these strategies have been implemented.
Target network
This method makes use of a second neural network with the exact same structure as the original one. The original neural network is updated every time however the weights from this network are only copied to the second network periodically. The second network is used to calculate the Q-values of the next state in the update step. This provides for more stable learning as the target Q-values are not continuously changing while the policy is being while learning is attempting to match it.
Memory replay
Memory replay is a strategy that stores the robots previous experiences in a memory buffer. The neural network is then periodically trained on random batches from this buffer. This allows the robot to form good policies about the general task rather only learning a policy that fits well to what it is currently experiencing. This is helpful here since the robot may experience different structure in different parts of the environment.
Learning goals
The learning goals of this project can be easily adjusted within the reinforcement learning framework that has been set up by changing:
• the reward function
The reward function used can be easily tailored to include goals of the algorithm that are measurable in the form of a scalar value. The policy that the algorithm develops should then develop in a way to maximize this scalar value. If using multiple different rewards to create a single scalar each contributor can be weighted based on importance. An example of a reward structure could be:
• A negative scalar value (e.g. -1) can be given for each time step where no new information is learned to encourage the robot to explore. • A negative scalar value (e.g. -10) can be given each time the robot makes a move that would result in it crashing. • A positive scalar reward (e.g. +5) can be given for each time step where new map information is learned. This can be a constant vale or it can be proportional to the amount of new information learned.
• the state representation
The state representation can change the learning goal of the algorithm as discussed in the General Environments section above. x,y co-ordinates can be used to create a policy based on the structure of a single environment, while robot observations like LIDAR scans or a map of the robots current understanding of the world can be used to create policies based on what the robot can observe. This can be further explored by limiting the range of what the robot can observe.
• the neural network structure
This project makes use of Keras to implement neural networks, this makes it simple to adjust not only the size and shape of the neural network but also the types of layers used. Recurrent layers can be added to allow for the temporal aspects of the states and actions to be included in the model. Convolutional layers can be added in order to be able to feed the network spatial/image data such as the map images created by the SLAM algorithm being used on the robot.
Results
Various learning aims were tried while testing the framework. These included: • trying to learn a simple obstacle avoidance policy based on segmented LIDAR data with a reward policy of -1 for bumping an object and 1 for not bumping an object. • Learning a policy for a single environment using the (x,y) values as state and new map information learned as a reward metric with negative reward per time step and negative reward for collisions. • Using a Cost map as state input to a CNN
Future learning goals
Owing to the speed at which the Jackal can be simulated in Gazebo trials were limited in the time provided. While the results are positive better results could definitely be achieved by tuning the hyperparamters in the algorithm, Given more time the project could also possibly be developed to meet the goals of the following very interesting applications to the algorithm • Using camera frames from the front of the robot as input to a CNN with a reward structure that accounts for collisions and new map information gained • Artificially limiting LIDAR range to see difference in optimal policies based on range of sight.
Problems encountered and outtakes
Stabilization of the learning process was a hard problem to identify and solve. There are no tell tale signs of this instability and I only found that the process was unstable by trying my algorithm on a much simpler problem (the OpenAI gym cartpole problem). The stabilization techniques used don't seem as though they can hurt the learning process so they might as well be included.
Finding an effective state representation and algorithm structure for learning policies to navigate general environments is not trivial and I would like to put more work into it. The use of the map produced by the slam algorithm with a CNN seems promising.
This project aimed to bring the ClearPath Jackal up to a running state along with a Velodyne LIDAR on ROS melodic (which at the time of writing is not supported by ClearPath). ClearPath has a release image with all the necessary set up for ROS kinetic but not for ROS melodic yet, as such some packages had to be built from source and certain issues (such as networking and udev rules) had to be trouble shot. To see a comprehensive walk-through visit jackal_melodic_bringup.
Brief process
Wireless Network set up
In order to allow easy and reliable connection to the jackal, A dedicated wireless router was set up. The router was set up to bind the mac addresses of my laptop and the Jackal to static IP addresses on the router. This allowes for setting static IPs to match a hostname in the /etc/hosts files on both the laptop and the Jackal for easy hostname resolution.
ROS install and packages
Since the Jackal is running Ubuntu 18.04 the installation of ROS melodic is very straight forward using apt-get. After ROS melodic is installed most of the packages needed for the Jackal are available as melodic packages through apt-get with the exception of the jackal and jackal_robot packages. These packages need to be built from source. Building these packages from source is also relatively straight forward, they can be cloned into a catkin workspace on the Jackal, they have all the dependencies properly listed so rosdep can be run first to build all the dependencies of both packages. Once all the dependencies for these packages have been installed simply running catkin_make should build these two packages from scratch without any problems. Running Gazebo simulations of the Jackal can be done on your computer. The Clearpath tutorials are very helpful in setting up these simulations.
ROS over the network
A single ROS master can be shared between the Jackal and your local computer if they are the same network. This is helpful in visualizing data while running the Jackal in headless mode. In order to achieve this the ROS_MASTER_URI and ROS_HOSTNAME environment variables need to be set properly on the Jackal and your local computer.
Communication between Jackal and motor board
Some system debugging needed to be done in order to get the Jackal communicating with the motor driver board. This is owing to a udev rule that creates a device in /dev called jackal. This udev rule needs to be implemented and then the ClearPath nodes for the Jackal will be able to send and receive data from the motor board.
Interfacing with the LIDAR
In order to interface with VLP-16 LIDAR the Jackal needs to be set up to communicate over the correct network interface. The velodyne documentation provides an IP address for the LIDAR. By setting the Jackals static IP address to be in the same range as the LIDAR, it will be able to read information from the LIDAR on the LIDARs IP. This is achieved by setting the static IP for the relevant network interface in the /etc/network/interfaces file. Once the Jackal is set up correctly you can ping the IP address of the lidar to check that they can connect and you can view the Velodyne LIDAR portal by visiting the IP address of the LIDAR through a browser if you still have a screen connected to the Jackal computer. The point cloud data can then be published to ROS topics by installing the ros-melodic-velodyne-pointcloud package from apt-get and using the VLP16_points.launch launch file. In addition a static tf transform needs to be established from the "base_link" of the Jackal to the "velodyne" link.
Booting into a working state
In order for the Jackal to boot into a working state, systemd services can be implemented to launch the relevant ROS nodes for moving the Jackal and get blue-tooth into a state to connect with the PS3 remote. This can also be used to set up the relevant tf static transforms and bring up the Velodyne nodes.
Useful packages and set up
Once the Jackal has been set up fully the 3D lidar can be compressed to a laserscan using the ros-melodic-pointcloud-to-laserscan package from apt-get. This allows all the examples included in the Jackal simulation tutorials to be run on the real Jackal. Along with this full 3D mapping can be done with the 3D LIDAR by using the octomap package. Some examples of the above capabilities running can be seen here.
This project aimed to develop a gait for a quadrupedal robot with the use of machine learning. In particular a robot model was created in gazebo and evolutionary algorithms were used to develop a gait. The gait developed is then intended to be put onto a 3d printed robot of the same shape and size. The motivation for this project was to see if a gait could be developed for a robot without delving into the inverse kinematics and other mathematical models of the system, as such the algorithm only has control over evolving the positions of the servo motors in the joints of the robot.
Gazebo and ROS
This project makes use of gazebo for the simulation of the robot and its gait evolution. Gazebo is used in conjunction with ROS to spawn a robot, control its joints and receive information about how the model performed in gazebo. This involves the use of plugins to control the model and reading topics to determine how the model is performing in the simulation
Hardware
The hardware part of this project involves the implementation of the gait onto a 3D printed version of the robot. The robot runs off a raspberry pi zero with a servo driver board. The 3D printed model is actuated with 12 servo motors
Algorithms
This project makes use of hand written evolutionary algorithms to develop the gait of the robot. Multiple variations of 'genetic encodings' and fitness functions have been explored to determine which combination produces the best gait in terns of speed and stability. A further goal of this project would be to try reinforcement learning algorithms on the framework that has been set up for simulation and machine learning
Two different genetic encodings were used: The first encoding was a list of joint angles for all the joints combined. This encoding took the form of [hip1 angle at t0, hip2 angle at t0, .... ankle3 angle at t0, ankle4 angle at t0, hip0 angle at t1 ...] The second encoding a list of joint angles for a single leg and a list of phases to offset the angles for the other legs. This encoding took the form of [hip angle at t0, knee angle at t0, ankle angle at t0, hip angle at t1, knee angle at t1 ...] and [offset for leg 2, offset for leg 3, offset for leg 4] As with other evolutionary algorithms this algorithm makes use of mutations, mating between the surviving members of the population and culling of the weak portion of the population at each generation.
This group project aimed to segment light poles out of a point cloud taken from a car. My part in this project was to complete the planar filtering and automatic detection of the number of poles. This data was provided from HERE technologies. The challenges relating to this process involved not only the logic of finding and separating the light poles from the cloud but also doing so in a computationally efficient manner.
Process
The process involved in this project involved a multiple stage pipeline. This pipeline involved the following steps :
Pre-process the data
Pre-processing the data involved converting the data from LLA (latitude, longitude, altitude) to Cartesian (x, y, z) co-ordinates as well as down sampling the data. This was done as the data was very dense (1,292,208,points) and by down-sampling very little useful information was lost, however this reduced the number of points so that computations were more feasible. Down-sampling was performed in three different ways :
Voxel down-sampling
Uniform down-sampling
Statistical outlier removal
voxel down-sampling, uniform down-sampling and statistical outlier removal. This voxel down-sampling was the method used in the final implementation. This reduced the point cloud to around 39,000 points
Planar filtering
This filtering method was used owing to its computational efficiency in dealing with point clouds. This is done by creating a bin size in two of the dimensions and then filtering based on the variation of points in the third dimension. This method can be used to filter for objects of a specific shape within a box. In the case of light poles the grid was created in the x and y axes. and then filtering based on the z-axis. By setting the grid size to around the size of the base of the light pole and filtering out any grids that have a z variation that is smaller than the approximate size of a pole. This resulted in a very robust segmentation of light poles. By changing the axes and the grid size this method can be used to segment out many distinctly shaped objects in a point cloud such as roads, rails and pavements in a reasonably robust manner.
Clustering to detect number of poles
The planar filtering achieves segmenting all the poles from everything else but it would be nice for each pole to be individually segmented. This can be done with set metrics such as binning based on the x and y grid again.This poses issues with having to know how far spaced the poles are and having them evenly spaced. Instead clustering was used with variation on the "elbow method" in order to determine the optimal number of clusters and segment the point cloud into this number of poles. This method worked very well and provides a quick and easy way of segmenting data that is spatially separated. In order to make the clustering more effective with poles the clustering was done based on only the x and y co-ordinated of the points in the cloud
Overview This project was completed as part of an embedded systems course at Northwestern University. In this project computer vision and machine learning were used with a Baxter robot in order to get Baxter to pick up a desired box with a hand written number on it.
Computer vision
In order to have Baxter recognize the numbers some computer vision was required. this involved using a USB camera and openCv to isolate the numbers from the frame and then convert them to a the MNIST number format (Black background, white number, 28x28 pixels) that the machine learning algorithms were trained on.
Machine learning
Multiple machine learning techniques were used for the classification of the numbers. These included Convolutional neural networks, k-nearest neighbors and combinational one vs all linear classifiers. These algorithms were written from scratch with the exception of the Convolutional neural network which was created using Keras.
The final algorithm that was implemented was the written from scratch multiple one vs all classifier, it was the quickest and it inherently calculated confidence values for each prediction. To improve the accuracy of the predictions the highest confidence predictions were selected first and the number that was predicted was removed from the "prediction pool" (that number could not be predicted again). This meant that each number could only come up once (as each number was only present on the table once) and that these numbers were predicted in order of confidence levels. This resulted in a very robust prediction strategy that in practice classified all ten numbers on the table correctly every time.
Baxter and ROS
This project involved using a Baxter robot made by Rethink robotics. Interfacing with Baxter as well as the USB camera were done using ROS. Ros was also used for all of the data transfer between various subsystems that were required to get this project to work.
As part of my individual undergraduate project at the University of Cape Town, I undertook to design, build and test an urban search and rescue robot. The resulting robot is Theseus. Theseus is aimed to satisfy certain goals, these goals are -Cost effective -Tele-operated -Agile
Cost Effective
Theseus is designed to be cost effective. This is a broad term that could mean many things however in the context of this project, cost effective means that Theseus is replaceable. This is desirable for the field of urban search and rescue as it is unpredictable and human life is in danger. Having a search and rescue robot that is cost effective enough to be replaceable means that a team can have many robots this allows another robot to be sent into the field if one robot becomes immobilized by unforeseen circumstances. Being cost effective enough to be considered replaceable also means that they can be operated without the concern for losing a large investment.
Tele-operated
Theseus is designed to be tele-operated. This is owing to the fact that search and rescue environments are unpredictable. Unpredictable environments such as these require tele-operation in order to keep the people operating the robot safe. Tele-operation was chosen over autonomy for Theseus as autonomy is difficult to achieve in such unpredictable environments and even with the use of current AI getting enough training data on which to train a urban search and rescue robot would be difficult.
Agile
Theseus is designed to fit into small spaces while still being able to climb over large objects all while being energy efficient. Theseus makes use of a gear system in the wheels that causes the rear wheel to flip over the front wheel when the front wheel becomes stuck. This allows for Theseus to have a low profile while still being able to overcome obstacles and climb steps. Theseus is also more energy efficient than winged or legged robots while achieving this agility.
This project was completed as a project requirement during third year at the University of cape Town. This project was completed by a team of 5 students. My main responsibilities were the circuits and micro controller portions, which I collaborated with one other student to complete. The final robot was a line following legged robot,that could detect whether a start light was green or red and could detect objects in front of it with millimeter precision.
Circuits
All the circuits were designed and built from discrete analogue components and assembled on vero-boards. The only exception to this, was the H-bridge for which we purchased packaged a chip on a PCB for size reasons. The circuits included, An H-bridge circuit, A color light detection circuit (using filtered photo-resistors), a line sensing circuit (using an array of 5 IR sensors), a distance sensor (using an IR sensor).
Mechanics
The system had to be legged, as such the solution implemented was a Jansen linkage. This allowed for the robot to move each set of legs with only one DC motor
Micro controller
The micro controller used was a STM32F051C6. Prototyping was done on a PCB development board created as part of another course at the University of Cape Town. The final micro controller circuitry was also soldered onto a vero-board. All the micro controller code was developed in C and compiled appropriately using atollic rather than gcc.
This project was developed as a final deliverable for a course to build and program a quadrotor at Northwestern University.
Low level controller
A PID controller was implemented using IMU data in order to stabilize the quad-rotor for human controlled flight. The IMU data was processed to provide pitch roll and yaw information. Both the accelerometer and gyroscope data were used with an exponential filter in order to get low noise low drift data.
High level controller
A high level PID controller was then implemented with VIVE sensor data, this Controller was used to control the position of the quad-rotor for autonomous flight.
Future application
As a future proposed project I would like to make a few of these raspberry pi powered quadcopters for use with ROS.