Bogdan Vlahov -
One common Cost Function component is having a map-based cost. In our library, we have included a helper class to allow for fast querying on the GPU and equivalent querying on the CPU. In this post, I will dicuss how to use our TextureHelper API to add maps to your own custom Cost Function. This post is based on this following Github Issue we received and should hopefully go into more depth for curious readers.
Let us first set up a minimal example.
Say we have an existing map of safe regions that we then want to use as our cost for a driving robot.
At every cell of the map, the value is the distance (in meters) to the nearest unsafe cell; cells containing obstacles will have a value of zero.
We want our cost to be zero if the robot is more than 2 meters away from an obstacle and an linearly increasing cost as
we get closer to an obstacle.
For the purposes of this example, I will use the 2D Texture Helper but there is also a 3D Texture Helper for 3D maps.
The TextureHelper class defines the map location, scale, orientation, and can hold multiple maps of different scales.
It computes bilinear interpolation between grid cells by default but can be configured to return the nearest grid cell if desred.
The Texture Helper API is also templated on the data type of the map.
We will use float
in our example to take advantage of linear interpolation but there is also the option of
char
, int
, short
, long
.
The map queries work on both the CPU and GPU with the GPU queries using the CUDA Texture API
for faster reads.
These classes are defined here
in the MPPI-Generic source code for those wanting to see the implementation details.
The first step is adding the 2D Texture Helper to your custom Cost Function class in the header file.
For this example, I will call my final cost function MapCost
.
I create a MapCostImpl
class so that this class can be extended in the future with CRTP compliance.
In the header file shown in Lst. 1, we set up inheriting from the base Cost class, declare the necessary methods,
and add a TwoDTextureHelper
class variable and a method to access it (getTextureHelper()
).
We do not use std::shared_ptr
for the TwoDTextureHelper
as the CPU and GPU have different memory locations and allocation methods
which makes std::shared_ptr
incompatible with CUDA.
It is possible to use a std::shared_ptr
on the CPU side only and a raw pointer on the GPU side which I shall leave for a future post.
It is important to note that we are overriding bindToStream()
, GPUSetup()
,freeCudaMem()
, and
paramsToDevice()
as well as the regular overwritten methods.
Towards the bottom, the class MapCost
is defined as an instantiated form of
MapCostImpl
that is used as the cost function in MPPI but can’t be inherited from due to the lack of
CLASS_T
templating.
Finally, we have the inclusion of the map_cost.cu
file where the definitions of the methods are actually stored.
We wrap it in a check to see if the compiler is nvcc
as our current attempt to allow compilation of a MPPI controller
into a shared library that can then be linked to by a pure C++ object without knowledge of CUDA methods.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
#pragma once
#include <mppi/cost_functions/cost.cuh>
#include <mppi/utils/texture_helpers/two_d_texture_helper.cuh>
template <class CLASS_T, class DYN_T, class PARAMS_T>
class MapCostImpl : public Cost<CLASS_T, PARAMS_T, typename DYN_T::DYN_PARAMS_T>
{
public:
using PARENT_CLASS = Cost<CLASS_T, PARAMS_T, typename DYN_T::DYN_PARAMS_T>;
using output_array = typename PARENT_CLASS::output_array;
using DYN_P = typename PARENT_CLASS::TEMPLATED_DYN_PARAMS;
MapCostImpl(cudaStream_t stream = 0);
~MapCostImpl();
std::string getCostFunctionName() const override
{
return std::string("Map Cost");
}
void bindToStream(cudaStream_t stream);
void GPUSetup();
void freeCudaMem();
void paramsToDevice();
TwoDTextureHelper<float>* getTexHelper()
{
return tex_helper_;
}
float computeStateCost(const Eigen::Ref<const output_array> s, int timestep, int* crash_status);
float terminalCost(const Eigen::Ref<const output_array> s);
__device__ float computeStateCost(const float* __restrict__ s, int timestep, float* __restrict__ theta_c,
int* __restrict__ crash_status);
__device__ float terminalCost(const float* __restrict__ s, float* __restrict__ theta_c);
protected:
TwoDTextureHelper<float>* tex_helper_ = nullptr;
};
template <class DYN_T>
class MapCost : public MapCostImpl<MapCost<DYN_T>, DYN_T, CostParams<DYN_T::CONTROL_DIM>>
{
public:
using PARENT_CLASS = MapCostImpl<MapCost, DYN_T, CostParams<DYN_T::CONTROL_DIM>>;
using PARENT_CLASS::MapCostImpl; // Use MapCostImpl constructors
};
#ifdef __CUDACC__
#include "map_cost.cu"
#endif
Listing 1: Header file for the Custom Cost Function using the TextureHelper API
Next, we go to the definitions of the various methods, starting with the constructor and destructor.
We first note that in the constructor, we create a new TwoDTextureHelper
variable.
The number passed in indicates how many maps we want to store in the helper (more can be added later using the addNewTexture()
method).
For this example, we only need a single map but set this higher in cases where you have more maps you want to query.
We set up the destructor to delete the texture helper to prevent CPU memory leaks.
template <class CLASS_T, class DYN_T, class PARAMS_T>
MapCostImpl<CLASS_T, DYN_T, PARAMS_T>::MapCostImpl(cudaStream_t stream)
{
tex_helper_ = new TwoDTextureHelper<float>(1, stream);
}
template <class CLASS_T, class DYN_T, class PARAMS_T>
MapCostImpl<CLASS_T, DYN_T, PARAMS_T>::~MapCostImpl()
{
delete tex_helper_;
}
From there, we go to setting up the methods that interact with the GPU.
The first method is GPUSetup()
which allocates the GPU memory for the cost function and the TwoDTextureHelper
.
The final line also ensures that the GPU version of the cost function points to the GPU version of the TwoDTextureHelper
.
The next three methods, freeCudaMem()
, paramsToDevice()
and bindToStream()
, ensure that the texture helper’s equivalent methods are also
called alongside the cost function’s.
With this, the texture helper will now update its GPU component whenever the Cost Function updates its GPU component and
does not require separate management.
template <class CLASS_T, class DYN_T, class PARAMS_T>
void MapCostImpl<CLASS_T, DYN_T, PARAMS_T>::GPUSetup()
{
PARENT_CLASS::GPUSetup();
tex_helper_->GPUSetup();
HANDLE_ERROR(cudaMemcpyAsync(&(this->cost_d_->tex_helper_), &(tex_helper_->ptr_d_), sizeof(TwoDTextureHelper<float>*),
cudaMemcpyHostToDevice, this->stream_));
}
template <class CLASS_T, class DYN_T, class PARAMS_T>
void MapCostImpl<CLASS_T, DYN_T, PARAMS_T>::freeCudaMem()
{
if (this->GPUMemStatus_)
{
tex_helper_->freeCudaMem();
}
PARENT_CLASS::freeCudaMem();
}
template <class CLASS_T, class DYN_T, class PARAMS_T>
void MapCostImpl<CLASS_T, DYN_T, PARAMS_T>::paramsToDevice()
{
if (this->GPUMemStatus_)
{
tex_helper_->copyToDevice();
}
PARENT_CLASS::paramsToDevice();
}
template <class CLASS_T, class DYN_T, class PARAMS_T>
void MapCostImpl<CLASS_T, DYN_T, PARAMS_T>::bindToStream(cudaStream_t stream)
{
PARENT_CLASS::bindToStream(stream);
tex_helper_->bindToStream(stream);
}
We now move on to actually using the map to compute costs.
The CPU and GPU versions of computeStateCost()
look the same other than the use of s()
to index into the state on
the CPU and s[]
on the GPU.
We create a variable map_index
for clarity on the Texture Helper method inputs.
As there is only have a single map from our constructor in this example, map_index
is just set to 0.
The first step is checking if the map is ready to be used with checkTextureUse(map_index)
.
This prevents us from reading potentially uninitialized memory.
Next, we create the query point that defines where we want to read the map at.
For our simple case, we only need to query using the x and y positions of the driving robot.
We make use of our macro O_IND_CLASS()
to allow us to query the state array at the
POS_X
and POS_Y
locations as defined in the Dynamics’ Parameter structure.
This allows this cost function to be compatible with any Dynamics that defines POS_X
and POS_Y
as part of their
output.
We then query the map using queryTextureAtWorldPose()
.
This method first converts our x and y position in meters
into the equivalent pixel coordinate of the map and returns the value at that map location, interpolating if
the pixel coordinates are in between grid cells.
With this, we have the distance of our robot from the nearest obstacle and we can set up our cost.
The cost is zero when we are more than 2 meters away from an obstacle and then linearly increases up to a max
cost of two when we are inside an obstacle.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
template <class CLASS_T, class DYN_T, class PARAMS_T>
float MapCostImpl<CLASS_T, DYN_T, PARAMS_T>::computeStateCost(const Eigen::Ref<const output_array> s, int timestep,
int* crash_status)
{
float cost = 0.0;
int map_index = 0;
if (this->tex_helper_->checkTextureUse(map_index))
{
float3 query_point = make_float3(s(O_IND_CLASS(DYN_P, POS_X)), s(O_IND_CLASS(DYN_P, POS_Y)), 0.0);
float distance = tex_helper_->queryTextureAtWorldPose(map_index, query_point);
cost = fmaxf(2 - distance, 0.0);
}
return cost;
}
template <class CLASS_T, class DYN_T, class PARAMS_T>
__device__ float MapCostImpl<CLASS_T, DYN_T, PARAMS_T>::computeStateCost(const float* __restrict__ s, int timestep,
float* __restrict__ theta_c,
int* __restrict__ crash_status)
{
float cost = 0.0f;
int map_index = 0;
if (this->tex_helper_->checkTextureUse(map_index))
{
float3 query_point = make_float3(s[O_IND_CLASS(DYN_P, POS_X)], s[O_IND_CLASS(DYN_P, POS_Y)], 0.0);
float distance = tex_helper_->queryTextureAtWorldPose(map_index, query_point);
cost = fmaxf(2 - distance, 0.0);
}
return cost;
}
template <class CLASS_T, class DYN_T, class PARAMS_T>
float MapCostImpl<CLASS_T, DYN_T, PARAMS_T>::terminalCost(const Eigen::Ref<const output_array> s)
{
return 0.0;
}
The full definition file map_cost.cu
is shown below in Lst. 2. For our simple example, we set the terminal cost to be zero.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
template <class CLASS_T, class DYN_T, class PARAMS_T>
MapCostImpl<CLASS_T, DYN_T, PARAMS_T>::MapCostImpl(cudaStream_t stream)
{
tex_helper_ = new TwoDTextureHelper<float>(1, stream);
}
template <class CLASS_T, class DYN_T, class PARAMS_T>
MapCostImpl<CLASS_T, DYN_T, PARAMS_T>::~MapCostImpl()
{
delete tex_helper_;
}
template <class CLASS_T, class DYN_T, class PARAMS_T>
void MapCostImpl<CLASS_T, DYN_T, PARAMS_T>::GPUSetup()
{
PARENT_CLASS::GPUSetup();
tex_helper_->GPUSetup();
HANDLE_ERROR(cudaMemcpyAsync(&(this->cost_d_->tex_helper_), &(tex_helper_->ptr_d_), sizeof(TwoDTextureHelper<float>*),
cudaMemcpyHostToDevice, this->stream_));
}
template <class CLASS_T, class DYN_T, class PARAMS_T>
void MapCostImpl<CLASS_T, DYN_T, PARAMS_T>::freeCudaMem()
{
if (this->GPUMemStatus_)
{
tex_helper_->freeCudaMem();
}
PARENT_CLASS::freeCudaMem();
}
template <class CLASS_T, class DYN_T, class PARAMS_T>
void MapCostImpl<CLASS_T, DYN_T, PARAMS_T>::paramsToDevice()
{
if (this->GPUMemStatus_)
{
tex_helper_->copyToDevice();
}
PARENT_CLASS::paramsToDevice();
}
template <class CLASS_T, class DYN_T, class PARAMS_T>
void MapCostImpl<CLASS_T, DYN_T, PARAMS_T>::bindToStream(cudaStream_t stream)
{
PARENT_CLASS::bindToStream(stream);
tex_helper_->bindToStream(stream);
}
template <class CLASS_T, class DYN_T, class PARAMS_T>
float MapCostImpl<CLASS_T, DYN_T, PARAMS_T>::computeStateCost(const Eigen::Ref<const output_array> s, int timestep,
int* crash_status)
{
float cost = 0.0;
int map_index = 0;
if (this->tex_helper_->checkTextureUse(map_index))
{
float3 query_point = make_float3(s(O_IND_CLASS(DYN_P, POS_X)), s(O_IND_CLASS(DYN_P, POS_Y)), 0.0);
float distance = tex_helper_->queryTextureAtWorldPose(map_index, query_point);
cost = fmaxf(2 - distance, 0.0);
}
return cost;
}
template <class CLASS_T, class DYN_T, class PARAMS_T>
__device__ float MapCostImpl<CLASS_T, DYN_T, PARAMS_T>::computeStateCost(const float* __restrict__ s, int timestep,
float* __restrict__ theta_c,
int* __restrict__ crash_status)
{
float cost = 0.0f;
int map_index = 0;
if (this->tex_helper_->checkTextureUse(map_index))
{
float3 query_point = make_float3(s[O_IND_CLASS(DYN_P, POS_X)], s[O_IND_CLASS(DYN_P, POS_Y)], 0.0);
float distance = tex_helper_->queryTextureAtWorldPose(map_index, query_point);
cost = fmaxf(2 - distance, 0.0);
}
return cost;
}
template <class CLASS_T, class DYN_T, class PARAMS_T>
float MapCostImpl<CLASS_T, DYN_T, PARAMS_T>::terminalCost(const Eigen::Ref<const output_array> s)
{
return 0.0;
}
template <class CLASS_T, class DYN_T, class PARAMS_T>
__device__ float MapCostImpl<CLASS_T, DYN_T, PARAMS_T>::terminalCost(const float* __restrict__ s,
float* __restrict__ theta_c)
{
return 0.0;
}
Listing 2: Source file for the Custom Cost Function using the TextureHelper API
Our cost function is now fully defined and can query a map on both the CPU and GPU. We shall now go on to discuss the Texture Helper API for filling in the map.
In order to fill in the map n the texture helper, we need to have a map. we will assume it is stored in a std::vector<float>
in row-major order and have some number of rows and columns. We also need to have access to the texture helper inside the cost function.
using COST_T = MapCost<DubinsDynamics>;
COST_T cost; // Create an instantiation of the cost function
auto tex_helper_ptr = cost.getTexHelper();
std::vector<float> map_data;
int num_rows = 12;
int num_cols = 15;
We shall assume that map_data
is filled in from some outside source like an image or maybe from a ROS topic. Other required information is the resolution of the map. We shall assume that $1$ pixel length is $1\text{m}$ for simplicity and fill it in as follows:
float resolution_meters_per_pixel = 1.0;
tex_helper_ptr->updateResolution(map_index, resolution_meters_per_pixel);
where map_index = 0
for our example of only having a single map.
Next, we shall update the texture helper to have the map dimensions. We use the cudaExtent
structure to store the dimensions as it explicitly labels the dimensions as width
, height
and depth
. We can fill the structure in the following ways.
// Method 1: Invididually set dimensions
cudaExtent map_dim;
map_dim.width = num_cols;
map_dim.height = num_rows;
map_dim.depth = 0; // 2D maps have 0 depth
// Method 2: set multiple dimensions at once
map_dim = make_cudaExtent(num_cols, num_rows, 0);
Once filled, we pass that to the texture helper using the setExtent()
method.
tex_helper_ptr->setExtent(map_index, map_dim);
The map origin and rotation matrix can be tricky to get right as there is the chance to have coordinate frame confusion. The map origin vector and rotation matrix are used to convert the robot state coordinate frame into the map data coordinate frame. The map is queried in image space coordinates where the origin starts in the top left and positve x goes right, positive y goes down the page, and positive z goes out of the page towards the reader (z is ignored in the case of a 2D map). This coordinate system stems from the data being stored in a 2D row-major structure.
| 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
---+---------------------------------------------
0 | 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 | 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2 | 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3 | 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4 | 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
5 | 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0
6 | 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
7 | 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
8 | 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
9 | 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
10 | 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
11 | 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Map showing an obstacle located at $x=5,y=2$ in the image coordinate frame
This can cause problems depending on how the map was constructed. Let us say that there is an obstacle at $x=5,y=2$. If the map is constructed in image space coordinates, then the obstacle location in the map would be as shown in Fig. 1 and queryable at m[2][5]
. However, most robots operate in North-East-Down (NED) or North-West-Up (NWU) coordinate frames. If the map was constructed in the NWU coordinate frame, an obstacle at $x=5, y=2$ looks like Fig. 2.
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | 11
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | 10
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | 9
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | 8
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | 7
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 | 6
0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 | 5
0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 | 4
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | 3
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | 2
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | 0
---------------------------------------------+---
14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 |
Map showing an obstacle located at $x=5,y=2$ in the NWU coordinate frame. Notice that array location of the obstacle is no longer at m[2][5]
as before but rather m[6][12]
(it is in the 13th column, and the 7th row)
While the location of the obstacle is the same, the location of where it is stored in the map data has changed. This difference is addressed by adjusting the origin and rotation of the texture helper map. The coordinate frame transformation equation is
\[\begin{align} \begin{bmatrix} x^{image} \\ y^{image} \\ z^{image} \\ \end{bmatrix} = R_{coord} \PP{ \begin{bmatrix} x^{robot} \\ y^{robot} \\ z^{robot} \end{bmatrix} - \begin{bmatrix} x^{o} \\ y^{o} \\ z^{o} \end{bmatrix}}. \label{eq:coord_transform} \end{align}\]where $[x^{o}, y^{o}, z^{o}]^\top$ the origin vector and $R_{coord}$ is the rotation matrix. The texture helper origin is the location of the top left corner of the map - m[0][0]
- in world frame. In the case of a NWU coordinate frame and our current map dimensions, the texture helper origin is going to be at $x_o = 14, y_o = 24$, one off of the width and height of the map.
float3 map_top_left_pos = make_float3(14, 24, 0);
tex_helper_ptr->updateOrigin(map_index, map_top_left_pos);
Next is the rotation matrix, $R_{coord}$. We need to go from NWU to image space coordinates. Using the two figures above for guidance, we can see that the vertical component is $x$ in NWU and becomes $y$ in the image coordinate frame. The positive direction in both frames is also reversed so we also need a negation. We can also see that $y$ in NWU is equivalent to $-x$ in the image coordinate frame. With this information, we can build our rotation matrix which translates from NWU to image coordinate space. Currently, the rotation matrix can only be provided as a std::array<float3, 3>
to avoid confusion on storage order.
std::array<float3, 3> rot;
rot[0] = make_float3( 0, -1, 0);
rot[1] = make_float3(-1, 0, 0);
rot[2] = make_float3( 0, 0, 1);
tex_helper_ptr->updateRotation(map_index, rot);
Note: We have been assuming that the map is aligned with the world frame and thus only the axes needed to be swapped/inverted. If the map is rotated with respect to the world frame, we will need to do a bit more work. First, we shall notate the rotation from world frame to rotated map frame as $R_W^{map}$ . We want to combine it with the coordinate transform rotation $R_{coord}$ which we do in the following order:
\[\hat{R} = R_{coord} R^{map}_W\]However, putting $\hat{R}$ into \eqref{eq:coord_transform} would cause issues as our original origin is in map frame (we defined it by looking at where the top right corner was on the map). Thus, we need to update the origin to be in world frame to negate the effects of $R_W^{map}$:
\[\begin{align} \begin{bmatrix} x_{W}^o \\ y_W^o \\ z_{W}^o \end{bmatrix} &= \PP{R^{map}_W}^\top\begin{bmatrix} x_{map}^o \\ y_{map}^o \\ z_{map}^o \end{bmatrix} \\ \begin{bmatrix} x^{image} \\ y^{image} \\ z^{image} \\ \end{bmatrix} &= R_{coord}R_W^{map} \PP{ \begin{bmatrix} x^{robot}_W \\ y^{robot}_W \\ z^{robot}_W \end{bmatrix} - \begin{bmatrix} x_{W}^o \\ y_W^o \\ z_{W}^o \end{bmatrix}} \end{align}\]The origin and rotation matrix to be passed to the texture helper would thus be $[x_{W}^o, y_W^o, z_{W}^o]^\top$ and $\hat{R}$ respectively.
The final step is to actually provide the texture helper with the actual map data and then turn the texture on for querying purposes.
bool column_major = false; // map_data was assumed row-major earlier
tex_helper_ptr->updateTexture(map_index, map_data, column_major);
tex_helper-ptr->enableTexture(map_index);
An Eigen::MatrixXf
can also be used to pass in the map data. However, as Eigen uses column-order by default, the updateTexture()
call might take longer as it converts the map to row-major order in preparation of use on the GPU. From here, all of the relevant map information will be passed to the GPU automatically when cost.GPUSetup()
is called or if you have updated the map after allocating the GPU memory, calling cost.paramsToDevice()
will update the GPU version of the map. When using the cost function with one of our MPPI controllers, cost.GPUSetup()
is called in the Controller’s constructor already so no explicit call is needed.
This should provide all the necessary information for someone to get started using the Texture Helper API to incorporate maps into your own Cost Function.