My first paper - Exactly solving a routing problem in humanitarian logistics

Update (15. Oct 2024): the paper was accepted and has been published!

Here is the link to the paper: https://link.springer.com/article/10.1007/s10100-024-00943-y

What am I talking about?

As part of the Data Science Master at TU Wien, we are required to work on the Interdisciplinary Project in Data Science. Students are expected to work (for free) on a research project supervised by another research group or institution that focuses on non-DS topics. The idea is to apply Data Science to a non-Data Science domain.

I just wanted to talk about how great this project turned out, and how good results we got. 👍🏻

Who supervised me?

I worked with the Institute for Transport and Logistics Management of the WU Vienna. And:

Vera Hemmelmayr (Assoz.Prof PD Dr.)
Günther Raidl (Ao.Univ.Prof. Dipl.-Ing. Dr.techn.)

supervised me. They provided excellent guidance, and they are very nice. You should work with them if you have a chance. Moreover, they gave me the opportunity to publish our work.

The optimization problem that we solved

We solved the Selective Assessment Routing Problem (SARP), which is the following problem:

Imagine some natural disaster or emergency (say an earthquake) that affects some region.
We are a humanitarian organization that will provide aid for the affected population
The very first step: Rapid Needs Assessment
- Evaluate and prioritize the immediate needs of affected populations
- The next steps (providing the needs) is out of the scope of the problem
We have some teams that will take cars and visit some affected regions (not all of them, time is limited)
What sites do we visit and in which order? This is our problem
In order to select the sites, we assume a set of characteristics of the sites (e.g., geographical, demographic, or socio-economical factors) that should be covered in a balanced way
- This means we wouldn’t visit all the sites on the coast and forget about the sites on the mountains
- Or, for instance, we want to cover the poor or rich areas equally
In essence, we select the sites fairly in order to get the best picture of the situation

Our Mixed Integer Programming Model

I don’t know if the reader is familiar with Mathematical Programming. I will just assume you are. The SARP was first defined in this paper by Balcik (2017), and the author proposed a MIP model using MTZ constraints. But they mostly focused on some Tabu Search Heuristic.

In the project, we come up with different alternative formulations (Cutting Sets, 2-indexed MTZ, and Single Commodity Flow), we study them theoretically, and computationally test them.

Single Commodity Flow formulation for the SARP

This model uses binary variables to select the edges and nodes part of the optimal routes. The formulation is based on flow conservation constraints and includes fairness objectives to ensure balanced coverage of different site characteristics.

Notation for the problem data

We use the following notation. $N$ is the set of affected sites. $K$ is the set of teams (or vehicles), and each team $k \in K$ departs from the origin $\{0\}$ and returns back to it after completing the route, which takes at most $T_{\text{max}}$ time units. The travel time is represented by $t_{ij}, \forall i, j \in N_0 = N \cup \{0\}$. Let $C$ denote the set of critical characteristics. A binary coverage parameter $\alpha_{ic}$ is 1 if node $i \in N$ carries characteristic $c \in C$. The total number of sites that carry characteristic $c \in C$ is represented by $\tau_c$.

Notation for our formulation

We propose a two-index Single Commodity Flow formulation using the following variables: $f_{ij}$ is the flow circulating from $i \in N_0$ to $j \in N_0$ and is bounded by 0 and $T_{\text{max}}$. Variables $x_{ij}$ takes value 1 if some vehicle visits site $j \in N_0$ right after $i \in N_0$, and 0 otherwise. $y_i$ takes value 1 if some vehicle visits site $i \in N$, and 0 otherwise. $z$ is a continuous variable modeling the coverage rate, bounded by 0 and 1.

Mathematical Model

\begin{equation} \max\ z \end{equation}

\begin{equation} \text{s.t.} \quad z \cdot \tau_c \leq \sum_{i \in N} \alpha_{ic} \cdot y_i \quad \forall c \in C \end{equation}

\begin{equation} \sum_{i \in N_0} x_{0i} = \sum_{i \in N_0} x_{i0} = |K| \end{equation}

\begin{equation} f_{ij} \leq T_{\max} \cdot x_{ij} \quad \forall i \in N_0,\, j \in N_0 \end{equation}

\begin{equation} f_{i0} = x_{i0} \cdot t_{i0} \quad \forall i \in N \end{equation}

\begin{equation} \sum_{j \in N_0} f_{ji} - \sum_{j \in N_0} f_{ij} = \sum_{j \in N_0} x_{ji} \cdot t_{ji} \quad \forall i \in N \end{equation}

\begin{equation} \sum_{j \in N_0} x_{ji} = \sum_{j \in N_0} x_{ij} = y_i \quad \forall i \in N \end{equation}

Our results and findings

We focused on exact methods, which have advantages and disadvantages compared to heuristics. From the exact perspective, one can study different formulations and try to establish whether one is better than another (which corresponds to proving the inclusion of the polyhedral projections of one into the other).

We did this, and it’s very nice because we could prove the Single Commodity Formulation (SCF) that I presented above is strictly stronger than the MTZ formulation in Balcik (2017).

The proof I came up with is quite simple. It's just checking that it is true. Click to see the proof

Proof of the strictness of the SCF formulation

Our computational results are also quite good. Our exact method is comparable in terms of solutions to the Tabu Search heuristic of Balcik.

You can find our implementation and the project report here: Exact Sarp - GitHub - jsalvasoler

What next?

Since the results are very decent, my supervisors suggested writing a paper about it, which has now been published in the Central European Journal of Operations Research! 🎉

And I wanted to end this blog post by just reflecting on my learnings:

MIP modeling is powerful and nice, and proving stuff regarding formulations is not that hard (proof is that I could do it).
Gurobi (the solver used) is very strong.
Doing research in optimization topics is great because the problems are so well-defined. Given a problem, we just want to get the best solution given a time budget. But this is actually one of the problems or optimization research: problems in the real world are not well-defined.
Doing good work (I worked hard) leads to nice opportunities (publishing a paper).