Abstract Path Model in DAGs
A general approach in developing a model to decompose a weighted graph into weighted paths is to:
- Fix \(k\), the number of paths.
- Formulate in ILP the \(k\) paths; that is adding a set of \(k\) variables and suitable constraints constraints such that the \(i\)-th set of variables encodes the \(i\)-th path.
- Add additional variables, constraints, or set the objective function, that these paths must satisfy.
- Iterate the above process for different values of \(k\), until the “best” one is found (“best” depends on the problem). See our implementation of a basic routine for this step.
For step 2. above we provide the abstract class AbstractPathModelDAG
which models a given number \(k\) of paths in a given acyclic graph \(G = (V,E)\). For simplicity, \(G\) must have a single source \(s\) and a single sink \(t\), see our class stDiGraph. (The stDiGraph
class adds these automatically for any NetworkX DiGraph, and keeps track of their incident edges.) This approach appeared in this paper.
More in detail, for every edge \((u,v) \in E\), and for every path index \(i \in \{0,\dots,k-1\}\), we add a binary variable \(x_{u,v,i} \in \{0,1\}\). We add constraints on these variables to ensure that for every \(i\) the variables \(x_{u,v,i}\) that equal 1 induce an \(s\)-\(t\) path (i.e., a path from \(s\) to \(t\)). In other words \(x_{u,v,i} = 1\) if edge \((u,v)\) belongs to solution path \(i\), and 0 otherwise. See the paper on the specific constraints that are added to enforce that they induce an \(s\)-\(t\) path.
For example, the edges in brown below induce an \(s\)-\(t\) path (for say \(i = 3\)), and notice that the \(x_{u,v,3}\) variables equal 1 only on the edges \((u,v)\) on the path.
%%{init: {'themeVariables': { 'edgeLabelBackground': 'white'}}}%%
flowchart LR
s((s))
a((a))
b((b))
c((c))
d((d))
t((t))
s -->|"$$x_{s,a,3} = 1$$"| a
a -->|"$$x_{a,b,3} = 1$$"| b
s -->|"$$x_{s,b,3} = 0$$"| b
a -->|"$$x_{a,c,3} = 0$$"| c
b -->|"$$x_{b,c,3} = 1$$"| c
c -->|"$$x_{c,d,3} = 1$$"| d
d -->|"$$x_{d,t,3} = 1$$"| t
c -->|"$$x_{c,t,3} = 0$$"| t
linkStyle 0,1,4,5,6 stroke:brown,stroke-width:3;
The search for paths
- Note that we do not have the paths beforehand, and the ILP will search for paths (i.e. assignment of values to the \(x_{u,v,i}\) variables, under the constraints that they induce a path).
- Once a class inherits from
AbstractPathModelDAG
, it will add other variables and constraints (as in point 3. above). The ILP solver will then search for the \(k\) paths (i.e. find the values to the \(x_{u,v,i}\) variables) to satisfy all constraints.
Example: Modelling \(k\)-Flow Decomposition
Consider the problem of decomposing a network flow \(f : E \rightarrow \mathbb{N}\) over a DAG \(G = (V,E)\) into a given number \(k\) of \(s\)-\(t\) paths (k-Flow Decomposition). Assume we created the \(x_{u,v,i}\) variables as above. Thus, we just need to implement point 3. above.
- We introduce a variable \(w_i\) (integer or continuous) modeling the weight of path \(i\), for every \(i \in \{0,\dots,k-1\}\).
- We need to enforce the “flow explanation” constraint:
$$
f(u,v) = \sum_{i=0}^{k-1} x_i \cdot w_i, ~~\forall (u,v) \in E.
$$
Note that in the above \(f(u,v)\) is a constant. Moreover, \(x_i \cdot w_i\) is not a linear term (as required by an Integer Linear Program), but it can be easily linearized via additional variables and constraints. However, our
SolverWrapper
class provides the methodadd_binary_continuous_product_constraint()
to directly encode such a non-linear constraint, without bothering to manually set up these additional variables and constraints.
The \(x_{u,v,i}\) variables are implemented as edge_vars[(u, v, i)]
, see the class documentation below.
AbstractPathModelDAG
AbstractPathModelDAG(
G: stDiGraph,
k: int,
subpath_constraints: list = [],
subpath_constraints_coverage: float = 1,
subpath_constraints_coverage_length: float = None,
encode_edge_position: bool = False,
encode_path_length: bool = False,
edge_length_attr: str = None,
optimization_options: dict = None,
solver_options: dict = {},
solve_statistics: dict = {},
)
Bases: ABC
This is an abstract class modelling a path-finding (M)ILP in a DAG. The design of this package is based on the following principles:
- The class is designed to be extended by other classes that model specific path-finding problems in DAGs. In this way, they all benefit from the variables it encodes, and the safety optimizations it provides.
- The class uses our custom SolverWrapper class, which is a wrapper around the solvers HiGHS (open source) and Gurobi (free with academic license). In this way, both solvers can be used interchangeably.
More in detail, this class encodes k
s-t paths in the DAG G, where s is the global source of the stDiGraph
and t is the global sink. It also allows for subpath constraints that must appear in at least one of the s-t paths.
The class creates the following variables:
- edge_vars:
edge_vars[(u, v, i)]
= 1 if pathi
goes through edge(u, v)
,0
otherwise -
edge_position_vars:
edge_position_vars[(u, v, i)]
= position of edge(u, v)
in pathi
, starting from position 0- These variables are created only if
encode_edge_position
is set toTrue
- Note that positions are relative to the globals source
s
of the stDiGraph, thus the first edge in a path is the edge froms
to the first vertex in the original graph, and this first edge has position 0 - If you set
edge_length_attr
, then the positions are relative to the edge lengths, and not the number of edges The first edge still gets position 0, and other edges get positions equal to the sum of the lengths of the edges before them in the path - If you set
edge_length_attr
, and an edge has missing edge length, then it gets length 1
- These variables are created only if
-
path_length_vars:
path_length_vars[(i)]
= length of pathi
- These variables are created only if
encode_path_length
is set toTrue
- Note that the length of a path is the sum of the lengths of the edges in the path
- If you set
edge_length_attr
, then the lengths are the sum of the lengths of the edges in the path - If you set
edge_length_attr
, and an edge has missing edge length, then it gets length 1 - NOTE: the length also includes the edges from global source to the first vertex, and from the last vertex to the global sink. By default, these do not have a length attached, so each gets length 1.
- These variables are created only if
-
solver: a solver object to solve the (M)ILP, implemented using our SolverWrapper class.
Safety optimizations
This class uses the “safety information” (see https://doi.org/10.48550/arXiv.2411.03871) in the graph to fix some
edge_vars
to 1 or 0. The safety information consists of safe paths, or safe sequences, that are guaranteed to appear in at least
one cover (made up of any number of s-t paths) of the edges in trusted_edges_for_safety
. That is, when implementing a new
path-finding (M)ILP, you can guarantee that
- The solution is made up of s-t paths
- Any solution covers all edges in
trusted_edges_for_safety
, then safety optimizations can be used to fix someedge_vars
to 1, which can speed up the solving process, while guaranteeing global optimality.
Parameters
-
G: stDiGraph.stDiGraph
The directed acyclic graph (DAG) to be used.
-
k: int
The number of paths to be modeled.
-
subpath_constraints: list
, optionalA list of lists, where each list is a sequence of edges (not necessarily contiguous, i.e. path). Defaults to an empty list.
Each sequence of edges must appear in at least one solution path; if you also pass subpath_constraints_coverage, then each sequence of edges must appear in at least subpath_constraints_coverage fraction of some solution path, see below.
-
subpath_constraints_coverage: float
, optionalCoverage fraction of the subpath constraints that must be covered by some solution paths, in terms of number of edges. - Defaults to 1 (meaning that 100% of the edges of the constraint need to be covered by some solution path).
-
subpath_constraints_coverage_length: float
, optionalCoverage fraction of the subpath constraints that must be covered by some solution paths, in terms of length of the subpath. Defaults to
None
, meaning that this is not imposed. - If you set this constraint, you cannot setsubpath_constraints_coverage
(and its default value of 1 will be ignored). - If you set this constraint, you also need to setedge_length_attr
. If an edge has missing edge length, it gets length 1. -
encode_edge_position: bool
, optionalWhether to encode the position of the edges in the paths. Defaults to
False
. -
encode_path_length: bool
, optionalWhether to encode the length of the paths (in terms of number of edges, or sum of lengths of edges, if set via
edge_length_attr
). Defaults toFalse
. -
edge_length_attr: str
, optionalThe attribute name from where to get the edge lengths. Defaults to
None
.- If set, then the edge positions, or path lengths (above) are in terms of the edge lengths specified in the
edge_length_attr
field of each edge - If set, and an edge has a missing edge length, then it gets length 1.
- If set, then the edge positions, or path lengths (above) are in terms of the edge lengths specified in the
-
optimization_options: dict
, optionalDictionary of optimization options. Defaults to
None
, in which case the default values are used. See the available optimizations. If you pass any safety optimizations, you must also pass the dict entry"trusted_edges_for_safety"
(see below). If a child class has already solved the problem and has the solution paths, it can pass them via the dict entry"external_solution_paths"
to skip the solver creation and encoding of paths (see below).-
"trusted_edges_for_safety": set
Set of trusted edges for safety. Defaults to
None
.Global optimality
In order for the optimizations to still guarantee a global optimum, you must guarantee that:
- The solution is made up of source-to-sink paths, and
- Every edge in
trusted_edges_for_safety
appears in some solution path, for all solutions. This naturally holds for several problems, for example Minimum Flow Decomposition or [k-Minimum Path Error] where in fact, under default settings, all edges appear in all solutions.
-
"external_solution_paths" : list
External solution paths, as a list of paths, where every path is a list of nodes. Defaults to
None
. If you provide this, this class skip the solver creation and encoding of paths, and just return these paths. This is useful when the child class managed to solver the problem in a different way, and needs to let this class know them, in order to have a consistent API.
-
-
solver_options: dict
, optionalDictionary of solver options. Defaults to
{}
, in which case the default values are used. See the available solver options. -
solve_statistics: dict
, optionalDictionary to store solve statistics. Defaults to
{}
.
Raises
- ValueError: If
trusted_edges_for_safety
is not provided when optimizing withoptimize_with_safe_paths
oroptimize_with_safe_sequences
. - ValueError: If both
optimize_with_safe_paths
andoptimize_with_safe_sequences
are set toTrue
.
Source code in flowpaths/abstractpathmodeldag.py
61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 |
|
create_solver_and_paths
Creates a solver instance and encodes the paths in the graph.
This method initializes the solver with the specified parameters and encodes the paths by creating variables for edges and subpaths.
If external solution paths are provided, it skips the solver creation.
Call this method before encoding other variables and constraints.
Always call this method before encoding other variables and constraints on the paths.
Source code in flowpaths/abstractpathmodeldag.py
get_lowerbound_k
abstractmethod
Implement this class in the child class to return a lower bound on the number of solution paths to the model. If you have no lower bound, you should implement this method to return 1.
Source code in flowpaths/abstractpathmodeldag.py
get_objective_value
abstractmethod
Implement this class in the child class to return the objective value of the model. This is needed to be able to compute the safe paths (i.e. those appearing any optimum solution) for any child class.
A basic objective value is k
(when we’re trying to minimize the number of paths). If your model has a different
objective, you should implement this method to return the objective value of the model. If your model has no objective value,
you should implement this method to return None.
Source code in flowpaths/abstractpathmodeldag.py
get_solution
abstractmethod
Implement this class in the child class to return the full solution of the model. The solution paths are obtained with the get_solution_paths method.
get_solution_paths
Retrieves the solution paths from the graph.
This method returns the solution paths either from the external solution paths if they are provided at initialization time, or by calculating them based on the edge variable solutions.
Returns
- A list of paths, where each path is represented as a list of vertices.
Source code in flowpaths/abstractpathmodeldag.py
is_valid_solution
abstractmethod
Implement this class in the child class to perform a basic check whether the solution is valid.
If you cannot perform such a check, provide an implementation that always returns True.
Source code in flowpaths/abstractpathmodeldag.py
solve
Solves the optimization model for the current instance.
Returns
- True if the model is solved successfully, False otherwise.
The method first checks if an external solution is already provided. If so, it sets the solved attribute to True and returns True.
If not, it optimizes the model using the solver, and records the solve time and solver status in the solve_statistics dictionary. If the solver status indicates an optimal solution (either ‘kOptimal’ (highs) or status code 2 (gurobi)), it sets the solved attribute to True and returns True. Otherwise, it sets the solved attribute to False and returns False.