Minimum Flow Decomposition
1. Definition
The Minimum Flow Decomposition problem on a directed acyclic graph (DAG) is defined as follows:
-
INPUT: A directed graph \(G = (V,E)\), and a flow on \(G\), namely weights \(f(u,v)\) for every edge \((u,v)\) of \(G\), such that for every node \(v\) that is not a source or sink of \(G\), it holds that the sum of the flow values entering \(v\) equals the sum of the flow values exiting \(v\). This property is called flow conservation.
-
OUTPUT: A minimum number \(k\) of source-to-sink paths, \(P_1,\dots,P_k\), with a weight \(w_i\) associated to each \(P_i\), such that for every edge it holds that its flow value equals the sum of the weights of the paths going through the edge. Formally, $$ f(u,v) = \sum_{i \in \{1,\dots,k\} : (u,v) \in P_i }w_i, ~~\forall (u,v) \in E. $$
Note
The graph may have more than one source or sink nodes. The solution paths are required to start in some source node, and end in some sink node.
See also
For example, the directed graph below satisfies the flow conservation property:
flowchart LR
s((s))
a((a))
b((b))
c((c))
d((d))
t((t))
s -->|6| a
a -->|2| b
s -->|7| b
a -->|4| c
b -->|9| c
c -->|6| d
d -->|6| t
c -->|7| t
A decomposition into 3 paths, in red, orange and blue, of weights 4, 2 and 7, respectively is shown below. There is no decomposition into a smaller number of paths, and thus this decomposition is also a minimum flow decomposition.
flowchart LR
s((s))
a((a))
b((b))
c((c))
d((d))
t((t))
s -->|4| a
a -->|4| c
c -->|4| d
d -->|4| t
linkStyle 0,1,2,3 stroke:red,stroke-width:3;
s -->|2| a
a -->|2| b
b -->|2| c
c -->|2| d
d -->|2| t
linkStyle 4,5,6,7,8 stroke:orange,stroke-width:3;
s -->|7| b
b -->|7| c
c -->|7| t
linkStyle 9,10,11 stroke:blue,stroke-width:3;
2. Solving the problem
We create the graph as a networkx DiGraph. In real project, you will likely have a method that transforms your graph to a DiGraph. We also give an attribute flow
for every edge storing its flow value.
import flowpaths as fp
import networkx as nx
graph = nx.DiGraph()
graph.add_edge("s", "a", flow=6)
graph.add_edge("s", "b", flow=7)
graph.add_edge("a", "b", flow=2)
graph.add_edge("a", "c", flow=4)
graph.add_edge("b", "c", flow=9)
graph.add_edge("c", "d", flow=6)
graph.add_edge("c", "t", flow=7)
graph.add_edge("d", "t", flow=6)
flow
of the edges. Note that MinFlowDecomp
just creates the model. You need to call solve()
to solve it.
The model might not be solved because the MILP solver couldn’t do it in the time it had allocated, or other problems. Thus, you need to check if it was solved, and then get its solution. The solutoion of MinFlowDecomp
is a tuple, where the first component is the list of paths (as lists of nodes), and the second component is a list of their corresponding weights.
if mfd_model.is_solved():
solution = mfd_model.get_solution()
print("Paths, weights", solution["paths"], solution["weights"])
# [['s', 'b', 'c', 't'], ['s', 'a', 'c', 'd', 't'], ['s', 'a', 'b', 'c', 'd', 't']] [7, 4, 2]
3. References
There are several works on this problem, for example.
-
Vatinlen, Benedicte, et al. Simple bounds and greedy algorithms for decomposing a flow into a minimal set of paths. European Journal of Operational Research 185.3 (2008): 1390-1401.
-
Shao, Mingfu, and Carl Kingsford. Theory and a heuristic for the minimum path flow decomposition problem. IEEE/ACM Transactions on Computational Biology and Bioinformatics 16.2 (2017): 658-670.
-
Kloster, Kyle, et al. A practical fpt algorithm for flow decomposition and transcript assembly 2018 Proceedings of the Twentieth Workshop on Algorithm Engineering and Experiments (ALENEX). Society for Industrial and Applied Mathematics, 2018.
-
See also flowpaths References, and the other papers cited by these works.
MinFlowDecomp
MinFlowDecomp(G: DiGraph, flow_attr: str, weight_type: type = float, subpath_constraints: list = [], subpath_constraints_coverage: float = 1.0, subpath_constraints_coverage_length: float = None, edge_length_attr: str = None, optimization_options: dict = None, solver_options: dict = None)
Bases: AbstractPathModelDAG
Class to decompose a flow into a minimum number of weighted paths.
Initialize the Minimum Flow Decomposition model, minimizing the number of paths.
Parameters
-
G : nx.DiGraph
The input directed acyclic graph, as networkx DiGraph.
-
flow_attr : str
The attribute name from where to get the flow values on the edges.
-
weight_type : type
, optionalThe type of weights (
int
orfloat
). Default isfloat
. -
subpath_constraints : list
, optionalList of subpath constraints. Default is an empty list. Each subpath constraint is a list of edges that must be covered by some solution path, according to the
subpath_constraints_coverage
orsubpath_constraints_coverage_length
parameters (see below). -
subpath_constraints_coverage : float
, optionalCoverage fraction of the subpath constraints that must be covered by some solution paths.
Defaults to
1.0
(meaning that 100% of the edges of the constraint need to be covered by some solution path). See subpath constraints documentation -
subpath_constraints_coverage_length : float
, optionalCoverage length of the subpath constraints. Default is
None
. If set, this overridessubpath_constraints_coverage
, and the coverage constraint is expressed in terms of the subpath constraint length.subpath_constraints_coverage_length
is then the fraction of the total length of the constraint (specified viaedge_length_attr
) needs to appear in some solution path. See subpath constraints documentation -
edge_length_attr : str
, optionalAttribute name for edge lengths. Default is
None
. -
optimization_options : dict
, optionalDictionary with the optimization options. Default is
None
. See optimization options documentation. This class also supports the optimization"optimize_with_greedy": True
(this is the default value). This will use a greedy algorithm to solve the problem, and if the number of paths returned by it equals a lowerbound on the solution size, then we know the greedy solution is optimum, and it will use that. The lowerbound used currently is the edge-width of the graph, meaning the minimum number of paths needed to cover all edges. This is a correct lowerbound because any flow decomposition must cover all edges, as they have non-zero flow. -
solver_options : dict
, optionalDictionary with the solver options. Default is
None
. See solver options documentation.
Raises
ValueError
- If
weight_type
is notint
orfloat
. - If some edge does not have the flow attribute specified as
flow_attr
. - If the graph does not satisfy flow conservation on nodes different from source or sink.
- If the graph contains edges with negative (<0) flow values.
- If the graph is not acyclic.
Source code in flowpaths/minflowdecomp.py
get_solution
Retrieves the solution for the flow decomposition problem.
Returns
-
solution: dict
A dictionary containing the solution paths (key
"paths"
) and their corresponding weights (key"weights"
).
Raises
exception
If model is not solved.
Source code in flowpaths/minflowdecomp.py
solve
Attempts to solve the flow distribution problem using a model with varying number of paths.
This method iterates over a range of possible path counts, creating and solving a flow decompostion model for each count. If a solution is found, it stores the solution and relevant statistics, and returns True. If no solution is found after iterating through all possible path counts, it returns False.
Returns:
Name | Type | Description |
---|---|---|
bool |
bool
|
True if a solution is found, False otherwise. |
Note
This overloads the solve()
method from AbstractPathModelDAG
class.