Main API
DataFlowContext
This is the shared store for the entire fixpoint loop. All loaded analyses read and write facts through the same context.
structure DataFlowContext where
analyses : HashMap Lean.Name DataFlowAnalysis
lattice : HashMap LatticeAnchor (HashMap Lean.Name ErasedAnalysisState)
workList : WorkList
DataFlowAnalysis
This is the abstraction every analysis plugs into. The solver itself
only knows how to initialize analyses, pop worklist items in the fixpoint loop, and call
visit until a fixpoint is reached.
structure DataFlowAnalysis where
id : Lean.Name
init : OperationPtr -> ...
visit : ProgramPoint -> ...
SparseForwardDataFlowAnalysis
This extension requires that the lattice facts are attached to SSA values. It does use-def propagation and block-argument propagation of those lattice facts.
def new
(Domain : Type)
(analysisName : Lean.Name)
(visitOperationImpl : VisitOperationFn)
(setToEntryState : SetToEntryStateFn) :
DataFlowAnalysis :=
DataFlowAnalysis.new
analysisName
(init (Domain := Domain) analysisName visitOperationImpl setToEntryState)
(visit (Domain := Domain) analysisName visitOperationImpl setToEntryState)
setToEntryState explanation.
fixpointSolve
This is the top-level function that computes the joint fixpoint for all analyses loaded into the solver.
def fixpointSolve (top : OperationPtr)
(analyses : Array DataFlowAnalysis)
(irCtx : IRContext OpCode) : DataFlowContext := Id.run do
let mut dfCtx := DataFlowContext.empty
for analysis in analyses do
dfCtx := dfCtx.insertAnalysis analysis
for analysis in analyses do
dfCtx := analysis.init top dfCtx irCtx
run dfCtx irCtx
Lattice Elements/States
In lattice theory, there are lattice elements that can be top, bottom, etc.
In this dataflow framework, that is encoded with LatticeElement.
But more information is often needed in regards to a specific lattice element.
The *State types contain that information, such as what depends
on this LatticeElement.
LatticeElement
Gives a domain its algebra: bottom,
top, join, and meet.
class LatticeElement (Domain : Type) extends BEq Domain where
typeNameInst : TypeName Domain
bottom : Domain
top : Domain
join : Domain -> Domain -> Domain
meet : Domain -> Domain -> Domain
AnalysisStateHeader
Stores the anchor a fact refers to plus explicit
dependents that should be re-enqueued when the fact
changes.
structure AnalysisStateHeader where
anchor : LatticeAnchor
dependents : Array WorkItem
AnalysisState
Adds onUpdate, which is a hook that is run when the lattice
state changes. Typically enqueues dependents and notifies subscribers.
Users making their own analysis state types extend AnalysisStateHeader and create
an instance of AnalysisState.
class AnalysisState (State : Type u) where
typeNameInst : TypeName State
mkState : LatticeAnchor -> State
onUpdate : State -> DataFlowContext -> IRContext OpCode -> DataFlowContext
toHeader : State -> AnalysisStateHeader
SparseLatticeState
Holds a lattice element of the specified lattice Domain plus useDefSubscribers: analyses
that want uses of that SSA value revisited when the lattice element changes.
structure SparseLatticeState (Domain : Type)
extends AnalysisStateHeader where
useDefSubscribers : Array Lean.Name
latticeElement : Domain
SparseConstantPropagation
How it plugs into the framework
def SparseConstantPropagation : DataFlowAnalysis :=
SparseForwardDataFlowAnalysis.new
(Domain := ConstantDomain)
"SparseConstantPropagation"
SparseConstantPropagation.visitOperation
SparseConstantPropagation.setToEntryState
ConstantDomain
Variables of ConstantDomain are lattice elements of the constant domain lattice with the following possible values:
bottom, top, or a concrete integer constant with bitwidth information.
Sparse constant propagation stores its facts in
SparseLatticeState ConstantDomain, meaning each SSA
value carries a constant-domain lattice element together with
the sparse dependency metadata (useDefSubscribers)
needed for use-def driven re-visitation when that SSA value's
lattice element changes.
structure KnownConstant where
bitwidth : Nat
value : Data.LLVM.Int bitwidth
deriving BEq
inductive ConstantDomain where
| top
| bottom
| constant (value : KnownConstant)
deriving BEq, TypeName
instance : LatticeElement ConstantDomain where
typeNameInst := inferInstance
bottom := .bottom
top := .top
join := .join
meet := .meet
Block-argument propagation
- If a block start is visited, first check whether the corresponding block is executable. If not, return.
- For a non-entry block, iterate over predecessor blocks.
- Skip any predecessor whose CFG edge to the current block is not live.
- For each block argument, if the predecessor forwards an SSA value into that argument, add the current block-start point as a dependent of that predecessor value.
- Join the predecessor value's lattice fact into the block argument's lattice state.
- If incoming flow cannot be modeled precisely, set the block argument to the entry state.
Operation transfer
- If the operation has regions, conservatively set its results to the entry state.
- If any operand is still
bottom, do nothing yet. - If all required operands are known constants, fold.
- If folding succeeds, join the constant into the SSA result lattice state.
- If folding fails, set the SSA result lattice state to the entry state:
unknown/top.
DeadCodeAnalysis
How it plugs into the framework
def DeadCodeAnalysis :=
DataFlowAnalysis.new
"DeadCodeAnalysis"
DeadCodeAnalysis.init
DeadCodeAnalysis.visit
ExecutableState
Marks a block entry or CFG edge live and tracks which analyses should revisit the relevant block contents if executability changes.
structure ExecutableState extends AnalysisStateHeader where
live : Bool := false
subscribers : Array Lean.Name := #[]
deriving TypeName
Initialization
- Mark top-level region entry blocks live during initialization.
- Recursively walk nested operations with control-flow semantics.
- Subscribe those operations to parent-block executability.
- Invoke
visiton those operations during initialization.
Control-flow propagation
- Skip the operation if its parent block is not executable.
- If the operation has regions, mark their entry blocks live.
- If the operation has successors and branch behavior can be resolved from sparse constant facts, mark the chosen edge live.
- Otherwise, conservatively mark all successor edges live.
Where the framework is currently restrictive
User defined analyses based off of SparseForwardDataFlowAnalysis
assume code reachability has been computed. DeadCodeAnalysis, in
turn, currently consults sparse constant propagation facts to reason precisely
about branches. This means that SparseConstantPropagationAnalysis
and DeadCodeAnalysis are effectively worthless unless run together,
combining into SCCP.
More detrimentally, this means that user defined sparse analyses will not work unless
both SparseConstantPropagationAnalysis and DeadCodeAnalysis
are registered to the same fixpointSolve run.
Loop Test From Combining Analyses, Combining Optimizations
int x₀ ← 1;
do { x₁ ← φ(x₀, x₃);
b ← (x₁ ≠ 1);
if (b)
x₂ ← 2;
x₃ ← φ(x₁, x₂);
} while (pred());
return(x₃);
"builtin.module"() ({
^bb0:
%x0 = "arith.constant"() <{ value = 1 : i32 }> : () -> i32
"test.test"(%x0, %x0)[^bb1] : (i32, i32) -> ()
^bb1(%dead1 : i32, %x1 : i32):
%one = "arith.constant"() <{ value = 1 : i32 }> : () -> i32
%b = "arith.subi"(%x1, %one) : (i32, i32) -> i32
"test.test"(%b, %x1)[^bb2, ^bb3] : (i32, i32) -> ()
^bb2(%dead_then : i32, %x1_then : i32):
%x2 = "arith.constant"() <{ value = 2 : i32 }> : () -> i32
"test.test"(%x2, %x2)[^bb3] : (i32, i32) -> ()
^bb3(%dead2 : i32, %x3 : i32):
%pred = "test.test"() : () -> i32
"test.test"(%pred, %x3)[^bb1, ^bb4] : (i32, i32) -> ()
^bb4(%dead3 : i32, %retv : i32):
}) : () -> ()
Reachability
- Line 0 is reachable.
- Line 1 is reachable.
- Line 2 is reachable.
- Line 3 is reachable.
- Line 4 is unreachable.
- Line 5 is reachable.
- Line 6 is reachable.
- Line 7 is reachable.
Value facts
x₀=1x₁=1b=0(false)x₂=bottomx₃=1pred=top
Block liveness
bb0=truebb1=truebb2=falsebb3=truebb4=true
Edge liveness
bb0→bb1=truebb1→bb2=falsebb1→bb3=truebb2→bb3=falsebb3→bb1=truebb3→bb4=true
Constant facts
x0=1x1=1one=1b=0x2=bottomx3=1pred=unknownretv=1
Implementation PRs
Dataflow Analysis Framework
The APIs referenced in this talk (such as the solver, analysis interfaces, lattice element inferace, etc.) is in this PR.
Dominance Tree
A dominance tree implementation based off the Cooper-Harvey-Kennedy algorithm is in this PR.