Generalizing Graph Search to State Space Search
Info
The inspiration for this article comes from my recent work on FIT3080 intro to AI at Monash University. We are to show how we can derive state space search from graph search algorithm by elegant generalization.
Introduction
Goal: To demonstrate how the classical single-source shortest path algorithm Dijkstra can be abstracted into a "Best-First Search" framework, and then evolve into a series of state space search algorithms based on the evaluation function f(n), including UCS, A*, Weighted A*, Greedy Search, and more.
Background Scenario: Graph shortest path → Cost minimization problems over arbitrary finite/countable state spaces S.
The journey from Dijkstra's algorithm to general state space search represents one of the most elegant generalizations in computer science and artificial intelligence. What begins as a concrete algorithm for finding shortest paths in graphs with non-negative edge weights transforms into a unified framework capable of solving a vast class of optimization problems.
At its core, this generalization teaches us a fundamental lesson: by abstracting away the specific structure of graphs and focusing on the essential components of search—states, actions, costs, and goals—we can develop algorithms that work not just for road networks or computer networks, but for planning, scheduling, game playing, automated reasoning, and countless other domains.
The key insight that enables this generalization is the recognition that Dijkstra's algorithm is fundamentally an instance of Best-First Search, where the "best" node to expand next is determined solely by the cost incurred so far. By modifying this evaluation function to incorporate estimates of future costs (heuristics), we unlock a spectrum of algorithms that trade off between optimality, completeness, and computational efficiency.
Problem Modeling
State Space Search 6-Tuple
To generalize beyond graphs, we need a formal framework for describing search problems. A state space search problem is defined by the 6-tuple:
P=⟨S,A,s0,T,c,SG⟩
Where:
- S is a finite or countably infinite set of states
- A(s)⊆A is the set of actions available in state s
- s0∈S is the initial state
- T:S×A→S is the transition function where T(s,a) gives the state resulting from applying action a in state s
- c:S×A→R≥0 is the cost function giving the non-negative cost of applying action a in state s
- SG⊆S is the set of goal states
A solution is a sequence of actions a1,a2,...,ak that transforms s0 into some sg∈SG, with total cost ∑i=1kc(si−1,ai) where si=T(si−1,ai).
Graph Shortest Path as a Special Case
The classical graph shortest path problem is a special case where:
- S=V (vertices of the graph)
- A(s)={(s,v)∣(s,v)∈E} (edges from s)
- s0 is the source vertex
- T(s,a)=v where a=(s,v)
- c(s,a)=w(s,v) (edge weight)
- SG is the target vertex (or set of targets)
This shows that Dijkstra's algorithm is just Best-First Search with f(n)=g(n), where g(n) is the cost from the start to node n.
Beyond Graphs: Examples of State Spaces
The power of this generalization becomes apparent when we consider non-graph domains:
Puzzle Solving: For the 8-puzzle
- S: All possible configurations of the 8 tiles
- A(s): Moves available in current configuration
- c(s,a): 1 for each move
- SG: The solved configuration
Path Planning: For a robot navigating
- S: Possible robot configurations (x,y,θ)
- A(s): Motion primitives
- c(s,a): Time/energy/distance
- SG: Target configuration
Resource Allocation:
- S: Allocations of resources to tasks
- A(s): Reallocation decisions
- c(s,a): Cost of reallocation
- SG: Complete, valid allocations
Nodes and Three Cost Functions
In state space search, we work with nodes rather than just states. A node represents a path to a state and contains additional information needed for the search. Each node n has three associated cost functions:
Notation | Meaning | Mathematical Definition |
---|---|---|
g(n) | Cumulative actual cost | Cost of path from start state to state of node n |
h(n) | Heuristic estimate | Estimated minimum remaining cost from state of node n to a goal |
f(n) | Evaluation/Sorting key | Algorithm-selected function: g, h, g+h, g+wh, etc. |
Understanding g(n): The Path So Far
g(n) represents the actual cost incurred to reach the state represented by node n. It is computed by summing the costs along the path from the initial state:
g(n)=i=1∑kc(si−1,ai)
where s0,s1,...,sk=s(n) is the path from start to node n.
Key properties:
- g(n) is exact and known
- g(n0)=0 for the initial node
- For non-negative costs, g(n) never decreases along a path
- In Dijkstra/UCS, f(n)=g(n)—we expand based purely on actual cost
Understanding h(n): The Heuristic Estimate
h(n) is a heuristic function that estimates the cost from the state of node n to the nearest goal state. It represents domain-specific knowledge that helps guide the search.
h(n)≈sg∈SGmincost(s(n)→sg)
Designing good heuristics:
- Admissible: h(n)≤h∗(n) where h∗(n) is the true optimal cost
- Consistent: h(n)≤c(n,a)+h(T(s(n),a)) for all actions a
- Informative: The closer h(n) is to h∗(n), the better the guidance
Common heuristic patterns:
- Distance metrics: Euclidean, Manhattan, Chebyshev distance for spatial problems
- Relaxation: Solve a simplified version of the problem (e.g., ignore obstacles)
- Pattern databases: Pre-computed costs for subproblems
- Landmark heuristics: Cost via intermediate "landmark" states
Understanding f(n): The Evaluation Function
f(n) determines the order of node expansion in Best-First Search. Different choices of f(n) yield different algorithms with different properties:
f(n)=combination of g(n) and h(n)
The role of f(n):
- Balances exploitation (using known costs g(n)) vs exploration (using estimates h(n))
- Determines the search strategy: greedy, optimal, or trade-offs
- Controls which parts of the search space are explored first
Common f(n) definitions:
- f(n)=g(n): Dijkstra/UCS—always optimal
- f(n)=h(n): Greedy best-first—fast but not optimal
- f(n)=g(n)+h(n): A*—optimal with admissible heuristic
- f(n)=g(n)+w⋅h(n): Weighted A*—suboptimal but faster
Best-First Search: The Unified Algorithm
At its heart, all the algorithms we discuss are instances of Best-First Search (BFS), which expands nodes in order of their f(n) value. Here's the unified algorithm:
def best_first_search(problem, f):
"""Best-First Search with evaluation function f"""
# Initialize frontier with start node
start_node = Node(state=problem.initial_state,
g=0,
parent=None)
frontier = PriorityQueue(f) # Ordered by f(n)
frontier.insert(start_node)
# Track reached states and their best g-values
reached = {problem.initial_state: 0}
while not frontier.empty():
# Extract node with minimum f-value
node = frontier.pop()
# Check if we've reached a goal
if node.state in problem.goal_states:
return reconstruct_path(node)
# Expand the node
for action in problem.actions(node.state):
child_state = problem.transition(node.state, action)
child_g = node.g + problem.cost(node.state, action)
# If we haven't reached this state or found a better path
if (child_state not in reached or
child_g < reached[child_state]):
reached[child_state] = child_g
child_node = Node(state=child_state,
g=child_g,
parent=node)
frontier.insert(child_node)
return None # No solution found
Key Components Explained
1. Priority Queue (Frontier)
- Stores nodes to be explored, ordered by f(n)
- Different data structures: binary heap (O(logn)), Fibonacci heap (O(1) decrease-key)
- The choice of f(n) determines the search strategy
2. Reached/Closed Set
- Tracks states we've already visited
- Prevents infinite loops and redundant work
- For optimal algorithms, we store the best g-value for each state
3. Node Structure Each node contains:
state
: The state in the state spaceg
: Actual cost from startparent
: Pointer to parent node (for path reconstruction)- Additional fields: action, depth, etc.
The Monotonicity Condition
A crucial property for optimality is monotonicity (also called consistency):
f(parent)≤f(child)
When this holds, the first time we expand a node, we have found the optimal path to it. This is why:
- Dijkstra's algorithm (with f=g) is optimal for non-negative weights
- A* with a consistent heuristic expands each state at most once
- We can safely prune previously expanded states
Search Tree vs State Graph
It's important to understand that Best-First Search builds a search tree where:
- Each node represents a path to a state
- Multiple nodes can represent the same state (with different g-values)
- The tree explores the state space without building it explicitly
This differs from graph algorithms that work on an explicitly represented graph. In state space search, we typically:
- Generate neighbors on-the-fly
- Never build the complete state space (often infinite)
- Use the heuristic to guide exploration toward promising regions
Algorithm Instances: Different Evaluation Functions
By varying the evaluation function f(n), we obtain different search algorithms with distinct properties. Here's a comprehensive comparison:
Dijkstra / UCS
- f(n) definition: g(n)
- Properties: Complete + Optimal (for edge weights ≥ 0)
- Complexity: O((∣V∣+∣E∣)log∣V∣) with binary heap
- Key characteristics: Expands nodes in order of actual cost from start
A*
- f(n) definition: g(n)+h(n)
- Properties: Complete + Optimal (if h is admissible)
- Complexity: O((∣V∣+∣E∣)log∣V∣) with binary heap
- Key characteristics: Uses heuristic to guide toward goal; optimal with admissible heuristic
Weighted A*
- f(n) definition: g(n)+w⋅h(n),w>1
- Properties: Complete, suboptimal (≤ w times optimal)
- Complexity: O((∣V∣+∣E∣)log∣V∣) with binary heap
- Key characteristics: Speeds up search by sacrificing optimality; common in real-time planning
Greedy BFS
- f(n) definition: h(n)
- Properties: Complete (finite state space), not optimal
- Complexity: O(bd)
- Key characteristics: Expands fewest nodes; fast but no optimality guarantees
ε-consistent A*
- f(n) definition: max{g(n)+h(n),(1+ε)g(n)}
- Properties: Complete, (1+ε)-optimal
- Complexity: O((∣V∣+∣E∣)log∣V∣) with binary heap
- Key characteristics: Theoretical-practical compromise; bounds suboptimality
Dijkstra / Uniform Cost Search (UCS)
When to use: When you need guaranteed optimality and have no domain knowledge for heuristics.
Behavior:
- Expands nodes in order of increasing path cost from start
- Like water filling a basin—explores uniformly in all directions
- Optimal for non-negative edge weights
- Can be slow for large state spaces
# Dijkstra/UCS is just Best-First Search with:
f = lambda n: n.g
A* Search
When to use: When you have a good admissible heuristic and need optimality.
Behavior:
- Balances actual cost (g) with estimated remaining cost (h)
- Focuses search toward promising regions
- With admissible heuristic: expands fewer nodes than UCS while remaining optimal
- The "gold standard" for optimal heuristic search
Key insight: f(n)=g(n)+h(n) estimates the total cost of a solution going through node n. By expanding nodes with minimum f(n), A* focuses on the most promising complete paths.
Weighted A*
When to use: When optimality can be sacrificed for speed.
Behavior:
- For w>1, overweights the heuristic, making the search more "greedy"
- Finds solutions faster but with suboptimality bound of w
- As w→∞, behavior approaches Greedy BFS
- Common in real-time applications where quick decisions are needed
Practical note: Weighted A* is widely used in robotics and game AI where:
- Planning time is limited
- Near-optimal solutions are acceptable
- The environment is dynamic
Greedy Best-First Search
When to use: When any solution is acceptable and speed is critical.
Behavior:
- Ignores path cost so far; focuses purely on heuristic
- Can find solutions very quickly in some cases
- May find very long, expensive paths
- No optimality guarantees
Example: In route planning with straight-line distance heuristic:
- Greedy BFS might take a tiny road directly toward the goal
- Missing a nearby highway that would be much faster
Epsilon-Consistent A*
When to use: When you want bounded suboptimality with better performance than weighted A*.
Behavior:
- Guarantees solutions within (1+ε) of optimal
- Often performs better than weighted A* for the same suboptimality bound
- Uses a more sophisticated evaluation function
Visualizing Search Behavior
Consider searching in a 2D grid with the goal in the northeast corner:
Start -> . . . . . . . . . . . . . . .
. # # # # . . . . . . . . . .
. . . . # . . . . . . . . . .
. . . . # . . . . . . . . . .
. . . . # . . . . . . . . . .
. . . . # # # # # . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . G
- UCS: Expands in a diamond pattern centered at start
- Greedy BFS: Expands in a line directly toward goal
- A*: Expands in an ellipse focused on the optimal path
- Weighted A*: More focused ellipse toward goal
This visualization helps understand how different f(n) functions shape the search frontier.
Key Abstractions: From Graphs to General Search
The generalization from graph algorithms to state space search involves several crucial conceptual shifts. Understanding these abstractions is key to mastering heuristic search algorithms.
Node ≠ State: A Critical Distinction
In graph algorithms, we typically work with vertices. In state space search, we work with nodes, where:
- A state is a configuration of the world
- A node represents a path to a state, including:
- The state itself
- The cost to reach it (g-value)
- Parent pointer for path reconstruction
- Other metadata (depth, action taken, etc.)
Why this matters: Multiple nodes can represent the same state with different costs:
Path 1: S → A → B → C (cost: 10)
Path 2: S → D → E → C (cost: 8)
Both end at state C, but represent different paths
This distinction allows us to:
- Find optimal paths (by keeping the best g-value)
- Handle state spaces with cycles
- Support anytime algorithms that improve solutions
The Power of Heuristics
Heuristics are what make state space search practical for large problems. They provide domain-specific knowledge to guide the search.
Where do heuristics come from?
Problem Relaxation: Remove constraints to get an easier problem
- 8-puzzle: Allow tiles to move through each other
- Route planning: Ignore obstacles
- The relaxed problem's optimal cost is a valid heuristic
Pattern Databases: Pre-compute costs for subproblems
- Store costs for solving subsets of tiles in 8-puzzle
- Combine multiple pattern databases for better estimates
Landmarks: Identify intermediate states that solutions must visit
- "Must pass through checkpoint A"
- Sum costs to and from landmarks
Distance Metrics: For spatial problems
- Euclidean distance: (x2−x1)2+(y2−y1)2
- Manhattan distance: ∣x2−x1∣+∣y2−y1∣
- Chebyshev distance: max(∣x2−x1∣,∣y2−y1∣)
The heuristic design trade-off:
- Too loose: Poor guidance, search behaves like UCS
- Too tight (overestimates): May miss optimal solutions
- Just right: Tight admissible heuristic → optimal with few expansions
The Reached/Closed Set: Managing Visited States
The reached
set (or closed
list in some implementations) is crucial for efficiency and correctness.
What it stores:
- For each state: the best g-value found so far
- Allows us to detect when we've found a better path to a state
When to update:
- When we find a path with lower cost to an already-visited state
- This happens naturally in Best-First Search with the check:
if child_state not in reached or child_g < reached[child_state]: # Found a better path!
For non-optimal algorithms:
- Greedy search might still want to avoid revisiting states
- But could miss shorter paths by doing so
- Trade-off between completeness and efficiency
Search Space Generation: On-the-Fly Expansion
Unlike graph algorithms that work on explicit graphs, state space search typically:
- Generates neighbors on-demand
- Never builds the complete state space
- Can handle infinite or extremely large state spaces
Example: Chess has about 1046 legal positions
- Impossible to store explicitly
- But A* can search toward checkmate with good heuristics
This lazy approach makes many complex problems tractable that would be impossible with explicit representation.
The Frontier as a "Wave" of Exploration
Think of the frontier as a wave propagating through the state space:
- The shape of the wave depends on f(n)
- UCS: Circular wave (uniform in all directions)
- A*: Elliptical wave stretched toward goal
- Greedy: Narrow beam focused on goal
This wave metaphor helps understand:
- Why some algorithms find solutions faster
- How heuristics focus the search
- The relationship between exploration and exploitation
Theoretical Properties: Understanding the Guarantees
To use these algorithms effectively, we need to understand their theoretical guarantees. When can we trust that an algorithm will find a solution? When will it be optimal?
Admissible vs Consistent Heuristics
Two key properties determine the behavior of heuristic search algorithms:
Admissible Heuristic: Never overestimates the true cost to goal
0≤h(n)≤h∗(n)for all n
Where h∗(n) is the true optimal cost from state n to goal.
Consistent Heuristic (also called monotonic): Satisfies the triangle inequality
h(n)≤c(n,a)+h(T(s(n),a))for all n,a
Key relationships:
- Consistent ⇒ Admissible (but not vice versa)
- Consistent heuristics are "locally admissible"
- For consistent heuristics: f(n) is non-decreasing along any path
Why Consistency Matters
Consistent heuristics give us important properties:
- Optimality of A*: With consistent heuristic, A* is optimal
- Efficiency: Each state is expanded at most once
- Monotonicity: The f-values never decrease along a path
- Pruning: We can safely ignore previously expanded states
Example: Manhattan distance for grid navigation is consistent:
- Moving one step changes Manhattan distance by exactly 1
- So h(n)≤1+h(neighbor) holds
Non-consistent but admissible example: Straight-line distance to goal
- Admissible (never overestimates)
- But may violate consistency if obstacles exist
Completeness and Optimality
Algorithm | Complete? | Optimal? | Conditions |
---|---|---|---|
UCS/Dijkstra | Yes (finite S, c≥0) | Yes (c≥0) | Non-negative costs |
A* | Yes (finite S, c≥0) | Yes | h admissible |
A* | Yes (finite S, c≥0) | Yes | h consistent |
Weighted A* | Yes | No | Within w of optimal |
Greedy BFS | Yes (finite S) | No | - |
Completeness: Will the algorithm find a solution if one exists?
- For finite state spaces with non-negative costs: all our algorithms are complete
- For infinite spaces: depends on the heuristic and branching factor
Optimality: Will the algorithm find the lowest-cost solution?
- Only UCS and A* (with admissible heuristic) guarantee optimality
- Others sacrifice optimality for speed
Complexity Analysis
The time complexity of Best-First Search depends on:
- The number of nodes expanded
- The cost of priority queue operations
Worst-case complexity (binary heap):
- O((∣S∣+∣E∣)log∣S∣) for UCS/A*
- Same as Dijkstra's algorithm
But—this worst-case rarely matters in practice!
- Good heuristics can reduce expanded nodes exponentially
- In the best case, A* expands only nodes along the optimal path
Space complexity:
- O(∣S∣) for the priority queue
- Can be prohibitive for large problems
- This led to development of iterative-deepening variants
The A* Optimality Proof
Why is A* optimal with an admissible heuristic? Here's the intuition:
- Suppose A* finds a suboptimal solution first
- Let n be a node on the optimal path that hasn't been expanded yet
- Since h is admissible: f(n)=g(n)+h(n)≤optimal cost
- But the suboptimal solution has cost > optimal cost
- So f(n)< cost of found solution
- Contradiction: A* would have expanded n first
Practical Implications
For algorithm selection:
- Need guaranteed optimality? Use A* with admissible heuristic
- No good heuristic? Use UCS/Dijkstra
- Speed more important than optimality? Use weighted A* or greedy
- Have consistent heuristic? Can use efficient implementations
For heuristic design:
- Start with admissible heuristics (guarantees optimality)
- If possible, make them consistent (improves efficiency)
- Consider multiple heuristics and take maximum
- Remember: tighter bounds → fewer expansions
Understanding these theoretical foundations helps:
- Choose the right algorithm for your problem
- Design effective heuristics
- Debug when algorithms don't behave as expected
- Prove properties of new algorithms you develop
Further Generalizations: Beyond Basic A*
The Best-First Search framework can be extended in many directions to handle more complex scenarios. Here are some important generalizations:
Weighted / Anytime A*
Problem: Basic A* finds optimal solutions but might be slow Solution: Dynamically adjust the weight w to find good solutions quickly, then improve them
def anytime_astar(problem, initial_weight=5.0, decay_rate=0.95):
"""Anytime A* with decreasing weight"""
weight = initial_weight
best_solution = None
best_cost = infinity
while True:
# Run weighted A* with current weight
solution = weighted_astar(problem, weight)
if solution and solution.cost < best_cost:
best_solution = solution
best_cost = solution.cost
yield best_solution # Return improved solution
# Decrease weight for next iteration
weight *= decay_rate
if weight <= 1.0:
break # Reached optimal A*
# Final run with optimal A*
final_solution = astar(problem)
yield final_solution
Applications:
- Real-time systems that need quick initial solutions
- Problems where solution quality can improve over time
- Situations with uncertain time constraints
Multi-Objective Search
Problem: Real problems often have multiple competing objectives
- Fastest route vs safest route vs most scenic route
- Cost vs time vs environmental impact
Solution: Find Pareto-optimal solutions—solutions where you can't improve one objective without worsening another
Approaches:
- Scalarization: Combine objectives: f(n)=w1⋅g1(n)+w2⋅g2(n)+...
- Pareto A*: Maintain multiple frontiers, one for each Pareto point
- Lexicographic ordering: Optimize objectives in priority order
def multi_objective_astar(problem, objectives):
"""Multi-objective A* with Pareto frontier"""
frontier = PriorityQueue() # May need multiple queues
pareto_frontier = {} # State -> set of non-dominated cost vectors
while frontier:
node = frontier.pop()
if is_goal(node.state):
add_to_pareto_set(pareto_solutions, node)
for child in expand(node):
if not is_dominated(child.costs, pareto_frontier):
frontier.add(child)
update_pareto_frontier(pareto_frontier, child)
Probabilistic and Risk-Sensitive Search
Problem: Costs may be uncertain or stochastic
- Travel times with traffic
- Action success probabilities
- Risk-averse vs risk-seeking behavior
Solution: Modify evaluation function to account for uncertainty
Approaches:
- Expected utility: f(n)=E[g(n)]+E[h(n)]
- Risk-sensitive: f(n)=E[g(n)]+λ⋅Var[g(n)]
- Chance-constrained: Find solutions that satisfy constraints with probability p
- Robust optimization: Minimize worst-case cost
Example: In route planning with uncertain travel times:
- Risk-averse: Prefer routes with predictable times
- Risk-neutral: Minimize expected time
- Risk-seeking: Take gambles on fast routes
Hierarchical A* (HA*)
Problem: Large state spaces with structure at multiple levels
- Navigation across cities vs streets
- Game playing with high-level and low-level moves
Solution: Search at multiple levels of abstraction
Process:
- Abstract level: Plan using coarse-grained representation
- Refinement: Refine abstract plan at detailed level
- Backtracking: If refinement fails, backtrack to abstract level
def hierarchical_astar(problem, abstraction_map):
"""Hierarchical A* with multiple abstraction levels"""
# Plan at highest level
abstract_plan = astar(abstract_problem)
for abstract_step in abstract_plan:
# Refine each abstract step
detailed_plan = astar(refined_problem(abstract_step))
if not detailed_plan:
# Failed to refine, need to replan
return hierarchical_astar(problem, next_abstraction())
return combine_plans(detailed_plans)
Bidirectional Search
Problem: Search from start to goal might explore many irrelevant nodes Solution: Search simultaneously from both start and goal
Challenge: How to meet in the middle?
- Need consistent heuristics in both directions
- Must handle when frontiers intersect
Best for:
- Undirected graphs
- Problems with symmetric structure
- When good heuristics exist in both directions
Real-Time Search
Problem: Must make decisions within time limits
- Games with time controls
- Robotics with sensor updates
- Interactive applications
Solutions:
- Time-bounded A*: Return best solution found when time expires
- RTA* (Real-Time A*): Commit to first move, then replan
- LRTA* (Learning Real-Time A*): Learn heuristics while exploring
def real_time_astar(problem, time_limit):
"""Real-time A* with fixed decision time"""
current_state = problem.initial_state
while not is_goal(current_state):
start_time = time.now()
# Search from current state
best_action = None
best_value = infinity
for action in problem.actions(current_state):
# Limited lookahead
value = lookahead_search(current_state, action, time_limit/len(actions))
if value < best_value:
best_value = value
best_action = action
# Execute best action
current_state = problem.transition(current_state, best_action)
Memory-Bounded Search
Problem: A* can use excessive memory for large problems Solution: Algorithms that limit memory usage
Approaches:
- SMA* (Simplified Memory-Bounded A*): Drop worst node when memory full
- IDA* (Iterative-Deepening A*): Use depth-first with increasing f-cost limits
- RBFS (Recursive Best-First Search): Recursive search with backtracking
Trade-off: Less memory usage, but may re-expand nodes
These generalizations show the flexibility of the Best-First Search framework. By modifying the evaluation function, adding constraints, or changing the search strategy, we can adapt to a wide variety of real-world problems.
Experimental Evaluation: Metrics and Methods
How do we know which algorithm performs best for our problem? Experimental evaluation helps us understand the practical performance of search algorithms beyond theoretical guarantees.
Key Performance Metrics
1. Nodes Expanded
- The most common metric in academic papers
- Directly relates to computational complexity
- Counts how many states were actually processed
- Independent of implementation details
2. Runtime
- What users actually care about
- Depends on:
- Implementation efficiency
- Data structure choices
- Hardware
- Constant factors
3. Solution Quality
- For suboptimal algorithms: how close to optimal?
- Suboptimality ratio: optimal costfound cost
- Important for weighted A*, greedy search, etc.
4. Memory Usage
- Maximum size of frontier/closed set
- Critical for large problems
- Varies significantly between algorithms
Generating Problem Instances
Random Problems:
- Random graphs with various topologies
- Random instances of puzzles (8-puzzle, Rubik's cube)
- Random scheduling/planning problems
Real-World Benchmarks:
- Road networks from OpenStreetMap
- Game AI scenarios
- Industrial planning problems
- Standardized test sets
Structured Problems:
- Problems with known optimal solutions
- Problems that stress specific aspects:
- Dead ends
- Multiple paths
- Symmetry
- Bottlenecks
Data Structure Impact
The choice of priority queue implementation significantly affects performance:
Data Structure | Insert | Extract-Min | Decrease-Key | Best For |
---|---|---|---|---|
Binary Heap | O(logn) | O(logn) | O(logn) | General use |
Fibonacci Heap | O(1) amortized | O(logn) amortized | O(1) amortized | Large graphs |
Bucket Queue | O(1) | O(1) | O(1) | Small integer costs |
Pairing Heap | O(1) | O(logn) | O(loglogn) amortized | Good in practice |
Practical advice:
- Binary heaps are simple and often good enough
- Fibonacci heaps have better asymptotics but large constants
- For special cases (small integer costs), bucket queues excel
Heuristic Quality Analysis
Heuristic Error Distribution:
- Mean error: ∣S∣1∑s∈S∣h(s)−h∗(s)∣
- Maximum error: maxs∈S∣h(s)−h∗(s)∣
- Correlation with actual optimal cost
Effective Branching Factor:
b∗=(solution depthnodes expanded)1/solution depth
- Measures how focused the search is
- Closer to 1 is better
- UCS: b∗ = actual branching factor
- A*: b∗ depends on heuristic quality
Visualization Techniques
Search Space Visualization:
- 2D/3D plots of expanded nodes
- Color-coding by g, h, or f values
- Animation showing search progression
Performance Profiles:
- Runtime vs problem size
- Nodes expanded vs solution quality
- Memory usage over time
Comparative Analysis:
- Scatter plots comparing two algorithms
- Speedup ratios
- Pareto fronts for multi-objective cases
Statistical Analysis
Multiple Runs:
- Average performance across problem instances
- Standard deviations and confidence intervals
- Statistical significance testing
Scaling Analysis:
- How performance changes with problem size?
- Polynomial vs exponential scaling
- Identify phase transitions
Benchmark Suites
Standard Collections:
- DIMACS: Graph algorithms challenges
- IPC: International Planning Competition
- TPTP: Automated theorem proving
- SSSP: Single-source shortest path instances
Creating Your Own Benchmarks:
- Ensure variety in difficulty
- Include both easy and hard instances
- Document properties and known solutions
- Make them publicly available
Common Pitfalls
- Overfitting: Optimizing for specific test cases
- Implementation bias: Comparing optimized vs naive implementations
- Hardware effects: Not accounting for cache/memory hierarchy
- Insufficient sampling: Testing on too few instances
- Ignoring constants: Asymptotic analysis doesn't tell the whole story
Tools and Frameworks
Search Algorithm Libraries:
- SearchLib (C++)
- AI4R (Ruby)
- SimpleAI (Python)
- Various university course projects
Visualization Tools:
- Graphviz for search trees
- Matplotlib/Seaborn for performance plots
- Custom web-based visualizers
Benchmarking Frameworks:
- Custom scripts with timing utilities
- Statistical analysis packages (R, pandas)
- Continuous integration for regression testing
Good experimental methodology helps:
- Choose the right algorithm for your problem
- Understand why algorithms behave as they do
- Identify opportunities for improvement
- Compare your work fairly with others
Summary
We have explored the elegant generalization from Dijkstra's algorithm to a unified framework of state space search algorithms. The key insights are:
Dijkstra = UCS = Best-First with f=g At its core, Dijkstra's algorithm is simply Best-First Search where the evaluation function f(n) is just the cumulative cost g(n) from the start.
The Power of Abstraction By moving from explicit graphs to state spaces defined by the 6-tuple ⟨S,A,s0,T,c,SG⟩, we can apply these algorithms to vastly different problem domains:
- Path finding in road networks
- Puzzle solving (8-puzzle, Rubik's cube)
- Planning and scheduling
- Game playing and automated reasoning
Heuristics: Domain Knowledge in Action The introduction of heuristic functions h(n) transforms uniform search into guided exploration:
- f(n)=g(n): Dijkstra/UCS—optimal but unfocused
- f(n)=g(n)+h(n): A*—optimal with focused search
- f(n)=g(n)+w⋅h(n): Weighted A*—faster, bounded suboptimality
- f(n)=h(n): Greedy search—fast but no optimality guarantees
Theoretical Foundations Matter Understanding properties like admissibility and consistency helps us:
- Guarantee optimality when needed
- Design effective heuristics
- Choose appropriate algorithms for specific problems
- Prove correctness of new algorithms
Beyond the Basics The framework extends to handle:
- Multiple objectives (Pareto optimality)
- Uncertainty and risk sensitivity
- Real-time constraints
- Memory limitations
- Hierarchical decomposition
This unified view shows that seemingly different algorithms are all instances of the same fundamental idea—exploring states in order of some evaluation function. By choosing different evaluation functions, we trade off between optimality, completeness, speed, and memory usage to match our specific needs.
References
- Russell, S., & Norvig, P. (2020). Artificial Intelligence: A Modern Approach (4th ed.). Pearson. Chapter 3: Solving Problems by Searching.