This notebook compares the performance between causal identification using vanilla backdoor search and the optimized backdoor search and demonstrates the performance gains obtained by using the latter.
import time
import random
from networkx.linalg.graphmatrix import adjacency_matrix
import numpy as np
import pandas as pd
import networkx as nx
import dowhy
from dowhy import CausalModel
from dowhy.utils import graph_operations
import dowhy.datasets
In this section, we create a random graph with the designated number of nodes (10 in this case).
n = 10
p = 0.5
graph = nx.generators.random_graphs.fast_gnp_random_graph(n, p, directed=True)
nodes = []
for i in graph.nodes:
nodes.append(str(i))
adjacency_matrix = np.asarray(nx.to_numpy_matrix(graph))
graph_dot = graph_operations.adjacency_matrix_to_graph(adjacency_matrix, nodes)
graph_dot = graph_operations.str_to_dot(graph_dot.source)
print("Graph Generated.")
df = pd.DataFrame(columns=nodes)
print("Dataframe Generated.")
Graph Generated. Dataframe Generated.
In this section, we compare the runtimes for causal identification using vanilla backdoor search and the optimized backdoor search.
start = time.time()
# I. Create a causal model from the data and given graph.
model = CausalModel(data=df,treatment=str(random.randint(0,n-1)),outcome=str(random.randint(0,n-1)),graph=graph_dot)
time1 = time.time()
print("Time taken for initializing model =", time1-start)
# II. Identify causal effect and return target estimands
identified_estimand = model.identify_effect()
time2 = time.time()
print("Time taken for vanilla identification =", time2-time1)
# III. Identify causal effect using the optimized backdoor implementation
identified_estimand = model.identify_effect(optimize_backdoor=True)
end = time.time()
print("Time taken for optimized backdoor identification =", end-time2)
Time taken for initializing model = 0.07566142082214355 Time taken for vanilla identification = 6.404623508453369 Time taken for optimized backdoor identification = 1.3513822555541992
It can be observed that the optimized backdoor search makes causal identification significantly faster as compared to the vanilla implementation.