Exploratory Data Analysis

Date published: 26/09/23

bin.eda.calculate_metrics(G: Graph, output_dir: str) dict[float][source]

Calculate graph metrics.

Args:
G (nx.Graph):

The graph.

output_dir (str):

The output directory for the visualisation.

Returns:
dict[float]:

The dictionary of metrics.

bin.eda.construct_network(edge_list: DataFrame, from_col: str, to_col: str, len_component: int = 5) Graph[source]

Construct a graph from edge list data.

Args:
edge_list (pd.DataFrame):

The edge list.

from_col (str):

The “from” column name.

to_col (str):

The “to” column name.

len_component (int, optional):

The minimum size of a subgraph to filter out. Defaults to 5.

Returns:
nx.Graph:

The constructed graph.

bin.eda.log_results(tracking_uri: str, experiment_prefix: str, grn_name: str, edge_list_file: str, network_plot: str, metrics: dict[float]) None[source]

Log experiment results to the experiment tracker.

Args:
tracking_uri (str):

The tracking URI.

experiment_prefix (str):

The experiment name prefix.

grn_name (str):

The name of the GRN.

edge_list_file (str):

The name of the edge list file.

network_plot (str):

The path to the network plot to add as an artifact.

metrics (dict[float]):

The dictionary of metrics.

bin.eda.main(config: DictConfig) None[source]

The main entry point for the plotting pipeline.

Args:
config (DictConfig):

The pipeline configuration.

bin.eda.visualize_graph(G: Graph, output_dir: str) str[source]

Visualise the graph.

Args:
G (nx.Graph):

The graph.

output_dir (str):

The output directory for the visualisation.

Returns:
str:

The output directory for the visualisation.