Skip to content

Relationship Collector and Processor

This module collects and processes relationship data for different types of relationships using the RelationshipPropertiesExtractor and RelationshipDataProcessor classes.

You can check the documentation of the RelationshipsCollectorProcessor class here.

Module to collect and process relationship data for different types of relationships using RelationshipPropertiesExtractor and RelationshipDataProcessor.

Classes:

Name Description
RelationshipsCollectorProcessor

Class to collect and process relationship data for different types of relationships.

Functions:

Name Description
main

Main function to parse command-line arguments and collect relationship data for the specified type.

RelationshipsCollectorProcessor

A class to collect and process relationship data for different types of relationships using RelationshipPropertiesExtractor and RelationshipDataProcessor.

Attributes:

Name Type Description
relationship_type str

The type of relationship to collect data for.

data_file str

The path to the data file containing relationship data.

extractor RelationshipPropertiesExtractor

An instance of RelationshipPropertiesExtractor.

processor RelationshipDataProcessor

An instance of RelationshipDataProcessor.

Source code in chemgraphbuilder/relationship_collector_processor.py
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
class RelationshipsCollectorProcessor:
    """
    A class to collect and process relationship data for different types of relationships using RelationshipPropertiesExtractor and RelationshipDataProcessor.

    Attributes:
        relationship_type (str): The type of relationship to collect data for.
        data_file (str): The path to the data file containing relationship data.
        extractor (RelationshipPropertiesExtractor): An instance of RelationshipPropertiesExtractor.
        processor (RelationshipDataProcessor): An instance of RelationshipDataProcessor.
    """

    def __init__(self, relationship_type, start_chunk=0):
        """
        Initializes the RelationshipsCollectorProcessor with the relationship type, data file, and start chunk index.

        Args:
            relationship_type (str): The type of relationship to collect data for (e.g., 'Assay_Compound', 'Assay_Enzyme', 'Gene_Enzyme', 'Compound_Enzyme', 'Compound_Similarity', 'Compound_Cooccurrence', 'Compound_Transformation').
            start_chunk (int): The starting chunk index for processing data.
        """
        self.relationship_type = relationship_type
        self.data_file = "Data/AllDataConnected.csv"
        self.gene_data = "Data/Nodes/Gene_Properties_Processed.csv"
        self.start_chunk = start_chunk
        self.extractor = RelationshipPropertiesExtractor()
        self.processor = RelationshipDataProcessor(path="Data/Relationships/Assay_Compound_Relationship")


    def collect_relationship_data(self):
        """
        Collects and processes relationship data based on the relationship type and saves it to the appropriate file.
        """
        if self.relationship_type == 'Assay_Compound':
            self.extractor.assay_compound_relationship(self.data_file, start_chunk=self.start_chunk)
            self.processor.process_files()
        elif self.relationship_type == 'Assay_Gene':
            self.extractor.assay_enzyme_relationship(self.data_file)
        elif self.relationship_type == 'Gene_Protein':
            self.extractor.gene_protein_relationship(self.data_file)
        elif self.relationship_type == 'Compound_Gene':
            self.extractor.compound_gene_relationship(self.data_file)
        elif self.relationship_type == 'Compound_Similarity':
            self.extractor.compound_similarity_relationship(self.data_file, start_chunk=self.start_chunk)
        elif self.relationship_type == 'Compound_Compound_Cooccurrence':
            self.extractor.compound_compound_cooccurrence(self.data_file)
        elif self.relationship_type == 'Compound_Gene_Cooccurrence':
            self.extractor.compound_gene_cooccurrence(self.gene_data)
        elif self.relationship_type == 'Compound_Gene_Interaction':
            self.extractor.compound_gene_interaction(self.gene_data)
        elif self.relationship_type == 'Compound_Transformation':
            self.extractor.compound_transformation(self.data_file)
        else:
            logging.error(f"Unsupported relationship type: {self.relationship_type}")

__init__(relationship_type, start_chunk=0)

Initializes the RelationshipsCollectorProcessor with the relationship type, data file, and start chunk index.

Parameters:

Name Type Description Default
relationship_type str

The type of relationship to collect data for (e.g., 'Assay_Compound', 'Assay_Enzyme', 'Gene_Enzyme', 'Compound_Enzyme', 'Compound_Similarity', 'Compound_Cooccurrence', 'Compound_Transformation').

required
start_chunk int

The starting chunk index for processing data.

0
Source code in chemgraphbuilder/relationship_collector_processor.py
34
35
36
37
38
39
40
41
42
43
44
45
46
47
def __init__(self, relationship_type, start_chunk=0):
    """
    Initializes the RelationshipsCollectorProcessor with the relationship type, data file, and start chunk index.

    Args:
        relationship_type (str): The type of relationship to collect data for (e.g., 'Assay_Compound', 'Assay_Enzyme', 'Gene_Enzyme', 'Compound_Enzyme', 'Compound_Similarity', 'Compound_Cooccurrence', 'Compound_Transformation').
        start_chunk (int): The starting chunk index for processing data.
    """
    self.relationship_type = relationship_type
    self.data_file = "Data/AllDataConnected.csv"
    self.gene_data = "Data/Nodes/Gene_Properties_Processed.csv"
    self.start_chunk = start_chunk
    self.extractor = RelationshipPropertiesExtractor()
    self.processor = RelationshipDataProcessor(path="Data/Relationships/Assay_Compound_Relationship")

collect_relationship_data()

Collects and processes relationship data based on the relationship type and saves it to the appropriate file.

Source code in chemgraphbuilder/relationship_collector_processor.py
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
def collect_relationship_data(self):
    """
    Collects and processes relationship data based on the relationship type and saves it to the appropriate file.
    """
    if self.relationship_type == 'Assay_Compound':
        self.extractor.assay_compound_relationship(self.data_file, start_chunk=self.start_chunk)
        self.processor.process_files()
    elif self.relationship_type == 'Assay_Gene':
        self.extractor.assay_enzyme_relationship(self.data_file)
    elif self.relationship_type == 'Gene_Protein':
        self.extractor.gene_protein_relationship(self.data_file)
    elif self.relationship_type == 'Compound_Gene':
        self.extractor.compound_gene_relationship(self.data_file)
    elif self.relationship_type == 'Compound_Similarity':
        self.extractor.compound_similarity_relationship(self.data_file, start_chunk=self.start_chunk)
    elif self.relationship_type == 'Compound_Compound_Cooccurrence':
        self.extractor.compound_compound_cooccurrence(self.data_file)
    elif self.relationship_type == 'Compound_Gene_Cooccurrence':
        self.extractor.compound_gene_cooccurrence(self.gene_data)
    elif self.relationship_type == 'Compound_Gene_Interaction':
        self.extractor.compound_gene_interaction(self.gene_data)
    elif self.relationship_type == 'Compound_Transformation':
        self.extractor.compound_transformation(self.data_file)
    else:
        logging.error(f"Unsupported relationship type: {self.relationship_type}")

main()

Main function to parse command-line arguments and collect relationship data for the specified type.

Source code in chemgraphbuilder/relationship_collector_processor.py
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
def main():
    """
    Main function to parse command-line arguments and collect relationship data for the specified type.
    """
    parser = argparse.ArgumentParser(description="Collect relationship data for different types of relationships.")
    parser.add_argument('--relationship_type',
                        type=str,
                        required=True,
                        choices=['Assay_Compound', 'Assay_Gene', 'Gene_Protein',
                                 'Compound_Gene', 'Compound_Similarity',
                                 'Compound_Compound_Cooccurrence',
                                 'Compound_Gene_Cooccurrence',
                                 'Compound_Gene_Interaction',
                                 'Compound_Transformation'],
                        help='The type of relationship to collect data for')
    parser.add_argument('--start_chunk',
                        type=int,
                        default=0,
                        help='The starting chunk index for processing data')

    args = parser.parse_args()

    collector = RelationshipsCollectorProcessor(relationship_type=args.relationship_type, start_chunk=args.start_chunk)
    collector.collect_relationship_data()