Relationship Collector and Processor

This module collects and processes relationship data for different types of relationships using the RelationshipPropertiesExtractor and RelationshipDataProcessor classes.

You can check the documentation of the `RelationshipsCollectorProcessor` class here.

Module to collect and process relationship data for different types of relationships using RelationshipPropertiesExtractor and RelationshipDataProcessor.

Classes:

Name	Description
`RelationshipsCollectorProcessor`	Class to collect and process relationship data for different types of relationships.

Functions:

Name	Description
`main`	Main function to parse command-line arguments and collect relationship data for the specified type.

`RelationshipsCollectorProcessor`

A class to collect and process relationship data for different types of relationships using RelationshipPropertiesExtractor and RelationshipDataProcessor.

Attributes:

Name	Type	Description
`relationship_type`	`str`	The type of relationship to collect data for.
`data_file`	`str`	The path to the data file containing relationship data.
`extractor`	`RelationshipPropertiesExtractor`	An instance of RelationshipPropertiesExtractor.
`processor`	`RelationshipDataProcessor`	An instance of RelationshipDataProcessor.

Source code in chemgraphbuilder/relationship_collector_processor.py

class RelationshipsCollectorProcessor:
    """
    A class to collect and process relationship data for different types of relationships using RelationshipPropertiesExtractor and RelationshipDataProcessor.

    Attributes:
        relationship_type (str): The type of relationship to collect data for.
        data_file (str): The path to the data file containing relationship data.
        extractor (RelationshipPropertiesExtractor): An instance of RelationshipPropertiesExtractor.
        processor (RelationshipDataProcessor): An instance of RelationshipDataProcessor.
    """

    def __init__(self, relationship_type, start_chunk=0):
        """
        Initializes the RelationshipsCollectorProcessor with the relationship type, data file, and start chunk index.

        Args:
            relationship_type (str): The type of relationship to collect data for (e.g., 'Assay_Compound', 'Assay_Enzyme', 'Gene_Enzyme', 'Compound_Enzyme', 'Compound_Similarity', 'Compound_Cooccurrence', 'Compound_Transformation').
            start_chunk (int): The starting chunk index for processing data.
        """
        self.relationship_type = relationship_type
        self.data_file = "Data/AllDataConnected.csv"
        self.gene_data = "Data/Nodes/Gene_Properties_Processed.csv"
        self.start_chunk = start_chunk
        self.extractor = RelationshipPropertiesExtractor()
        self.processor = RelationshipDataProcessor(path="Data/Relationships/Assay_Compound_Relationship")


    def collect_relationship_data(self):
        """
        Collects and processes relationship data based on the relationship type and saves it to the appropriate file.
        """
        if self.relationship_type == 'Assay_Compound':
            self.extractor.assay_compound_relationship(self.data_file, start_chunk=self.start_chunk)
            self.processor.process_files()
        elif self.relationship_type == 'Assay_Gene':
            self.extractor.assay_enzyme_relationship(self.data_file)
        elif self.relationship_type == 'Gene_Protein':
            self.extractor.gene_protein_relationship(self.data_file)
        elif self.relationship_type == 'Compound_Gene':
            self.extractor.compound_gene_relationship(self.data_file)
        elif self.relationship_type == 'Compound_Similarity':
            self.extractor.compound_similarity_relationship(self.data_file, start_chunk=self.start_chunk)
        elif self.relationship_type == 'Compound_Compound_Cooccurrence':
            self.extractor.compound_compound_cooccurrence(self.data_file)
        elif self.relationship_type == 'Compound_Gene_Cooccurrence':
            self.extractor.compound_gene_cooccurrence(self.gene_data)
        elif self.relationship_type == 'Compound_Gene_Interaction':
            self.extractor.compound_gene_interaction(self.gene_data)
        elif self.relationship_type == 'Compound_Transformation':
            self.extractor.compound_transformation(self.data_file)
        else:
            logging.error(f"Unsupported relationship type: {self.relationship_type}")

`init(relationship_type, start_chunk=0)`

Initializes the RelationshipsCollectorProcessor with the relationship type, data file, and start chunk index.

Parameters:

Name	Type	Description	Default
`relationship_type`	`str`	The type of relationship to collect data for (e.g., 'Assay_Compound', 'Assay_Enzyme', 'Gene_Enzyme', 'Compound_Enzyme', 'Compound_Similarity', 'Compound_Cooccurrence', 'Compound_Transformation').	required
`start_chunk`	`int`	The starting chunk index for processing data.	`0`

Source code in chemgraphbuilder/relationship_collector_processor.py

def __init__(self, relationship_type, start_chunk=0):
    """
    Initializes the RelationshipsCollectorProcessor with the relationship type, data file, and start chunk index.

    Args:
        relationship_type (str): The type of relationship to collect data for (e.g., 'Assay_Compound', 'Assay_Enzyme', 'Gene_Enzyme', 'Compound_Enzyme', 'Compound_Similarity', 'Compound_Cooccurrence', 'Compound_Transformation').
        start_chunk (int): The starting chunk index for processing data.
    """
    self.relationship_type = relationship_type
    self.data_file = "Data/AllDataConnected.csv"
    self.gene_data = "Data/Nodes/Gene_Properties_Processed.csv"
    self.start_chunk = start_chunk
    self.extractor = RelationshipPropertiesExtractor()
    self.processor = RelationshipDataProcessor(path="Data/Relationships/Assay_Compound_Relationship")

`collect_relationship_data()`

Collects and processes relationship data based on the relationship type and saves it to the appropriate file.

Source code in chemgraphbuilder/relationship_collector_processor.py

def collect_relationship_data(self):
    """
    Collects and processes relationship data based on the relationship type and saves it to the appropriate file.
    """
    if self.relationship_type == 'Assay_Compound':
        self.extractor.assay_compound_relationship(self.data_file, start_chunk=self.start_chunk)
        self.processor.process_files()
    elif self.relationship_type == 'Assay_Gene':
        self.extractor.assay_enzyme_relationship(self.data_file)
    elif self.relationship_type == 'Gene_Protein':
        self.extractor.gene_protein_relationship(self.data_file)
    elif self.relationship_type == 'Compound_Gene':
        self.extractor.compound_gene_relationship(self.data_file)
    elif self.relationship_type == 'Compound_Similarity':
        self.extractor.compound_similarity_relationship(self.data_file, start_chunk=self.start_chunk)
    elif self.relationship_type == 'Compound_Compound_Cooccurrence':
        self.extractor.compound_compound_cooccurrence(self.data_file)
    elif self.relationship_type == 'Compound_Gene_Cooccurrence':
        self.extractor.compound_gene_cooccurrence(self.gene_data)
    elif self.relationship_type == 'Compound_Gene_Interaction':
        self.extractor.compound_gene_interaction(self.gene_data)
    elif self.relationship_type == 'Compound_Transformation':
        self.extractor.compound_transformation(self.data_file)
    else:
        logging.error(f"Unsupported relationship type: {self.relationship_type}")

`main()`

Main function to parse command-line arguments and collect relationship data for the specified type.

Source code in chemgraphbuilder/relationship_collector_processor.py

def main():
    """
    Main function to parse command-line arguments and collect relationship data for the specified type.
    """
    parser = argparse.ArgumentParser(description="Collect relationship data for different types of relationships.")
    parser.add_argument('--relationship_type',
                        type=str,
                        required=True,
                        choices=['Assay_Compound', 'Assay_Gene', 'Gene_Protein',
                                 'Compound_Gene', 'Compound_Similarity',
                                 'Compound_Compound_Cooccurrence',
                                 'Compound_Gene_Cooccurrence',
                                 'Compound_Gene_Interaction',
                                 'Compound_Transformation'],
                        help='The type of relationship to collect data for')
    parser.add_argument('--start_chunk',
                        type=int,
                        default=0,
                        help='The starting chunk index for processing data')

    args = parser.parse_args()

    collector = RelationshipsCollectorProcessor(relationship_type=args.relationship_type, start_chunk=args.start_chunk)
    collector.collect_relationship_data()

Relationship Collector and Processor

You can check the documentation of the RelationshipsCollectorProcessor class here.

RelationshipsCollectorProcessor

__init__(relationship_type, start_chunk=0)

collect_relationship_data()

main()

You can check the documentation of the `RelationshipsCollectorProcessor` class here.

`RelationshipsCollectorProcessor`

`init(relationship_type, start_chunk=0)`

`collect_relationship_data()`

`main()`