Skip to content

eds_scikit.phenotype.cancer.cancer

CancerFromICD10

CancerFromICD10(data: BaseData, cancer_types: Optional[List[str]] = None, level: str = 'patient', subphenotype: bool = True, threshold: int = 1)

Bases: Phenotype

Phenotyping visits or patients using ICD10 cancer codes

PARAMETER DESCRIPTION
data

A BaseData object

TYPE: BaseData

cancer_types

Optional list of cancer types to use for phenotyping

TYPE: Optional[List[str]] DEFAULT: None

level

On which level to do the aggregation, either "patient" or "visit"

TYPE: str DEFAULT: 'patient'

subphenotype

Whether the threshold should apply to the phenotype ("phenotype" column) of the subphenotype ("subphenotype" column)

TYPE: bool DEFAULT: True

threshold

Minimal number of events (which definition depends on the level value)

TYPE: int DEFAULT: 1

Source code in eds_scikit/phenotype/cancer/cancer.py
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
def __init__(
    self,
    data: BaseData,
    cancer_types: Optional[List[str]] = None,
    level: str = "patient",
    subphenotype: bool = True,
    threshold: int = 1,
):
    """
    Parameters
    ----------
    data : BaseData
        A BaseData object
    cancer_types :  Optional[List[str]]
        Optional list of cancer types to use for phenotyping
    level : str
        On which level to do the aggregation,
        either "patient" or "visit"
    subphenotype : bool
        Whether the threshold should apply to the phenotype
        ("phenotype" column) of the subphenotype ("subphenotype" column)
    threshold : int
        Minimal number of *events* (which definition depends on the `level` value)
    """
    super().__init__(data)

    if cancer_types is None:
        cancer_types = self.ALL_CANCER_TYPES

    incorrect_cancer_types = set(cancer_types) - set(self.ALL_CANCER_TYPES)

    if incorrect_cancer_types:
        raise ValueError(
            f"Incorrect cancer types ({incorrect_cancer_types}). "
            f"Available cancer types are {self.ALL_CANCER_TYPES}"
        )

    self.icd10_codes = {
        k: v for k, v in self.ICD10_CODES.items() if k in cancer_types
    }

    self.level = level
    self.subphenotype = subphenotype
    self.threshold = threshold

ICD10_CODES class-attribute

ICD10_CODES = {cancer_type: {'prefix': df.code.to_list()} for (cancer_type, df) in ICD10_CODES_DF.groupby('Cancer type')}

For each cancer type, contains a set of corresponding ICD10 codes.

ALL_CANCER_TYPES class-attribute

ALL_CANCER_TYPES = list(ICD10_CODES.keys())

Available cancer types.

compute

compute()

Fetch all necessary features and perform aggregation

Source code in eds_scikit/phenotype/cancer/cancer.py
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
def compute(self):
    """
    Fetch all necessary features and perform aggregation
    """
    self.add_code_feature(
        output_feature="icd10",
        source="icd10",
        codes=self.icd10_codes,
        additional_filtering=dict(condition_status_source_value={"DP", "DR"}),
    )

    self.agg_single_feature(
        input_feature="icd10",
        level=self.level,
        subphenotype=self.subphenotype,
        threshold=self.threshold,
    )
Back to top