Skip to content

breast_cancer

load_data

Load and return the UCI ML Breast Cancer Wisconsin (Diagnostic) dataset.

For more information about this dataset and the meaning of the features it contains, see the sklearn documentation.

Returns:

Type Description
Tuple[NumpyDataset, NumpyDataset]

(train_data, eval_data)

Source code in fastestimator/fastestimator/dataset/data/breast_cancer.py
def load_data() -> Tuple[NumpyDataset, NumpyDataset]:
    """Load and return the UCI ML Breast Cancer Wisconsin (Diagnostic) dataset.

    For more information about this dataset and the meaning of the features it contains, see the sklearn documentation.

    Returns:
        (train_data, eval_data)
    """
    (x, y) = load_breast_cancer(return_X_y=True)
    x_train, x_eval, y_train, y_eval = train_test_split(x, y, test_size=0.2, random_state=42)
    x_train, x_eval = np.float32(x_train), np.float32(x_eval)
    train_data = NumpyDataset({"x": x_train, "y": y_train})
    eval_data = NumpyDataset({"x": x_eval, "y": y_eval})
    return train_data, eval_data