SCIKIT_LEARN_DATASET
Retrieve a pandas DataFrame from the scikit-learn sample datasets.Params:dataset_name : strReturns:out : DataFrameA DataContainer object containing the retrieved pandas DataFrame.
Python Code
from typing import Literal
from flojoy import DataFrame, flojoy
@flojoy()
def SCIKIT_LEARN_DATASET(
dataset_name: Literal[
"iris", "diabetes", "digits", "linnerud", "wine", "breast_cancer"
] = "iris",
) -> DataFrame:
"""Retrieve a pandas DataFrame from the scikit-learn sample datasets.
Parameters
----------
dataset_name : str
Returns
-------
DataFrame
A DataContainer object containing the retrieved pandas DataFrame.
"""
if dataset_name == "iris":
from sklearn.datasets import load_iris
iris = load_iris(as_frame=True, return_X_y=True)
return DataFrame(df=iris[0]) # type: ignore
elif dataset_name == "diabetes":
from sklearn.datasets import load_diabetes
iris = load_diabetes(as_frame=True, return_X_y=True)
return DataFrame(df=iris[0]) # type: ignore
elif dataset_name == "digits":
from sklearn.datasets import load_digits
iris = load_digits(as_frame=True, return_X_y=True)
return DataFrame(df=iris[0]) # type: ignore
elif dataset_name == "linnerud":
from sklearn.datasets import load_linnerud
iris = load_linnerud(as_frame=True, return_X_y=True)
return DataFrame(df=iris[0]) # type: ignore
elif dataset_name == "wine":
from sklearn.datasets import load_wine
iris = load_wine(as_frame=True, return_X_y=True)
return DataFrame(df=iris[0]) # type: ignore
elif dataset_name == "breast_cancer":
from sklearn.datasets import load_breast_cancer
iris = load_breast_cancer(as_frame=True, return_X_y=True)
return DataFrame(df=iris[0]) # type: ignore
else:
raise ValueError(f"Failed to retrieve '{dataset_name}' from rdatasets package!")
Example
Having problems with this example app? Join our Discord community and we will help you out!
The SCIKIT_LEARN_DATASET app
The workflow of this app is described below:
SCIKIT_LEARN_DATASET : This is a SCIKIT_LEARN_DATASET node. It takes one parameter dataset_name
, the name of dataset to load from sklearn.datasets
package. In this case it is ‘iris’ which is default value of this parameter. It passing a DataFrame
object of DataContainer
class to the next node Table
.
TABLE: This node creates a Plotly table visualization for a given input DataFrame
object of DataContainer
class.