data

datafactory

Contains functions to prepare the data.

configuration.data.datafactory.read_anthropometric_data(sex, data_dir_path)[source]

Read and process anthropometric data from a sex-specific CSV file.

Parameters:
  • sex (Sex) -- The sex of the individuals whose data is to be read ("male" or "female").

  • data_dir_path (Path) -- Path to the root data directory containing the "csv" subdirectory.

Returns:

Processed DataFrame containing:

  • Original data with standardized units (converted to cm/kg)

  • Renamed columns with units in brackets

  • Added "sex" column indicating the subject"s gender

Return type:

pd.DataFrame

Raises:
  • ValueError -- If the provided sex is not "male" or "female".

  • FileNotFoundError -- If the specified CSV file does not exist in the data directory.

Notes

  • Performs the following unit conversions:
    • Height: inches → centimeters

    • Weight: pounds → kilograms

    • Chest depth: millimeters → centimeters

    • Bideltoid breadth: millimeters → centimeters

  • Original column names are renamed to include units in brackets

configuration.data.datafactory.prepare_anthropometric_data(data_dir_path)[source]

Prepare and save anthropometric data as a pickle file.

This function reads anthropometric data for both males and females, combines them into a single DataFrame, and saves the result as a pickle file for efficient future access.

Parameters:

data_dir_path (Path) -- The path to the root data directory containing input data and where the output pickle file will be saved.

Return type:

None

configuration.data.datafactory.prepare_bike_data(data_dir_path)[source]

Prepare bike data by reading a CSV file, processing it, and saving as a pickle file.

Parameters:

data_dir_path (Path) -- The path to the root data directory containing "csv" and "pkl" subdirectories.

Return type:

None

configuration.data.datafactory.prepare_data()[source]

Prepare the data for the application by processing anthropometric and bike data.

This function checks for the existence of preprocessed data files and, if not found, initiates the data preparation process. It performs the following steps:

  1. Prepares anthropometric data by calling prepare_anthropometric_data().

  2. Prepares bike data by calling prepare_bike_data().

  3. Prepares 3D body data by calling prepare_3D_body_data().

Return type:

None

configuration.data.datafactory.prepare_3D_body_data(data_dir_path)[source]

Process 3D body data by keeping one MultiPolygon per bin of height of a given size and reducing the precision of each MultiPolygon.

For each sex (male/female):

  1. Loads original 3D body shape data from <sex>_3dBody.pkl

  2. Creates target bins at 3cm intervals (controlled by DISTANCE_BTW_TARGET_KEYS_ALTITUDES)

  3. Selects the nearest available height to each bin's boundary values

  4. Simplifies each Polygon that compose each MultiPolygon using Douglas-Peucker algorithm with specified tolerance

  5. Saves optimized data to <sex>_3dBody_light.pkl

Parameters:

data_dir_path (Path) -- Path to root directory containing input/output subdirectories. Requires "pkl" subdirectory with original pickle files.

Raises:

FileNotFoundError -- If either the input directory structure is invalid or source pickle file for a sex is missing.

Return type:

None