Loading Part 4: Settings Files#

In this part of the tutorial, we will explore how to save and load stream and lake settings dictionaries. This can be incredibly useful for preserving your settings, sharing them with others, or simply avoiding the need to retype everything.

Working Path#

In your working path, you will find a couple of .json files. These files are the settings files. The lake_settings.json file stores the settings for the lake, while the stream_settings.json file stores the settings for the stream. These settings are the same ones you created in the previous example, but now they are saved to files for easy access and sharing.

data
├── CPC_3010_data
│   ├── CPC_3010_data_20220709_Jul.csv
│   ├── CPC_3010_data_20220709_Jul.csv
    ├── stream_settings_cpc.json
├── SMPS_data
│   ├── 2022-07-07_095151_SMPS.csv
│   ├── 2022-07-10_094659_SMPS.csv
│   ├── stream_settings_smps_1d.json
│   ├── stream_settings_smps_2d.json
├── lake_settings.json
# Import the necessary libraries and modules
import matplotlib.pyplot as plt
from particula.data import loader_interface, settings_generator
from particula.data.tests.example_data.get_example_data import get_data_folder
from particula.data.lake import Lake

# Set the parent directory where the data folders are located
path = get_data_folder()
print('Path to data folder:')
print(path.rsplit('particula')[-1])
Path to data folder:
/data/tests/example_data

Generate and Save Settings#

First, we generate the settings for the CPC data using the settings_generator.for_general_1d_load function. These settings include details such as the data file location, file format, column names, and more.

# settings for the CPC data
cpc_settings = settings_generator.for_general_1d_load(
    relative_data_folder='CPC_3010_data',
    filename_regex='*.csv',
    file_min_size_bytes=10,
    data_checks={
        "characters": [10, 100],
        "char_counts": {",": 4},
        "skip_rows": 0,
        "skip_end": 0,
    },
    data_column=[1, 2],
    data_header=['CPC_count[#/sec]', 'Temperature[degC]'],
    time_column=[0],
    time_format='epoch',
    delimiter=',',
    time_shift_seconds=0,
    timezone_identifier='UTC',
)

# save the settings to a file
settings_generator.save_settings_for_stream(
    settings=cpc_settings,
    path=path,
    subfolder='CPC_3010_data',
    settings_suffix='_cpc',
)

Next save the SMPS settings#

# settings for the SMPS data
smps_1d_settings, smps_2d_settings = settings_generator.for_general_sizer_1d_2d_load(
    relative_data_folder='SMPS_data',
    filename_regex='*.csv',
    file_min_size_bytes=10,
    header_row=24,
    data_checks={
        "characters": [250],
        "skip_rows": 25,
        "skip_end": 0,
        "char_counts": {"/": 2, ":": 2}
    },
    data_1d_column=[
        "Lower Size (nm)",
        "Upper Size (nm)",
        "Sample Temp (C)",
        "Sample Pressure (kPa)",
        "Relative Humidity (%)",
        "Median (nm)",
        "Mean (nm)",
        "Geo. Mean (nm)",
        "Mode (nm)",
        "Geo. Std. Dev.",
        "Total Conc. (#/cm³)"],
    data_1d_header=[
        "Lower_Size_(nm)",
        "Upper_Size_(nm)",
        "Sample_Temp_(C)",
        "Sample_Pressure_(kPa)",
        "Relative_Humidity_(%)",
        "Median_(nm)",
        "Mean_(nm)",
        "Geo_Mean_(nm)",
        "Mode_(nm)",
        "Geo_Std_Dev.",
        "Total_Conc_(#/cc)"],
    data_2d_dp_start_keyword="20.72",
    data_2d_dp_end_keyword="784.39",
    data_2d_convert_concentration_from="dw/dlogdp",
    time_column=[1, 2],
    time_format="%m/%d/%Y %H:%M:%S",
    delimiter=",",
    time_shift_seconds=0,
    timezone_identifier="UTC",
)

# save the settings to a file
settings_generator.save_settings_for_stream(
    settings=smps_1d_settings,
    path=path,
    subfolder='SMPS_data',
    settings_suffix='_smps_1d',
)
settings_generator.save_settings_for_stream(
    settings=smps_2d_settings,
    path=path,
    subfolder='SMPS_data',
    settings_suffix='_smps_2d',
)

Loading Stream Settings#

If you are still exploring your analysis pipeline, you may want to load settings for individual streams. To do so, you can use the generate_settings.load_settings_for_stream function. This function takes the path to the settings file as an argument and returns a dictionary containing the stream settings.

smps_1d_stream_settings = settings_generator.load_settings_for_stream(
    path=path,
    subfolder='SMPS_data',
    settings_suffix='_smps_1d',
)

stream_smps_1d = loader_interface.load_files_interface(
    path=path,
    settings=smps_1d_stream_settings
)

print(stream_smps_1d.header)
  Loading file: 2022-07-10_094659_SMPS.csv
  Loading file: 2022-07-07_095151_SMPS.csv
['Lower_Size_(nm)', 'Upper_Size_(nm)', 'Sample_Temp_(C)', 'Sample_Pressure_(kPa)', 'Relative_Humidity_(%)', 'Median_(nm)', 'Mean_(nm)', 'Geo_Mean_(nm)', 'Mode_(nm)', 'Geo_Std_Dev.', 'Total_Conc_(#/cc)']

Lake settings#

If you wanted to load everything for a reanalysis, instead of calling each individual stream, you can first save a lake settings file. This is done with generate_settings.save_settings_for_lake. This function takes the path to the lake settings file as an argument. It returns a dictionary with the settings.

# collect settings into a dictionary
combined_settings = {
    'cpc': cpc_settings,
    'smps_1d': smps_1d_settings,
    'smps_2d': smps_2d_settings,
}

# save the lake settings to a file
settings_generator.save_settings_for_lake(
    settings=combined_settings,
    path=path,
    subfolder='',
    settings_suffix='_cpc_smps',
)

Load the Lake#

To load the lake settings use generate_settings.load_settings_for_lake. This function takes the path to the lake settings file as an argument. It returns a dictionary with the settings.

lake_settings = settings_generator.load_settings_for_lake(
    path=path,
    subfolder='',
    settings_suffix='_cpc_smps',
)

# now call the loader interface for files
lake = loader_interface.load_folders_interface(
    path=path,
    folder_settings=combined_settings,
)

print(' ')
print(lake)
Folder Settings: cpc
  Loading file: CPC_3010_data_20220710_Jul.csv
  Loading file: CPC_3010_data_20220709_Jul.csv
Folder Settings: smps_1d
  Loading file: 2022-07-10_094659_SMPS.csv
  Loading file: 2022-07-07_095151_SMPS.csv
Folder Settings: smps_2d
  Loading file: 2022-07-10_094659_SMPS.csv
  Loading file: 2022-07-07_095151_SMPS.csv
 
Lake with streams: ['cpc', 'smps_1d', 'smps_2d']

Summary#

This example showed how to save and load the settings for a stream and a lake. This is useful if you want to save your settings and then load them later. This is also useful if you want to share your settings with someone else. Or just stop from having to retype everything.

help(settings_generator.save_settings_for_stream)
Help on function save_settings_for_stream in module particula.data.settings_generator:

save_settings_for_stream(settings: dict, path: str, subfolder: str, settings_suffix: str = '') -> None
    Save settings for lake data to a JSON file.
    
    Given a dictionary of settings, this function saves it to a JSON file
    named 'stream_settings' with an optional suffix in the specified filename.
    The JSON file is formatted with a 4-space indentation.
    
    Args:
    - settings: The settings dictionary to be saved.
    - path: The path where the subfolder is located.
    - subfolder: The subfolder where the settings file will be saved.
    - settings_suffix: An optional suffix for the settings
        file name. Default is an empty string.
    
    Returns:
    - None
help(settings_generator.load_settings_for_stream)
Help on function load_settings_for_stream in module particula.data.settings_generator:

load_settings_for_stream(path: str, subfolder: str, settings_suffix: str = '') -> dict
    Load settings for Stream data from a JSON file.
    
    Given a path and subfolder, this function searches for a JSON file
    named 'stream_settings' with an optional suffix. It returns the settings
    as a dictionary. If no file is found, or multiple files are found,
    appropriate errors or warnings are raised.
    
    Args:
    - path: The path where the subfolder is located.
    - subfolder: The subfolder where the settings file is expected.
    - settings_suffix: An optional suffix for the settings
        file name. Default is an empty string.
    
    Returns:
    - dict: A dictionary of settings loaded from the file.
    
    Raises:
    - FileNotFoundError: If no settings file is found.
    - Warning: If more than one settings file is found.
help(settings_generator.save_settings_for_lake)
Help on function save_settings_for_lake in module particula.data.settings_generator:

save_settings_for_lake(settings: dict, path: str, subfolder: str = '', settings_suffix: str = '') -> None
    Save settings for lake data to a JSON file.
    
    Given a dictionary of settings, this function saves it to a JSON file
    named 'lake_settings' with an optional suffix in the specified filename.
    The JSON file is formatted with a 4-space indentation.
    
    Args:
    - settings: The settings dictionary to be saved.
    - path: The path where the subfolder is located.
    - subfolder: The subfolder where the settings file will be saved.
    - settings_suffix: An optional suffix for the settings
        file name. Default is an empty string.
    
    Returns:
    - None
help(settings_generator.load_settings_for_lake)
Help on function load_settings_for_lake in module particula.data.settings_generator:

load_settings_for_lake(path: str, subfolder: str = '', settings_suffix: str = '') -> dict
    Load settings for Lake data from a JSON file. The settings file is
    a dictionary of stream settings dictionaries.
    
    Given a path and subfolder, this function searches for a JSON file
    named 'lake_settings' with an optional suffix. It returns the settings
    as a dictionary. If no file is found, or multiple files are found,
    appropriate errors or warnings are raised.
    
    Args:
    - path: The path where the subfolder is located.
    - subfolder: The subfolder where the settings file is expected.
    - settings_suffix: An optional suffix for the settings
        file name. Default is an empty string.
    
    Returns:
    - dict: A dictionary of settings loaded from the file.
    
    Raises:
    - FileNotFoundError: If no settings file is found.
    - Warning: If more than one settings file is found.