Loading Part 3: Lake

Loading Part 3: Lake#

In this example, we explore the process of working with data from multiple instruments and consolidating them into a single Lake object. A Lake object serves as a convenient container for aggregating multiple Streams, each representing data from individual instruments.

Setting the Working Path#

To begin, you need to establish the working path where your data is stored. For this demonstration, we will use provided example data located in the current directory. However, keep in mind that the path can be anywhere on your computer. For instance, if you have a folder named “data” in your home directory, you can set the path as follows:

path = "U:\\data\\processing\\Campaign2023_of_awesome\\data"

Your folder structure should resemble the following:

data
├── CPC_3010_data
│   ├── CPC_3010_data_20220709_Jul.csv
│   ├── CPC_3010_data_20220709_Jul.csv
├── SMPS_data
│   ├── 2022-07-07_095151_SMPS.csv
│   ├── 2022-07-10_094659_SMPS.csv

Here, the path points to the “data” folder. Within this folder, you’ll find two subfolders: one for CPC data and another for SMPS data. These subfolders correspond to the relative_data_folder keywords used in the settings dictionary. The data within these subfolders will be loaded as Stream objects.

Inside each of these subfolders, you’ll find data files that match the specified filename_regex. A regular expression is used to select files based on specific criteria. In this case, we are matching all files ending with “.csv” and loading them into the respective Stream objects. This approach allows you to efficiently manage and consolidate data from various instruments for further analysis and visualization.

# Import necessary modules
import os  # Provides functions for interacting with the operating system.
# Matplotlib is a library for creating visualizations and plots.
import matplotlib.pyplot as plt
from particula.data import (
    loader_interface,  # This module allows you to load data from files.
    settings_generator,  # It helps generate settings for data loading.
    # This module provides statistics for a collection of data streams.
    lake_stats
)
from particula.data.tests.example_data.get_example_data import get_data_folder
# Lake is a container for multiple data streams.
from particula.data.lake import Lake

# Set the parent directory of the data folder
path = get_data_folder()

Load the Data#

In this example, we’ll work with provided example data. However, you have the flexibility to change the path to any folder on your computer. We will use the settings generator to efficiently load the data.

# settings for the CPC data
cpc_settings = settings_generator.for_general_1d_load(
    relative_data_folder='CPC_3010_data',
    filename_regex='*.csv',
    file_min_size_bytes=10,
    data_checks={
        "characters": [10, 100],
        "char_counts": {",": 4},
        "skip_rows": 0,
        "skip_end": 0,
    },
    data_column=[1, 2],
    data_header=['CPC_count[#/sec]', 'Temperature[degC]'],
    time_column=[0],
    time_format='epoch',
    delimiter=',',
    time_shift_seconds=0,
    timezone_identifier='UTC',
)

# settings for the SMPS data
smps_1d_settings, smps_2d_settings = settings_generator.for_general_sizer_1d_2d_load(
    relative_data_folder='SMPS_data',
    filename_regex='*.csv',
    file_min_size_bytes=10,
    header_row=24,
    data_checks={
        "characters": [250],
        "skip_rows": 25,
        "skip_end": 0,
        "char_counts": {"/": 2, ":": 2}
    },
    data_1d_column=[
        "Lower Size (nm)",
        "Upper Size (nm)",
        "Sample Temp (C)",
        "Sample Pressure (kPa)",
        "Relative Humidity (%)",
        "Median (nm)",
        "Mean (nm)",
        "Geo. Mean (nm)",
        "Mode (nm)",
        "Geo. Std. Dev.",
        "Total Conc. (#/cm³)"],
    data_1d_header=[
        "Lower_Size_(nm)",
        "Upper_Size_(nm)",
        "Sample_Temp_(C)",
        "Sample_Pressure_(kPa)",
        "Relative_Humidity_(%)",
        "Median_(nm)",
        "Mean_(nm)",
        "Geo_Mean_(nm)",
        "Mode_(nm)",
        "Geo_Std_Dev.",
        "Total_Conc_(#/cc)"],
    data_2d_dp_start_keyword="20.72",
    data_2d_dp_end_keyword="784.39",
    data_2d_convert_concentration_from="dw/dlogdp",
    time_column=[1, 2],
    time_format="%m/%d/%Y %H:%M:%S",
    delimiter=",",
    time_shift_seconds=0,
    timezone_identifier="UTC",
)

# collect settings into a dictionary
combined_settings = {
    'cpc': cpc_settings,
    'smps_1d': smps_1d_settings,
    'smps_2d': smps_2d_settings,
}

# now call the loader interface for files
lake = loader_interface.load_folders_interface(
    path=path,
    folder_settings=combined_settings,
)

print(' ')
print(lake)

Folder Settings: cpc
  Loading file: CPC_3010_data_20220710_Jul.csv

  Loading file: CPC_3010_data_20220709_Jul.csv

Folder Settings: smps_1d
  Loading file: 2022-07-10_094659_SMPS.csv
  Loading file: 2022-07-07_095151_SMPS.csv

Folder Settings: smps_2d
  Loading file: 2022-07-10_094659_SMPS.csv
  Loading file: 2022-07-07_095151_SMPS.csv

 
Lake with streams: ['cpc', 'smps_1d', 'smps_2d']

Lake Class Overview#

The Lake is a collection of Stream objects stored as a dictionary. The keys represent the names of the streams, and the values are the stream objects themselves. It provides a convenient way to organize and manage multiple datasets. Let’s explore its key attributes and methods:

Attributes:#

streams (Dict[str, Stream]): A dictionary where keys are stream names, and values are the corresponding Stream objects.

Methods:#

__getitem__(self, key: str) -> Any: Retrieve a specific Stream by its name.
- Example: To access the CPC stream, you can use lake['cpc'].
__delitem__(self, key: str) -> None: Remove a Stream from the Lake using its name.
- Example: To remove a stream named ‘cpc’, you can use del lake['cpc'].
__getattr__(self, name: str) -> Any: Access streams as attributes for easier navigation.
- Example: You can directly access the ‘cpc’ stream with lake.cpc.
add_stream(self, stream: particula.data.stream.Stream, name: str) -> None: Add a new Stream to the Lake.
- Example: To add a new stream, you can use lake.add_stream(new_stream, 'stream_name').
__len__(self) -> int: Determine the number of streams in the Lake.
- Example: To find out how many streams are in the Lake, use len(lake).
__iter__(self) -> Iterator[Any]: Iterate over the streams in the Lake.
- Example: To loop through all streams, you can use [stream.header for stream in lake].
summary (Readonly property): Generate a summary by iterating through each stream and printing their headers.
- Example: To get a summary of all streams, use lake.summary.

Usage:#

The Lake class simplifies the management of multiple datasets. You can access individual streams by name, add new streams, and iterate through them efficiently. This class is particularly helpful when dealing with various data sources within your analysis.

# get the names of the streams
print(' ')
print('Names of the streams:')
print(dir(lake))

 
Names of the streams:
['cpc', 'smps_1d', 'smps_2d']

# get the streams
print(' ')
print('The streams:')
for stream in lake:
    print(stream)

 
The streams:
('cpc', Stream(header=['CPC_count[#/sec]', 'Temperature[degC]'], data=array([[3.3510e+04, 1.7000e+01],
       [3.3465e+04, 1.7100e+01],
       [3.2171e+04, 1.7000e+01],
       ...,
       [1.9403e+04, 1.6900e+01],
       [2.0230e+04, 1.7000e+01],
       [1.9521e+04, 1.6800e+01]]), time=array([1.65734280e+09, 1.65734281e+09, 1.65734281e+09, ...,
       1.65751559e+09, 1.65751560e+09, 1.65751560e+09]), files=[['CPC_3010_data_20220710_Jul.csv', 1078191], ['CPC_3010_data_20220709_Jul.csv', 1011254]]))
('smps_1d', Stream(header=['Lower_Size_(nm)', 'Upper_Size_(nm)', 'Sample_Temp_(C)', 'Sample_Pressure_(kPa)', 'Relative_Humidity_(%)', 'Median_(nm)', 'Mean_(nm)', 'Geo_Mean_(nm)', 'Mode_(nm)', 'Geo_Std_Dev.', 'Total_Conc_(#/cc)'], data=array([[2.05000e+01, 7.91500e+02, 2.37000e+01, ..., 2.07210e+01,
        2.17900e+00, 2.16900e+03],
       [2.05000e+01, 7.91500e+02, 2.36000e+01, ..., 2.52550e+01,
        2.10100e+00, 2.39408e+03],
       [2.05000e+01, 7.91500e+02, 2.37000e+01, ..., 2.18700e+01,
        2.13600e+00, 2.27861e+03],
       ...,
       [2.05000e+01, 7.91500e+02, 2.35000e+01, ..., 2.07210e+01,
        2.31800e+00, 2.08056e+03],
       [2.05000e+01, 7.91500e+02, 2.33000e+01, ..., 2.10970e+01,
        2.31800e+00, 2.10616e+03],
       [2.05000e+01, 7.91500e+02, 2.35000e+01, ..., 2.07210e+01,
        2.24800e+00, 2.45781e+03]]), time=array([1.65718376e+09, 1.65718385e+09, 1.65718394e+09, ...,
       1.65753440e+09, 1.65753450e+09, 1.65753459e+09]), files=[['2022-07-10_094659_SMPS.csv', 2003798], ['2022-07-07_095151_SMPS.csv', 5617925]]))
('smps_2d', Stream(header=['20.72', '21.10', '21.48', '21.87', '22.27', '22.67', '23.08', '23.50', '23.93', '24.36', '24.80', '25.25', '25.71', '26.18', '26.66', '27.14', '27.63', '28.13', '28.64', '29.16', '29.69', '30.23', '30.78', '31.34', '31.91', '32.49', '33.08', '33.68', '34.29', '34.91', '35.55', '36.19', '36.85', '37.52', '38.20', '38.89', '39.60', '40.32', '41.05', '41.79', '42.55', '43.32', '44.11', '44.91', '45.73', '46.56', '47.40', '48.26', '49.14', '50.03', '50.94', '51.86', '52.80', '53.76', '54.74', '55.73', '56.74', '57.77', '58.82', '59.89', '60.98', '62.08', '63.21', '64.36', '65.52', '66.71', '67.93', '69.16', '70.41', '71.69', '72.99', '74.32', '75.67', '77.04', '78.44', '79.86', '81.31', '82.79', '84.29', '85.82', '87.38', '88.96', '90.58', '92.22', '93.90', '95.60', '97.34', '99.10', '100.90', '102.74', '104.60', '106.50', '108.43', '110.40', '112.40', '114.44', '116.52', '118.64', '120.79', '122.98', '125.21', '127.49', '129.80', '132.16', '134.56', '137.00', '139.49', '142.02', '144.60', '147.22', '149.89', '152.61', '155.38', '158.20', '161.08', '164.00', '166.98', '170.01', '173.09', '176.24', '179.43', '182.69', '186.01', '189.38', '192.82', '196.32', '199.89', '203.51', '207.21', '210.97', '214.80', '218.70', '222.67', '226.71', '230.82', '235.01', '239.28', '243.62', '248.05', '252.55', '257.13', '261.80', '266.55', '271.39', '276.32', '281.33', '286.44', '291.64', '296.93', '302.32', '307.81', '313.40', '319.08', '324.88', '330.77', '336.78', '342.89', '349.12', '355.45', '361.90', '368.47', '375.16', '381.97', '388.91', '395.96', '403.15', '410.47', '417.92', '425.51', '433.23', '441.09', '449.10', '457.25', '465.55', '474.00', '482.61', '491.37', '500.29', '509.37', '518.61', '528.03', '537.61', '547.37', '557.31', '567.42', '577.72', '588.21', '598.89', '609.76', '620.82', '632.09', '643.57', '655.25', '667.14', '679.25', '691.58', '704.14', '716.92', '729.93', '743.18', '756.67', '770.40', '784.39'], data=array([[ 6103.186,  2832.655,  4733.553, ...,    93.413,   122.992,
            0.   ],
       [ 5621.118,  5867.747,  6233.403, ...,     0.   ,     0.   ,
           75.377],
       [ 5165.139,  4969.987,  4312.386, ...,     0.   ,   122.992,
          124.085],
       ...,
       [ 9962.036,  7986.823,  8682.258, ...,     0.   ,     0.   ,
          124.153],
       [ 8765.782, 11175.603,  8148.945, ...,     0.   ,     0.   ,
          372.433],
       [14380.528, 11524.35 , 13632.727, ...,     0.   ,     0.   ,
            0.   ]]), time=array([1.65718376e+09, 1.65718385e+09, 1.65718394e+09, ...,
       1.65753440e+09, 1.65753450e+09, 1.65753459e+09]), files=[['2022-07-10_094659_SMPS.csv', 2003798], ['2022-07-07_095151_SMPS.csv', 5617925]]))

# get just the keys
print(' ')
print('The keys:')
for key in lake.keys():
    print(key)

 
The keys:
cpc
smps_1d
smps_2d

# get just the values
print(' ')
print('The values:')
for value in lake.values():
    print(value)

 
The values:
Stream(header=['CPC_count[#/sec]', 'Temperature[degC]'], data=array([[3.3510e+04, 1.7000e+01],
       [3.3465e+04, 1.7100e+01],
       [3.2171e+04, 1.7000e+01],
       ...,
       [1.9403e+04, 1.6900e+01],
       [2.0230e+04, 1.7000e+01],
       [1.9521e+04, 1.6800e+01]]), time=array([1.65734280e+09, 1.65734281e+09, 1.65734281e+09, ...,
       1.65751559e+09, 1.65751560e+09, 1.65751560e+09]), files=[['CPC_3010_data_20220710_Jul.csv', 1078191], ['CPC_3010_data_20220709_Jul.csv', 1011254]])
Stream(header=['Lower_Size_(nm)', 'Upper_Size_(nm)', 'Sample_Temp_(C)', 'Sample_Pressure_(kPa)', 'Relative_Humidity_(%)', 'Median_(nm)', 'Mean_(nm)', 'Geo_Mean_(nm)', 'Mode_(nm)', 'Geo_Std_Dev.', 'Total_Conc_(#/cc)'], data=array([[2.05000e+01, 7.91500e+02, 2.37000e+01, ..., 2.07210e+01,
        2.17900e+00, 2.16900e+03],
       [2.05000e+01, 7.91500e+02, 2.36000e+01, ..., 2.52550e+01,
        2.10100e+00, 2.39408e+03],
       [2.05000e+01, 7.91500e+02, 2.37000e+01, ..., 2.18700e+01,
        2.13600e+00, 2.27861e+03],
       ...,
       [2.05000e+01, 7.91500e+02, 2.35000e+01, ..., 2.07210e+01,
        2.31800e+00, 2.08056e+03],
       [2.05000e+01, 7.91500e+02, 2.33000e+01, ..., 2.10970e+01,
        2.31800e+00, 2.10616e+03],
       [2.05000e+01, 7.91500e+02, 2.35000e+01, ..., 2.07210e+01,
        2.24800e+00, 2.45781e+03]]), time=array([1.65718376e+09, 1.65718385e+09, 1.65718394e+09, ...,
       1.65753440e+09, 1.65753450e+09, 1.65753459e+09]), files=[['2022-07-10_094659_SMPS.csv', 2003798], ['2022-07-07_095151_SMPS.csv', 5617925]])
Stream(header=['20.72', '21.10', '21.48', '21.87', '22.27', '22.67', '23.08', '23.50', '23.93', '24.36', '24.80', '25.25', '25.71', '26.18', '26.66', '27.14', '27.63', '28.13', '28.64', '29.16', '29.69', '30.23', '30.78', '31.34', '31.91', '32.49', '33.08', '33.68', '34.29', '34.91', '35.55', '36.19', '36.85', '37.52', '38.20', '38.89', '39.60', '40.32', '41.05', '41.79', '42.55', '43.32', '44.11', '44.91', '45.73', '46.56', '47.40', '48.26', '49.14', '50.03', '50.94', '51.86', '52.80', '53.76', '54.74', '55.73', '56.74', '57.77', '58.82', '59.89', '60.98', '62.08', '63.21', '64.36', '65.52', '66.71', '67.93', '69.16', '70.41', '71.69', '72.99', '74.32', '75.67', '77.04', '78.44', '79.86', '81.31', '82.79', '84.29', '85.82', '87.38', '88.96', '90.58', '92.22', '93.90', '95.60', '97.34', '99.10', '100.90', '102.74', '104.60', '106.50', '108.43', '110.40', '112.40', '114.44', '116.52', '118.64', '120.79', '122.98', '125.21', '127.49', '129.80', '132.16', '134.56', '137.00', '139.49', '142.02', '144.60', '147.22', '149.89', '152.61', '155.38', '158.20', '161.08', '164.00', '166.98', '170.01', '173.09', '176.24', '179.43', '182.69', '186.01', '189.38', '192.82', '196.32', '199.89', '203.51', '207.21', '210.97', '214.80', '218.70', '222.67', '226.71', '230.82', '235.01', '239.28', '243.62', '248.05', '252.55', '257.13', '261.80', '266.55', '271.39', '276.32', '281.33', '286.44', '291.64', '296.93', '302.32', '307.81', '313.40', '319.08', '324.88', '330.77', '336.78', '342.89', '349.12', '355.45', '361.90', '368.47', '375.16', '381.97', '388.91', '395.96', '403.15', '410.47', '417.92', '425.51', '433.23', '441.09', '449.10', '457.25', '465.55', '474.00', '482.61', '491.37', '500.29', '509.37', '518.61', '528.03', '537.61', '547.37', '557.31', '567.42', '577.72', '588.21', '598.89', '609.76', '620.82', '632.09', '643.57', '655.25', '667.14', '679.25', '691.58', '704.14', '716.92', '729.93', '743.18', '756.67', '770.40', '784.39'], data=array([[ 6103.186,  2832.655,  4733.553, ...,    93.413,   122.992,
            0.   ],
       [ 5621.118,  5867.747,  6233.403, ...,     0.   ,     0.   ,
           75.377],
       [ 5165.139,  4969.987,  4312.386, ...,     0.   ,   122.992,
          124.085],
       ...,
       [ 9962.036,  7986.823,  8682.258, ...,     0.   ,     0.   ,
          124.153],
       [ 8765.782, 11175.603,  8148.945, ...,     0.   ,     0.   ,
          372.433],
       [14380.528, 11524.35 , 13632.727, ...,     0.   ,     0.   ,
            0.   ]]), time=array([1.65718376e+09, 1.65718385e+09, 1.65718394e+09, ...,
       1.65753440e+09, 1.65753450e+09, 1.65753459e+09]), files=[['2022-07-10_094659_SMPS.csv', 2003798], ['2022-07-07_095151_SMPS.csv', 5617925]])

Pause to Plot the data#

In this code snippet, we retrieve data from the Lake object and create a dual-axis plot to visualize both CPC and SMPS data over time.

We access the CPC data from the Lake using lake[‘cpc’]. We retrieve the datetime and CPC count data.
Similarly, we access the SMPS data from the Lake using lake[‘smps_1d’] and retrieve the datetime and Mode data.
We create a plot with a blue line for CPC data using ax.plot(), and an orange line for SMPS data on a twin y-axis axb.
To improve readability, we rotate the x-axis labels using plt.xticks(rotation=45).
We set y-axis limits for the SMPS data to be in the range [0, 200] using axb.set_ylim(0, 200).
Axis labels and legends are added for both datasets.
Finally, we display the plot and adjust the layout for better visualization using plt.show() and fig.tight_layout().

# Load and Plot Data from Lake

# Access CPC data from the Lake
cpc_time = lake['cpc'].datetime64
cpc_data = lake['cpc']['CPC_count[#/sec]']

# Access SMPS data from the Lake
smps_time = lake['smps_1d'].datetime64
smps_data = lake['smps_1d']['Mode_(nm)']

# Plot the Data on Twinx Axis
fig, ax = plt.subplots()

# Plot CPC data
ax.plot(cpc_time,
        cpc_data,
        label='CPC',
        color='blue')

# Rotate x-axis labels for better readability
plt.xticks(rotation=45)

# Create a twin y-axis for SMPS data
axb = ax.twinx()

# Plot SMPS data
axb.plot(smps_time,
         smps_data,
         label='SMPS',
         color='orange')

# Set y-axis limits for SMPS data
axb.set_ylim(0, 200)

# Set axis labels
ax.set_xlabel("Time (UTC)")
ax.set_ylabel('CPC Counts [#/sec]')
axb.set_ylabel('SMPS Mode [nm]')

# Display the legend and show the plot
plt.show()

# Adjust layout for better visualization
fig.tight_layout()

../../_images/90fba4b4f25afe0104419e662c3ecdddba3856e5c3e2e02d920ba0781c728538.png

Data Averaging#

Now that we have loaded the data, we can perform data averaging over time. To achieve this, we will utilize the ‘particula.data.lake_stats’ module, which provides a convenient function called ‘averaged_std.’ This function takes a stream object as input and returns a new stream object containing both the averaged data and the standard deviation of the data.

It’s worth noting that this function follows a similar naming convention to ‘stream_stats.average_std,’ which operates on individual stream objects.

# Compute the average and standard deviation of data within a time interval
# of 600 seconds for each stream in the lake, and create a new lake
# containing the averaged data.
lake_averaged = lake_stats.average_std(
    lake=lake,
    average_interval=600,
    clone=True  # Create a new lake instead of modifying the original
)

# Print the resulting lake with averaged data and standard deviation.
print(lake_averaged)

Lake with streams: ['cpc', 'smps_1d', 'smps_2d']

Plot the Averaged Data#

Let’s plot the averaged data to see how it compares to the raw data. We will use the same approach as before, but this time we will use the averaged data from the Lake object.

# Extract datetime and CPC data from the averaged lake
cpc_time = lake_averaged['cpc'].datetime64
cpc_data = lake_averaged['cpc']['CPC_count[#/sec]']

# Extract datetime and SMPS data from the averaged lake
smps_time = lake_averaged['smps_1d'].datetime64
smps_data = lake_averaged['smps_1d']['Mode_(nm)']

# Create a plot with two y-axes (twinx) for CPC and SMPS data
fig, ax = plt.subplots()
ax.plot(cpc_time,
        cpc_data,
        label='CPC',
        color='blue')
plt.xticks(rotation=45)

# Create a twinx axis for SMPS data
axb = ax.twinx()
axb.plot(smps_time,
         smps_data,
         label='SMPS',
         color='orange',)
axb.set_ylim(0, 200)

# Set labels for both y-axes and the x-axis
ax.set_xlabel("Time (UTC)")
ax.set_ylabel('CPC_counts[#/sec]')
axb.set_ylabel('SMPS_Mode[nm]')

# Show the plot and adjust layout for better presentation
plt.show()
fig.tight_layout()

../../_images/a9de2c9880b4bbd747b816821b883e20c96b6ae352ef4c47c37382ba6bf0e4e1.png

Summary#

In this part of the tutorial, we learned how to work with multiple streams of data and load them into a Lake object, which is a collection of streams. We explored operations on the data, including averaging it over time using the particula.data.lake_stats module. This allowed us to create more meaningful visualizations by comparing the averaged and non-averaged data. The example demonstrated the power of the particula.data package in handling and analyzing scientific data efficiently.

help(Lake)

Help on class Lake in module particula.data.lake:

class Lake(builtins.object)
 |  Lake(streams: Dict[str, particula.data.stream.Stream] = <factory>) -> None
 |  
 |  A class representing a lake which is a collection of streams.
 |  
 |  Attributes:
 |      streams (Dict[str, Stream]): A dictionary to hold streams with their
 |      names as keys.
 |  
 |  Methods defined here:
 |  
 |  __delitem__(self, key: str) -> None
 |      Remove a stream by name.
 |      Example: del lake['stream_name']
 |  
 |  __dir__(self) -> list
 |      List available streams.
 |      Example: dir(lake)
 |  
 |  __eq__(self, other)
 |      Return self==value.
 |  
 |  __getattr__(self, name: str) -> Any
 |      Allow accessing streams as an attributes.
 |      Raises:
 |          AttributeError: If the stream name is not in the lake.
 |      Example: lake.stream_name
 |  
 |  __getitem__(self, key: str) -> Any
 |      Get a stream by name.
 |      Example: lake['stream_name']
 |  
 |  __init__(self, streams: Dict[str, particula.data.stream.Stream] = <factory>) -> None
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  __iter__(self) -> Iterator[Any]
 |      Iterate over the streams in the lake.
 |      Example: [stream.header for stream in lake]""
 |  
 |  __len__(self) -> int
 |      Return the number of streams in the lake.
 |      Example: len(lake)
 |  
 |  __repr__(self) -> str
 |      Return a string representation of the lake.
 |      Example: print(lake)
 |  
 |  __setitem__(self, key: str, value: particula.data.stream.Stream) -> None
 |      Set a stream by name.
 |      Example: lake['stream_name'] = new_stream
 |  
 |  add_stream(self, stream: particula.data.stream.Stream, name: str) -> None
 |      Add a stream to the lake.
 |      
 |      Args:
 |      -----------
 |          stream (Stream): The stream object to be added.
 |          name (str): The name of the stream.
 |      
 |      Raises:
 |      -------
 |          ValueError: If the stream name is already in use or not a valid
 |          identifier.
 |  
 |  items(self) -> Iterator[Tuple[Any, Any]]
 |      Return an iterator over the key-value pairs.
 |  
 |  keys(self) -> Iterator[Any]
 |      Return an iterator over the keys.
 |  
 |  values(self) -> Iterator[Any]
 |      Return an iterator over the values.
 |  
 |  ----------------------------------------------------------------------
 |  Readonly properties defined here:
 |  
 |  summary
 |      Return a string summary iterating over each stream
 |          and print Stream.header.
 |      Example: lake.summary
 |  
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables
 |  
 |  __weakref__
 |      list of weak references to the object
 |  
 |  ----------------------------------------------------------------------
 |  Data and other attributes defined here:
 |  
 |  __annotations__ = {'streams': typing.Dict[str, particula.data.stream.S...
 |  
 |  __dataclass_fields__ = {'streams': Field(name='streams',type=typing.Di...
 |  
 |  __dataclass_params__ = _DataclassParams(init=True,repr=True,eq=True,or...
 |  
 |  __hash__ = None
 |  
 |  __match_args__ = ('streams',)