Particula
Stream stats size distribution part2
Initializing search
    particula-main
    • Home
    • Mk generator
    • API
    • Discussions
    • How To Guides
    • Tutorials
    • Contribute
    • Next
    particula-main
    • Home
    • Mk generator
    • API
      • Particula beta
        • Time Manage
        • Units
        • Data
          • Lake
          • Lake Stats
          • Loader
          • Loader Interface
          • Loader Setting Builders
          • Merger
          • Mixin
          • Settings Generator
          • Stream
          • Stream Stats
          • Process
            • Aerodynamic Convert
            • Chamber Rate Fitting
            • Kappa Via Extinction
            • Lognormal 2mode
            • Mie Angular
            • Mie Bulk
            • Optical Instrument
            • Scattering Truncation
            • Size Distribution
            • Stats
            • Ml analysis
              • Generate And Train 2mode Sizer
              • Get Ml Folder
              • Run Ml Trainings
        • Lagrangian
          • Boundary
          • Collisions
          • Integration
          • Particle Pairs
          • Particle Property
    • Discussions
    • How To Guides
      • Chamber Wall Loss
          • Chamber Forward Simulation
          • Chamber Forward Simulation Non-Spherical Particles
          • Chamber Observations
          • Chamber Rates Fitting
      • Data Streams and Lakes
          • Loading Part 1: Into & 1D Data
          • Formatting the Data
          • Loading Part 3: Lake
          • Loading Part 4: Settings Files
          • Load the data
          • Stream stats size distribution part2
      • Lagrangian
          • Introduction to the Lagrangian Box Model
          • Realistic Drag, Mass, and Velocity
          • Sweep and Prune Algorithm
          • Realistic Drag, Mass, and Velocity
      • Light Scattering
          • Scattering for Humidified Particles
          • Fitting Kappa-HGF from Light Extinction
          • Mie Scattering
          • Scattering Truncation Corrections
      • Setup Particula
    • Tutorials
      • Data Analysis
        • Converting Size Distributions
        • Fitting Lognormal PDFs: 2 Modes
    • Contribute
      • Code of Conduct
      • Contributor Guidelines
    • Next

    Size Distribution Stats¶

    This example shows how to process size distribution data from an SMPS. The processing returns mean properties of the size distribution, such as the mean diameter, median diameter, and total PM2.5 mass.

    In [1]:
    Copied!
    # all the imports
    import matplotlib.pyplot as plt
    from particula_beta.data import loader_interface, settings_generator
    from particula_beta.data.tests.example_data.get_example_data import get_data_folder
    from particula_beta.data.loader_setting_builders import (
        # These functions create settings for loading data from files.
        DataChecksBuilder,
        SizerDataReaderBuilder,
        LoaderSizerSettingsBuilder,
    )
    
    # the new step
    from particula_beta.data.process import size_distribution
    
    # set the parent directory of the data folder
    path = get_data_folder()
    
    # all the imports import matplotlib.pyplot as plt from particula_beta.data import loader_interface, settings_generator from particula_beta.data.tests.example_data.get_example_data import get_data_folder from particula_beta.data.loader_setting_builders import ( # These functions create settings for loading data from files. DataChecksBuilder, SizerDataReaderBuilder, LoaderSizerSettingsBuilder, ) # the new step from particula_beta.data.process import size_distribution # set the parent directory of the data folder path = get_data_folder()

    Load the data¶

    For this example we'll use the provided example data. But you can change the path to any folder on your computer. We then can used the settings generator to

    If you think this settings generator is getting tedious, we hear you. We'll show the fix to that soon.

    In [2]:
    Copied!
    # settings for the SMPS data
    data_checks_sizer = (
        DataChecksBuilder()
        .set_characters([250])
        .set_skip_rows(25)
        .set_char_counts({"/": 2, ":": 2})
        .build()
    )
    data_sizer_reader = (
        SizerDataReaderBuilder()
        .set_sizer_start_keyword("20.72")
        .set_sizer_end_keyword("784.39")
        .set_sizer_concentration_convert_from("dw/dlogdp")
        .build()
    )
    smps_1d_settings, smps_2d_settings = (
        LoaderSizerSettingsBuilder()
        .set_relative_data_folder("SMPS_data")
        .set_filename_regex("*.csv")
        .set_header_row(24)
        .set_data_checks(data_checks_sizer)
        .set_data_column(
            [
                "Lower Size (nm)",
                "Upper Size (nm)",
                "Sample Temp (C)",
                "Sample Pressure (kPa)",
                "Relative Humidity (%)",
                "Median (nm)",
                "Mean (nm)",
                "Geo. Mean (nm)",
                "Mode (nm)",
                "Geo. Std. Dev.",
                "Total Conc. (#/cm³)",
            ]
        )
        .set_data_header(
            [
                "Lower_Size_(nm)",
                "Upper_Size_(nm)",
                "Sample_Temp_(C)",
                "Sample_Pressure_(kPa)",
                "Relative_Humidity_(%)",
                "Median_(nm)",
                "Mean_(nm)",
                "Geo_Mean_(nm)",
                "Mode_(nm)",
                "Geo_Std_Dev.",
                "Total_Conc_(#/cc)",
            ]
        )
        .set_data_sizer_reader(data_sizer_reader)
        .set_time_column([1, 2])
        .set_time_format("%m/%d/%Y %H:%M:%S")
        .set_delimiter(",")
        .set_timezone_identifier("UTC")
        .build()
    )
    
    # collect settings into a dictionary
    combined_settings = {
        'smps_1d': smps_1d_settings,
        'smps_2d': smps_2d_settings,
    }
    
    # now call the loader interface for files
    lake = loader_interface.load_folders_interface(
        path=path,
        folder_settings=combined_settings,
    )
    
    print(' ')
    print(lake)
    
    # settings for the SMPS data data_checks_sizer = ( DataChecksBuilder() .set_characters([250]) .set_skip_rows(25) .set_char_counts({"/": 2, ":": 2}) .build() ) data_sizer_reader = ( SizerDataReaderBuilder() .set_sizer_start_keyword("20.72") .set_sizer_end_keyword("784.39") .set_sizer_concentration_convert_from("dw/dlogdp") .build() ) smps_1d_settings, smps_2d_settings = ( LoaderSizerSettingsBuilder() .set_relative_data_folder("SMPS_data") .set_filename_regex("*.csv") .set_header_row(24) .set_data_checks(data_checks_sizer) .set_data_column( [ "Lower Size (nm)", "Upper Size (nm)", "Sample Temp (C)", "Sample Pressure (kPa)", "Relative Humidity (%)", "Median (nm)", "Mean (nm)", "Geo. Mean (nm)", "Mode (nm)", "Geo. Std. Dev.", "Total Conc. (#/cm³)", ] ) .set_data_header( [ "Lower_Size_(nm)", "Upper_Size_(nm)", "Sample_Temp_(C)", "Sample_Pressure_(kPa)", "Relative_Humidity_(%)", "Median_(nm)", "Mean_(nm)", "Geo_Mean_(nm)", "Mode_(nm)", "Geo_Std_Dev.", "Total_Conc_(#/cc)", ] ) .set_data_sizer_reader(data_sizer_reader) .set_time_column([1, 2]) .set_time_format("%m/%d/%Y %H:%M:%S") .set_delimiter(",") .set_timezone_identifier("UTC") .build() ) # collect settings into a dictionary combined_settings = { 'smps_1d': smps_1d_settings, 'smps_2d': smps_2d_settings, } # now call the loader interface for files lake = loader_interface.load_folders_interface( path=path, folder_settings=combined_settings, ) print(' ') print(lake)
    Folder Settings: smps_1d
      Loading file: 2022-07-07_095151_SMPS.csv
      Loading file: 2022-07-10_094659_SMPS.csv
    Folder Settings: smps_2d
      Loading file: 2022-07-07_095151_SMPS.csv
      Loading file: 2022-07-10_094659_SMPS.csv
     
    Lake with streams: ['smps_1d', 'smps_2d']
    

    Processing the Stream¶

    The lake is a collection of streams, stored as a dictionary. The next step can be average the data or you may have a processing step that you want to apply to all the data. For example, you may want to calculate the total PM2.5 mass from the SMPS data. You could do this by looping through the streams and applying a cusutom processing function to each stream. Or you could use some standard process already built in to particula_beta.data.process. In this example we'll use process.size_distribution to calculate the PM2.5 mass from the

    In [3]:
    Copied!
    lake['mean_properties'] = size_distribution.sizer_mean_properties(
        stream=lake['smps_2d'],
        diameter_units='nm',
    )
    
    # list out the header
    for header in lake['mean_properties'].header:
        print(header)
    
    lake['mean_properties'] = size_distribution.sizer_mean_properties( stream=lake['smps_2d'], diameter_units='nm', ) # list out the header for header in lake['mean_properties'].header: print(header)
    Total_Conc_(#/cc)
    Mean_Diameter_(nm)
    Geometric_Mean_Diameter_(nm)
    Mode_Diameter_(nm)
    Mean_Diameter_Vol_(nm)
    Mode_Diameter_Vol_(nm)
    Unit_Mass_(ug/m3)
    Mass_(ug/m3)
    Total_Conc_(#/cc)_N100
    Unit_Mass_(ug/m3)_N100
    Mass_(ug/m3)_N100
    Total_Conc_(#/cc)_PM1
    Unit_Mass_(ug/m3)_PM1
    Mass_(ug/m3)_PM1
    Total_Conc_(#/cc)_PM2.5
    Unit_Mass_(ug/m3)_PM2.5
    Mass_(ug/m3)_PM2.5
    Total_Conc_(#/cc)_PM10
    Unit_Mass_(ug/m3)_PM10
    Mass_(ug/m3)_PM10
    

    Plot the Data¶

    With that processing done we can plot some useful summary plots. For example, we can plot the total PM2.5 mass as a function of time. And on the same plot we can add the N100 mass.

    Tip: Calling the header directly¶

    Note below how we can call the data header directly from the stream. This is because the get items property defined in the stream class accepts, either and index or a header name. So we can call the header name directly from the stream and get back that specific time series.

    This is incontrast to callint stream.data['header_name'] which would return an error. As that line first calls stream.data returning the np.ndarray, then calls the header name, which is not a valid index for a np.ndarray.

    In [4]:
    Copied!
    mean_prop_stream = lake['mean_properties']
    
    # plot the data on twinx axis
    fig, ax = plt.subplots()
    ax.plot(mean_prop_stream.datetime64,
            mean_prop_stream['Mass_(ug/m3)_PM2.5'],
            label='PM 2.5',
            color='blue')
    ax.plot(mean_prop_stream.datetime64,
            mean_prop_stream['Mass_(ug/m3)_N100'],
            label='N100 mass',
            color='red')
    ax.set_ylim(0, 50)
    plt.xticks(rotation=45)
    ax.set_xlabel("Time (UTC)")
    ax.set_ylabel('PM mass (ug/m3)')
    ax.legend()
    plt.show()
    fig.tight_layout()
    
    mean_prop_stream = lake['mean_properties'] # plot the data on twinx axis fig, ax = plt.subplots() ax.plot(mean_prop_stream.datetime64, mean_prop_stream['Mass_(ug/m3)_PM2.5'], label='PM 2.5', color='blue') ax.plot(mean_prop_stream.datetime64, mean_prop_stream['Mass_(ug/m3)_N100'], label='N100 mass', color='red') ax.set_ylim(0, 50) plt.xticks(rotation=45) ax.set_xlabel("Time (UTC)") ax.set_ylabel('PM mass (ug/m3)') ax.legend() plt.show() fig.tight_layout()
    No description has been provided for this image

    Summary¶

    This example showed how to process size distribution data from an SMPS. The processing returns mean properties of the size distribution, such as the mean diameter, median diameter, and total PM2.5 mass.

    © 2024 By particula contributors
    Made with Material for MkDocs