Skip to content

Mixin

Particula-beta Index / Particula Beta / Data / Mixin

Auto-generated documentation for particula_beta.data.mixin module.

ChecksCharCountsMixin

Show source in mixin.py:430

Mixin class for setting the character counts for data checks.

Signature

class ChecksCharCountsMixin:
    def __init__(self): ...

ChecksCharCountsMixin().set_char_counts

Show source in mixin.py:436

Set the required character counts for the data checks. This is the number of times a character should appear in a line of the data file, for it to be considered valid, and proceed with data parsing.

Arguments

  • char_counts - Dictionary of characters and their required counts for the data checks. The keys are the characters, and the values are the required counts. e.g. {",": 4, ":": 0}.

Examples

Set number of commas
char_counts = {",": 4}
# valid line: '1,2,3,4'
# invalid line removed: '1,2,3'
Filter out specific words
char_counts = {"Temp1 Error": 0}
# valid line: '23.4, 0.1, 0.2, no error'
# invalid line removed: '23.4, 0.1, 0.2, Temp1 Error'

Signature

def set_char_counts(self, char_counts: dict[str, int]): ...

ChecksCharactersMixin

Show source in mixin.py:395

Mixin class for setting the character length range for data checks.

Signature

class ChecksCharactersMixin:
    def __init__(self): ...

ChecksCharactersMixin().set_characters

Show source in mixin.py:401

Set the character length range for the data checks. This is how many characters are expected a line of the data file, for it to be considered valid, and proceed with data parsing.

Arguments

  • characters - List of one (or two) integers for the minimum (and maximum) number of characters expected in a line of the data file. e.g. [10, 100] for 10 to 100 characters. or [10] for 10 or more characters.

Examples

Set minimum characters
characters = [5]
# valid line: '1,2,3,4,5'
# invalid line: '1,2'
Set range of characters
characters = [5, 10]
# valid line: '1,2,3,4,5'
# invalid line: '1,2,3,4,5,6,7,8,9,10,11'
# invalid line: '1,2'

Signature

def set_characters(self, characters: list[int]): ...

ChecksReplaceCharsMixin

Show source in mixin.py:508

Mixin class for setting the characters to replace in the data lines.

Signature

class ChecksReplaceCharsMixin:
    def __init__(self): ...

ChecksReplaceCharsMixin().set_replace_chars

Show source in mixin.py:514

Set the characters to replace in the data lines.

This is useful to replace unwanted characters from the data lines before converting the data to the required format. Each key in the replace_dict represents the character to replace, and the corresponding value is the replacement target.

Arguments

  • replace_dict dict - Dictionary with keys as characters to replace and values as the replacement targets.

Examples

Replace brackets with empty string
replace_dict = {"[": "", "]": ""}
# data: '[1], [2], [3]' -> '1, 2, 3'
Replace spaces with underscores
replace_dict = {" ": "_"}
# data: '1, 2, 3' -> '1,_2,_3'
Replace multiple characters
replace_dict = {"[": "", "]": "", "
": " "}
# data: '[1]
[2]
[3]' -> '1 2 3'

Returns

  • self - The instance of the class to allow for method chaining.

References

Python str.replace

Signature

def set_replace_chars(self, replace_chars: dict[str, str]): ...

ChecksSkipEndMixin

Show source in mixin.py:486

Mixin class for setting the number of rows to skip at the end.

Signature

class ChecksSkipEndMixin:
    def __init__(self): ...

ChecksSkipEndMixin().set_skip_end

Show source in mixin.py:492

Set the number of rows to skip at the end of the file.

Arguments

  • skip_end int - Number of rows to skip at the end of the file.

Examples

Skip last row
skip_end = 10
# Skip the last 10 row of the file.

Signature

def set_skip_end(self, skip_end: int = 0): ...

ChecksSkipRowsMixin

Show source in mixin.py:463

Mixin class for setting the number of rows to skip at the beginning.

Signature

class ChecksSkipRowsMixin:
    def __init__(self): ...

ChecksSkipRowsMixin().set_skip_rows

Show source in mixin.py:469

Set the number of rows to skip at the beginning of the file.

Arguments

  • skip_rows int - Number of rows to skip at the beginning of the file.

Examples

Skip the first 2 rows
skip_rows = 2
# Skip the first 2 rows of the file.

Signature

def set_skip_rows(self, skip_rows: int = 0): ...

DataChecksMixin

Show source in mixin.py:120

Mixin class for setting the data checks.

Signature

class DataChecksMixin:
    def __init__(self): ...

DataChecksMixin().set_data_checks

Show source in mixin.py:126

Dictionary of data checks to perform on the loaded data.

Arguments

  • checks dict - Dictionary of data checks to perform on the loaded data. The keys are the names of the checks, and the values are the parameters for the checks.

Signature

def set_data_checks(self, data_checks: Dict[str, Any]): ...

DataColumnMixin

Show source in mixin.py:138

Mixin class for setting the data column.

Signature

class DataColumnMixin:
    def __init__(self): ...

DataColumnMixin().set_data_column

Show source in mixin.py:144

The data columns for the data files to load. Build with DataChecksBuilder.

Arguments

  • data_columns - List of column numbers or names for the data columns to load from the data files. The columns are indexed from 0. e.g. [3, 5] or ['data 1', 'data 3'].

Examples

Single data column, index
data_columns = [3]
# header: 'Time, Temp, data 1, data 2, data 3'
# line: '2021-01-01T12:00:00, 25.8, 1.2, 3.4' # load 1.2
Single data column, name
data_columns = ['data 1']
# header: 'Time, Temp, data 1, data 3, data 5'
# line: '2021-01-01T12:00:00, 25.8, 1.2, 3.4' # load 25.8
Multiple data columns, index
data_columns = [1, 3]
# header: 'Time, Temp, data 1, data 3, data 5'
# line: '2021-01-01T12:00:00, 25.8, 1.2, 3.4' # load 25.8, 3.4
Multiple data columns, name
data_columns = ['Temp', 'data 3']
# header: 'Time, Temp, data 1, data 3, data 5'
# line: '2021-01-01T12:00:00, 25.8, 1.2, 3.4' # load 25.8, 3.4

Signature

def set_data_column(self, data_columns: Union[List[str], List[int]]): ...

DataHeaderMixin

Show source in mixin.py:182

Mixin class for setting the data header for the Stream.

Signature

class DataHeaderMixin:
    def __init__(self): ...

DataHeaderMixin().set_data_header

Show source in mixin.py:188

Set the Stream headers corresponding to the data columns. This is to improve the readability of the Stream data. The headers should be in the same order as the data columns. These are also the same headers that will be written to the output file or csv.

Arguments

  • headers - List of headers corresponding to the data columns to load. e.g. ['data-1[m/s]', 'data_3[L]'].

Examples

Single header
headers = ['data-1[m/s]']
# Name the only data column as 'data-1[m/s]'.
Multiple headers
headers = ['data-1[m/s]', 'data-3[L]']
# Name the data columns as 'data-1[m/s]' and 'data-3[L]'.

Signature

def set_data_header(self, headers: List[str]): ...

DelimiterMixin

Show source in mixin.py:296

Mixin class for setting the delimiter.

Signature

class DelimiterMixin:
    def __init__(self): ...

DelimiterMixin().set_delimiter

Show source in mixin.py:302

Set the delimiter for the data files to load.

Arguments

  • delimiter str - Delimiter for the data columns in the data files. e.g. ',' for CSV files or ' ' for tab-separated files.

Examples

CSV delimiter
delimiter = ","
# CSV file with columns separated by commas.
Tab delimiter
delimiter = "   "
# Tab-separated file with columns separated by tabs.
Space delimiter
delimiter = " "
# Space-separated file with columns separated by spaces.

Signature

def set_delimiter(self, delimiter: str): ...

FileMinSizeBytesMixin

Show source in mixin.py:74

Mixin class for setting the minimum file size in bytes.

Signature

class FileMinSizeBytesMixin:
    def __init__(self): ...

FileMinSizeBytesMixin().set_file_min_size_bytes

Show source in mixin.py:80

Set the minimum file size in bytes for the data files to load.

Arguments

  • size int - Minimum file size in bytes. Default is 10000 bytes.

Signature

def set_file_min_size_bytes(self, size: int = 10000): ...

FilenameRegexMixin

Show source in mixin.py:37

Mixin class for setting the filename regex.

Signature

class FilenameRegexMixin:
    def __init__(self): ...

FilenameRegexMixin().set_filename_regex

Show source in mixin.py:43

Set the filename regex for the data files to load.

Arguments

  • regex str - Regular expression for the filenames, e.g. 'data_*.csv'.

Examples

Match all files
regex = ".*"
# Match all files in the folder.
Match CSV files
regex = ".*.csv"
# Match all CSV files in the folder.
Match specific files
regex = "data_*.csv"
# Match files starting with 'data_' and ending with '.csv'.

References

Explore Regex Python Regex Doc

Signature

def set_filename_regex(self, regex: str): ...

HeaderRowMixin

Show source in mixin.py:90

Mixin class for setting the header row.

Signature

class HeaderRowMixin:
    def __init__(self): ...

HeaderRowMixin().set_header_row

Show source in mixin.py:96

Set the header row for the data files to load.

Arguments

  • row int - Row number for the header row in the data file, indexed from 0.

Examples

Header row at the top
row = 0
# line 0: 'Time, Temp, data 1, data 2, data 3'
Header is third row
row = 2
# line 0: "Experiment 1"
# line 1: "Date: 2021-01-01"
# line 2: 'Time, Temp, data 1, data 2, data 3'

Signature

def set_header_row(self, row: int): ...

RelativeFolderMixin

Show source in mixin.py:8

Mixin class for setting the relative data folder.

Signature

class RelativeFolderMixin:
    def __init__(self): ...

RelativeFolderMixin().set_relative_data_folder

Show source in mixin.py:14

Set the relative data folder for the folder with the data loading.

Arguments

  • folder str - Relative path to the data folder. e.g. 'data_folder'. Where the data folder is located in project_path/data_folder.

Examples

Set data folder
folder = "data_folder"
# Set the data folder to 'data_folder'.
Set a subfolder
folder = "subfolder/data_folder"
# Set the data folder to 'subfolder/data_folder'.

Signature

def set_relative_data_folder(self, folder: str): ...

SizerConcentrationConvertFromMixin

Show source in mixin.py:617

Mixin class for setting to convert the sizer concentration to a different scale.

Signature

class SizerConcentrationConvertFromMixin:
    def __init__(self): ...

SizerConcentrationConvertFromMixin().set_sizer_concentration_convert_from

Show source in mixin.py:624

Set to convert the sizer concentration from dw or (pmf) scale to dN/dlogDp scale.

Arguments

  • convert_from - Conversion flag to convert the sizer concentration from dw or (pmf) scale to dN/dlogDp scale. The option is only "dw" all other values are ignored.

Examples

Convert from dw scale
convert_from = "dw"
# Convert the sizer concentration from dw scale to dN/dlogDp scale.
Convert Ignored
convert_from = "pmf"
# Ignored, no conversion is performed, when loading the sizer data.

Signature

def set_sizer_concentration_convert_from(self, convert_from: Optional[str] = None): ...

SizerDataReaderMixin

Show source in mixin.py:651

Mixin class for the dictionary of the sizer data reader settings.

Signature

class SizerDataReaderMixin:
    def __init__(self): ...

SizerDataReaderMixin().set_data_sizer_reader

Show source in mixin.py:657

Dictionary of the sizer data reader settings for the data files. Build with SizerDataReaderBuilder.

Arguments

  • data_sizer_reader - Dictionary of the sizer data reader settings for the data files. The keys are the names of the settings, and the values are the parameters for the settings.

Signature

def set_data_sizer_reader(self, data_sizer_reader: Dict[str, Any]): ...

SizerEndKeywordMixin

Show source in mixin.py:586

Mixin class for setting the end key for the sizer data.

Signature

class SizerEndKeywordMixin:
    def __init__(self): ...

SizerEndKeywordMixin().set_sizer_end_keyword

Show source in mixin.py:592

Set the end keyword for the sizer data, to identify the end of the sizer data block in the data files. This can be a string or an integer (column index) to identify the end of the sizer data block.

Arguments

  • end_keyword - End key for the sizer data in the data files. e.g. '789.3' or -3 for the 3rd column from the end.

Examples

End key as a string
end_key = "789.3"
# header: '... 689.1, 750.2, 789.3, Total Conc, Comments'
End key as a column index
end_key = -3
# header: '... 689.1, 750.2, 789.3, Total Conc, Comments'

Signature

def set_sizer_end_keyword(self, end_key: Union[str, int]): ...

SizerStartKeywordMixin

Show source in mixin.py:555

Mixin class for setting the start key for the sizer data.

Signature

class SizerStartKeywordMixin:
    def __init__(self): ...

SizerStartKeywordMixin().set_sizer_start_keyword

Show source in mixin.py:561

Set the start keyword for the sizer data, to identify the start of the sizer data block in the data files. This can be a string or an integer (column index) to identify the start of the sizer data block.

Arguments

  • start_keyword - Start key for the sizer data in the data files. e.g. '25.8' or 3 for the 4th column

Examples

Start key as a string
start_key = "35.8"
# header: 'Time, Temp, 35.8, 36.0, 36.2, ...'
Start key as a column index
start_key = 2
# header: 'Time, Temp, 35.8, 36.0, 36.2, ...'

Signature

def set_sizer_start_keyword(self, start_key: Union[str, int]): ...

TimeColumnMixin

Show source in mixin.py:213

Mixin class for setting the time column.

Signature

class TimeColumnMixin:
    def __init__(self): ...

TimeColumnMixin().set_time_column

Show source in mixin.py:219

The time column for the data files to load. The time column is used to convert the time data to an Unix-Epoch timestamp.

Arguments

  • columns - List of column indexes for the time columns to load from the data files. The columns are indexed from 0. e.g. [0] or [1, 2] to combine 1 and 2 columns.

Examples

Single time column
columns = [0]
# Load the time data from the first column.
# line: '2021-01-01T12:00:00, 1.2, 3.4'
Multiple time columns
columns = [1, 2]
# Load the time data from the second and third columns.
# line: '1.2, 2021-01-01, 12:00:00'

Signature

def set_time_column(self, columns: List[int]): ...

TimeFormatMixin

Show source in mixin.py:245

Mixin class for setting the time format.

Signature

class TimeFormatMixin:
    def __init__(self): ...

TimeFormatMixin().set_time_format

Show source in mixin.py:251

Set the time format for the time data in the data files.

Arguments

  • time_format_str str - Time format string for the time data in the data files. Default is ISO "%Y-%m-%dT%H:%M:%S", list "epoch" if the time data is in Unix-Epoch format. Use the Python time format codes otherwise, e.g. "%Y-%m-%dT%H:%M:%S" for '2021-01-01T12:00:00'.

Examples

USA date format
time_format_str = "%m/%d/%Y %H:%M:%S"
# e.g. '01/01/2021 12:00:00'
European date format
time_format_str = "%d/%m/%Y %H:%M:%S"
# e.g. '01/01/2021 12:00:00'
ISO date format
time_format_str = "%Y-%m-%dT%H:%M:%S"
# e.g. '2021-01-01T12:00:00'
AM/PM time format
time_format_str = "%Y-%m-%d %I:%M:%S %p"
# e.g. '2021-01-01 12:00:00 PM'
Fractional seconds
time_format_str = "%Y-%m-%dT%H:%M:%S.%f"
# e.g. '2021-01-01T12:00:00.123456'

References

Signature

def set_time_format(self, time_format_str: str = "%Y-%m-%dT%H:%M:%S"): ...

TimeShiftSecondsMixin

Show source in mixin.py:329

Mixin class for setting the time shift in seconds.

Signature

class TimeShiftSecondsMixin:
    def __init__(self): ...

TimeShiftSecondsMixin().set_time_shift_seconds

Show source in mixin.py:335

Set the time shift in seconds for the time data in the data files. This is helpful to match the time stamps of two data folders. This shift is applied to all files loaded with this builder.

Arguments

  • shift int - Time shift in seconds for the time data in the data files. Default is 0 seconds.

Examples

Shift by 1 hour
shift = 3600
# Shift the time data by 1 hour (3600 seconds).
Shift by 1 day
shift = 86400
# Shift the time data by 1 day (86400 seconds).

Signature

def set_time_shift_seconds(self, shift: int = 0): ...

TimezoneIdentifierMixin

Show source in mixin.py:359

Mixin class for setting the timezone identifier.

Signature

class TimezoneIdentifierMixin:
    def __init__(self): ...

TimezoneIdentifierMixin().set_timezone_identifier

Show source in mixin.py:365

Set the timezone identifier for the time data in the data files. The timezone shift is handled by the pytz library.

Arguments

  • timezone str - Timezone identifier for the time data in the data files. Default is 'UTC'.

Examples

List of Timezones
timezone = "Europe/London"  # or "GMT"
Mountain Timezone
timezone = "America/Denver"  # or "MST7MDT"
ETH Zurich Timezone
timezone = "Europe/Zurich"  # or "CET"

References

List of Timezones

Signature

def set_timezone_identifier(self, timezone: str = "UTC"): ...