Skip to content

Loader Interface

Particula-beta Index / Particula Beta / Data / Loader Interface

Auto-generated documentation for particula_beta.data.loader_interface module.

get_1d_stream

Show source in loader_interface.py:223

Loads and formats a 1D data stream from a file and initializes or updates a Stream object.

Arguments


file_path : str The path of the file to load data from. first_pass : bool Whether this is the first time data is being loaded. If True, the stream is initialized. If False, raises an error as only one file can be loaded. settings : dict A dictionary containing data formatting settings such as data checks, column names, time format, delimiter, and timezone information. stream : Stream, optional An instance of Stream class to be updated with loaded data. Defaults to a new Stream object.

Returns


Stream The Stream object updated with the loaded data and corresponding time information.

Raises


ValueError If first_pass is False, indicating data has already been loaded. TypeError If settings is not a dictionary. FileNotFoundError If the file specified by file_path does not exist. KeyError If any required keys are missing in the settings dictionary.

Signature

def get_1d_stream(
    file_path: str,
    settings: dict,
    first_pass: bool = True,
    stream: Optional[Stream] = None,
) -> Stream: ...

See also

get_2d_stream

Show source in loader_interface.py:342

Initializes a 2D stream using the settings in the DataLake object.

Arguments


- `key` *str* - The key of the stream to initialise.
- `path` *str* - The path of the file to load data from.
- `first_pass` *bool* - Whether this is the first time loading data.

Returns


None.

Signature

def get_2d_stream(
    file_path: str,
    settings: dict,
    first_pass: bool = True,
    stream: Optional[Stream] = None,
) -> Stream: ...

See also

get_new_files

Show source in loader_interface.py:12

Scan a directory for new files based on import settings and stream status.

This function looks for files in a specified path using import settings. It compares the new list of files with a pre-loaded list in the stream object to determine which files are new. The comparison is made based on file names and sizes. It returns a tuple with the paths of new files, a boolean indicating if this was the first pass, and a list of file information for new files.

Arguments


path : str The top-level directory path to scan for files. import_settings : dict A dictionary with 'relative_data_folder', 'filename_regex', and 'MIN_SIZE_BYTES' as keys used to specify the subfolder path and the regex pattern for filtering file names. It should also include 'min_size' key to specify the minimum size of the files to be considered. loaded_list : list of lists A list of lists with file names and sizes that have already been loaded. The default is None. If None, it will be assumed that no files have been loaded.

Returns


tuple of (list, bool, list) A tuple containing a list of full paths of new files, a boolean indicating if no previous files were loaded (True if it's the first pass), and a list of lists with new file names and sizes.

Raises


YourErrorType Explanation of when and why your error is raised and what it means.

Signature

def get_new_files(
    path: str, import_settings: dict, loaded_list: Optional[list] = None
) -> tuple: ...

load_files_interface

Show source in loader_interface.py:110

Load files into a stream object based on settings.

Arguments


path : str The top-level directory path to scan for folders of data. folder_settings : dict A dictionary with keys corresponding to the stream names and values corresponding to the settings for each stream. The settings can be generated using the settings_generator function. stream : Stream, optional An instance of Stream class to be updated with loaded data. Defaults to a new Stream object. - sub_sample - int, optional sub-sample only the first n files. Defaults to None.

Returns


Stream The Stream object updated with the loaded data.

Signature

def load_files_interface(
    path: str,
    settings: dict,
    stream: Optional[Stream] = None,
    sub_sample: Optional[int] = None,
) -> Stream: ...

See also

load_folders_interface

Show source in loader_interface.py:184

Load files into a lake object based on settings.

Arguments


path : str The top-level directory path to scan for folders of data. folder_settings : dict A dictionary with keys corresponding to the stream names and values corresponding to the settings for each stream. The settings can be generated using the settings_generator function. lake : Lake, optional An instance of Lake class to be updated with loaded data. Defaults to a new Lake object.

Returns


Lake The Lake object updated with the loaded data streams.

Signature

def load_folders_interface(
    path: str, folder_settings: dict, lake: Optional[Lake] = None
) -> Lake: ...

See also