cfl_data_utils.utils package¶
pylint: disable=missing-docstring
Submodules¶
cfl_data_utils.utils.dict_to_object module¶
Small class to convert a JSON object/dictionary into an object
This module is for “formalising” JSON object into Python objects for referencing attributes with dot notation, rather than index-lookups
cfl_data_utils.utils.utils module¶
This module is for smaller utility function which are useful across many projects.
Todo
- better type testing using
isinstance
orassert
-
cfl_data_utils.utils.utils.
assert_date_is_yesterday
(date)[source]¶ Checks that a date is yesterday for data validation
Parameters: date (datetime) – The date to be validated Raises: AssertionError – if the date isn’t today
-
cfl_data_utils.utils.utils.
get_col_types
(data, sql=False)[source]¶ Returns column headers and their types from a CSV or JSON file
Parameters: - data (Union[BufferedReader, BufferedWriter, TextIOWrapper, str, dict]) – the file to be parsed
- sql (bool) – flag to say whether the types should be in SQL dialect or not
Returns: - A list of two-element dictionaries (column name and value type). For example::
[{name: ‘col1’, type: ‘typ1’}, {name: ‘col2’, type: ‘typ2’}, {name: ‘col3’, type: ‘typ3’}]
The types can either be Python types or their SQL dialect counterparts (str vs ‘TEXT’)
Return type: List
-
cfl_data_utils.utils.utils.
get_var_type
(value, sql=False)[source]¶ Gets a Python type from a string variable
Parameters: - value (Union[int, float, str, datetime]) – The variable to be type-checked
- sql (bool) – Flag to decide if return type should be SQL-ized (e.g. int vs INT)
Returns: Type of value passed in, either as SQL format ready for query of as Python type
Examples
>>> get_var_type(123, sql=True) 'INT'
-
cfl_data_utils.utils.utils.
increment_progress_display
(processed=None, goal=100, start_time=None, downloaded=None, print_line=None, terminal_width=None)[source]¶ Displays a progress bar to track data processing progress
Parameters: - processed (int, optional) – the amount of processing done so far (e.g. number of iterations)
- goal (int) – the total amount of processing to be done
- start_time (float, optional) – the time at which the processing was started
- downloaded (float, optional) – amount of data downloaded
- print_line (int, optional) – the line of the terminal to print the progress bar on
- terminal_width (int, optional) – width of terminal used for sizing progress bar
Returns: If the processed arg is passed in, it is incremented by 1 for use in while loops etc. where the counter can be incremented as part of this function. Alternatively, None is returned.
-
cfl_data_utils.utils.utils.
sqlize
(string)[source]¶ SQL-izes a string to ensure the characters are all legal
Parameters: string (str) – The string to be processed Returns: the processed SQL-friendly string
-
cfl_data_utils.utils.utils.
time_to_epoch
(human_time=None, year=None, month=None, day=None, hour=None, minute=None, second=None)[source]¶ Converts a time to an epoch timestamp
- It can take arguments of several different formats:
- nothing can be passed, and the current epoch will be returned
- each component part of the timestamp can be passed (e.g. year, month, day)
- human_time is for more easily type-able time formats, e.g. 19700101120000 or 1970-01-01 12:00:00
Parameters: - human_time (str) – a more human-readable time to allow easier entry
- year (int) – year to be converted
- month (int) – month to be converted
- day (int) – day to be converted
- hour (int) – hour to be converted
- minute (int) – minute to be converted
- second (int) – second to be converted
Returns: The time passed in (or the current time otherwise) as time since epoch in seconds