time_series_transform.io package¶
Submodules¶
time_series_transform.io.arrow module¶
-
time_series_transform.io.arrow.from_arrow_record_batch(time_series, timeSeriesCol, mainCategoryCol)[source]¶ from_arrow_record_batch transform arrow record batch to Time_Series_Data or Time_Series_Data_Collection
- Parameters
time_series (Time_Series_Data or Time_Series_Data_Collection) – input data
mainCategoryCol (str of int) – index of category column
- Returns
- Return type
arrow record batch
-
time_series_transform.io.arrow.from_arrow_table(time_series, timeSeriesCol, mainCategoryCol)[source]¶ - from_arrow_table transform arrow table
to Time_Series_Data or Time_Series_Data_Collection
- Parameters
time_series (Time_Series_Data or Time_Series_Data_Collection) – input data
mainCategoryCol (str of int) – index of category column
- Returns
- Return type
arrow table
-
time_series_transform.io.arrow.to_arrow_record_batch(time_series, max_chunksize, expandCategory, expandTime, preprocessType, seperateLabels=False)[source]¶ to_arrow_record_batch [summary]
[extended_summary]
- Parameters
time_series (Time_Series_Data or Time_Series_Data_Collection) – input data
max_chunksize (int) – max size of record batch
expandCategory (bool) – whether to expand category
expandTime (bool) – whether to expand time
preprocessType (['ignore','pad','remove']) – preprocess data time across categories
seperateLabels (bool) – whether to seperate labels and data
- Returns
- Return type
arrow record batch
-
time_series_transform.io.arrow.to_arrow_table(time_series, expandCategory, expandTime, preprocessType, seperateLabels=False)[source]¶ to_arrow_table Time_Series_Data or Time_Series_Data_Collection to arrow table
- Parameters
time_series (Time_Series_Data or Time_Series_Data_Collection) – input data
expandCategory (bool) – whether to expand category
expandTime (bool) – whether to expand time
preprocessType (['ignore','pad','remove']) – preprocess data time across categories
seperateLabels (bool) – whether to seperate labels and data
- Returns
- Return type
arrow table
time_series_transform.io.base module¶
-
class
time_series_transform.io.base.io_base(time_series, timeSeriesCol, mainCategoryCol)[source]¶ Bases:
object-
from_collection(expandCategory, expandTimeIx, preprocessType='ignore')[source]¶ from_collection prepare Time_Series_Data_Collection into dict of list
- Parameters
- Returns
- Return type
dict of list
- Raises
ValueError – invalid data
KeyError – invalid key
-
from_single(expandTime)[source]¶ from_single transform Time_Series_Data into dict of list
- Parameters
expandTime (bool) – whether to expand Time
- Returns
- Return type
-
to_collection()[source]¶ to_collection transform data into Time_Series_Data_Collection
- Returns
- Return type
- Raises
KeyError – invalid input
-
time_series_transform.io.feather module¶
-
time_series_transform.io.feather.from_feather(dirPath, timeSeriesCol, mainCategoryCol, columns=None)[source]¶ from_feather read feather file into Time_Series_Data or Time_Series_Data_Collection
- Parameters
- Returns
- Return type
Time_Series_Data or Time_Series_Collection
-
time_series_transform.io.feather.to_feather(dirPaths, time_series_data, expandCategory, expandTime, preprocessType, seperateLabels=False, version=1, chunksize=None)[source]¶ transform Time_Series_Data or Time_Series_Data_Collection to feather file
- Parameters
dirPaths (str) – directory to feather file
time_series_data (Time_Series_Data or Time_Series_Data_Collection) – input data
expandCategory (bool) – whether to expand category
expandTime (bool) – whether to expand time
preprocessType (['ignore','pad','remove']) – preprocess data time across categories
seperateLabels (bool) – whether to seperate labels and data
version (int, optional) – feather version, by default 1
chunksize (int) – size of feather file
time_series_transform.io.generator module¶
time_series_transform.io.numpy module¶
-
time_series_transform.io.numpy.from_numpy(numpyArray, timeSeriesCol, mainCategoryCol=None)[source]¶ - from_numpy transform numpy ndArray
to Time_Series_Data or Time_Series_Data_Collection
- Parameters
- Returns
- Return type
- Raises
ValueError – invalid input data
-
time_series_transform.io.numpy.to_numpy(time_series_data, expandCategory, expandTime, preprocessType, seperateLabels=False)[source]¶ transform Time_Series_Data or Time_Series_Data_Collection to numpy ndArray
- Parameters
time_series_data (Time_Series_Data or Time_Series_Data_Collection) – input data
expandCategory (bool) – whether to expand category
expandTime (bool) – whether to expand time
preprocessType (['ignore','pad','remove']) – preprocess data time across categories
seperateLabels (bool) – whether to seperate labels and data
- Returns
[description]
- Return type
[type]
- Raises
ValueError – [description]
time_series_transform.io.pandas module¶
-
time_series_transform.io.pandas.from_pandas(pandasFrame, timeSeriesCol, mainCategoryCol=None)[source]¶ from_pandas transform dataFrame to Time_Series_Data or Time_Series_Data_Collection
- Parameters
- Returns
- Return type
-
time_series_transform.io.pandas.to_pandas(time_series_data, expandCategory, expandTime, preprocessType, seperateLabels=False)[source]¶ transform Time_Series_Data or Time_Series_Data_Collection into pandas dataFrame
- Parameters
time_series_data (Time_Series_Data or Time_Series_Data_Collection) – input data
expandCategory (bool) – whether to expand category
expandTime (bool) – whether to expand time
preprocessType (['ignore','pad','remove']) – preprocess data time across categories
seperateLabels (bool) – whether to seperate labels and data
- Returns
- Return type
pandas dataFrame
- Raises
ValueError – invalid data input
time_series_transform.io.parquet module¶
-
time_series_transform.io.parquet.from_parquet(dirPath, timeSeriesCol, mainCategoryCol, columns=None, partitioning='hive', filters=None, filesystem=None)[source]¶ from_parquet transform parquet into Time_Series_Data or Time_Series_Data_Collection
- Parameters
dirPaths (str) – directory to parquet file
time_series (Time_Series_Data or Time_Series_Data_Collection) – input data
mainCategoryCol (str of int) – index of category column
columns (list, optional) – columns to fetch, by default None
partitioning (str, optional) – partition type, by default ‘hive’
filters (str, optional) – parquet filter, by default None
filesystem (str, optional) – filesystem, by default None
- Returns
- Return type
-
time_series_transform.io.parquet.to_parquet(dirPaths, time_series_data, expandCategory, expandTime, preprocessType, seperateLabels=False, version='1.0', isDataset=False, partition_cols=None)[source]¶ - to_parquet transform Time_Series_Data or Time_Series_Data_Collection
to parquet
- Parameters
dirPaths (str) – directory to parquet file
time_series_data (Time_Series_Data or Time_Series_Data_Collection) – input data
mainCategoryCol (str of int) – index of category column
preprocessType (['ignore','pad','remove']) – preprocess data time across categories
seperateLabels (bool) – whether to seperate labels and data
version (str, optional) – parquet version, by default ‘1.0’
isDataset (bool, optional) – whether to output as dataset, by default False
partition_cols (list, optional) – partition columns, by default None
Module contents¶
-
time_series_transform.io.from_arrow_table(time_series, timeSeriesCol, mainCategoryCol)[source]¶ - from_arrow_table transform arrow table
to Time_Series_Data or Time_Series_Data_Collection
- Parameters
time_series (Time_Series_Data or Time_Series_Data_Collection) – input data
mainCategoryCol (str of int) – index of category column
- Returns
- Return type
arrow table
-
time_series_transform.io.from_feather(dirPath, timeSeriesCol, mainCategoryCol, columns=None)[source]¶ from_feather read feather file into Time_Series_Data or Time_Series_Data_Collection
- Parameters
- Returns
- Return type
Time_Series_Data or Time_Series_Collection
-
time_series_transform.io.from_numpy(numpyArray, timeSeriesCol, mainCategoryCol=None)[source]¶ - from_numpy transform numpy ndArray
to Time_Series_Data or Time_Series_Data_Collection
- Parameters
- Returns
- Return type
- Raises
ValueError – invalid input data
-
time_series_transform.io.from_pandas(pandasFrame, timeSeriesCol, mainCategoryCol=None)[source]¶ from_pandas transform dataFrame to Time_Series_Data or Time_Series_Data_Collection
- Parameters
- Returns
- Return type
-
time_series_transform.io.from_parquet(dirPath, timeSeriesCol, mainCategoryCol, columns=None, partitioning='hive', filters=None, filesystem=None)[source]¶ from_parquet transform parquet into Time_Series_Data or Time_Series_Data_Collection
- Parameters
dirPaths (str) – directory to parquet file
time_series (Time_Series_Data or Time_Series_Data_Collection) – input data
mainCategoryCol (str of int) – index of category column
columns (list, optional) – columns to fetch, by default None
partitioning (str, optional) – partition type, by default ‘hive’
filters (str, optional) – parquet filter, by default None
filesystem (str, optional) – filesystem, by default None
- Returns
- Return type
-
time_series_transform.io.to_arrow_table(time_series, expandCategory, expandTime, preprocessType, seperateLabels=False)[source]¶ to_arrow_table Time_Series_Data or Time_Series_Data_Collection to arrow table
- Parameters
time_series (Time_Series_Data or Time_Series_Data_Collection) – input data
expandCategory (bool) – whether to expand category
expandTime (bool) – whether to expand time
preprocessType (['ignore','pad','remove']) – preprocess data time across categories
seperateLabels (bool) – whether to seperate labels and data
- Returns
- Return type
arrow table
-
time_series_transform.io.to_feather(dirPaths, time_series_data, expandCategory, expandTime, preprocessType, seperateLabels=False, version=1, chunksize=None)[source]¶ transform Time_Series_Data or Time_Series_Data_Collection to feather file
- Parameters
dirPaths (str) – directory to feather file
time_series_data (Time_Series_Data or Time_Series_Data_Collection) – input data
expandCategory (bool) – whether to expand category
expandTime (bool) – whether to expand time
preprocessType (['ignore','pad','remove']) – preprocess data time across categories
seperateLabels (bool) – whether to seperate labels and data
version (int, optional) – feather version, by default 1
chunksize (int) – size of feather file
-
time_series_transform.io.to_numpy(time_series_data, expandCategory, expandTime, preprocessType, seperateLabels=False)[source]¶ transform Time_Series_Data or Time_Series_Data_Collection to numpy ndArray
- Parameters
time_series_data (Time_Series_Data or Time_Series_Data_Collection) – input data
expandCategory (bool) – whether to expand category
expandTime (bool) – whether to expand time
preprocessType (['ignore','pad','remove']) – preprocess data time across categories
seperateLabels (bool) – whether to seperate labels and data
- Returns
[description]
- Return type
[type]
- Raises
ValueError – [description]
-
time_series_transform.io.to_pandas(time_series_data, expandCategory, expandTime, preprocessType, seperateLabels=False)[source]¶ transform Time_Series_Data or Time_Series_Data_Collection into pandas dataFrame
- Parameters
time_series_data (Time_Series_Data or Time_Series_Data_Collection) – input data
expandCategory (bool) – whether to expand category
expandTime (bool) – whether to expand time
preprocessType (['ignore','pad','remove']) – preprocess data time across categories
seperateLabels (bool) – whether to seperate labels and data
- Returns
- Return type
pandas dataFrame
- Raises
ValueError – invalid data input
-
time_series_transform.io.to_parquet(dirPaths, time_series_data, expandCategory, expandTime, preprocessType, seperateLabels=False, version='1.0', isDataset=False, partition_cols=None)[source]¶ - to_parquet transform Time_Series_Data or Time_Series_Data_Collection
to parquet
- Parameters
dirPaths (str) – directory to parquet file
time_series_data (Time_Series_Data or Time_Series_Data_Collection) – input data
mainCategoryCol (str of int) – index of category column
preprocessType (['ignore','pad','remove']) – preprocess data time across categories
seperateLabels (bool) – whether to seperate labels and data
version (str, optional) – parquet version, by default ‘1.0’
isDataset (bool, optional) – whether to output as dataset, by default False
partition_cols (list, optional) – partition columns, by default None