time_series_transform.io package¶
Submodules¶
time_series_transform.io.arrow module¶
-
time_series_transform.io.arrow.
from_arrow_record_batch
(time_series, timeSeriesCol, mainCategoryCol)[source]¶ from_arrow_record_batch transform arrow record batch to Time_Series_Data or Time_Series_Data_Collection
- Parameters
time_series (Time_Series_Data or Time_Series_Data_Collection) – input data
mainCategoryCol (str of int) – index of category column
- Returns
- Return type
arrow record batch
-
time_series_transform.io.arrow.
from_arrow_table
(time_series, timeSeriesCol, mainCategoryCol)[source]¶ - from_arrow_table transform arrow table
to Time_Series_Data or Time_Series_Data_Collection
- Parameters
time_series (Time_Series_Data or Time_Series_Data_Collection) – input data
mainCategoryCol (str of int) – index of category column
- Returns
- Return type
arrow table
-
time_series_transform.io.arrow.
to_arrow_record_batch
(time_series, max_chunksize, expandCategory, expandTime, preprocessType, seperateLabels=False)[source]¶ to_arrow_record_batch [summary]
[extended_summary]
- Parameters
time_series (Time_Series_Data or Time_Series_Data_Collection) – input data
max_chunksize (int) – max size of record batch
expandCategory (bool) – whether to expand category
expandTime (bool) – whether to expand time
preprocessType (['ignore','pad','remove']) – preprocess data time across categories
seperateLabels (bool) – whether to seperate labels and data
- Returns
- Return type
arrow record batch
-
time_series_transform.io.arrow.
to_arrow_table
(time_series, expandCategory, expandTime, preprocessType, seperateLabels=False)[source]¶ to_arrow_table Time_Series_Data or Time_Series_Data_Collection to arrow table
- Parameters
time_series (Time_Series_Data or Time_Series_Data_Collection) – input data
expandCategory (bool) – whether to expand category
expandTime (bool) – whether to expand time
preprocessType (['ignore','pad','remove']) – preprocess data time across categories
seperateLabels (bool) – whether to seperate labels and data
- Returns
- Return type
arrow table
time_series_transform.io.base module¶
-
class
time_series_transform.io.base.
io_base
(time_series, timeSeriesCol, mainCategoryCol)[source]¶ Bases:
object
-
from_collection
(expandCategory, expandTimeIx, preprocessType='ignore')[source]¶ from_collection prepare Time_Series_Data_Collection into dict of list
- Parameters
- Returns
- Return type
dict of list
- Raises
ValueError – invalid data
KeyError – invalid key
-
from_single
(expandTime)[source]¶ from_single transform Time_Series_Data into dict of list
- Parameters
expandTime (bool) – whether to expand Time
- Returns
- Return type
-
to_collection
()[source]¶ to_collection transform data into Time_Series_Data_Collection
- Returns
- Return type
- Raises
KeyError – invalid input
-
time_series_transform.io.feather module¶
-
time_series_transform.io.feather.
from_feather
(dirPath, timeSeriesCol, mainCategoryCol, columns=None)[source]¶ from_feather read feather file into Time_Series_Data or Time_Series_Data_Collection
- Parameters
- Returns
- Return type
Time_Series_Data or Time_Series_Collection
-
time_series_transform.io.feather.
to_feather
(dirPaths, time_series_data, expandCategory, expandTime, preprocessType, seperateLabels=False, version=1, chunksize=None)[source]¶ transform Time_Series_Data or Time_Series_Data_Collection to feather file
- Parameters
dirPaths (str) – directory to feather file
time_series_data (Time_Series_Data or Time_Series_Data_Collection) – input data
expandCategory (bool) – whether to expand category
expandTime (bool) – whether to expand time
preprocessType (['ignore','pad','remove']) – preprocess data time across categories
seperateLabels (bool) – whether to seperate labels and data
version (int, optional) – feather version, by default 1
chunksize (int) – size of feather file
time_series_transform.io.generator module¶
time_series_transform.io.numpy module¶
-
time_series_transform.io.numpy.
from_numpy
(numpyArray, timeSeriesCol, mainCategoryCol=None)[source]¶ - from_numpy transform numpy ndArray
to Time_Series_Data or Time_Series_Data_Collection
- Parameters
- Returns
- Return type
- Raises
ValueError – invalid input data
-
time_series_transform.io.numpy.
to_numpy
(time_series_data, expandCategory, expandTime, preprocessType, seperateLabels=False)[source]¶ transform Time_Series_Data or Time_Series_Data_Collection to numpy ndArray
- Parameters
time_series_data (Time_Series_Data or Time_Series_Data_Collection) – input data
expandCategory (bool) – whether to expand category
expandTime (bool) – whether to expand time
preprocessType (['ignore','pad','remove']) – preprocess data time across categories
seperateLabels (bool) – whether to seperate labels and data
- Returns
[description]
- Return type
[type]
- Raises
ValueError – [description]
time_series_transform.io.pandas module¶
-
time_series_transform.io.pandas.
from_pandas
(pandasFrame, timeSeriesCol, mainCategoryCol=None)[source]¶ from_pandas transform dataFrame to Time_Series_Data or Time_Series_Data_Collection
- Parameters
- Returns
- Return type
-
time_series_transform.io.pandas.
to_pandas
(time_series_data, expandCategory, expandTime, preprocessType, seperateLabels=False)[source]¶ transform Time_Series_Data or Time_Series_Data_Collection into pandas dataFrame
- Parameters
time_series_data (Time_Series_Data or Time_Series_Data_Collection) – input data
expandCategory (bool) – whether to expand category
expandTime (bool) – whether to expand time
preprocessType (['ignore','pad','remove']) – preprocess data time across categories
seperateLabels (bool) – whether to seperate labels and data
- Returns
- Return type
pandas dataFrame
- Raises
ValueError – invalid data input
time_series_transform.io.parquet module¶
-
time_series_transform.io.parquet.
from_parquet
(dirPath, timeSeriesCol, mainCategoryCol, columns=None, partitioning='hive', filters=None, filesystem=None)[source]¶ from_parquet transform parquet into Time_Series_Data or Time_Series_Data_Collection
- Parameters
dirPaths (str) – directory to parquet file
time_series (Time_Series_Data or Time_Series_Data_Collection) – input data
mainCategoryCol (str of int) – index of category column
columns (list, optional) – columns to fetch, by default None
partitioning (str, optional) – partition type, by default ‘hive’
filters (str, optional) – parquet filter, by default None
filesystem (str, optional) – filesystem, by default None
- Returns
- Return type
-
time_series_transform.io.parquet.
to_parquet
(dirPaths, time_series_data, expandCategory, expandTime, preprocessType, seperateLabels=False, version='1.0', isDataset=False, partition_cols=None)[source]¶ - to_parquet transform Time_Series_Data or Time_Series_Data_Collection
to parquet
- Parameters
dirPaths (str) – directory to parquet file
time_series_data (Time_Series_Data or Time_Series_Data_Collection) – input data
mainCategoryCol (str of int) – index of category column
preprocessType (['ignore','pad','remove']) – preprocess data time across categories
seperateLabels (bool) – whether to seperate labels and data
version (str, optional) – parquet version, by default ‘1.0’
isDataset (bool, optional) – whether to output as dataset, by default False
partition_cols (list, optional) – partition columns, by default None
Module contents¶
-
time_series_transform.io.
from_arrow_table
(time_series, timeSeriesCol, mainCategoryCol)[source]¶ - from_arrow_table transform arrow table
to Time_Series_Data or Time_Series_Data_Collection
- Parameters
time_series (Time_Series_Data or Time_Series_Data_Collection) – input data
mainCategoryCol (str of int) – index of category column
- Returns
- Return type
arrow table
-
time_series_transform.io.
from_feather
(dirPath, timeSeriesCol, mainCategoryCol, columns=None)[source]¶ from_feather read feather file into Time_Series_Data or Time_Series_Data_Collection
- Parameters
- Returns
- Return type
Time_Series_Data or Time_Series_Collection
-
time_series_transform.io.
from_numpy
(numpyArray, timeSeriesCol, mainCategoryCol=None)[source]¶ - from_numpy transform numpy ndArray
to Time_Series_Data or Time_Series_Data_Collection
- Parameters
- Returns
- Return type
- Raises
ValueError – invalid input data
-
time_series_transform.io.
from_pandas
(pandasFrame, timeSeriesCol, mainCategoryCol=None)[source]¶ from_pandas transform dataFrame to Time_Series_Data or Time_Series_Data_Collection
- Parameters
- Returns
- Return type
-
time_series_transform.io.
from_parquet
(dirPath, timeSeriesCol, mainCategoryCol, columns=None, partitioning='hive', filters=None, filesystem=None)[source]¶ from_parquet transform parquet into Time_Series_Data or Time_Series_Data_Collection
- Parameters
dirPaths (str) – directory to parquet file
time_series (Time_Series_Data or Time_Series_Data_Collection) – input data
mainCategoryCol (str of int) – index of category column
columns (list, optional) – columns to fetch, by default None
partitioning (str, optional) – partition type, by default ‘hive’
filters (str, optional) – parquet filter, by default None
filesystem (str, optional) – filesystem, by default None
- Returns
- Return type
-
time_series_transform.io.
to_arrow_table
(time_series, expandCategory, expandTime, preprocessType, seperateLabels=False)[source]¶ to_arrow_table Time_Series_Data or Time_Series_Data_Collection to arrow table
- Parameters
time_series (Time_Series_Data or Time_Series_Data_Collection) – input data
expandCategory (bool) – whether to expand category
expandTime (bool) – whether to expand time
preprocessType (['ignore','pad','remove']) – preprocess data time across categories
seperateLabels (bool) – whether to seperate labels and data
- Returns
- Return type
arrow table
-
time_series_transform.io.
to_feather
(dirPaths, time_series_data, expandCategory, expandTime, preprocessType, seperateLabels=False, version=1, chunksize=None)[source]¶ transform Time_Series_Data or Time_Series_Data_Collection to feather file
- Parameters
dirPaths (str) – directory to feather file
time_series_data (Time_Series_Data or Time_Series_Data_Collection) – input data
expandCategory (bool) – whether to expand category
expandTime (bool) – whether to expand time
preprocessType (['ignore','pad','remove']) – preprocess data time across categories
seperateLabels (bool) – whether to seperate labels and data
version (int, optional) – feather version, by default 1
chunksize (int) – size of feather file
-
time_series_transform.io.
to_numpy
(time_series_data, expandCategory, expandTime, preprocessType, seperateLabels=False)[source]¶ transform Time_Series_Data or Time_Series_Data_Collection to numpy ndArray
- Parameters
time_series_data (Time_Series_Data or Time_Series_Data_Collection) – input data
expandCategory (bool) – whether to expand category
expandTime (bool) – whether to expand time
preprocessType (['ignore','pad','remove']) – preprocess data time across categories
seperateLabels (bool) – whether to seperate labels and data
- Returns
[description]
- Return type
[type]
- Raises
ValueError – [description]
-
time_series_transform.io.
to_pandas
(time_series_data, expandCategory, expandTime, preprocessType, seperateLabels=False)[source]¶ transform Time_Series_Data or Time_Series_Data_Collection into pandas dataFrame
- Parameters
time_series_data (Time_Series_Data or Time_Series_Data_Collection) – input data
expandCategory (bool) – whether to expand category
expandTime (bool) – whether to expand time
preprocessType (['ignore','pad','remove']) – preprocess data time across categories
seperateLabels (bool) – whether to seperate labels and data
- Returns
- Return type
pandas dataFrame
- Raises
ValueError – invalid data input
-
time_series_transform.io.
to_parquet
(dirPaths, time_series_data, expandCategory, expandTime, preprocessType, seperateLabels=False, version='1.0', isDataset=False, partition_cols=None)[source]¶ - to_parquet transform Time_Series_Data or Time_Series_Data_Collection
to parquet
- Parameters
dirPaths (str) – directory to parquet file
time_series_data (Time_Series_Data or Time_Series_Data_Collection) – input data
mainCategoryCol (str of int) – index of category column
preprocessType (['ignore','pad','remove']) – preprocess data time across categories
seperateLabels (bool) – whether to seperate labels and data
version (str, optional) – parquet version, by default ‘1.0’
isDataset (bool, optional) – whether to output as dataset, by default False
partition_cols (list, optional) – partition columns, by default None