Patching Dec 9, 2021 6-7a CST- All GitLab services may be unavailable for 5-10 minutes

Commit 45267111 authored by Matthew Krafczyk's avatar Matthew Krafczyk
Browse files

Change group_spec to keyword argument, add None, [] behavior.

parent afa9600e
......@@ -105,7 +105,7 @@ def contiguous_group_indices(df_or_series, sequence_index_level=None, sequence_c
return grp_ids
def sequence_df(df, lags, group_specs):
def sequence_df(df, lags, group_specs=None):
"""
Sequence feature data into multi-component rows.
......@@ -132,7 +132,7 @@ def sequence_df(df, lags, group_specs):
Named Arguments
df: A Pandas dataframe containing a set of features for each day
lags: A list of lags to include
group_specs: A list of tuples defining how groups are discovered.
group_specs: A list of tuples defining how groups are discovered. (Can also be None or empty list)
'group' type specs - Group type specs specify columns, or index levels where the groups are already defined.
'sequence' type specs - Sequence type specs specify 'sequencable' columns. These columns have a 'by-one' well ordering defined.
This well ordering can either be implicit if you use integers, or you can pass a function which defines it.
......@@ -148,10 +148,15 @@ def sequence_df(df, lags, group_specs):
('group', 'column', groups) - Use the groups series to define the groups to use. This is a column passed in.
('group', 'column', 'Group') - Use the 'Group' column of the dataframe to define the groups to use. This is a column passed in.
If group_specs is None or an empty list, no grouping is done, and the lags are computed using shift directly on the input DataFrame.
returns
A pandas dataframe containing rows of prediction and/or label data.
"""
if group_specs is None:
group_specs = []
by = []
level = []
remove_columns = []
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment