f_arrange() can now sort data frames in-place via the .in_place argument.
Fixed rchk issues flagged on CRAN.
Simple functions such as base R math operators +, /, abs, etc, are
now internally marked as group-unaware. This has a very significant speed
improvement for large grouped data frames.
This means that expressions containing only group-unaware functions, e.g.
(x + y) / abs(z), are evaluated on the entire data frame instead
of on a by-group basis.
If the expression contains any functions not marked as group-unaware, e.g.
x + cumsum(y) (as cumsum() is not flagged as group-unaware),
then usual evaluation applies except in the case of other statistical functions
which are optimised in a separate way.
across, pick, etc.Accessing columns through .data should work correctly now.
f_reframe would not recycle correctly in some cases and has now been fixed.
An issue where f_arrange would add variables has been fixed.
An issue where across was selecting grouped variables has been fixed.
Fixed an issue where in some cases lists where not being handled correctly in
calls to across().
Many common expressions, such as sum(), mean() and many others have been
optimised in functions like f_summarise(). For a current list of
optimised functions, see ?f_summarise.
f_mutate as an alternative to mutate
f_reframe as an alternative to reframe
Fast group metadata helper functions f_group_data, f_group_indices,
f_group_keys, f_group_rows, f_group_size and f_n_groups.
Small bug fix when f_summarise calculates means and medians
for zero-row data frames with integer variables.
R 4.0.0 now required.
f_summarise returning results in the incorrect order.New function list_tidy as an alternative to list that evaluates
arguments dynamically with a focus on setting precedence for objects created
in the list over environment objects.
new_tbl now evaluates its arguments dynamically. f_expand also
evaluates its argument dynamically unless the data is grouped and the
expressions supplied aren't simply column selections.
New function f_pull as a fast convenience function for extracting
vectors from columns.
New functions remove_rows_if_any_na and remove_rows_if_all_na.
f_arrange gains the .descending argument to efficiently
return data frames in descending order.
f_fill to fill NA values forwards and backwards by group.f_bind_rows sees a noticeable speed improvement.f_summarise now returns results in the correct order when both
multiple cols and multiple optimised functions were specified.
Joins were returning an error when x and y are grouped_df objects.
The join by argument now accepts a partial named character vector without throwing an error.
tidy_quantiles would return an error when probabilities were not sorted and
has now been fixed.
seed argument of f_slice_sample is soft-deprecated. To achieve
sampling, or really any RNG functions with a local seed,
use cheapr::with_local_seed().tidy_quantiles gains dramatic speed and efficiency improvements.
The order and sort arguments for data frame functions have been
superseded in favour of .order and .sort.
New argument .order added to both f_summarise and tidy_quantiles
to allow for controlling the order of groups.
rowwise_df is now explicitly unsupported. To group by row, use f_rowwise.
New functions f_nest_by, f_rowwise and add_consecutive_id.
A few bug fixes including:
f_bind_rows was not working when supplied with more than 2 data frames in
some cases.f_summarise was not working when supplied with non-function expressions.f_bind_cols now recycles its arguments and converts non-data frames
to data frames to allow for joining variables as if they were columns.
Fixed a bug in the f_join functions where incorrect matches were
occurring when the columns being joined on are 'exotic' variables, e.g.
lists, lubridate 'Intervals', etc. Currently fastplyr uses a proxy method to
join these kinds of variables through the use of group_id. This was not being
applied correctly for joined exotic variables and should now be fixed.
New function f_consecutive_id as an alternative to dplyr::consecutive_id.