Novo pacote do R – forecastHybrid

Direto do Peter Stats Blog

The new forecastHybrid package for R by David Shaub and myself provides convenient access to ensemble time series forecasting methods, in any combination of up to five of the modelling approaches from Hyndman’s forecast package. These are auto-selected autoregressive integrated moving average; exponential smoothing state space (both ets and tbats); feed forward neural network with a single hidden layer and lagged inputs; and forecasts of loess-based seasonal decomposition.

Background and motivation
In an earlier post I explored ways that might improve on standard methods for prediction intervals from univariate time series forecasting. One of the tools I used was a convenience function to combine forecasts from Rob Hyndman’s ets and auto.arima functions. David Shaub (with a small contribution from myself) has now built and published an R package forecastHybrid that expands on this idea to create ensembles from other forecasting methods from Hyndman’s forecast package.

The motivation is to make it easy to improve forecasts, both their point estimates and their prediction intervals. It has been well known for many years that taking the average of rival forecast methods improves the performance of forecasts. This new R package aims to make it as easy for people to do this as to fit the individual models in the first place.

Anúncios
Novo pacote do R – forecastHybrid

Gradient Boosted Trees Model no Microsoft R Server

Direto do Revolutions Blog

R is an open source, statistical programming language with millions of users in its community. However, a well-known weakness of R is that it is both single threaded and memory bound, which limits its ability to process big data. With Microsoft R Server (MRS), the enterprise grade distribution of R for advanced analytics, users can continue to work in their preferred R environment with following benefits: the ability to scale to data of any size, potential speed increases of up to one hundred times faster than open source R.

In this article, we give a walk-through on how to build a gradient boosted tree using MRS. We use a simple fraud data data set having approximately 1 million records and 9 columns. The last column “fraudRisk” is the tag: 0 stands for non-fraud and 1 stands for fraud. The following is a snapshot of the data.

Gradient Boosted Trees Model no Microsoft R Server