Machine Learning Principles Can Improve Hip Fracture Prediction

Abstract:  Apply machine learning principles to predict
hip fractures and estimate predictor importance in
Dual-energy X-ray absorptiometry (DXA)-scanned men
and women. Dual-energy X-ray absorptiometry data from
two Danish regions between 1996 and 2006 were combined
with national Danish patient data to comprise 4722
women and 717 men with 5 years of follow-up time (original
cohort n=6606 men and women). Twenty-four statistical
models were built on 75% of data points through k-5,
5-repeat cross-validation, and then validated on the remaining
25% of data points to calculate area under the curve
(AUC) and calibrate probability estimates. The best models
were retrained with restricted predictor subsets to estimate
the best subsets. For women, bootstrap aggregated flexible
discriminant analysis (“bagFDA”) performed best with
a test AUC of 0.92 [0.89; 0.94] and well-calibrated probabilities
following Naïve Bayes adjustments. A “bagFDA”
model limited to 11 predictors (among them bone mineral
densities (BMD), biochemical glucose measurements,
general practitioner and dentist use) achieved a test AUC
of 0.91 [0.88; 0.93]. For men, eXtreme Gradient Boosting
(“xgbTree”) performed best with a test AUC of 0.89 [0.82;
0.95], but with poor calibration in higher probabilities. A
ten predictor subset (BMD, biochemical cholesterol and
liver function tests, penicillin use and osteoarthritis diagnoses)
achieved a test AUC of 0.86 [0.78; 0.94] using an
xgbTree” model. Machine learning can improve hip fracture
prediction beyond logistic regression using ensemble
models. Compiling data from international cohorts of
longer follow-up and performing similar machine learning
procedures has the potential to further improve discrimination
and calibration.
Conclusion: We conclude that hip fracture risk can be modelled with
high discriminative performance for men (Test AUC of
0.89 [0.82; 0.95], sensitivity 100%, specificity 69% at the
Youden probability cut-off) and particularly for women
(Test AUC 0.91 [0.88; 0.94], sensitivity 88%, specificity
81% at the Youden probability cut-off) using advanced predictive
models. Ensemble models using bootstrap aggregation
and boosting performed best in both cohorts, and
probabilities can generally be calibrated well with a Naïve
Bayes approach, although poor for high probability estimates
in men. Models of 11 predictors for women and 9 for
men with combinations of DXA BMD measurements and
primary sector use achieved the highest numerical AUC
values. Further improvements in predictive capability are
likely possible with compilations of more data points and
longer observation periods. We strongly suggest the use of
machine learning principles to model hip fracture risk, and
we welcome an effort to compile existing datasets and perform
advanced predictive modelling.
Machine Learning Principles Can Improve Hip Fracture Prediction

Accelerating the XGBoost algorithm using GPU computing

A fronteira final em relação ao uso com GPU de um dos mais poderosos algoritmos de todos os tempos está aqui.

Abstract: We present a CUDA based implementation of a decision tree construction algorithm within the gradient boosting library XGBoost. The tree construction algorithm is executed entirely on the GPU and shows high performance with a variety of datasets and settings, including sparse input matrices. Individual boosting iterations are parallelized, combining two approaches. An interleaved approach is used for shallow trees, switching to a more conventional radix sort based approach for larger depths. We show speedups of between 3-6x using a Titan X compared to a 4 core i7 CPU, and 1.2x using a Titan X compared to 2x Xeon CPUs (24 cores). We show that it is possible to process the Higgs dataset (10 million instances, 28 features) entirely within GPU memory. The algorithm is made available as a plug-in within the XGBoost library and fully supports all XGBoost features including classification, regression and ranking tasks. 

Accelerating the XGBoost algorithm using GPU computing

Porque o xGBoost ganha todas as competições de Machine Learning

Uma (longa e) boa resposta está nesta tese de Didrik Nielsen.


Abstract: Tree boosting has empirically proven to be a highly effective approach to predictive modeling.
It has shown remarkable results for a vast array of problems.
For many years, MART has been the tree boosting method of choice.
More recently, a tree boosting method known as XGBoost has gained popularity by winning numerous machine learning competitions.
In this thesis, we will investigate how XGBoost differs from the more traditional MART.
We will show that XGBoost employs a boosting algorithm which we will term Newton boosting. This boosting algorithm will further be compared with the gradient boosting algorithm that MART employs.
Moreover, we will discuss the regularization techniques that these methods offer and the effect these have on the models.
In addition to this, we will attempt to answer the question of why XGBoost seems to win so many competitions.
To do this, we will provide some arguments for why tree boosting, and in particular XGBoost, seems to be such a highly effective and versatile approach to predictive modeling.
The core argument is that tree boosting can be seen to adaptively determine the local neighbourhoods of the model. Tree boosting can thus be seen to take the bias-variance tradeoff into consideration during model fitting. XGBoost further introduces some subtle improvements which allows it to deal with the bias-variance tradeoff even more carefully.

Conclusion: After determining the different boosting algorithms and regularization techniques these methods utilize and exploring the effects of these, we turned to providing arguments for why XGBoost seems to win “every” competition. To provide possible answers to this question, we first gave reasons for why tree boosting in general can be an effective approach. We provided two main arguments for this. First off, additive tree models can be seen to have rich representational abilities. Provided that enough trees of sufficient depth is combined, they are capable of closely approximating complex functional relationships, including high-order interactions. The most important argument provided for the versatility of tree boosting however, was that tree boosting methods are adaptive. Determining neighbourhoods adaptively allows tree boosting methods to use varying degrees of flexibility in different parts of the input space. They will consequently also automatically perform feature selection. This also makes tree boosting methods robust to the curse of dimensionality. Tree boosting can thus be seen actively take the bias-variance tradeoff into account when fitting models. They start out with a low variance, high bias model and gradually reduce bias by decreasing the size of neighbourhoods where it seems most necessary. Both MART and XGBoost have these properties in common. However, compared to MART, XGBoost uses a higher-order approximation at each iteration, and can thus be expected to learn “better” tree structures. Moreover, it provides clever penalization of individual trees. As discussed earlier, this can be seen to make the method even more adaptive. It will allow the method to adaptively determine the appropriate number of terminal nodes, which might vary among trees. It will further alter the learnt tree structures and leaf weights in order to reduce variance in estimation of the individual trees. Ultimately, this makes XGBoost a highly adaptive method which carefully takes the bias-variance tradeoff into account in nearly every aspect of the learning process.

Porque o xGBoost ganha todas as competições de Machine Learning

Previsão de Séries Temporais usando XGBoost – Pacote forecastxgb

Para quem já teve a oportunidade de trabalhar com previsão de variáveis categóricas em Machine Learning sabe que o XGBoost é um dos melhores pacotes do mercado, sendo largamente utilizado em inúmeras competições no Kaggle.

A grande diferença feita pelo Peter Ellis foi realizar algumas adaptações para incorporar algumas variáveis independentes através do parâmetro xreg ao modelo preditivo de séries temporais.

Para quem trabalha com análise de séries temporais, esse trabalho é muito importante até porque o forecastxgb  tríade Média-Móvel/ARIMA (ARMA)/(S)ARIMA em que tanto estatísticos/Data Miners/Data Scientists ficam presos por comodidade ou falta de meios.

Um exemplo da utilização do pacote está abaixo:

# Install devtools to install packages that aren't in CRAN

# Installing package from github 

# Load the libary

# Time Series Example

# Model
model <- xgbts(gas)

# Summary of the model

# Forecasting 12 periods 
fc <- forecast(model, h = 12)

# Plot
Previsão de Séries Temporais usando XGBoost – Pacote forecastxgb