Abstract: Deep Learning is a new machine learning field that gained a lot of interest over the past few years. It was widely applied to several applications and proven to be a powerful machine learning tool for many of the complex problems. In this paper we used Deep Neural Network classifier which is one of the DL architectures for classifying a dataset of 66 brain MRIs into 4 classes e.g. normal, glioblastoma, sarcoma and metastatic bronchogenic carcinoma tumors. The classifier was combined with the discrete wavelet transform (DWT) the powerful feature extraction tool and principal components analysis (PCA) and the evaluation of the performance was quite good over all the performance measures.
Conclusion and future work: In this paper we proposed an efficient methodology which combines the discrete wavelet transform (DWT) with the Deep Neural Network (DNN) to classify the brain MRIs into Normal and 3 types of malignant brain tumors: glioblastoma, sarcoma and metastatic bronchogenic carcinoma. The new methodology architecture resemble the convolutional neural networks (CNN) architecture but requires less hardware specifications and takes a convenient time of processing for large size images (256 × 256). In addition using the DNN classifier shows high accuracy compared to traditional classifiers. The good results achieved using the DWT could be employed with the CNN in the future and compare the results.
Abstract: With the intervention of social media and internet technology, people are getting more and more careless and distracted while driving which is having a severe detrimental effect on the safety of the driver and his fellow passengers. To provide an effective solution, this paper put forwards a Machine Learning model using Convolutional Neural Networks to not only detect the distracted driver but also identify the cause of his distraction by analyzing the images obtained using the camera module installed inside the vehicle. Convolutional neural networks are known to learn spatial features from images, which can be further examined by fully connected neural networks. The experimental results show a 99% average accuracy in distraction recognition and hence strongly support that our Convolutional neural networks model can be used to identify distraction among the drivers.
Conclusion: Deep learning using Convolutional Neural Networks  has become a hot area in Machine Learning research, and it has been extensively used in image classification, voice recognition, etc. In this paper, we use Deep Convolutional Networks for detecting distracted drivers and also identifying the cause of their distraction using the VGG16  and VGG19  model. The above results suggest that the methods discussed in this work can be used to develop a system using which distraction while driving can be detected among drivers. The model proposed can automatically identify any of the mentioned 10 classes of distraction and identify not only basic distraction but also their cause of distraction. With an accuracy of more than 99%, the mentioned system was shown to be efficient and workable. The proposed system can be a part of some Driver State Monitoring System which will effectively monitor the state of the driver while he is driving. Driver state monitoring has been becoming increasingly popular these days and many automobile giants have started adopting such systems as a methodology to prevent accidents. These systems, when installed inside vehicles will raise warnings whenever the driver gets distracted, thus trying to prevent any accidents due to distraction from the driver. Also in this work a significant amount of training time has been shown to be reduced. When pre-trained weights from ImageNet  were not used, the training time increased by around 50 times for both VGG16  and VGG19 . A graphical representation of time elapsed is depicted in Fig. 15. This drastic reduction in training time was achieved without diminishing the accuracy of our classification models. In future work as an extension to this work, more categories of distraction can be brought in. Even considering certain specific scenarios, which were not targeted in the present work, such as detecting drowsiness among drivers may also provide an opportunity to widen the scale of the work and build a more efficient system.
Abstract: Currently, oxygen uptake () is the most precise means of investigating aerobic fitness and level of physical activity; however, can only be directly measured in supervised conditions. With the advancement of new wearable sensor technologies and data processing approaches, it is possible to accurately infer work rate and predict during activities of daily living (ADL). The main objective of this study was to develop and verify the methods required to predict and investigate the dynamics during ADL. The variables derived from the wearable sensors were used to create a predictor based on a random forest method. The temporal dynamics were assessed by the mean normalized gain amplitude (MNG) obtained from frequency domain analysis. The MNG provides a means to assess aerobic fitness. The predicted during ADL was strongly correlated (r = 0.87, P < 0.001) with the measured and the prediction bias was 0.2 ml·min−1·kg−1. The MNG calculated based on predicted was strongly correlated (r = 0.71, P < 0.001) with MNG calculated based on measured data. This new technology provides an important advance in ambulatory and continuous assessment of aerobic fitness with potential for future applications such as the early detection of deterioration of physical health.
Abstract: Cancers that appear pathologically similar often respond differently to the same drug regimens. Methods to better match patients to drugs are in high demand. We demonstrate a promising approach to identify robust molecular markers for targeted treatment of acute myeloid leukemia (AML) by introducing: data from 30 AML patients including genome-wide gene expression profiles and in vitro sensitivity to 160 chemotherapy drugs, a computational method to identify reliable gene expression markers for drug sensitivity by incorporating multi-omic prior information relevant to each gene’s potential to drive cancer. We show that our method outperforms several state-of-the-art approaches in identifying molecular markers replicated in validation data and predicting drug sensitivity accurately. Finally, we identify SMARCA4 as a marker and driver of sensitivity to topoisomerase II inhibitors, mitoxantrone, and etoposide, in AML by showing that cell lines transduced to have high SMARCA4 expression reveal dramatically increased sensitivity to these agents.
Discussion: Due to the small sample size and the potential confounding factors in the gene expression and the drug sensitivity data, standard methods to discover gene-drug associations usually fail to identify replicable signals. We present a new way to identify robust gene-drug associations by prioritizing genes based on the multi-dimensional information on each gene’s potential to drive cancer. We demonstrate that our method increases the chance that the identified gene-drug associations are replicated in validation data. This leads us to a short list of genes which are all attractive biomarkers for different classes of drugs. Our results—including the expression, drug sensitivity data, and association statistics from patient samples—have been made freely available to academic communities.
Our results suggest that high SMARCA4 expression could be a molecular marker for sensitivity to topoisomerase II inhibitors in AML cells. These results offer a potentially enormous impact to improve patient response. Mitoxantrone is an anthracycline, like daunorubicin or idarubicin, and one of the two component classes of drugs included in nearly all upfront AML treatment regimens. It is also included (the “M”) in the CLAG-M regimen55, a triple-drug component upfront regimen now being studied as GCLAM56. Mitoxantrone and etoposide (also a topoisomerase II inhibitor) are two of the three drugs in the MEC regimen57, used together with cytarabine, as a common regimen for relapsed/refractory AML. Many modern regimens are in clinical trials that add an investigational drug to the MEC backbone, for example, an antibody to CXCR4 (NCT01120457) or an E selectin inhibitor (NCT02306291) in combination, or decitabine priming preceding the MEC regimen58. Identifying a predictor of response to mitoxantrone based on clinically available biospecimens, such as leukemic blast gene expression measured prior to treatment, could potentially increase median survival rates for patients with high expression of SMARCA4 and indicate alternative therapies for patients with low SMARCA4 expression.
The AML patients used in our study were consecutively enrolled on a protocol to obtain laboratory samples for research. They were selected solely based on sufficient leukemia cell numbers. As the patient samples were consecutively obtained and not selected for any specific attribute, we postulated that they were representative of patients seen at a tertiary referral center and that the results would be relevant to a larger, more general clinical population. Moreover, since each of the data sets from which we collected prior information (driver features) contained many more than 30 samples (e.g., TCGA AML data), it would be highly likely that MERGE results would be more generalizable to larger clinical populations than the methods that retrieve results specifically based on the 30 AML samples. In fact, Fig. 2a, b implies higher generalizability of MERGE compared to alternative methods.
While we have genotype information on FLT3 and NPM1 and the cytogenetic risk category for most of the 30 patients, the current version of the MERGE framework did not take these features into account: our main focus sought to build a general framework that could address the high-dimensionality challenge (i.e., the number of samples being much smaller than the number of genes) and make efficient use of expression data to identify robust associations. However, to consolidate our findings, we performed a covariate analysis to confirm that the top-ranked gene-drug associations discovered by MERGE remained significant when the risk group/cytogenetic features were considered in the association analysis. We checked whether the gene-drug associations shown in the heat map in Fig. 6b (highlighted as red or green) were conserved when we added each of the following as an additional covariate to the linear model: (1) cytogenetic risk, (2) FLT3 mutation status, and (3) NPM1 mutation status. In Supplementary Fig. 9, each dot corresponds to a gene-drug pair, and each color to a different covariate. Most of the dots being closer to the diagonal indicates that the associations did not decrease significantly after adding the covariates. Moreover, of 357 dots, only eight were below the horizontal red line; this indicates that 98% of the gene-drug associations MERGE uncovered were still significant (p ≤ 0.05) after modeling the covariate.
Abstract: Time-to-event outcomes are common in medical research as they offer more information than simply whether or not an event occurred. To handle these outcomes, as well as censored observations where the event was not observed during follow-up, survival analysis methods should be used. Kaplan-Meier estimation can be used to create graphs of the observed survival curves, while the log-rank test can be used to compare curves from different groups. If it is desired to test continuous predictors or to test multiple covariates at once, survival regression models such as the Cox model or the accelerated failure time model (AFT) should be used. The choice of model should depend on whether or not the assumption of the model (proportional hazards for the Cox model, a parametric distribution of the event times for the AFT model) is met. The goal of this paper is to review basic concepts of survival analysis. Discussions relating the Cox model and the AFT model will be provided. The use and interpretation of the survival methods model are illustrated using an artificially simulated dataset.
SUMMARY AND CONCLUSIONS
This paper reviews some basic concepts of survival analyses including discussions and comparisons between the semiparametric Cox proportional hazards model and the parametric AFT model. The appeal of the AFT model lies in the ease of interpreting the results, because the AFT models the effect of predictors and covariates directly on the survival time instead of through the hazard function. If the assumption of proportional hazards of the Cox model is met, the AFT model can be used with the Weibull distribution, while if proportional hazard is violated, the AFT model can be used with distributions other than Weibull.
It is essential to consider the model assumptions and recognize that if the assumptions are not met, the results may be erroneous or misleading. The AFT model assumes a certain parametric distribution for the failure times and that the effect of the covariates on the failure time is multiplicative. Several different distributions should be considered before choosing one. The Cox model assumes proportional hazards of the predictors over time. Model diagnostic tools and goodness of fit tests should be utilized to assess the model assumptions before statistical inferences are made.
In conclusion, although the Cox proportional hazards model tends to be more popular in the literature, the AFT model should also be considered when planning a survival analysis. It should go without saying that the choice should be driven by the desired outcome or the fit to the data, and never by which gives a significant P value for the predictor of interest. The choice should be dictated only by the research hypothesis and by which assumptions of the model are valid for the data being analyzed.
Abstract: Traditionally, medical discoveries are made by observing associations, making hypotheses from them and then designing and running experiments to test the hypotheses. However, with medical images, observing and quantifying associations can often be difficult because of the wide variety of features, patterns, colours, values and shapes that are present in real data. Here, we show that deep learning can extract new knowledge from retinal fundus images. Using deep-learning models trained on data from 284,335 patients and validated on two independent datasets of 12,026 and 999 patients, we predicted cardiovascular risk factors not previously thought to be present or quantifiable in retinal images, such as age (mean absolute error within 3.26 years), gender (area under the receiver operating characteristic curve (AUC) = 0.97), smoking status (AUC = 0.71), systolic blood pressure (mean absolute error within 11.23 mmHg) and major adverse cardiac events (AUC = 0.70). We also show that the trained deep-learning models used anatomical features, such as the optic disc or blood vessels, to generate each prediction.
That’s why I like more “tear down” projects and papers (that kind of paper helps to open our eyes in the right criticism) than “look-my-shiny-not-reproducible-paper-project” .
Using machine learning approaches as non-invasive methods have been used recently as an alternative method in staging chronic liver diseases for avoiding the drawbacks of biopsy. This study aims to evaluate different machine learning techniques in prediction of advanced fibrosis by combining the serum bio-markers and clinical information to develop the classification models.
A prospective cohort of 39,567 patients with chronic hepatitis C was divided into two sets – one categorized as mild to moderate fibrosis (F0-F2), and the other categorized as advanced fibrosis (F3-F4) according to METAVIR score. Decision tree, genetic algorithm, particle swarm optimization, and multilinear regression models for advanced fibrosis risk prediction were developed. Receiver operating characteristic curve analysis was performed to evaluate the performance of the proposed models.
Age, platelet count, AST, and albumin were found to be statistically significant to advanced fibrosis. The machine learning algorithms under study were able to predict advanced fibrosis in patients with HCC with AUROC ranging between 0.73 and 0.76 and accuracy between 66.3% and 84.4%.
A good case to replicate in Brazil.
BACKGROUND: In China, dengue remains an important public health issue with expanded areas and increased incidence recently. Accurate and timely forecasts of dengue incidence in China are still lacking. We aimed to use the state-of-the-art machine learning algorithms to develop an accurate predictive model of dengue.
METHODOLOGY/PRINCIPAL FINDINGS: Weekly dengue cases, Baidu search queries and climate factors (mean temperature, relative humidity and rainfall) during 2011-2014 in Guangdong were gathered. A dengue search index was constructed for developing the predictive models in combination with climate factors. The observed year and week were also included in the models to control for the long-term trend and seasonality. Several machine learning algorithms, including the support vector regression (SVR) algorithm, step-down linear regression model, gradient boosted regression tree algorithm (GBM), negative binomial regression model (NBM), least absolute shrinkage and selection operator (LASSO) linear regression model and generalized additive model (GAM), were used as candidate models to predict dengue incidence. Performance and goodness of fit of the models were assessed using the root-mean-square error (RMSE) and R-squared measures. The residuals of the models were examined using the autocorrelation and partial autocorrelation function analyses to check the validity of the models. The models were further validated using dengue surveillance data from five other provinces. The epidemics during the last 12 weeks and the peak of the 2014 large outbreak were accurately forecasted by the SVR model selected by a cross-validation technique. Moreover, the SVR model had the consistently smallest prediction error rates for tracking the dynamics of dengue and forecasting the outbreaks in other areas in China.
CONCLUSION AND SIGNIFICANCE: The proposed SVR model achieved a superior performance in comparison with other forecasting techniques assessed in this study. The findings can help the government and community respond early to dengue epidemics.