Abstract: Time-to-event outcomes are common in medical research as they offer more information than simply whether or not an event occurred. To handle these outcomes, as well as censored observations where the event was not observed during follow-up, survival analysis methods should be used. Kaplan-Meier estimation can be used to create graphs of the observed survival curves, while the log-rank test can be used to compare curves from different groups. If it is desired to test continuous predictors or to test multiple covariates at once, survival regression models such as the Cox model or the accelerated failure time model (AFT) should be used. The choice of model should depend on whether or not the assumption of the model (proportional hazards for the Cox model, a parametric distribution of the event times for the AFT model) is met. The goal of this paper is to review basic concepts of survival analysis. Discussions relating the Cox model and the AFT model will be provided. The use and interpretation of the survival methods model are illustrated using an artificially simulated dataset.
SUMMARY AND CONCLUSIONS
This paper reviews some basic concepts of survival analyses including discussions and comparisons between the semiparametric Cox proportional hazards model and the parametric AFT model. The appeal of the AFT model lies in the ease of interpreting the results, because the AFT models the effect of predictors and covariates directly on the survival time instead of through the hazard function. If the assumption of proportional hazards of the Cox model is met, the AFT model can be used with the Weibull distribution, while if proportional hazard is violated, the AFT model can be used with distributions other than Weibull.
It is essential to consider the model assumptions and recognize that if the assumptions are not met, the results may be erroneous or misleading. The AFT model assumes a certain parametric distribution for the failure times and that the effect of the covariates on the failure time is multiplicative. Several different distributions should be considered before choosing one. The Cox model assumes proportional hazards of the predictors over time. Model diagnostic tools and goodness of fit tests should be utilized to assess the model assumptions before statistical inferences are made.
In conclusion, although the Cox proportional hazards model tends to be more popular in the literature, the AFT model should also be considered when planning a survival analysis. It should go without saying that the choice should be driven by the desired outcome or the fit to the data, and never by which gives a significant P value for the predictor of interest. The choice should be dictated only by the research hypothesis and by which assumptions of the model are valid for the data being analyzed.
Abstract: Traditionally, medical discoveries are made by observing associations, making hypotheses from them and then designing and running experiments to test the hypotheses. However, with medical images, observing and quantifying associations can often be difficult because of the wide variety of features, patterns, colours, values and shapes that are present in real data. Here, we show that deep learning can extract new knowledge from retinal fundus images. Using deep-learning models trained on data from 284,335 patients and validated on two independent datasets of 12,026 and 999 patients, we predicted cardiovascular risk factors not previously thought to be present or quantifiable in retinal images, such as age (mean absolute error within 3.26 years), gender (area under the receiver operating characteristic curve (AUC) = 0.97), smoking status (AUC = 0.71), systolic blood pressure (mean absolute error within 11.23 mmHg) and major adverse cardiac events (AUC = 0.70). We also show that the trained deep-learning models used anatomical features, such as the optic disc or blood vessels, to generate each prediction.
That’s why I like more “tear down” projects and papers (that kind of paper helps to open our eyes in the right criticism) than “look-my-shiny-not-reproducible-paper-project” .
Using machine learning approaches as non-invasive methods have been used recently as an alternative method in staging chronic liver diseases for avoiding the drawbacks of biopsy. This study aims to evaluate different machine learning techniques in prediction of advanced fibrosis by combining the serum bio-markers and clinical information to develop the classification models.
A prospective cohort of 39,567 patients with chronic hepatitis C was divided into two sets – one categorized as mild to moderate fibrosis (F0-F2), and the other categorized as advanced fibrosis (F3-F4) according to METAVIR score. Decision tree, genetic algorithm, particle swarm optimization, and multilinear regression models for advanced fibrosis risk prediction were developed. Receiver operating characteristic curve analysis was performed to evaluate the performance of the proposed models.
Age, platelet count, AST, and albumin were found to be statistically significant to advanced fibrosis. The machine learning algorithms under study were able to predict advanced fibrosis in patients with HCC with AUROC ranging between 0.73 and 0.76 and accuracy between 66.3% and 84.4%.
A good case to replicate in Brazil.
BACKGROUND: In China, dengue remains an important public health issue with expanded areas and increased incidence recently. Accurate and timely forecasts of dengue incidence in China are still lacking. We aimed to use the state-of-the-art machine learning algorithms to develop an accurate predictive model of dengue.
METHODOLOGY/PRINCIPAL FINDINGS: Weekly dengue cases, Baidu search queries and climate factors (mean temperature, relative humidity and rainfall) during 2011-2014 in Guangdong were gathered. A dengue search index was constructed for developing the predictive models in combination with climate factors. The observed year and week were also included in the models to control for the long-term trend and seasonality. Several machine learning algorithms, including the support vector regression (SVR) algorithm, step-down linear regression model, gradient boosted regression tree algorithm (GBM), negative binomial regression model (NBM), least absolute shrinkage and selection operator (LASSO) linear regression model and generalized additive model (GAM), were used as candidate models to predict dengue incidence. Performance and goodness of fit of the models were assessed using the root-mean-square error (RMSE) and R-squared measures. The residuals of the models were examined using the autocorrelation and partial autocorrelation function analyses to check the validity of the models. The models were further validated using dengue surveillance data from five other provinces. The epidemics during the last 12 weeks and the peak of the 2014 large outbreak were accurately forecasted by the SVR model selected by a cross-validation technique. Moreover, the SVR model had the consistently smallest prediction error rates for tracking the dynamics of dengue and forecasting the outbreaks in other areas in China.
CONCLUSION AND SIGNIFICANCE: The proposed SVR model achieved a superior performance in comparison with other forecasting techniques assessed in this study. The findings can help the government and community respond early to dengue epidemics.
Great talk for whom are looking to get some benchmarks of produtionized machine learning.
Summary: Davis Shepherd and Eugen Cepoi discuss the evolution of ML automation at Netflix and how that lead them to build Meson, an orchestration system used for many of the personalization / recommendation algorithms. They talk about challenges they faced, and what they learned automating thousands of ML pipelines with Meson.
Fast and simple.
What is an Adversarial Attack?
Machine learning algorithms accept inputs as numeric vectors. Designing an input in a specific way to get the wrong result from the model is called an adversarial attack.
How is this possible? No machine learning algorithm is perfect and they make mistakes — albeit very rarely. However, machine learning models consist of a series of specific transformations, and most of these transformations turn out to be very sensitive to slight changes in input. Harnessing this sensitivity and exploiting it to modify an algorithm’s behavior is an important problem in AI security.
In this article we will show practical examples of the main types of attacks, explain why is it so easy to perform them, and discuss the security implications that stem from this technology.
Types of Adversarial Attacks
Here are the main types of hacks we will focus on:
- Non-targeted adversarial attack: the most general type of attack when all you want to do is to make the classifier give an incorrect result.
- Targeted adversarial attack: a slightly more difficult attack which aims to receive a particular class for your input.
If the answer is “no”, please get this tutorial of Algobeans.
The term ‘self-organizing map’ might conjure up a militaristic image of data points marching towards their contingents on a map, which is a rather apt analogy of how the algorithm actually works.
A self-organizing map (SOM) is a clustering technique that helps you uncover categories in large datasets, such as to find customer profiles based on a list of past purchases. It is a special breed of unsupervised neural networks, where neurons (also called nodes or reference vectors) are arranged in a single, 2-dimensional grid, which can take the shape of either rectangles or hexagons.
HOW DOES SOM WORK?
In a nutshell, an SOM comprises neurons in the grid, which gradually adapt to the intrinsic shape of our data. The final result allows us to visualize data points and identify clusters in a lower dimension.
So how does the SOM grid learn the shape of our data? Well, this is done in an iterative process, which is summarized in the following steps, and visualized in the animated GIF below:
Step 0: Randomly position the grid’s neurons in the data space.
Step 1: Select one data point, either randomly or systematically cycling through the dataset in order
Step 2: Find the neuron that is closest to the chosen data point. This neuron is called the Best Matching Unit (BMU).
Step 3: Move the BMU closer to that data point. The distance moved by the BMU is determined by a learning rate, which decreases after each iteration.
Step 4: Move the BMU’s neighbors closer to that data point as well, with farther away neighbors moving less. Neighbors are identified using a radius around the BMU, and the value for this radius decreases after each iteration.
Step 5: Update the learning rate and BMU radius, before repeating Steps 1 to 4. Iterate these steps until positions of neurons have been stabilized.
This is the best no-tech concept available in internet.
Imagine you are standing in front of a row of slot machines, and wish to gamble. You have a bag full of coins. Your goal is to maximize the return on your investment. The problem is that you don’t know the payout percentages of any of the machines. Each has a, potentially, different expected return.
What is your strategy?
You could select one machine at random, and invest all your coins there, but what happens if you selected a poor payout machine? You could have done better.
You could spread your money out and divide it equally (or randomly) between all the different machines. However, if you did this, you’d spend some time investing in poorer payout machines and ‘wasting’ coins that could have be inserted into better machines. The benefit of this strategy, however, is diversification, and you’d be spreading your risk over many machines; you’re never going to be playing the best machine all the time, but you’re never going to be playing the worst all the time either!
Maybe a hybrid strategy is better? In a hybrid solution you could initially spend some time experimenting to estimate the payouts of the machines then, in an exploitation phase, you could put all your future investment into the best paying machine you’d discovered. The more you research, the more you learn about the machines (getting feedback on their individual payout percentages).
However, what is the optimal hybrid strategy? You could spend a long time researching the machines (increasing your confidence), and the longer you spend, certainly, the more accurate your prediction of the best machine would become. However, if you spend too long on research, you might not have many coins left to properly leverage this knowledge (and you’d have wasted many coins on lots of machines that are poor payers). Conversely, if you spend too short a time on research, your estimate for which is the best machine could be bogus (and if you are unlucky, you could become victim to a streak of ‘good-luck’ from a poor paying machine that tricks you into thinking it’s the best machine).
If you are playing a machine that is “good enough”, is it worth the risk of attempting to see if another machine is “better” (experiments to determine this might not be worth the effort).
From the series “Beyond the K-Means clustering“…
By Abhijit Annaldas, Microsoft.
DBSCAN is a different type of clustering algorithm with some unique advantages. As the name indicates, this method focuses more on the proximity and density of observations to form clusters. This is very different from KMeans, where an observation becomes a part of cluster represented by nearest centroid. DBSCAN clustering can identify outliers, observations which won’t belong to any cluster. Since DBSCAN clustering identifies the number of clusters as well, it is very useful with unsupervised learning of the data when we don’t know how many clusters could be there in the data.
K-Means clustering may cluster loosely related observations together. Every observation becomes a part of some cluster eventually, even if the observations are scattered far away in the vector space. Since clusters depend on the mean value of cluster elements, each data point plays a role in forming the clusters. Slight change in data points might affect the clustering outcome. This problem is greatly reduced in DBSCAN due to the way clusters are formed.
Abstract: Cancer is the most dangerous and stubborn disease known to mankind. It accounts for the most deaths caused by any disease. However, if detected early this medical condition is not very difficult to defeconvoat. Tumors which are cancerous grow very rapidly and spread into different parts of the body and this process continues until that tumor spreads in the entire body and ultimately our organs stop functioning. If any tumor is developed in any part of our body it requires immediate medical attention to verify that the tumor is malignant(cancerous) or Benign(non-cancerous). Until now if any tumor has to be tested for malignancy a sample of tumor should be extracted out and then tested in the laboratory. But using the computational logic of Deep Neural Networks we can predict that the tumor is malignant or Benign by only a photograph of that tumor. If cancer is detected in early stage chances are very high that it can be cured completely. In this work, we detect Melanoma(Skin cancer) in tumors by processing images of those tumors.
Conclusion: We have trained our model using Vgg16, Inception and ResNet50 neural network architecture. In training, we have provided 2 categories of images one with Malignant (MelanomaSkin cancer) tumors and other with benign tumors. After training, we tested our model with random images of tumor and an accuracy of 83.86%-86.02% was recorded in classifying that it is malignant or benign. By using neural network our model can classify Malignant(cancerous) and benign(non-cancerous) tumors with an accuracy of 86.02%. Since cancer, if detected early can be cured completely. This technology can be used to detect cancer when a tumor is developed at early stage and precautions can be taken accordingly.
Abstract: The standard architecture of synthetic aperture radar (SAR) automatic target recognition (ATR) consists of three stages: detection, discrimination, and classification. In recent years, convolutional neural networks (CNNs) for SAR ATR have been proposed, but most of them classify target classes from a target chip extracted from SAR imagery, as a classification for the third stage of SAR ATR. In this report, we propose a novel CNN for end-to-end ATR from SAR imagery. The CNN named verification support network (VersNet) performs all three stages of SAR ATR end-to-end. VersNet inputs a SAR image of arbitrary sizes with multiple classes and multiple targets, and outputs a SAR ATR image representing the position, class, and pose of each detected target. This report describes the evaluation results of VersNet which trained to output scores of all 12 classes: 10 target classes, a target front class, and a background class, for each pixel using the moving and stationary target acquisition and recognition (MSTAR) public dataset.
Conclusion: By applying CNN to the third stage classification in the standard architecture of SAR ATR, the performance has been improved. In order to improve the overall performance of SAR ATR, it is important not only to improve the performance of the third stage classification but also to improve the performance of the first stage detection and the second stage discrimination. In this report, we proposed a CNN based on a new architecture of SAR ATR that consists of a single stage, i.e. endto-end, not the standard architecture of SAR ATR. Unlike conventional CNNs for target classification, the CNN named VersNet inputs a SAR image of arbitrary sizes with multiple classes and multiple targets, and outputs a SAR ATR image representing the position, class, and pose of each detected target. We trained the VersNet to output scores include ten target classes on MSTAR dataset and evaluated its performance. The average IoU for all the pixels of testing (2420 target chips) is over 0.9. Also, the classification accuracy is about 99.5%, if we select the majority class of maximum probability for each pixel as the predicted class.
By Joel Ruben Antony Moniz, David Krueger
We propose Nested LSTMs (NLSTM), a novel RNN architecture with multiple levels of memory. Nested LSTMs add depth to LSTMs via nesting as opposed to stacking. The value of a memory cell in an NLSTM is computed by an LSTM cell, which has its own inner memory cell. Specifically, instead of computing the value of the (outer) memory cell as coutert=ft⊙ct−1+it⊙gt, NLSTM memory cells use the concatenation (ft⊙ct−1,it⊙gt) as input to an inner LSTM (or NLSTM) memory cell, and set coutert = hinnert. Nested LSTMs outperform both stacked and single-layer LSTMs with similar numbers of parameters in our experiments on various character-level language modeling tasks, and the inner memories of an LSTM learn longer term dependencies compared with the higher-level units of a stacked LSTM.
When we have some smart people like Elon Musk and Stephen Hawking
bashing talking about something totally outside their fields, in the first moment we need to be careful about their opinions; but when we have the same smart guys using some alarmistic speech about what they do not understand, we really need to take a more pragmatic view about their argument and see the flaws, especially when those guys talk about regulation over this “black boxes”.
This article of Vijay Pande of the New York Times called “Artificial Intelligence’s ‘Black Box’ Is Nothing to Fear” give us a sober view of what’s the real black box are:
Given these types of concerns, the unseeable space between where data goes in and answers come out is often referred to as a “black box” — seemingly a reference to the hardy (and in fact orange, not black) data recorders mandated on aircraft and often examined after accidents. In the context of A.I., the term more broadly suggests an image of being in the “dark” about how the technology works: We put in and provide the data and models and architectures, and then computers provide us answers while continuing to learn on their own, in a way that’s seemingly impossible — and certainly too complicated — for us to understand.
There’s particular concern about this in health care, where A.I. is used to classify which skin lesions are cancerous, to identify very early-stage cancer from blood, to predict heart disease, to determine what compounds in people and animals could extend healthy life spans and more. But these fears about the implications of black box are misplaced. A.I. is no less transparent than the way in which doctors have always worked — and in many cases it represents an improvement, augmenting what hospitals can do for patients and the entire health care system. After all, the black box in A.I. isn’t a new problem due to new tech: Human intelligence itself is — and always has been — a black box.
Let’s take the example of a human doctor making a diagnosis. Afterward, a patient might ask that doctor how she made that diagnosis, and she would probably share some of the data she used to draw her conclusion. But could she really explain how and why she made that decision, what specific data from what studies she drew on, what observations from her training or mentors influenced her, what tacit knowledge she gleaned from her own and her colleagues’ shared experiences and how all of this combined into that precise insight? Sure, she’d probably give a few indicators about what pointed her in a certain direction — but there would also be an element of guessing, of following hunches. And even if there weren’t, we still wouldn’t know that there weren’t other factors involved of which she wasn’t even consciously aware.
If the same diagnosis had been made with A.I., we could draw from all available information on that particular patient — as well as data anonymously aggregated across time and from countless other relevant patients everywhere, to make the strongest evidence-based decision possible. It would be a diagnosis with a direct connection to the data, rather than human intuition based on limited data and derivative summaries of anecdotal experiences with a relatively small number of local patients.
But we make decisions in areas that we don’t fully understand every day — often very successfully — from the predicted economic impacts of policies to weather forecasts to the ways in which we approach much of science in the first place. We either oversimplify things or accept that they’re too complex for us to break down linearly, let alone explain fully. It’s just like the black box of A.I.: Human intelligence can reason and make arguments for a given conclusion, but it can’t explain the complex, underlying basis for how we arrived at a particular conclusion. Think of what happens when a couple get divorced because of one stated cause — say, infidelity — when in reality there’s an entire unseen universe of intertwined causes, forces and events that contributed to that outcome. Why did they choose to split up when another couple in a similar situation didn’t? Even those in the relationship can’t fully explain it. It’s a black box.
A good point about this argument it’s that when we get some diagnostics from our doctors, how much of this knowledge (e.g. field experience, practical skills, etc) are totally interpretable, uniform with the literature and transparent for their patient in the right moment of the examination?
And the final argument of the article shows about an common aspect of computation: debugging.
The irony is that compared with human intelligence, A.I. is actually the more transparent of intelligences. Unlike the human mind, A.I. can — and should — be interrogated and interpreted. Like the ability to audit and refine models and expose knowledge gaps in deep neural nets and the debugging tools that will inevitably be built and the potential ability to augment human intelligence via brain-computer interfaces, there are many technologies that could help interpret artificial intelligence in a way we can’t interpret the human brain. In the process, we may even learn more about how human intelligence itself works.
The final conclusion of this is: if you see a smart, well known person talking about something totally outside of their original field, be skeptic.