Deep learning sharpens views of cells and genes

The research relied on a convolutional neural network, a type of deep-learning algorithm that is transforming how biologists analyse images. Scientists are using the approach to find mutations in genomes and predict variations in the layout of single cells. Google’s method, described in a preprint in August (R. Poplin et al. Preprint at https://arxiv.org/abs/1708.09843; 2017), is part of a wave of new deep-learning applications that are making image processing easier and more versatile — and could even identify overlooked biological phenomena.
Cell biologists at the Allen Institute for Cell Science in Seattle, Washington, are using convolutional neural networks to convert flat, grey images of cells captured with light microscopes into 3D images in which some of a cell’s organelles are labelled in colour. The approach eliminates the need to stain cells — a process that requires more time and a sophisticated lab, and can damage the cell. Last month, the group published details of an advanced technique that can predict the shape and location of even more cell parts using just a few pieces of data — such as the cell’s outline (G. R. Johnson et al. Preprint at bioRxiv http://doi.org/chwv; 2017).
Other machine-learning connoisseurs in biology have set their sights on new frontiers, now that convolutional neural networks are taking flight for image processing. “Imaging is important, but so is chemistry and molecular data,” says Alex Wolf, a computational biologist at the German Research Center for Environmental Health in Neuherberg. Wolf hopes to tweak neural networks so that they can analyse gene expression. “I think there will be a very big breakthrough in the next few years,” he says, “that allows biologists to apply neural networks much more broadly.”
Anúncios
Deep learning sharpens views of cells and genes

Survival analysis and regression models

Abstract: Time-to-event outcomes are common in medical research as they offer more information than simply whether or not an event occurred. To handle these outcomes, as well as censored observations where the event was not observed during follow-up, survival analysis methods should be used. Kaplan-Meier estimation can be used to create graphs of the observed survival curves, while the log-rank test can be used to compare curves from different groups. If it is desired to test continuous predictors or to test multiple covariates at once, survival regression models such as the Cox model or the accelerated failure time model (AFT) should be used. The choice of model should depend on whether or not the assumption of the model (proportional hazards for the Cox model, a parametric distribution of the event times for the AFT model) is met. The goal of this paper is to review basic concepts of survival analysis. Discussions relating the Cox model and the AFT model will be provided. The use and interpretation of the survival methods model are illustrated using an artificially simulated dataset.

SUMMARY AND CONCLUSIONS
This paper reviews some basic concepts of survival analyses including discussions and comparisons between the semiparametric Cox proportional hazards model and the parametric AFT model. The appeal of the AFT model lies in the ease of interpreting the results, because the AFT models the effect of predictors and covariates directly on the survival time instead of through the hazard function. If the assumption of proportional hazards of the Cox model is met, the AFT model can be used with the Weibull distribution, while if proportional hazard is violated, the AFT model can be used with distributions other than Weibull.

It is essential to consider the model assumptions and recognize that if the assumptions are not met, the results may be erroneous or misleading. The AFT model assumes a certain parametric distribution for the failure times and that the effect of the covariates on the failure time is multiplicative. Several different distributions should be considered before choosing one. The Cox model assumes proportional hazards of the predictors over time. Model diagnostic tools and goodness of fit tests should be utilized to assess the model assumptions before statistical inferences are made.

In conclusion, although the Cox proportional hazards model tends to be more popular in the literature, the AFT model should also be considered when planning a survival analysis. It should go without saying that the choice should be driven by the desired outcome or the fit to the data, and never by which gives a significant P value for the predictor of interest. The choice should be dictated only by the research hypothesis and by which assumptions of the model are valid for the data being analyzed.

Survival analysis and regression models

Prediction of cardiovascular risk factors from retinal fundus photographs via deep learning

Prediction of cardiovascular risk factors from retinal fundus photographs via deep learning

Abstract: Traditionally, medical discoveries are made by observing associations, making hypotheses from them and then designing and running experiments to test the hypotheses. However, with medical images, observing and quantifying associations can often be difficult because of the wide variety of features, patterns, colours, values and shapes that are present in real data. Here, we show that deep learning can extract new knowledge from retinal fundus images. Using deep-learning models trained on data from 284,335 patients and validated on two independent datasets of 12,026 and 999 patients, we predicted cardiovascular risk factors not previously thought to be present or quantifiable in retinal images, such as age (mean absolute error within 3.26 years), gender (area under the receiver operating characteristic curve (AUC) = 0.97), smoking status (AUC = 0.71), systolic blood pressure (mean absolute error within 11.23 mmHg) and major adverse cardiac events (AUC = 0.70). We also show that the trained deep-learning models used anatomical features, such as the optic disc or blood vessels, to generate each prediction.

Prediction of cardiovascular risk factors from retinal fundus photographs via deep learning

Deep Reinforcement Learning Doesn’t Work Yet

That’s why I like more “tear down” projects and papers (that kind of paper helps to open our eyes in the right criticism) than “look-my-shiny-not-reproducible-paper-project” .

By Sorta Insightful

 

 

 

 

 

Deep Reinforcement Learning Doesn’t Work Yet

Optimization for Deep Learning Algorithms: A Review

ABSTRACT: In past few years, deep learning has received attention in the field of artificial intelligence. This paper reviews three focus areas of learning methods in deep learning namely supervised, unsupervised and reinforcement learning. These learning methods are used in implementing deep and convolutional neural networks. They offered unified computational approach, flexibility and scalability capabilities. The computational model implemented by deep learning is used in understanding data representation with multiple levels of abstractions. Furthermore, deep learning enhanced the state-of-the-art methods in terms of domains like genomics. This can be applied in pathway analysis for modelling biological network. Thus, the extraction of biochemical production can be improved by using deep learning. On the other hand, this review covers the implementation of optimization in terms of meta-heuristics methods. This optimization is used in machine learning as a part of modelling methods.
CONCLUSION
In this review, discussed about deep learning techniques which implementing multiple level of abstraction in feature representation. Deep learning can be characterized as rebranding of artificial neural network. This learning methods gains a large interest among the researchers because of better representation and easier to learn tasks. Even though deep learning is implemented, however there are some issues has been arise. There are easily getting stuck at local optima and computationally expensive. DeepBind algorithm shows that deep learning can cooperate in genomics study. It is to ensure on achieving high level of prediction protein binding affinity. On the other hand, the optimization method which has been discusses consists of several meta-heuristics
methods which can be categorized under evolutionary algorithms. The application of the techniques involvedCRO shows the diversity of optimization algorithm to improve the analysis of modelling techniques. Furthermore, these methods are able to solve the problems arise in conventional neural network as it provides high quality in finding solution in a given search space. The application of optimization methods enable the
extraction of biochemical production of metabolic pathway. Deep learning will gives a good advantage in the biochemical production as it allows high level abstraction in cellular biological network. Thus, the use of CRO will improve the problems arise in deep learning which are getting stuck at local optima and it is computationally expensive. As CRO use global search in the search space to identify global minimum point. Thus, it will improve the training process in the network on refining the weight in order to have minimum error.
Optimization for Deep Learning Algorithms: A Review

Driver behavior profiling: An investigation with different smartphone sensors and machine learning

Abstract: Driver behavior impacts traffic safety, fuel/energy consumption and gas emissions. Driver behavior profiling tries to understand and positively impact driver behavior. Usually driver behavior profiling tasks involve automated collection of driving data and application of computer models to generate a classification that characterizes the driver aggressiveness profile. Different sensors and classification methods have been employed in this task, however, low-cost solutions and high performance are still research targets. This paper presents an investigation with different Android smartphone sensors, and classification algorithms in order to assess which sensor/method assembly enables classification with higher performance. The results show that specific combinations of sensors and intelligent methods allow classification performance improvement.
Results: We executed all combinations of the 4 MLAs and their configurations described on Table 1 over the 15 data sets described in Section 4.3 using 5 different nf values. We trained, tested, and assessed every evaluation assembly with 15 different random seeds. Finally, we calculated the mean AUC for these executions, grouped them by driving event type, and ranked the 5 best performing assemblies in the boxplot displayed in Fig 6. This figure shows the driving events on the left-hand side and the 5 best evaluation assemblies for each event on the right-hand side, with the best ones at the bottom. The assembly text identification in Fig 6 encodes, in this order: (i) the nf value; (ii) the sensor and its axis (if there is no axis indication, then all sensor axes are used); and (iii) the MLA and its configuration identifier.
Conclusions and future work: In this work we presented a quantitative evaluation of the performances of 4 MLAs (BN, MLP, RF, and SVM) with different configurations applied in the detection of 7 driving event types using data collected from 4 Android smartphone sensors (accelerometer, linear acceleration, magnetometer, and gyroscope). We collected 69 samples of these event types in a real-world experiment with 2 drivers. The start and end times of these events were recorded serve as the experiment ground-truth. We also compared the performances when applying different sliding time window sizes.
We performed 15 executions with different random seeds of 3865 evaluation assemblies of the form EA = {1:sensor, 2:sensor axis(es), 3:MLA, 4:MLA configuration, 5:number of frames in sliding window}. As a result, we found the top 5 performing assemblies for each driving event type. In the context of our experiment, these results show that (i) bigger window sizes perform better; (ii) the gyroscope and the accelerometer are the best sensors to detect our driving events; (iii) as general rule, using all sensor axes perform better than using a single one, except for aggressive left turns events; (iv) RF is by far the best performing MLA, followed by MLP; and (v) the performance of the top 35 combinations is both satisfactory and equivalent, varying from 0.980 to 0.999 mean AUC values.
As future work, we expect to collect a greater number of driving events samples using different vehicles, Android smartphone models, road conditions, weather, and temperature. We also expect to add more MLAs to our evaluation, including those based on fuzzy logic and DTW. Finally, we intend use the best evaluation assemblies observed in this work to develop an Android smartphone application which can detect driving events in real-time and calculate the driver behavior profile.
Driver behavior profiling: An investigation with different smartphone sensors and machine learning

Comparison of Machine Learning Approaches for Prediction of Advanced Liver Fibrosis in Chronic Hepatitis C Patients.

BACKGROUND/AIM:
Using machine learning approaches as non-invasive methods have been used recently as an alternative method in staging chronic liver diseases for avoiding the drawbacks of biopsy. This study aims to evaluate different machine learning techniques in prediction of advanced fibrosis by combining the serum bio-markers and clinical information to develop the classification models.

METHODS:
A prospective cohort of 39,567 patients with chronic hepatitis C was divided into two sets – one categorized as mild to moderate fibrosis (F0-F2), and the other categorized as advanced fibrosis (F3-F4) according to METAVIR score. Decision tree, genetic algorithm, particle swarm optimization, and multilinear regression models for advanced fibrosis risk prediction were developed. Receiver operating characteristic curve analysis was performed to evaluate the performance of the proposed models.

RESULTS:
Age, platelet count, AST, and albumin were found to be statistically significant to advanced fibrosis. The machine learning algorithms under study were able to predict advanced fibrosis in patients with HCC with AUROC ranging between 0.73 and 0.76 and accuracy between 66.3% and 84.4%.

CONCLUSIONS:
Machine-learning approaches could be used as alternative methods in prediction of the risk of advanced liver fibrosis due to chronic hepatitis C.

Comparison of Machine Learning Approaches for Prediction of Advanced Liver Fibrosis in Chronic Hepatitis C Patients.

Optimization for Deep Learning Algorithms: A Review

Abstract: In past few years, deep learning has received attention in the field of artificial intelligence. This paper reviews three focus areas of learning methods in deep learning namely supervised, unsupervised and reinforcement learning. These learning methods are used in implementing deep and convolutional neural networks. They offered unified computational approach, flexibility and scalability capabilities. The computational model implemented by deep learning is used in understanding data representation with multiple levels of abstractions. Furthermore, deep learning enhanced the state-of-the-art methods in terms of domains like genomics. This can be applied in pathway analysis for modelling biological network. Thus, the extraction of biochemical production can be improved by using deep learning. On the other hand, this review covers the implementation of optimization in terms of meta-heuristics methods. This optimization is used in machine learning as a part of modelling methods.
CONCLUSION: In this review, discussed about deep learning techniques which implementing multiple level of abstraction in feature representation. Deep learning can be characterized as rebranding of artificial neural network. This learning methods gains a large interest among the researchers because of better representation and easier to learn tasks. Even though deep learning is implemented, however there are some issues has been arise. There are easily getting stuck at local optima and computationally expensive. DeepBind algorithm shows that deep learning can cooperate in genomics study. It is to ensure on achieving high level of prediction protein binding affinity. On the other hand, the optimization method which has been discusses consists of several meta-heuristics methods which can be categorized under evolutionary algorithms. The application of the techniques involvedCRO shows the diversity of optimization algorithm to improve the analysis of modelling techniques. Furthermore, these methods are able to solve the problems arise in conventional neural network as it provides high quality in finding solution in a given search space. The application of optimization methods enable the extraction of biochemical production of metabolic pathway. Deep learning will gives a good advantage in the biochemical production as it allows high level abstraction in cellular biological network. Thus, the use of CRO will improve the problems arise in deep learning which are getting stuck at local optima and it is computationally expensive. As CRO use global search in the search space to identify global minimum point. Thus, it will improve the training process in the network on refining the weight in order to have minimum error.
Optimization for Deep Learning Algorithms: A Review

A Machine Learning Approach Using Survival Statistics to Predict Graft Survival in Kidney Transplant Recipients: A Multicenter Cohort Study.

Abstract: Accurate prediction of graft survival after kidney transplant is limited by the complexity and heterogeneity of risk factors influencing allograft survival. In this study, we applied machine learning methods, in combination with survival statistics, to build new prediction models of graft survival that included immunological factors, as well as known recipient and donor variables. Graft survival was estimated from a retrospective analysis of the data from a multicenter cohort of 3,117 kidney transplant recipients. We evaluated the predictive power of ensemble learning algorithms (survival decision tree, bagging, random forest, and ridge and lasso) and compared outcomes to those of conventional models (decision tree and Cox regression). Using a conventional decision tree model, the 3-month serum creatinine level post-transplant (cut-off, 1.65 mg/dl) predicted a graft failure rate of 77.8% (index of concordance, 0.71). Using a survival decision tree model increased the index of concordance to 0.80, with the episode of acute rejection during the first year post-transplant being associated with a 4.27-fold increase in the risk of graft failure. Our study revealed that early acute rejection in the first year is associated with a substantially increased risk of graft failure. Machine learning methods may provide versatile and feasible tools for forecasting graft survival.
A Machine Learning Approach Using Survival Statistics to Predict Graft Survival in Kidney Transplant Recipients: A Multicenter Cohort Study.

Study of Engineered Features and Learning Features in Machine Learning – A Case Study in Document Classification

Abstract. Document classification is challenging due to handling of voluminous and highly non-linear data, generated exponentially in the era of digitization. Proper representation of documents increases efficiency and performance of classification, ultimate goal of retrieving information from large corpus. Deep neural network models learn features for document classification unlike the engineered feature based approaches where features are extracted or selected from the data. In the paper we investigate performance of different classifiers based on the features obtained using two approaches. We apply deep autoencoder for learning features while engineering features are extracted by exploiting semantic association within the terms of the documents. Experimentally it has been observed that learning feature based classification always perform better than the proposed engineering feature based classifiers.
Conclusion and Future Work: In the paper we emphasize the importance of feature representation for classification. The potential of deep learning in feature extraction process for efficient compression and representation of raw features is explored. By conducting multiple experiments we deduce that a DBN – Deep AE feature extractor and a DNNC outperforms most other techniques providing a trade-off between accuracy and execution time. In this paper we have dealt with the most significant feature extraction and classification techniques for text documents where each text document belongs to a single class label. With the explosion of digital information a large number of documents may belong to multiple class labels handling of which is a new challenge and scope of future work. Word2vec models [18] in association with Recurrent Neural Networks(RNN) [4,14] have recently started gaining popularity in feature representation domain. We would like to compare their performance with our deep learning method in future. Similar feature extraction techniques can also be applied to image data to generate compressed feature which can facilitate efficient classification. We would also like to explore such possibilities in our future work.
Study of Engineered Features and Learning Features in Machine Learning – A Case Study in Document Classification

Machine Learning to forecasting Dengue

A good case to replicate in Brazil.

Developing a dengue forecast model using machine learning: A case study in China.

BACKGROUND: In China, dengue remains an important public health issue with expanded areas and increased incidence recently. Accurate and timely forecasts of dengue incidence in China are still lacking. We aimed to use the state-of-the-art machine learning algorithms to develop an accurate predictive model of dengue.

METHODOLOGY/PRINCIPAL FINDINGS: Weekly dengue cases, Baidu search queries and climate factors (mean temperature, relative humidity and rainfall) during 2011-2014 in Guangdong were gathered. A dengue search index was constructed for developing the predictive models in combination with climate factors. The observed year and week were also included in the models to control for the long-term trend and seasonality. Several machine learning algorithms, including the support vector regression (SVR) algorithm, step-down linear regression model, gradient boosted regression tree algorithm (GBM), negative binomial regression model (NBM), least absolute shrinkage and selection operator (LASSO) linear regression model and generalized additive model (GAM), were used as candidate models to predict dengue incidence. Performance and goodness of fit of the models were assessed using the root-mean-square error (RMSE) and R-squared measures. The residuals of the models were examined using the autocorrelation and partial autocorrelation function analyses to check the validity of the models. The models were further validated using dengue surveillance data from five other provinces. The epidemics during the last 12 weeks and the peak of the 2014 large outbreak were accurately forecasted by the SVR model selected by a cross-validation technique. Moreover, the SVR model had the consistently smallest prediction error rates for tracking the dynamics of dengue and forecasting the outbreaks in other areas in China.

CONCLUSION AND SIGNIFICANCE: The proposed SVR model achieved a superior performance in comparison with other forecasting techniques assessed in this study. The findings can help the government and community respond early to dengue epidemics.

Machine Learning to forecasting Dengue

Produtionized pipelines in ML

Great talk for whom are looking to get some benchmarks of produtionized machine learning.

Automating Netflix ML Pipelines with Meson

Summary: Davis Shepherd and Eugen Cepoi discuss the evolution of ML automation at Netflix and how that lead them to build Meson, an orchestration system used for many of the personalization / recommendation algorithms. They talk about challenges they faced, and what they learned automating thousands of ML pipelines with Meson.

Produtionized pipelines in ML

A synthetic guide on Adversarial Attack

Fast and simple.

How Adversarial Attacks Work

What is an Adversarial Attack?
Machine learning algorithms accept inputs as numeric vectors. Designing an input in a specific way to get the wrong result from the model is called an adversarial attack.

How is this possible? No machine learning algorithm is perfect and they make mistakes — albeit very rarely. However, machine learning models consist of a series of specific transformations, and most of these transformations turn out to be very sensitive to slight changes in input. Harnessing this sensitivity and exploiting it to modify an algorithm’s behavior is an important problem in AI security.

In this article we will show practical examples of the main types of attacks, explain why is it so easy to perform them, and discuss the security implications that stem from this technology.

Types of Adversarial Attacks
Here are the main types of hacks we will focus on:

  1. Non-targeted adversarial attack: the most general type of attack when all you want to do is to make the classifier give an incorrect result.
  2. Targeted adversarial attack: a slightly more difficult attack which aims to receive a particular class for your input.
A synthetic guide on Adversarial Attack

Do you know about Self-Organizing Maps?

If the answer is “no”, please get this tutorial of Algobeans.

The term ‘self-organizing map’ might conjure up a militaristic image of data points marching towards their contingents on a map, which is a rather apt analogy of how the algorithm actually works.

A self-organizing map (SOM) is a clustering technique that helps you uncover categories in large datasets, such as to find customer profiles based on a list of past purchases. It is a special breed of unsupervised neural networks, where neurons (also called nodes or reference vectors) are arranged in a single, 2-dimensional grid, which can take the shape of either rectangles or hexagons.

HOW DOES SOM WORK?
In a nutshell, an SOM comprises neurons in the grid, which gradually adapt to the intrinsic shape of our data. The final result allows us to visualize data points and identify clusters in a lower dimension.

So how does the SOM grid learn the shape of our data? Well, this is done in an iterative process, which is summarized in the following steps, and visualized in the animated GIF below:

Step 0: Randomly position the grid’s neurons in the data space.

Step 1: Select one data point, either randomly or systematically cycling through the dataset in order

Step 2: Find the neuron that is closest to the chosen data point. This neuron is called the Best Matching Unit (BMU).

Step 3: Move the BMU closer to that data point. The distance moved by the BMU is determined by a learning rate, which decreases after each iteration.

Step 4: Move the BMU’s neighbors closer to that data point as well, with farther away neighbors moving less. Neighbors are identified using a radius around the BMU, and the value for this radius decreases after each iteration.

Step 5: Update the learning rate and BMU radius, before repeating Steps 1 to 4. Iterate these steps until positions of neurons have been stabilized.

 

Do you know about Self-Organizing Maps?

Multi Armed Bandit concept

This is the best no-tech concept available in internet.

By Datagenetics

Imagine you are standing in front of a row of slot machines, and wish to gamble. You have a bag full of coins. Your goal is to maximize the return on your investment. The problem is that you don’t know the payout percentages of any of the machines. Each has a, potentially, different expected return.

What is your strategy?

You could select one machine at random, and invest all your coins there, but what happens if you selected a poor payout machine? You could have done better.

You could spread your money out and divide it equally (or randomly) between all the different machines. However, if you did this, you’d spend some time investing in poorer payout machines and ‘wasting’ coins that could have be inserted into better machines. The benefit of this strategy, however, is diversification, and you’d be spreading your risk over many machines; you’re never going to be playing the best machine all the time, but you’re never going to be playing the worst all the time either!

Maybe a hybrid strategy is better? In a hybrid solution you could initially spend some time experimenting to estimate the payouts of the machines then, in an exploitation phase, you could put all your future investment into the best paying machine you’d discovered. The more you research, the more you learn about the machines (getting feedback on their individual payout percentages).

However, what is the optimal hybrid strategy? You could spend a long time researching the machines (increasing your confidence), and the longer you spend, certainly, the more accurate your prediction of the best machine would become. However, if you spend too long on research, you might not have many coins left to properly leverage this knowledge (and you’d have wasted many coins on lots of machines that are poor payers). Conversely, if you spend too short a time on research, your estimate for which is the best machine could be bogus (and if you are unlucky, you could become victim to a streak of ‘good-luck’ from a poor paying machine that tricks you into thinking it’s the best machine).

If you are playing a machine that is “good enough”, is it worth the risk of attempting to see if another machine is “better” (experiments to determine this might not be worth the effort).

Multi Armed Bandit concept

A gentle introduction to DBSCAN

From the series “Beyond the K-Means clustering“…

Density Based Spatial Clustering of Applications with Noise (DBSCAN)

By Abhijit Annaldas, Microsoft.

DBSCAN is a different type of clustering algorithm with some unique advantages. As the name indicates, this method focuses more on the proximity and density of observations to form clusters. This is very different from KMeans, where an observation becomes a part of cluster represented by nearest centroid. DBSCAN clustering can identify outliers, observations which won’t belong to any cluster. Since DBSCAN clustering identifies the number of clusters as well, it is very useful with unsupervised learning of the data when we don’t know how many clusters could be there in the data.

K-Means clustering may cluster loosely related observations together. Every observation becomes a part of some cluster eventually, even if the observations are scattered far away in the vector space. Since clusters depend on the mean value of cluster elements, each data point plays a role in forming the clusters. Slight change in data points might affect the clustering outcome. This problem is greatly reduced in DBSCAN due to the way clusters are formed.

A gentle introduction to DBSCAN

Skin Cancer Detection using Deep Neural Networks

Skin Cancer Detection using Deep Neural Networks

Abstract: Cancer is the most dangerous and stubborn disease known to mankind. It accounts for the most deaths caused by any disease. However, if detected early this medical condition is not very difficult to defeconvoat. Tumors which are cancerous grow very rapidly and spread into different parts of the body and this process continues until that tumor spreads in the entire body and ultimately our organs stop functioning. If any tumor is developed in any part of our body it requires immediate medical attention to verify that the tumor is malignant(cancerous) or Benign(non-cancerous). Until now if any tumor has to be tested for malignancy a sample of tumor should be extracted out and then tested in the laboratory. But using the computational logic of Deep Neural Networks we can predict that the tumor is malignant or Benign by only a photograph of that tumor. If cancer is detected in early stage chances are very high that it can be cured completely. In this work, we detect Melanoma(Skin cancer) in tumors by processing images of those tumors.

Conclusion: We have trained our model using Vgg16, Inception and ResNet50 neural network architecture. In training, we have provided 2 categories of images one with Malignant (MelanomaSkin cancer) tumors and other with benign tumors. After training, we tested our model with random images of tumor and an accuracy of 83.86%-86.02% was recorded in classifying that it is malignant or benign. By using neural network our model can classify Malignant(cancerous) and benign(non-cancerous) tumors with an accuracy of 86.02%. Since cancer, if detected early can be cured completely. This technology can be used to detect cancer when a tumor is developed at early stage and precautions can be taken accordingly.

Skin Cancer Detection using Deep Neural Networks

Deep Learning for End-to-End Automatic Target Recognition from Synthetic Aperture Radar Imagery

Deep Learning for End-to-End Automatic Target Recognition from Synthetic Aperture Radar Imagery

Abstract: The standard architecture of synthetic aperture radar (SAR) automatic target recognition (ATR) consists of three stages: detection, discrimination, and classification. In recent years, convolutional neural networks (CNNs) for SAR ATR have been proposed, but most of them classify target classes from a target chip extracted from SAR imagery, as a classification for the third stage of SAR ATR. In this report, we propose a novel CNN for end-to-end ATR from SAR imagery. The CNN named verification support network (VersNet) performs all three stages of SAR ATR end-to-end. VersNet inputs a SAR image of arbitrary sizes with multiple classes and multiple targets, and outputs a SAR ATR image representing the position, class, and pose of each detected target. This report describes the evaluation results of VersNet which trained to output scores of all 12 classes: 10 target classes, a target front class, and a background class, for each pixel using the moving and stationary target acquisition and recognition (MSTAR) public dataset.

Conclusion: By applying CNN to the third stage classification in the standard architecture of SAR ATR, the performance has been improved. In order to improve the overall performance of SAR ATR, it is important not only to improve the performance of the third stage classification but also to improve the performance of the first stage detection and the second stage discrimination. In this report, we proposed a CNN based on a new architecture of SAR ATR that consists of a single stage, i.e. endto-end, not the standard architecture of SAR ATR. Unlike conventional CNNs for target classification, the CNN named VersNet inputs a SAR image of arbitrary sizes with multiple classes and multiple targets, and outputs a SAR ATR image representing the position, class, and pose of each detected target. We trained the VersNet to output scores include ten target classes on MSTAR dataset and evaluated its performance. The average IoU for all the pixels of testing (2420 target chips) is over 0.9. Also, the classification accuracy is about 99.5%, if we select the majority class of maximum probability for each pixel as the predicted class.

 

Deep Learning for End-to-End Automatic Target Recognition from Synthetic Aperture Radar Imagery

Nested LSTMs

By Joel Ruben Antony Moniz, David Krueger

We propose Nested LSTMs (NLSTM), a novel RNN architecture with multiple levels of memory. Nested LSTMs add depth to LSTMs via nesting as opposed to stacking. The value of a memory cell in an NLSTM is computed by an LSTM cell, which has its own inner memory cell. Specifically, instead of computing the value of the (outer) memory cell as coutert=ftct1+itgt, NLSTM memory cells use the concatenation (ftct1,itgt) as input to an inner LSTM (or NLSTM) memory cell, and set coutert = hinnert. Nested LSTMs outperform both stacked and single-layer LSTMs with similar numbers of parameters in our experiments on various character-level language modeling tasks, and the inner memories of an LSTM learn longer term dependencies compared with the higher-level units of a stacked LSTM.

 

Full implementation in Github.

Nested LSTMs

Who is the true black box?

When we have some smart people like Elon Musk and Stephen Hawking bashing talking about something totally outside their fields, in the first moment we need to be careful about their opinions; but when we have the same smart guys using some alarmistic speech about what they do not understand,  we really need to take a more pragmatic view about their argument and see the flaws, especially when those guys talk about regulation over this “black boxes”.

This article of Vijay Pande of the New York Times called “Artificial Intelligence’s ‘Black Box’ Is Nothing to Fear” give us a sober view of what’s the real black box are:

Given these types of concerns, the unseeable space between where data goes in and answers come out is often referred to as a “black box” — seemingly a reference to the hardy (and in fact orange, not black) data recorders mandated on aircraft and often examined after accidents. In the context of A.I., the term more broadly suggests an image of being in the “dark” about how the technology works: We put in and provide the data and models and architectures, and then computers provide us answers while continuing to learn on their own, in a way that’s seemingly impossible — and certainly too complicated — for us to understand.

There’s particular concern about this in health care, where A.I. is used to classify which skin lesions are cancerous, to identify very early-stage cancer from blood, to predict heart disease, to determine what compounds in people and animals could extend healthy life spans and more. But these fears about the implications of black box are misplaced. A.I. is no less transparent than the way in which doctors have always worked — and in many cases it represents an improvement, augmenting what hospitals can do for patients and the entire health care system. After all, the black box in A.I. isn’t a new problem due to new tech: Human intelligence itself is — and always has been — a black box.

Let’s take the example of a human doctor making a diagnosis. Afterward, a patient might ask that doctor how she made that diagnosis, and she would probably share some of the data she used to draw her conclusion. But could she really explain how and why she made that decision, what specific data from what studies she drew on, what observations from her training or mentors influenced her, what tacit knowledge she gleaned from her own and her colleagues’ shared experiences and how all of this combined into that precise insight? Sure, she’d probably give a few indicators about what pointed her in a certain direction — but there would also be an element of guessing, of following hunches. And even if there weren’t, we still wouldn’t know that there weren’t other factors involved of which she wasn’t even consciously aware.

If the same diagnosis had been made with A.I., we could draw from all available information on that particular patient — as well as data anonymously aggregated across time and from countless other relevant patients everywhere, to make the strongest evidence-based decision possible. It would be a diagnosis with a direct connection to the data, rather than human intuition based on limited data and derivative summaries of anecdotal experiences with a relatively small number of local patients.

But we make decisions in areas that we don’t fully understand every day — often very successfully — from the predicted economic impacts of policies to weather forecasts to the ways in which we approach much of science in the first place. We either oversimplify things or accept that they’re too complex for us to break down linearly, let alone explain fully. It’s just like the black box of A.I.: Human intelligence can reason and make arguments for a given conclusion, but it can’t explain the complex, underlying basis for how we arrived at a particular conclusion. Think of what happens when a couple get divorced because of one stated cause — say, infidelity — when in reality there’s an entire unseen universe of intertwined causes, forces and events that contributed to that outcome. Why did they choose to split up when another couple in a similar situation didn’t? Even those in the relationship can’t fully explain it. It’s a black box.

A good point about this argument it’s that when we get some diagnostics from our doctors, how much of this knowledge (e.g. field experience, practical skills, etc) are totally interpretable, uniform with the literature and transparent for their patient in the right moment of the examination?

And the final argument of the article shows about an common aspect of computation: debugging.

The irony is that compared with human intelligence, A.I. is actually the more transparent of intelligences. Unlike the human mind, A.I. can — and should — be interrogated and interpreted. Like the ability to audit and refine models and expose knowledge gaps in deep neural nets and the debugging tools that will inevitably be built and the potential ability to augment human intelligence via brain-computer interfaces, there are many technologies that could help interpret artificial intelligence in a way we can’t interpret the human brain. In the process, we may even learn more about how human intelligence itself works. 

The final conclusion of this is: if you see a smart, well known person talking about something totally outside of their original field, be skeptic.

 

 

Who is the true black box?