O que é Deep Learning?

Direto de O’Reily

So, what is deep learning?

It’s a term that covers a particular approach to building and training neural networks. Neural networks have been around since the 1950s, and like nuclear fusion, they’ve been an incredibly promising laboratory idea whose practical deployment has been beset by constant delays. I’ll go into the details of how neural networks work a bit later, but for now you can think of them as decision-making black boxes. They take an array of numbers (that can represent pixels, audio waveforms, or words), run a series of functions on that array, and output one or more numbers as outputs. The outputs are usually a prediction of some properties you’re trying to guess from the input, for example whether or not an image is a picture of a cat.

The functions that are run inside the black box are controlled by the memory of the neural network, arrays of numbers known as weights that define how the inputs are combined and recombined to produce the results. Dealing with real-world problems like cat-detection requires very complex functions, which mean these arrays are very large, containing around 60 million numbers in the case of one of the recent computer vision networks. The biggest obstacle to using neural networks has been figuring out how to set all these massive arrays to values that will do a good job transforming the input signals into output predictions.

 

O que é Deep Learning?

Paper Ensemble methods for uplift modeling

Esse paper sobre a aplicação de métodos ensemble especificamente em modelagem uplift, é um ótimo guia de como técnicas não são canônicas em termos de resolução de problemas.

Abstract: Uplift modeling is a branch of machine learning which aims at predicting the causal effect of an action such as a marketing campaign or a medical treatment on a given individual by taking into account responses in a treatment group, containing individuals subject to the action, and a control group serving as a background. The resulting model can then be used to select individuals for whom the action will be most profitable. This paper analyzes the use of ensemble methods: bagging and random forests in uplift modeling. We perform an extensive experimental evaluation to demonstrate that the application of those methods often results in spectacular gains in model performance, turning almost useless single models into highly capable uplift ensembles. The gains are much larger than those achieved in case of standard classifi- cation. We show that those gains are a result of high ensemble diversity, which in turn is a result of the differences between class probabilities in the treatment and control groups being harder to model than the class probabilities themselves. The feature of uplift modeling which makes it difficult thus also makes it amenable to the application of ensemble methods. As a result, bagging and random forests emerge from our evaluation as key tools in the uplift modeling toolbox.

Ensemble methods for uplift modeling

Paper Ensemble methods for uplift modeling

Comparação entre as APIs de Machine Learning

Os serviços de Machine Learning estão cada vez mais populares dado que há uma grande demanda nos dias atuais de modelos de predição cada vez mais apurados, dado que muitas empresas não conseguem absorver Cientistas de Dados. 

Contudo, muito do que é vendido como Machine Learning grande parte das vezes é um fork de algum projeto Open Source e que é vendido como se fosse a 8a maravilha do mundo. 

Os resultados?

Of the APIs compared, Google performed the worst, being the slowest and least accurate of the four. Amazon ML was the most accurate, but this came at the expense of time to train and make predictions. BigML proved to be the fastest in both training and predictions, but compromised on accuracy.

Comparação entre as APIs de Machine Learning