<img height="1" width="1" src="https://www.facebook.com/tr?id=&amp;ev=PageView &amp;noscript=1">

Posted by Martin Zapletal
Sun, Nov 20, 2016

Introduction

In this series of posts I will discuss the evolution of machine learning algorithms with regards to scaling and performance. We will start with a naive implementation and progress to more advanced solutions finally reaching state of the art implementations, similar to what companies like Google, Netflix and others use for their data pipelines, recommendation systems or machine learning. A variety of topics will be discussed, from basics of ML, different programming models, impact of distributed environment, specifics of machine learning algorithms as compared to common business applications and much more. For those not particularly interested in machine learning the concepts discussed are chosen carefully to apply to a wide range of applications and ML itself is chosen as a good example.

In my previous blog post we looked into neural networks, their training and investigated a trivial single threaded object oriented implementation. The result was a working example that was, however, not useful in many real world scenarios for its poor performance. With large amounts of data such approach is extremely wasteful and we can achieve vastly better performance through parallelization.

Posted by Martin Zapletal
Sat, Oct 1, 2016

Introduction

In this series of posts I will discuss the evolution of machine learning algorithms with regards to scaling and performance. We will start with a naive implementation and progress to more advanced solutions finally reaching state of the art implementations, similar to what companies like Google, Netflix and others use for their data pipelines, recommendation systems or machine learning. A variety of topics will be discussed, from basics of ML, different programming models, impact of distributed environment, specifics of machine learning algorithms as compared to common business applications and much more. For those not particularly interested in machine learning the concepts discussed are chosen carefully to apply to a wide range of applications and ML itself is chosen as a good example.

Although very old concepts, the importance of big data analytics and machine learning is steadily increasing. One of the reasons is improving accessibility of tools, decreasing prices and therefore the ability to access, store, process and use large amounts of data. And data are key for many use cases, from optimizing standard business use cases to finding and opening new business opportunities to completely transforming businesses.

Throughout this series of blog posts we will touch on many topics from machine learning, functional programming, parallel programming to distributed systems theory. I will start with a brief introduction into the different programming models, followed by abstract description of single machine, parallel and distributed computation, common data processing architectures, pipelines and technology stacks before getting to the actual focus of the blog post. Feel free to skip to chapter Perceptron if you want.

Posted by Carl Pulley
Mon, Jan 26, 2015

In this post we demonstrate how machine learning (specifically SVMs) may be used to identify gesture events, such as taps, in data steams produced by accelerometers in devices such as Pebble watches.

We start by developing prototype classification models in R and then port those models into Scala.

Applying the trained SVM models to unseen data, we successfully demonstrate an ability to punctuate exercise sessions by identifying taps to tokenise those exercise steams into separate activity periods.

Posted by Martin Zapletal
Sun, Nov 9, 2014

Apache Spark has been receiving a lot of deserved attention lately [1]. It is very understandable given the huge importance of distributed data processing for many companies and the pursuit for faster, cheaper and easier to use technologies aiming replace or complement the widely adopted Hadoop ecosystem and its MapReduce paradigm.

Recent Posts

Posts by Topic

see all

Subscribe to Email Updates