# My Production Data Science Workflow

Hello World,

So I’ve spent a while now looking at 3 competing languages and I did my best to give each one a fair shake. Those 3 languages were F#, Python and R. I have to say it was really close for a while because each language has its strengths and weaknesses. That said, I am moving forward with 2 languages and a very specific way I use each one. I wanted to outline this, because for me it has taken a very long time to learn all of the languages to the level that I have to discover this and I would hate for others to go through the same exercise.

# Machine Learning Study Group Recap – Week 4

Hello World,

So here we go with another recap. This week we did a deep dive into binary classification using Logistic Regression. Logistic regression and binary classification is the underpinnings for modern neural networks so a deep and complete understanding of this is necessary to be proficient in machine learning.

# Sigmoid for Classifiers Decoded

Hello World,

Sigmoid really isn’t that complicated (once your understand it of course).  Some back knowledge in case you are coming at this totally fresh is that the Sigmoid function is used in machine learning primarily as a hypothesis function for classifiers.  What is interesting is that this same function is used for binary classifiers, multi-class classifiers and is the backbone of modern neural networks.

Here is the sigmoid function:    $\frac{ 1 }{ 1 + e^{-z}}$

# Categories of Analytics

Hello World,

This is a high level article geared for general consumption of the normal individual! I’ve been thinking about types of customer engagements I have been doing lately and decided to break it down into a series of categorical engagements. There are 4 categories of engagements: Descriptive, Predictive, Prescriptive and Actuated Analytics engagements.

# Merging Data Sets in Python

Hello World,

So this article is inspired by a customer doing financial analysis who can only grab a certain amount of data at a time from the data steward’s stores in chunks based on time windows. As time is constantly moving, what happens is that occasionally you get duplicate data in each request. If you attempt to grab exactly on the edges, you have a chance of missing something, so its best to have a bit of an overlap and just deal with that overlap.

# Getting Started with Linear Algebra in Python

Hello World!

So here I am after trying for a long time to not learn Python learning Python.  It just seems like I might get a hit or two more on my blog with some Python content.  Well whats the first thing I need to figure out aside from getting it up and running in my environment and installing some libraries… Thats right, find a numerical computing library and see how it ticks.

Hello World!

In this article we are going to cover a simple version of Gradient Descent. It is important to note that this version of gradient descent is using Sum of Squares as its cost function to reduce. This implementation utilizes vectorized algorithms. Lets start off with…

# Machine Learning Study Group – Week 2 & 3 Recap

Hello World,

So this is another recap from our study group covering the Andrew NG course on Coursera. Lets start by a quick summary from the two weeks. Week 1 was all about introduction to linear regression and gradient descent. There were no assignments due. Week 3 was all about multi-variate linear regression, normalization and a few other topics. There was a coding assignment as well as a quiz due for week 2.
$\theta_j = \theta_j - \alpha \cdot \frac{ \sum_i^m \left(H_{\theta}\left(x\right) - y\right) \cdot x_j } { m }$