My Production Data Science Workflow

Hello World,

So I’ve spent a while now looking at 3 competing languages and I did my best to give each one a fair shake. Those 3 languages were F#, Python and R. I have to say it was really close for a while because each language has its strengths and weaknesses. That said, I am moving forward with 2 languages and a very specific way I use each one. I wanted to outline this, because for me it has taken a very long time to learn all of the languages to the level that I have to discover this and I would hate for others to go through the same exercise.

Continue reading

Linear Regression from Scratch using Linear Algebra

Hello World!

So I wrote an article earlier “Linear Regression From Scratch”.  Many folks have pointed out that this is in fact not the optimal approach.  Now being the perfectionist I decided to re-implement.  Not to mention it works great in my own libraries.  The following article discussing converting the original code into code that uses linear algebra.  Beyond this, it still works in PCL for xamarin,  Hoo-Rah Xamarin!

Continue reading

Battle of the Programming Languages

Hello World!

So this article is to help provide some guidance around which programming language to use.  Note that this article is specifically geared towards delivering code in which intelligence and information is the soul of the product.  In this day and age, that should be every product.

I want to preface this article with a few things

  1. This is an excerpt from a paper I wrote for internal use of my own volition.  As this is the case, I was able to remove all confidential information and publish my findings.
  2. I only analyzed F#, C#, R and Python.  I know there are more, but I picked the top dogs, but F# had some special circumstances that I felt it belonged.

Continue reading

Machine Learning Study Group Recap – Week 1

Hello World!

So many of you who are here are probably part of the study group.  For those who are not or are perhaps referencing this at a later time, this is in regards to the following course on Coursera. If you would like to join our study group, please see one of the following meetup pages: Fort Lauderdale Machine Learning or Florida Dot Net.

Here in South Florida we have a strong Machine Learning and Data Science community and therefor it is easy to get a study group together.  This article is a recap from the first meeting of our study group.  Note that this first meeting is the week before the class started.  Therefor this article is a great introduction to machine learning, languages, commitments and more generally applicable questions and concerns.

Continue reading

Linear Regression from Scratch

Hello World,

So today we will do a quick conversion from mathematical notations of Algebra into a real algorithm that can be executed.  Note we will not be covering gradient descent, but rather only cost functions, errors and execution of these to provide the framework for gradient descent.  Gradient descent has so many flavors that it deserves its own article.

So to the mathematical representation.

LinearRegression

Continue reading

Convert Latitude Longitude to Degree Minutes Seconds for Maps

Hello World!

If you have ever done mapping applications, you may have encountered needing to do this.  It takes a lot of looking around the internet to finally find the right equation etc.  For our application, we need to do this for google maps, as it does not take a latitude/longitude combination like bing maps.  If you choose to support ONLY bing maps, your job is easy, as here is the format: http://www.bing.com/maps/default.aspx?q=LATITUDE%2c+LONGITUDE (include negative signs if necessary).  However Google maps requires more work (UGH!) https://www.google.com/maps/place/LATITUDE <directional> LONGITUDE <directional>, ZOOM (with various encoded separators).

This article goes through the code that converts latitude longitude like you will pull from a phone’s gps into the DMS format needed by google maps.  Again, you don’t even need to bother with this conversion if you choose to use bing maps, it is simply as stated above.

Continue reading

Using Census Data to Help Pick your Child’s Name

Hello World!

So I had a life changing event this past Sunday at 8:55am 5/24/2015.  My first child was born!  Both child and wife are healthy and happy.  Everything is good in life.  Like many couples though, my wife and I struggled to find the right name for our child.  We didn’t want something too common, or was an old person name, or so rare and funky that nobody could spell it.  We also realized we just had a general lack in knowing what names were out there.  So after much debate and discussion over what to name her, I started doing a bit of an analysis using some census data.  I want to thank Jamie Dixon for providing the data that he found for use in his Dinner Nerds article.  The data itself can be found here.  This article will discuss the code used to go through all of the data and provide insights into child names.

Continue reading

How to Datamine Zillow

Hello World,

As many of you may know at this point, I am relocating to South Florida.  Final location to be determined, but will probably be renting around Pompano Beach or Fort Lauderdale while working out of Venture Hive and the Microsoft Fort Lauderdale Offices.  So what does this have to do with Zillow?  Well, It has EVERYTHING to do with Zillow.  What I’ve found while searching for homes is that between Realtors, Zillow and Trulia, they really just don’t have a predictive analytics solution that works for me.  So I decided to give a shot at AzureML to mash together a few datasets to send me notifications more to my liking than is currently being sent.  So step 1 in this plan is to data mine Zillow.  Luckily, Zillow has an api for that.  Or if you are feeling particularly frisky, Zillow gets their data from ArcGIS (example for Raleigh).  So lets get cracking…

Continue reading