Wednesday, February 29, 2012

Machine Learning

Topics from Stanford's Machine Learing class:

Supervised learnings.

In supervised learning, one has a set of data with features and labels.

Linear Regression – one/multiple variables
Gradient Descent - a general algorithm for minimizing a function
Logistic Regression – This is useful when predicting classification type results. For example, are you looking for a yes or no result. Does the patient have cancer? Will the customer buy my new product? It can also be helpful for more than 2 results. What color will a person choose (red, blue, green, silver)?
Neural Networks – A learning algorithm that is modeled after the brain. Think of neurons.

Unsupervised Learning

In unsupervised learning, one has a set of data with no features and labels. Can some structure be found for the data?

Clustering – The most popular technique is K-means.
PCA (Principal Components Analysis) – speed up a learning algorithm

Anomaly Detection

This section covers methods to determine if data is bad. Bad data is considered an anomaly.

Recommender Systems

Like the name says, recommender systems are used to make recommendations. Companies like Netflix use recommender systems to recommend new movies to customers. LinkedIn also recommends people to connect with. This is a fairly hot topic in the tech world right now.

Content Based(Features)
  Modified Linear Regression
Non-content Based(No Features)
  Collaborative Filtering
  Matrix Factorization

/via Data Science

No comments: