Knowledge Base

Big Data Analytics, Machine Learning, Optimization, Collaborative Filtering, Data Mining..

Data Science Community on G+

Please join us at the Data Science Community on Google+ If you are a Google+ user, see you there!!

Read More

A New Big Data Analytics Engine : Shark

A newly published research paper from UC Berkeley amplab. “Shark: SQL and Rich Analytics at Scale” looks pretty promising. Claimed “Shark is 100X faster than Hive for SQL, and 100X faster than Hadoop for machine-learning”.. It builds on newly-proposed distributed shared memory abstraction called Resilient Distributed Datasets (RDDs), which can...

Read More

Introduction to Matrix Factorization in Collaborative Filtering – Tutorial

This is something I put together as internal training material and thought it might be worth sharing… view/download pdf Presentation Outline 1. Introduction To Matrix Factorization Collaborative Filtering – Alex Lin Senior Architect Intelligent Mining 2. Outline – Factor analysis – Matrix decomposition – Matrix Factorization Model – Minimizing Cost...

Read More

An Approach to R Package Recommendation Engine

Slides presented by Alex to the NYC Predictive Analytics group on March 10, 2011 describing his approach to the R Package Recommendation Engine competition on Kaggle, where he placed 4th… view/download pdf   Presentation Outline 1. An Approach to R Package Recommendation Engine by Alex Lin of Intelligent Mining 2....

Read More

Naive Bayesian Text Classifier Event Models

Most people know naive bayesian classifier, but a few of them actually remember there are different event models to be used in naive bayesian. view/download pdf  

Read More