-
The universal workflow of machine learning
What presented here is a universal blueprint one can use to attack and solve any machine learning problem, tying together the different concepts you learned about in thischapter: problem definition, evaluation, feature engineering, and fighting o...…
-
Deploying Django to production
OverviewOnce your site is finished (or finished “enough” to start public testing) you’re going to need to host it somewhere more public and accessible than your personal development computer.Up to now you’ve been working in a development environme...…
-
Deep Learning - Learning Resources
Courses CS231n: Convolutional Neural Networks for Visual Recognition, Fei-Fei Li (Stanford). CS224d: Deep Learning for Natural Language Processing, Richard Socher (Stanford).Books Michael Nielsen. Neural Networks and Deep Learning. Goodfellow,...…
-
Word as Vectors
Words as VectorsWord2Vec (Part 1):NLP With Deep Learning with Tensorflow (Skip-gram)linkWord2Vec (Part 2): NLP With Deep Learning with Tensorflow (CBOW)LinkGloVe: Global Vectors for Word Representation + ImplementationLink…
-
Django with Jupyter
Install django-extensions pip install django-extensionsChange your settings file to include ‘django-extensions’INSTALLED_APPS += ['django_extensions']Run your Django server like this:python manage.py shell_plus --notebookRemote AccessOn the local ...…
-
Topic Models - Latent Dirichlet Allocation
Graphical Models Nodes are random variables Edges denote possible dependence Observed variables are shaded Plates denote replicated structure Structure of the graph defines the pattern of conditional dependencebetween the ensemble of random...…
-
Get Started with PySpark and Jupyter Notebook
Apache Spark is a must for Big data’s lovers. In a few words, Spark is a fast and powerful framework that provides an API to perform massive distributed processing over resilient sets of data.Jupyter Notebook is a popular application that enables ...…
-
Error of Python virtualenv
Error of Python virtualenv$ /usr/local/python-2.7/bin/virtualenv xp_mypy-2.7New python executable in /scratch/xp_mypy-2.7/bin/python2.7Also creating executable in /scratch/xp_mypy-2.7/bin/pythonInstalling setuptools, pip, wheel... Complete output...…
-
Remote Access to IPython Notebooks via SSH
Install ipythonQuickstartpip install ipython[all]To run IPython’s test suite, use the iptest command:iptestRemote Accessreference from here.Scenario: On your local computer, you want to open and manipulate an IPython notebook running on a remote c...…
-
Work remotely with PyCharm, TensorFlow and SSH
Install PyCharmDownload the PyCharm Community version from [here](https://www.jetbrains.com/pycharm/download)Work remotely with PyCharm, TensorFlow and SSH Work remotely with PyCharm, TensorFlow and SSHecho 'export LD_LIBRARY_PATH=”$LD_LIBRARY_PA...…
-
Useful Links
Detection Insider Threat Detection: Detecting Variance in User Behavior using an Ensemble ApproachWork remotely with PyCharm, TensorFlow and SSH Work remotely with PyCharm, TensorFlow and SSHKaggle Experts Approaching (Almost) Any Machine Learn...…
-
Latent Dirichlet Allocation (LDA)
from sklearn.decomposition import LatentDirichletAllocationimport pandas as pdimport numpy as npdf = pd.read_csv("~/data/hab_test.csv", dtype={'hab_test.tot_visit': np.float32,'hab_test.tot_visit_ed': np.float32,'hab_test.tot_visit_acute': np.floa...…
-
Machine Learning Clustering Problems Workflow
Business TaskData Pre-processing1. Feature SelectionData Visualization1. PCA2. Plot DataDistance Computation1. One-hot encoding2. Metric Learning3. Cosine4. Euclidean DistanceModel Selection:1. Centroid-based clustering (K-Means, K-medoids)2. Conn...…
-
Machine Learning Prediction Problems Workflow
Business QuestionData Pre-processing1. Label Data2. Feature SelectionModel Selection1. Logistic Regression2. Linear Regression3. Decision Tree4. Random Forest5. SVM: a. Linear b. Kernel6. K-NN7. Naïve Bayes Classifier8. Neural Networks ClassifierM...…
-
Feature Selection
Data TypeTime Sequence DataThe candidate types of features should be the same as the type of target feature. For example, temperature at time1, temperature at time2, and temperature of time3, the target feature should be temperature at next time p...…
-
How can we use kaggle?
Begin learning machine learningIf you go to the competition page on Kaggle, you can find a number of open competitions. If you scroll down to the bottom, you can find ~4 competitions with a light blue (or green) tab, and a “101”. Those are tutoria...…
-
Common Git Commands
Add filesgit add -A stages Allgit add . stages new and modified, without deletedgit add -u stages modified and deleted, without newRemove directory from git and localgit rm -r one-of-the-directoriesgit commit -m "Remove duplicated directory"git pu...…
-
Jekyll Template and Setup
leopardleopard 是一个简洁的博客模板,如果你也喜欢请 Star ,你的 Star 是我持续更新的动力, 谢谢 😄.使用条件Jekyll 支持 Mac 、Windows、ubuntu 、Linux 操作系统 Jekyll 需要依赖:Ruby、bundler获取博客模板 $ git clone https://github.com/MengZheK/kangblog.github.io.git或者直接下载博客进kangblog.github....…