Machine Learning

Let’s look at some examples of Machine Learning.

Try to answer some questions

What language is this?

LanguageIdentify takes pieces of text, and identifies what human language they’re in.
Identify the language each phrase is in:
In[]:=
LanguageIdentify["thank you","merci","dar las gracias","感謝","благодарить"]

What’s in this image?

Identify what an image is of:
In[]:=
ImageIdentify


What sort of sentiment does this text express?

Classifying the “sentiment” of text:
In[]:=
Classify["Sentiment","I'm so excited to be programming"]
In[]:=
Classify["Sentiment","math can be really hard"]

Reframe the questions: Is this A or B (or C or D or E)

◼
  • Is this English or French (or Arabic or Hindi)?
  • ◼
  • Is this a cheetah or a tiger or and owl?
  • ◼
  • Is this an example of positive or negative or neutral sentiment?
  • You can train a classifier yourself.
    Here's a simple example of classifying handwritten digits as 0 or 1. You give the classifier a collection of training examples, followed by a particular handwritten digit. Then it'll tell you whether the digit you give is a 0 or 1.
    With training examples, Classify correctly identifies a handwritten 0:
    In[]:=
    c=Classify
    ->0,
    ->1,
    ->0,
    ->1,
    ->1,
    ->0,
    ->0,
    ->1,
    ->1,
    ->0,
    ->0,
    ->0,
    ->1,
    ->0,
    ->1,
    ->0,
    ->1,
    ->1,
    ->1
    c
    
    c
    ,"Probabilities"

    Day or Night

    Feed a list of images, each with a label “Day” or “Night”, to Classify:
    In[]:=
    daynight=Classify​​
    "Night",
    "Day",
    "Night",
    "Night",
    "Day",
    "Night",
    "Day",
    "Day",
    "Night",
    "Night",
    "Day",
    "Night",
    "Night",
    "Day",
    "Night",
    "Night",
    "Day",
    "Day",
    "Day",
    "Day",
    "Night",
    "Night",
    "Day",
    "Night",
    "Night",
    "Day",
    "Day",
    "Day",
    "Night",
    "Day"
    The result is a classifier function that can accept new examples as input data and return a class or label as output that the classifier believes fits the input best:
    In[]:=
    daynight
    ,
    ,
    ,
    

    Try on your own

    Try using Classify[ ] for identifying images of any of the following famous pairs:
    ◼
  • Dogs and cats
  • ◼
  • Tom and Jerry
  • ◼
  • Chalk and cheese
  • ◼
  • Hammer and nails
  • ◼
  • Fish and chips
  • ◼
  • Ant and Dec
  • ◼
  • Batman and Robin
  • ◼
  • Asterix and Obelix
  • Remember Vectors and Vector Spaces

    In one dimension:
    In[]:=
    peopleAges={7,11,18,20,21,50,67};
    In[]:=
    NumberLinePlot[peopleAges]
    In two dimensions:
    In[]:=
    places={{41.8375511`,-87.6818441`},{39.7639077`,-89.6708323`},{40.115057`,-88.2736523`},{40.7523087`,-89.6170968`}};
    In[]:=
    GeoListPlot[GeoPosition[places]]
    In three dimensions:
    In[]:=
    colorRGBs={{0.28,0.62,0.43},{0.54,0.078,0.15},{0.56,0.24,0.006},{0.84,0.42,0.19},{0.92,0.45,0.16},{0.3,1.,0.17},{0.46,0.64,0.078},{0.96,0.79,0.56},{0.11,0.1,0.65},{0.29,0.37,0.9},{0.8,0.69,0.13},{0.67,0.83,0.18},{0.18,0.1,1.},{0.98,0.76,0.92},{0.42,0.33,0.82},{0.056,0.94,0.83},{0.6,0.83,0.79},{0.37,0.95,0.16},{0.39,0.15,0.88},{0.86,0.2,0.35},{0.71,0.16,0.68},{0.65,0.044,0.81},{0.98,0.71,0.3},{0.36,0.43,0.84},{0.82,0.94,0.026},{0.87,0.7,0.78},{0.74,0.2,0.97},{0.092,0.84,0.23},{0.62,0.64,0.73},{0.59,0.42,0.52}};
    In[]:=
    ListPointPlot3D[%,AxesLabel->{Red,Green,Blue}]

    Any data sample can be represented as a vector of numbers

    In[]:=
    FeatureExtract
    ,
    ,
    ,
    ,
    ,
    
    In[]:=
    FeatureExtract[{{"Yes","A"},{"No","A"},{"No","B"},{"Maybe","B"},{"No","C"}}]
    In[]:=
    FeatureExtract[{"the cat is grey","my cat is fast","this dog is scary","the big dog"}]

    Calculate nearness based on numbers

    Find what element in a list is nearest to what you supply.
    Find what number in the list is nearest to 22:
    In[]:=
    Nearest[{10,20,30,40,50,60,70,80},22]
    Find the nearest three numbers:
    In[]:=
    Nearest[{10,20,30,40,50,60,70,80},22,3]
    In[]:=
    40-22
    Nearest can find nearest colors as well.
    Find the 3 colors in the list that are nearest to the color you give:
    In[]:=
    Nearest
    ,
    ,
    ,
    ,
    ,
    ,
    ,
    ,
    ,
    ,
    ,
    ,
    ,
    ,
    ,
    ,
    ,
    ,
    ,
    ,
    ,3
    Find the 10 words nearest to “good” in the list of words:
    In[]:=
    wordsStartingWithG=
    ;
    In[]:=
    Nearest[wordsStartingWithG,"good",10]
    Construct a dataset of dog images:
    There’s a notion of nearness for images too.
    In[]:=
    dataset=
    ,
    ,
    ,
    ,
    ,
    ,
    ,
    ,
    ,
    ,
    ,
    ,
    ,
    ,
    ,
    ,
    ,
    ,
    ,
    ,
    ;
    Train a nearest function:
    In[]:=
    nf=FeatureNearest[dataset]
    Find the image friom the dataset that is nearest to these new sample images:
    In[]:=
    nf
    ,
    ,
    

    Nearness as step to identifying

    When we compare things—whether they’re colors or pictures of animals—we can think of identifying certain features that allow us to distinguish them.
    ​
    For colors, a feature might be how light the color is, or how much red it contains.
    ​
    For pictures of animals, a feature might be how furry the animal looks, or how pointy its ears are.
    The machine learning function is able to identify an image because it has previously seen similar images and decides that this image is closest to the examples of “cheetah” images it has seen before:
    What if we try to provide the same image to the machine learning function but blur it a bit every time, i.e. we “muddy the input”?
    Progressively blur a picture of a cheetah:
    When the picture gets too blurred, ImageIdentify no longer thinks it' s a cheetah:
    Take one of the blurred images and look at all possible answers ImageIdentify came up with. ImageIdentify thinks this might be a cheetah, but it’s more likely to be a lion, or it could be a dog:
    When the image is sufficiently blurred, ImageIdentify can have wild ideas about what it might be:
    ​

    Supervised vs. Unsupervised Machine Learning

    In machine learning, one often gives training that explicitly says, for example, “this is a cheetah”, “this is a lion”. This is known as “Supervised Learning”. You provide labeled examples that were created by some expert.
    But one also often just wants to automatically pick out categories of things without any providing specific labels. This is “Unsupervised Learning”.

    Supervised Learning

    This is used to answer questions of the type:
    ◼
  • Is this A or B (or A or B or C or D or E)? (Classification)
  • ◼
  • How much or how many? (Regression)
  • The Task of Classification

    Predict a label for the sample:
    Training data is usually a list of labeled samples:
    ◼
  • Infer a function from the data, mapping from feature values to label
  • ◼
  • Given a new data point, use this function to return a label based on the feature values
  • Example of a Classifier to Recognize a Dog or a Cat

    ​

    Well-known algorithms available to perform Classification

    Set up some example data:
    Classify automatically picks a method most suitable for the input data:
    It is also possible to specifically set the method to be used:
    NearestNeighbors​
    Find known data points that are nearest to the input sample in feature space and use only those to infer the class or value.
    ​
    ​LogisticRegression​
    Fitting best linear combination of logistic sigmoid functions.
    ​
    ​SupportVectorMachine​
    Find the hyperplane that best partitions the data (maximum-margin hyperplane).
    ​
    ​RandomForest​
    Construct a decision tree that repeatedly partitions the data. Do this repeatedly to create an ensemble of decision trees and choose the decision tree that gives the best predictive power.
    ​
    ​NaiveBayes​
    Determine the class using Bayes’s theorem and assuming that features are independent given the class.​
    ​​
    ​NeuralNetwork​
    Model class probabilities or predict the value distribution using a neural network.

    The Task of Regression

    Compute a target value for a sample:
    The training data contains samples with recorded values:
    ◼
  • Infer a function from the data—mapping from the feature values to the numeric target
  • ◼
  • Given a new data point, use the regression function to compute a target value optimal for the given features
  • Example of a wine score predictor

    Load a dataset of wine quality as a function of the wines’ physical properties:
    Look at the various features being considered:
    Look at some of the data points:
    Train a predictor on the training set:
    Predict the quality of an unknown wine:

    Well known algorithms available for regression

    Unsupervised Learning

    The Task of Clustering

    This is used to answer questions like:
    ◼
  • How is the data organized?
  • ◼
  • Do the samples separate into groups of some kind?
  • ◼
  • Are there samples that are very different from most of the group (outliers)?
  • The goal is to partition a dataset into clusters of similar elements (all sorts of data: numerical, textual, image, as well as dates and times).
    Collect “clusters” of similar colors into separate lists:
    A dendrogram is a tree-like plot that lets you see a whole hierarchy of what’s near what.
    Show nearby colors successively grouped together:
    ​
    There are algorithms available that take collections of objects and tries to find what it considers the "best" distinguishing features of them, then uses the values of these to position objects in a plot .
    FeatureSpacePlot makes similar colors be placed nearby:
    It doesn’t explicitly say what features it’s using—and actually they’re usually quite hard to describe. But what happens in the end is that FeatureSpacePlot arranges things so that objects that have similar features are drawn nearby.
    100 random colors laid out by FeatureSpacePlot:
    FeatureSpacePlot places photographs of different kinds of things quite far apart :

    Classification based on Clustering

    Often labeled examples are hard to come by. It can be expensive to hire an expert who will identify all the examples you need for training.
    Instead use clustering to come up with groups and use group membership as a label.
    Some human supervision may still be required but at least you have a start.