Examples showcasing some of the topics that will be discussed at the webinars.

## Tweet Analysis

Tweet Analysis

Create an infographic that shows the analytics from tweets containing a certain keyword. In particular, visualize the following:

◼

Tweet Timeline: A timeline showing the dates and times the tweets were posted.

◼

Tweet Sentiment Analysis: Classify the sentiment of the tweets (in English) as being positive, negative or neutral.

◼

Favorite Counts: Number of times the tweets have been liked.

◼

Retweet Counts: Number of times the tweets have been retweeted.

### Get the Data

Get the Data

Connect to Twitter:

In[1]:=

twitter=ServiceConnect["Twitter"]

Out[1]=

ServiceObject

Note: To edit or evaluate the code in this notebook, click Open In at the bottom of the browser window to select a Wolfram Cloud product. Place your cursor within the code and press the Shift and Enter keys together.

Set the keyword to search for in tweets:

In[2]:=

keyword="WolframAlpha";

Search for the tweets (in English) containing the keyword and download the data:

In[3]:=

data=twitter"TweetSearch","Query"keyword,"Language"->,MaxItems200;

Look at the text of 5 random tweets from the data:

In[4]:=

RandomSample[data[All,"Text"],5]

Out[4]=

### Visualize the Information

Visualize the Information

Plot the tweet timeline:

In[5]:=

tweetTimeline=DateHistogram[Normal[data[All,"CreationDate"]],PlotTheme{"Web","Square"},PlotLabel

"Tweet Timeline",ImageSizeSmall];

Classify the sentiment in each tweet and visualize the number of tweets in each class:

In[6]:=

tweetSentiments=PieChart[KeySort[Counts[Classify["Sentiment"][data[All,#Text&]]]],

PlotTheme{"Web","Square"},

ChartStyle,,,

ChartLegendsAutomatic,PlotLabel"Tweet Sentiment Analysis",ImageSizeSmall];

Create a histogram of the number of times each tweet was liked:

In[7]:=

favouriteCount=Histogram[Normal[data[All,"FavoriteCount"]],

PlotTheme{"Web","Square"},PlotLabel"Favorite Count",ImageSizeSmall];

Create a histogram of the number of times each tweet was retweeted:

In[8]:=

retweetCount=Histogram[Normal[data[All,"RetweetCount"]],

PlotTheme{"Web","Square"},PlotLabel"Retweet Count",ImageSizeSmall];

### Create the Infographic

Create the Infographic

Putting it all together:

In[9]:=

Panel[Grid[{{tweetTimeline,tweetSentiments},{favouriteCount,retweetCount}}],Style["Tweets about \""~~

keyword~~"\"","Subsubtitle"],Bottom]

Out[9]=

Tweets about "WolframAlpha" |

## Predicting Home Prices

Predicting Home Prices

Use automated machine learning to predict the median value of owner-occupied homes in 506 Boston suburbs, based on potential influential factors (such as the crime rate, number of rooms, distance to employment centers, etc.).

### Get the Data

Get the Data

Load the training and test data:

In[10]:=

trainingset=ExampleData[{"MachineLearning","BostonHomes"},"TrainingData"];

testset=ExampleData[{"MachineLearning","BostonHomes"},"TestData"];

The input features used to train the predictive model:

In[12]:=

ExampleData[{"MachineLearning","BostonHomes"},"VariableDescriptions"][[1]]

Out[12]=

{Per capita crime rate by town,Proportion of residential land zoned for lots over 25000 square feet,

Proportion of non-retail business acres per town,Charles River dummy variable (1 if tract bounds river, 0

otherwise),Nitrogen oxide concentration (parts per 10 million),Average number of rooms per dwelling,

Proportion of owner-occupied units built prior to 1940,Weighted mean of distances to five Boston employment

centers,Index of accessibility to radial highways,Full-value property-tax rater per $10000,

Pupil-teacher ratio by town,1000(Bk-0.63)^2 where Bk is the proportion of blacks by town,

Lower status of the population (percent),Median value of owner-occupied homes in $1000s}

The output target variable to be predicted:

In[13]:=

ExampleData[{"MachineLearning","BostonHomes"},"VariableDescriptions"][[2]]

Out[13]=

Median value of owner-occupied homes in $1000s

### Build a Predictive Model

Build a Predictive Model

Use the

Predict

function to train the best predictive algorithm for this training set:In[14]:=

p=Predict[trainingset]

Out[14]=

PredictorFunction

Pick a random sample from the test data (which is in the format {inputFeatures} -> actualHomePrice):

In[15]:=

testData=RandomSample[testset,1]

Out[15]=

{{0.77299,0,8.14,0,0.538,6.495,94.4,4.4547,4,307,21,387.94,12.8}18.4}

In[16]:=

inputFeatures=First@Keys@testData

Out[16]=

{0.77299,0,8.14,0,0.538,6.495,94.4,4.4547,4,307,21,387.94,12.8}

In[17]:=

actualHomePrice=First@Values@testData(*in$1000s*)

Out[17]=

18.4

Use the trained model to predict the price of this home (in thousands of dollars):

In[18]:=

predictedHomePrice=p[inputFeatures]

Out[18]=

19.7235

### Test the Model

Test the Model

Various evaluation metrics can be used to see how well the model is performing on the test data.

Create a

PredictorMeasurements

object and extract various metrics of evaluation and other information from it:In[19]:=

pm=PredictorMeasurements[p,testset]

Out[19]=

PredictorMeasurementsObject

Standard deviation that represents the root mean square of residuals:

In[20]:=

pm["StandardDeviation"]

Out[20]=

4.75794

Plot of predicted values versus test values:

In[21]:=

pm["ComparisonPlot"]

Out[21]=