Examples showcasing some of the topics that will be discussed at the webinars.
Tweet Analysis
Tweet Analysis
Create an infographic that shows the analytics from tweets containing a certain keyword. In particular, visualize the following:
Tweet Timeline: A timeline showing the dates and times the tweets were posted.
◼
Tweet Sentiment Analysis: Classify the sentiment of the tweets (in English) as being positive, negative or neutral.
◼
Favorite Counts: Number of times the tweets have been liked.
◼
Retweet Counts: Number of times the tweets have been retweeted.
◼
Get the Data
Get the Data
Connect to Twitter:
twitter=ServiceConnect["Twitter"]
In[]:=
ServiceObject
Out[]=
Note: To edit or evaluate the code in this notebook, click Open In at the bottom of the browser window to select a Wolfram Cloud product. Place your cursor within the code and press the Shift and Enter keys together.
Set the keyword to search for in tweets:
keyword="WolframAlpha";
In[]:=
Search for the tweets (in English) containing the keyword and download the data:
data=twitter"TweetSearch","Query"keyword,"Language"->,MaxItems200;
In[]:=
Look at the text of 5 random tweets from the data:
RandomSample[data[All,"Text"],5]
In[]:=
Out[]=
Visualize the Information
Visualize the Information
Plot the tweet timeline:
tweetTimeline=DateHistogram[Normal[data[All,"CreationDate"]],PlotTheme{"Web","Square"},PlotLabel"Tweet Timeline",ImageSizeSmall];
In[]:=
Classify the sentiment in each tweet and visualize the number of tweets in each class:
tweetSentiments=PieChartKeySort[Counts[Classify["Sentiment"][data[All,#Text&]]]],PlotTheme{"Web","Square"},ChartStyle,,,ChartLegendsAutomatic,PlotLabel"Tweet Sentiment Analysis",ImageSizeSmall;
In[]:=
Create a histogram of the number of times each tweet was liked:
favouriteCount=Histogram[Normal[data[All,"FavoriteCount"]],PlotTheme{"Web","Square"},PlotLabel"Favorite Count",ImageSizeSmall];
In[]:=
Create a histogram of the number of times each tweet was retweeted:
retweetCount=Histogram[Normal[data[All,"RetweetCount"]],PlotTheme{"Web","Square"},PlotLabel"Retweet Count",ImageSizeSmall];
In[]:=
Create the Infographic
Create the Infographic
Putting it all together:
Panel[Grid[{{tweetTimeline,tweetSentiments},{favouriteCount,retweetCount}}],Style["Tweets about \""~~keyword~~"\"","Subsubtitle"],Bottom]
In[]:=
Tweets about "WolframAlpha" |
Out[]=
Predicting Home Prices
Predicting Home Prices
Use automated machine learning to predict the median value of owner-occupied homes in 506 Boston suburbs, based on potential influential factors (such as the crime rate, number of rooms, distance to employment centers, etc.).
Get the Data
Get the Data
Load the training and test data:
trainingset=ExampleData[{"MachineLearning","BostonHomes"},"TrainingData"];testset=ExampleData[{"MachineLearning","BostonHomes"},"TestData"];
In[]:=
The input features used to train the predictive model:
ExampleData[{"MachineLearning","BostonHomes"},"VariableDescriptions"][[1]]
In[]:=
{Per capita crime rate by town,Proportion of residential land zoned for lots over 25000 square feet,Proportion of non-retail business acres per town,Charles River dummy variable (1 if tract bounds river, 0 otherwise),Nitrogen oxide concentration (parts per 10 million),Average number of rooms per dwelling,Proportion of owner-occupied units built prior to 1940,Weighted mean of distances to five Boston employment centers,Index of accessibility to radial highways,Full-value property-tax rater per $10000,Pupil-teacher ratio by town,1000(Bk-0.63)^2 where Bk is the proportion of blacks by town,Lower status of the population (percent),Median value of owner-occupied homes in $1000s}
Out[]=
The output target variable to be predicted:
ExampleData[{"MachineLearning","BostonHomes"},"VariableDescriptions"][[2]]
In[]:=
Median value of owner-occupied homes in $1000s
Out[]=
Build a Predictive Model
Build a Predictive Model
p=Predict[trainingset]
In[]:=
PredictorFunction
Out[]=
Pick a random sample from the test data (which is in the format {inputFeatures} -> actualHomePrice):
testData=RandomSample[testset,1]
In[]:=
{{0.77299,0,8.14,0,0.538,6.495,94.4,4.4547,4,307,21,387.94,12.8}18.4}
Out[]=
inputFeatures=First@Keys@testData
In[]:=
{0.77299,0,8.14,0,0.538,6.495,94.4,4.4547,4,307,21,387.94,12.8}
Out[]=
actualHomePrice=First@Values@testData(*in$1000s*)
In[]:=
18.4
Out[]=
Use the trained model to predict the price of this home (in thousands of dollars):
predictedHomePrice=p[inputFeatures]
In[]:=
Test the Model
Test the Model
Various evaluation metrics can be used to see how well the model is performing on the test data.
Create a PredictorMeasurements object and extract various metrics of evaluation and other information from it:
Standard deviation that represents the root mean square of residuals:
Plot of predicted values versus test values:
Where to Go Next
Where to Go Next
Visit http://mpdatascience.com to learn more about the integrated multiparadigm approach to data science offered by Wolfram technology and explore further examples.
◼
Learn how to use the Wolfram neural network framework to employ machine learning in your data science project. Visit the Wolfram U Machine Learning page to watch video courses and also to register for our Machine Learning Webinar Series.
◼
Happy exploring!