Level 2: Applied Expertise
Grading Rubric
In order to be Wolfram Certified Level 2 in Multiparadigm Data Science (MPDS), the applicant must successfully complete an independent project that demonstrates expertise in applying the MPDS approach to solving a complex problem. The project report must be submitted in the form of a Wolfram Notebook. Submissions are graded according to the following rubric. Level 2 certification requires a project score of 75 or greater.
Mastery of the MPDS Workflow (75 points)
Mastery of the MPDS Workflow (75 points)
Question (5 points)
Question (5 points)
Define specific questions that will be answered by the end of the project.
List questions that are well motivated and interesting.
Include why it is important to answer these questions.
If the list of questions evolved through multiple iterations of the workflow, explain how and why.
Wrangle (15 points)
Wrangle (15 points)
Clearly identify the data source. Either provide the link to a publicly accessible source or the data file itself. If we are unable to run the code in the notebook for lack of access to the data, we cannot proceed with the grading for certification. (We recommend not using proprietary data.)
Provide code to download and restructure the data into appropriate Wolfram Language expressions (List, Association, Dataset, ResourceData, etc.)
Show how data has been cleaned, and ensure errors and missing data have been dealt with.
If you have obtained clean and computable data from a resource like the Wolfram Data Repository, include adequate explanations about how the data was obtained, cleaned and curated (refer to the data curation hierarchy).
Identify additional data that had to be wrangled during later iterations of the workflow.
Explore (15 points)
Explore (15 points)
Perform visual exploration of the data, including but not limited to scatter plots, bar charts, pie charts, histograms, geographics and word clouds.
Include summary statistics of the data (e.g. Tukey’s five-number summary), as well as insights offered by such statistics.
Clearly explain conclusions from the exploratory analysis.
Provide the justification for choosing algorithms and techniques for further analysis, based on these initial explorations.
If you had to revisit the Exploratory Data Analysis stage during later iterations of the workflow, indicate why and how.
Analyze (20 points)
Analyze (20 points)
State steps for computational analyses of the data, including text explanations and Wolfram Language code.
Explain why certain analytical techniques/methods were chosen.
Validate the final models where machine learning is used.
Evaluate the performance of the model or algorithm using relevant performance metrics like accuracy, false positive rate or precision.
Ensure you have tried more than one analytical technique/method. The highlight of the MPDS workflow is not restricting analyses to a specific discipline, but rather trying out different techniques to elicit useful information from the data.
Include at least one example of using a nontraditional technique to analyze the data. (It is OK to not have great results from this part of the analyses. We would like to evaluate your ability to apply MPDS techniques across disciplines.)
Explain how the analysis evolved during multiple iterations of the workflow.
Communicate (20 points)
Communicate (20 points)
Convey the final results of the analysis clearly (we recommend using one or more of the following):
An infographic (either static or interactive) such as a poster that conveys the results succinctly, without need for accompanying verbal explanation.
A computational essay in a Wolfram Notebook with relevant text explanations, Wolfram Language code and visualizations. Note that the computational essay is different from the project report notebook. The computational essay should contain less detail about the project workflow and focus on conveying the results of the analysis in an interesting and concise way.
A microsite published online through the Wolfram Cloud that allows static or interactive exploration of the results by the readers.
A data product or app run locally on the desktop in a Wolfram Notebook or online through the Wolfram Cloud that performs data analysis according to the model developed in this project.
This section of the project will be graded on:
Visual appeal
Informativeness
Quality of design
Clarity of answers to the questions set up at the Question stage
Reproducibility (25 points)
Reproducibility (25 points)
Code (10 points)
Code (10 points)
Make sure the notebook is self-contained. (It should be possible for us to run the code without any external dependency.)
Organize and indent code for readability.
Remove incomplete/irrelevant code snippets (used to try out ideas while developing the workflow) from the final notebook.
Revise and rewrite code to make it simple and straightforward. Keep in mind the rule of thumb to make a cell no more than three lines of code; cells with complicated, lengthy code should be broken up into multiple segments or cells.
Name variables and functions according to their purpose in the code. Avoid generic names like “var123” or “myFunction”.
Include comments to explain exactly what is being done by the code.
Either use comments such as:
(*thisisacomment*)
Or use a cell in CodeText style preceding the cell with the code:
Create a scatter plot of feature 1 vs. feature 2 using ListPlot:
In[]:=
ListPlot[RandomReal[10,{10,2}]]
Out[]=
Explanations and Comments (10 points)
Explanations and Comments (10 points)
Include relevant, concise text explanations to describe every stage of the project workflow.
Include explanations regarding specific decisions on the choice of algorithms, techniques, values of parameters and hyperparameters.
Whenever possible, use simple visualizations to illustrate the topic and add visual interest.
References (5 points)
References (5 points)
Provide links to existing published research as references.
If this project builds on someone else’s work or attempts to provide comparative analysis, highlight the work in a reference.