WOLFRAM NOTEBOOK

Insert New
Slide
Roboto
A
Slide Break Defaults
Cell Actions
Window Options
Start Presentation
Slide
1
of
9

Scholarly Insight through the ArXivExplore Paclet

Daniele Gregori, Soochow University
Institute for Advanced Study
#WolframTechConf
Slide
2
of
9

ABSTRACT

In[]:=
abstract="ArXivExplore helps the deep data analysis of all 2.6M physics, math, computer science, etc. articles on ArXiv, providing functionality for e.g. title/abstract word statistics, TeX source/formulae/citations dissection, NNs for classification or recommendation and LLM-automated concept explanations and author reports.";
Out[]=

Paclet loading

Slide
3
of
9

Introduction

On ArXiv

A “deep data” problem

Academic “chat” vs scientific “insight”

Some ArXiv questions

Slide
4
of
9

ArXiv data mining

ArXiv main data

Categories

Full
T
E
X
scraping

Citations API

Slide
5
of
9

Real insights from “just counting”

Submission trends

Counting total words

Word popularity trends

Word logic combinations

Slide
6
of
9

Some Machine Learning

Basic classification

Advanced training

Feature extraction

Clustering

Slide
7
of
9

LLMs to enhance understanding

The importance of introductions

Explaining specific concepts

Creating author reports

Slide
8
of
9

Conclusions

Unlimited explorations

Planned improvements

New horizons in “deep data”

Slide
9
of
9

Thank you!

Out[]=
Wolfram Cloud

You are using a browser not supported by the Wolfram Cloud

Supported browsers include recent versions of Chrome, Edge, Firefox and Safari.


I understand and wish to continue anyway »

You are using a browser not supported by the Wolfram Cloud. Supported browsers include recent versions of Chrome, Edge, Firefox and Safari.