An extension for extracting and downloading toots (posts on Mastodon) for text mining and analysis.
If you use this extension for your research, please reference it as follows:
Moncomble, F. (2024). MastoScraper (Version 0.8) [JavaScript]. Arras, France: Université d’Artois. Available at: https://fmoncomble.github.io/mastoscraper/
Mastodon, which forms part of the fediverse, is unlike most other social networks. Please reflect on the your planned use of the content you are scraping, and consider studying the terms and conditions of the various instances you are drawing data from.
Click here for some relevant reading material.
Remember to pin the add-on to the toolbar.
Search
.XML/XTZ
for an XML file to import into TXM using the XML/TEI-Zero
module
ref
in the field labelled “Out of text to edit”TXT
for plain textCSV
XLSX
JSON
Abort
.Download
to collect the output or Reset
to start afresh.Searching by instance and language is not built into the Mastodon API, meaning that results are filtered from the whole query response, which may take some time depending on the chosen criteria.