Zimmerman en Space go Wiki
Webscrape of the Zimmerman en Space podcast, and (re)publication on Wikimedia Commons (and Zenodo in the future). High 5 for CC0 licenses, space, astronomy and nerds!
Latest update : 17 September 2024
Main result
Episodes 1 - 92 are now available on Wikimedia Commons:
Step by step process
Make initial scrape map
Excel with scraped data, post-processing
Output of webscrape, with post-processing to make data suitable input for Wikimedia Commons, OpenRefine and the Python modules used below: https://ookgezellig.github.io/Zimmerman-en-Space-podcast/ZimmermanEnSpacePodcast_episodes1-92.xlsx
Download mp3s from URL
Converting from mp3 to ogg/oga:
- Python script: convert_mp3s_to_oga.py - Make sure ffmpeg
is installed on your machine and it has been added to your System’s PATH.
- Alternatively, use an online .mp3 to .ogg bulk converter, such as online-audio-converter.com. This was the actual tool used for converting the first batch of episodes (1-92). File extension can be changed from .ogg tot .oga without penalty.
- Folder: ogg-files
Wikimedia Commons:
- Files must be copied and renamed from Buzzsprout to Wikimedia Commons syntax titles, eg. from ogg-files/11845039-tsunami-s-op-mars.ogg to [oga-files/Tsunami’s_op_Mars_-Zimmerman_en_Space-S01E01-2022-12-09-11845039.oga](/Zimmerman-en-Space-podcast/oga-files/Tsunami’s_op_Mars-Zimmerman_en_Space-S01E01-2022-12-09-_11845039.oga)
- Folder with Commons compatible files: oga-files
Category & gallery
API
Request info about episode 14, AI en Chat GPT in de sterrenkunde
SPARQL
Structured data has been added to all files, so we can do some (basic) semantic searching via SPARQL queries.
Wikidata
Copyright
All episodes 1-92 of the Zimmerman en Space podcast have been licensed under the Creative Commons CC0 1.0 license, as stated in the shownotes of each episode.