R-Shief Data Stories

R-Shief Data Stories#machinelearning, #nextjs, #strapi

R-Shief Data Stories is a microblogging platform for journalists, academics and activists that augments the authoring of articles with beautiful data visualizations powered by integrations with a variety of data sources like social media APIs and dataset repositories.

R-Shief is cyberslang for the Arabic word أرشيف which means archive. It's an organization dedicated to empowering the growing community of interdisciplinarians interested in data whose identity is not easily narrowable into one category of scholar, artist or activist. Since 2009 R-Shief has been creating web-based software for this community, and during the Arab Spring revolutions R-Shief was the major player in data-based sense-making about the zeitgeist in that region.

I came on the scene in 2018 as R-Shief's senior full stack developer. Between 2018 and 2020 we built the now deprecated R-Shief Dashboard and various other bespoke data visualizations. Since then the emergence of unreasonably effective large language models and the narrowing of Twitter/X's API deeply affected how we designed the newest and biggest R-Shief service to date, which we call Data Stories.

It's main front-facing structure is a NextJS/Strapi/MongoDB stack, but there's really a constellation of services, such as the R-Shief Library and the R-Shief API. The following diagram roughly outlines the structure of the constellation:

The core of the R-Shief Data Stories service really lies in what's going on with the R-Shief Worker, the Kubeflow models, and the data sinks/sources in Kafka. These are the components that manage the inflow and analysis of vast quantities of data in real-time. And for the technically inclined with goals other than article and blog making, the R-Shief API gives a more direct handle to the headless capabilities of the worker/kubeflow/kafka stack. Below is a somewhat out-of-date but still clarifying diagram of the API's machinery: