JournalismAI Collab Challenges 2021 / EMEA
An algebra for news modules to support a new kind of storytelling that is more focused on information needs
The brief for this project was:
"How might we use modular journalism and AI to assemble new storytelling formats and reach currently underserved audiences."
For us, this indicated that we needed to find ways to provide different kinds of content, in different ways, to directly address the user information needs of a variety of users–particularly those that have been badly served by traditional media.
As a starting point, we broadly defined "modules" as discrete elements of a story that can be created independently and then combined and recombined with other modules in order to create a variety of storytelling formats.
A modular approach was clearly a valuable and interesting one, but early on we understood that we couldn't jump straight to creating new kinds of stories. We would first have to create a robust and flexible theoretical framework to create a solid base to work from.
We therefore set out to establish a methodology around modular journalism and support that process with automation tools. The outcome of that work, illustrated here at modularjournalism.com, is a basic prototype to demonstrate that thinking by showing the creation of modules and their repurposing in different formats.
For this, we have focused on modular-first news artefacts that are specifically created for the purpose of modularisation. That's because our experiments have shown that modular-first artefacts create more coherent and successful stories. This is a new approach to journalism, which requires some adjustments to workflows, but offers the prize of producing more useful, engaging, and effective journalism.
Using the modules in these news artefacts, we have successfully shown that it's possible to automatically create a range of stories based on predefined templates and patterns to directly meet user information needs. The detailed theoretical work has given rise to a robust but flexible algebra which gives any newsroom the opportunity to create formats that meet its own specific workflow and information needs.
Learning more about the inner functional structures of the atoms in a news artefact has further given us valuable and transferable insights into devising new best practices to better serve our public.
We asked ourselves whether we can transform a news artefact into something new
(And if we had anything to gain in doing so)
At the core of our work is an assumption that paying greater attention to the structure of a news artefact can help us reach a larger portion of the audience. Long-form articles are hardly ever consumed in full, and different formats appeal to different users, in different ways depending on different factors. If we can respond to those factors, then we have a route to boosting the engagement of the users we do have and also attracting those that we aren't currently reaching. By doing things differently, we can also find ways of capturing the attention (and the trust) of younger or more diverse audiences.
Quality was important to us, but modular journalism also opens up ways to boost quantity. By working in a modular way, we are not restricted to creating completely new articles every time there is an update. The work of creating new content might be restricted to writing or editing just one or two modules, which would themselves update or create multiple story versions. Modules might also be "evergreen" and capable of being part of many different stories over time, thus reducing research and writing demands on journalists, enabling them to focus on creating new and distinctive content.
Setting out on this process, we first investigated whether we needed to start with a blank slate, by creating "modular first" artefacts, or whether we could easily extract modules of journalism from existing content. We concluded that the modular-first approach was clearly the most successful and that existing content is not structured in a way that enables discrete elements to be easily isolated and categorised. Whilst the modular-first approach was clearly preferred and is what we have focused on, there might be some scope for taking the extraction route to repurpose some existing content or archive material.

We have a taxonomy on which to base our algebra
Defining modules, information needs and automation opportunities
We've asked ourselves a number of questions as part of this research, including:
- What are the constituent parts of a "story"?
- What do we mean by modules?
- What are the key user needs of our audience(s) - particularly those that are currently ill-served?
- What kinds of stories do we need to tell to create more inclusive and effective journalism?
- What modules would serve those stories best?
We've found answers, or at least partial solutions, to many of those questions. That work has resulted in the proof of concept prototype stories showcased on the "Modular Articles" page.

Our initial entities and a placeholder for a 'storytelling' algorithm

We started with a very big cloud of information needs
We collected questions based on the user personas we'd identified as part of our ill-served audiences. Many of these questions are seldom answered in traditional long-form articles.
In defining user information needs, we started from the proposition that they could be defined loosely as "what do our readers, viewers, or listeners want to know about this story?" For example, those would include: What has happened today? Who is involved? Why is this important? What happens next?
Our information needs are embodied in modules that are defined by a series of those questions. User information needs can be very different from case to case and have been defined and prioritised by different newsrooms in different ways, depending on each outlet's editorial purpose and the nature of their audience.
For the purpose of this work, we started with a list of around 60 potential modules (user information needs), but then narrowed this down to a subset of 10 key modules in order to test our algebra. Clearly, the more user information needs we include in our storytelling, the more possible combinations of modules and potential stories we have. Individual newsrooms taking on this work may wish to create their own subsets of user information needs, as having a potentially infinite number of possible stories may not be manageable or desirable, either in terms of newsroom workflow or user experience.
Defining this subset of 10 key questions, or user information needs, is a fascinating process in itself, as it forces us to think deeply both about the structure of existing journalism and about different (and potentially more effective) ways to build stories in the future. Starting from our initial set of 60 modules, it soon became clear that "essential" modules to reach audiences who are currently badly served by traditional journalism would include questions that don't currently feature in much "traditional" journalism, for example:
- Why is this important?
- What has got us here?
- What is the impact on my community?
- Are any people particularly or disproportionately affected?
- What don't we know?
- How can we fix it?
These questions were all responses to issues highlighted in user research done by the JournalismAI Collab teams and are a reflection of the "user first" (as opposed to "newsroom first") approach we took to the project brief.
For the core of our algebra, we're using Van Dijk's news schema
We needed a solid ground of linguistic research on which to base our castle of modules
We identified early on that one potential risk of taking a user information needs approach could be that we would lose linguistic grounding. The central question was whether this user information needs approach, based on discrete modules of journalism answering individual questions, would ultimately lead to coherent and comprehensible stories.
We started by investigating a number of possible approaches, including Rhetorical Structure Theory and van Dijk's News Schemata. We found that the van Dijk approach was most useful to us as formal analysis and guidance on the linguistic organisation of news reports within a "textual superstructure". This gave us a starting point to help us understand how conventional reporting is typically organised. Van Dijk's schemata pointed us to a set of building blocks which we could potentially "mirror" in our storytelling, but at the very least would provide guidance to make it feel part of a textual tradition that most people would recognise as "news".

Can we slide our way from an update to the full context?
What kinds of modular stories might we create, and what modules would they need?
After selecting a set of ten core modules for our proof of concept, we next investigated what sorts of stories we could create using those modules. This was to meet our audiences' broader user information needs, such as the need for greater context, to understand a story's direct impact on their communities, to have access to all the data and facts, or to see possible solutions or constructive approaches to problems.
We also considered how these stories might be presented to users and landed on the idea of a slider. This would enable users to quickly choose between different kinds of articles based on various selections and configurations of our set of core modules.
We should be clear that, for our proof of concept at least, we are not using AI or machine learning. However, the detailed linguistic and definitional work we have done here is a fundamental and necessary step to enable this kind of storytelling to be scaled using AI/ML in the future.
An example of a modular-first article in JSON can be seen here.

The Journalism AI Collab Challenges
This project is part of the 2021 JournalismAI Collab Challenges, a global initiative that brings together media organisations to explore innovative solutions to improve journalism via the use of AI technologies. It was developed as part of the EMEA cohort of the Collab Challenges that focused on modular journalism with the support of BBC News Labs and Clwstwr.
JournalismAI is a project of Polis – the journalism think-tank at the London School of Economics and Political Science – and it’s sponsored by the Google News Initiative. If you want to know more about the Collab Challenges and other JournalismAI activities, sign up for the newsletter or get in touch with the team via hello@journalismai.info
