Share

cover art for Nick Diakopoulos: The Potential of Fine-Tuning Large Language Models on Newsroom Data

Newsroom Robots

Nick Diakopoulos: The Potential of Fine-Tuning Large Language Models on Newsroom Data

Ep. 26

Nick Diakopoulos, Communication Studies and Computer Science Professor at Northwestern University joins Nikita Roy to discuss the opportunities fine-tuning Large Language Models offer for news organizations, and the impact of generative AI on news production and the broader information ecosystem. 


Nick directs the Computational Journalism Lab and is Director of Graduate Studies for the Technology and Social Behavior doctoral program. He's also the author of the award-winning book "Automating the News: How Algorithms are Rewriting the Media," published by Harvard University Press. His research focuses on computational journalism, including automation and algorithms in news production and algorithmic accountability and transparency. 


Thoughts or questions? You can reach us here.

More episodes

View all episodes

  • 49. Mattia Peretti: Balancing AI Innovation with Journalistic Integrity

    37:51
    Mattia Peretti, former manager of Journalism AI at the London School of Economics and current Knight Fellow at the International Center for Journalists, joins host Nikita Roy to share insights on balancing AI innovation with journalistic integrity.The episode explores an AI literacy initiative at Internews, which created a platform for knowledge exchange and significantly improved the organization's understanding and application of AI technologies. The discussion also delves into the development of generative AI guidelines for newsrooms, using the example of The Guardian. The focus is on creating adaptable, value-driven principles rather than strict prescriptions. This approach allows for flexibility in the face of rapid technological changes while ensuring that the organization's foundational values remain intact. The Guardian's experience serves as a valuable case study for other newsrooms looking to navigate the integration of AI technologies. Sign up for the Newsroom Robots newsletter for episode summaries and insights from host Nikita Roy.
  • 48. Florent Daudens: Building an AI-Literate Newsroom at Radio Canada

    01:08:28
    Florent Daudens, the outgoing Director of Newsgathering and Deployment at Canada's National Public Broadcaster, Radio-Canada, joins Nikita Roy to share how he led AI literacy initiatives in their newsroom. In his role, Florent focused on enhancing the news department with AI as well as managing operations across national, parliamentary, and foreign bureaus.With a passion for AI and technology trends, Florent has contributed to the digital evolution of major Canadian media outlets for over 15 years. Previously, he worked as the News Director at Le Devoir, where his tenure was marked by a digital transformation. This included the creation of specialized video and data visualization units and the introduction of innovative journalistic products. Florent also dedicates time to teaching digital journalism at the University of Montreal.Sign up for the Newsroom Robots newsletter for episode summaries and insights from host Nikita Roy.
  • 47. Ezra Eeman: Navigating the AI Frontier at the Dutch Public Broadcaster NPO

    39:05
    Ezra Eeman, the Director of Strategy & Innovation at the Dutch Public Broadcaster NPO joins Nikita Roy to discuss NPO's AI strategy, revealing the complexities of navigating this frontier within a decentralized network of 13 broadcasters. From leveraging AI for accessibility and efficiency to cautious experiments with synthetic voices and avatars, NPO's approach offers a fascinating case study in balancing innovation with public trust.With almost 20 years of experience in media, innovation, and journalism, Ezra has been at the forefront of digital transformation. Previously he was the Change Director at international media company Mediahuis, where he was responsible for coordinating newsroom transformation and digital acceleration. He also served as the Head of Digital, Transformation and Platforms at the European Broadcasting Union (EBU), and prior to that, he was head of an innovation lab and journalist at VRT, the Flemish public broadcaster.Sign up for the Newsroom Robots newsletter for episode summaries and insights from host, Nikita Roy.
  • 46. Craig Newmark: Philanthropy's Role in Supporting Journalism in the AI Era

    29:11
    Craig Newmark, internet pioneer and founder of Craigslist, joins Nikita Roy to talk about the past, present, and future of AI. Craig is a visionary whose profound contributions have shaped the landscape of digital platforms and supported the pillars of journalism. As the founder of Craigslist, he revolutionized the classified ads sector and transformed how people buy, sell, and connect within their local communities. Beyond his impact on the internet's landscape, Craig is a dedicated philanthropist, notably through the Craig Newmark Philanthropies where he has become a beacon of support for the work of journalists. His philanthropic journey is marked by significant contributions to some of the leading journalism schools, including the City University of New York's Graduate School of Journalism, aiming for a future where education in journalism is accessible to all, free of tuition​​. Craig's generosity has been instrumental in establishing the Center for Journalism Ethics and Security at Columbia University. His vision for a well-informed public has also led to supporting the University of Washington's Center for an Informed Public addressing the critical issues of mis- and disinformation​​.Craig Newmark Philanthropies has contributed to Harvard University's Berkman Klein Center for Internet & Society which supported the launch of a three-year initiative called the Institute for Rebooting Social Media.Craig has contributed to several other universities, focusing on initiatives that support journalism, cybersecurity, public service for veterans, and the digital information ecosystem. In this episode, Craig shares his thoughts on the challenges posed by large language models and how philanthropy plays a vital role in supporting the integration of AI into journalism. 📢 Announcing the launch of the Newsroom Robots Academy. The Academy will offer short online courses designed to introduce you to generative AI, complete with industry-specific insights.Join Nikita Roy, who will co-teach these courses alongside Jeremy Caplan, writer of the Wonder Tools newsletter and Director of Teaching and Learning at the Craig Newmark Graduate School of Journalism at the City University of New York.Upskilling has become more crucial than ever. Through the courses offered at the Newsroom Robots Academy, you'll be able to leverage the capabilities of generative AI in your work as a media professional.Sign up now to be among the first to know when course registration opens.
  • 45. How Germany’s Ippen Digital is Fine-Tuning Large Language Models for Their Newsroom

    45:52
    From fine-tuning large language models, to discussing modular journalism, to developing an AI tool to help track misinformation, there’s a lot to unpack from this week’s conversation with Alessandro Alviani, the product lead for AI at Germany’s Ippen Digital. We build upon the first part of our conversation from last week, where Alessandro shared his editor-centric approach toward building AI products. A core takeaway from this week's episode is the value of fine-tuning large language models on a newsroom’s content. Fine-tuning is the process of taking a pre-trained language model that understands general textual patterns and customizing it by training the algorithm on writings from a specific domain – in this case, Ippen Digital's own journalistic content. By fine-tuning models on Ippen Digital's extensive corpus of local German reporting rather than just using out-of-the-box models like GPT-4, they are working on enhancing accuracy for tasks like headline writing, lead paragraph generation, and article summarization.Their editors and developers work side-by-side to ensure the AI's outputs match the desired quality standards and editorial voice. Additionally, Alessandro spotlighted their work in building personalized news experiences enhanced by modular journalism or “intelligent content.” Modular journalism involves breaking down articles into discrete, interchangeable components centered on key semantic themes – historical context, opposing views, critical data, etc. These content blocks can then be dynamically mixed and matched by an algorithm to generate personalized news experiences for different reader interests and preferences.We also discussed how developing AI assistants to break down a human-written news story into modules can enable the creation of customized article versions matching different reader interests or news products.Such repackaging of information to cater to diverse audiences is one of the potentials of AI in the newsroom. Thoughtful implementation of augmented writing tools could catalyze more engaging, personalized news without compromising editorial integrity. Of course, prudent precautions are necessary to develop algorithms in the newsroom. While AI has much potential for accelerating and enhancing reporting, we must understand its limitations in fully automating high-caliber journalism. The heart of quality storytelling – weaving together evidence and narratives to reveal truth and empower civil discourse – remains an irreplicable, fundamentally human endeavor.Ippen Digital’s stance to develop AI solutions that empower rather than replace reporters seems wise. By bonding human creativity and AI productivity with an ethical approach to automation, journalism may structurally shift yet hold fast to its sacred commitments to transparency, accuracy, and public enlightenment.🎧 Listen to the full conversation available now on Apple, Spotify, Google, and other major podcast platforms.
  • 44. Alessandro Alviani (Part One): Building AI Products with an Editor-Centric Approach

    27:59
    Rather than AI replacing journalists, Alessandro Alviani believes editorial teams can leverage AI to enhance and augment their work. Formerly as the Editorial Director at the Microsoft News Hub, Alessandro experienced firsthand the consequences that replacing human editors with automated systems caused. Drawing from his experience he says that the key is to empower journalists with AI tools rather than displace them. "It's our responsibility to help editors develop a more realistic approach to AI," he says.Now, as the Product Lead on AI at the German newsroom Ippen Digital, Alessandro has led the creation of a range of innovative AI products - from interview transcription tools to illustration generators - with transparency, responsibility, and human oversight as key principles. What I found particularly interesting was his three-pronged strategy towards an editorial-first approach to building AI products: internships with his product team, having two editors embedded within his 10-person team, and deep-dive discovery sessions across their newsrooms to understand editorial needs. This approach, which emphasizes collaboration and hands-on involvement, led to innovations such as an editorial assistant that was developed with input from human editors. With transparency and human oversight as guiding principles, Ippen's AI team built a self-evaluation system on top of their generative AI tools to automatically evaluate the quality of their output.Through their internal AI training programs, Ippen Digital strives to give every employee - not just technologists - a solid understanding of how AI models function, where they fall short, and why human judgment is irreplaceable.My biggest takeaway from Alessandro was this: by proactively shaping how AI gets built and deployed, journalists have an opportunity to set their direction. The future of news isn't human versus AI - it's human augmented by AI. And for the survival of quality journalism, getting that balance right is imperative.In the second part of our conversation out next week, Alessandro discusses how Ippen Digital is working on fine-tuning large language models for specific newsroom tasks. He also discusses his collaboration with colleagues at The Times of London as a 2022 JournalismAI fellow, where he developed a tool and methodology for journalists to track manipulated narratives, especially those from state-run media.
  • 43. Jeff Jarvis (Part Two): Rethinking the journalism business model in the age of AI

    27:52
    Jeff Jarvis joins Nikita Roy in the second part of his conversation to discuss how journalism business models will be affected by the rise of generative AI.In part one, Jarvis shared his thoughts on whether generative AI companies should be allowed to use news media's copyrighted content to train their AI models.Jarvis has been the director of the Tow-Knight Center for Entrepreneurial Journalism at the Craig Newmark Graduate School of Journalism at the City University of New York and the author of "The Gutenberg Parenthesis: The Age of Print and its Lessons for the Age of the Internet." He also co-hosts the podcasts "This Week in Google" and "AI Inside"..Sign up for the Newsroom Robots newsletter for episode summaries and insights from host, Nikita Roy.
  • 42. Jeff Jarvis (Part One): Should AI have the ‘right to read’ news like humans do?

    33:55
    Jeff Jarvis joins Nikita Roy to discuss whether AI companies should be allowed to use news media's copyrighted content to train their models. Jarvis is a veteran journalist and professor who recently testified to the US Senate Judiciary Subcommittee on Privacy, Technology, and Law on AI and the Future of Journalism. He's been the director of the Tow-Knight Center for Entrepreneurial Journalism at the Craig Newmark Graduate School of Journalism at the City University of New York. He is the author of six books, most recently "The Gutenberg Parenthesis: The Age of Print and its Lessons for the Age of the Internet." He co-hosts "This Week in Google" and "AI Inside" podcasts. Sign up for the Newsroom Robots newsletter for episode summaries and insights from host, Nikita Roy.
  • 41. Aliya Itzkowitz and Sam Gould (Part Two): Potential of Multimodal AI and Autonomous AI Agents in Publishing

    30:02
    Aliya Itzkowitz and Sam Gould from FT Strategies join Nikita Roy to discuss the capabilities of multimodal AI and AI agents within the publishing industry. Discover further insights and practical examples of these technologies in the Newsroom Robots newsletter, featuring insights from host, Nikita Roy. Aliya is a Manager at FT Strategies where she has consulted over 30 publishers across Europe, Asia, Africa and North America. Her work focuses on the critical shifts facing publishers today, including rethinking revenue models and understanding how to leverage AI. Before the FT, she worked at Dataminr, bringing AI technology to newsrooms, and at Bloomberg as a journalist. Aliya has a BA from Harvard University and an MBA from the University of Oxford.Sam is a data scientist at FT Strategies and has worked in consulting, helping clients to solve strategic business challenges using data. He has helped organizations in both the public and private sectors, from tech to healthcare to consumer products, define their AI roadmaps and strategies. He has also worked as a data scientist, designing and building data and AI systems. Sam designed the FT Strategies AI Design Sprint methodology working in partnership with the Google News Initiative.Sign up for the Newsroom Robots newsletter here.