SEO for Google News - Structured Data for News Publishers
Structured Data for News PublishersI explain the types of structured data a news publishing site should have, and how these can be implemented for optimal results.For this newsletter I’m digging into one of my favourite topics: Structured data. I’m not going to explain in detail what structured data is - Google does a pretty good job of that in their official documentation:
Basically structured data is extra markup you put in your HTML code that tells machine systems - like Google - exactly what type of content is on the page. It makes Google’s life easier. And much of SEO is making Google’s life easier. Structured Data FormatsThe structured data that Google supports is based on the schema.org vocabulary. This is not a perfect marriage: In their documentation, Google states that you shouldn’t rely on the schema.org website but on Google’s own documentation instead. In the context of SEO, we care about structured data primarily for its impact on Google. So, while sometimes Google’s requirements will be different from what’s stated on the schema.org website, in those cases you’ll want to follow Google’s rules. You can implement structured data in different ways. The two most common approaches are in-line with the page content’s HTML (microdata and RDFa), or in a separate snippet using JavaScript object notation (JSON-LD). Google explicitly prefers the latter: I also prefer JSON-LD, because it significantly eases any troubleshooting you may need to do. By keeping the structured data in a separate snippet in JSON-LD, you make it much simpler to test, implement, and fix. A JSON-LD structured data snippet can theoretically sit anywhere in a webpage’s HTML code. In my experience, it’s best to have it as part of the <head>, and fairly high up in the <head> as well. This seems to reduce the chance of the snippet not being picked up by Google. Most importantly, the structured data should be present in the raw (unrendered) HTML code. It should not rely on client-side JavaScript to be injected into the webpage. The reason for this is speed. While Google does render webpages as part of its indexing process, that rendering is done relatively slow. Can take minutes, can also take hours or even days. News has to start ranking in Google’s results straight away. Google cannot wait for its own datacentres to complete a full render of a news article, as this could mean the article isn’t shown until it’s already out of date. So the initial indexing of a news article is based on its HTML code only. And that means all the SEO-critical components of an article (headline, full content, <title>, canonical tag, Open Graph, etc. - and structured data) need to be part of the HTML code before (and after) any client-side JavaScript is loaded. I still see some implementations where structured data is loaded with JavaScript, for example with Google Tag Manager. This is fine for non-news webpages, but for news it simply doesn’t work. Required Structured DataSo which structured data does a news publishing site actually need? Well, none of it is mandatory. A site can rank just fine in Google’s results, both news-specific and general, without any structured data at all. But I believe certain structured data snippets are highly advisable, as they help Google understand the context and purpose of your content better. This can help with your content’s appearance in Top Stories, Google Discover, and other news-specific areas. These are the structured data snippets I would strongly recommend a publisher implement:
That’s it. These two are the structured data snippets I think every publisher should definitely have. Everything else is optional - more on that below. Before we move on to other structured data snippets, let’s answer a few common questions about these two.
In its Article structured data documentation, Google says it supports the BlogPosting, Article, and NewsArticle structured data types. In my view, it doesn’t make any difference which one you use. They’re all valid. I generally recommend NewsArticle for stories aimed at the news cycle, and Article for evergreen content. I think BlogPosting is still supported by Google because many news sites started out as blogs, but personally I wouldn’t use BlogPosting anymore.
There are more granular subtypes within NewsArticle, such as ReportageNewsArticle, OpinionNewsArticle, ReviewNewsArticle, etc. I don’t think there’s an SEO benefit to using these. But they seem to work fine for Google, and validate as articles in Google’s Rich Results Test.
In their documentation, Google is fairly clear about which attributes are required and recommended. None are required, but the more recommended attributes you provide the easier Google can understand the article. These are the attributes Google recommends for every Article structured data snippet: headline, image (ideally three, one for each of the preferred aspect ratios - 1x1, 4x3, and 16x9), author (with name and URL), datePublished, and dateModified. That’s it. Most implementations will have more attributes defined, such as description, publisher, URL (for the article itself), mainEntityOfPage, and complimentary attributes such as keywords, articleSection, and sometimes even the full articleBody. But none of those are actually needed. When you look at how Google displays news stories in its search results, it makes sense why so few attributes are recommended. All that Google shows is the publisher, the headline, the image, and the timestamp. I find it interesting that there is no recommendation to include publisher attributes in your article structured data, despite it being such an obvious part of the article ‘s visual presence. I think this is because Google establishes the publisher details on the hostname level, and doesn’t extract it from an article’s structured data. A notable example is how content from The Athletic is shown on Google with The New York Times branding. The article structured data on The Athletic clearly defines it as part of The Athletic, but because the content is now published on www.nytimes.com it gets that hostname’s branding instead: The additional recommended attributes for authors is, I believe, a way for Google to establish authorship and expertise. Beyond that, everything else is just noise. Because every publisher has their own implementation, with some defining a shedload of attributes and others defining very few, there’s no way for Google to attach strong value judgments to the presence or absence of additional attributes. Rewarding more structured data would create a two-tier web space where sites with more development resources or better CMSs win over those less fortunate, regardless of the quality of the journalism. Which is also, of course, the main reason why there is no hard requirement for any structured data. It’s all optional.
Yes - more on that here.
Yes. But there’s no guarantee that you get the Live badge, even with a full implementation of LiveBlogPosting markup. It seems Google needs to somehow ‘approve’ the site for red Live badges. I’m honestly not sure what it takes to get approved for a Live badge, but LiveBlogPosting structured data appears to be a hard requirement. Regularly publishing live articles with LiveBlogPosting seems to be part of the approval process.
Theoretically, yes, you can have both LiveBlogPosting and NewsArticle structured data on one article page. You won’t get penalised for it. However, LiveBlogPosting covers everything Google needs to interpret a live article as such, so the presence of NewsArticle structured data is unnecessary.
Yes, welcome to the club. I have yet to see a LiveBlogPosting implementation that doesn’t show at least one error in Google’s Rich Results Test. And yet, it still seems to work fine, and the article gets the coveted live badge in Top Stories. Despite the fact Google obviously supports live articles, there is no official Google documentation on LiveBlogPosting markup available to the public. There is some documentation available to those who were given access as part of the original Google pilot program for live coverage, but this seems to be dormant. Generally I recommend copying the LiveBlogPosting snippet from a competing website that gets the red Live badge, and refer to the schema.org page on LiveBlogPosting to enhance your snippet. Use the Rich Results Test and the Schema Validator to try and get your snippet as close to perfection as possible. Recommended Structured DataOf course there’s more to structured data than just article markup. There are several more snippets that I would recommend every publisher implement. These are:
Organization structured data should be implement on the site’s homepage (and only there, ideally). In this snippet you define your business’s name, website, logo, associated social media presence (in the sameAs attribute), and any contact points (such as the customer service desk) that you want the public to be aware of. If you have a physical address for your headquarters, I’d recommend providing those attributes too. An actual office address is an extra level of credibility. For news publishers, you can use NewsMediaOrganization instead of just Organization. This is a more granular subtype that makes it explicitly clear what your business’s purpose is.
For pages that have a video embedded, you’ll want to implement appropriate VideoObject structured data. Note that if the video is not the main content of the page - so if it’s just the ‘banner’ above the article, or embedded in the article content - Google will not accept the page as a video page. It’ll be shown in Google Search Console as ‘No video indexed’ even when your VideoObject markup is fully valid. This is because Google only wants to show a video result if the URL is a dedicated video page. I think it’s still worthwhile to include VideoObject structured data on articles where the video is not the primary content, even though you may not be rewarded by Google in any way.
I’m a fan of implementing Breadcrumb structured data. The direct SEO benefits are marginal; you basically get a somewhat nicer-looking breadcrumb when the article is shown in regular search results: However, I think there’s more to breadcrumbs. I’m in favour of visual breadcrumbs on articles, as these show the content hierarchy and send strong signals about the position the article occupies in the site’s overall structure. Breadcrumbs also help with sending link value to the article’s parent categories, and serve as an additional navigation method. When you implement visual breadcrumbs, you should also implement Breadcrumb structured data.
Having author pages for your regular writers has been a best practice for a while. On these author pages, you should also consider implementing Person or ProfilePage structured data. This structured data makes the content on the author page more easily digestible for Google, allowing the search engine to establish the author’s entity in the Knowledge Graph and identify expertise and authority. Make sure you implement the ‘sameAs’ attribute on your author pages, with URLs listing the author’s social media profiles and other publications they write for. This all helps with the author’s entity in Google’s knowledge graph.
Lastly, I recommend implementing SearchAction structured data on the site’s homepage. With SearchAction you let Google know that your site has an internal search function, and how that search function can be accessed. For large-ish news brands, Google will often show a search box as part of the brand’s search result - a so-called Sitelinks Search box: If you do not provide SearchAction structured data, anyone using this search feature on Google will get another Google search result - just limited to the site in question: With SearchAction, Google should theoretically trigger your own internal site search when someone uses the search box on your branded Google result. I say ‘theoretically’, because in this example The Telegraph does actually have SearchAction structured data but Google is still generating its own SERP instead of using the Telegraph’s search function. I suspect it could be related to the fact The Telegraph’s search function is JavaScript-powered. (Yes, it’s always JavaScript’s fault.) Wrapping UpThere’s a lot more to say about structured data, as there are plenty more opportunities for implementation. Job listings should have JobPosting markup, a frequently asked questions page should have FAQ markup, etc. Google’s own extensive documentation on structured data is a great resource, so please dig into that to see all the types that are officially supported and how to best implement them. I may do a follow-up newsletter at some stage on additional structured data snippets. Let me know in the comments which structured data types you’d like me to dig into. MiscellaneaIt’s been a fun time in the world of SEO and publishing! Here’s a roundup of recent articles and stories I found worthwhile: Official Google Docs: Interesting Articles:
Latest in SEO:
Lastly, some usual self-promotion. I was a panellist at the recent Future of Media Technology conference, where my panel spoke about the ‘interesting’ relationship between publishers and tech platforms in the age of LLMs. Here are two recaps from that session:
Recently I was interviewed on the topic of search and publishers, looking at underlying reasons for news sites’ decline in Google traffic. Here is the resulting article:
If you like to watch hour-long videos of SEOs blabbering about SEO for news and publishers, I got just the thing for you. Sean Bianco from The SEO Club invited me onto his podcast and we had a blast:
News and Editorial SEO Summit 2024We’re just over a month away from our fourth annual News and Editorial SEO Summit. This year has a truly packed lineup across the two days of our online livestreamed event. If you haven’t get secured your ticket, giving you access to the live event as well as recordings of all the sessions, better get a move on! That’s it for this edition of SEO for Google News. Thanks as always for reading and subscribing. See you at the next one! If you liked this edition of SEO for Google News, please share it with anyone you think may find it useful. |
Older messages
Site Migrations for Publishers: Best Practices and Pitfalls
Thursday, August 8, 2024
Changing the technology and design of your publishing site can be a daunting affair. Here's a look at what to do and, more importantly, what not to do. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
The Real Impact of AI Overviews
Monday, June 3, 2024
AIO will likely mean less traffic for many sites, but news appears to be exempt (for now). Plus, a major leak might have revealed many of Google search's internal workings. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Google penalises Site Reputation Abuse, starts rollout of AI Overviews
Friday, May 17, 2024
Immediately following the two-month grace period Google handed out manual penalties like candy at Halloween. Plus, Google I/O showed us the future of search. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
The Trouble with JavaScript and News Sites
Tuesday, April 30, 2024
JavaScript is unavoidable on the modern web, but best used sparingly. Google says it indexes JavaScript, but the realities are a lot more complex. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Google's March 2024 Core Update
Wednesday, March 6, 2024
The latest Google core update could be a game-changer, and gives Google back control of the search quality narrative. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
You Might Also Like
Ahrefs’ Digest #210: Google manual actions, fake AI profiles, and more
Thursday, November 21, 2024
Welcome to a new edition of the Ahrefs' Digest. Here's our meme of the week: — Quick search marketing news ICYMI, Google is rolling out the November 2024 Core Update. Google quietly introduces
Closes Sunday • Black Fri TO CyberMon Book Promos for Authors
Thursday, November 21, 2024
Book Your Spot Now to Get Seen During the Busiest Shopping Season of the Year! Please enable images to see this email. Black Friday & Cyber
What Motivates Marketers? The Answers Will Shock You 🫢
Thursday, November 21, 2024
We surveyed marketers across the globe - here's what they say. ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
🧙♂️ NEW 8 Sponsorship Opportunities
Thursday, November 21, 2024
Plus secret research on SoFi, Angara Jewelry, and Dyson ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
Literature Lab vol. 1 - Rebecca Makkai | #122
Thursday, November 21, 2024
Fiction: I Have Some Questions for You by Rebecca Makkai ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
The Farmer Strikes Back
Thursday, November 21, 2024
(by studying law)
Why Leaders Believe the Product Operating Model Succeeds Where Agile Initiatives Failed
Thursday, November 21, 2024
The psychological, organizational, and strategic reasons behind this seeming contradiction ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
December starts, prepare the 2025 marketing
Thursday, November 21, 2024
We're about a week from December 2024 😮 Did the time fly by for you? I would suggest NOW start planning for how to 2X your 2025. An easy way is to improve the effectiveness of everything in your
Time’s running out - 14 months at our lowest price💥
Wednesday, November 20, 2024
Limited offer inside - Only $1199 ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏
The Ad. Product Backlog Management Course — Tools (1): Forensic Product Backlog Probe
Wednesday, November 20, 2024
A Great Tool to Understand the Status Quo and Change It ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏ ͏