January 22, 2009

Creating a Google News sitemap feed in Drupal 6

Filed under: Tips — Tags: , — Webopius @ 10:22 pm

A news site such as Google News gets millions of visitors each day. If you manage to get your site approved for Google News it can give you a great traffic boost.

Once you’re in, you need to tell Google about new articles as quickly as possible – you do this with a Google News sitemap feed that you can submit using Google Webmaster tools.

Pre-written Drupal Module
A Google new feed module for Drupal 6.
Note: This module is provided ‘as is’ without warranty or support.

We built a news site just over a year ago and were fortunate to get approved by Google News. Using a Google news sitemap we usually have new articles appearing in Google News within about half an hour of them being posted on the site.

Google’s rules for building a news sitemap…

Before we get into building the news feed, let’s look at Google’s rules for the sitemap:

  • News sitemaps should use a custom namespace format
  • A news sitemap should only contain URLs for articles published in the last three days.
  • Each URL submitted needs to include the article publication date in W3C format (yyyy-mm-ddThh:mm:ssZ)

Here’s an example of a typical Google news compliant sitemap:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:news="http://www.google.com/schemas/sitemap-news/0.9">
<url>
  <loc>http://www.cotswoldnews.com/news/957/cotswold-council-tax-set-to-rise-by-29</loc>
  <news:news>
    <news:publication_date>2009-01-20T18:06:57+00:00</news:publication_date>
  </news:news>
</url>
</urlset>

Building a Google News sitemap in Drupal 6

Drupal’s excellent module system makes building a Google news sitemap pretty straightforward, all you need to do is create your own module. In this example, let’s call the module “newsmodule”.

First, let’s create the module’s .info file (newsmodule.info) and save it in sites/all/modules/newsmodule:


name = newsmodule
description = Example Google news sitemap generator
core = 6.x
version = "6.x-1.0"

Next, create the module file itself (newsmodule.module) and save it in the same directory as the .info file.

Within the module file, create a function to retrieve the published nodes for the past 3 days. Note: In our case, we have a content type of ‘news’ set which is specific to our site. You need to change this to suit your own site’s content configuration.

Having found the right nodes, the function simply generates the Google sitemap news format and outputs the result as an xml feed:

function _newsmodule_getgooglenews() {
  drupal_set_header('Content-Type: text/xml');
  $content='<?xml version="1.0" encoding="UTF-8"?>';
  $content.='<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"';
  $content.=' xmlns:news="http://www.google.com/schemas/sitemap-news/0.9">';
  $sql = "select nid, created from {node} where status=1 and type='news' ";
  $sql .= "and from_unixtime(created) >= date_sub(curdate(),interval 3 day) order by created desc";
  $res = db_query($sql);
  while ($data = db_fetch_object($res)) {
    $nid = $data->nid;
    $node_date = date(DATE_W3C,$data->created);
    $node_url = url("node/$nid");
    $content .= '<url>';
    $content .= '<loc>http://www.YOURSITEDOMAIN.com'.$node_url.'</loc>';
    $content .= '<news:news>';
    $content .= '<news:publication_date>'.$node_date.'</news:publication_date>';
    $content .= '</news:news>';
    $content .= '</url>';
  }
  $content .= '</urlset>';
  print $content;
}

Finally, create a hook_menu function within the same module file to tell Drupal that you have a custom URL which then calls the generator function:

function newsmodule_menu() {
   $items['googlenews.xml'] = array(
   'title' => 'Google News feed',
   'page callback' => '_newsmodule_getgooglenews',
   'access arguments' => array('access content'),
   'type' => MENU_CALLBACK,
   );
   return $items;
}

Now, you should be able to activate the module in Drupal and browse to www.YOURSITENAME.com/googlenews.xml and the sitemap will be generated. If all went well, you can now submit the Google news sitemap url to Google’s webmaster tools.

A Google News sitemap module

For those of you who prefer it, I’ve written a slightly more enhanced module than described above (it has a configuration screen that allows the content types for the feed to be specified. You can find the zip file of the Google new feed module here.. Note: This module is provided ‘as is’ without warranty or support.

  • Tags