Popular Science Magazine (PopSci.com) Case Study

AppId is over the quota
AppId is over the quota

Until the year of relaunch, Popular Science's online presence was dominated by proprietary web content management solutions. With this relaunch, the Popular Science team wanted to take the online presence of the magazine into the open source world.

Prior to its relaunch, the Popular Science website used various different systems to deliver content. One of the goals for the new site was to bring these disparate sites together into a unified user interface while increasing usability and functionality. Drupal's inherent flexibility and extensibility afforded the delivery of Popular Science's usability and functional requirements. One of the big challenges, however, was converting and importing several years' worth of content from a Vignette 7 CMS and several TypePad blogs.

Another challenge was the integration of several third-party services, including a fantasy stock trading system, video conversion and hosting services, and advertising.

In approaching the development of the new PopSci.com, we took advantage of various contributed modules, and created a number of custom modules, including the Drupal Markup Engine for content placement within nodes and Node Carousel for displaying content.

Finally, scalability was a primary concern, as PopSci already had a large and active user base. By specifying a load-balanced multi-server cluster to serve up the site, combined with the use of Memcache, PopSci.com post-relaunch was able to weather an average load of 60 pages per second with a spike of over 1.1 million page views in 24 hours -- a new record for Popular Science.

It was important to the PopSci.com editors that they have complete control over the placement of media and supporting content not only in full node view but also in teaser view. They wanted the ability to paginate long articles and place any number of images or even related blocks into the content of a node. The media placement also needed to be intelligent enough to work with legacy content imported from Vignette and Typepad. Most of this was accomplished with the creation of a new module called the Drupal Markup Engine, or DME. The DME works in conjunction with the content-types that were created for this project with the Content Construction Kit (CCK) by providing a custom, extensible input filter.

Articles are the main content-type on the site. All blog posts from TypePad and articles from Vignette were consolidated as articles in Drupal.

The article content-type uses the DME extensively. Referenced images can be placed anywhere in an article using the DME. If a referenced image node isn't specifically placed within the content body by the DME, it is automatically displayed at the top of the article and in the article's teaser view.

Images may also be placed directly in the teaser using the DME. This approach provides maximum flexibility with images entered through Drupal and with images from legacy content, which required no human intervention to make the latter work.

The DME is also used to place a related content block (containing links to nodes in Node Reference fields or nodes with similar taxonomy terms) into the content and to set pagination for the article.

Article Images -- Node Reference to images used in the article.Associated Photo Gallery -- Node Reference to an Photo Gallery.Body -- The article's body.Category Badge -- A taxonomy image that will apply a graphical badge to the article.Credit -- The credit is the contributor of the article.DEK -- A brief description of the article.Primary Category -- The primary taxonomy for the site represented by the main navigation areas.Related Articles -- Node Reference field to relate other articles.Tags -- An auto-fill taxonomy field.Title -- Core title field.V7id -- The Vignette 7 ID of the original article so that it can be cross-referenced. This was useful for redirecting old urls to new Drupal content. [See discussion about imports below]Video Link -- Node Reference to related videos.

The "current issue" node type represents an issue of the magazine. It is used to store images of the magazines cover associated with dates. This node type is used in various promotional content throughout the site.

Current Issue Structure

Cover -- An image representing the magazine cover.Issue Date -- Publication date of the issue.Title -- Core title field.

The Featured tout is a node type created to be used solely in a Node Carousel driven by a Node Queue. The featured touts simply require the Popular Science editors to create graphics that are of the appropriate dimensions. These can be seen on the front page of http://popsci.com/.

Featured Tout Structure

Associated Article -- Node Reference to the article being touted.DEK -- A brief description of the article being touted.Index Display Link -- The word used as the link in the tout.Title -- Core title field.

Images are used extensively on the site and needed to be invoked in a number of ways. Images are used in different forms in articles, teaser widgets, and photo galleries. If an image has related content, links to that content are shown in all but teaser views. Images are not served as stand alone images on the site but are invoked in Articles and Photo Galleries.

Image Structure

Credit -- The contributor of the image.DEK -- A brief description of the image.Photo Gallery Link -- Node Reference to Photo Galleries. If an image references a gallery it shows up in that Photo Gallery.Photo Gallery Weights -- This field contains a series of number pairs with each pair representing the photo gallery and the image's weight in that photo gallery.Primary Category -- The primary taxonomy for the site represented by the main navigation areas.Title -- Core title field.V7id -- The Vignette 7 ID of the original image so that it can be cross-referenced. This was useful for redirecting old urls to new Drupal content.Video Link -- Node Reference to related videos.

A Photo Gallery is a node type serving to collect image nodes and content to be displayed to the end user as a photo gallery. The images are designated for a photo gallery by editing the image and entering the gallery title in the appropriate Node Reference field. Galleries are presented as Node Carousels to give them a slick, interactive feel.

Photo Gallery Structure

Category Badge -- A taxonomy image that will apply a graphical badge to the image.Credit -- The contributor of the image.DEK -- A brief description of the image.Icon -- A Node Reference field to the image to use when viewing the gallery in teaser view.Primary Category -- The primary taxonomy for the site represented by the main navigation areas.Tags -- An auto-fill taxonomy field.Title -- Core title field.V7id -- The Vignette 7 ID of the original image so that it can be cross-referenced. This was useful for redirecting old urls to new Drupal content.

The Video node enables posting of video to either YouTube or OnStream. We developed a custom media module, which creates a custom Media Profile CCK field that can be attached to any node, allowing editors and admins to restrict the services used on a per-content-type basis.

The custom media module differs from the existing emfield module by offering greater flexibility -- such as allowing users to upload videos to the services straight from Drupal.

Video Structure

Category Badge -- A taxonomy image that will apply a graphical badge to the video.Credit -- The contributor of the video.DEK -- A brief description of the video.Primary Category -- The primary taxonomy for the site represented by the main navigation areas.Tags -- An auto-fill taxonomy field.Title -- Core title field.Video Link -- A hosted video handled by an extension to the media module.

Part of the motivation to move the existing content over to Drupal was to escape the rigid complexity and cost associated with the Vignette CMS. The Vignette dataset was a 1.66GB Oracle database -- and that didn't include the more than 15,000 images referenced in the Vignette data which also had to be imported into the new site.

The first step in the migration process was to use the MySQL Migration Toolkit to transfer the data to MySQL. We wrote a custom module that used cron to feed the Oracle data through Drupal's APIs in manageable chunks. And finally, we imported the images by extracting their locations from the Oracle data and, via shell script, executing a series of wget commands to download the images.

As each piece of content was created in Drupal it was tagged with the Yahoo Terms module, which despite some odd results provided a good start on tagging the immense amount of un-tagged Vignette data.

Once the preparations were in place, the entire import process took approximately two solid days of execution time to complete.

A portion of the import process centered around how to deal with the urls that had been generated by Vignette, so that an article called up by its old Vignette address could be found in the new Drupal architecture. In order to accomplish this, during the import we took the associated Vignette ID for each unit of information imported from Vignette into Drupal and placed it into a CCK field in its destination node in Drupal. To actually find those articles in Drupal, a hook was written that works with the Custom Error module to look for the old Vignette ID in the url when a 404 occurs and issues the correct redirect code. Not only were we able to handle the redirects while historic links were used, but in a very short time Google had updated their search results showing the new paths.

The design of the PopSci search results required the search results to be grouped by content type, with tabs allowing re-sorting of the results by Most Relevant, Most Recent, Most Viewed, Top Rated, and Most Commented. On top of that, users needed to be able to subscribe to rss feeds of the results.

We achieved this functionality by developing an extended version of Drupal's core search, displaying the various results in blocks of paginated content, with AJAX tabsets to access other sortings of the results.

Each search is also cached, given a hashed id, and associated with the user performing the search to allow the saving the searches for future reference.

In many instances the design comps we received required a nested set of tabs that could function to filter the content being displayed on a particular page. This was largely handled by the Tabs component of the Javascript Tools module. However, the large tabbed datasets displayed on each of the main category pages and in searches needed to be a custom coded solution to be able to work in a responsive fashion with larger amounts of data.

Naturally, there is a hefty selection of hardware powering the Popular Science website, but the true performance winner of this project was the Memcache module which integrates Drupal with Memcached and the PECL Memcache library. Out of the box, this module worked extremely well for us, with the exception of path aliases: A full page load was generating as many as 700 queries to determine path aliases. Pulling these queries through Memcache gave us the speed we needed to maintain an initial average load of approximately 60-70 page views per second.


View the original article here

Comments