Category Archives: Coding

Drupal: Using QueueWorkers to Sync Multiple Data Sources

Profiles on our site can get their data from two external data sources: a directory connected via LDAP to ActiveDirectory for handling contact information, and a course catalog connected to Banner for lists of courses taught by instructors. We want to keep this data fresh, but don’t want to load it on demand since we’re caching the page content for performance. But there are ~2,500 profiles throughout our sites, so hitting the APIs for the external services to fetch this data on a single cron run would cause them to crash.

Fortunately, Drupal 8 added a QueueWorker class that allows us to easily batch the updates and, using a custom hook, ensure that each entity in the batch gets updates from all the external sources, in case we ever add more to handle event listings or other data.

As a note, our Profile content type in Drupal contains two text fields that aren’t shown on the rendered page. The first, field_external_id, contains a 32 character hexadecimal string that is a unique, persistent identifier for the individual. Everyone at Middlebury has one of these which allows for quick records access across platforms. The second, field_profile_sync, contains a timestamp we can use to ensure we’re not hammering the remote services to fetch the same data over and over again.

/**
 * Implements hook_cron().
 */
function middlebury_profile_sync_cron() {
  $queueFactory = \Drupal::service('queue');
  $queue = $queueFactory->get('middlebury_profile_update');

  $now = new \DateTime();
  $ttl = new \DateInterval('P1D');
  $cutOff = $now->sub($ttl);

  $storage = \Drupal::service('entity_type.manager')->getStorage('node');
  $query = $storage->getQuery()
    ->condition('type', 'profile')
    ->exists('field_external_id.value')
    ->condition('field_profile_sync', $cutOff->getTimestamp(), '<');
  $nids = $query->execute();
  if (!empty($nids)) {
    foreach ($nids as $nid) {
      $item = new \stdClass();
      $item->nid = $nid;
      $queue->createItem($item);
    }
  }
}

Our main module’s hook_cron() sets up the QueueWorker, fetches all of the profiles which have an external id and have not been updated in the last 24 hours, and adds them to the queue. This just puts their node ids into a database table. The cron run will batch through these nodes during its execution for a pre-determined number of seconds and then leave whatever is left in the batch until its next run.

<?php

namespace Drupal\middlebury_profile_sync\Plugin\QueueWorker;

use Drupal\Core\Cache\CacheTagsInvalidatorInterface;
use Drupal\Core\Entity\EntityStorageInterface;
use Drupal\Core\Extension\ModuleHandler;
use Drupal\Core\Plugin\ContainerFactoryPluginInterface;
use Drupal\Core\Queue\QueueWorkerBase;
use Symfony\Component\DependencyInjection\ContainerInterface;

/**
 * Provides base functionality for the Profile Queue Workers.
 *
 * @QueueWorker(
 *   id = "middlebury_profile_update",
 *   title = @Translation("Middlebury Profile Update"),
 *   cron = {"time" = 15}
 * )
 */
class ProfileUpdate extends QueueWorkerBase implements ContainerFactoryPluginInterface {

  /**
   * The node storage.
   *
   * @var \Drupal\Core\Entity\EntityStorageInterface
   */
  protected $nodeStorage;

  /**
   * Cache invalidator service.
   *
   * @var \Drupal\Core\Cache\CacheTagsInvalidatorInterface
   */
  protected $cacheInvalidator;

  /**
   * The module handler.
   *
   * @var \Drupal\Core\Extension\ModuleHandler
   */
  protected $moduleHandler;

  /**
   * Creates a new ProfileUpdate object.
   *
   * @param \Drupal\Core\Entity\EntityStorageInterface $node_storage
   *   The node storage.
   * @param \Drupal\Core\Cache\CacheTagsInvalidatorInterface $cache_invalidator
   *   The cache invalidator service for marking cache invalid so that pages
   *   are notified when profiles are updated.
   * @param \Drupal\Core\Extension\ModuleHandler $module_handler
   *   The module handler.
   */
  public function __construct(EntityStorageInterface $node_storage, CacheTagsInvalidatorInterface $cache_invalidator, ModuleHandler $module_handler) {
    $this->nodeStorage = $node_storage;
    $this->cacheInvalidator = $cache_invalidator;
    $this->moduleHandler = $module_handler;
  }

  /**
   * {@inheritdoc}
   */
  public static function create(ContainerInterface $container, array $configuration, $plugin_id, $plugin_definition) {
    return new static(
      $container->get('entity_type.manager')->getStorage('node'),
      $container->get('cache_tags.invalidator'),
      $container->get('module_handler')
    );
  }

  /**
   * {@inheritdoc}
   */
  public function processItem($data) {
    $node = $this->nodeStorage->load($data->nid);

    $this->cacheInvalidator->invalidateTags(['n:' . $data->nid]);

    if ($node) {
      $changes = [];
      $this->moduleHandler->invokeAll('midd_update_profile_external', [
        $node,
        &$changes,
      ]);
      if (count($changes) > 0) {
        $node->field_profile_sync = strtotime('now');
        $node->isSavedByCron = TRUE;
        return $node->save();
      }
    }
  }

}

Line 18 is where we tell cron not to spend more than 15 seconds processing the queue. Further down, in processItem(), the Profile node is loaded based on the id from the database table. We then invalidate its cache to ensure that people will see the latest version of the profile data.

Beginning on line 81 we track whether any changes were returned from the external data sources. Since we need to save the node to make changes, and that save action will create a new revision row in the database, we only want to perform this action if actual changes have been made to course lists or contact information so we don’t bloat the database. The $changes[] array is therefore passed by reference through the hook so that any external source can add its information to the variable.

If changes have been made we update the timestamp the profile was last synced and note that we’re using cron to save it, then go ahead and perform the save action. the note on line 87 is important so we can differentiate this action from a normal save done by an editor in the Drupal UI.

/**
 * Implements hook_node_presave().
 */
function middlebury_profile_sync_node_presave($node) {
  if (empty($node->isSavedByCron) && $node->getType() == 'profile' && !empty($node->field_external_id->value)) {
    $changes = [];
    $node->field_profile_sync = strtotime('now');
    \Drupal::moduleHandler()->invokeAll('midd_update_profile_external', [
      $node,
      &$changes,
    ]);
  }
}

Since we know whether a save is begin triggered by cron, we can safely add an invocation of our custom hook within hook_node_presave(), as without that knowledge we’d end up in a loop.

/**
 * Implements hook_midd_update_profile_external().
 */
function middlebury_profile_sync_midd_update_profile_external($node, &$changes) {
  $storage = \Drupal::service('entity_type.manager')->getStorage('profile_source');
  $sources = $storage->loadMultiple($storage->getQuery()->execute());
  foreach ($sources as $source) {
    $changed = middlebury_profile_sync_sync_profile($source, $node);
    foreach ($changed as $change) {
      $changes[] = $change;
    }
  }
}

/**
 * Implements hook_midd_update_profile_external().
 */
function middlebury_courselist_midd_update_profile_external($node, &$changes) {
  if ($node->getType() == 'profile' && $node->field_show_courses->value) {
    $changed = \Drupal::service('middlebury_courselist.profile_sync')->syncNode($node);
    foreach ($changed as $change) {
      $changes[] = $change;
    }
  }
}

The implementations of our custom hook in their respective modules are fairly straightforward. They load the connection information of the external API(s) and then call their local function or service to sync the data. Changes are safely merged into the $changes[] array so we can know whether a save action needs to be performed on the node.

/**
 * Update a profile.
 *
 * @param \Drupal\middlebury_profile_sync\ProfileInterface $profile
 *   The Profile data.
 * @param \Drupal\node\Entity\Node $node
 *   The profile node.
 */
function middlebury_profile_sync_update_profile(ProfileInterface $profile, Node $node) {
  $changed_fields = [];

  // Email.
  if (empty($node->field_override_email->value) && $profile->hasEmail()) {
    if ($node->field_email->value != $profile->getEmail()) {
      $changed_fields[] = t('email');
    }
    $node->field_email = $profile->getEmail();
  }

  // ... snip ...

  // Job title.
  if (empty($node->field_override_job_title->value) && $profile->hasJobTitle()) {
    if ($node->field_job_title->value != $profile->getJobTitle()) {
      $changed_fields[] = t('job_title');
    }
    $node->field_job_title = $profile->getJobTitle();
  }

  return $changed_fields;
}

The actual implementations of fetching and comparing the field data are bespoke to our environment, but you can see here we have a function that goes through each field returned by the external source, compares it to the data currently stored in the node and notes a change, if necessary.

Drupal: Sharing Inline Block Markup Between Custom Themes and Layout Builder Editing

Drupal 9’s layout builder system is a great improvement on how we’ve long been building content in Drupal and combined with custom block types is a sufficient replacement for modules like Paragraphs. But one of the frustrations I’ve run into while using it is that my custom theme styles get in the way of the editing UI, preventing editors from being able to access or read some of the controls they need to do their work.

Fortunately, the layout_builder_admin_theme module solves this problem and lets us use an admin theme like seven or claro when editing with layout builder, so all the editing controls appear where and as they should. However, my custom templates for inline blocks weren’t rendering with this enabled since I kept them in my custom theme directory, and my custom theme was no longer being used when editing.

Here’s how to move those templates into a custom module so that they’re loaded in both the site your visitors see as well as when editors are working with them.

First, create a custom module and add hook_theme to let Drupal know that you’ll be supplying templates for these blocks.

<?php

/**
 * Implements hook_theme().
 */
function MY_MODULE_theme() {
  return [
    'block__inline_block__MACHINE_NAME' => [
      'render element' => 'elements',
      'base hook' => 'block',
    ],
  ];
}

Now you can create a templates directory within your module and place a block--inline-block--MACHINE-NAME.html.twig template file there which will be picked up when the block is shown with your custom theme or the admin theme enabled.

{% if content %}
  {% set fields = content.content ? content.content : content %}
  <div{{ attributes.addClass(["block--MACHINE-NAME"]) }}>
    {{ title_prefix }}
    {{ title_suffix }}
    {% if content.actions %}
      {{ content.actions }}
    {% endif %}
    {% if fields.field_body %}
      <div class="block--MACHINE-NAME--body">{{ fields.field_body }}</div>
    {% endif %}
{% endif %}

The content object will work as normal when viewed in your custom theme, but when the admin theme is shown it will instead be an array with two items: content.content is the same object as content in the custom theme while content.actions contains some editing controls for positioning the block. To get around this inconsistency, we create a new variable, fields, which will have all the normal field content in it and then ensure that content.actions is printed out at the start of the block right after title_prefix and title_suffix which are also needed (along with attributes on a wrapper element) for the layout builder editing UI to work.

Importing Files into Drupal 8 media_entity Bundles

The media_entity module is Drupal 8 (and its eventual inclusion in core) is great for managing these assets so that they can be reused, have fields, and included in nodes with the paragraphs module. For a new project, we’re importing faculty profile photos from a remote database into Drupal and want to be able to track the files locally so that we can take advantage of all these features, plus image styles and responsive images.

I found several examples that covered parts of this, but no single example that did everything I needed it to do, so this is a write-up of what I ended up using to solve this problem.

First, I wanted to create a separate directory inside the files folder to hold the images that this module would manage. I would like to be able to use file_prepare_directory() for this task, but in testing I wasn’t able to get it to create a directory with sufficient permissions. Calling file_prepare_directory() later in my procedural code would return false. So instead I’m using FileSystem::mkdir() which lets me pass the mode. This is based on the install process from the Libraries API module.


<?php
/**
* @file
* Contains install, uninstall and update functions for Middlebury Profile Sync.
*/
/**
* Implements hook_install().
*/
function middlebury_profile_sync_install() {
$directory = file_default_scheme() . '://middlebury_profile_sync';
// This would ideally use file_prepare_directory(), but that doesn't create
// the directory with sufficient write permissions and a check against the
// function in the module code later fails.
if (!is_dir($directory)) {
$file_system = \Drupal::service('file_system');
$file_system->mkdir($directory, 0777);
}
}

Next, I use system_retrieve_file() to import the file from the remote URL, create a File object for it and track it in Drupal’s managed files. There are a lot of examples out there that would have you use the httpClient class manually to fetch the file, or use standard PHP file functions and create the File object by hand, but I found using this function to be much simpler for what I was trying to do.

I then created the Media bundle based on the File I had just created. This code is based on this gist example. Lastly, I assign the Media object’s id to the Featured Image field of my node and then save it. I found this example of assigning media ids to fields to be helpful.


<?php
$photo_dir = file_default_scheme() . '://middlebury_profile_sync';
if ($profile->hasPhoto() && file_prepare_directory($photo_dir)) {
$photo = $profile->getPhoto();
$id = $profile->getId();
$destination = $photo_dir . '/' . $id . '.' . pathinfo($photo, PATHINFO_EXTENSION);
$file = system_retrieve_file($photo, $destination, TRUE, FILE_EXISTS_REPLACE);
$media = Media::create([
'bundle' => 'image',
'uid' => \Drupal::currentUser()->id(),
'langcode' => \Drupal::languageManager()->getDefaultLanguage()->getId(),
'status' => Media::PUBLISHED,
'field_image' => [
'target_id' => $file->id(),
'alt' => t('Photo of ' . $node->title->value),
'title' => t('Photo of ' . $node->title->value),
],
]);
$media->save();
$node->field_featured_image = ['target_id' => $media->id()];
}
$node->save();

Creating a RESTful JSON API in a legacy .NET 3.5 application

Our online Directory has been around in essentially the same form since the early 2000s, though I’ve rewritten it as a PHP application, then as part of Microsoft Content Management Server 2003 as a .NET 1.0 application, then as a PHP application again, and then as a .NET 1.1 application, and then upgraded it in .NET a few times. Along the way, it acquired a SOAP web service, which is accessible at https://directory.middlebury.edu/WebDirectory.asmx and is the primary interface for Directory information displayed in our Drupal sites, including profiles.

This fall, we needed to add a RESTful web service that returned some public Directory information as a JSON string for use in a mobile app. If I were doing the Directory all over from scratch it would probably be some MVC application with some nice endpoints that returned this data already, if it was in .NET, but that’s not the situation we’re in right now, so I had to find a way to hack a REST API on top of this thing.

Apparently, that’s possible using a Web Service ASMX file, which looks like a SOAP service, but with a few tweaks can act like a REST service. The gist below ends in .cs, but that’s just so the syntax highlighting looks nice. It’s really a .asmx file.


using Newtonsoft.Json;
using System;
using System.Collections.Generic;
using System.Data;
using System.DirectoryServices;
using System.Linq;
using System.Web;
using System.Web.Services;
using System.Web.Script.Services;
/// <summary>
/// Summary description for RESTfulDirectory
/// </summary>
[WebService(Namespace = "http://tempuri.org/")]
[WebServiceBinding(ConformsTo = WsiProfiles.BasicProfile1_1)]
[ScriptService]
public class RESTfulDirectory : System.Web.Services.WebService {
public RESTfulDirectory () {
}
[WebMethod]
[ScriptMethod(UseHttpGet = true, ResponseFormat = ResponseFormat.Json)]
public void Search(string q) {
Results results = new Results();
results.directory = new List<Result>();
if (q != null && q != "")
{
// get the results from the directory
string search = "the LDAP search query";
DataTable dt = Directory.query(search, "none");
foreach (DataRow dr in dt.Rows)
{
Result result = new Result();
// populate the property fields
results.directory.Add(result);
}
}
string json = Newtonsoft.Json.JsonConvert.SerializeObject(results, Newtonsoft.Json.Formatting.None, new JsonSerializerSettings { NullValueHandling = NullValueHandling.Ignore });
HttpContext.Current.Response.Write(json);
}
}
public class Results
{
public List<Result> directory;
}
public class Result
{
// properties
}

The first key is to uncomment the “[ScriptService]” notation above the class so that it can act as an AJAX response service, if needed. Next, above the method for the service response, add the notation “[ScriptMethod(UseHttpGet = true, ResponseFormat = ResponseFormat.Json)]”, which allows the service to accept GET requests and return a JSON response. Change the return type of the method to “void”, since we’ll just be printing our response directly, rather than returning a value that gets send via an XML wrapper.

The inner part of my search method is mostly program logic that creates an object and populates it with information from the Directory application based on the input query and business logic about what data to return. I decided to use the JSON.NET NuGet package after trying .NET’s native JavaScriptSerializer class, which kept pegging the CPU t 100% trying to serialize the object.

The JSON.NET object serializer offers a few more advantages. You can tell it to indent the response, if you want (I chose not to), and you can tell it not to include any property in the response where the value is null. For instance, if a person doesn’t have a telephone number, rather than returning “{phone: null}”, that portion of their record will just be blank.

Lastly, I write the JSON string directly to the HTTP response, which is why I earlier changed the method return type to void. With this in place, the service is now available at http://directory.middlebury.edu/RESTfulDirectory.asmx/Search?q=

Displaying a date field in two formats in Drupal 8

Working on the conversion of our Museum of Art site, I ran into an issue with our “event” content type where we store the date and time of the event in a single datetime field, but want to display this information in two formats in different parts of the markup. In Drupal 7, we handled this using a preprocess function to access the data and set variables, but I wanted to find a more elegant solution in Drupal 8.

One option for this would be to have a, computed second field that stores the value again so it can be output twice in the node template and allow Drupal’s field API to format the output differently each time. I decided against this as it would require duplicating the data in the database.

Instead, I decided to format the date in the Twig template. Complicating this is that the date is stored as a string, rather than a timestamp in the database, so even if I tell the field API to format the output as a timestamp, there’s no way to directly access just that value without any of the markup.

To get around this, I set a variable in the Twig template to the value of the field piped to the date() function with the ‘U’ format, which will return an integer timestamp. This can then be passed into format_date() to have the date show up as desired.


{% set timestamp = node.field_event_time.value|date('U') %}
<article{{ attributes }}>
{% if label %}
<header>
<h1>{{ label }}</h1>
<h2>{{ timestamp|format_date('museum_date') }}
</header>
{% endif %}
<section class="contents">
<p>
{{ content.field_event_type }}<br />
<span class="label">Time</span>: {{ timestamp|format_date('museum_time') }}
<span class="label">Location</span>: {{ content.field_event_place }}<br />
</p>
{{ content }}
</section>
</article>

Always showing a teaser for lists of content in Drupal

For the main Middlebury and MIIS websites, we run a module named Monster Menus that adds hierarchical pages and permissions on top of Drupal. One of the effects of this is that we have many pages that are lists of nodes some of which we want to display in a full view, others of which we want to display as teasers. A common example is a list of news articles with a simple node stickied at the top. We want the news items to show up as teasers, allowing you to click through to see one of them in full view mode, but we also want that basic node at the top of the page to show up fully.

In Drupal 6 we achieved this by creating a custom template file outside of Drupal’s normal theme hooks and adding it to the template files array in the node preprocess function[ref]In order to make our preprocess functions a bit easier to read, we separate out the functionality for each content type into its own function using this method: [/ref] within template.php [ref]arg(1) is normally the node id when viewing a Drupal URL ending in /node/123. In Monster Menus, the node id is actually arg(3) since its URLs are of the form /mm/123/node/456 where the first number is the page id. I’ve used arg(1) in the examples to keep them familiar to most Drupal developers.[/ref].

<?php
function midd_preprocess_node_news(&$vars) {
if (arg(1) != $vars['nid']) {
$vars['template_files'][] = 'node-newslist';
}
}

Then we could have node-newslist.tpl.php along with node.tpl.php and show different markup. This all worked quite well in Drupal 6 because the CCK module exposed all the field data to the preprocess functions and you could manipulate it however you liked before outputting it. But it runs into trouble with Drupal 7’s field API and render arrays. You can still do it in Drupal 7, and just need to change “template_files” to “theme_hook_suggestions” in the example shown above, but I decided to go a different direction.

With Drupal 7’s new entity API, you can update how the render array is built based on the node’s metadata in hook_entity_view_mode_alter(). This has the advantage that the node now is rendered as a teaser and the fields set for that display mode in the admin interface show up as you’d normally expect them to, in the order defined, with the wrapper elements specified, and using the standard theme hook suggestion of node__TYPE__teaser.tpl.php.

<?php
/**
* Implements hook_entity_view_mode_alter().
*/
function midd_entity_view_mode_alter(&$view_mode, $context) {
if ($context['entity_type'] == 'node' &&
$context['entity']->nid != arg(1) &&
$context['entity']->type == 'news') {
$view_mode = 'teaser';
}
}