Using the Google Natural Language API with Drupal

If you haven’t heard of the phrase “Natural Language Processing” by now, you soon will. Natural Language processing is an expanding and innovative use of technology to analyze large amounts of data or content and derive meaning from it, short-cutting a tremendous amount of manual effort needed to do that ourselves. It’s been around in some form for quite a while, but it was often relegated to complex enterprise systems or large corporations with a vested interest in automating the data mining of huge amounts of data to figure out what the patterns were, for example, in consumer purchasing trends or social media behavior. It’s a cool idea (it’s a form of artificial intelligence after all) and fuels a lot of our online experience now whether it’s product recommendations, content recommendations, targeted ads, or interactive listening services like Siri or Alexa. What’s even better is that this sort of thing is becoming more and more accessible to use in our own software solutions as many of these now provide services with APIs. This allows us to provide a more personalized or meaningful experience for site visitors on web projects that likely don’t have the budget or requirements to justify attacking natural language processing itself and can instead find accessible ways to benefit from the technology.

One such use case that we’re talking about today is using the Google Natural Language Processing APIs on our own Drupal sites. We can use it to analyze our own site content and even autotag based on a common taxonomy. We really dig integrations here at Ashday, so we’ve just released two new Drupal modules to help you get hooked up with Google’s service. They are the Google NL API and Google NL Autotag modules.

The Google NL API Module

This module is intended to be your starter module to get things going. It provides functionality to connect to Google's Natural Language API and run analysis on text, including sentiment, entities, syntax, entity sentiment and content classification. It doesn’t decide what to do with this analysis, but it provides a service with a number of methods to analyze your content and then you can decide what to do with the information. All you need to get going is a Google NL API account. Full details of installation and usage can be found here.

Here is a brief outline of what each method provides, provided by Google.

Sentiment Analysis

“Sentiment Analysis inspects the given text and identifies the prevailing emotional opinion within the text, especially to determine a writer's attitude as positive, negative, or neutral. Sentiment analysis is performed through the analyzeSentiment method.” (from Analyzing Sentiment)

Entity Analysis

“Entity Analysis inspects the given text for known entities (proper nouns such as public figures, landmarks, etc.), and returns information about those entities. Entity analysis is performed with the analyzeEntities method.” (from Analyzing Entities)

Syntax Analysis

“While most Natural Language API methods analyze what a given text is about, the analyzeSyntax method inspects the structure of the language itself. Syntactic Analysis breaks up the given text into a series of sentences and tokens (generally, words) and provides linguistic information about those tokens.” (from Analyzing Syntax)

Entity Sentiment Analysis 

“Entity Sentiment Analysis combines both entity analysis and sentiment analysis and attempts to determine the sentiment (positive or negative) expressed about entities within the text. Entity sentiment is represented by numerical score and magnitude values and is determined for each mention of an entity. Those scores are then aggregated into an overall sentiment score and magnitude for an entity.” (from Analyzing Entity Sentiment)

Content Classification 

“Content Classification analyzes a document and returns a list of content categories that apply to the text found in the document. ” (from Classifying Content)

The Google NL Autotag Module

This module is the first step in actually doing something with the natural language analysis results provided by the API module. It provides a Google NL Autotag taxonomy that you can attach to whichever content types you choose and then will automatically create the relevant taxonomy terms and relate content whenever that content is saved. So a few clicks is all you need to have a nice auto-classification system in use on your site. You can even configure which text-based fields on your content should be used for the analysis as well as specify the confidence threshold, which determines at what confidence level you consider a Google classification as valid. So Google may say that the content matches the category /Home & Garden/Bed & Bath/Bathroom, with a confidence of .4 (on a scale of 0 to 1). You can decide what confidence level is good enough to categorize your content since different use cases may justify different approaches. A full list of Google’s content categories can be found here.

That’s all this module essentially does, but it’s meant to be a simple solution for sites to easily start benefiting from Google’s natural language services. You can, of course, extend the service or add your own functionality to use these APIs however you find beneficial because it’s Drupal 8, and Drupal 8 rocks when it comes to flexibility. So install these modules and start tinkering and don’t hesitate to ask if you have any questions or suggestions for future functionality.

Author: 
RSS Tags: 
Original Post: 

About Drupal Sun

Drupal Sun is an Evolving Web project. It allows you to:

  • Do full-text search on all the articles in Drupal Planet (thanks to Apache Solr)
  • Facet based on tags, author, or feed
  • Flip through articles quickly (with j/k or arrow keys) to find what you're interested in
  • View the entire article text inline, or in the context of the site where it was created

See the blog post at Evolving Web

Evolving Web