Aug 01 2018
Aug 01

This is part two of a two-part series.

In part one, we cogently convinced you that regardless of what your organization does and what functionality your website has, it is in your best interest to serve your website securely over HTTPS exclusively. Here we provide some guidance as to how to make the move.

How to transition to HTTPS

To fully transition to HTTPS from HTTP means to serve your website HTML and assets exclusively over HTTPS. This requires a few basic things:

  • A digital certificate from a certificate authority (CA)
  • Proper installation of the certificate on your website’s server
  • Ensuring all assets served from your website are served over HTTPS

Let’s break these down.

Acquire a digital certificate

As briefly discussed in part one, to implement HTTPS for your website you must procure a digital certificate from a certificate authority. Just like domain name registrars lease domain names, CAs lease digital certificates for a set time period. Each certificate has a public and private component. The public component is freely shared and allows browsers to recognize that a “trusted” CA was the one to distribute the certificate. It is also used to encrypt data transmitted from the browser. The private complement is only shared with the purchaser of the certificate, and can uniquely decrypt data encrypted by the public key. CAs use various methods through email or DNS to “prove” that the person who purchased the certificate is a rightful administrator of the domain for which they purchased it. Once you’ve received the private key associated with the certificate, you’ll need to install it on your server. Annual certificate costs can be as much as $1,000 or as little as nothing. More on that in a moment.

Install the certificate

Lots of wires

Installing an HTTPS certificate manually on a server is not a trivial engineering task. We explain this at a high-level in part one. It requires expertise from someone who is experienced and comfortable administering servers. There are many different installation methods unique to each permutation of server software and hosting platform, so I won’t expend any real estate here attempting to lay it out. If you have to do a manual installation, it’s best to search your hosting provider’s documentation. However, depending on the complexity of your website architecture, there are ways to ease this process. Some hosting platforms have tools that substantially simplify the installation process. More on that in a moment as well.

Serve all resources over HTTPS: avoid mixed content

Once you’ve installed your certificate, it’s time to ensure all assets served from your pages are served over HTTPS. Naturally, this entire process should be completed in a staging environment before making the switch to your production environment. Completing a full transition to HTTPS requires attention to detail and diligence. “Mixed content”, or serving assets from a page over both HTTP and HTTPS, can be both tedious and insidious to rectify. The longer your site has been around, the more content there is and the more historic hands have been in the pot of content creation, the more work there will be to make the switch. Depending on your platform (CMS or otherwise) and how it was designed, there may be many avenues for different stakeholders to have included assets within a page over time. Developers, site admins, and content editors usually have the ability to add assets to a page. If any assets start with http://, they’ll need to be updated to https:// to prevent mixed content warnings.

We have recently helped a client who has been publishing articles at a high cadence for over 10 years with many different stakeholders over that period. Practices weren’t consistent and uncovering all the ways in which HTTP resources were served from the page was a substantial undertaking. Therefore, be prepared for a time investment here – there may be many areas to audit to ensure all assets from all your pages are being served over HTTPS. Some common ways mixed content HTTP assets are served from a site whose HTML is served over HTTPS are:

  • Hard-coding a resource: e.g. http://www.example.com/img/insecure-image.jpg
  • Using a 3rd-party library or ad network reference: http://www.example.com/js/analytics.js
    • This is common for libraries that haven’t been updated in a while. Most all of them now serve the same assets over the same path securely with HTTPS.

Even practices that were previously encouraged by Paul Irish, a leading web architect for the Google Chrome team, may have contributed to your mixed content problem, so don’t feel bad. Just know, there will likely be work to be done.

The risk of mixed content

These “not secure” bits of mixed-content expose the same risk that your HTML does when served over HTTP, so browsers rightfully show the user that the experience on your site is “not secure”.

Mixed content is categorized in two ways: active and passive. An asset that can contribute to passive mixed content would be an image; it doesn’t interact with the page but is merely presented on the page. An example that can be an active asset of mixed content is a javascript file or stylesheet since its purpose is to manipulate the page. Passive mixed content, albeit categorically less severe than active, still exposes opportunities for an attacker to learn a lot about an individual’s browsing patterns and even trick them into taking different actions than they intend to on the page. Therefore passive mixed content still constitutes enough of a threat for the browser to issue a warning display.

mixed content browser errors

In the case of an active asset that is compromised, if it were a javascript file, an attacker can take full control of the page and any interaction you may have with it like entering passwords, or credit card information. The mechanism behind this is a somewhat sophisticated man-in-the-middle attack, but suffice it to say, if the browser recognizes the vulnerability, the best scenario is the poor user experience we discussed in part one, the worst is total data compromise. Your audience, and by association your organization, will be seeing red.
Chrome misconfigured HTTPS

The good news about moving to HTTPS

Ensuring your website serves assets exclusively over HTTPS is not as hard as it used to be, and is getting easier by the day.

There are free digital certificates

There’s no such thing as a free lunch, but a free certificate from a reputable CA? It would seem so. People are just giving them away these days… Seriously, many years in the making since its founding by two Mozilla employees in 2012, the Let’s Encrypt project has vowed to make the web a secure space and has successfully endeavored to become a trusted CA that literally does not charge for the certificates they provide. They have shorter lease cycles of 60 to 90 days, but they also offer tooling around automating the process of reinstalling newly provided certificates.

There are easier and cheaper ways to install certificates

With the advent of the aforementioned free certificate, many platform-as-a-service (PaaS) hosting options have incorporated low cost or free installation of certificates through their hosting platform sometimes as easily as a few clicks. Let’s Encrypt has been adopted across a broad range of website hosting providers like Squarespace, GitHub Pages, Dreamhost, all of which we use alongside many others.

For many of our Drupal partners, we prefer to use a platform as a service (PaaS) hosting option like Pantheon, Acquia, or Platform.sh. Both Pantheon and Platform.sh now provide a free HTTPS upgrade for all hosting plans; Acquia Cloud, another popular Drupal PaaS, is still a bit behind in this regard. We have found that the efficiency gains of spending less time in server administration translates to more value to our clients, empowering additional effort for the strategy, design, and development for which they hired us. In addition to efficiency, the reliability and consistency provided by finely tuned PaaS offerings are, in most cases, superior to manual installation.

A good example of the evolution of hosting platforms maturing into the HTTPS everywhere world is our own Jekyll-based site, which we’ve written about and presented on before. We first set up HTTPS over GitHub pages using CloudFlare guided by this tutorial since we found it necessary to serve our site over HTTPS. However, about a year later GitHub announced they would provide HTTPS support for GitHub pages.

Similarly, we had previously implemented Pantheon’s workaround to make HTTPS on all of their tiers accessible to our clients on their hosting platform. Then they announced HTTPS for all sites. We’re thankful both have gotten easier.

There are tools to help with the transition to HTTPS

Through its powerful Lighthouse suite, Google has a tool to help audit and fix mixed content issues. Given the aforementioned tedium and potential difficulty of tracking down all the ways in which people have historically added content to your site, this can be an invaluable time saver.

You can also use tools like Qualys SSL Labs to verify the quality of your HTTPS installation. See how our site stacks up.

Wrap-up

Given the much greater ease at which many modern hosting platforms allow for HTTPS, the biggest barrier, primarily a matter of effort, is to clean up your content and make sure all assets are served over HTTPS from all pages within your website. So, if you haven’t already, start the transition now! Contact us over the phone or email if you need assistance and feel free to comment below.

Jun 26 2018
Jun 26

This is part one of a two-part series on transitioning to HTTPS

For some time, major internet players have advocated for a ubiquitous, secure internet, touting the myriad benefits for all users and service providers of “HTTPS everywhere”. The most prominent and steadfast among them is Google. In the next week, continuing a multi-year effort to shepherd more traffic to the secure web, Google will make perhaps its boldest move to date which will negatively impact all organizations not securely serving their website over HTTPS.

To quote the official Google Security Blog

Beginning in July 2018 with the release of Chrome 68, Chrome will mark all HTTP sites as “not secure”

Chrome insecure message for HTTP
Google blog

Given the ambiguous “in July 2018”, with no clearly communicated release date for Chrome 68, it’s wise to err on the side of caution and assume it will roll out on the 1st. We have readied our partners with this expectation.

So what does this mean for your organization if your site is not served over HTTPS? In short, it’s time to make the move. Let’s dig in.

What is HTTPS?

HTTP, or HyperText Transfer Protocol, is the internet technology used to communicate between your web browser and the servers that the websites you visit are on. HTTPS is the secure version (s for secure) which is served over TLS: Transport Layer Security. What these technical acronyms equate to are tools for internet communication that verify you’re communicating with who you think you are, in the way you intended to, in a format that only the intended recipient can understand. We’ll touch on the specifics in a moment and why they’re important. Put simply, HTTPS enables secure internet communication.

Why secure browsing matters

Leaving aside the technical details for a moment and taking a broader view than communication protocols reveals more nuanced benefits your organization receives by communicating securely with its audience.

HTTPS improves SEO

Since Google accounts for 75-90% of global search queries (depending on the source) SEO is understandably often synonymous with optimizing for Google. Given their market domination, competitors are taking queues from Google and in most cases it’s safe to assume what’s good for SEO in Google is good for optimizing competing search engines.

In the summer of 2014, Google announced on their blog that they would begin to favorably rank sites who used HTTPS over HTTP. It’s already been nearly four years since we’ve known HTTPS to be advantageous for SEO. Since then, Google has consistently advocated the concept of HTTPS ubiquity, frequently writing about it in blog posts and speaking about it at conferences. The extent to which serving your site over HTTPS improves your SEO is not cut and dry and can vary slightly depending on industry. However, the trend toward favoring HTTPS is well under way and the scales are tipped irreversibly at this point.

HTTPS improves credibility and UX

Once a user has arrived at your site, their perceptions may be largely shaped by whether the site is served over HTTP or HTTPS. The user experience when interacting with a site being served over HTTPS is demonstrably better. SEMrush summarizes well what the data clearly indicate; people care a great deal about security on the web. A couple highlights:

You never get a second chance to make a first impression.

With engaging a participant of your target audience, you have precious few moments to instill a sense of credibility with them. This is certainly true of the first time a user interacts with your site, but is also true for returning users. You have to earn your reputation every day, and it can be lost quickly. We know credibility decisions are highly influenced by design choices and are made in well under one second. Combining these two insights, with the visual updates Chrome is making to highlight the security of a user’s connection to your site, drawing the user’s attention to a warning in the URL bar translates to a potentially costly loss in credibility. Unfortunately it’s the sort of thing that users won’t notice unless there’s a problem, and per the referenced cliché, at that point it may be too late.

Browsers drawing attention to insecure HTTP

Much like search, browser usage patterns have evolved over the last five years to heavily favor Google Chrome. Therefore, what Google does carries tremendous weight internet-wide. Current estimations of browser usage put Chrome between 55% and 60% of the market (again, depending on sources). Firefox has followed suit with Chrome as far as HTTP security alerts go, and there’s no indication we should expect this to change. So it’s safe to assume a combined 60-75% of the market is represented by Chrome’s updates.

Google Chrome HTTP warning roll out

Google (and closely mirroring behind, Firefox) has been getting more stringent in their display of the security implications of a site served over HTTP (in addition to sites misconfigured over HTTPS). They’ve shared details on the six-step roll out on their general blog as well as on a more technical, granular level on the Chrome browser blog.

In January 2017, they began marking any site that collects a password field or credit card information, served over HTTP as subtly (grey text) not secure.

Chrome insecure message for HTTP
Laravel News

Then, in October 2017, they tightened things up so that a site that collected any form information over HTTP, would have the same “not secure” messaging. They added the more action-based aspect of showing the warning on the URL bar when a user entered data into a form. This is an especially obtrusive experience on mobile due to space constraints, which more deeply engages the user cognitively as to exactly what is unsafe about how they’re interacting with the site.

Chrome insecure message for HTTP
Google blog

Next, in July 2018, all HTTP sites will be marked as not secure.

In September 2018, secure sites will be marked more neutrally, removing the green secure lock by default connoting a continuing expectation that HTTPS is the norm and no longer special.

Chrome insecure message for HTTP
Google blog

In October 2018, any HTTP site that accepts any form fields will show affirmatively not secure with a bold red label, much like a misconfigured HTTPS site does now.

Chrome insecure message for HTTP
Google blog

Though they haven’t yet announced a date, Google intends to show affirmatively not secure for all HTTP sites. The drive is clearly to establish the norm that all the web traffic should be served over HTTPS and that outdated HTTP is not to be trusted. This is a pretty strong message that if Google has their way (which they usually do) HTTPS will inevitably be virtually mandatory. And inevitably in internet years, may be right around the corner.

HTTPS vastly improves security for you and your users

Returning to the technical, as mentioned previously, HTTPS helps secure communication in three basic ways.

  • Authentication “you’re communicating with who you think you are”
  • Data integrity “in the way you intended to”
  • Encryption: “in a format that only the intended recipient can understand”

What authentication does for you

In order for the browser to recognize and evaluate an HTTPS certificate, it must be verified by a trusted certificate authority (CA). There are a limited amount of CAs who are entrusted to distribute HTTPS certificates. Through public-key cryptography, a fairly complex but interesting topic, through inherent trust in the CA who has provided the HTTPS certificate for a given site, the browser can verify any site visitor is positively communicating with the expected entity with no way of anyone else posing as that entity. No such verification is possible over HTTP and it’s fairly simple to imagine what identify theft would be possible if you were communicating with a different website than you appeared to be. In the event any of the major browsers cannot validate the expected certificate, they will show a strong, usually red warning that you may not be communicating with the expected website, and strongly encourage you to reconsider interacting at all.

Chrome misconfigured HTTPS

Therefore, the authentication gives your users the confidence you are who you say you are, which is important when you’re engaging with them in any way whether they’re providing an email, credit card or simply reading articles.

How data integrity helps you

Ensuring perfect preservation of communication over the internet is another guarantee HTTPS provides. When a user communicates with a website over HTTPS, the browser takes the input of that communication and using a one-way hashing function creates a unique “message digest”: a concise, alphanumeric string. The digest may only be reliably recreated by running the exact same input through the same hash algorithm irrespective of where and when this is done. For each request the user makes to the website, the browser passes a message digest alongside it and the server then runs the input it receives from the request through the hash algorithm to verify it matches the browser-sent digest. Since it is nearly computationally impossible to reverse engineer these hash functions, if the digests match, it proves the message was not altered in transit. Again, no such data integrity preservation is possible over HTTP, and there is therefore no way to tell if a message has been altered en route to the server from the browser.

What encryption does for you

Communicating over an unencrypted HTTP connection allows for some easily exploitable security risks in the case of authentication to a site. To demonstrate how easy it can be to take over someone’s account on an HTTP connection, a tool called Firesheep was developed and openly released in mid 2010. Major social media platforms Facebook and Twitter were both susceptible to this exploit for some time after Firesheep was released. The identity theft is carried out through a means called session hijacking. With Firesheep installed, a few clicks could log you in as another user who was browsing over WiFi nearby on any HTTP website. This form of session hijacking is possible when the authentication cookies, small identifying pieces of information that live in your browser while you’re logged into a site, are transmitted to the server on each request over HTTP. Over WiFi these messages are broadcasted into the air in plain text, and can be picked up by anyone listening. HTTPS prevents this since the communication is encrypted and unintelligible to eavesdroppers.

In the example of a CMS like Drupal or any other system in which there is a login, if an administrator with elevated site permissions is logged in over HTTP, they’re subject to the same risk if that traffic is monitored or “sniffed” at any point along its path from the browser to the server. This is especially easy over WiFi but is not relegated to only WiFi. The cookies are sent to the server upon every request, regardless of whether or not the user entered their password during the active session or not. Depending on the admin’s privileges, this access can be easily escalated to complete control of the website. Encryption is a big deal.

HTTPS is required for the modern web

One of the more promising developments of the last few years, is the pervasiveness and effectiveness of Progressive Web Apps (PWAs). PWAs is the name coined for a set of technologies that provide a feature-set for mobile browsing akin to native applications, yet is entirely served through the web browser. PWAs require all communication to be done over HTTPS. Some of the possibilities with PWAs that were previously relegated to native applications only are:

  • Providing content and services based on the user’s location data
  • Providing interaction with the user’s camera and microphone within the browsing experience
  • Sending push notifications
  • Serving off-line content

If you aren’t taking advantage of any of these features that are possible through PWAs, it’s something your organization should strongly consider to further engage users. Before the ambitions to be on feature parity with native applications are fully borne-out, PWAs will continue to evolve the power of layering deeper engagement with users on top of your existing mobile experience with minimal effort. PWAs simply do not work over HTTP. HTTPS is required to open the door to their possibilities.

Barriers to HTTPS have been lifted

Historically, considering a move to HTTPS has been held back by some valid concerns for webmasters whose job it was to select where and how their websites were hosted. A few of the fundamental apprehensions could be categorized as:

  • No perceived benefit. People often assumed if they weren’t collecting financial or personal information, it wasn’t necessary. We’ve covered why holding this belief in 2018 is a misconception. Savas Labs made the move in July 2017 to serve exclusively over HTTPS for our statically-generated Jekyll website even though at the time we had no forms or logins.
  • Performance costs. We know reducing latency is crucial for optimizing conversions and HTTPS does require additional communication and computation. However, with the broad adoption of the HTTP/2 protocol over the last few years, HTTPS now usually outperforms HTTP.
  • Financial costs. HTTPS was too complex and costly to implement for some. Large strides have been made across many hosting providers who now bundle HTTPS into their hosting offerings by default, often at no additional cost. Let’s Encrypt, a relatively new and novel certificate authority, first began offering free certificates (which they still do) and then made it easy to automatically renew those certificates, helping to ease the burden and cost of implementation.

We cover each of these in more detail in part two that will help guide you on how to make the move to HTTPS.

Conclusion

To revisit Google’s announcement:

Beginning in July 2018 with the release of Chrome 68, Chrome will mark all HTTP sites as “not secure”.

Interpreting that and providing our perspective:

You’re not part of the modern web unless you’re exclusively using HTTPS.

A bold, if slightly controversial statement, but for ambitious organizations like the folks we’re fortunate enough to work with each day, HTTPS-only is the standard in mid 2018 and beyond. Given the benefits, the lifted previous barriers, and the opportunity for the future, very few organization have a good reason not to exclusively serve their sites over HTTPS.

Have we convinced you yet? Great! Read part two for some assistance on how to make the move.

Additional resources

May 08 2018
May 08
May 8th, 2018

Over the past few months, Four Kitchens has worked together with the Public Radio International (PRI) team to build a robust API in PRI’s Drupal 7 site, and a modern, fresh frontend that consumes that API. This project’s goal was to launch a new homepage in the new frontend. PRI intends to re-build their entire frontend in this new structure and Four Kitchens has laid the groundwork for this endeavor. The site went live successfully, with a noticeable improvement in load time and performance. Overall load time performance increased by 40% with first-byte time down to less than 0.5 seconds. The results of the PRI team’s efforts can be viewed at PRI.org.

PRI is a global non-profit media company focused on the intersection of journalism and engagement to effect positive change in people’s lives. PRI’s mission is to serve audiences as a distinctive content source for information, insights and cultural experiences essential to living in our diverse, interconnected world.

Overall load time performance increased by 40% with first-byte time down to less than 0.5 seconds.

Four Kitchens and PRI approached this project with two technical goals. The first was to design and build a full-featured REST API in PRI’s existing Drupal 7 application. We used RESTFul, a Drupal module for building APIs, to create a JSON-API compliant API.

Our second technical goal was to create a robust frontend backed by the new API. To achieve that goal, we used React to create component-based user interfaces and styled them with using the CSS Modules pattern. This work was done in a library of components in which we used Storybook to demonstrate and test the components. We then pulled these components into a Next-based application, which communicates with the API, parses incoming data, and uses that data to populate component properties and generate full pages. Both the component library and the Next-based application used Jest and Enzyme heavily to create thorough, robust tests.

A round of well-deserved kudos to the PRI team: Technical Project Manager, Suzie Nieman managed this project from start to finish, facilitating estimations that led the team to success. Senior JavaScript Engineer, Patrick Coffey, provided keen technical leadership as well as deep architectural knowledge to all facets of the project, keeping the team unblocked and motivated. Engineer, James Todd brought his Drupal and JavaScript expertise to the table, architecting and building major portions of PRI’s new API. Senior Frontend Engineer, Evan Willhite, brought his wealth of frontend knowledge to build a robust collection of elegant components in React and JavaScript. Architect, David Diers created mechanisms that will be responsible for managing PRI’s API documentation that can be used in future projects.

Special thanks to Patrick Coffey and Suzie Nieman for their contributions to this launch announcement. 

Four Kitchens

The place to read all about Four Kitchens news, announcements, sports, and weather.

Jan 18 2018
Jan 18
January 18th, 2018

What are Spectre and Meltdown?

Have you noticed your servers or desktops are running slower than usual? Spectre and Meltdown can affect most devices we use daily. Cloud servers, desktops, laptops, and mobile devices. For more details go to: https://meltdownattack.com/

How does this affect performance?

We finally have some answers to how this is going to affect us. After Pantheon patched their servers they released an article showing the 10-30% negative performance impact that servers are going to have. For the whole article visit: https://status.pantheon.io/incidents/x9dmhz368xfz

I can say that I personally have noticed my laptop’s CPU is running at much higher percentages than before the update for similar tasks.
Security patches are still being released for many operating systems, but traditional desktop OSs appear to have been covered now. If you haven’t already, make sure your OS is up to date. Don’t forget to update the OS on your phone.

Next Steps?

So what can we do in the Drupal world? First, you should follow up with your hosting provider and verify they have patched your servers. Then you need to find ways to counteract the performance loss. If you are interested in performance recommendations, Four Kitchens offers both frontend and backend performance audits.

As a quick win, if you haven’t already, upgrade to PHP7 which should give you a performance boost around 30-50% on PHP processes. Now that you are more informed about what Spectre and Meltdown are, help with the performance effort by volunteering or sponsoring a developer on January 27, 2018 and January 28, 2018 for the Drupal Global Sprint Weekend 2018, specifically on performance related issues: https://groups.drupal.org/node/517797

Web Chef Chris Martin
Chris Martin

Chris Martin is a support engineer at Four Kitchens. When not maintaining websites he can be found building drones, computers, robots, and occasionally traveling to China.

Web Chef Dev Experts
Development

Blog posts about backend engineering, frontend code work, programming tricks and tips, systems architecture, apps, APIs, microservices, and the technical side of Four Kitchens.

Read more Development
Feb 15 2017
Feb 15

We use Docker for our development environments because it helps us adhere to our commitment to excellence. It ensures an identical development platform across the team while also achieving parity with the production environment. These efficiency gains (among others we’ll share in an ongoing Docker series) over traditional development methods enable us to spend less time on setup and more time building amazing things.

Part of our workflow includes a mechanism to establish and update the seed database which we use to load near-real-time production content to our development environments as well as our automated testing infrastructure. We’ve found it’s best to have real data throughout the development process, rather than using stale or dummy data which runs the risk of encountering unexpected issues toward the end of a project. One efficiency boon we’ve recently implemented and are excited to share is a technique that dramatically speeds up database imports, especially large ones. This is a big win for us since we’re often importing large databases multiple times a day on a project. In this post we’ll look at:

  • how much faster data volume imports are compared to traditional database dumps piped to mysql
  • how to set up a data volume import with your Drupal Docker stack
  • how to tie in this process with your local and continuous integration environments

The old way

The way we historically imported a database was to pipe a SQL database dump file into the MySQL command-line client:

mysql -u{some_user} -p{some_pass} {database_name} < /path/to/database.sql

An improvement upon the default method above which we’ve been using for some time allows us to monitor import progress utilizing the pv command. Large imports can take many minutes, so having insight into how much time remains is helpful to our workflow:

pv /path/to/database.sql | mysql -u{some_user} -p {some_pass} {database_name}

On large databases, though, MYSQL imports can be slow. If we look at a database dump SQL file, we can see why. For example, a 19 MB database dump file we are using in one of our test cases further on in this post contains these instructions:

--
-- Table structure for table `block_content`
--

DROP TABLE IF EXISTS `block_content`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `block_content` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `revision_id` int(10) unsigned DEFAULT NULL,
  `type` varchar(32) CHARACTER SET ascii NOT NULL COMMENT 'The ID of the target entity.',
  `uuid` varchar(128) CHARACTER SET ascii NOT NULL,
  `langcode` varchar(12) CHARACTER SET ascii NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `block_content_field__uuid__value` (`uuid`),
  UNIQUE KEY `block_content__revision_id` (`revision_id`),
  KEY `block_content_field__type__target_id` (`type`)
) ENGINE=InnoDB AUTO_INCREMENT=12 DEFAULT CHARSET=utf8mb4 COMMENT='The base table for block_content entities.';
/*!40101 SET character_set_client = @saved_cs_client */;

--
-- Dumping data for table `block_content`
--

LOCK TABLES `block_content` WRITE;
/*!40000 ALTER TABLE `block_content` DISABLE KEYS */;
set autocommit=0;
INSERT INTO `block_content` VALUES (1,1,'basic','a9167ea6-c6b7-48a1-ac06-6d04a67a5d54','en'),(2,2,'basic','2114eee9-1674-4873-8800-aaf06aaf9773','en'),(3,3,'basic','855c13ba-689e-40fd-9b00-d7e3dd7998ae','en'),(4,4,'basic','8c68671b-715e-457d-a497-2d38c1562f67','en'),(5,5,'basic','bc7701dd-b31c-45a6-9f96-48b0b91c7fa2','en'),(6,6,'basic','d8e23385-5bda-41da-8e1f-ba60fc25c1dc','en'),(7,7,'basic','ea6a93eb-b0c3-4d1c-8690-c16b3c52b3f1','en'),(8,8,'basic','3d314051-567f-4e74-aae4-a8b076603e44','en'),(9,9,'basic','2ef5ae05-6819-4571-8872-4d994ae793ef','en'),(10,10,'basic','3deaa1a9-4144-43cc-9a3d-aeb635dfc2ca','en'),(11,11,'basic','d57e81e8-c613-45be-b1d5-5844ba15413c','en');
/*!40000 ALTER TABLE `block_content` ENABLE KEYS */;
UNLOCK TABLES;
commit;

When we pipe the contents of the MySQL database dump to the mysql command, the client processes each of these instructions sequentially in order to (1) create the structure for each table defined in the file, (2) populate the database with data from the SQL dump and (3) do post-processing work like create indices to ensure the database performs well. The example here processes pretty quickly, but if your site has a lot of historic content, as many of our clients do, then the import process can take enough time that it throws a wrench in our rapid workflow!

What happens when mysql finishes importing the SQL dump file? The database contents (often) live in /var/lib/mysql/{database}, so for example for the block_content table mentioned above, assuming you’re using the typically preferred InnoDB storage engine, there are two files called block_content.frm and block_content.ibd in /var/lib/mysql/{database}/. The /var/lib/mysql directory will also contain a number of other directories and files related to the configuration of the MySQL server.

Now, suppose that instead of sequentially processing the SQL instructions contained in a database dump file, we were able to provide developers with a snapshot of the /var/lib/mysql directory for a given Drupal site. Could this swap faster than the traditional database import methods? Let’s have a look at two test cases to find out!

MySQL import test cases

The table below shows the results of two test cases, one using a 19 MB database and the other using a 4.7 GB database.

Method Database size Time to drop tables and restore (seconds) Traditional mysql 19 MB 128 Docker data volume restore 19 MB 11 Traditional mysql 4.7 GB 606 Docker data volume restore 4.7 GB 85

In other words, the MySQL data volume import completes, on average, in about 11% of the time, or 9 times faster, than a traditional MySQL dump import would take!

Since a GIF is worth a thousand words, compare these two processes side-by-side (both are using the same 19 MB source database; the first is using a data volume restore process while the second is using the traditional MySQL import process). You can see that the second process takes considerably longer!

Docker data volume restore

Traditional MySQL database dump import

Use MySQL volume for database imports with Docker

Here’s how the process works. Suppose you have a Docker stack with a web container and a database container, and that the database container has data in it already (your site is up and running locally). Assuming a database container name of drupal_database, to generate a volume for the MySQL /var/lib/mysql contents of the database container, you’d run these commands:

# Stop the database container to prevent read/writes to it during the database
# export process.
docker stop drupal_database
# Now use the carinamarinab/backup image with the `backup` command to generate a
# tar.gz file based on the `/var/lib/mysql` directory in the `drupal_database`
# container.
docker run --rm --volumes-from drupal_database carinamarina/backup backup \
--source /var/lib/mysql/ --stdout --zip > db-data-volume.tar.gz

With the 4.7 GB sample database above, this process takes 239 seconds and results in 702 MB compressed file.

We’re making use of the carinamarina/backup image produced by Rackspace to create an archive of the database files.

You can then distribute this file to your colleagues (at Savas Labs, we use Amazon S3), or make use of it in continuous integration builds (more on that below), using these commands:

# Copy the data volume tar.gz file from your team's AWS S3 bucket.
if [ ! -f db/db-data-volume.tar.gz ]; then aws s3 cp \
s3://{your-bucket}/mysql-data-volume/db-data-volume.tar.gz db-data-volume.tar.gz; fi
# Stop the database container to prevent read/writes during the database
# restore process.
docker stop drupal_database
# Remove the /var/lib/mysql contents from the database container.
docker run --rm --volumes-from drupal_database alpine:3.3 rm -rf /var/lib/mysql/*
# Use the carinamarina/backup image with the `restore` command to extract
# the tar.gz file contents into /var/lib/mysql in the database container.
docker run --rm --interactive --volumes-from drupal_database \
carinamarina/backup restore --destination /var/lib/mysql/ --stdin \
--zip < db-data-volume.tar.gz
# Start the database container again.
docker start drupal_database

So, not too complicated, but it will require a change in your processes for generating seed databases to distribute to your team for local development, or for CI builds. Instead of using mysqldump to create the seed database file, you’ll need to use the carinamarina/backup image to create the .tar.gz file for distribution. And instead of mysql {database} < database.sql you’ll use carinamarina/backup to restore the data volume.

In our team’s view this is a small cost for the enormous gains in database import time, which in turn boosts productivity to the tune of faster CI builds and refreshes of local development environments.

Further efficiency gains: integrate this process with your continuous integration workflow

The above steps can be manually performed by a technical lead responsible for generating and distributing the MySQL data volume to team members and your testing infrastructure. But we can get further productivity gains by automating this process completely with Travis CI and GitHub hooks. In outline, here’s what this process looks like:

1. Generate a new seed database SQL dump after production deployments

At Savas Labs, we use Fabric to automate our deployment process. When we deploy to production (not on a Docker stack), our post-deployment tasks generate a traditional MySQL database dump and copy it to Amazon S3:

def update_seed_db():
    run('drush -r %s/www/web sql-dump \
    --result-file=/tmp/$(date +%%Y-%%m-%%d)--post-deployment.sql --gzip \
    --structure-tables-list=cache,cache_*,history,search_*,sessions,watchdog' \
    % env.code_dir)
    run('/usr/local/bin/aws s3 cp /tmp/$(date +%Y-%m-%d)--post-deployment.sql.gz \
    s3://{bucket-name}/seed-database/database.sql.gz --sse')
    run('rm /tmp/$(date +%Y-%m-%d)--post-deployment.sql.gz')

2. When work is merged into develop, create a new MySQL data volume archive

We use git flow as our collaboration and documentation standard for source code management on our Drupal projects. Whenever a developer merges a feature branch into develop, we update the MySQL data volume archive dump for use in Travis CI tasks and local development. First, there is a specification in our .travis.yml file that calls a deployment script:

deploy:
  provider: script
  script:
    - resources/scripts/travis-deploy.sh
  skip_cleanup: true
  on:
    branch: develop

And the travis-deploy.sh script:

#!/usr/bin/env bash

set -e

make import-seed-db
make export-mysql-data
aws s3 cp db-data-volume.tar.gz \
s3://{bucket-name}/mysql-data-volume/db-data-volume.tar.gz --sse

This script: (1) imports the traditional MySQL seed database file from production, and then (2) creates a MySQL data volume archive. We use a Makefile to standardize common site provisioning tasks for developers and our CI systems.

3. Pull requests and local development make use of the MySQL data volume archive

Now, whenever developers want to refresh their local environment by wiping the existing database and re-importing the seed database, or, when a Travis CI build is triggered by a GitHub pull request, these processes can make use of an up-to-date MySQL data volume archive file which is super fast to restore! This way, we ensure we’re always testing against the latest content and configuration, and avoid running into costly issues having to troubleshoot inconsistencies with production.

Conclusion

We’ve invested heavily in Docker for our development stack, and this workflow update is a compelling addition to that toolkit since it has substantially sped up MySQL imports and boosted productivity. Try it out in your Docker workflow and we invite comments to field any questions and hear about your successes. Stay tuned for further Docker updates!

Jan 21 2017
Jan 21

Everyone loves a fast website. It’s one of the critical goals for every web developer to build a site that’s twice as fast than a previous one. And ‘BigPiping’ a website makes it super fast. BigPipe is a fundamental redesign of the dynamic web page serving system.

Typically, 80-90% of the loading time is spent on front end which is very huge.

A few important metrics to observe are:

  • Time To First Byte (TTBF): Time taken between requesting html page and starting to receive the first byte of response. In this time, the client and the browser can’t do anything.
  • Time To Interact (TTI): Completely dependent on use case, but it’s what really matters.
  • Page load time: Total load time until loading is complete.

Facebook’s story on Bigpipe

The basic idea here is to break the webpage into small chunks called pagelets and pipelining them to go through several execution. You know why Facebook loads very fast? That’s because Facebook uses Bigpipe for loading content. Facebook loads the structure of the page in chunks and the elements which are difficult to load come afterwards.

It seems like Facebook loads very fast, but it actually takes 5-6 seconds for everything to load. What happens is that it loads the unchangeable parts first and personalised parts later, like friends list, groups, pages etc.

Facebook’s homepage performance after using Bigpipe caching

Bigpipe rating chart

Source: Facebook

The graph shows that BigPipe reduces user perceived latency by half in most browsers.

Bigpipe in Drupal 8

In Drupal 7 or any other CMS, the webpage gets comparatively slower ones we add customisations or personalisation it. After Using Bigpipe in Drupal 8, this is no longer an issue.

This technology was only available to Linkedin and FB. But now it’s available as a module in Drupal 8.

Facebook did the streaming of content in so called pagelets. Wim & Fabian (Author of Drupal 8 Bigpipe module) named it “auto-placeholdering” for Drupal 8 which differentiates the static sections of the page from dynamic ones.

Here is the example of comparison between standard drupal caching & bigpipe: Here is the link to video

Source: Drie Buytaert/ Youtube

If we provide the correct cacheability metadata, Drupal will be able to automatically deliver personalised parts of the page later, without you having to write a single line of code.

Wim Leers, the author of this module explained it as follows:

“The problem we were solving: Drupal 7 can’t really cache its output because it lacks metadata for caching. It generates just about everything you see on the page on every page load, even things that don’t change. Drupal 8 does far less work when serving a page, because we brought cacheability metadata to everything. BigPipe is the most powerful way to put that cacheability metadata to use and it delivers the most significant performance improvements.

BigPipe was not possible before Drupal 8. So, no, it’s the other way around: BigPipe has changed Drupal 8 and made it the fastest Drupal yet.” - Wim Leers.

How BigPipe works

To exploit the parallelism between web server and browser, BigPipe first breaks web pages into multiple chunks called pagelets. Just as a pipelining microprocessor divides an instruction’s life cycle into multiple stages (such as “instruction fetch”, “instruction decode”, “execution”, “register write back” etc.), BigPipe breaks the page generation process into several stages:

  • Request parsing: Web server parses and sanity checks the HTTP request.
  • Data fetching: Web server fetches data from storage tier.
  • Markup generation: Web server generates HTML markup for the response.
  • Network transport: The response is transferred from web server to browser.
  • CSS downloading: Browser downloads CSS required by the page.
  • DOM tree construction and CSS styling: Browser constructs DOM tree of the document, and then applies CSS rules on it.
  • JavaScript downloading: Browser downloads JavaScript resources referenced by the page.
  • JavaScript execution: Browser executes JavaScript code of the page.

(From the Facebook Engineering blog)

I’d really like to thank Wim Leers & Fabian and others who worked really hard on bringing this caching strategy to Drupal 8 core with 8.1 release.

Jul 16 2016
Jul 16

Sounds like a bad design? When the first time I found out this, I thought that we should have avoided it in design. But, that is not what we are talking today. After we figured out a way to fix the performance, it seems quite a powerful way to deal with the business logic.

How to accommodate the request that a node holds thousands of multiple value items in one field? When it comes to editor experience, we have something to share. Multiple values field for a field-collection-item is a usual setup for a content type. When there are only couple dozens of values, everything is good. The default field-collection embed widget with a multivalue field is working well.

As the number of items goes up, the editing page become heavier. In our case, we have a field collection contains five subfields. There is one entity reference field pointing to nodes, two text fields, one taxonomy entity reference field and a number field. Some nodes have over 300 such field collection items. The editing pages for those nodes are like taking forever to load. Updating the node getting more and more difficult.

for such a node, the edit form has thousands of form elements. It is like loading an adult elephant with a small pickup truck. Anything can slow down the page. That can be from web server performance to the network bandwidth and our local computer browser capability. So, we need to find a valid way to handle it. We want the multiple value field to be truly unlimited. Make it capable of holding thousands of field-collection-items value in a single node.

After doing some research, we come with a pretty good solution. Here is what we had done to deal with it.

We use Embeded Views Field to build a block for the field collection items. We paginate it and break down 300 items into 12 pages. Then, we insert the views block into the node editing page. Not loading all the elements into the node editing form, it speeds up the page loading immediately. Display the field collection items in views block is not enough, we need to edit them. I had tried to use the VBO to handle editing and deleting. It did not work. Then we built some custom ajax functions for editing and deleting. We use the ctools modal window as front end interface to edit, delete and add new items. That works well. With modal window and Ajax, we can keep the main node edit page untouched. There is no need to refresh the page every time they change the field-collection-items. Thanks to the pagination of the views block. We now can add as many items as we want into the field collection multivalue field. We added views sorting function to the embedded views field.

Sounds pretty robust, but wait, there is something missing. We quickly running into problem soon we implement it. What about the form to create a new node? On the new node page, the embedded views field block is not working. A new node does not have its node id. We fixed it by using the default widget. It is just for the new node page. We used the following function to switch the field widget.

function MODULENAME_field_widget_properties_alter(&$widget, $context) {
  if ($context['entity_type'] == 'node') {
    if (!isset($context['entity']->nid)) {
      if ($context['field']['field_name'] == 'FIELD_MACHINE_NAME') {
        if ($widget['type'] == 'field_collection_hidden') {
          $widget['type'] = 'field_collection_embed';
        }
      }
    }
  }
}

Jun 30 2016
Jun 30

These are the slides of the presentation I gave last week at DrupalDevDays Milan about the ways using Queue API can help you speed up Drupal sites, especially in connected mode.

This is an updated for Drupal 8 version of the presentation we gave at DrupalCon Barcelona with Yuriy Gerasimov.
Jun 16 2016
Jun 16

This is part 2 of a series on using XHProf for profiling Drupal modules.

After you’ve installed XHProf, what’s next? How can you make use of its recommendations to tune a module or a website? Unfortunately there are no hard-and-fast rules for optimizing code to run more efficiently. What I can offer here is my own experience trying to optimize a D8 module using XHProf.

Understanding an XHProf run report

The XHProf GUI displays the result of a given profiler run, or a group of runs. It can even compare 2 runs, but we’ll get to that in a minute. If you followed my previous post, you should have the xhprof_html directory symlinked into the root web directory for your Drupal site; so visiting <my-local-site>/xhprof/ should give you a list of all available stored run IDs, and you can click through one of those to view a specific run report.

You can also go directly to a specific run report via the URL <my-local-site>/xhprof/index.php?run=<run-id>&source=<source-id> (which you should have been logging already via an echo statement or dblog if you followed the last post).

Header of an XHProf run report

The core of the run report is a table of each function or method which your code called while it was being profiled, along with a set of statistics about that function. This allows you to understand which parts of your code are most resource-intensive, and which are being called frequently in the use case that you’re profiling. Clicking on any one of the column headers will sort the list by that metric. To understand this report, it’s helpful to have some terminology:

  • Incl. Wall Time - The total clock time elapsed between when a function call started and when the function exited. Note that this number is not a great basis for comparisons, since it can include other processes which were taking up CPU time on your machine while the PHP code was running, from streaming music in the background to PHPStorm’s code indexing, to random web browsing.
  • Incl. CPU Time - In contrast to wall time, CPU time tracks only the time which the CPU actually spent executing your code (or related system calls). This is a more reliable metric for comparing different runs.
  • Excl. Wall/CPU Time - Exclusive time measurements only count time actually spent within the given method itself. They exclude the time spent in any method/function called from the given function (since that time will be tracked separately).

In general, the inclusive metrics (for CPU time and memory usage) will give you a sense of what your expensive methods/functions are – these are the methods or functions that you should avoid calling if possible. In contrast, the exclusive metrics will tell you where you can potentially improve the way a given method/function is implemented. For methods which belong to Drupal Core or other contrib modules, inclusive and exclusive metrics are basically equivalent, since you don’t usually have the option of impacting the implementation details of a function unless you’re working on its code directly. Note also that because your overall parent method and any other high-level methods in your code will always show up at the top of the inclusive time chart, you may have better luck really understanding where your performance hits come from by sorting by exclusive CPU time.

Take a step back and plan your test scenarios

Before digging in to optimizing your module code, you need to take a step back and think about the big picture. First, what are you optimizing for? Many optimizations involve a tradeoff between time and memory usage. Are you trying to reduce overall run-time at the expense of more memory? Is keeping the memory footprint of a given function down more important? In order to answer these questions you need to think about the overall context in which your code is running. In my case, I was optimizing a background import module which was run via cron, so the top priority was that the memory usage and number of database optimizations were low enough not to impact the user-facing site performance.

Second, what use case for your code are you profiling? If this is a single method call, what arguments will be passed? If you’re profiling page loads on a website, which pages are you profiling? In order to successfully track whether the changes you’re making are having an impact on the metrics you’re concerned about, you need to be able to narrow down the possible use cases for your code into a handful of most-likely real world scenarios which you’ll actually choose to track via the profiler.

Keep things organized

Now it’s time to get organized. Write a simple test script so that you can easily run through all your use cases in an automated way – this is not strictly necessary, but it will save you a lot of work and potential error as you move through the process. In my case, I was testing a drush command hook, so I just wrote a bash shell script which executed the command three times in each of two different ways. For profiling page loads, I would recommend using Apache JMeter - and you’ll need to consider whether you want to force an uncached page load by passing a random dummy query parameter. Ideally, you should be running each scenario a few times so that you can then average the results to account for any small variations in run-time.

Keeping your different runs organized is probably the most important part of successfully profiling module code using XHProf! Each run has a unique run ID, but you are solely responsible for knowing which use case scenario and which version of the codebase that run ID corresponds to. I set up a basic spreadsheet in OpenOffice where I could paste in run numbers and basic run stats to compare (but there’s almost certainly a nicer automated way to do this than what I used).

Screenshot of an OpenOffice spreadsheet summarizing XHProf results for various runs

Once you have a set of run IDs for a given use case + codebase version, you can generate a composite xhprof report using the query string syntax http://<your-local-site>/xhprof/index.php?run=<first-run-id>,<second-run-id>,<third-run-id>&source=<source-string> Averaging out across a few runs should give you more precise measurements for CPU time and memory usage, but beware that if parts of your code involve caching you may want to either throw out the first run’s results in each version of the code base, since that’s where the cache will be generated, or clear the cache between runs.

Go ahead and test your run scripts to make sure that you can get a consistent baseline result at this point – if you’re seeing large differences in average total CPU times or average memory usage across different runs of the same codebase, you likely won’t be able to compare run times across different versions of the code.

Actually getting to work!

After all this set-up, you should be ready to experiment and see what the impact of changes in your code base are on the metrics that you want to shift. In my case, the code I was working on used a streaming JSON parser class, and I noticed that one of the top function calls in the inital profiler report was the consumeChar method of the parser.

Image of XHProf profiler report with the method consumeChar highlighted in yellow

It turns out that the JSON files I was importing were pretty-printed, thus containing more whitespace than they needed to. Sine the consumeChar method gets called on each incoming character of the input stream, that added up to a lot of unnecessary method calls in the original code. By tweaking the JSON file export code to remove the pretty print flag, I cut down the number of times this method was called from 729,099 to 499,809, saving .2 seconds of run time right off the bat.

That was the major place where the XHProf profiler report gave me insights I would not have had otherwise. The rest of my optimizing experience was mostly testing out some of the common-sense optimizations I had already thought of while looking at the code – caching a table of known Entity IDs rather than querying the DB to check if an entity existed each time, using an associative array and is_empty() to replace in_array() calls, cutting down on unnecessary $entity->save() operations where possible.

It’s useful to mention that across the board the biggest performance hit in your Drupal code will probably be database calls, so cutting down on those wherever possible will save run-time (sometimes at the expense of memory, if you’re caching large amounts of data). Remember, also, that if DB log is enabled each logging call is a separate database operation, so use the log sparingly – or just log to syslog and use service like Papertrail or Loggly on production sites.

The final results

As the results below show, using XHProf and some thoughtful optimizations I was able to cut total run time significantly in one use case (Case 2) and slightly in another use case (Case 1). Case 1 was already running in a reasonable amount of time, so here I was mostly interested in the Case 2 numbers (assuming I didn’t make anything too much worse).

Bar chart comparing the run time of various runs

Think of the big picture, part 2

Remember that controlled experimental metrics are just a means to understanding and improving real-world performance (which you can also measure directly using tools like blackfire, but that’s another blog post). In my case, at the end of the day we decided that the most important thing was to ensure that there wasn’t a performance impact on the production site while this background import code was running; so one of the optimizations we actually ended up implementing was to force this code to run slower by throttling $entity->save() operations to maximally 1 every 1/2 second or so, as a way to minimize the number of requests MySQL was having to respond to from the importer. XHProf is a powerful tool, but don’t lose the forest for the trees when it comes to optimization.

May 26 2016
May 26

XHProf is a profiling tool for PHP code – it tracks the amount of time and memory your code spends in each function call, allowing you to spot bottlenecks in the code and identify where it’s worth spending resources on optimization. There are have been a number of PHP profilers over the years, and XDebug has a profiler as well, but XHProf is the first one I’ve successfully managed to configure correctly and interpret the output of.

I had run across a number of blog posts about using XHProf + Drupal, but never actually got it to work sucessfully for a project. Because so much of the documentation online is incomplete or out-of-date, I thought it would be useful to document my process using XHProf to profile a Drupal 8 custom module here. YMMV, but please post your thoughts/experiences in the comments!

How to find documentation

I find the php.net XHProf manual entry super-confusing and circular. Part of the problem is that Facebook’s original documentation for the library has since been removed from the internet and is only accessible via the WayBack Machine.

If there’s only one thing you take away from this blog post, let it be: read and bookmark the WayBack machine view of the original XHProf documentation, which is at http://web.archive.org/web/20110514095512/http://mirror.facebook.net/facebook/xhprof/doc.html.

Install XHProf in a VM

If you’re not running DrupalVM, you’ll need to install XHProf manually via PECL. On DrupalVM, XHProf is already installed and you can skip to the next step.

sudo pecl install xhprof-beta

Note that all these commands are for Ubuntu flavors of linux. If you’re on Red Hat / CentOS you’ll want to use the yum equivalents. I had to first install the php5-dev package to get PECL working properly:

sudo apt-get update
sudo apt-get install php5-dev

And, if you want to view nice callgraph trees like the one below you’ll need to install the graphviz package sudo apt-get install graphviz

Image of a sample XHProf callgraph

Configure PHP to run XHProf

You need to tell PHP to enable the xhprof extension via your php.ini files. Usually these are in /etc/php5/apache2/php.ini and /etc/php5/cli/php.ini. Add the following lines to the bottom of each file if they’re not there already. You will also need to create the /var/tmp/xhprof directory if it doesn’t already exist.

[xhprof]
extension=xhprof.so
;
; directory used by default implementation of the iXHProfRuns
; interface (namely, the XHProfRuns_Default class) for storing
; XHProf runs.
;
xhprof.output_dir="/var/tmp/xhprof"

Lastly, restart Apache so that the PHP config changes take effect.

Set up a path to view the XHProf GUI

The XHProf GUI runs off a set of HTML files in the xhprof_html directory. If you’ve been following the install steps above, you should be able to find that directory at /usr/share/php/xhprof_html. Now you need to set up your virtual host configuration to serve the files in the xhprof_html directory.

I find the easiest way to do this is just to symlink the xhprof_html directory into the existing webroot of whatever site you’re working on locally, for example:

ln -s /usr/share/php/xhprof_html /var/www/my-website-dir/xhprof

If you’re using DrupalVM, a separate vhost configuration will already be set up for XHProf, and the default URL is http://xhprof.drupalvm.dev/ although it can be changed in your config.yml file.

Hooking XHProf into your module code

Generally, the process of profiling a chunk of code using XHProf goes as follows:

  1. Call xhprof_enable()
  2. Run the code you want profiled
  3. Once the code has finished running, call xhprof_disable(). That function will return the profiler data, which you can either display to the screen (not recommended), or…
  4. Store the profiler data to a file by creating a new XHProfRuns_Default(); object and calling its save_run method.

In the case below, I’m profiling a module that implements a few Drush commands from the command line which I’d like to optimize. So I created _modulename_xhprof_enable() and _modulename_xhprof_disable() functions – the names don’t matter here – and then added a --profile flag to my Drush command options which, when it is set to true, calls my custom enable/disable functions before and after the Drush command runs.

Here’s what those look like in full:

<?php
/**
 * Helper function to enable xhprof.
 */
function _mymodule_enable_xhprof() {
  if (function_exists('xhprof_enable')) {
    // Tell XHProf to track both CPU time and memory usage
    xhprof_enable(XHPROF_FLAGS_CPU + XHPROF_FLAGS_MEMORY,
      array(
        // Don't treat these functions as separate function callss
        // in the results.
        'ignored_functions' => array('call_user_func',
          'call_user_func_array',
        ),
      ));
  }
}

/**
 * Helper function to disable xhprof and save logs.
 */
function _mymodule_disable_xhprof() {
  if (function_exists('xhprof_enable')) {
    $xhprof_data = xhprof_disable();

    //
    // Saving the XHProf run
    // using the default implementation of iXHProfRuns.
    //
    include_once "/usr/share/php/xhprof_lib/utils/xhprof_lib.php";
    include_once "/usr/share/php/xhprof_lib/utils/xhprof_runs.php";

    $xhprof_runs = new XHProfRuns_Default();

    // Save the run under a namespace "xhprof_foo".
    //
    // **NOTE**:
    // By default save_run() will automatically generate a unique
    // run id for you. [You can override that behavior by passing
    // a run id (optional arg) to the save_run() method instead.]
    // .
    $run_id = $xhprof_runs->save_run($xhprof_data, 'xhprof_mymodule');

    echo "---------------\nAssuming you have set up the http based UI for \nXHProf at some address, you can view run at \nhttp://mywebsiteurl.dev/xhprof/index.php?run=$run_id&source=xhprof_mymodule\n---------------\n";
  }
}

The echo command here works fine for a Drush command, but for other tasks you could log the run url using watchdog.

Note: Another way to run XHProf on a Drupal site is using the XHProf module, but I haven’t had great luck with that.

Viewing profiler results

If everything is configured correctly, when you run your module you should get a run ID output either to the screen (via echo, as above, or however you’ve configured this logging). Visit the URL you configured above for xhprof, and you should see a list of all the stored runs. Clicking on a run will bring up the full profiler report.

Sample screenshot of an XHProf profiler report

Now what?

Now you’ve got all this data – how to make sense of it? What to do with it? Stay tuned for more discussion of how to interpret XHProf results and a real-world example of profiling a D8 module, next week!

Apr 03 2015
Apr 03

Drupal field was part of the Drupal core since version 7. The Field extends her ability to build different kinds of systems. Since it is basic units of each entity, it is one of the most important parts of the open source software. But, when it comes to the efficiency of using SQL storage engine, the field can still do better with efficiency. I sincerely believe that we may not afford to ignore it. Let put it under a microscope had a close look at field SQL storage.

Case study:

I had built a patient scheduling system for a couple clinic offices. The project itself is not complicated. I have attached the patient profile picture on this article. We built a patient profile node type on the form. It is not a complicated form, but there are over 40 fields. It is not difficult to set up a nice patient profile node form. I also created appointment node type that connected patient profile and doctor profile with entity reference fields. Using views with exposed filter for the various reports.

It was the project where I find the issue. I am a little bit uncomfortable after I take a close look at the database. Each field has two almost identical tables. I think fields took too much unnecessary database space. I have dumped one of the fields database information to explain my concern.

1) Base table: field_data_field_initial

+----------------------+------------------+------+-----+---------+-------+
| Field                | Type             | Null | Key | Default | Extra |
+----------------------+------------------+------+-----+---------+-------+
| entity_type          | varchar(128)     | NO   | PRI |         |       |
| bundle               | varchar(128)     | NO   | MUL |         |       |
| deleted              | tinyint(4)       | NO   | PRI | 0       |       |
| entity_id            | int(10) unsigned | NO   | PRI | NULL    |       |
| revision_id          | int(10) unsigned | YES  | MUL | NULL    |       |
| language             | varchar(32)      | NO   | PRI |         |       |
| delta                | int(10) unsigned | NO   | PRI | NULL    |       |
| field_initial_value  | varchar(255)     | YES  |     | NULL    |       |
| field_initial_format | varchar(255)     | YES  | MUL | NULL    |       |
+----------------------+------------------+------+-----+---------+-------+

Base table SQL script:

CREATE TABLE `field_data_field_initial` (
`entity_type` varchar(128) NOT NULL DEFAULT '',
`bundle` varchar(128) NOT NULL DEFAULT '',
`deleted` tinyint(4) NOT NULL DEFAULT '0',
`entity_id` int(10) unsigned NOT NULL,
`revision_id` int(10) unsigned DEFAULT NULL,
`language` varchar(32) NOT NULL DEFAULT '',
`delta` int(10) unsigned NOT NULL,
`field_initial_value` varchar(255) DEFAULT NULL,
`field_initial_format` varchar(255) DEFAULT NULL,
PRIMARY KEY (`entity_type`,`entity_id`,`deleted`,`delta`,`language`),
KEY `entity_type` (`entity_type`),
KEY `bundle` (`bundle`),
KEY `deleted` (`deleted`),
KEY `entity_id` (`entity_id`),
KEY `revision_id` (`revision_id`),
KEY `language` (`language`),
KEY `field_initial_format` (`field_initial_format`)

2) Revision table: field_revision_field_initial

+----------------------+------------------+------+-----+---------+-------+
| Field                | Type             | Null | Key | Default | Extra |
+----------------------+------------------+------+-----+---------+-------+
| entity_type          | varchar(128)     | NO   | PRI |         |       |
| bundle               | varchar(128)     | NO   | MUL |         |       |
| deleted              | tinyint(4)       | NO   | PRI | 0       |       |
| entity_id            | int(10) unsigned | NO   | PRI | NULL    |       |
| revision_id          | int(10) unsigned | NO   | PRI | NULL    |       |
| language             | varchar(32)      | NO   | PRI |         |       |
| delta                | int(10) unsigned | NO   | PRI | NULL    |       |
| field_initial_value  | varchar(255)     | YES  |     | NULL    |       |
| field_initial_format | varchar(255)     | YES  | MUL | NULL    |       |
+----------------------+------------------+------+-----+---------+-------+

Revision table SQL script:

CREATE TABLE `field_revision_field_initial` (
  `entity_type` varchar(128) NOT NULL DEFAULT '',
  `bundle` varchar(128) NOT NULL DEFAULT '',
  `deleted` tinyint(4) NOT NULL DEFAULT '0',
  `entity_id` int(10) unsigned NOT NULL,
  `revision_id` int(10) unsigned NOT NULL,
  `language` varchar(32) NOT NULL DEFAULT '',
  `delta` int(10) unsigned NOT NULL,
  `field_initial_value` varchar(255) DEFAULT NULL,
  `field_initial_format` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`entity_type`,`entity_id`,`revision_id`,`deleted`,`delta`,`language`),
  KEY `entity_type` (`entity_type`),
  KEY `bundle` (`bundle`),
  KEY `deleted` (`deleted`),
  KEY `entity_id` (`entity_id`),
  KEY `revision_id` (`revision_id`),
  KEY `language` (`language`),
  KEY `field_initial_format` (`field_initial_format`)

Here are my concerns.

1) Normalization.

Here is one of the fields' data record.

+-------------+--------+---------+-----------+-------------+----------+-------+---------------------+----------------------+
| entity_type | bundle | deleted | entity_id | revision_id | language | delta | field_initial_value | field_initial_format |
+-------------+--------+---------+-----------+-------------+----------+-------+---------------------+----------------------+
| node        | patient_profile      |       0 |      1497 |        1497 | und      |     0 | w                   | plain_text        |
+-------------+--------+---------+-----------+-------------+----------+-------+---------------------+----------------------+

We have value "W" in the Initial field. One character took 51 bytes for storage that had not included index yet. It took another 51 byte in the revision table and more for index. In this case here, only less than two percents of space are used for real data the initial 'W', and over 98% of space is for other purposes.

For the sake of space, I think we should not use varchar for entity_type, bundle, language, field_format column. Use small int, tiny int or intÎÎ that will only take one to four bytes. The field is a basic unit of a Drupal website. A medium website can hold millions of fields. Saved one byte is equal to multiple megabytes in precious MySQL database.

2) Too complicated primary key

Each field table has a complicated primary key. Base table use `entity_type`, `entity_id`, `deleted`, `delta`, `language` as primary key. Revision table use `entity_type`, `entity_id`, `revision_id`, `deleted`, `delta`, `language` as primary key. "In InnoDB, having a long PRIMARY KEY wastes a lot of disk space because its value must be stored with every secondary index record."ÎÎÎ. It may be worthy to add an auto increasing int as a primary key.

3) Not needed field column

I found bundle type column is not necessary. We can have the system running well without bundle type column. In my clinic project, I named the node type "patient profile". The machine name patient_profile appears in each field record's bundle type column. As varchar (255), it uses 16 bytes for each table record. Let do a quick calculation. if there are 100, 000 nodes and each node have 40 fields, 100,000 x 40 x 2 x 16 = 122MB are taken for this column. Or at least, we use 2 bytes small int that will take only one-eighth of the spaces.

4) Just use revision table.

Remove one of the field's data tables. It may need a little bit more query power to get field data, but it save time when we insert, update and delete field's data. By doing so, we maintain one less table per field, edit content faster. It helps to bring better editor experience and to save on database storage space.

A contributed module field_sql_leanÎÎ addressed some of the concerns here. It still needs a lot of work on itself and if we want other contributed modules compatible with it. After all, it changed the field table structure.

Reference:

1: http://dev.mysql.com/doc/refman/5.1/en/integer-types.html
2: http://dev.mysql.com/doc/refman/5.0/en/innodb-tuning.html
3: Field SQL storage lean solution
4: Patient profile form:medical form

Feb 27 2015
Feb 27

Begin to design a Drupal website with a million nodes in mind. We build a Drupal website. It runs well at beginning. Until one day, the system has hundreds of thousands of node. We found the site became slow. We need wait many seconds before we can open a new page. Not only it is slow, but also sometimes we got errors like memory exhausted.

Most time the problem was existed at the beginning stage of a system. When designing a site, there are something we as a developer have to take care. We need bear in mind the site will grow and more and more nodes will come. Everytimes creating a function, we need to make sure the function will work fine when there are hundreds of thousands of nodes in the system. Otherwise, those functions may time out or finish all the memory by those ever increasing nodes in the system.

PHP have a maximum memory limit for each user. Sometimes it is 128 MB. Sometimes it is 256MB. The number is limited, and it is not infinite large for sure. There is no limit on how many nodes can exist on our website. As our system getting larger and larger with more nodes created, we will face the memory limitation sooner or later if we did not take it into consideration at the beginning.

Here is a quick sample. Drupal have a function node_load_multiple(). This function can load all nodes in the database to memory. Here are some codes from one of our contributed module.

foreach (node_load_multiple(FALSE) as $node) {
  // Modify node objects to be consistent with Revisioning being
  // uninstalled, before updating the {taxonomy_index} table accordingly.
  unset($node->revision_moderation);
  revisioning_update_taxonomy_index($node, FALSE);
}

This code is in an implementation of hook_uninstall. It will run into a problem if there are over 10,000 nodes in the system. As a result, we can not uninstall this module. Here is the error message:

Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate 36 bytes) in ...

It used up all 256MB memory before it can load all the nodes. As the result, the module can never be uninstalled from the site.

It is an extreme case. As we troubleshooting an existing site, we may notice similar case here and there. I also notice that we can do something on the field_sql_storage module to make Drupal running faster and keep SQL database smaller.

Jan 28 2015
Jan 28

As the largest bicycling club in the country with more than 16,000 active members and a substantially larger community across the Puget Sound, Cascade Bicycle Club requires serious performance from its website. For most of the year, Cascade.org serves a modest number of web users as it furthers the organization’s mission of “improving lives through bicycling.”

But a few days each year, Cascade opens registration for its major sponsored rides, which results in a series of massive spikes in traffic. Cascade.org has in the past struggled to keep up with demand during these spikes. During the 2014 registration period for example, site traffic peaked at 1,022 concurrent users and >1,000 transactions processed within an hour. The site stayed up, but the single web server seriously struggled to stay on its feet.

In preparation for this year’s event registrations, we implemented horizontal scaling at the web server level as the next logical step forward in keeping pace with Cascade’s members. What is horizontal scaling, you might ask? Let me explain.

[Ed Note: This post gets very technical, very quickly.]

Overview

We had already set up hosting for the site in the Amazon cloud, so our job was to build out the new architecture there, including new Amazon Machine Images (AMIs) along with an Autoscale Group and Scaling Policies.

Here is a diagram of the architecture we ended up with. I’ll touch on most of these pieces below.

Cascade-scaling2

Web Servers as Cattle, Not Pets

I’m not the biggest fan of this metaphor, but it’s catchy: The fundamental mental shift when moving to automatic scaling is to stop thinking of the servers as named and coddled pets, but rather as identical and ephemeral cogs–a herd of cattle, if you will.

In our case, multiple web server instances are running at a given time, and more may be added or taken away automatically at any given time. We don’t know their IP addresses or hostnames without looking them up (which we can do either via the AWS console, or via AWS CLI — a very handy tool for managing AWS services from the command line).

The load balancer is configured to enable connection draining. When the autoscaling group triggers an instance removal, the load balancer will stop sending new traffic, but will finish serving any requests in progress before the instance is destroyed. This, coupled with sticky sessions, helps alleviate concerns about disrupting transactions in progress.

The AMI for the “cattle” web servers (3) is similar to our old single-server configuration, running Nginx and PHP tuned for Drupal. It’s actually a bit smaller of an instance size than the old server, though — since additional servers are automatically thrown into the application as needed based on load on the existing servers — and has some additional configuration that I’ll discuss below.

As you can see in the diagram, we still have many “pets” too. In addition to the surrounding infrastructure like our code repository (8) and continuous integration (7) servers, at AWS we have a “utility” server (9) used for hosting our development environment and some of our supporting scripts, as well as a single RDS instance (4) and a single EC2 instance used as a Memcache and Solr server (6). We also have an S3 instance for managing our static files (5) — more on that later.

Handling Mail

One potential whammy we caught late in the process was handling mail sent from the application. Since the IP of the given web server instance from which mail is sent will not match the SPF record for the domain (IP addresses authorized to send mail), the mail could be flagged as spam or mail from the domain could be blacklisted.

We were already running Mandrill for Drupal’s transactional mail, so to avoid this problem, we configured our web server AMI to have Postfix route all mail through the Mandrill service. Amazon Simple Email Service could also have been used for this purpose.

Static File Management

With our infrastructure in place, the main change at the application level is the way Drupal interacts with the file system. With multiple web servers, we can no longer read and write from the local file system for managing static files like images and other assets uploaded by site editors. A content delivery network or networked file system share lets us offload static files from the local file system to a centralized resource.

In our case, we used Drupal’s S3 File System module to manage our static files in an Amazon S3 bucket. S3FS adds a new “Amazon Simple Storage Service” file system option and stream wrapper. Core and contributed modules, as well as file fields, are configured to use this file system. The AWS CLI provided an easy way to initially transfer static files to the S3 bucket, and iteratively synch new files to the bucket as we tested and proceeded towards launch of the new system.

In addition to static files, special care has to be taken with aggregated CSS and Javascript files. Drupal’s core aggregation can’t be used, as it will write the aggregated files to the local file system. Options (which we’re still investigating) include a combination of contributed modules (Advanced CSS/JS Aggregation + CDN seems like it might do the trick), or Grunt tasks to do the aggregation outside of Drupal during application build (as described in Justin Slattery’s excellent write-up).

In the case of Cascade, we also had to deal with complications from CiviCRM, which stubbornly wants to write to the local file system. Thankfully, these are primarily cache files that Civi doesn’t mind duplicating across webservers.

Drush & Cron

We want a stable, centralized host from which to run cron jobs (which we obviously don’t want to execute on each server) and Drush commands, so one of our “pets” is a small EC2 instance that we maintain for this purpose, along with a few other administrative tasks.

Drush commands can be run against the application from anywhere via Drush aliases, which requires knowing the hostname of one of the running server instances. This can be achieved most easily by using AWS CLI. Something like the bash command below will return the running instances (where ‘webpool’ is an arbitrary tag assigned to our autoscaling group):

[[email protected] ~]$aws ec2 describe-instances --filters "Name=tag-key, Values=webpool" |grep ^INSTANCE |awk '{print $14}'|grep 'compute.amazonaws.com'

We wrote a simple bash script, update-alias.sh, to update the ‘remote-host’ value in our Drush alias file with the hostname of the last running server instance.

Our cron jobs execute update-alias.sh, and then the application (both Drupal and CiviCRM) cron jobs.

Deployment and Scaling Workflows

Our webserver AMI includes a script, bootstraph.sh, that either builds the application from scratch — cloning the code repository, creating placeholder directories, symlinking to environment-specific settings files — or updates the application if it already exists — updating the code repository and doing some cleanup.

A separate script, deploy-to-autoscale.sh, collects all of the running instances similar to update-alias.sh as described above, and executes bootstrap.sh on each instance.

With those two utilities, our continuous integration/deployment process is straightforward. When code changes are pushed to our Git repository, we trigger a job on our Jenkins server that essentially just executes deploy-to-autoscale.sh. We run update-alias.sh to update our Drush alias, clear the application cache via Drush, tag our repository with the Jenkins build ID, and we’re done.

For the autoscaling itself, our current policy is to spin up two new server instances when CPU utilization across the pool of instances reaches 75% for 90 seconds or more. New server instances simply run bootstrap.sh to provision the application before they’re added to the webserver pool.

There’s a 300-second grace time between additional autoscale operations to prevent a stampede of new cattle. Machines are destroyed when CPU usage falls beneath 20% across the pool. They’re removed one at a time for a more gradual decrease in capacity than the swift ramp-up that fits the profile of traffic.

More Butts on Bikes

With this new architecture, we’ve taken a huge step toward one of Cascade’s overarching goals: getting “more butts on bikes”! We’re still tuning and tweaking a bit, but the application has handled this year’s registration period flawlessly so far, and Cascade is confident in its ability to handle the expected — and unexpected — traffic spikes in the future.

Our performant web application for Cascade Bicycle Club means an easier registration process, leaving them to focus on what really matters: improving lives through bicycling.

Previous Post

2015 Digital Trends for Influence

Next Post

Communicating Data for Impact takes Seattle

Jan 08 2015
Jan 08

When we talk about the performance of Drupal, the first thing come to my mind is caching. But today I found another way to make Drupal run a little bit faster. It is not a profound thing, but something may be ignored by many. In work, I need process 56916 records constantly with automated Cron process. It took 13 minutes 30 seconds to process all those records. Adding a new database field index, I reduced the processing time to one minute 33 seconds only. It is more than eight times faster.

Here is the detail. I have about fifty thousand of record that updated daily. Each record I had a hash created and stored in a field. Whenever inserting or updating a record, and I would check and see if this hash code existed in the database. The project requires searching on the field revision table. Here is the code in my custom module.

$exist = db_query("SELECT EXISTS(Select entity_id from {field_revision_field_version_hash} where field_version_hash_value = :hash)", array(':hash' => $hash))->fetchField();
// Return when we had imported the schedule item before.
if ($exist) {
  return;
}

So, checking the hash code in the database became one of the heavy operations. It consumed a lot of system resource. By adding a single query to the field revision table make the process eight times faster. Here is the code I put in the module install file.

// Add version-hash indexes.
if (!db_index_exists('field_revision_field_version_hash', 'version_hash')) {
  db_add_index('field_revision_field_version_hash', 'version_hash', array('field_version_hash_value'));
}

When we build a Drupal website, we are not dealing with database directly. Even though Drupal creates the tables for us, we can still alter the table and make it better.

Nov 05 2014
Nov 05

In the spirit of the computer video game Doom and its skill levels, we’ll review a few ways you can improve your Drupal speed performance and optimize for better results and server response time. These tips that we’ll cover may be at times specific to Drupal 6 versions, although you can always learn the best practices from these examples and apply them on your own code base.

Doom

Doom

Using indexes, and proper SQL queries can boost performance by a huge factor, especially if the affected tables are very big (millions of rows). Take a look at the diff below showing a fix to a not so proper, and ill-advised use of querying the database:

drupal_perf-1

The bad performing query took anything between 6 to 60 seconds to run, depending on the data, and database load, and database’s current cache state. The newer query takes milliseconds.

Series NavigationDrupal Performance Tip – removing unused modules >>
Jun 17 2014
Jun 17
  • Drupal Performance Tuning for Better Database Utilization – Introduction

Drupal is a great CMS or CMF, whichever your take on it, but it can definitely grow up to be a resources hog with all of those contributed modules implementing hooks to no avail. It is even worse when developers aren’t always performance oriented (or security oriented god save us all) and this can (unknowingly) take it’s toll on your web application performance.

Drupal performance tuning has seen it’s share through many presentation decks, tutorials, and even dedicated books such as PacktPub’s Drupal 6 Performance Tips but it seems to be an always continuing task to get great performance so here are some thoughts on where you should start looking.

meme-drupal-database-performance

Checklist for glancing further into Drupal’s rabbit hole and getting insights on tuning your web application for better performance:

  1. Enable MySQL slow query log to trace all the queries which take a long time (usually >1 is enough, and with later versions of MySQL or compliant databases like Percona or MariaDB you can also specify milliseconds for the slow query log)
  2. Enable MySQL slow query log to also log any queries without indexes
  3. Make sure to review all of those query logs with EXPLAIN to figure out which queries can be better constructed to employ good use of indexes. Where indexes are missing it’s worth reviewing if the database would benefit from modifying existing indexes (and not breaking older queries)
  4. Use percona-toolkit to review out standing queries
  5. Use New Relic’s PHP server side engine which can tune into your web application and provide great analysis on function call time, wall time, and overall execution pipelines. While it’s not a must, I’ve personally experienced it and it’s a great SaaS offering for an immediate solution without having to need to install alternatives like XHProf or Webgrind.
Apr 25 2013
Apr 25

I recently began working on some PHP code for resolving HTML5 entities into their Unicode codepoints. According to the code, it had been optimized for performance. The code was moderately complex, and the authors appeared to have gone through great pains to build a specialized lookup algorithm. But when I took a closer look, I doubted. I decided to compare the "optimized" version with what I would call a naive version -- the simplest solution to the problem.

Here I show the two solutions, and then benchmark them for both memory and speed.

Up front, I want to state that I am not poking fun at or deriding the original authors. Their solution has merits, and in a compiled language it might actually be a faster implementation. But to optimize for PHP requires both an understanding of the language and an appreciation for opcode-based interpretation.

UPDATE: Jeff Graham made a very astute observation on Twitter:

@technosophos Nice writeup looks like the diff is on the order O(log n) vs. O(1) for ~2000 entries. Knuth put it best c2.com/cgi/wiki?Prema…— Jeff Graham (@jgraham909) April 25, 2013

UPDATE: In the initial version of this article, I claimed that the tree was a b*tree. It's actually not. It's just a standard node tree. As such, there is no way to make it outperform a hash table. Based on this, the conclusion of the article is patently obvious. However, it is good to see how damaging mis-application of an algorithm can be to overall performance.

The Code: HTML5 Character Reference Resolving

HTML5 defines over 2,000 character references (expressed as entities). There is no computable way to match the string-based entity names to their Unicode code points. So to solve the problem, one must build a lookup mechanism that maps the character reference's string name to a Unicode character.

For example, the character reference name of ampersand (&) is amp. Likewise, Á is Aacute. A lookup tool should be able to take that string name (amp, Aacute) and return the right unicode character. PHP has an older function that does this, but it only supports a subset of the characters in HTML5, so full standards support requires work.

The Optimized Code

The original code solved the problem as follows:

  • The string names were broken down into characters and then packed into a datastructure designed to work like a node tree.
  • The node tree was serialized to a nice compact 178k file (see the end of the article for a sample).
  • At runtime, the serialized file is read into memory once.
  • To lookup a character, a function walks the node tree. When it finds the full reference, it returns a codepoint.

In theory, this sounds very good. But does it perform? I took the code, optimized it as much as possible without changing the underlying algorithm, and wrote this simple test harness:

<?php
 
class Data {
  protected static $refs;
  public static function getRefs() {
    // A serialized node tree of entities is stored on disk,
    // It takes 178k of disk space. It is only loaded once.
    if (empty(self::$refs)) {
      self::$refs = unserialize(file_get_contents('./named-character-references.ser'));
    }
    return self::$refs;
  }
}
 
 
function test($lookup)  {
  // Load the entities tree.
  $refs = Data::getRefs();
 
  // Split the string into an array.
  $stream = str_split($lookup);
  $chars = array_shift($stream);
 
  // Dive into the tree and look for a match.
  // The tree is structured like a tree:
  // array (a => array(m => array(p => 38)))
  $codepoint = false;
  $char = $chars;
  while ($char !== false && isset($refs[$char])) {
    $refs = $refs[$char];
    if (isset($refs['codepoint'])) {
      $id = $chars;
      $codepoint = $refs['codepoint'];
    }
    $chars .= $char = array_shift($stream);
  }
 
  // Return the codepoint.
  return chr($codepoint);
};
 
$r = test('amp');
 
printf("Lookup %s >Current mem: %d Peak mem: %d\n", $r, memory_get_usage(), memory_get_peak_usage());

(Note that using static classes seems to improve memory usage in PHP5. It saves around 0.5M of runtime memory compared to a global or local variable.)

Several things stood out to me, though:

  1. While a serialized file might seem logical, it's actually a burden on PHP. Unserializing is not fast.
  2. The node tree isn't really a node tree. It's actually an n-depth hash table. Tree traversal is not going to be very fast.
  3. Not only will tree traversal be slow, but the node tree-like table is going to require a lot of memory.

The Naive Code

Based on my observations, I decided to compare it to a naive implementation: I stored all of the entities in a single hashtable (in PHP parlance, this is an array). Rather than serialize, I just put the entire thing into a PHP file.

So my code looked like this:

<?php
 
class Entities {
public static $byName = array (
  'Aacute' => 'Á',
  'Aacut' => 'Á',
  'aacute' => 'á',
  'aacut' => 'á',
  // Snip a few thousand lines
  'zwj' => '',
  'zwnj' => '',
);
}
 
function test($lookup)  {
  if (!isset(Entities::$byName[$lookup])) return FALSE;
 
  return Entities::$byName[$lookup];
};
 
$r = test('amp');
 
printf("Lookup %s >Current mem: %d Peak mem: %d\n", $r, memory_get_usage(), memory_get_peak_usage());

Compared to the previous code, this should be self-explanatory. We take a string to look up, and we see look it up. (We could actually remove the isset() check by adding @ on the second call to Entities.) The total file size for this file is 47k, so it's still smaller than the serialized node tree.

My hypothesis coming out of this was that the lookup would be substantially faster, and the memory usage would be a little lower. The real results surprised me, though.

Comparing

Memory Usage

The above examples are both tooled already to report memory usage. Let's compare:

$ php test-btree.php
Lookup & >Current mem: 5289928 Peak mem: 5575336

To do a single entity lookup, this code took around 5MB of memory. How does that compare to the naive version?

php test-hash.php
Lookup >Current mem: 1263200 Peak mem: 1276768

The naive version used a little over a fifth of the memory that the node tree version used.

Speed Test

So the node tree version takes a little extra memory... that's not a deal breaker, is it? Probably not. In fact, if they perform about the same, then it's probably inconsequential.

To test performance, I did the following:

  • Put the test script on a local webserver
  • Warmed the server by requesting each script ten times
  • Ran ab -n 1000 http://localhost/Perf/test-SCRIPT.php

The output of ab (The Apache Benchmarking Tool) is verbose, but here's the pertinent bit for the node tree version:

Concurrency Level:      1
Time taken for tests:   10.564 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      261000 bytes
HTML transferred:       49000 bytes
Requests per second:    94.66 [#/sec] (mean)
Time per request:       10.564 [ms] (mean)
Time per request:       10.564 [ms] (mean, across all concurrent requests)
Transfer rate:          24.13 [Kbytes/sec] received

The easiest number to zoom in on is Time taken for tests: 10.564 seconds, which conveniently averages to about 10.6 msec per request.

Let's compare with the hashtable version:

Concurrency Level:      1
Time taken for tests:   2.541 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      261000 bytes
HTML transferred:       49000 bytes
Requests per second:    393.61 [#/sec] (mean)
Time per request:       2.541 [ms] (mean)
Time per request:       2.541 [ms] (mean, across all concurrent requests)
Transfer rate:          100.33 [Kbytes/sec] received

Again, the important number: 2.541 seconds, or about 2.5 msec per request.

The naive version took only one quarter of the time that the optimized version took.

Now here's an interesting additional piece of data: This was without opcode caching. Would opcode caching make a difference?

Speed Test with Opcode Caching

To test opcode caching, I installed apc, restarted Apache, and then re-ran the battery of tests.

Here's how the optimized version fared:

Concurrency Level:      1
Time taken for tests:   10.636 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      261000 bytes
HTML transferred:       49000 bytes
Requests per second:    94.02 [#/sec] (mean)
Time per request:       10.636 [ms] (mean)
Time per request:       10.636 [ms] (mean, across all concurrent requests)
Transfer rate:          23.96 [Kbytes/sec] received

For all intents and purposes, there was no real change.

Compare that to the naive version:

Concurrency Level:      1
Time taken for tests:   2.025 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      261000 bytes
HTML transferred:       49000 bytes
Requests per second:    493.81 [#/sec] (mean)
Time per request:       2.025 [ms] (mean)
Time per request:       2.025 [ms] (mean, across all concurrent requests)
Transfer rate:          125.86 [Kbytes/sec] received

The opcode cache has trimmed a little over a half a millisecond off of the request time. That makes the naive version more than 5x faster than the optimized version.

Why did it make a difference for the naive version, but not the node tree-based version? The reason is that the .ser file, introduced ostensibly to speed things up, cannot be cached, as it's not code. So on each request, it must be re-loaded into memory.

Meanwhile, all 2,000 entities in the hashed version are conveniently stored in-memory. Assuming the server has sufficient cache space, that data will not need to be re-read and re-interpreted on each subsequent request.

One Additional Strength of the Optimized Code

While I opted to take the naive version of the code, there is one additional strength of the optimized code: Under certain conditions, this sort of algorithm can become more fault-tolerant. The optimized version can sometimes find a codepoint for a reference that was not well formed, because it traverses until it finds a match, and then it stops. The problem with this algorithm, though, is that given the input string 'foobar' and an entity map that contains 'foo' and 'foobar', the matched candidate will be 'foo'.

The naive version of the code does not correct for encoding errors. If the entity name isn't an exact match, it is not resolved.

Appendix: XHProf Stack

Want to know where all of that time is spent? Here's an xhprof dump of the two call stacks.

Using xhprof_enable(XHPROF_FLAGS_CPU + XHPROF_FLAGS_MEMORY);, I gathered the following stats:

node tree Version

Array
(
    [Data::getRefs==>file_get_contents] => Array
        (
            [ct] => 1
            [wt] => 219
            [cpu] => 0
            [mu] => 183888
            [pmu] => 192392
        )
 
    [Data::getRefs==>unserialize] => Array
        (
            [ct] => 1
            [wt] => 9083
            [cpu] => 8001
            [mu] => 4644728
            [pmu] => 4729336
        )
 
    [test==>Data::getRefs] => Array
        (
            [ct] => 1
            [wt] => 9346
            [cpu] => 8001
            [mu] => 4648312
            [pmu] => 4921728
        )
 
    [test==>str_split] => Array
        (
            [ct] => 1
            [wt] => 4
            [cpu] => 0
            [mu] => 2072
            [pmu] => 0
        )
 
    [test==>array_shift] => Array
        (
            [ct] => 4
            [wt] => 10
            [cpu] => 0
            [mu] => 480
            [pmu] => 0
        )
 
    [test==>chr] => Array
        (
            [ct] => 1
            [wt] => 3
            [cpu] => 0
            [mu] => 1128
            [pmu] => 0
        )
 
    [main()==>test] => Array
        (
            [ct] => 1
            [wt] => 9435
            [cpu] => 8001
            [mu] => 4654408
            [pmu] => 4921728
        )
 
    [main()==>xhprof_disable] => Array
        (
            [ct] => 1
            [wt] => 0
            [cpu] => 0
            [mu] => 1080
            [pmu] => 0
        )
 
    [main()] => Array
        (
            [ct] => 1
            [wt] => 9454
            [cpu] => 8001
            [mu] => 4657560
            [pmu] => 4921728
        )
 
)

Hash Version

Array
(
    [main()==>test] => Array
        (
            [ct] => 1
            [wt] => 139
            [cpu] => 0
            [mu] => 1072
            [pmu] => 0
        )
 
    [main()==>xhprof_disable] => Array
        (
            [ct] => 1
            [wt] => 1
            [cpu] => 0
            [mu] => 1080
            [pmu] => 0
        )
 
    [main()] => Array
        (
            [ct] => 1
            [wt] => 178
            [cpu] => 0
            [mu] => 4160
            [pmu] => 0
        )
 
)

Appendix 2: Sample SER data

Array
(
    [A] => Array
        (
            [E] => Array
                (
                    [l] => Array
                        (
                            [i] => Array
                                (
                                    [g] => Array
                                        (
                                            [;] => Array
                                                (
                                                    [codepoint] => 198
                                                )
 
                                            [codepoint] => 198
                                        )
 
                                )
 
                        )
 
                )
 
            [M] => Array
                (
                    [P] => Array
                        (
                            [;] => Array
                                (
                                    [codepoint] => 38
                                )
 
                            [codepoint] => 38
                        )
 
                )
Oct 08 2012
Oct 08

Posted Oct 8, 2012 // 0 comments

In the Drupal community, you see caching discussions related to pages, blocks, reverse-proxies, opcodes, and everything in between. These are often tied to render- and database-intensive optimizations to decrease the load on a server and increase throughput. However, there is another form of caching that can have a huge impact on your site’s performance – module level data caching. This article explores Drupal 7 core caching mechanisms that modules can take advantage of.

When?

Not all modules require data caching, and in some cases due to “real-time” requirements it might not be an option. However, here are some questions to ask yourself to determine if module-level data caching can help you out:

  • Does the module make queries to an external data provider (e.g. web service API) that returns large datasets?
  • If the module pulls data from an external source, is it a slow or unreliable connection?
  • If calling a web service, are there limits to the number of calls the module can make (hourly, daily, monthly, etc.)? Also, if it is a pay service, is it a variable cost based on number of calls?
  • Does the hosting provider have penalties for large amounts of inbound data?
  • Does the data my module handles require significant processing (e.g. heavy XML parsing)?
  • Is the data the module loads from an external source relatively stable and not change rapidly?

If you answered, “yes,” to more than a third of the questions above, module-level data caching can probably help your module’s performance by providing the following features:

  • Decrease external bandwidth
  • Decrease page load times
  • Reduce load on the site’s server
  • Provide reliable data services

Where?

OK, so you’ve decided your module could probably benefit from some form of module-level data caching. The next thing to determine is where to store it. You can always use some form of file-based caching, but to implement that with the proper abstractions to run on a variety of servers requires calls through the Drupal core File APIs, which can be a bit convoluted at times. File-based caching mechanisms also cannot take advantage of scalable performance solutions like memcache or multiple database server configurations that might be changed at any time.

Luckily, Drupal core provides a cache mechanism available to any module using the cache_get and cache_set functions, fully documented on http://api.drupal.org:

<?php
cache_get
($cid, $bin = 'cache')
cache_set($cid, $data, $bin = 'cache', $expire = CACHE_PERMANENT)
?>

By default, these functions will work with the core cache bin called simply “cache.” This is the main dumping ground for Drupal core for data that can persist in the system for a length of time beyond the one page call, and are not tied to a session. However, many modules define their own cache bins so they can provide their own cache management processes. A few core module ones are:

  • cache_block
  • cache_field
  • cache_filter
  • cache_form
  • cache_menu
  • cache_page

Seeing as how several core Drupal modules implement their own cache bins, the next questions for your new module are:

  • Does the module need to manage its cache in a manner that is not consistent with the main cache bin?
  • Will its cache need to be flushed independently of the main cache at any time, or have some other expiration logic assigned to it that falls outside of the core cron cache clear calls?

If the answer to either of these questions is, “yes,” then a dedicated cache bin is probably a wise idea.

Cache bin management is abstracted in the Drupal system via classes implementing DrupalCacheInterface. The core codebase provides a default database-driven cache mechanism via DrupalDatabaseCache that is used for any cache bin type that has not been overridden with a custom class (see the documentation on DrupalCacheInterface for details on how to do that) and has a table in the database named the same as the bin. This table conforms to the same schema as the core cache tables. For reference, this is the core cache table schema in MySQL that we will use as the base for our module’s cache bin:

+------------+--------------+------+-----+---------+-------+
| Field      | Type         | Null | Key | Default | Extra |
+------------+--------------+------+-----+---------+-------+
| cid        | varchar(255) | NO   | PRI |         |       |
| data       | longblob     | YES  |     | NULL    |       |
| expire     | int(11)      | NO   | MUL | 0       |       |
| created    | int(11)      | NO   |     | 0       |       |
| serialized | smallint(6)  | NO   |     | 0       |       |
+------------+--------------+------+-----+---------+-------+

How?

For the sake of simplicity, we will assume that our module is fine with using the default cache mechanism and database schema. As an exercise, we will also assume that we meet the criteria for defining our own cache bin so we can explore all the hooks required to implement a complete custom bin leveraging the default cache implementation. The sample module is called cachemod, and the cache bin name is cache_cachemod.

Define the cache bin schema

In order to add a table with the correct schema to the system, we borrow from some code found in the block module that copies the schema from the core cache table and add this to our install hooks in cachemod.install:

<?php
/**
* Implements hook_schema
*/
function cachemod_schema() {
 
// Create new cache table using core cache schema
 
$schema['cache_cachemod'] = drupal_get_schema_unprocessed('system', 'cache');
 
$schema['cache_cachemod']['description'] = 'Cache bin for the cachemod module';  return $schema;
}
?>

Now that we have defined a table for our cache bin that replicates the schema of the core cache table, we can make basic set and get calls using the following:

<?php
cache_get
($cid, 'cache_cachemod');
cache_set($cid, $data, 'cache_cachemod');
?>

Using our new cache bin

Notice the CID (cache ID) parameter. This will need to be unique to the data being stored, so in the case of something like a web service, the CID might be built from the arguments being passed to the service and the data will be the returned data. One way to abstract this so you get consistent CID values for calls to cache_get and cache_set is to build a helper function. This sample assumes our service call takes an array of key-value pairs:

<?php
/**
* Util function to generate cid from service call args
*/
function _cachemod_cid($args) {
 
// Make sure we have a valid set of args
 
if (empty($args)) {
    return
NULL;
  } 
// Make sure we are consistently operating on an array
 
If (!is_array($args)) {
   
$args = array($args);
  } 
// Sort the array by key, serialize it, and calc the hash
 
ksort($args);
 
$cid = md5(serialize($args));
  return
$cid;
}
?>

Now we can implement a basic public web service function leveraging our cache like this:

<?php
/**
* Public function to execute web service call
*/
function cachemod_call($args) {
 
// Create our cid from args
 
$cid = _cachemod_cid($args);  // See if we have cached data already
 
$data = cache_get($cid, 'cache_cachemod')
  if (!
$data) {
   
// No such luck, go try to pull it from the web service
   
$data = _cachemod_call_service($args);
    if (
$data) {
     
// Great, we have data!  Store it off in the cache
     
cache_set($cid, $data, 'cache_cachemod');
    }
  }  return
$data;
}
?>

Note that there are several values for the optional expire parameter to the cache_set call that are fully documented in the API docs.

Hooking into the core cache management system

If you want your module’s cache bin to clear out when Drupal executes a cache wipe during cron runs or a general cache_clear_all, set the expire parameter in your cache_set call above to either CACHE_TEMPORARY or a Unix timestamp to expire after, and add the following hook to your module:

<?php
/**
* Implements hook_flush_caches
*/
function cachemod_flush_caches() {
 
$bins = array('cache_cachemod');
  return
$bins;
}
?>

This will add your cache bin to the list of bins that Drupal’s cron task will empty.

Additionally, if you would like to add your cache bin to the list of caches that drush can selectively clear, add the following to your module in a file named cachemod.drush.inc:

<?php
// Implements hook_drush_cache_clear
function cachemod_drush_cache_clear(&$types) {
 
$types['cachemod'] = '_cachemod_cache_clear';
}
// Util function to clear the cachemod bin
function _cachemod_cache_clear() {
 
cache_clear_all('*', 'cache_cachemod', true);
}
?>

Note that if you set the expiration of the cache item to CACHE_PERMANENT (the default), only an explicit call to cache_clear_all with the item’s CID will remove it from the cache.

Conclusion

Sometimes it makes sense to have a module cache data for its own use, and even possibly in its own cache bin to maintain a finer-grained control of the data and cache management if something beyond the core cache management is required. Utilizing the cache abstraction built into Drupal 7 core and some custom classes, hooks, and drush callbacks can give your module a range of options for reducing data calls, processing overhead, and bandwidth consumption. For more detailed info, check out the API pages at http://api.drupal.org for the functions, classes and hooks mentioned above.

As a Senior Developer at Phase2, Robert Bates is able to pursue his interests in solving complex multi-tier integration challenges with elegant solutions. He has experience not only in traditional web programming languages such as PHP and ...

Aug 01 2012
Aug 01

In the last 24 hours, I have had three glimpses into emergency plans. First, my local coffee shop -- the true source of my productivity -- experienced a water main break. Second, a site I manage experienced a server failure. Third, I came across Netflix's recently open sourced Chaos Monkey tool.

The emergencies I am talking about don't involve physical risk and are certainly not life-or-death circumstances. But they deal with a threat to a business: the loss of customers and revenue. Reflecting on these provides some insight into prevention and practice.

The Coffee Shop

When I walked into my corner coffee shop for the morning Joe, the barista informed me that due to a water main break, they were unable to make most of their drinks. I got lucky -- they still had some drip coffee. I ordered and sat down to drink it.

I watched the team deal with customers who came in, deal with the equipment, and engage in problem solving. After an hour, the talk turned from carrying on to closing shop. My curiosity was piqued, so I asked about the protocol for handling situations like this. They explained that they have a procedure to follow for emergencies, and that they had stepped through it to the best of their abilities. But there was a problem: The water main break was sudden, and the protocol didn't adapt well to cases where there was an unexpected and sudden loss of water.

The team did an admirable job, and took measures to encourage customers to come back. But the emergency plan had a flaw. (With a little inventiveness, I think they managed to avoid closing for the day.)

The Website

I admit that in most cases I am not good at planning for emergencies. But in the case of the website that failed, we had a plan. We had thought through enough of the possibilities that even the case that occurred was one we had anticipated. When the site failed, we implemented the plan, and it worked.

We could pat ourselves on the back, but here's the thing: Until our outage, we didn't know whether the emergency plan would work. We got lucky -- and even in our luck, there was a fair amount of flapping as we tried to implement our untested recovery plan.

If only we were more proactive in testing our emergency plan.

Chaos Monkey

Netflix takes a different approach to their emergency plan: they simulate emergencies. In fact, they built a tool to simulate emergencies for them, the aptly named Chaos Monkey.

In a nutshell, Chaos Monkey causes servers to break. Yes, they intentionally break their servers. Then the emergency recovery process kicks in. If some part of the emergency process is broken, they'll know and they will be prepared to react (I presume that it's not 2 AM on a Sunday when they run this thing).

Because of this testing -- somewhat chaotic, but in a controlled environment -- the Netflix engineers can test and improve emergency plans.

An emergency plan is a must-have. But to play its role, it's got to work.

Jun 18 2012
Jun 18

The PHP CURL library is the most robust library for working with HTTP (and other protocols) from within PHP. As a maintainer of the HPCloud-PHP library, which makes extensive use of REST services, I've been tinkering around with ways of speeding up REST interactions.

What I've found is a way to cut off nearly 70% of the processing time for a typical usage scenario. For example, our unit tests used to take four minutes to run, and we're now down to just over a minute, while our Drupal module's network time has been cut by over 75%.

This article explains how we accomplished this with a surprisingly simple (and counter-intuitive) modification.

The typical way to work with CURL is to use the curl_init() call to create a new CURL handle. After suitable configuration has been done, that CURL handle is typically executed with curl_exec(), and then closed with curl_close(). As might be expected, this builds a new connection for each CURL handle. If you create two handles (calling curl_init() twice, and executing each handle with curl_exec()), the unsurprising result is that two connections to the remote server are created.

But for our library, a pattern emerges quickly: Many requests are sent to the same servers. In some cases, several hundred requests may got to the same server, and even the same URL (though with different headers and bodies). If we use the method above of creating one
CURL handle per request, the network overhead for each request can really slow things down. This is compounded by the fact that almost all request are done over SSL, which means each request has not only network overhead, but also SSL negotiation overhead.

This is hardly a new problem, and HTTP has several methods for dealing with this. Unfortunately, CURL, as I've used it above, cannot make use of any of these. Why not? Because each CURL handle does its own connection management. But there were hints in the PHP manual that there may be ways to share connections. And when looking at CURL's raw output, I could see it leaving connections open for re-use. But how could I make use of those?

Reuse a CURL handle

The first method was to call curl_init() once, and then call curl_exec() multiple times before calling curl_close(). This method is described (sparsely) in a Stack Overflow discussion.

I gave this method a try, but immediately ran up against issues. While I suspect that this method works for simple configurations, our library is not simple. It makes deep use of CURL's configuration API, passing input and output streams around, and conditionally setting many options depending on the type of operation being performed. We use GET, HEAD, POST, PUT, and COPY requests, sometimes in rapid succession. Sometimes we provide only scant data to the server, while other times we are working with large objects. Re-using the same CURL handle did not work well in this situation. While it is easy to set an option, it is not possible to unset or reset an option.

After trying several methods of resetting options, I forwent this approach and began digging again.

CURL Multi is not just for parallel processing

The hint that changed everything came from this entry in the CURL FAQ:


"curl and libcurl have excellent support for persistent connections when transferring several files from the same server. Curl will attempt to reuse connections for all URLs specified on the same command line/config file, and libcurl will reuse connections for all transfers that are made using the same libcurl handle.
When you use the easy interface, the connection cache is kept within the easy handle. If you instead use the multi interface, the connection cache will be kept within the multi handle and will be shared among all the easy handles that are used within the same multi handle. "

It took me a moment to realize that the easy interface was curl_exec, but once I caught on, I knew what I needed to do.

The CURL multi library is typically used for running several requests in parallel. But as you can see from the FAQ entry above, it has another virtue: It caches connections. As long as the CURL multi handle is re-used, CURL connections will automatically be re-used as long as possible.

This method provides the ability to set different options on each CURL handle, but then to run each CURL handle through the CURL multi handler, which provides the connection caching. While this particular chunk of code never executes requests in parallel, CURL multi still provides a huge performance boost.

A quick test of this revealed instant results. Running a series of requests that took 14 seconds on the original configuration took only five seconds with CURL multi. (How does all of this compare to the built-in PHP HTTP Stream? It took 22 seconds to run the same tests, and it takes over seven minutes to run the same batch of tests that takes CURL multi 1.5 minutes.)

An Example in Code

While the HP Cloud library is object oriented, here is a simple procedural example that shows (basically) what my starting code looked like and what the finished code looked like.

Initially, we were using a simple method of executing CURL like this:

 <?php
   function get($url) {
     // Create a handle.
    $handle = curl_init($url);
 
     // Set options...
 
     // Do the request.
     $ret = curl_exec($handle);
 
     // Do stuff with the results...
 
     // Destroy the handle.
     curl_close($handle);
 }
 
   ?>

While our actual code does a lot of options configuring and then does a substantial amount with $handle after the curl_exec() call, this code illustrates the basic idea.

Refactoring to make use of CURL multi, the final code looked more like this:

 <?php
   function get2($url) {
    // Create a handle.
    $handle = curl_init($url);
 
     // Set options...
 
     // Do the request.
     $ret = curlExecWithMulti($handle);
 
     // Do stuff with the results...
 
     // Destroy the handle.
     curl_close($handle);
 
   }
 
   function curlExecWithMulti($handle) {
    // In real life this is a class variable.
      static $multi = NULL;
 
     // Create a multi if necessary.
    if (empty($multi)) {
     $multi = curl_multi_init();
      }
 
     // Add the handle to be processed.
     curl_multi_add_handle($multi, $handle);
 
     // Do all the processing.
      $active = NULL;
    do {
     $ret = curl_multi_exec($multi, $active);
     } while ($ret == CURLM_CALL_MULTI_PERFORM);
 
     while ($active && $ret == CURLM_OK) {
        if (curl_multi_select($multi) != -1) {
       do {
          $mrc = curl_multi_exec($multi, $active);
        } while ($mrc == CURLM_CALL_MULTI_PERFORM);
      }
    }
 
     // Remove the handle from the multi processor.
     curl_multi_remove_handle($multi, $handle);
 
     return TRUE;
 }
 
   ?>

Now, instead of using curl_exec(), we supply a method called curlExecWithMulti(). This function keeps a single static $multi instance (again, our actual implementation is more nuanced and less... Singleton-ish). This $multi instance is shared for all requests, and doing this allows us to make use of CURL multi's connection caching.

In each call to curlExecWithMulti(), we add $handle to the $multi request handler, execute it using CURL multi's execution style, and then remove the handle once we are done.

There is nothing particularly fancy about this implementation. It is actually even more complicated than it needs to be (I eventually want to make curlExecWithMulti() be able to take an array of handles for parallel processing). But it certainly does the trick.

Using that pattern for the HPCloud PHP library, I re-ran our unit tests. The unit test run typically takes between four and five minutes to handle several hundred REST requests. But with this pattern, the same tests took under a minute and a half -- and made over 300 requests over the same connection.

We will continue to evolve the HPCloud PHP library to improve performance even more. Parallel and asynchronous processing is one performance item on the roadmap. And we have others as well. If you've got some tricks you'd like to share, feel free to drop them in the issue
queue at GitHub and let us know.

Apr 22 2012
Apr 22

We had a site for a client that was stable for close to two years, then suddenly started to experience switches from the master to the geographically separate slave server as frequently as twice a week.

The site is an entertainment news site, and its articles get to Google News on occasions.

The symptoms was increased load on the server, a sudden influx of traffic causing over 800 simultaneous connections all in the ESTABLISHED state.

Normally, a well tuned Drupal site can withstand this influx, with server optimization and proper caching. But for this previously stable site, we found that a combination of factors, some internal to the sites, and the other external, participated to cause the site to switch.

The internal factor was the way the site was setup using purl, and other code around. The links of a URL changed to add a top level section, which redirected to the real URL. This caused around 30% of accesses to the URLs to cause a 302 redirect. Since redirects are not cached, they incurred more overhead than regularly served pages.

Investigating the root cause

We started checking if there is a pattern, and went back to analyse the server logs as far back as a year.

We used the ever helpful Go Access tool to do most of the investigative work.

A week in April 2011, had 28% redirects, but we found an anomaly of the browser share over the months. For that same April week, the browser breakdown are 34% MSIE, 21% Safari and 21% Firefox.

A week in Sep 2011, redirects are 30%, browsers are 26% Safari, 25% MSIE and 20% Firefox. These make sense as Safari gains more market share and Microsoft loses market share.

But when checking a week in Feb 2012, redirects are 32%, but look at the browsers: 46% Firefox, 16% Safari, 14% Others and 12% MSIE

It does not make sense for Firefox to jump by that much and gain market share from thin air.

A partial week in March 2012, shows that redirects are 32%, and again, the browsers are 52% Firefox, 14% Others, 13% Safari and 10% MSIE.

That MSIE dropped is something that one can understand. But the jump in Firefox from Sep to Feb/March is unjustified, and tells us that perhaps there are crawlers, scrappers, leachers or something else masking as Firefox and hitting our content.

Digging deeper, we find that the top 2 Firefox versions are:

27,092 Firefox/10.0.2
180,420 Firefox/3.0.10

The first one is understandable, a current version of Firefox. The second one is a very old version from 2009, and has 6.6X the traffic of the current version!

The signature for the user agent is all like so, with a 2009 build:

Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.10) Gecko/2009042316 Firefox/3.0.10 (.NET CLR 3.5.30729)

We went back and looked at a week in September (all hours of the day), with that browser signature, and lo and behold:

Unique visitors that suck lots of bandwidth:

  88      10.49%  24/Sep/2011  207.76 MB
  113     13.47%  23/Sep/2011  994.44 MB
  109     12.99%  22/Sep/2011    1.44 GB
  133     15.85%  21/Sep/2011    1.70 GB
  134     15.97%  20/Sep/2011    1.68 GB

There were only 335 different IP addresses!

But look at the same user agent in March for a week:

   94479  38.36%  15/Mar/2012   16.38 GB
  102037  41.43%  14/Mar/2012   17.13 GB
   38795  15.75%  13/Mar/2012   12.48 GB
   11003   4.47%  12/Mar/2012   10.90 GB

See the number of unique visitors compared to September?
And now there are 206,225 different IP addresses!

For a few days in March, Monday to Thursday, here are the figures for this user agent.

Total requests to pages (excluding static file): 1,122,229
Total requests that have an empty referer: 1,120,843
That is, 99.88% are from those botnets!

Verifying the hypothesis

Looking at the web server logs through awstats, we found that a year ago, Feb 2011, the market share for Firefox overall was 24.7% with 16,559,999. And at that time, Firefox 3.0.10 had only 44,436

That is 0.002 % of the total.

In Sep 2011 it had 0.2% with 241,869 hits.

Then in Feb 2011, that old version from 2009 have 2.2% share of hits, with 4,409,396 hits.

So, from 0.002% to 2.2% of total, for an obsolete version of Firefox. This means growth by a factor of 1,100 X in one year.

Does not make sense.

Botnet hammering the site

So, what does this tell us?

Looking at a sample of the IP addresses, we found that they all belong to Cable or DSL companies, mainly in the USA.

This tells us that there is a massive botnets that infect lots of PCs.

They were piloting the botnet in September and went full speed after that, and they are hitting the server hard.

The programs of the botnet seem to have a bug in them that prevent them from coordinating with each other, and they all try to grab new content at the same time. This poor coding causes the sudden influx of traffic that brings the server to its knees, combined with the non-caching of 302 redirects.

Just to make sure, we checked two other sites that we manage quickly for the same symptoms. One entertainment site is showing similar signs, the other, a financial sites is not showing the signs. Both have good caching because of no redirects (97% to 98% return code of 200), and that is why the entertainment site can stand the onslaught.

Solution: block the botnet's user agent

Since the botnet is coming from hundreds of thousands IP addresses, it is not possible to block based on the IP address alone.

Therefore, the solution was to block requests coming with that browser signature from 2009 only, and only when there is no referer.

This solution, that goes into settings.php, prevents Drupal from fully booting when a bad browser signature is encountered and the referer is empty.

We intentionally sent the humorous, but still legitimate, 418 HTTP return code so we can filter by that when analysing logs.

$botnet = 'Gecko/2009042316 Firefox/3.0.10';
if ($_SERVER['HTTP_REFERER'] == '') {
  if (FALSE !== strpos($_SERVER['HTTP_USER_AGENT'], $botnet) {
    header("HTTP/1.0 418 I'm a teapot");
    exit();
  }
}

The above should work in most cases.

However, a better solution is to keep the changes at the Apache level and never bother with executing any PHP code if the conditions are met.

# Fix for botnet crawlers, by 2bits.com, Inc.
#
# Referer is empty
RewriteCond  %{HTTP_REFERER}    ^$
# User agent is bogus old browser
RewriteCond  %{HTTP_USER_AGENT} "Gecko/2009042316 Firefox/3.0.10"
# Forbid the request
RewriteRule  ^(.*)$ - [F,L]

The drawback is that we are using a 403 (access denied) instead of the 418 (I am a teapot), which can skew the statistics a bit in the web server logs.

Further reading

After investigating and solving this problem, I discussed the issue with a friend who manages several high traffic sites that are non-Drupal, and at the time, he did not see the same symptoms. However, a few weeks later he started seeing the same symptoms, and sent me the first two articles. Months later, I saw the third:

Apr 17 2012
Apr 17

A lot of very interesting things are happening to make Drupal's caching system a bit smarter. One of my favorite recent (albeit smaller) developments is a patch (http://drupal.org/node/1471200) for the Views module that allows for cached views to have no expiration date. This means that the view will remain in the cache until it is explicitly removed.

Before this patch landed, developers were forced to set an arbitrary time limit for how long Views would store the cached content. So even if your view's content only changed every six months, you had to choose a time limit from a list of those predefined by Views, the maximum of which was 6 days. Every six days, the view content would be flushed and regenerated, regardless of whether its contents had actually changed or not.

The functionality provided by this patch opens the door for some really powerful behavior. Say, for instance, that I have a fairly standard blog view. Since I publish blog posts somewhat infrequently, I would only like to clear this view's cache when a new blog post is created, updated, or deleted.

To set up the view to cache indefinitely, click on the "Caching" settings in your view and select "Time-based" from the pop-up.

Then, in the Caching settings form that follows, set the length of time to "Custom" and enter "0" in the "Seconds" field. You can do the same for the "Rendered output" settings if you'd like to also cache the rendered output of the view.

Once you save your view, you should be all set.

Next, we need to manually invalidate the cached view whenever its content changes. There are a couple different ways to do this depending on what sort of content is included in the view (including both of the modules linked to above). In this case, I'll keep it lightweight and act on hooks in a custom module:

/**
* Implements hook_node_insert().
*/
function MY_MODULE_node_insert($node) {
  if ($node->type == 'blog') {
    cache_clear_all('blog:', 'cache_views', TRUE); 
  }
}...Same for hook_node_update() and hook_node_delete()...

And just like that, my view is only regenerated when it needs to be, and should be blazing fast in between.

The patch was committed to the 7.x-3.x branch of Views on March 31, 2012, so for now you will have to manually apply the patch until it is released in the next point release.

Happy caching!

Mar 09 2012
Mar 09

The Gateway to 21st Century Skills (www.thegateway.org) is a semantic web enabled digital library that contains thousands of educational resources and as one of the oldest digital libraries on the web, it serves educators in 178 countries. Since 1996, educational activities, lesson plans, online projects, and assessment items have been contributed and vetted by over 700 quality organizations.

Given their rich pedigree, the site serves over 100,000 resources each month to educators worldwide. Since 2005, the Gateway has been managed by JES & Co., a 501(c)(3) non-profit educational organization. The original site was built on Plone several years ago. In recent years the constraints of the old site proved too great for the quality and quantity of content, and the needs of its increasingly engaged readership. It was becoming difficult and expensive to manage and update in its current configuration.


JES & Co., as an organization with a history of embracing innovation, decided to move the Gateway onto Drupal and looked to 10jumps to make the transition happen. The site had to be reliable with very high up time. Moreover the site would have to be able to handle the millions of hits without batting an eyelid. And most importantly, the faceted search would have to work well with the semantically described records. Based on the requirements, Acquia’s managed cloud seemed like the best approach. It can help a site scale across multiple servers and Acquia provides high-availability with full fail-over support.
“If something does go down, we know that Acquia 24x7 support has our back” - 10jumps


How they did it


There were several hosting options, but very few that met the requirements for the Gateway. And definitely none that made the development-testing-production migration seamless and easy. Usually there are too many manual steps raising the chances of error.

After a few rounds of technology and support evaluation calls, Acquia was retained to provide hosting and site support. A good support package, combined the expertise of Acquia’s support team was a compelling reason to make the move. The technical team at 10jumps was also fairly confident that the move would be a good choice for their customer – the Gateway, freeing them to focus on the site development. With Acquia’s Managed Cloud and the self-service model, code, local files and database can be migrated between development, testing and production systems literally with mouse clicks. With the seamless migration, the development cycle became shorter and with Git in place, collaboration between developers became easier. Moreover caching for anonymous content was provided out of the box and the 10jumps developers did not have to navigate tricky cache settings. Moving the developers to the new platform was the first step and soon the team was on an agile development track, being able to develop and roll out features quickly.

The result

After the new site went live, we were certain that

TheGateway.org would not be effected by traffic spikes, nor would the site be down because of a data center outage. More importantly, the semantically described data could be searched more efficiently because of the integration with Apache’s Solr search that comes from being in the Acquia cloud.
The development life cycle had gone from being clunky and broken to being smooth and agile. The redesigned site makes it simpler for the end users to navigate through large amounts of data and the powerful search is returning better results - improving overall user experience.

Feb 23 2012
Feb 23

One of our clients came to us with a performance issue on their Drupal 6 multi-site installation. Views were taking ages to save, the admin pages seemed unnecessarily sluggish, and clearing the cache put the site in danger of going down. They reported that the issue was most noticeable in their state-of-the-art hosting environment, yet was not reproducible on a local copy of the site — a baffling scenario as their 8 web heads and 2 database servers were mostly idle while the site struggled along.

Our performance analysis revealed two major issues. After implementing fixes, we saw the average time to save a Drupal view drop from 2 minutes 20 seconds to 4.6 seconds — a massive improvement. Likewise, the time to load the homepage on a warm cache improved from 2.3 seconds to 621 milliseconds. The two bugs that accounted for these huge gains turned out to be very interesting:

1. Intermediary system causes MySQL queries to slow down

Simple queries that are well indexed and cached, can see significant lag when delivering packets through an intermediary. This actually has nothing to do with Drupal, as it is reproducible from the MySQL command line utility. (It’s probably a bug in the MySQL libraries but we’re not entirely sure.) It could also be a problem with the intermediary but we’ve reproduced it in two fairly different systems: F5’s load balancer proxy and VMWare Fusion’s network drivers/stack.

For example:

SELECT cid, data, created, expire, serialized FROM cache_menu WHERE cid IN (x)

A query like this one should execute in a millisecond or less. In our client’s case, however, we found that 40ms was being added to the query time. The strange part is that this extra delay only occurred when the size of the data payload returned was above a certain threshold, so most of the time, similar queries returned quickly, but around 10–20 of these simple queries had 40ms or more added to their execution time, resulting in significant slowdowns.

We briefly debugged the MySQL process and found it to be waiting for data. Unfortunately, we didn’t pursue this much further as the simple workaround was apparent: reroute the MySQL traffic directly to the database instead of through the F5 load balancer. The net change from applying this simple modification is that the time to save a view was reduced to 25.3 seconds.

2. Database prefixing isn’t designed to scale as the number of prefixes increases

Drupal can be configured to share a database with another system or Drupal install. To do this, it uses a function called db_prefix_tables() to add prefixes to table names so they don’t collide with other applications’ table names. Our client was using the table prefixing system to allow various sub-sites to share data such as views and nodes, and thus they had 151 entries in the db_prefixes list.

The problem is that db_prefix_tables() is not well optimized for this implementation edge case. It will run an internal PHP function called strtr() (string token replace) for each prefix, on every database query string. In our case, saving a view executed over 9200 queries, meaning strtr() was called more than 1.4 million times!

We created a fix using preg_replace_callback() which resulted in both the number of calls and execution time dropping dramatically. Our view could now be saved in a mere 10.3 seconds. The patch is awaiting review in the Drupal issue queue, and there’s a patch for Pressflow 6, too, in case someone needs it before it lands in core.

The final tweaks included disabling the administration menu and the query logger. At that point, we finally reached a much more palatable 4.6 seconds for saving a view — still not as fast as it could be, but given the large number of views in the shared system, a respectable figure.

Feb 11 2012
Feb 11
Beer and developer conferences go hand in hand.

A few weeks ago I presented “CDNs made simple fast and cheap” at the Drupal Downunder conference in Melbourne Australia.

The talk covered:

  • the importance of good client side performance,
  • how A CDN works,
  • recommended CDN providers (from an Australian’s perspective),
  • a demonstration of how to set up a CDN and
  • a summary of the results (better YSlow score and page download times).


Setting up a CDN is very easy to do and cost effective. If you want you users to have the best online experience then there is nothing stopping you!

The CDN presentation is available as PDF slides and a video.

Thanks to my employer PreviousNext who kindly sponsored my trip to Melbourne. Hats off to Wim Leers for contributing the CDN module.

[embedded content]

Be Sociable, Share!
Feb 11 2012
Feb 11

Acquia Insight

Posted on: Saturday, February 11th 2012 by Brandon Tate

Site performance is usually something developed on a per issue basis. Meaning, your site was popular but your code couldn’t handle the popularity! Thankfully, Acquia Dev Cloud and Managed hosting options provide a pretty neat little tool that can help you out with site performance and SEO. It’s called Acquia Insight. The information provided is broken down into two areas called SEO Grader and Insight.

SEO Grader

An area that is commonly overlooked when creating a site is how well your site performs in the SEO world. Once you log into the Acquia Network there will be a link on the left hand side called “SEO Grader”. Clicking this will take you to the overview page that gives you brief look at how your site is doing. If you have any critical issues with your site, you’ll see them here with an exclamation mark beside it. Clicking the “Analysis” link under the SEO Grader menu will give you a more in depth look. Here is it broken down into sections called Page Structure, Crawlability, Findability, User Experiences and Best Practices. Each section will list issues with your site and each issue has a problem explanation and solution provided so not only do you understand why its effecting your site, you also understand how to fix it. As well, each issue has the ability to be ignored since not everything listed will be relevant to your site.

Insight

As a developer, the insight section is very helpful since it gives me an overall look at my sites performance in terms of code, server configuration and MySQL statistics. Once logged into the Acquia Network there is a link for “Insight”, clicking this gives you an overview page with your overall score as well as access to the Code, Server and Statistics information pages from here.

The Code section allows you to see every module you have installed on your site and the files it contains. If any files within a module have been modified you can click into that module and see a code level diff view of the file which allows you to identify any possible problems. The Server section gives you an overview of the PHP information and server. The Statistics section provides you with MySQL statistics that allows you to see things like your cache hit ratio, queries in cache, slow queries etc. Each statistic here has information as to what the stat is relating to and a possible solution if the stat is on the wrong side of the fence. For instance, if your slow queries number is high, you can view the MySQL slow queries log on the server and fix the queries that were slowing your site down.

If you click the “Analysis” link under the Insight menu, you can access the Performance, Security and Best Practices sections. As with the SEO Grader, you can access a list of issues with your site and each issue has a problem / solution provided to help you out.

Email Alerts

Acquia Insight allows you to receive emails concerning Performance, Security or SEO issues as they arise. This is helpful since you can’t always be checking in on your site as it lives out its life cycle. Acquia provides a slider tool to configure which emails you receive (Critical, Warning, Notice or All)

Lifecycle

Over time the performance of a site can degrade and issues can arise from use. Normally, if your not keeping watch these degradation issues will occur at the worst times and usually wreak havoc for a live site. With Insight, it provides a graphical view of your Insight score which allows you to foresee issues as they can occur over time.

Best Practices

I really enjoy the fact that Insight allows you to become a better developer since it provides best practices and highlights issues with problems / solutions. So overtime you’ll notice your sites will perform better out of the box because you’ve been learning from this helpful tool and implementing these solutions beforehand.

Conclusion

Overall I enjoy Acquia Insight and all its capabilities. It provides helpful information and allows you to monitor, troubleshoot and tune for optimal performance. If that isn't enough for you and you want more information about your sites performance, Acquia has another service available with New Relic that provides an extensive web monitoring and management tool that can be easily setup through the Acquia Network interface as well.

Jan 28 2012
Jan 28

Evan wrote a good article about the various high level options you have when looking at performing some lightweight drupal performance tuning. Good if you're just looking to optimize page load times and resource utilization on a server you may not have command line access to. What I'm going to share is the learning we've done over the course of many Drupal deployments in VPS and Dedicated hosting environments where we DO have access to some of the more low level options and settings made available with root or administrator access.

For this post, I'm just going to cover Apache and APC. I'll cover MySQL and PHP in a follow up post.

Typically we deploy sites on the Debian based Ubuntu so all config options and file locations are specific to that, but you should have no trouble translating the info here to other Linux distributions.

First at bat: Apache.

I won't go into "non-standard" web server configurations like running nginx as a proxy (though, it's a great option and I highly suggest it if you can) CDN's or running anything other than mod_php. Just standard Apache with mod_php enabled.

First off, take a look at /etc/apache2/mods-enabled

For every thread that Apache spins up to handle a request, all those modules get loaded and they all take up your precious RAM. Take a look, I bet you don't recognize half of those. If you do, good for you, but then you probably also know that we definitely don't need all of them enabled to run a

From a standard apache package install on ubuntu i've removed the following:

  • sudo a2dismod cgi
    when was the last time you ever put anything in cgi-bin?
  • sudo a2dismod negotiation
    we'll generally handle our content negotiations within php
  • sudo a2dismod autoindex
    prettifies a directory without a directory index file (index.html, index.php) present
  • sudo a2dismod env
    provides environment variables to cgi scripts and ssi pages. won't need those for Drupal
  • sudo a2dismod reqtimeout
    not 100% sure, but i turned it off and everything seems fine
  • sudo a2dismod setenvif
    again, not something we really need for PHP and Drupal

The following moduels are useful for restricting access at the web server level. If you don't need any access restriction that Drupal already provides, then you can turn these off without issue as well.

  • sudo a2dismod auth_basic
    provides authentication based on the following modules
  • sudo a2dismod authn_file
    provides a plaintext user/pass lookup for authenticating users
  • sudo a2dismod authz_default
    fallback module to deny all access if groupfile/host and user are not enabled
  • sudo a2dismod authz_groupfile
    provides authentication based on groups defined on the server itself
  • sudo a2dismod authz_host
    provides ip or hostname authentication
  • sudo a2dismod authz_user
    similair to the group authentication, but for users

Disabling modules won't make the biggest impact on speed, but it will free up a bit of memory that you can use else where. Extra threads for Apache or more RAM allocated to APC.

Next up is the apache config file. /etc/apache2/apache2.conf

I'll just list a few key variables that should be changed from their defaults.

  • Timeout 300
    Timeout tells Apache how long it can wait, process and return a single response from a client. A timeout of 300 is way too high. You'll want to drop this to something like 30.
  • MaxKeepAliveRequests 100
    With KeepAlive on by default, we're telling Apache to serve files using persistent connections. So one client will have one thread serving multiple resources needed to display a page load. With Drupal, generally we see pages that can easily load a hundred files or more. CSS, JS, images and fonts. Setting the MaxKeepAliveRequests to something higher will allow Apache to serve more content within a single persistent connection. 
  • KeepAliveTimeout 5
    KeepAliveTimeout tells your persistent connection how long it should sit around and wait for the next request from a client. I would set this a little lower, but not too low. Something around 2 or 3 seconds should be sufficient.

Your mpm.

Out of the box, you're most likely running the prefork mpm. The defauls are pretty good for your standard Drupal install, but we've found bumping up the min and max spare servers can help when dealing with a sudden small spike in traffic. Changes to this usually require some research in how your server is already performing, but the thing to keep in mind is the more "Servers" you have running the more RAM you'll be using. Then again, what is the RAM for on a dedicated web server if not for serving web pages.

<IfModule mpm_prefork_module>
StartServers 5
MinSpareServers 5
MaxSpareServers 10
MaxClients 150
MaxRequestsPerChild 0
</IfModule>

What we generally use for a moderate site that handles a few thousand hits per day.

<IfModule mpm_prefork_module>
StartServers 5
MinSpareServers 10
MaxSpareServers 20
MaxClients 200
MaxRequestsPerChild 0
</IfModule>

Next at bat: APC

Like Evan mentioned in his article. APC is a must. Not a "it would be nice", but an absolute must. The thing about APC that we learned a while ago is that installing it is not enough. In fact, it can actually make your site run A LOT slower in it's default state. The key is getting in and configuring it properly. This can take a little trial and error, but it's definitely worth it. Below is our apc configuration and on our systems it's located at /etc/php5/conf.d/apc.ini

extension=apc.so
apc.enabled=1
apc.shm_segments=1
apc.shm_size=64M
apc.optimization=0
apc.num_files_hint=512
apc.user_entries_hint=1024
apc.ttl=0
apc.user_ttl=0
apc.gc_ttl=600
apc.cache_by_default=1
apc.filters="apc\.php$"
apc.slam_defense=0
apc.use_request_time=1
apc.mmap_file_mask=/dev/zero
apc.file_update_protection=2
apc.enable_cli=0
apc.max_file_size=2M
apc.stat=1
apc.write_lock=1
apc.report_autofilter=0
apc.include_once_override=0
apc.rfc1867=0
apc.rfc1867_prefix="upload_"
apc.rfc1867_name="APC_UPLOAD_PROGRESS"
apc.rfc1867_freq=0
apc.localcache=1
apc.localcache.size=512
apc.coredump_unmap=0
apc.stat_ctime=0

This configuration is for a moderately sized Drupal install. What you'll want to do is find the apc.php file at /usr/share/php/apc.php and copy it into your web server root. This file will help you determine how effectively APC is working on your system and whether you need to make adjustments to the amount of RAM you make available to APC.

Good APC

The graph above is from our development server so we've set the RAM a little higher, but the pertinent info is the same. What you want to pay attention to is your hits/misses and your fragmentation. For hits and misses you want to make sure your hits are quite a bit higher than your misses. You'll get a miss the first time you hit a page, but beyond that you should be getting hits. Fragmentation is more important. Fragmentation happens when you have a new item that needs to be stored in cache and you have no available room in your cache to store it. APC will discard a cached item to make room for the new item. Fragmentation occurs when the item being discarded is larger than the new item coming in. That left over space can only be used if a file small enough can fill the spot, otherwise it sites un-used. FRAGMENTATION. To prevent fragmentation, you'll want to monitor APC's memory usage as page loads come in and once you start to see fragmentation, increase the RAM allocated to APC by a few MB's and try again. The graph below is something you never want to see.

Bad APC

These tweaks should get you quicker page loads with less RAM being taken up by Apache. One of the big slowdowns in Drupal can be MySQL so if you haven't got the performance bump you hoped for here, then wait for the MySQL/PHP post coming soon.

If there's anything I missed or anything that I inevitably got completely wrong, feel free to post a comment and publicly scold me.

Jan 06 2012
Jan 06

Performance problems are a bit of an embarrassment of riches. It's great that your site is getting a lot of visitors, but are all those visitors slowing down your site or potentially causing crashes? How do you go about resolving this issue? Unfortunately there's no hard-and-fast answer to performance problems, and examining the shear breadth and depth of the topic would result in the longest and most boring blog post ever. However I will attempt to outline the general strategies you might follow to diagnose and solve problems.

The first step is to consider what part of your site is experiencing problems. A particular page or View, for example, being slow will generally indicate a different set of problems to the entire site feeling sluggish. Another issue to consider is whether it's the backend that's slow, or the frontend. A tool such as Chrome's web inspector can tell you what assets (CSS files, JS files, images) are being loaded, how big they are and how long it is taking, among other useful metrics.

Secondly, it's a good idea to determine what the bottleneck is, if possible. Take a look at the load on your CPU when you notice poor performance as serving complex PHP applications can be quite taxing. You might also want to turn on the Devel module's query log (or the MySQL slow query log) to determine if any queries are taking an exceptional amount of time.

When tuning the performance of a site and its server I'll typically follow a strategy where by I'll implement a number of steps then re-examine performance, before moving on to more gradually more complex solutions if the need arises. On every production Drupal site I'll take the following steps:

  • Enable CSS/JS compression
  • Make sure APC is enabled and properly configured
  • Disable unnecessary modules
  • Clean up as many PHP errors as possible, especially if we're logging with the dblog or syslog modules
  • Make sure the MySQL query cache is enabled and properly configured
  • Sprite images where possible for frontend performance benefits

The aforementioned steps will go a long way to ensuring that your site is performant while having very few, if any, detrimental effects. With that said, sometimes this isn't enough. Fear not, we can go further to attempt to address outstanding issues:

  • Cache expensive or time consuming operations, for example, requesting data from a web service. Drupal makes this simple with cache_set() / cache_get().
  • Turn on additional caching, such as Views or Block, if possible
  • Use the MySQL slow query log and run EXPLAIN on the logged queries. Look for queries which have joins using filesort or temporary tables and determine whether these could be rewritten or sped up with an index.
  • Analyze the traffic pattern. A site with a largely anonymous user base can see a significant benefits from implementing the Boost module or Varnish in to the server stack.
  • Change cache backends. Drupal reads and writes a lot of data from its caches (as it should) so you can see some great speed improvements storing this data in memory (fast!) rather than the database (on the disk, slow!).

We're still just scratching the surface of optimizations that can be made to a website or the server stack, or even the servers themselves. Scaling vertically (using bigger servers) or horizontally (using more servers) are solutions that are often utilized when a site has outgrown its existing server (and sometimes before that point).

If your interest in performance and tuning has been piqued there's a couple of great resources for Drupal. Swing by the High Performance group (http://groups.drupal.org/high-performance) or http://2bits.com/ for a number of helpful tips on how you can improve the performance of your Drupal website.

We'd also be happy to discuss any questions or observations you might have, so feel free to post a comment!

Dec 23 2011
Dec 23

About thegateway.org:

The Gateway has been serving teachers continuously since 1996 which makes it one of the oldest publically accessible U.S. repositories of education resources on the Web. The Gateway contains a variety of educational resource types from activities and lesson plans to online projects to assessment items.

The older version of the website was on plone. The team hired us to migrate it to Drupal. It was an absolutely right choice to make. Given that, with Drupal comes a lot more benefits.

We redesigned the existing website giving it a new look and on Drupal. Then we hosted it on Acquia managed could to boost its performance and scalability. The new look is more compact, organized and easier to use.

It was a very interesting project for us and our team is proud to be a part of such a great educational organization serving the nation.

Looking forward to a grand success of the new launch!

thegateway.org BEFORE:

 

thegateway.org NOW:

Dec 09 2011
Dec 09

Drupal can power any site from the lowliest blog to the highest-traffic corporate dot-com. Come learn about the high-end of the spectrum with this comparison of techniques for scaling your site to hundreds of thousands or millions of page views an hour. This Do it with Drupal session with Nate Haug will cover software that you need to make Drupal run at its best, as well as software that acts as a front-end cache (a.k.a Reverse-Proxy Cache) that you can put in-front of your site to offload the majority of the processing work. This talk will cover the following software and architectural concepts:

  • Configuring Apache and PHP
  • MySQL Configuration (with Master/Slave setups)
  • Using Memcache to reduce database load and speed up the site
  • Using Varnish to serve up anonymous content lightning fast
  • Hardware overview for high-availability setups
  • Considering nginx (instead of Apache) for high amounts of authenticated traffic
Sep 22 2011
Sep 22

Tomorrow is the last day of Summer but the Drupal training scene is as hot as ever. We’ve scheduled a number of trainings in Los Angeles this Fall that we’re excited to tell you about, and we’re happy to publicly announce our training assistance program.

First, though, we’re sending out discount codes on Twitter and Facebook. Follow @LarksLA on Twitter, like Exaltation of Larks on Facebook or sign up to our training newsletter at http://www.larks.la/training to get a 15% early bird discount* toward all our trainings!

Los Angeles Drupal trainings in October and November, 2011

Here are the trainings we’ve lined up. If you have any questions, visit us at http://www.larks.la/training or contact us at trainings [at] larks [dot] la and we’ll be happy to talk with you. You can also call us at 888-LARKS-LA (888-527-5752) with any questions.

Beginner trainings:

Intermediate training:

Advanced trainings:

All our trainings are $400 a day (1-day trainings are $400, 2-day trainings are $800, etc.). We’re excited about these trainings and hope you are, too. Here are some more details and descriptions.

Training details and descriptions

   Drupal Fundamentals
   October 31, 2011
   http://ex.tl/df7

Drupal Fundamentals is our introductory training that touches on nearly every aspect of the core Drupal framework and covers many must-have modules. By the end of the day, you’ll have created a Drupal site that looks and functions much like any you’ll see on the web today.

This training is for Drupal 7. For more information, visit http://ex.tl/sbd7

   Drupal Scalability and Performance
   October 31, 2011
   http://ex.tl/dsp1

In this advanced Drupal Scalability and Performance training, we’ll show you the best practices for running fast sites for a large volume of users. Starting with a blank Linux virtual server, we’ll work together through the setup, configuration and tuning of Drupal using Varnish, Pressflow, Apache, MySQL, Memcache and Apache Solr.

This training is for both Drupal 6 and Drupal 7. For more information, visit http://ex.tl/dsp1

   Drupal Architecture (Custom Content, Fields and Lists)
   November 1 & 2, 2011
   http://ex.tl/ccfl1

Drupal Architecture (Custom Content, Fields and Lists) is our intermediate training where we explore modules and configurations you can combine to build more customized systems using Drupal. You’ll create many examples of more advanced configurations and content displays using the popular Content Construction Kit (CCK) and Views modules.

This training is for Drupal 6. For more information, visit http://ex.tl/ccfl1

   Developing RESTful Web Services and APIs
   November 3, 4 & 5, 2011
   http://ex.tl/dwsa1

Offered for the first time in Southern California, Developing RESTful Web Services and APIs is an advanced 2-day training (with an optional third day of additional hands-on support) for those developers seeking accelerated understanding of exploiting Services 3.0 to its fullest. This is THE training you need if you’re using Drupal to create a backend for iPad, iPhone or Android applications.

This training covers both Drupal 6 and Drupal 7. For more information, visit
http://ex.tl/dwsa1

Training assistance program

In closing, we’d like to tell you about our training assistance program. For each class, we’re setting aside a limited number of seats for students, unemployed job seekers and people in need.

For more details about the program, contact us at trainings [at] larks [dot] la and we’ll be happy to talk with you. You can also call us at 888-LARKS-LA (888-527-5752) with any questions.

* Our early bird discount is not valid toward the Red Cross First Aid, CPR & AED training and 2-year certification that we’re organizing. It’s already being offered at nearly 33% off, so sign up today. You won’t regret it and you might even save someone’s life. ^

Sep 20 2011
Sep 20

Drupal has a presence problem when it comes to front end performance. Drupal has for the most part ignored front end performance. According to a study by Strangeloop, 97% of the time it takes a mobile page to render is in the front end. For desktop browser the front end makes up 85% of the time. These numbers may feel high. But, when pages take 500ms to render in Drupal but 6 seconds to display in an end users browser you can see where this comes from.

The presence problem for Drupal can be seen in several places:

  1. A the past few DrupalCons how many sessions have touched on front end performance? I can only recall one of them while there have been many covering memcache, apc, and other server side technologies.
  2. Take a look at the documentation pages on profiling Drupal. Or, search for documentation pages on performance. You'll find discussions about apache benchmark, learn about varnish, etc. You won't learn about font end performance.
  3. Drupal doesn't provide minified JavaScript. For production environments this is considered a standard practice.
  4. The Drupal 8 development "gates" RFC gives 1 of 6 performance items to font end performance. The other 5 are tips/gates in detail for back end issues we've commonly run into. The front end one is a basic one liner.

Front end performance is a big deal. This is more so true as we enter into the dominance of mobile where mobile devices are low powered and on high latency networks.

Pointing out problems is no good without solutions. The problem is in the amount of face time front end performance gets. So, lets get it some face time.

  • At DrupalCamps let's start presenting on it.
  • Drupal companies could benefit from having someone knowledge in house. Come up with ways to add it to your expertise. Maybe hold a book club and discuss a Steve Souders book.
  • When we learn about useful tools like ImageOptim or Sprite Cow lets share them.
  • If you see a contrib module serving JavaScript up that has not been minified file a patch. You can use UglifyJS easily through the web. UglifyJS is what jQuery uses.

Front end performance is a big deal. It's the largest part of the performance equation an end user experiences. Companies have done studies showing the financial and usage impact of end user performance. Let's elevate front end performance to the place it needs to be in the Drupal community.

––
Posted in:

Jul 25 2011
Jul 25

This past weekend was CapitalCamp, Washington DC's first DrupalCamp, and Treehouse presented three exciting sessions.

Treehouse CEO, Michael Caccavano, was on a panel with fellow CEOs of web technology companies in Growing a Drupal Agency Beyond the Garage.

Steven Merrill and I talked about guidelines for web performance optimization, techniques for improving front-end performance, and ways to automate the testing and monitoring of front-end functionality in Pages on a Silver Platter: Automated Front-End Testing and Tuning. If you saw this talk, we'd love your feedback! Please leave your quick review here: http://tha.cm/capitalcamp-front-end-testing

Then, Patrick Macom and I presented on Raphaël JS, an interactive drawing library for vector graphics on the web. The session covered the basics of the Raphaël JS API and then moved into integration strategies with Drupal in the talk, "Outside the Garden: Intro to Raphaël JS." If you saw this talk, we'd love your feedback! Please leave your quick review here: http://tha.cm/capitalcamp-raphael

Below are our presentations from the event, unfortunately the event was not video recorded so there won't be videos of the talks.

If you happened to be at CapitalCamp, we'd love your feedback on our sessions! Please fill out a quick review on SpeakerRate here: http://tha.cm/rate-drupal-and-raphael

If you missed us at CapitalCamp, don't worry! Treehouse Agency will be making appearances again this Friday, July 29th, 2011 at DrupalCamp Philadelphia. Sessions haven't been announced yet but our own JR Wilson has proposed, Node.js Integration with Drupal 7 and Steven Merrill has proposed Coat Your Website with Varnish.

Pages on a Silver Platter: Automated Front-End Testing and Tuning:

Outside the Garden: Intro to Raphaël JS:

Don't Forget! Treehouse Agency will be making appearances again this Friday, July 29th, 2011 at DrupalCamp Philadelphia. Sessions haven't been announced yet but JR Wilson has proposed, Node.js Integration with Drupal 7 and Steven Merrill has proposed Coat Your Website with Varnish. We'll see you there!

Dec 20 2010
Dec 20

Here's a quick tip which will influence your performance and SEO. It's just about uncommenting 2 lines of code in the .htaccess file that ships with Drupal (and replacing some text), but I've seen a lot of sites that tend to forget this.

Suppose your domain is something like yourdomain.com. Check now if you can access your site by prefixing your domain with www. and by not doing it. So check if http://www.yourdomain.com is accessible and if http://yourdomain.com is (without redirection to one of the two). If so, this article is for you.

The problem

If you have Drupal's page cache enabled, 2 cache entries will be generated for your page if you visit the same page on http://www.yourdomain.com and http://yourdomain.com. This is because the cache key Drupal generates is based on the full url (well, actually the cache key is the full url). This means your cache hit rate will be lower and the performance gain you'd get from caching will be not as high.

Another issue with this is that this will have some impact on your search engine results ranking. Suppose half of the people use the www-less domain, and the other half will use the www-having domain, then the score of your page for search engines will be shared between the two pages.

The solution

How can you solve this? Easy. Open your Drupal installation's .htaccess file and look for the following part.

  # If your site can be accessed both with and without the 'www.' prefix, you
  # can use one of the following settings to redirect users to your preferred
  # URL, either WITH or WITHOUT the 'www.' prefix. Choose ONLY one option:
  #
  # To redirect all users to access the site WITH the 'www.' prefix,
  # (http://example.com/... will be redirected to http://www.example.com/...)
  # adapt and uncomment the following:
  # RewriteCond %\{HTTP_HOST\} ^example\.com$ [NC]
  # RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301]
  #
  # To redirect all users to access the site WITHOUT the 'www.' prefix,
  # (http://www.example.com/... will be redirected to http://example.com/...)
  # uncomment and adapt the following:
  # RewriteCond %\{HTTP_HOST\} ^www\.example\.com$ [NC]
  # RewriteRule ^(.*)$ http://example.com/$1 [L,R=301]

If you read the comments in this part, you'll know what to do. Just uncomment the 2 lines that apply to your case (would you like to use the www-having or the www-les urls?).

For example, suppose my domain is mydomain.com and I would like to redirect all my urls to the www-having domain I would do the following:


RewriteCond %{HTTP_HOST} ^mydomain\.com$ [NC]
RewriteRule ^(.*)$ http://www.mydomain.com/$1 [L,R=301]

Caveat

Just a little caveat. If you update your Drupal installation, the .htaccess file will be overriden (unless you don't explicitly don't overwrite it). In case you override your .htaccess file when updating, just make sure you redo these modifications.

Dec 13 2010
Dec 13

While Drupal's performance is always of interest, it has a hard time defending itself against the features people want to add.

There are different ways to address this, but the "less features" approach is usually not defensible.

To defend itself from the feature onslaught, Drupal tries to load as few lines of PHP code as possible, which helps to increase performance. A PHP opcode cache (such as APC) helps even more and points the way to where further improvements can be made: outside of conventional PHP.

One idea that has come up several times is to create a Drupal-specific PHP extension that will reimplement some of Drupal's often used functions in C. Such an extension exists. It implements Drupal's check_plain and Drupal 7's drupal_static. One of our goals is to examine the gains brought by this approach.

Another very interesting approach is to remove PHP entirely from the code that is actually executed by the webserver. There have been a number of attempts over the years, but one can't say that any of them has found wide adoption yet.

Hiphop is a recent addition to this group of compilers, it was released by Facebook and is used in production by them to reduce load across their infrastructure. This latter aspect makes it interesting to look into and see if it can be used with Drupal. Essentially, Hiphop translates your PHP code to C++ which is then compiled to a binary.

Besides the actual performance improvements, one should not disregard the total effort required to add and keep hiphop or a PHP extension in the toolchain:

To use the Drupal PHP extension, you only need a minimal patch to Drupal core. You then need to compile the extension and install it, which is also rather easy, but you will need to find a way to keep track of the extension and make sure that it will be available e.g. on new webservers you set up. However, in order to increase the use of the extension, you'll need to recreate all the Drupal functions that you have identified as being costly. Should there be a change in the core Drupal function, you will need to update your C reimplementation. On the other hand, some internal Drupal functions don't change all that much and the C code can be re-used across several Drupal versions.

To use hiphop, you need to first install a whole lot of libraries that aren't usually installed on a webserver. To make this easier, Tag1 consulting has developed HipHop RPMs for CentOS. There are also Debian packages maintained elsewhere. After you have successfully installed hiphop, you need to compile your PHP application. This is rather easy if you assemble a list of files that make up your PHP application and tell hiphop to compile all of them. The compilation will take quite a bit of time, but will succeed in the end if it is compatible. Drupal isn't completely compatible with hiphop. For Drupal 6 there is a small patchset on drupal.org which replaces calls to preg_replace(). Drupal 7 currently does compile with hiphop, but there is a bug in hiphop's implementation of the MySQL PDO layer that will not allow you to actually run it.

Since the resulting hiphop binary is not a PHP application anymore, there will be subtle differences in behavior. One example is Drupal's way to enable and disable modules: If you disable a module, the file will not be read in by PHP and thus it's functions will not be available. If you disable a module in hiphop compiled Drupal, you can still switch the status entry in the system table, but the functions of the module will still be available. You thus need to make sure you only compile in the modules you actually want to run to exclude unforeseen consequences.

We have compared Drupal's performance for a total of ten setups:

A) Drupal 6

1) Drupal running under PHP as an Apache module

2) Same as 1), with the Drupal PHP extension

3) Same as 1) with APC opcode cache

4) Same as 2) with APC opcode cache

5) Drupal compiled with Hiphop

6) Drupal compiled with Hiphop, only required files

B) Drupal 7

1) - 4) as in A)

All installations will use the same database with the D7 version upgraded from the D6 version. The database as a few thousand nodes, users, and comments, as well as taxonomy terms. The actual content composition isn't very important since we are interested in PHP and not MySQL performance.

All measurements are done with a modified version of siege. The modification is to increase the precision of the measurements (or rather the amount of digits written to the results file).

All the final measurements where done on a rackspace virtual machine with 2G RAM and four cores.

We have done tests with two pages, a single node with comments, and the forum overview page. All requests were done as an anonymous user with the page cache disabled. This is appropriate as we are only interested in raw php performance. We've requested several thousand pages for each test, using single concurrency.

Evaluating the tests proved much more difficult than was anticipated. As a standard procedure, most people are happy to specify the average of the measured quantity. More diligent people then also specify the standard deviation as a measure for the statistical significance of the result as is appropriate for a normal distribution. On closer examination it was observed that the resulting distribution of measurements rather resembles a mixture of skewed distributions and thus this procedure isn't appropriate.

Ideally, one would try to find out how to do one or more of the following:

  • Find reasons for the appearance of a mixture of distributions
  • Suppress some of these reasons for the existence of this mixture
  • Find a mathematical description for the mixture
  • Find a suitable model to express the statistical error for this mixture

While interesting, all the above is difficult and exceeds the experience of the experimenter with hard-core statistics. I can make the raw data available to somebody who has more of that.

We therefor do the following: the one-sigma boundary of the normal distribution marks the area where 68% of all results of the experiment can be found. We find a similar boundary by computing the same 68% threshold from the result of our data and give the resulting difference to the mean as an error estimate. Since we have taken a significant number of measurements, this should not be too far off. Regardless, it needs to be regarded with some caution.

So, here's the graphics of the evaluation of the main forum page.

There's a PDF of the graphics too.

The result is obviously, that Hiphop has by far the most advantage over a "normal", e.g. PHP+APC Drupal, install, a whopping 30%. Also, the gains from the PHP extension in any case are rather minor (2-3%).

An additional result is that sadly Drupal 7 is much slower (60%), at least for this page.

Now, what's the conclusion? The conclusion is that Hiphop can give you gains not easily possible with other methods. Does that mean everybody should run it? That depends on the effort you are willing to put into it. For Drupal 6, you would need to check if Drupal behaves as Drupal should. For this it would be nice to have unit tests. Unfortunately, there aren't many. Drupal 7 has good coverage with unit tests, but as explained Hiphop needs to be fixed to run it properly.

The big advantage of Hiphop vs the PHP extension can be easily explained by the fact that the extension only translates a small part of Drupal to a high-performance language, whereas Hiphop does so for the complete code base. You can turn this argument around and say that a 2-3% improvement is a lot considering that this was achieved by implementing one (D6) or to (D7) functions. This makes sense, too, but in order to achieve a higher quota, you'd have to re-implement a lot of functions by hand. The hiphop approach has a lot of appeal here, since you can continue to write all code in PHP.

The next steps should be:

1) Evaluate other types of pages (single node view)

2) Look into general system performance under load.

This project was sponsored by examiner.com.

Oct 11 2010
Oct 11

On websites with a lot of users and especially a lot of nodes a lot of queries can kill your site performance.

One central problem is the usage of node access. It's recommended to avoid it as long you can, but there might be other ways to get around some problems. Here is one specific example which is quite interesting.

The goal of the view was to show the X latest users together with it's profile picture and some additional stuff. The site used content profile so a view with the base_table "node" was used. Sadly this query was horrible slow so views caching was set to a long time(using views caching is a easy and good improvement and is highly recommended).

When the cache had to be rebuild the query was still slow. So instead of using node as base table choosing user as base table made it. This improved the query time by the factor 10, because of several reasons:

  • no node access system is used(which wasn't needed on this view)
  • less rows has to be joined

In views3 recently a patch landed which allows to disable node access on a certain view: http://drupal.org/node/621142

Sep 06 2010
Sep 06

For one of our clients, we are running a Drupal site with about a millions of nodes. Before launch, those nodes are imported from another database and then indexed into Apache SOLR. The total time to index all of these nodes in an empty SOLR instance is measured in days rather than hours or minutes.

A bit too long to do this import regularly. So me and my (XDebug) profiler delved into the Apache SOLR module code to look where we could scrape of a few hours/days of the execution time.

Seemed like in our case, there were 3 components responsible for a large share of the execution time. Let's have a look.

BTW. We are using the latest dev build of version 2 of the Apache SOLR module.

Tip 1: Not indexing $document->body

When indexing nodes, the SOLR module needs to construct an Apache_Solr_Document object for each node. It passes all fields and metadata of the node in that document. The heaviest part of constructing this document is the assembling of the $document->body field. The module uses the node_build_content and drupal_render($node->content) functions to generate the body of the node.

In our case, we didn't really use the body since we were indexing companies with fields like name, address, manager, ... So we decided to remove the code from apachesolr_node_to_document that calculates the body. Although this one gave us a major performance boost, it might not be applicable in your case. We could use this because we didn't need the body of a node.

Keep in mind also that in the body all other fields and metadata are assembled too (dependent on your search build mode configuration).

Tip 2: Add static caching to apachesolr_add_taxonomy_to_document

Another heavy thing that is going on while generating the Apache_Solr_Document object is fetching the taxonomy terms in apachesolr_add_taxonomy_to_document. For each term, the ancestors are calculated. In some cases you don't have a hierarchical vocabulary, so you could remove that code, but in case you have a hierarchical vocabulary, you could benefit a lot from static caching. You might have millions of nodes, but you probably have only a handful of terms (hundreds). So the ancestors of some term will be calculated multiple times.

Keep in mind though that you won't benefit too much from static caching if you're using batch processing for indexing with small batches, since the static cache is rebuilt for each batch step. So we wrote a Drush command to do the indexing. This way we're keeping the static cache for the full batch.

function drush_slimkopen_solr_index() {
  $cron_limit = variable_get('apachesolr_cron_limit', 50);

  while ($rows = apachesolr_get_nodes_to_index('apachesolr_search', $cron_limit)) {
    apachesolr_index_nodes($rows, 'apachesolr_search');
  }
}

Tip 3: Don't check excluded content types

The SOLR module has a nice feature that allows you to exclude certain content types from being indexed. Turns out the check for excluded content types is pretty expensive. This happens in the apachesolr_get_nodes_to_index('apachesolr_search', $limit) call where the apachesolr_search_node table is joined with the node table. For the initial import, we removed the check for excluded types (the join with node) and indexed all nodes. The excluded ones we removed after indexing.

This was possible in our case since the bulk of nodes (99.9% of them) needed to be indexed.

Conclusion

Drupal and its modules are developed to work in a lot of environments and situations. So next to the implementation of what they're designed to do, they also contain a lot of code that checks if a certain condition or context applies. But when you are using or deploying a module, you know what the context is. So you may be able to remove some code. Keep in mind though that tampering with core and module code is bad, but there are a few practices that can help here!

For those curious about what kind of performance gain you might have with these tricks: in our case it was about 50% but it highly depends site's implementation.

Aug 29 2010
Aug 29

DrupalCon Copenhagen comes to an end, as does my blogging hiatus.

Two of my primary learning objectives here in Copenhagen were configuration management and deployment process. Historically, working with Drupal in these areas has been unpleasant, and I think that's why there is tons of innovation going on in that space right now. It needs to be fixed, and new companies are springing up to say "hey, we fixed it." Often, the people running the companies are the same people running the project that encapsulates the underlying technologies. I'm referring to:

  • The hyper-performant core distro, Pressflow
  • Distros with sophisticated install profiles, like OpenAtrium, ManagingNews and OpenPublish
  • Configuration externalization with Features
  • Development Seed's "for every site, a makefile" workflow using drush make
  • The different-yet-overlapping hosting platforms Pantheon and Aegir

Dries commented in his keynote that as Drupal continues to grow, it also needs to grow up. I think advances like these are part of the community's answer to that. I want to wrap my head around some of these tools, while continuing to watch how they progress. Others, I want to implement right now. What's perfectly clear though is that I have a lot of work to do to keep up with the innovation going on in this hugely powerful community. Which is actually nothing new, but reading a blog post about these technologies doesn't make my jaw drop the way that it does when I'm in the room watching Drupal advance.

Aug 20 2010
Aug 20

on 20 August, 2010

In this article we will talk through setting up a simple load testing scenario for Drupal applications using Amazon’s Elastic Cloud computing. Amazon EC2 will enable you to easily set up testing scenarios for a relatively low cost, e.g. you can find out what the effect of adding an additional database server will make without actually buying one! JMeter will allow us to create complex test plans to measure the effect of our optimisations, we'll set up a remote JMeter load generator on EC2 that we'll control from our desktop.

Improving Drupal's performance is beyond the scope of this article, but we'll talk more about that in future. If you need some suggestions now then check out the resources section for links to good Drupal optimisation articles.

Setting up your test site on EC2

If you don’t already have an account then you’ll need to sign up for Amazon Web Services. It’s all rather space-age and if you haven’t been through the process before then it can be a bit confusing. We want to set up a working copy of our site to test on EC2, so once you have your AWS account, the process goes something like this:

  • Select an AMI (Amazon Machine Image) that matches your production environment - we use alestic.com as a good source of Debian and Ubuntu AMIs.

  • Create a high-CPU instance running your AMI. Small-CPU instances only have one virtual CPU, which can be stolen by other VMs running on the same physical hardware, which can seriously skew your results when running a test. There is always going to be a certain amount of variance in the actual CPU time available to your AMI, since it’s always going to sharing the physical hardware, but we find that high-CPU instances tend to reduce the CPU contention issues to a reasonable level.

  • Give your instance an elastic IP, which is Amazon's term for a static IP that you can use to connect to it.

  • Ssh into the machine, you’ll need to make sure that port 80 and 22 are open in the security group, and set up a keypair. Download the private key and use that when connecting, the simplest way is to do:

ssh -i /path/to/your/private/key.pem [email protected] 
  • Install the LAMP server packages you require, try to mirror the production environment as closely as possible. A typical LAMP server can be installed on Debian/Ubuntu by running:
apt-get  install apache2 php5 php5-mysql php5-gd mysql-server php5-curl 
  • Now you need to set up a copy of the site you want to test on your new server. EC2 instances give you a certain amount of ephemeral storage, which will be destroyed when the AMI is terminated, but will persist between reboots - this can be found at /mnt. If you want to terminate your AMI but may need the test sites that you are going to create again, it's a good idea to back up /mnt to Amazon S3.

  • We will create two copies of the site, one called “control” and another called “optimised”. Give them each their own virtual host definition and make sure that they each point to their own copy of the database. “Control” should be left alone, we’ll use this version to get baseline statistics for each test plan. We’ll tweak and tune “optimised” to improve the performance, and compare our results with “control”. Give each of the sites an obvious subdomain so that we can connect to them easily without getting confused. You should end up with two copies of your site set up on /mnt, with separate domains and dbs, something like this:

http://foo-control.bar.com   -> /mnt/sites/foo/control/   -> DB = foo_control
http://foo-optimised.bar.com -> /mnt/sites/foo/optimised/ -> DB = foo_optimised

Setting up JMeter to generate load

We don't want fluctuating network bandwidth to affect our results, so it's best to run a JMeterEngine on a separate EC2 instance and control that from JMeter running on our local machine. First we'll get JMeter generating load from our local machine, then we'll set up a remote JMeterEngine on EC2.

  • First download JMeter, you'll also need a recent copy of the Java JVM running. On OS X, I moved the downloaded JMeter package to Applications, and then ran it by executing bin/jmeter.

  • If you're new to Jmeter, you can download some sample JMeter test plans for stress testing a Drupal site from the nice guys at Pantheon. Or just create your own simple plan and point it at your test server on EC2.

  • Now we have a basic test plan in place, we should spin up another EC2 instance that we'll use to generate the load on our test server. This will provide more reliable results as we're removing our local network bandwidth from the equation. We'll still use our local JMeter to control the remote load generator. We used a prebuilt AMI that comes with Ubuntu and JMeter already installed. JMeter has some good documentation on how to tell your local JMeter to connect to the remote machine, in essence you need to add the remote machine's IP address to your local jmeter.properties file.

  • You'll need to open a port on EC2 for JMeter to talk to the remote engine, add 9090 TCP to your security group that the AMI is running under.

  • We found that JMeter started to hang when we increased the amount of data being transferred in our test plans. Tailing jmeter.log told us that we were running out of memory, increasing the heap size available solved this.

  • Test, test, and more tests. It's important to repeat your tests to make sure you're getting reliable results. It's also important to know that your tests are representative of average user behaviour, it's possible to set up JMeter as a proxy that will capture your browsing activity and replay that as a test. It's also possible to replay apache logs as test plans.

Resources

Jul 04 2010
Jul 04
APC graphs APC entries

There are several different PHP accelerators to choose from, but according to wikipedia "APC is quickly becoming the de-facto standard PHP caching mechanism as it will be included built-in to the core of PHP starting with PHP 6".

I recently put together a new development webserver in a virtualbox virtual machine, and as I was setting it up I thought I'd take the opportunity to test how much difference APC actually makes to a simple Drupal site.

Installation

I was using Ubuntu server. On newer releases APC is available from the package manager...

$ apt-cache search php-apc
php-apc - APC (Alternative PHP Cache) module for PHP 5
$ sudo apt-get install php-apc

...however I'm using Ubuntu 8.04 LTS (Hardy Heron) and there's no apc-php package. It's not hard to install via PECL / PEAR though. First some dependencies need to be installed, then PECL can be used to install APC.

$ sudo apt-get install php-pear php5-dev apache2-threaded-dev build-essential
$ sudo pecl install apc

This last command will produce a ton of output, but one of the last lines will tell you to add this to your php.ini file (which you'll find in /etc/php5/apache2/php.ini) - you'll probably have to do so manually.

extension=apc.so

Restart apache, and you should see a new APC section in phpinfo() which will confirm it's enabled. There's a small php script which gives you some useful info about APC, which you'll find in /usr/share/php/apc.php - you could use a symlink to allow you to get to this file in your browser to see the stats and graphs it produces to tell you what files it has cached, and info on cache hits and misses.

What difference does it make?

I've left the APC default settings - which in my case was only 30mb of memory being used for cache, and run some basic tests using Apache Bench on a simple Drupal 6 site. The actual performance figures are not that important (this is a virtual machine on my laptop, not a production server), but it's interesting to see how much difference it makes turning APC on.

I tested two pages - the very simple homepage, and another page which displays a relatively long webform. The AB command I used was for 100 requests with 10 concurrent requests. e.g.

$ ab -n 100 -c 10 http://mytestsite.example/webform/

test of APC on a Drupal 6 site Test Requests / Second homepage (APC off) 2.78 webform (APC off) 1.72 homepage (APC on) 8.36 webform (APC on) 3.77
test of APC on a Drupal 6 site Test Milliseconds / Request homepage (APC off) 359.68 webform (APC off) 582.94 homepage (APC on) 119.61 webform (APC on) 265.41

You can see that the effect of APC on the simple homepage is more dramatic than on the webform page. This is almost certainly because the database has to do a lot more work to build the latter, and APC's not going to help on that front. However, we can say on the simple page APC makes Drupal perform almost 3 times faster. With the more database-heavy webform page, the improvement is slightly less - but we're still looking at a doubling in performance.

This is obviously not a hugely detailed test, but it certainly leaves me in no doubt that installing APC represents a quick and easy way to achieve a huge improvement in performance for Drupal sites.

Pages

About Drupal Sun

Drupal Sun is an Evolving Web project. It allows you to:

  • Do full-text search on all the articles in Drupal Planet (thanks to Apache Solr)
  • Facet based on tags, author, or feed
  • Flip through articles quickly (with j/k or arrow keys) to find what you're interested in
  • View the entire article text inline, or in the context of the site where it was created

See the blog post at Evolving Web

Evolving Web