Sep 23 2016
Sep 23

"Someone is looking for Drupal developers. Are you interested?", I asked one of my friends. "I will never touch Drupal again.", That is the most negative statement I have heard so far. Yes, Drupal has an interesting learning curve. It is not easy to master quickly. When we talk about learning Drupal, There are some of us left Drupal after many years as a Drupal developer, disappointed, frustrated. It is a sad truth some of us have worked with Drupal for many years, and still not find the beauty of her. I want to discuss how we can avoid it and how to obtain the power of Drupal quickly in an exciting way.

Here is something interesting about Drupal. After seven years into it, I found myself is still learning new thing and gain a little bit deeper understanding of it. When looking into the code of the core and contributed modules, I can see what the other developers were thinking; I saw how they tackled the problem. I have a great sense of connection with them. It is not just merely the code itself. It is a collaborated among many developers in the past, present and future. So, keep learning is the key to getting deeper inside of it.

Drupal is a tool. We are learning the skill to use this tool. It can be boring if the purpose is just to learn and understand it. It is a tool. So, the best way is to use it. Use it in the best possible way we can find. Always, find multiple solutions and choose one of the best to implement it.

Most of the requests from our customers, managers are reasonable. A lot of time, they are looking for a better user experience. It is the critical element leading to the success of a product. Enhancement of user experiences is a big thing. Never push it back off quickly. There is nothing like Drupal can't do it. Yes, we can do it. We have a lot of contributed modules to do it. If we can not find one, we build one and contribute it back to the community. So, others do not need to invent the wheel again. We take on all kinds of the challenge and find the best possible way to face it. By doing that, we dive deep inside of the water.

When it comes to working on a ticket and solving a problem, we create a patch, and there is no error. That is not sufficient. We can ask ourselves couple more questions. Are there other ways to do it? Is the way what I just did will affect other functions and will this limit the future expansion? Are there any bad things introduced down the road? Is there anything that might sabotage the whole system because of this change? We are going to find the best possible way, a native way. Always looking for the best and willing to take on challenges. It helps us learn it and getting deeper. It is how we can learn stuff that others will not get it from tutorial videos.

If your manager is not a taker and wants to burn you out, always take the challenge. When facing a problem, do not find a way to avoid it. Find as many as possible ways to address the problem. Pick one of the best one; Find a natural way that might have come with the original system design.

Solving a problem, fix a bug is not a big deal. Solve it in a right way is a deal. There are always many methods to solve a single problem, choose the right one. Do not get satisfied by fixing something with a single shot. Ask ourselves to look for more ways to it, use the best one. It is like giving away your talent to solve the problem. It leads me to a commercial ad on a Canadian TV channel. "Greatness is not what you have; it is what you give." Give your greatness to Drupal. The more you give, at the same time, the more skill you get.

Jul 16 2016
Jul 16

Sounds like a bad design? When the first time I found out this, I thought that we should have avoided it in design. But, that is not what we are talking today. After we figured out a way to fix the performance, it seems quite a powerful way to deal with the business logic.

How to accommodate the request that a node holds thousands of multiple value items in one field? When it comes to editor experience, we have something to share. Multiple values field for a field-collection-item is a usual setup for a content type. When there are only couple dozens of values, everything is good. The default field-collection embed widget with a multivalue field is working well.

As the number of items goes up, the editing page become heavier. In our case, we have a field collection contains five subfields. There is one entity reference field pointing to nodes, two text fields, one taxonomy entity reference field and a number field. Some nodes have over 300 such field collection items. The editing pages for those nodes are like taking forever to load. Updating the node getting more and more difficult.

for such a node, the edit form has thousands of form elements. It is like loading an adult elephant with a small pickup truck. Anything can slow down the page. That can be from web server performance to the network bandwidth and our local computer browser capability. So, we need to find a valid way to handle it. We want the multiple value field to be truly unlimited. Make it capable of holding thousands of field-collection-items value in a single node.

After doing some research, we come with a pretty good solution. Here is what we had done to deal with it.

We use Embeded Views Field to build a block for the field collection items. We paginate it and break down 300 items into 12 pages. Then, we insert the views block into the node editing page. Not loading all the elements into the node editing form, it speeds up the page loading immediately. Display the field collection items in views block is not enough, we need to edit them. I had tried to use the VBO to handle editing and deleting. It did not work. Then we built some custom ajax functions for editing and deleting. We use the ctools modal window as front end interface to edit, delete and add new items. That works well. With modal window and Ajax, we can keep the main node edit page untouched. There is no need to refresh the page every time they change the field-collection-items. Thanks to the pagination of the views block. We now can add as many items as we want into the field collection multivalue field. We added views sorting function to the embedded views field.

Sounds pretty robust, but wait, there is something missing. We quickly running into problem soon we implement it. What about the form to create a new node? On the new node page, the embedded views field block is not working. A new node does not have its node id. We fixed it by using the default widget. It is just for the new node page. We used the following function to switch the field widget.

function MODULENAME_field_widget_properties_alter(&$widget, $context) {
  if ($context['entity_type'] == 'node') {
    if (!isset($context['entity']->nid)) {
      if ($context['field']['field_name'] == 'FIELD_MACHINE_NAME') {
        if ($widget['type'] == 'field_collection_hidden') {
          $widget['type'] = 'field_collection_embed';
        }
      }
    }
  }
}

Jul 10 2016
Jul 10

We make our life easier by making our colleague's lives easier. If our manager is in a stressful state, we'd better find the way to do something. It is such a fulfill feeling when we can do something together and contribute as a member of a productive team.

I had been in a financial system project for a company in NYC. We needed to migrate a huge amount of content from old system to a new one (Tech: Drupal 5 to Drupal 7). The business is making money by selling data. So, data accuracy was a big thing. Due to the complexity of the data, couple months into the project, we had not got the content migrated. Some technical problems prevented us from going forward. Everyone in the team was under pressure and working very hard. Company executives were starting to losing patience and doubt our ability to get the job done. Our boss was under a lot of stress. In a weekly meeting a few weeks after I joined the project, our manager told us that he was not sure he would still be working for the company the next day. He was afraid he might get fired by his boss. It was like a stone in my stomach. What would happen to us if he lost the job? The whole team was quickly motivated. We all liked him and did not want it to become reality. We all know it was the time all out for a greater good.

For the interest of the group, we did not care about a small personal loss. When migrating location nodes, we needed Google map API to translate hundreds of thousands postal address. It is not free. To save time from asking permission to buy the service, one of our colleagues just went ahead and created an account with his personal credit card. It cost him some money but saved some time for the whole team. We all worked late, collaborated closely and more efficiently. We did not mind to sacrifice for the interest of the project.

My job was to assistant the other backend developer to migrate the content. It was such a compelling feeling to be a part of it and wanted the success of the project. It occupied my mind. When eating, walking, taking a shower, sleeping and even dreaming I had been thinking the ways to solve some technical problems; contemplating the best possible solutions. Many of us include myself were a little bit sick. But, physical health with good rest and diet is the key to a clear and sharp mind. Even though we had a lot of stress, I believe that was one of the key elements to the success of the project. I had learned some time management skill before. Not wasting a minute, I allocated enough time to eat and sleep, and that help me to keep my mind fresh and calm all the time. Two weeks after the meeting, we successfully overcame all the major technical difficulty and got the content migrated.

Everyone seemed to be relaxed right away. It was a pleasure to the conversation during the following meeting. Like fighting a battle should by should against a ferocious enemy and win it, everybody in the team felt closer and more connected to each other. We helped each other and collaborated closely and made our life easier. It is a real joy from collaboration.

Moon, earth and other object

Mar 25 2016
Mar 25

I am not against design patterns. I am just not its devoted follower. I think design pattern to a programming language is like grammar to the English language. I believe it is not the right tool to begin with software designing.

It said a design pattern is a general repeatable solution to a commonly occurring problem in software design. It is not accurate. Why not just make it a standard API function. A pattern should come with our solution in a natural way. To cope with a real life problem, we need to be agile and flexible. Design patterns may not be the best way. We just can not generalize solutions for various problems.

Someone said that we need design pattern. It is because issues may not become visible until later in the implementation. That is a big problem for me. As a system architect, I design software to deal with issues. If we can't see all of them before the implementation. It is a failure. Or at least, a certain percent of discount on the perfection of software quality.

It said that patterns allowed developers to communicate using well-known, well-understood names for software interactions, which might be true. We might discover patterns in a well-written system. Those patterns should not be created intentionally. Otherwise, it can make the code difficult to understand. Mostly, it happened when design pattern ideology dominated the process of business request analyze and technical design.

How does design pattern work? It is like how the grammar is working. We did not check grammar every time we speak? So, forget about patterns. Like we learned our mother tongue, just go practice and use it. Forget about the grammar. Let your innovation and creativity find the way to your code. Just design the system to handle issues, no worry about patterns.

Design patterns should come to our code naturally. After finishing our programming designing, revisit it a few weeks later. We may have "invented" a new one. As the matter of the fact, I might get one in my next article.
wuinfo design pattern discuss

Jan 30 2016
Jan 30

Wuinfo Drupal Drop
Spam is a big headache for many website owners. Using the Drupal impression module, I saw the relentlessness of the spammer bots. Every day, for a single site, I got thousands of hit from URLs like "/?q=user/register" and "/?q=node/add". I have someone commented on my LinkedIn update of my blog post Is there more computer bots than us?. She is "on the verge of giving up on Drupal after being unable to solve this problem". How do we address this issue and solve the problem? I know this is not the issue for one CMS like Drupal, but, it provides some mandate for us to do something. Build something for Drupal and usable by other CMS like Wordpress and Joomla.

I have a bold idea of blocking spam efficiently without taking a toll on the performance of every website. Let's set up a websites spam defence network. A network based on a global spam IP database. Each website is a node of the defence network. It provides spamming IP query as a web service.

The idea is to have a distributed but well-controlled spam IP servers. All participated website acting as a node in the network and capture spamming IP and report it. Web sites are connected and talk to each other and form a defendant line in front of spammers. The network will quarantine the spammer IP for 45 minutes or more depending on how active the spamming activity. The IP will get off the list after the quarantine time ended.

Web sites that join the network will have faster responding website by freeing up the resource taken by spamming activity. We will have a cleaner internet by eliminating the fake users, spamming comments and contents.

Technical wise, we use the open source solution. We can build distributed spam IP database like git repository. We use composer repository, so, all PHP based CMS websites can easily join the network.

Dec 29 2015
Dec 29

I know that is a dumb question. I am shocked after I found out my site is quite busy, but none of them are real human beings.

I installed the Impression module that give me a detail report of all the activities on the site. I used views module to creating a report of all those activities. I am quite surprised to find out how big the problem is. I have attached a report for about 40 minutes' sites activities.

As you can find out from the report, In the 40 minutes period, no human activity on the website. But there are 35 bots visits. There are 32 registration form requests.

A little bit of explanation of the report. All values in the action column are 0. Which means all the activities are from bots. Because if it is a person who visits a web page, we will detect other movements like screen touch, mouse movement or keyboard movement.

Action ID Action Name 0 page load 1 Screen touch 2 Mouse move 3 Keyboard key down

Here is the report:

User ID Action ID IP URI CREATED DATE 0 0 xx.94.236.195 /?q=user/register 2015-12-28 18:59 0 0 xx.108.70.111 /user/register 2015-12-28 18:59 0 0 xxx.144.230.34 /?q=node/add 2015-12-28 18:59 0 0 xxx.144.230.34 /content/spinach-spinach-and-spinach 2015-12-28 18:59 0 0 xxx.100.55.148 /content/spinach-spinach-and-spinach 2015-12-28 18:54 0 0 xxx.82.169.157 /content/spinach-spinach-and-spinach 2015-12-28 18:53 0 0 xx.176.232.61 /user/register 2015-12-28 18:52 0 0 xx.176.232.61 /user/register 2015-12-28 18:52 0 0 xx.176.232.61 /node/add 2015-12-28 18:52 0 0 xx.176.232.61 /?q=node/add 2015-12-28 18:52 0 0 xx.176.232.61 /user/register 2015-12-28 18:52 0 0 xx.89.52.194 /?q=user 2015-12-28 18:51 0 0 xx.89.52.194 /user/register 2015-12-28 18:51 0 0 xx.89.52.194 /user/register 2015-12-28 18:51 0 0 xx.89.16.190 /?q=user/register 2015-12-28 18:51 0 0 xxx.82.170.131 /?q=node/add 2015-12-28 18:51 0 0 xxx.82.170.131 /content/spinach-spinach-and-spinach 2015-12-28 18:51 0 0 xx.72.71.69 /?q=user 2015-12-28 18:50 0 0 xx.72.71.69 /user/register 2015-12-28 18:50 0 0 xxx.222.15.14 /content/spinach-spinach-and-spinach 2015-12-28 18:50 0 0 xxx.217.171.178 /content/spinach-spinach-and-spinach 2015-12-28 18:49 0 0 xx.95.105.65 /?q=user 2015-12-28 18:49 0 0 xx.95.105.65 /user/register 2015-12-28 18:49 0 0 xxx.94.168.69 /user/register 2015-12-28 18:49 0 0 xxx.94.168.69 /user/register 2015-12-28 18:49 0 0 xxx.77.254.224 /?q=user/register 2015-12-28 18:49 0 0 xx.95.105.7 /?q=node/add 2015-12-28 18:49 0 0 xx.95.105.7 /content/spinach-spinach-and-spinach 2015-12-28 18:49 0 0 xxx.222.15.14 /content/spinach-spinach-and-spinach 2015-12-28 18:48 0 0 xxx.222.15.14 /comment/reply/53/20 2015-12-28 18:48 0 0 xxx.222.15.14 /comment/reply/53/20 2015-12-28 18:48 0 0 xx.210.181.162 /content/spinach-spinach-and-spinach 2015-12-28 18:48 0 0 xxx.69.209.169 /content/spinach-spinach-and-spinach 2015-12-28 18:47 0 0 xx.94.234.174 / 2015-12-28 18:45 0 0 xxx.227.92.115 /user/register 2015-12-28 18:44 0 0 xxx.227.92.115 /user/register 2015-12-28 18:44 0 0 xxx.3.83.153 /?q=user/register 2015-12-28 18:44 0 0 xx.95.94.79 /?q=node/add 2015-12-28 18:44 0 0 xx.95.94.79 /content/spinach-spinach-and-spinach 2015-12-28 18:43 0 0 xxx.227.92.78 /user/register 2015-12-28 18:43 0 0 xxx.227.92.78 /user/register 2015-12-28 18:43 0 0 xx.211.55.9 /user 2015-12-28 18:43 0 0 xxx.227.92.78 /node/add 2015-12-28 18:43 0 0 xxx.227.92.78 /?q=node/add 2015-12-28 18:43 0 0 xx.211.55.9 /user 2015-12-28 18:43 0 0 xx.211.55.9 /content/spinach-spinach-and-spinach 2015-12-28 18:43 0 0 xx.211.55.9 /?q=user 2015-12-28 18:43 0 0 xx.211.55.9 /user/register 2015-12-28 18:43 0 0 xxx.217.171.178 /content/spinach-spinach-and-spinach 2015-12-28 18:42 0 0 xxx.69.209.169 /content/spinach-spinach-and-spinach 2015-12-28 18:42 0 0 xxx.69.209.169 /comment/reply/53/6 2015-12-28 18:42 0 0 xxx.69.209.169 /content/spinach-spinach-and-spinach 2015-12-28 18:42 0 0 xxx.69.209.169 /comment/reply/53/6 2015-12-28 18:42 0 0 xx.35.104.150 /user 2015-12-28 18:36 0 0 xx.35.104.150 /user 2015-12-28 18:36 0 0 xx.35.104.150 /?q=user 2015-12-28 18:36 0 0 xx.35.104.150 /user/register 2015-12-28 18:36 0 0 xxx.162.199.47 /user/register 2015-12-28 18:36 0 0 xxx.162.199.47 /?q=user 2015-12-28 18:36 0 0 xx.59.49.218 / 2015-12-28 18:36 0 0 xx.59.49.218 / 2015-12-28 18:35 0 0 xxx.94.168.69 /user/register 2015-12-28 18:33 0 0 xxx.94.168.69 /user/register 2015-12-28 18:33 0 0 xxx.77.254.224 /?q=user/register 2015-12-28 18:33 0 0 xxx.94.168.133 /?q=node/add 2015-12-28 18:33 0 0 xxx.94.168.133 /content/spinach-spinach-and-spinach 2015-12-28 18:33 0 0 xxx.144.227.34 /?q=user 2015-12-28 18:32 0 0 xxx.144.227.34 /user/register 2015-12-28 18:32 0 0 xxx.144.227.34 /user/register 2015-12-28 18:32 0 0 xx.94.237.96 /?q=user/register 2015-12-28 18:32 0 0 xxx.52.208.224 /?q=node/add 2015-12-28 18:32 0 0 xxx.52.208.224 /content/spinach-spinach-and-spinach 2015-12-28 18:32 0 0 xx.94.10.97 /user 2015-12-28 18:28 0 0 xx.94.10.97 /?q=user 2015-12-28 18:28 0 0 xx.94.10.97 /user/register 2015-12-28 18:28 0 0 xx.94.10.97 /user/register 2015-12-28 18:28 0 0 xx.94.10.97 /?q=node/add 2015-12-28 18:28 0 0 xx.94.10.97 /node/add 2015-12-28 18:28 0 0 xx.94.10.97 /content/spinach-spinach-and-spinach 2015-12-28 18:28 0 0 xxx.144.227.34 /user 2015-12-28 18:28 0 0 xxx.144.227.34 /user 2015-12-28 18:28 0 0 xxx.144.227.34 /?q=user 2015-12-28 18:28 0 0 xxx.144.227.34 /user/register 2015-12-28 18:28 0 0 xxx.217.171.178 /content/spinach-spinach-and-spinach 2015-12-28 18:28 0 0 xxx.3.242.250 /user 2015-12-28 18:27 0 0 xxx.3.242.250 /user 2015-12-28 18:27 0 0 xxx.3.242.250 /?q=user 2015-12-28 18:27 0 0 xxx.3.242.250 /user/register 2015-12-28 18:27 0 0 xxx.3.83.153 /?q=user 2015-12-28 18:27 0 0 xxx.3.83.153 /user/register 2015-12-28 18:27 0 0 xxx.217.171.178 /content/spinach-spinach-and-spinach 2015-12-28 18:25 0 0 xxx.214.9.179 / 2015-12-28 18:25 0 0 xxx.214.9.179 / 2015-12-28 18:25 0 0 xxx.214.9.179 / 2015-12-28 18:25 0 0 xxx.217.171.178 /content/spinach-spinach-and-spinach 2015-12-28 18:24 0 0 xxx.217.171.178 /content/spinach-spinach-and-spinach 2015-12-28 18:23 0 0 xxx.217.171.178 /content/spinach-spinach-and-spinach 2015-12-28 18:22 0 0 xxx.77.245.222 /user 2015-12-28 18:20 0 0 xxx.77.245.222 /?q=user 2015-12-28 18:20 0 0 xxx.77.245.222 /user/register 2015-12-28 18:20
Dec 12 2015
Dec 12

We built a startup website on top of Drupal. The beta release of https://www.dinnerlife.com is one of the latest Drupal 7 websites we built.

The startup is trying to help promote a new lifestyle. People can dine around if they do not want to cook. Or, if they like cooking, they can host dinners for others and make a living on it.

The mission of the startup is to help people living in a healthy and smart way. It will focus on building community and provide valuable information on how to prepare a nutritious, tasty and balanced dinner. Dinnerlife is a platform for people who do not have time to cook but still want to have a well-prepared home meal. The goal of it is to help form regular dinner groups. So people can meet and eat together every week or every day.

Social benefit is huge for this initiative. Get people connected with dinner is a way to fight against depression and improve health and then improve our life quality. It provides job opportunity to those who love to stay at home. Please help me wish all the success of it and most importantly it is on Drupal.

Nov 01 2015
Nov 01

Look at the snapshot from the user table. I was having a big problem of spamming bots. The user table is full of fake users. Almost every minutes there is a fake user registered.
spammers' account flooding a website
I have tried a lot of ways to stop bots from creating user accounts on my website. I have tried different captchas. They are not convenient to real users. Some of them have problems with the responsive design. I start to looking for another inovative way to stop those bots. I came to this 'crazy' idea. Since bots are do not use a mouse or touch the screen of the opened web page, Why not use that behavior to identify them?

Following this idea, I quickly come up with a human behavior module. It uses impression module to catch the human actions on a web page. If none of such actions detected, it blocks the submission of the form. It records an action of each web page. Within 3 hours, form submission from this page is valid. The website is problem free from the spammer soon I download and install the modules.

The module is in a beta release; It is a new weapon to fight the bots. Fighting against spam bots is an ongoing fight. The module is not perfect. The bot can still try to mimic human behavior. New bots might be created to make some fake mouse move, etc. We can not prevent that from happening, but we can sure to come up with some new ideas to detect them and stop them from messing up our website.

Sep 01 2015
Sep 01

In a large website with many nodes, stop using the node_load_multiple function. It potentially limits the site growing.

According to the document: "This function should be used whenever you need to load more than one node from the database." But, I want to say that we should avoid using this function as this open the door to system crash in the future.

I had written a blog before Design a Drupal website with a million nodes in mind. I had used this function as an example. It is true that node_load_multiple function enhance the performance. But, it comes with a price. When we load thousands of node into the memory, it exhausts the web server memory instantly. Then we get this infamous message: "Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate XYZ bytes) in ...".

This issue brought me to an attention when I am reading the code done by one of the Drupal shops. In one of the recently launched high-profile project, the function is used to load all the show nodes in the system. It sounds to be troublesome to me at the beginning. We can not load all show nodes if we are not sure how many it will be? Where there are a small amount of node in the system, that is fine. But if it is for a system there are over hundred thousand nodes, that is a big problem. As time goes by, we add more shows into the system; The utility function needs more memory to handle it. The physical memory limits the ability to add infinity show nodes into the system.

What is the best way to handle this situation? Avoid using the node_load_multiple function. If we have to use the node_load_multiple, limit the number of the node for node_load_multiple to load.

Aug 28 2015
Aug 28

The success of a project depends on a good development team. How to build and maintain such a good team?

As, a software developer for many years, I believed a good dev team is one of the pillars for a successful business. Here, I want to discuss how to build a dream developer team. Building a high productive, super innovative and proactive team is like cooking a meal. It needs a good ingredient, right source, and good timing in each step of cooking.

A good developer has a good academy score in math. Software developing needs the strongest logic thinking and self-validation skill. Building a software project like taking a mathematics test. The higher score in the test means fewer bugs in the code. A person who is capable of getting full scores on math tests is likely to build a project with least bugs. Finding right persons is the first step toward a great team.

A good academy score will not automatically make a good developer. Building software projects are team workings. Developing software is very detail oriented. We may not be able to avoid nitpicking on something. Soft skill is important too. A good developer is willing to learn, easy to collaborate and detail oriented. A good developer will always focus on the matters but never escalate matters to a personal level. A good developer can accept criticism and change for a greater good of the team.

After we have gathered a group of talented developers, it is time to "cook". Every person can be in different states. Software developers can be in the peak productivity state or the bottom of the productivity. An encouraging and rewarding environment with a strong leadership is the key to motivating developers to reach the peak of their productivity. Reward developers with self-fulfillment and let them achieve something with their work. A leading developer with an extraordinary fellowship will help it a lot.

We might ignore the physical environment. Nice, clean, quiet offices help developers focusing on their job. Some start-up companies put a lot of effort to finding talents but did not let them work comfortably. Sometimes, offices are crowded and stuffy. What they can do is just stop looking for a smarter developer and put a little bit more effort to improve the current working condition. In such a company, even the best developer is not able to concentrate on his job. Software developing is a mind activity. The brain requires a lot of blood circulation with plenty of oxygen and energy. The importance of clean, quiet, natural and toxic-free environment will never be overestimated. A healthy environment is a basic requirement for a strong software product.

The next one is an a study oriented and encouraging setting; A company has a respectful culture and a group of open minded developers. It is where developers are very closely collaborating with each other. Developers are not afraid to make a mistake and willing to share their latest trick and newly mastered programming tactics.

Aug 21 2015
Aug 21

For a small Drupal shop or an individual Drupal consultant, how to grow up? It seems that small Drupal shops face a glass ceiling when they want to move upward. They are not able to find a larger project because they not big enough. It is not trustworthy or not give the stack holder a confidence if there are not a team of developers. Should we solve this problem by working together in a partnership? The Drupal developer is a very technical intensive. Let us follow the way lawyers did in their practice. We get together and build a strong team.

wuinfo lawyer blog for drupal developers

What is the benefit to run a Drupal shop in a partnership?
1) It is easy to setup unless we want to form an LLP partnership. As a professional Drupal Freelance, we may have some client already. Initial partners sign an agreement and form a partnership with some existing customers already.

2) A good size team gives confidence to customers. It is going to be easier to win a bigger project. According to #3 and #6 of seven common myths from Pantheon, big firms get big jobs. If you want big jobs, you have to get big first.

3) Having a partnership formed, we can recruit more junior developers and train them.

The challenge here is we never did it before. We do may not have any ways to follow. A comprehensive partnership agreement is needed. Here are some important things that we need think through before we form a partnership:
1) Types of Partnerships (General Partnership or Limited Liability Partnership)
2) Governance and Decision-Making
3) Partner Compensation
4) Capital Contribution
5) Overhead and Liabilities
6) Parental Leaves and Sabbaticals
7) Retirement and Termination.

Professional Drupal developers will benefit by practicing partnership in professional service. A reputable good size team is capable of catch and deliver bigger and more profitable projects.

Drupal developers provide highly skilled professional service. Lawyers give professional service related law. Lawyers have lawyer office to provide their service in a decent way. Why not copy the way how they did it to provide our Drupal service.

Referred document: http://www.cba.org/cba/PracticeLink/WWP/agreement.aspx

Aug 19 2015
Aug 19

Small and medium-size businesses can benefit with a good content strategy backed by a content management system (CMS) like Drupal. The internet is evolving fast. A good content strategy helps business keep a close pace with the trend.

More and more people are using mobile phones to get information and connect with others. A CMS website can quickly turn into responsive design. A responsive website provides better user experience for mobile users. Hence, Google ranks a responsive website higher than none responsive ones.

It is beneficial for a business to have a long term and short term digital plan. It saves money in a long run. If a company has a consistent plan for next 5 to 15 years, it helps avoid costly overhaul of previously built software and redo anything just because it did not fit into a big picture.

Here is an example of my customer who is doing great in the insurance business. Their consistent content strategy help them take a lion share of a niche market, a Chinese insurance market.

They are focusing on Chinese insurance market. At the very beginning, the owner of the business has an excellent long-term goal for his business. He built a comprehensive Drupal based system for his insurance business. With Drupal powerful Multi-lingual support, he built a website having three languages. The website is serving as a primary marketing tool. He published unique content that are valuable for Chinese travelers to Canada. There is an online insurance quotation system built from a Drupal contributed module. With the quotation system, people can easily compare insurance policies from different insurance companies. They can place an insurance order online. Other than that, backend system catches other customers' leads.

A system built on Drupal is well SEO-tuned. The website rank high in Google search result. Keywords like "Canada insurance" in the Chinese language is on the first page of both Google and Baidu. Their website rank high in the search result of other search engines. As I am writing this article, their keyword "Canada insurance" in Chinese rank #1 on Google search result and also on the first page of Baidu. It brings thousands of organic search visit and hundreds of high-quality leads every week. Without spending any other marketing dollars, the company doing great with the solid content strategy.

The owner of the business had the great vision at the beginning. He built his insurance business on a top of Drupal-based software system. Supported by an active and diverse community of people around the world, Drupal is an enterprise standard open source software. The system serves as a marketing tool that bring hundreds of quality leads every week. The content management system lets employees easily publish blogs and articles. Recently, they hired us a Toronto Drupal shop for a main Drupal version upgrade.

If the software is a pillar of a successful business, building a system from Drupal is a cornerstone of it. A good content strategy secures a profitable business.

Jul 27 2015
Jul 27

As one of Canada’s most successful integrated media and entertainment companies, Corus have multiple TV channels and websites for each channel.

It had been a challenge to have multiple channels' live schedule data displayed on websites. All the data are from a central repository. It became a little bit difficult since the repository is not always available. We had used Feeds module to import all the schedule data. Each channel website keeps a live copy of the schedule data. Things got worse because of the way we update the program items. We delete all the current schedule data in the system and then imported from the central repository. Sometimes, our schedule pages became empty because the central repository is not available.

Pedram Tiv, the director of digital operations at Corus Entertainment, had a vision of building a robust schedule for all channels. He wants to establish a Drupal website as a schedule service provider - content as a service. The service website download and synchronize all channels schedule data. Our content manager can also login to the website and edit any schedule items. The site keeps all the revisions for the changes. Since, the central repository only provide raw data, It is helpful we can edit the scheduled show title or series name.

I loved this brilliant idea as soon as he had explained it to me. We are building a Drupal website as a content service provider. It means we would build a CMS for other CMS websites. Scalability is always challenging for a modern website. To make it scalable, Pedram added another layer of cache protection. We added S3 cache between the schedule service and the front end web servers. With it, schedule service can handle more channels and millions of requests each day. Front end websites download schedule data from the Amazon S3 bucket only. What we did is creating and uploading seven days' schedule data to S3. We set up a cron job for this task. Every day, It uploads thousands of JSON schedule files for different channels in different time zones of next seven days each time.

This setup offloaded the pressure of schedule server and let it serve unlimited front end users. It gives seven days of grace period. It allowed the schedule server to be offline without interrupting the service. One time, our schedule service was down for three days. The schedule service was not affected because we have seven days of schedule data in an S3 bucket. By using S3 as another layer of protection, it provided excellent high availability.

Our schedule service have been up and running for many months without a problem. There are over 100,000 active nodes in the system. For more detail about importing large number of content and building an efficient system, we have some other blogs for this project.

Sites are that are using the schedule services now:
http://www.cmt.ca/schedule
http://www.teletoonlanuit.com/horaire
http://www.abcspark.ca/schedule/daily

Jul 18 2015
Jul 18

Like Chinese philosophy Yin and Yang, Coding and testing support each other. Here I want to introduce a method to increase the rate of software project success. It is a critical thinking in programming, an agile methodology on a micro-scale level. What is it? In every line of code, every loop or conditional block, we look for multiple solutions. For each solution, we evaluate possibilities of going wrong. Pick the most reliable one. It helps us remove most of code vulnerability at beginning.

Knowing how to test is more important than knowing how to coding. A good function is stable, fast and, requires less memory and computing resources. It is not necessary to be complicated or has difficult API functions under the hood. Usually, it has less code and seems to be simpler than the rest. And it is the result of various tests and trying different methods. We choose the best way to compose the code base on various tests. It ensures robust software system with less vulnerability.

Let us say there is a project that send human beings to the Mars. If we are part of the project and our responsibility is to land the spaceship on the surface of the planet safely. How to make sure 'Landing Function' work without any deadly 'bug'? To us, what is the most important thing for it? It is not the landing gear; it is not the parachute. It is the tests. Under different situations, we performing hundreds or thousands of tests to validate each equipment. Find those thousands testing cases is the most difficult and important task. We need to find as close condition as possible under which when landing the spaceship on the Mars. The temperature, the speed of the spaceship, the chemical in the air, etc. Test materials under all sorts of combined circumstances. Any possibility that can go wrong, we will let it fail in the test.

When we solve a problem or fix bugs, we do the same thing. We do some research. By doing some research, we guess the cause of a problem. Then, we try to prove it with many tests. If the test results are positive, then, we came to a conclusion. Otherwise, we continue guesses and tests. Insufficient tests may lead us to a wrong conclusion.

Again, the tricky thing is how we test it. It is same when building new functions or modules. Whenever there is a solution, we put it under different conditions and validate it. We cover it with enough test cases. Like Chinese philosophy Yin and Yang, Coding and validating support each other.

When building a complex function, we may divide it into many baby steps. Each step is important to the overall success. Exhaust the ways testing each step to make sure each one is good. The result of the previous step can safely serve the next step. Finding a good way to test each step is crucial. Because there are many ways to do a thing. We need to choose the best one with tests. To each baby step, there can be a dozen ways to take. Able to find the most reliable way is a merit of a good software developer or a project leader. When pushes code to production, it can be way more complicated and possible to break. As laws of Murphy, anything that can go wrong will go wrong. So, have multiple test scenarios prepared for the smallest function unit. It is essential to make our code safe and sound. We will never overestimate the importance of it.

When it comes to a complex function, dividing it into small functional units that are easy to verify. Let's say the CYouTube project. We divide it into multiple steps and sub-steps. Each step has an easy and clear goal to achieve.

  • Pull JSON data from YouTube
  • Create Drupal queue task for each item
  • Create a file entity of video for each task
  • Each of those major Steps has multiple sub-steps. For example, pulling data from YouTube has following sub-steps.

  • Collect YouTube channel name from Drupal nodes field.
  • Get Youtube Channel upload playlist ID from YouTube with the collected channel name.
  • Cache function for playlist ID
  • Download YouTube video list JSON data with the playlist ID.
  • Parse the downloaded JSON into an array for next major step. That is to create Drupal queue tasks
  • Each of the sub-task or major task is easy to verify. Whenever, something goes wrong, we can follow the steps and find the broken one. From large scale to small, we can use this methodology. When we practise it more, our minds become more capable to find those steps with clear goals; We will foresee the problem when we choose the way to implement each step. That is before we do some real coding; That is how we can develop software that requires less maintenance.

    Test-driven development (TDD) relies on the repetition of a very short development cycle. The developer writes an automated test case before writing code. Our approach is a step ahead of it. When thinking about the solution, we find test cases first. Passing those test is the goal of next step - coding. We start think about test case when we read the requirement document. When we write code, we are thinking about how to break it. Whenever we finish a function, it went though many verification tests by ourselves. When it finishes, we have a much better chance to have a fully functioning module. If it does not work out, we can always return to each step and find the broken link by reiterating the verification process. I believe it is the most efficient way to develop something.
    test driven software developing

Jul 04 2015
Jul 04

Recently, I was working with a team on the project for country music television (CMT) at Corus Entertainment. We need to synchronize videos from two different resources. One of them is Youtube. There are over 200 youtube channels on CMT. We need pull videos from all those channels regularly. By doing that, videos that published on Youtube will be available on CMT automatically. So, in-house editors do not need to spend time uploading the videos.

We store videos in file entities. Videos are from different sources. All imported videos act same across the site. Among the imported video, only their mime type is different if two videos are from the different source. Each imported video is a file entity on CMT. For the front end, we built views pages and blocks based on those video file entities.

We have some videos imported from another source: MPX thePlatform. We used "Media: thePlatform mpx" as main module to import and update those videos. To deal with customized video fields, we contributed a module "Media: thePlatform MPX entity fields sync" to work with the main module.

After we have done with MPX thePlatform videos, we have experience of handling videos import. How do we import YouTube videos now? We have over 200 channels on YouTube.

We gave up to make it work with Feeds module. At first, we planned to use feeds module. Since Google have just deprecated their YouTube API V2.0, Old RSS channels feed is no longer working. Thanks to the community, we found a module called feeds_youtube. The module's 7.x-3.x branch works with YouTube latest API V3.0. So, we have a feeds parser. We still need a feeds processor. Thanks to the community again, we found feeds_files module. We installed those modules and their dependent modules. We configure feeds module and spent couple days. It did work. At the end, we decide to build a lightweight custom module that can do everything from download YouTube JSON data to create local file entities. Each video imported from channels will have a uplink to an artist or a hosting show.

What do we want from the module? We want it to check multiple YouTube channels. If there is a new video uploaded in a channel, we create a associated file entity on CMT. In the created entity, we have some the metadata from YouTube saved to the entity fields. In that entity, we also save local metadata. Local metadata likes a show, artist information. We want the module to handle over 200 and maybe thousands of channels in the future. We want the module to handle the tasks gracefully and not burn out the system when importing thousands of videos. It sounds to be quite intimidating but not really.

We ended up building a module and contributed it to Drupal.org. It is called Youtube Channel Videos Sync V3. Here is how we come up with the module.

First of all, we gathered a list of channels name. We also collected relevant local metadata like artist id or show id. Then, we send a request to YouTube and get a list of videos for each channel. Then, for each video, we create a system queue task with video's data and the local metadata. At last, we built queue processor and created the video file entity one by one. See the diagram below for the complete process.

drupal module cyoutube video sync

A little more technical detail below:

How we get a list of YouTube channels name and the local metadata for the channels? In the module configuration page, we set the fields used to store the channel name. In CMT, we have the field in both show and artist nodes. The module went through the nodes and get an array of channel name and the node id with they content type. Array like this:

  array(
   'channel name' => array(
     'field_artist_referenced' => array('nid',...),
    ),

How to configure YouTube metadata fields into entity fields? On the module configuration page, we can set up mapping between the YouTube metadata field and local entity field.

In the imported videos, how to set the local artist and show information? The module has a hook where a custom module can implement to provide that information. Like the array above, "field_artist_referenced" is a field machine name of video entity. "array('nid')" is the value for that field. By doing that, the imported video has an entity reference pointing back to the artist or show node. One of the best ways to set up a relation is to have an entity reference in the video entity that point to the artist node or entity.

That is the overall process the module follow to import thousands of videos from YouTube.

Apr 03 2015
Apr 03

Drupal field was part of the Drupal core since version 7. The Field extends her ability to build different kinds of systems. Since it is basic units of each entity, it is one of the most important parts of the open source software. But, when it comes to the efficiency of using SQL storage engine, the field can still do better with efficiency. I sincerely believe that we may not afford to ignore it. Let put it under a microscope had a close look at field SQL storage.

Case study:

I had built a patient scheduling system for a couple clinic offices. The project itself is not complicated. I have attached the patient profile picture on this article. We built a patient profile node type on the form. It is not a complicated form, but there are over 40 fields. It is not difficult to set up a nice patient profile node form. I also created appointment node type that connected patient profile and doctor profile with entity reference fields. Using views with exposed filter for the various reports.

It was the project where I find the issue. I am a little bit uncomfortable after I take a close look at the database. Each field has two almost identical tables. I think fields took too much unnecessary database space. I have dumped one of the fields database information to explain my concern.

1) Base table: field_data_field_initial

+----------------------+------------------+------+-----+---------+-------+
| Field                | Type             | Null | Key | Default | Extra |
+----------------------+------------------+------+-----+---------+-------+
| entity_type          | varchar(128)     | NO   | PRI |         |       |
| bundle               | varchar(128)     | NO   | MUL |         |       |
| deleted              | tinyint(4)       | NO   | PRI | 0       |       |
| entity_id            | int(10) unsigned | NO   | PRI | NULL    |       |
| revision_id          | int(10) unsigned | YES  | MUL | NULL    |       |
| language             | varchar(32)      | NO   | PRI |         |       |
| delta                | int(10) unsigned | NO   | PRI | NULL    |       |
| field_initial_value  | varchar(255)     | YES  |     | NULL    |       |
| field_initial_format | varchar(255)     | YES  | MUL | NULL    |       |
+----------------------+------------------+------+-----+---------+-------+

Base table SQL script:

CREATE TABLE `field_data_field_initial` (
`entity_type` varchar(128) NOT NULL DEFAULT '',
`bundle` varchar(128) NOT NULL DEFAULT '',
`deleted` tinyint(4) NOT NULL DEFAULT '0',
`entity_id` int(10) unsigned NOT NULL,
`revision_id` int(10) unsigned DEFAULT NULL,
`language` varchar(32) NOT NULL DEFAULT '',
`delta` int(10) unsigned NOT NULL,
`field_initial_value` varchar(255) DEFAULT NULL,
`field_initial_format` varchar(255) DEFAULT NULL,
PRIMARY KEY (`entity_type`,`entity_id`,`deleted`,`delta`,`language`),
KEY `entity_type` (`entity_type`),
KEY `bundle` (`bundle`),
KEY `deleted` (`deleted`),
KEY `entity_id` (`entity_id`),
KEY `revision_id` (`revision_id`),
KEY `language` (`language`),
KEY `field_initial_format` (`field_initial_format`)

2) Revision table: field_revision_field_initial

+----------------------+------------------+------+-----+---------+-------+
| Field                | Type             | Null | Key | Default | Extra |
+----------------------+------------------+------+-----+---------+-------+
| entity_type          | varchar(128)     | NO   | PRI |         |       |
| bundle               | varchar(128)     | NO   | MUL |         |       |
| deleted              | tinyint(4)       | NO   | PRI | 0       |       |
| entity_id            | int(10) unsigned | NO   | PRI | NULL    |       |
| revision_id          | int(10) unsigned | NO   | PRI | NULL    |       |
| language             | varchar(32)      | NO   | PRI |         |       |
| delta                | int(10) unsigned | NO   | PRI | NULL    |       |
| field_initial_value  | varchar(255)     | YES  |     | NULL    |       |
| field_initial_format | varchar(255)     | YES  | MUL | NULL    |       |
+----------------------+------------------+------+-----+---------+-------+

Revision table SQL script:

CREATE TABLE `field_revision_field_initial` (
  `entity_type` varchar(128) NOT NULL DEFAULT '',
  `bundle` varchar(128) NOT NULL DEFAULT '',
  `deleted` tinyint(4) NOT NULL DEFAULT '0',
  `entity_id` int(10) unsigned NOT NULL,
  `revision_id` int(10) unsigned NOT NULL,
  `language` varchar(32) NOT NULL DEFAULT '',
  `delta` int(10) unsigned NOT NULL,
  `field_initial_value` varchar(255) DEFAULT NULL,
  `field_initial_format` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`entity_type`,`entity_id`,`revision_id`,`deleted`,`delta`,`language`),
  KEY `entity_type` (`entity_type`),
  KEY `bundle` (`bundle`),
  KEY `deleted` (`deleted`),
  KEY `entity_id` (`entity_id`),
  KEY `revision_id` (`revision_id`),
  KEY `language` (`language`),
  KEY `field_initial_format` (`field_initial_format`)

Here are my concerns.

1) Normalization.

Here is one of the fields' data record.

+-------------+--------+---------+-----------+-------------+----------+-------+---------------------+----------------------+
| entity_type | bundle | deleted | entity_id | revision_id | language | delta | field_initial_value | field_initial_format |
+-------------+--------+---------+-----------+-------------+----------+-------+---------------------+----------------------+
| node        | patient_profile      |       0 |      1497 |        1497 | und      |     0 | w                   | plain_text        |
+-------------+--------+---------+-----------+-------------+----------+-------+---------------------+----------------------+

We have value "W" in the Initial field. One character took 51 bytes for storage that had not included index yet. It took another 51 byte in the revision table and more for index. In this case here, only less than two percents of space are used for real data the initial 'W', and over 98% of space is for other purposes.

For the sake of space, I think we should not use varchar for entity_type, bundle, language, field_format column. Use small int, tiny int or intÎÎ that will only take one to four bytes. The field is a basic unit of a Drupal website. A medium website can hold millions of fields. Saved one byte is equal to multiple megabytes in precious MySQL database.

2) Too complicated primary key

Each field table has a complicated primary key. Base table use `entity_type`, `entity_id`, `deleted`, `delta`, `language` as primary key. Revision table use `entity_type`, `entity_id`, `revision_id`, `deleted`, `delta`, `language` as primary key. "In InnoDB, having a long PRIMARY KEY wastes a lot of disk space because its value must be stored with every secondary index record."ÎÎÎ. It may be worthy to add an auto increasing int as a primary key.

3) Not needed field column

I found bundle type column is not necessary. We can have the system running well without bundle type column. In my clinic project, I named the node type "patient profile". The machine name patient_profile appears in each field record's bundle type column. As varchar (255), it uses 16 bytes for each table record. Let do a quick calculation. if there are 100, 000 nodes and each node have 40 fields, 100,000 x 40 x 2 x 16 = 122MB are taken for this column. Or at least, we use 2 bytes small int that will take only one-eighth of the spaces.

4) Just use revision table.

Remove one of the field's data tables. It may need a little bit more query power to get field data, but it save time when we insert, update and delete field's data. By doing so, we maintain one less table per field, edit content faster. It helps to bring better editor experience and to save on database storage space.

A contributed module field_sql_leanÎÎ addressed some of the concerns here. It still needs a lot of work on itself and if we want other contributed modules compatible with it. After all, it changed the field table structure.

Reference:

1: http://dev.mysql.com/doc/refman/5.1/en/integer-types.html
2: http://dev.mysql.com/doc/refman/5.0/en/innodb-tuning.html
3: Field SQL storage lean solution
4: Patient profile form:medical form

Mar 13 2015
Mar 13

Have been worked on many Drupal projects these years. Even though, most of the projects have version control system. But everyone has different ways. Today, I want to share one of them that I think is great. Because of it, there was no accident in the over 5 deployments during a half year period.

The biggest headache for Drupal deployment is the conflict between the configuration and the content. Content is moving downward from product to staging to development. But the configuration is moving upward from development to staging to production. Both of configuration and content are existing in a same database and same tables sometimes. So, we can not separate them and move the configuration only upward to production. We used features module and packed most of the configurations into several features. But there was some manual configuration we had to do. The CTO did not want developers to have administrator access to the production server. I agree that it is a good idea, since it helps stabilize the production environment. But they had to have someone know Drupal to configure the production site. So, they appoint me as configuration manager position to do that job. Well, the good news is Drupal 8 have moved configuration into code. Hopefully, that will solve the problem gracefully.

It was a typical Drupal website for a small content publisher. We had 5 Drupal developers, 2 QAs, a project manager and a business analyst. We had a group of in-house editors who would be very upset if our system had something wrong during deployment. We needed a good strategy to make sure successful deployment within the maintenance window. Usually, the downtime was 2 hours.

We used Jira for the issue queue. There was a Jira expert helped us set up the process. Issue went through various stakeholders according the designed process. Project manager would decide whether to approve each ticket for next release. Developers would see all the approved tickets in a working pool. After solve the problem, developers marked the ticket as done. It would then in the queue of the configuration manager. In the end, configuration manager would make a quick snapshot of the dev branch and mark all the related tickets as QA ready. He then worked with system admin to push the code to staging and did any necessary manual configurations. We were using features module extensively. It kept the manual configuration at minimal. We also put all the necessary manual configuration steps in Jira tickets. QA then got onto the staging server and verified and approved each ticket. Ticket had failed to pass QA will be disapproved, and dev team had to deal with it again. The whole process was reiterated until every ticket passed QA. Then QA marked all the tickets as passed.

At last, configuration manager used the latest release tag and merged the Git dev branch into the production branch. Make a release version tag on production. Also after that, merge back all the hotfix branch back into the dev branch. There is a great article about Git branching model. I think it worthy of time to read it for every developer.

Mar 09 2015
Mar 09

As I have talked in my previous blog about importing large quantity of node without feeds module. Today, I will show you how to build complicated block views without using views module.

Did you have ever faced the situation that views can not solve the problem? I have a unique request from my client http://www.wview.com/ this week. We need to have a related video block for video node page. There are three levels of sorting. First show the episode videos from the same series. Then show the videos having same tag. At last, show the rest of the videos that are not in the same series and have no tags in common. What are those three requirements really means? We need to sort all video nodes according to current node's taxonomy tags. So, views block is not the solution since we select all the nodes. Even though, views is one of the most used modules among Drupal websites; it is not a medicine for everything. As a rule of thumb, if any problem can not solved by one SQL query, then we should look for a solution in other places. The problem we are solving is related to complicated sorting that views can not do it. It is one of the case.

It took me and another Drupal developer all most two days to make a view on it. We can not find a good way to sort them. In the end, we give up views and decide to make a custom module for it. Within couple hours, a customized block was built. I found that a custom block was easier to build. It is more efficient than the views block. Here I would like to show you how does it work.

There are three parts of the module. First, get a list of the nodes. Second, render the list of the nodes. Third part is using the ajax framework to add the load more button.

Most difficult part of the module is to get the list of the nodes. Since all the request of 3 levels of sort are all done with the list of nodes. It is where views is not able to accomplish it. But it is more related to the business logic. Different project may face different logic. There is one thing I want to mention on this part. We need to be careful when build query. We are adding more and more videos to the website. Try to avoid loading too many videos into memory will be the key point to make sure the system will running smoothly after the site grow big. Check blog "build website with a million nodes" for more detail on it.

Here are the codes for the rest of 2 parts.

Render the list of nodes into a block. We use node teaser displays for each node in the block.

/*
* Render the list of nodes into HTML.
*/
function related_videos_render_json($count = 7, $position = 0, $cur_nid = 0) {
$nodes = related_videos_get_list($count, $position, $cur_nid);
$nid = array();
foreach ($nodes as $key => $node) {
$nids[] = $node->nid;
}
if (!empty($nids)) {
if ($count > count($nids)) {
$count = 0;
}
$nodes = node_load_multiple($nids);
$build = node_view_multiple($nodes);
if ($count != 0) {
$build['loadmore'] = array(
'#prefix' => '

',
'#type' => 'link',
'#title' => 'Load More',
'#href' => 'related_videos/nojs/' . $count . '/' . ($position + $count) . '/' . $cur_nid,
'#id' => 'videos_ajax_link',
'#options' => array('attributes' => array('class' => array('use-ajax'))),
'#ajax' => array(
'wrapper' => 'related_videos_link',
'method' => 'json',
),
);
}
}
else {
$build['no_content'] = array(
'#prefix' => '

',
'#markup' => t('There is currently no content.'),
'#suffix' => '

',
);
}
return render($build);
}

The function related_videos_get_list($count, $position, $cur_nid) is to get the list of node where we put our node sorting logic there.

Then we use two block hooks to create a block for the list of video.

/**
* Implements hook_block_info().
*/
function related_videos_block_info() {
$blocks['related_videos_block'] = array(
'info' => t('Related videos block'),
'cache' => DRUPAL_NO_CACHE,
);
return $blocks;
}
/**
* Implements hook_block_view().
*/
function related_videos_block_view($delta = '') {
switch ($delta) {
case 'related_videos_block':
$cur = related_videos_current_node();
drupal_add_library('system', 'drupal.ajax');
$block['subject'] = t("Related Videos");
$block['content'] = '

Related Videos

';
$block['content'] .= related_videos_render_json(7, 0, $cur->nid);
$block['content'] .= '';
break;
}
return $block;
}

So, now we have a block can be assign to anywhere with Drupal block system, panel or context module.

The last part is to use Drupal Ajax framework to add a load more button. We load more videos without reload the page.

First, implements the hook_menu to define the callback link for ajax request.

/**
* Implements hook_menu().
*/
function related_videos_menu() {
$items = array();
$items['related_videos/ajax'] = array(
'page callback' => 'related_videos_ajax',
'access callback' => 'user_access',
'delivery callback' => 'ajax_deliver',
'access arguments' => array('access content'),
'type' => MENU_CALLBACK,
);
$items['related_videos/nojs'] = array(
'page callback' => 'related_videos_nojs',
'access callback' => 'user_access',
'access arguments' => array('access content'),
'type' => MENU_CALLBACK,
);
return $items;
}

The second menu item items['related_videos/nojs'] is for the fall back of browser have javascript disabled. For more detail, check Ajax framework document.

Then we build the callback function related_videos_ajax. We use the callback function in the implementation of hook_menu.

/**
* Ajax callback for related videos content.
*/
function related_videos_ajax($count, $position, $cur) {
$data = related_videos_render_json($count, $position, $cur);
$commands = array();
$commands[] = ajax_command_replace('#related_videos_ajax_link', $data);
$page = array('#type' => 'ajax', '#commands' => $commands);
return $page;
}

The function related_videos_render_json is the function we used to render the HTML. We use the same function for the initial block content. Here, we got the HTML of the list of the node and delivered to the browser in JSON format for the load more button. The parameters $count, $position, $cur was passed in from the URL. The URL of the load more link is defined by the related_videos_render_json function.

Those three parts formed a custom block display a uniquely ordered video nodes. Use the powerful Drupal theme function and Ajax framework, it is easy to build a block like this.

Mar 09 2015
Mar 09

As I have talked in my previous blog about importing large quantity of node without feeds module. Today, I will show you how to build complicated block views without using views module.

Did you have ever faced the situation that views can not solve the problem? I have a unique request from my client http://www.wview.com/ this week. We need to have a related video block for video node page. There are three levels of sorting. First show the episode videos from the same series. Then show the videos having same tag. At last, show the rest of the videos that are not in the same series and have no tags in common. What are those three requirements really means? We need to sort all video nodes according to current node's taxonomy tags. So, views block is not the solution since we select all the nodes. Even though, views is one of the most used modules among Drupal websites; it is not a medicine for everything. As a rule of thumb, if any problem can not solved by one SQL query, then we should look for a solution in other places. The problem we are solving is related to complicated sorting that views can not do it. It is one of the case.

It took me and another Drupal developer all most two days to make a view on it. We can not find a good way to sort them. In the end, we give up views and decide to make a custom module for it. Within couple hours, a customized block was built. I found that a custom block was easier to build. It is more efficient than the views block. Here I would like to show you how does it work.

There are three parts of the module. First, get a list of the nodes. Second, render the list of the nodes. Third part is using the ajax framework to add the load more button.

Most difficult part of the module is to get the list of the nodes. Since all the request of 3 levels of sort are all done with the list of nodes. It is where views is not able to accomplish it. But it is more related to the business logic. Different project may face different logic. There is one thing I want to mention on this part. We need to be careful when build query. We are adding more and more videos to the website. Try to avoid loading too many videos into memory will be the key point to make sure the system will running smoothly after the site grow big. Check blog "build website with a million nodes" for more detail on it.

Here are the codes for the rest of 2 parts.

Render the list of nodes into a block. We use node teaser displays for each node in the block.

/*
* Render the list of nodes into HTML.
*/
function related_videos_render_json($count = 7, $position = 0, $cur_nid = 0) {
  $nodes = related_videos_get_list($count, $position, $cur_nid);
  $nid = array();
  foreach ($nodes as $key => $node) {
    $nids[] = $node->nid;
  }
  if (!empty($nids)) {
    if ($count > count($nids)) {
      $count = 0;
    }
    $nodes = node_load_multiple($nids);
    $build = node_view_multiple($nodes);
    if ($count != 0) {
      $build['loadmore'] = array(
        '#prefix' => '<ul id="related_videos_link" class="load-more"><li>',
        '#suffix' => '</li></ul>',
        '#type' => 'link',
        '#title' => 'Load More',
        '#href' => 'related_videos/nojs/' . $count . '/' . ($position + $count) . '/' . $cur_nid,
        '#id' => 'videos_ajax_link',
        '#options' => array('attributes' => array('class' => array('use-ajax'))),
        '#ajax' => array(
          'wrapper' => 'related_videos_link',
          'method' => 'json',
        ),
      );
    }
  }
  else {
    $build['no_content'] = array(
      '#prefix' => '<p>',
      '#markup' => t('There is currently no content.'),
      '#suffix' => '</p>',
    );
  }
  return render($build);
}

The function related_videos_get_list($count, $position, $cur_nid) is to get the list of node where we put our node sorting logic there.

Then we use two block hooks to create a block for the list of video.

/**
* Implements hook_block_info().
*/
function related_videos_block_info() {
  $blocks['related_videos_block'] = array(
    'info'    => t('Related videos block'),
    'cache' => DRUPAL_NO_CACHE,
  );
  return $blocks;
}

/**
* Implements hook_block_view().
*/
function related_videos_block_view($delta = '') {
  switch ($delta) {
    case 'related_videos_block':
      $cur = related_videos_current_node();
      drupal_add_library('system', 'drupal.ajax');
      $block['subject'] = t("Related Videos");
      $block['content'] = '<section class="col-xs-12 col-sm-8 wv-contnet video-related-wrapper"><h2>Related Videos</h2>';
      $block['content'] .= related_videos_render_json(7, 0, $cur->nid);
      $block['content'] .= '</section>';
      break;
  }
  return $block;
}

So, now we have a block can be assign to anywhere with Drupal block system, panel or context module.

The last part is to use Drupal Ajax framework to add a load more button. We load more videos without reload the page.

First, implements the hook_menu to define the callback link for ajax request.

/**
* Implements hook_menu().
*/
function related_videos_menu() {
  $items = array();
  $items['related_videos/ajax'] = array(
    'page callback' => 'related_videos_ajax',
    'access callback' => 'user_access',
    'delivery callback' => 'ajax_deliver',
    'access arguments' => array('access content'),
    'type' => MENU_CALLBACK,
  );
  $items['related_videos/nojs'] = array(
    'page callback' => 'related_videos_nojs',
    'access callback' => 'user_access',
    'access arguments' => array('access content'),
    'type' => MENU_CALLBACK,
  );
  return $items;
}

The second menu item items['related_videos/nojs'] is for the fall back of browser have javascript disabled. For more detail, check Ajax framework document.

Then we build the callback function related_videos_ajax. We use the callback function in the implementation of hook_menu.

/**
* Ajax callback for related videos content.
*/
function related_videos_ajax($count, $position, $cur) {
  $data = related_videos_render_json($count, $position, $cur);
  $commands = array();

  $commands[] = ajax_command_replace('#related_videos_ajax_link', $data);
  $page = array('#type' => 'ajax', '#commands' => $commands);
  return $page;
}

The function related_videos_render_json is the function we used to render the HTML. We use the same function for the initial block content. Here, we got the HTML of the list of the node and delivered to the browser in JSON format for the load more button. The parameters $count, $position, $cur was passed in from the URL. The URL of the load more link is defined by the related_videos_render_json function.

Those three parts formed a custom block display a uniquely ordered video nodes. Use the powerful Drupal theme function and Ajax framework, it is easy to build a block like this.

Feb 27 2015
Feb 27

Begin to design a Drupal website with a million nodes in mind. We build a Drupal website. It runs well at beginning. Until one day, the system has hundreds of thousands of node. We found the site became slow. We need wait many seconds before we can open a new page. Not only it is slow, but also sometimes we got errors like memory exhausted.

Most time the problem was existed at the beginning stage of a system. When designing a site, there are something we as a developer have to take care. We need bear in mind the site will grow and more and more nodes will come. Everytimes creating a function, we need to make sure the function will work fine when there are hundreds of thousands of nodes in the system. Otherwise, those functions may time out or finish all the memory by those ever increasing nodes in the system.

PHP have a maximum memory limit for each user. Sometimes it is 128 MB. Sometimes it is 256MB. The number is limited, and it is not infinite large for sure. There is no limit on how many nodes can exist on our website. As our system getting larger and larger with more nodes created, we will face the memory limitation sooner or later if we did not take it into consideration at the beginning.

Here is a quick sample. Drupal have a function node_load_multiple(). This function can load all nodes in the database to memory. Here are some codes from one of our contributed module.

foreach (node_load_multiple(FALSE) as $node) {
  // Modify node objects to be consistent with Revisioning being
  // uninstalled, before updating the {taxonomy_index} table accordingly.
  unset($node->revision_moderation);
  revisioning_update_taxonomy_index($node, FALSE);
}

This code is in an implementation of hook_uninstall. It will run into a problem if there are over 10,000 nodes in the system. As a result, we can not uninstall this module. Here is the error message:

Fatal error: Allowed memory size of 268435456 bytes exhausted (tried to allocate 36 bytes) in ...

It used up all 256MB memory before it can load all the nodes. As the result, the module can never be uninstalled from the site.

It is an extreme case. As we troubleshooting an existing site, we may notice similar case here and there. I also notice that we can do something on the field_sql_storage module to make Drupal running faster and keep SQL database smaller.

Jan 08 2015
Jan 08

When we talk about the performance of Drupal, the first thing come to my mind is caching. But today I found another way to make Drupal run a little bit faster. It is not a profound thing, but something may be ignored by many. In work, I need process 56916 records constantly with automated Cron process. It took 13 minutes 30 seconds to process all those records. Adding a new database field index, I reduced the processing time to one minute 33 seconds only. It is more than eight times faster.

Here is the detail. I have about fifty thousand of record that updated daily. Each record I had a hash created and stored in a field. Whenever inserting or updating a record, and I would check and see if this hash code existed in the database. The project requires searching on the field revision table. Here is the code in my custom module.

$exist = db_query("SELECT EXISTS(Select entity_id from {field_revision_field_version_hash} where field_version_hash_value = :hash)", array(':hash' => $hash))->fetchField();
// Return when we had imported the schedule item before.
if ($exist) {
  return;
}

So, checking the hash code in the database became one of the heavy operations. It consumed a lot of system resource. By adding a single query to the field revision table make the process eight times faster. Here is the code I put in the module install file.

// Add version-hash indexes.
if (!db_index_exists('field_revision_field_version_hash', 'version_hash')) {
  db_add_index('field_revision_field_version_hash', 'version_hash', array('field_version_hash_value'));
}

When we build a Drupal website, we are not dealing with database directly. Even though Drupal creates the tables for us, we can still alter the table and make it better.

Jan 07 2015
Jan 07

wuinfoOne of the Drupal tasks is to import data into a system. Importing large number of articles or any types of entities is a little bit challenging. There are two tricks here. One is obvious. We need to find the way to break one mega task into small tasks; The other one is not so obvious. We have to minimize the damage of every mistake and avoid a small failure(uncaught exception) breaking the whole importing process. To achieve that, we isolate the processing of handling each node.

Here is a study case from one of my previous projects. Customer needed to import and constantly update 55,000 schedule records every day. We imported TV channel schedule data as node entity. There are two ways to achieve the goal. We can either use feeds module or build our custom modules.

Feeds module provides a full set of functions to do the job. It just took less than an hour to setup everything. But, it gets time out error every time after import around 2000 nodes. Feeds module did not provide the function to divide a big task into many small trunks. So, we have to implement the Feeds fetch Hook for it. I had used Feeds module when I was working at TVO.ORG. It had not been efficient as we had expected. A problem in a single node would impact whole import process. Since we just can not guaranty clean source data, it is difficult to be error-free for all nodes. It was related to human involvement, and content manager may make a mistake at some time.

So, I choose to build custom modules for it. According to Feeds module's architecture, there are three parts in the importing process. They are "Fetcher", "Parser" and "Processor". We will setup a module to fetch data and create a Drupal system queue for each node. But, we still have timeout issue if we want to put over 50,000 nodes into a queue at once. We handle 5000 nodes each time and use an internal counter to handle it. Depending on different kinds of data source, we put different program logic here in the fetcher. The fetcher is responsible for building the queue. It is also responsible for no missing or duplicated records. Then in the queue worker callback function, we built the parser and processor for each queue task. One task deals with one node only. That is to create those nodes. By doing this, we successfully detached the fetcher from parser and processor. Any breakdown in parsing and processing would not affect the others anymore.

Here is a little bit more about fetcher. We used cron to transfer the source data into Drupal system queue. Surely, it would timeout if I put those 55,000 nodes into queues at one time. In most cases, it is not possible for Drupal to handle over 20,000 nodes in one task. So, we need to break the job into small tasks. A single cron run may just handle 5000 nodes. So, every time cron import 5000 nodes from source data and put into the queue. We have not started the process of creating a node yet. At the end of this, we transfer all source data into the system queue. The queue holds 55,000 tasks. Each task has data to create a node. Use the contribute module queue ui to see see the created tasks.

Now, we have importing each node as a separate task by itself. Failure of one would not affect the others.

Since we have already put our parser and processor code in queue worker callback function, we have the last thing to do is to run through the queue and finish all the tasks on it. Drupal Cron runs the queue tasks by default. But I need a better control over creating queue and cleaning queue. I want to have a different program to handle it. There is a mob_queue module. It finish the queued tasks with Drush command and prevent the queue running from default Cron execution. Mob_queue also allow to assign time to execute the queue. Drush have default queue execution command, but it does not deal with the cron queue execution. We can invoke the Drush command in the Linux Crontab.

After all, we build our customized module to import and update all those nodes, and it is much more lightweight and less overhead than feeds module. It also gives us more control over the importing process.

About Drupal Sun

Drupal Sun is an Evolving Web project. It allows you to:

  • Do full-text search on all the articles in Drupal Planet (thanks to Apache Solr)
  • Facet based on tags, author, or feed
  • Flip through articles quickly (with j/k or arrow keys) to find what you're interested in
  • View the entire article text inline, or in the context of the site where it was created

See the blog post at Evolving Web

Evolving Web