Feeds

Author

Upgrade Your Drupal Skills

We trained 1,000+ Drupal Developers over the last decade.

See Advanced Courses NAH, I know Enough
Aug 23 2016
Aug 23

GSoC is almost over and I’m now submitting the project which I started with my awesome mentors Dick Olsson, Andrei Jechiu and Tim Millwood nearly 4 months ago. I knew it’d be a really great journey but I had no idea it’d be this awesome. Everything related to GSoC was perfect. The organisation, my mentors, the project itself was enough challenging as well as interesting at the same time. Weekly blogs, daily updates and communication with mentors over Slack made sure we’re on track. First part was easy but I struggled a bit in second half but my mentors were always right here to help me with all the issues I had. I used to feel a bit embarrassed as I was not getting things done as easily as any other student could have but they never felt the same way. They always had a smile at their face and a solution for me to work on. I’m really glad I got such supportive mentors.  Thanks a lot Drupal.

Drupal makes me happy

Project Description: The task was to solve the conflicts which may occur when two users try to push the same content with modifications in same piece of content/entity on their respective end. Initially it looked like a very difficult task and I was really afraid when I chose the project but when Dick Olsson told me that this is a very interesting project and something like this has not been done in Drupal like this from a long time, it may also get a lot of attention as well as support from other people in community as well, I was more than just happy in going with this project. It took me some time to actually understand how we would solve the problem and to be very honest, I had no clear idea for the community bonding period about if I still understand it clearly or not. I decided to go on step by step and knew what I had to do at one time. Once we started coding, I could see the future and the parts where the code I was writing at that moment would be used. Soon, I could see a very clear picture of our project and the path we had to follow to make it a success. All thanks to my mentors again for their constant guidance.

How we solved it: I started by understand what actually happens at the backend of Drupal When a node is updated. I learnt that a revision is created every time a node is updated. Multiple revisions forms a graph with actual entity as root node. In the image below, you can see a graph with multiple nodes. All these nodes are revision and root is actual entity. In simpler words, when we create a new entity, we are starting a graph and when we update that entity, we are adding a node in that graph. It should be noted that even when we remove an entity, we’re adding a node in the graph and not removing.

The complete solution was depending on two steps:

  1. Find Lowest common ancestor of 2 revisions.

  2. Implement recursive 3-way merge algorithm on them.

I will start by explaining the 2nd step and It’d pretty much cover the 1st one itself. Recursive 3-way merge Algorithm takes 3 parameters as input which are 3 entities (local, remote and base) and compares two of them (local and remote) with base entity.

It then makes sure that none of the content which is changed in Local/Remote is changed in Remote/Local as well. For example:  If some changes are made in Local on line number 19, there should not be any changes on line number 19 of remote and It should be identical to line number 19 of base entity. There can be changes on any other line of remote entity but not those which are changed/modified in local. Vice versa is true as well. If it is made sure that no changes in local and remote overlaps, there’d not be any merge conflicts otherwise there would be conflicts.

Let’s take another example: In the image below, we see that line 51 is changed by both the developers and content is different from the base entity. In this scenario, the system can not decide on it’s own if we want the loops to run 10 times as per Remote Entity or 20 times as per local entity or just 5 times as Base entity says. So we’ll let users manually decide what they want. In our case, developer wanted it to run for 25 times.

Figure 1 : Explaining how merge conflicts occursFigure 1 : Explaining how merge conflicts occurs
                                                                                                                                                                     IMAGE BY : DR. DROBBS (FOR INFORMATIONAL USE ONLY)

For further information on how it all works, you can also have a look at the 3rd week blog post at my blog.

Now coming back to the first step: “Finding LCA”, the base entity we use in recursive 3-way merge is the LCA of two revisions. It is considered as base as it contains all the latest changes which are reckoned to be in the two revisions we will be comparing. So, we find a common parent of both the revisions which is closest to both of them and compare these revisions with that parent only. That parent or LCA acts as base entity.

Example: Below is a graph of revisions with revision ID and status. As we can see there are multiple revisions. Now let’s suppose someone (Developer A) starts editing revision with revision_id 4-8B10006… and at the very same time, another person (Developer B) starts editing 4-82460F0…. When both the developers try to save their edits, we don’t know which changes should be kept in the final revision (new revision which would be created after the merge). So for that purpose, we need to develop a system which could compare the revisions content and make sure that there is no conflict.

As discussed above, first we would find Lowest common ancestor of both the revisions developers have modified. As we can see Revision 3-6F7F875…  is closest to both of them, it is the Lowest Common Ancestor. Relaxedws/lca library does this task of finding LCA of two revisions. Now we have three important parts we can use from the graph:

  • Revision 1 (4-8B10006…) which was modified by developer A.

  • Revision 2 (4-82460F0…)  which was modified by developer B.

  • Lowest Common Ancestor of both the revisions (3-6F7F875…). 

Figure 2: Revision Graph created by workspace module.Figure 2: Revision Graph created by workspace module.

 The next step is to find if there are any merge conflicts or not. For that purpose, we will compare contents of revision 4-8B10006… with revision 3-6F7F875… and inspect all the changes made in revisions 4-8B10006…. We’ll do the same for revision 4-82460F0….. Once we have compared both revisions with their LCA (revision 3-6F7F875…), we will compare the changes made in both revisions. If the changes contains some common content in both revisions, we know there is a merge conflict as the system can not decide which changes are more important. So at that time, we’ll have to show a 3 pane window to the developer and let him/her decide whether he wants. He would be shown content of all three parts i.e, revision 1, revision 2 and LCA. The manually chosen content will then create a new revision. The tree pane window would look like this.

Figure 3 : 3-Pane window to resolve merge conflicts manually (containing content from Local, Base and Remote entities).Figure 3 : 3-Pane window to resolve merge conflicts manually (containing content from Local, Base and Remote entities).
                                                                                                                                                       IMAGE BY : CODEMIRROR DEMO (FOR INFORMATIONAL USE ONLY)

And once the conflicts and either resolved or there wasn't any conflict, the graph would look something like the following image:          

Expected Graph after conficts are resolved.Expected Graph after conflicts are resolved.

Initially the biggest concern was to get my system set up and running properly. I had to install Drupal 8 locally and get an IDE with a good debugger to start working. As per my mentor’s suggestion, I used PhpStorm IDE from JetBrains and Xdebug for debugging purposes. Also, my mentors kept telling me to learn how to test the code I will write as they wanted to “Write tests first, code later” approach. It was one of the most challenging tasks as I had no exposure to testing and was totally new to it. So, with a little knowledge of everything, I started my journey and I’m really happy we got it going till the end.

Once the problem was clear, we started coding our way into it. We divided the whole process in sub tasks and completed one at a time which in the end was integrated together for a complete success of project.

Let’s take a look at the all the subtasks and how we implemented them.

Task 1: First task was to create a PHP Library which could return Lowest Common Ancestor from a Directed Acyclic Graph.

Implementation: We used clue/graph library and ran a Breadth First Search on the graph for both the revisions and then found the common nodes and returned the one closest to both the nodes.

Code: The code is pushed over github and packagist as well.

List of all my commits : Commits made by Rakesh Verma in Relaxedws/lca Library.

 A Pull commit may contain multiple commits.

Task 2: Once we’re done with the relaxedws/lca library, we needed another library which could use the LCA returned by relaxedws/lca for implementing recursive 3-way merge algorithm as described above. For this purpose, we created relaxedws/merge library which takes 3 arrays as input and merge them according to 3-way merge algorithm.

Implementation: We implemented all the algorithm by ourselves for this library. We converted the contents of file into array and then compared each line of these array with array from other revision. If there was any change in values of the arrays at same index, we made sure it doesn’t exist in 3rd array as well otherwise we returned a custom exception alerting user about merge conflict.

List of all commits: Commits made by Rakesh Verma in Relaxedws/merge Library.

Task 3: After the successful completion of both these libraries, we’re ready to implement these libraries in Multiversion module but then we realized we need one module which won’t use these libraries but use simple terminology to find LCA and perform merge algorithm in Drupal as not all Drupal sites would use Multiversion and hence these libraries won’t be used by all.

Implementation: As we discussed earlier how revisions are created in a linear graph by default in drupal, we created a module named “Conflict” which would take 2 node id as input and would return the parent of the node_id which was created first.

For ex: LCA(4,5) would be 3 and LCA(12,16) would be 11 and LCA(1,8) would be 1 as it’s the root node and it doesn’t have any parent.

Similarly it takes 3 node ids to perform merge and return the last created node id out of those 3 as the last created node is the latest one and it contains all the latest changes in linear graph. So, Merge(3,4,5) would return contents of node with id = 5 and Merge(4,6,9) would return the contents of node with id = 9.

We created a pluggable pattern with services in the container which could be used by other drupal modules as well. The services we created in the conflict module were used in Multiversion module as well to detect and solve merge issues in the complex graphs where a node can have multiple branches (nonlinear graph).

Code: The code is available at Drupaldeploy/drupal-conflict.

List of commits made by me can is available here: Commits made by Rakesh Verma.

Task 4: After implementing the simple LCA resolver and simple Merge resolver in Conflict module, we’re ready to start integrating the relaxedws/lca library into the Multiversion module. All the websites which uses Multiversion module can use these libraries to find Lowest common ancestor from the graph of revisions and use that as an input to merge library.

The feature is already completed and merged in Multiversion module.

List of all the commits made by me is present : Commits made by Rakesh Verma to integrate relaxedws/lca with multiversion module.

Task 5: Now we’re left with the integration of relaxedws/merge library with the multiversion module. We’ve implemented the functionality and writing tests for it. Some tests are failing because revision ID is different for each revision and hence, merge library is returning an exception because the revision ID is changed in all three revisions and library can not decide on it’s own which results are to be kept in final revision. We’re trying to solve this issue.

Implementation: We created a method to convert the tree returned by multiversion module into a graph and later, we used that graph and it’s contents in PHP libraries we created in first phase. We integrated those libraries with Multiversion module and accessed them via services defined in Conflict module.

Code: All the commits made by me can be seen here: Link to pull request for integrating relaxedws/merge into multiversion module.

Task 6: The last task of the project is to integrate Codemirror with the multiversion module so that if in any case, there is a merge conflict, A user would be shown all 3 nodes (Local, remote and Base) and would be allowed to decide what changes are to be kept in final revision. This is where Codemirror comes in. This is the last part of the project and we ran out of time.

Weekly Breakdown: The work we did weekly is mentioned below:

  • Community Bonding period:  Familiarize myself with the organizations, mentors and the codebase.

  • Coding Phase Week 1: Finished the PHP library to find LCA. (Blog Post)

  • Week 2: Wrote tests for the PHP library. (Blog Post)

  • Week 3: Finished the code for PHP library to perform recursive 3-way merge (Blog Post)

  • Week 4: Wrote tests for the PHP library to ensure it performs the algorithms correctly (Blog post)

  • Week 5: Started code to create module to find LCA in linear graph (Blog post).

  • Week 6: Wrote tests for module and Finished the LCA part of the Conflict module (Blog Post).

  • Week 7: Implemented Simple Merge Resolver in Conflict Module (Blog Post).

  • Week 8: Extensive tests written for Simple Merge Resolver (Blog Post).

  • Week 9: Finished the Conflict module (Blog Post).

  • Week 10: Started Integration of PHP library which finds LCA from a graph into Multiversion module. (Blog Post)

  • Week 11: Wrote tests to ensure integration has been done correctly in multiversion module (Blog post).

  • Week 12: Finished Integration of PHP library to perform recursive 3-way merge with multiversion module (Blog Post).

  • Week 13: Wrote Tests for the integration part of the PHP library with multiversion Module.

  • Week 14: Fixing Bugs, added documentation and fixed other issues.

Future Work:  I’d keep contributing to the organisation as much as possible and I’d finish the tasks which I couldn’t complete in given time. I’ll also try to mentor students in Google Code In. This was such a great experience and learning curve grew exponentially for me. I would be really fortunate if I could experience something like this ever again or if I could get another chance to work full time with my mentors.

Conclusion: Participating in Google Summer of Code has brought me closer to the Drupal community members which comprises of an amazing group of developers. The experience in working with them taught me essential soft skills like effective communication, extensive testing and much more, than just writing code. Not only I learnt about full scale development but also about writing Industrial level code as well which I believe I couldn’t have learnt on my own. Every single time I tried to write code with as few mistakes as possible but there were always some. In order to not make new mistakes, I sometimes made same mistakes multiple times but my mentors (specially Andrei) always informed me about these mistakes and why I shouldn’t repeat them. Writing tests was one of the major learning. I never thought they were this important. After writing code, when I thought now the work is done, it was always the start. Solving errors took twice the time than writing code. Initially when my mentor Dick told me that ideally writing code takes 1/3rd of the total time, I really didn’t take him very seriously until I experienced it myself. Other than this, I learnt the importance of having a good and working debugger. A few hours tasks could take more than a week if you don’t have a debugger working fine at your end as you just simply can not identify the problems sometimes with your code. No matter how fine code you’ve written or how obvious it seems that it should work, there will always be an issue at your end if tests are failing and only a debugger can help you detecting the issue. There were many other little things which I got to learn like working under given time and not sticking to a wrong method if that’s not working rather than spending time with it to make it work. These all things will help me a lot shaping my future as a software developer. I will be forever indebted to them for providing such a nurturing environment.

I would sincerely thank my mentors Dick Olsson, Andrei Jechiu and Tim Millwood, My C0-mentor Matthew Lechleider and whole Drupal community for the support over IRC. A big thank to Drupalize.me for giving us a chance to learn about drupal through their website for free. Thanks a lot Google for providing us the opportunity and thanks Drupal for choosing me . I am really grateful.

Aug 17 2016
Aug 17

Woah!! We're in last week of Google Summer of Code. Hard to believe that the journey we started from 22nd April is about to come to an end. Soon I’ll be publishing a post summarizing all my experience, development and other great achievements during this time. It was a really awesome journey but I’m really happy that the bond we share with the community and open source is not tied to GSoC only. The thought that I would be able to contribute after GSoC as well gives me  a really good motivation. There are many things that I learned during this time and would be happy if I could ever teach any of it to someone. My mentors were the best part of this journey. I’m really happy I got mentors like them and they played a huge role in keeping this journey interesting and really enjoying. I feel really very lucky to have 3 mentors and each of them is really a master in their fields. On the top of that, what I really liked about them is their behaviour, nature and the respect they give me irrespective of the fact that I make mistakes almost all the time and they have to find the mistakes in my code every single week in every task. I always feel embarrassing but they always tell me that they wouldn’t have signed up for being a mentor if they had any problem with this. My words can not define a bit of their awesomeness. I’d really recommend you guys to do at least a single (even small) project under their guidance and write to me if you don’t fall in love with them. You can contact them or IRC or through their drupal profiles (Dick Olsson, Andrei Jechiu and Tim Millwood) No matter how busy they are, they will definitely make some time for you and you’ll often see them working almost every time. No matter what time you message them, they are like always online to solve your issues.

Coming back to the Project details,

Project Details: The task is to solve the conflicts which may occur when two users try to push the same content with modifications in same piece of content/entity on their respective end. So far, we've a way to detect conflicts but not any to solve them.

Solution we're working on: When a node is updated, a revision is created. Multiple revisions forms a graph with actual entity as root node. In the image below, you can see a graph with multiple nodes. All these nodes are revision and root is actual entity.

We detect conflicts in 2 steps:

  1. Find Lowest common ancestor of 2 revisions.

  2. Implement recursive 3-way merge algorithm on them.

I will start by explaining the 2nd step and It’d pretty much cover the 1st one itself. Recursive 3-way merge Algorithm takes 3 parameters as input which represents 3 entities (Local, remote and base) and compares two of them (Local and Remote) with base entitiy.

It then makes sure that none of the content which is changed in Local/Remote is changed in Remote/Local as well. For example:  If some changes are made in Local on line number 19, there should not be any changes on line number 19 of remote and It should be identical to line number 19 of base entity. There can be changes on any other line of remote entity but not those which are changed/modified in local. Vice Versace is true as well. If it is made sure that no changes in local and remote overlaps, there’d not be any merge conflicts otherwise there would be conflicts.

Let’s take another example: In the image below, we see that line 51 is changed by both the deveopers and content is different from the base entity. In this scenario, the system can not decide on it’s own if we want the loops to run 10 times as per Remote Entity or 20 times as per local entity or just 5 times as Base entity says. So we’ll let users manually decide what they want. In our case, developer wanted it to run for 25 times. ;)

Explaining how merge conflicts occursExplaining how merge conflicts occurs
                                                                                                                                                              Image by : Dr. Drobbs (For informational use only)
 

For further information on how it all works, you can also have a look at the 3rd week blog post at my blog.


Now coming back to the first step: “Finding LCA”, the base entity we use in recursive 3-way merge is the LCA of two revisions. It is considered as base as it contains all the latest changes which are reckoned to be in the two revisions we will be comparing. So, we find a common parent of both the revisions which is closest to both of them and compare these revisions with that parent only. That parent or LCA acts as base entity.

Example: Below is a graph of revisions with revision ID and status. As we can see there are multiple revisions. Now let’s suppose someone (Developer A) starts editing revision with revision_id 4-8B10006… and at the very same time, another person( Developer B) starts editing 4-82460F0… . When both the developers tries to save their edits, we don’t know which changes should be kept in the final revision ( new revision which would be created after the merge). So for that purpose, we need to develop a system which could compare the revisions content and make sure that there is no conflict.

As discussed above, first we would find Lowest common ancestor of both the revisions developers have modified. As we can see Revision 3-6F7F875…  is closest to both of them, it is the Lowest Common Ancestor. Relaxedws/lca library does this task of finding LCA of two revisions. Now we have three important parts we can use from the graph:

  • Revision 1 (4-8B10006…) which was modified by developer A.

  • Revision 2 (4-82460F0…)  which was modified by developer B.

  • Lowest Common Ancestor of both the revisions (3-6F7F875…) .                                         

Revisions Graph created by workspace moduleRevision Graph created by workspace module.

The next step is to find if there are any merge conflicts or not. For that purpose, we will compare contents of revision 4-8B10006… with revision 3-6F7F875… and inspect all the changes made in revisions 4-8B10006…. We’ll do same for revision 4-82460F0…  as well. Once we have compared both revisions with their LCA (revision 3-6F7F875…), we will compare the changes made in both the revisions. If the changes contains some common content in both the revisions, we know there is a merge conflict as system can not decided which changes are more important. So at that time, we’ll have to show a 3 pane window to the developer and let him/her decide whether he wants. He would be shown content of all three parts i.e, revision1, revision2 and LCA. The manually chosen content will then create a new revision. The tree pane window would look like this.

3 pane window to resolve conflicts manually.3 Pane window to resolve Merge conflicts manually (Containing content from Local, Base and Remote Entities).
                                                                                                                                                              Image by : CodeMirror Demo (For informational use only)

And once the conflicts and either resolver or there wasn't any conflict, the graph would look something like:

                                                                                            3-6F7F875…

                                                                                         /                  \

Just a prototype:                                           4-8B10006…    4-82460F0…

                                                                                        \                   /

                                                                                        5-XXXXXX......

Progress so far: We have created two php libraries. One to find LCA from a directed acyclic graph and other one to perform a recursive 3-way merge algorithm. We have also created a module named "Conflict" which would find LCA and perform the merge algorithm on Drupal entities. The Conflict modules doesn’t use the libraries we created in first phase of GSoC as by default Drupal 8 uses linear graph of Revisions. For example:

                                                                      1-----> 2 ------> 3 ------> 4 ------> 5

Where “1” is the actual nodes and rest all numbers are to be considered as its revisions. Now, finding Lowest common ancestor in such linear graph is easy. For example, LCA(2,3) is 1 and LCA(4,5) is 3. Also LCA(1,3) is 1. So we just checked if either one of nodes weren’t root, we returned the parent of the revision which was first created otherwise we returned the root ( Actual entity). Similarly for merge method, we just found the revision last created out of 3 and returned it as result the last created revision is the latest one in linear graph. For ex: Merge(3,4,5) would return 5 as result. This module doesn't use these libraries we've created.

However, those websites which uses Multiversion module will not use such approach as the graph of revision would be pretty different as defined above.

Now we are writing code in Multiversion module which would use these library. We'd be able to detect LCA from complex graphs and resolve merge conflicts using Multiversion module once we've integrated these APIs with the module. Of course these features would be available to sites using Multiversion module. At the moment, we have successfully integrated LCA Library with Multiversion module.

Work we did last week: We finished our work with ComplexLcaResolver.php which implements relaxedws/lca Library and returns lowest common ancestor from a directed acyclic graph. After we got that merged, we were all set for developing ComplexMergeResolver.php which would use relaxedws/merge to merge 2 arrays into one after comparing them with a third array. The whole merging process follows 3-way merge algorithm. You can read it on Dr. Drobss website for more information about it. You can also read this blog if you want to have a clear picture of how we are using it in Drupal 8 and Multiversion.

Work we did this week: We started our work with ComplexMergeResolver and have successfully implemented the functionality. We have successfully converted entity objects into arrays using serialization API and then we have used relaxedws/merge  library to detect any merge conflicts and merge them otherwise. The functionality is implemented and now we are writing tests to ensure it’s correct working. I’m really happy that we’re on the verge of completing our last development task. All the code we've written so far can be found here and the tests we've written are availiable here.

Aim for the next week: AAhhhhhhh!!!! Unfortunately, this is the last week. The only Aim for next week would be to plan how can I keep contributing to what I have created so far and other parts of this module.

I’d really like to thanks Matthew Lechleider. He talks to us once every week but has taught us one of the most important things i.e, to think thoroughly and define our tasks in very simple language. As a developer, we know what we have developed but most of the times, we’re not really able to make non-devs understand what our code is doing and what is the actual project. From the very starting where we started writing proposals to the very end, he used to tell me to “explain it in detail”. All of a sudden, it all makes sense now. Thanks a lot Drupal Community. Thanks a lot. You've given me an experience I'd never forget. 

Aug 10 2016
Aug 10

Project Details: The task is to solve the conflicts which may occur when two users try to push the same content with modifications in same piece of content/entity on their respective end. So far, we've a way to detect conflicts but not any to solve them. 

Solution we're working on: When a node is updated, a revision is created. Multiple revisions forms a graph with actual entity as root node. The simplest way to detect a merge conflict is to compare entities with their parent entity and figure out if both users have modified the same content or not. In case both of them have modified same content from an entity, there'd be a merge conflict otherwise we can simply merge their updates which would result in a new entity. Now, to compare these entities, we need a base entity as well with which these two updated revisions can be compared. For this purpose, we find lowest common ancestor from the graph (Aka LCA). After finding LCA, we can compare both of them with their common parent revision one by one. This process is known as  3-way merge algorithm. According to this algorithm, we shall compare two entities/revisions with another entity/revision to detect the modifications. While comparing, we just make sure that the same content is not modified from an entity in both the new revisions. If this happens, there'd be a merge conflict. To solve this merge conflict, a 3-pane window would be shown to user and let him decide which code he wants to keep and the other code (from other developer) would be discarded. You can find more about the project and the approach in the 3rd week blog post at my blog.

Progress so far: We have created two php libraries. One to find LCA from a directed acyclic graph and other one to perform a recursive 3-way merge algorithm. We have also created a module named "Conflict" which would find LCA and perform the merge algorithm on drupal entities. This module doesn't use these libraries we've created. Now we are writing code in Multiversion module which would use these library. We'd be able to detect LCA from complex graphs and resolve merge conflicts using Multiversion module once we've integrated these APIs with the module. Of course these features would be available to sites using Multiversion module. At the moment, we have successfully integrated LCA Library with Multiversion module. The next task would be to integrate Merge Library with the multiversion module as well.

Work we did last week: We implemented the basic functionality for finding lowest common ancestor from a graph of revision last week in multiversion module. The first task was to implement a method which could return a graph of revision Id's from the tree containing all information of an entity and it's revisions. We implemented `getGraph()` in multiversion module for this purpose. It takes uuid as input and returns a graph containing nodes keyed with revision ids. Once we had the graph of revision ID, we were able to use the relaxedws/lca library to find nearest common parent of two revisions in this graph of revision ids. We also implemented basic tests for the ComplexLcaResolver class to ensure that it was returning the correct LCA.

Work we did this week: Some tests were failing last week because the graph multiple nodes were having same revision Ids in the graph returned by the code we wrote last week in the ComplexLcaResolver.php .  This week, we tested our code in ComplexLcaResolver.php for multiple graphs and made sure that all the tests were passing. Because of some issues with my debugger, I was not able to debug the programme and despite of everything working fine, tests were failing. I spent almost 4 days trying to solve the issue but in the end, I was unable to solve the issue on my own. A big thanks to my super awesome mentors Dick Olsson, Andrei Jechiu and Tim Millwood, they pointed me in the right direction and the issue was resolved in next hour. However, the issue this time was not caused because of a silly mistake in my code. The most important thing I learnt this week was the importance of a working debugger. I learnt how a debugger can make life of a software developer much easier. The other thing I learnt is to follow the standards, even in documentation. Although it is said that a good code should be able to explain itself but for other cases, we must add proper documentation. Documentation standards are different for Drupal modules than the ones we used in PHP Library. I got to know about this after we're done with the tests issue.

Aim for the next week: The next step is to implement code which can use relaxedws/merge library to detect merge conflicts or to merge updates from multiple revisions into one. We've started our work with it already. We would have to convert entity objects into array first and then the merge library would loop over them to detect merge conflicts. After we're sure that there are no merge conflicts, we'd update (merge) the entities and convert them back to object. We would be using Serialization API for this. We'd be implementing tests for this class as well this week to make sure it's working fine.

Aug 05 2016
Aug 05

Project Details: With the introduction of Multiversion module in Drupal 8, we have a very powerful content revision API that can handle branching and conflict DETECTION. But there is not yet a way to SOLVE revision conflicts.

Proposed Solution: We store a graph of all the updated revisions from an entity and find a base revision from this graph with which two entities can be compared. This process is based on 3-way merge algorithm. According to this algorithm, we shall compare two revisions of an entity with another revision or actual node to detect what has been modified in those revisions. If the new revisions has modified content over different lines, the updated content shall be merged in the new revision along with the unchanged data from previous revisions. This is why we compare two entities with a base entity, to find out which part has been modified in the new revisions. If both the new revisions has modified the same data, a merge conflict would occur and user would be allowed to solve the conflict in a 3-pane window. You can find more about the project and the approach in the 3rd week blog post at my blog.

By last week, we had already implemented the `GetGraph()` function in Multiversion module. The tree of revision is passed to the function and a graph is returned containing vertices keyed with the revision ID. After multiple tests, we made sure that the getGraph() method was returning the correct output. After this, the next task was to implement the feature to find Lowest Common Ancestor of 2 nodes from the Graph returned by getGraph() method.

We’ve created a class ‘ComplexLcaResolver’ in the Multiversion module which uses relaxedws/lca to find the LCA. This feature of finding LCA from a tree of revisions will be available only in those sites which uses multiversion module. We have implemented the functionality and now we’re in test phase. For now, there are bugs in this implementation and we are working on them. My mentors Andrei Jechiu, Dick Olsson and Tim millwood have been a real support and source of motivation for me so far. Not only do they assist me with the implementation ideas but also teaches me a lot when I am making silly mistakes or don’t know about any feature of Drupal. All the code we’ve written can be found here.

Now I’ve been working with Automation Testing for quite sometime and the most important thing that I’ve learnt is no matter how confident you are about your code, it’d fail under certain circumstances. It matters the most when your code would be used by many people out there. So, it’s better to test your programme thoroughly before pushing it. It’s time consuming and sometimes you have no idea where and why the tests are failing. This is one of the best things GSoC and my mentors have taught me. Running one test at my end takes nearly 45-60 minutes which makes the development process a lot slower. We can not work on multiple parts of the project as we are in last stage and every next part uses the results from the previous part. For example: Merge Library uses results from LCA library to perform 3-way merge algorithm. Once you are done with testing, you are pretty sure that there will be very few bugs in the program which will be discovered when the code goes live.

This week, we’d be done with ComplexLcaResolver implementation after few bug fixes and more tests. After that, we shall start ComplexMergeResolver implementation which will use relaxedws/merge library which we created in the first phase of GSoC16. This class would be used to detect merge conflicts and to merge the updated entities. It would also use Serialization module to normalize the given revisions into arrays and then, we can loop over those array to find any merge conflicts or to find where an entity has been updated and we’ll merge those changes into the main branch.

Also, I’d like to announce that we’ve launched relaxedws/lca library on packagist.org and it is ready to be implemented into any project. It can be found here: https://packagist.org/packages/relaxedws/lca

Jul 27 2016
Jul 27

It's been 3 months now and we're finally approaching the last steps of our project - "Solving Merge conflict in Drupal 8" with Multiversion module. So far, we have launched two Open source Libraries, Implemented Simple LCA resolver and a simple merge resolver in the Conflict  module to find lowest common ancestor and merge the updates respectively. Now all development phase is almost done and the integration part starts where we have to integrate the Libraries we created in the first phase of Google Summer of Code16. The libraries we created were relaxed/lca and relaxed/merge

Let's take a look at the project's target and the solution we are implemeting:

Project Details: With the introduction of Multiversion module in Drupal 8, we have a very powerful content revision API that can handle branching and conflict DETECTION. But there is not yet a way to SOLVE revision conflicts.

Proposed Solution: We store a graph of all the updated revisions from an entity and find a base revision from this graph with which two entities can be compared. This process is based on 3-way merge algorithm. According to this algorithm, we shall compare two revisions of an entity with another revision or actual node to detect what has been modified in those revisions. If the new revisions has modified content over different lines, the updated content shall be merged in the new revision along with the unchanged data from previous revisions. This is why we compare two entities with a base entity, to find out which part has been modified in the new revisions. If both the new revisions has modified the same data, a merge conflict would occur and user would be allowed to solve the conflict in a 3-pane window. You can find more about the project and the approach in the 3rd week blog post at my blog.

Work we did last week: The multiversion module stores the revision status and other data in form of trees. To detect and solve conflicts, we need to convert this data in graphs. Last week, we worked on writing code which could create a graph from a given tree.

We wrote a recursive function to store all revision Ids from the tree in an array. Then we created nodes from those revision IDs. After the recursive function extracted the revision ID correctly, we wrote a function which creates edges between these nodes in the exact same manner in which they were connected in the tree. The code is pushed here.

Work we did this week: We tested the code we developed last week for the Conflict Module to convert the tree returned by getTree method of Multiversion module. After we made sure it’s returning the expected output, we merged the code into Multiversion module as a new method named getGraph. Other than that, we have written tests for it in the multiversion module as well to ensure it’s functionality with the module. We also implemented Travis CI integration for the conflict module. We’re also making sure that our code is working on other versions of drupal than 8.2. All the code is pushed here.

Target for the next week: We will start with creating a complex LCA resolver which will use the relaxedws/lca library for finding LCA. This resolver will be available to only those sites which uses multiversion module and so we will implement this resolver in multiversion module only. We’d also be writing tests for the complex LCA resolver this week only.

Jul 25 2016
Jul 25

Project Details: With the introduction of Multiversion module in Drupal 8, we have a very powerful content revision API that can handle branching and conflict DETECTION. But there is not yet a way to SOLVE revision conflicts.

Proposed Solution: We store a graph of all the updated revisions from an entity and find a base revision from this graph with which two entities can be compared. This process is based on 3-way merge algorithm. According to this algorithm, we shall compare two revisions of an entity with another revision or actual node to detect what has been modified in those revisions. If the new revisions has modified content over different lines, the updated content shall be merged in the new revision along with the unchanged data from previous revisions. This is why we compare two entities with a base entity, to find out which part has been modified in the new revisions. If both the new revisions has modified the same data, a merge conflict would occur and user would be allowed to solve the conflict in a 3-pane window. You can find more about the project and the approach in the 3rd week blog post at my blog.

Work we did last week: We are done with the simple merge resolver as well now. The task was to resolve merge conflicts in linear entities which drupal implements by default. The approach we took for this was to return the entity last created in the revision. This way, the entity which was created last would be returned as it is the latest entity. We have also written tests for it. The code for the library can be found on the github repository.

Work we did this week: The multiversion module stores the revision status and other data in form of trees. To detect and solve conflicts, we need to convert this data in graphs. This week, we have been working on writing code which could create a graph from a given array.

We wrote a recursive function to create an array to store all revision Ids from the array and then we created nodes from those revision IDs. Next we are working on a function which creates edges between these nodes. The code is pushed here in the getGraphTest.php file.

We are done with the code and testing it to make sure it’s returning the expected output. Once we are sure it’s returning the correct data, we can implement the code into the multiversion module.

Task for the next week: We will complete the tests, get the code implemented in the multiversion module to return graph from the tree and will also try to complete the complex LCA resolver code as much as possible. We need to complete the graph creating code first as it’d be used in all next steps i.e, to find LCA using relaxedws/lca Library and to resolve merge conflicts using relaxedws/merge library.

Jul 19 2016
Jul 19

Project Details: With the introduction of Multiversion module in Drupal 8, we have a very powerful content revision API that can handle branching and conflict DETECTION. But there is not yet a way to SOLVE revision conflicts.

Proposed Solution: We store a graph of all the updated revisions from an entity and find a base revision from this graph with which two entities can be compared. This process is based on 3-way merge algorithm. According to this algorithm, we shall compare two revisions of an entity with another revision or actual node to detect what has been modified in those revisions. If the new revisions has modified content over different lines, the updated content shall be merged in the new revision along with the unchanged data from previous revisions. This is why we compare two entities with a base entity, to find out which part has been modified in the new revisions. If both the new revisions has modified the same data, a merge conflict would occur and user would be allowed to solve the conflict in a 3-pane window. You can find more about the project and the approach in the 3rd week blog post at my blog.

Work we did last week: We finished writing code for simple LCA resolver along with tests and after that we started our work to implement a simple merge resolver for linear entities in drupal. We started by writing basic services, creating containers and I spent lot more time in reading about Symfony framework and Drupal8 module development.

Work we did this week: We are done with the simple merge resolver as well now. The task was to resolve merge conflicts in linear entities which drupal implements by default. The approach we took for this was to return the entity last created in the revision. This way, the entity which was created last would be returned as it is the latest entity. We have also written tests for it. The code for the library can be found on the github repository.

Next week’s task: After we’re done with the extensive testing of this simple merge resolver, we will start our major task with the complicated lca resolver where we’d integrate the library we created in coding phase 1 to find LCA with Drupal8 Conflict module. Our approach will be the same - write tests first and then code.

My mentors Dick Olsson, Andrei Jechiu and Tim Millwood have been really very supportive. Not only they have taught me how to code in a better way but they also help me understanding why it is important to test a software extensively. It’s only because of them I have came this far. All the credits to them only. I know my project is very interesting but the have made the whole journey interesting.

Jul 07 2016
Jul 07

Recap: We need to detect and resolve merge conflicts in Drupal8 which might arise when two users try to push the same entities but modified at respective end to the server without pulling the latest changes. We are implementing a “conflict” module to help detecting conflicts otherwise merge. For a complete description of what the project is about, please review to the 3rd week’s blog where I have given a complete walk through of the project. We finished two libraries, one to find the Lowest common ancestor from a directed acyclic graph and the other one to use that LCA as a base to perform recursive 3-way Merge algorithm by comparing local, remote and base entity before mid term and now we’re creating a module which will use those libraries.

For the progress till now, please refer to the article from last week. All the work till last week has been written in that article.

Target for this week was to create a simple Lowest common ancestor in Drupal 8 as a service in “Conflict” module.

Working: Entities in Drupal store revisions in a linear graph, for example if a node “A” has been edited 4 times, a linear graph would be formed looking something like:

                                                                      A → B → C → D → E

This simple LCA resolver would find the parent of last edited revisions. In this case, it’d find and return “C” for revisions “D” and “E” as they both (D & E) have been modified from “C” and hence, “C” is their common parent. 

Although it looks to be a pretty simple feature but it was ironically very tough for me to implement this code. Not the actual implementation but the testing part. After creating two libraries, I was starting to think that the “Testing” part is not so difficult after all but it proved me wrong *pun*. I made some silly mistakes and was trying on my own to solve them because I didn’t want to bug my mentors with such small mistakes. Just opposite to what I was thinking, I had to ask my mentors after nearly spending 2 days on a single error. My mentor “Andrei Jechiu” hardly took 2 minutes to point me in the right direction and I was back on track.

One thing I noticed while creating the module is that you don’t really just create the module, you learn a lot more than what you need and this is what I believe is the best part of working with Drupal and Google summer of code. The graph of learning increases exponentially.

So after solving the error, when I thought it shouldn’t take much time, I was right where I started, on another error which wouldn’t let me through but I didn’t make the same mistake of being struck with that error, after trying for like half day, I again had to ask “Tim Millwood” (My mentor) and just like Andrei, not only he pointed me to where I was making mistake (Services.yml was not properly defined), he also taught me how actually service container works in a much simpler way than many of the articles I read.

All the code we created can be found in drupal-conflict repository over github. We are still working on the Tests and once we are sure that the code is passing the tests, we shall move to the next step of creating a simple merge resolver which would work for linear entities and merge the updates in them. I was expecting to get over with this task this week only but it took much longer than I was expecting but I will take it as long as I’m learning new things.

Other than this part of project, we had an awesome Open session this week to encourage students to start contributing to open source. Students from various organizations like Fossasia, Drupal, Mifos Initiative and KD were there. Not only we met really talented people, we also had a talk about their projects and their approach.

Jun 29 2016
Jun 29

After finishing our work with the PHP libraries to find Lowest Common Ancestor from a graph and then to merge the changes from updated entites, we finally moved to the next step of the project i.e, Making these libraries work with Drupal system.

For those who are not familiar with the project, I would strongly recommend to go through my 3rd week blogpost of GSoC16 where I have given a complete and detailed description of the project.

Coming back to where we started, to make these libraries work with Drupal, we have started work on creating a D8 module "Conflict" which would take care of merging and conflicts.

I spent this week studying about dependency injection, developing pluggable services and other Drupal development stuff rather than starting with the implementation right now. The goal is to keep the module pluggable and to not to depend on Multiversion or Replication module. The initial stab at Conflict module for Drupal 8 will only do a few simple things:

  1. Implement two different ancestor resolvers: 

Since the LCA resolution is pluggable inside, we shall provide two different implementations for this:

  • A simple resolver which find the LCA for revisions that use the default linear revision system provided by core. This resolver will work with any Drupal 8 site.
  • A more complicated resolver that uses the relaxedws/lca library and RevisionTreeIndex from Multiversion module. This step depends on #2748403: Add getGraph() to RevisionTreeIndex . This resolver will only work with sites that uses Multiversion module.
  1. Implement two conflict resolvers: 
  • A simple resolver that just returns the revision with the largest revision ID. This resolver will work with any Drupal 8 site.
  • A more complex resolver that uses the Serialization module to normalize the given revisions into arrays and then use the relaxedws/merge library to perform the merge. This resolver will only work with sites that uses Multiversion module.

We set up an issue to track the progress of this project and the code will be pushed to the drupal-conflict public repository over github. It has the d7 code for it already and now we are creating the code for d8 over "dev-8.x" branch.

Note that the main focus is to make the services pluggable so that those users who don't use Multiversion module can also use this module. Also, there might be developers who might want to use some other approach to find LCA or solve conflicts, they can also add their own services easily if we make the module services pluggable.

To understand the use of services, I went through Services and dependency injection in Drupal 8 and the article about Service Tags over drupal.org. After going through these articles, I could easily see the clear picture of the approach my mentors told me about. 

We will be starting by implemeting a method to find Lowest common ancestor which would work with linear entities and would not require Multiversion module. It'd work with some predefined approached like the last entity updated. We'd be deciding about which approach would be best for this as there are many to use. Once we decide, we would start with the implementation. The best way would be to write the unit test and build the service until it passes the test. This is what we have decided this week. By the next week, we would start the implementation and probably would have done the LCA implementation with the module.

Jun 22 2016
Jun 22

Google Summer of Code is halfway through and it’s now time for mid term evaluations. What that means is, we will be judged on the basis of our efforts and targets achieved and our mentors would decide if we pass of fail. If someone fails the mid term evaluation, he is removed from the program immediately.

Let us sum up all work we’ve done in the GSoC-coding phase 1.

Project: Solving content conflicts with merge algorithms in Drupal 8.

Proposed Solution: Merge the updated content in the new array or throw an Exception if there is a conflict. The real challenge was to detect the merge conflicts and to merge otherwise. We decided to use 3-way merge algorithm to compare the updated nodes with a parent node. Out of all parent nodes, the one closest to them(LCA) was our best option to compare them with as it’d have all the recent updations. We needed Libraries to find the closes common parent or Lowest common ancestor and then compare it to the updated nodes to find if there was a conflict or not.

We needed one more library to merge the updated content if there is not conflict.

Deliverables before mid term evaluations:

  1. Library to find LCA from a Directed acyclic graph.

  2. Library to perform a recursive 3-way merge algorithm.

Week 1: Library to find Lowest common ancestor(LCA) in a Directed acyclic graph.

LCA: The nearest common parent to 2 nodes. For ex:

                                                                                      1

                                                                                  /        \

                                                                                2            3

                                                                              /    \         /     

                                                                            4       5     6

LCA of (2,3) is 1 and (4,5) is 2 in the graph created above. LCA(4,5) = 2 because it is a common parent of both as well as closest to these nodes than other common parents.

Similarly LCA(5,6) = 1

Approach: The approach we used to find LCA is Breadth First Traversal in opposite Direction. We traversed from node1 to root storing all the elements encountered in an array. We, then traversed from node2 to root doing same thing. Thanks to the clue/graph and graphp libraries. We used these libraries to create graphs and traverse the graph using BFS in the reverse direction. The first element we get from the intersection of these array elements will be the LCA.

Week 2: Writing tests to assert the correct functionality.

Once we had the working prototype ready to find the LCA, we needed tests to make sure that the code works well in all the cases. We created multiple graphs based on their complexity such as Simple graph, a bit complex graph (Few Multiple parents), very complex graph (Many multiple parents). The visual representation of the graphs is available in the source code.

Then we implemented these graphs in our code and ran multiple tests on them to make sure that they were returning the correct LCA. It was a very crucial part of our project and we had to make sure that no wrong LCA were returned as it’d have caused data loss and probably would have resulted in unnecessary merge conflicts. We wrote tests for:

  • Single Parent in simple graph.

  • Multiple Parents in simple graph.

  • Single Parent in Complex graph.

  • Multiple Parents in Complex graph.

Week 3: Writing Library to perform recursive 3-Way Merge algorithm

After the completion of the LCA library, we were all set to start with the library to perform a recursive 3-way merge algorithm.

3-way merge algorithm: It turns a manual conflict into an automatic resolution. Part of the magic here relies on the VCS locating the original version of the entity (Base entity). This original version is better known as the "Lowest Common Ancestor (LCA)". The VCS then passes the common ancestor and the two contributors to the three-way merge tool that will use all three to calculate the result. The tool compares the local and remote entities with the LCA (base entity) and updates the base entity according to the modified entity. For a clear understanding, please read the article I wrote for 3rd week of GSoC.

Approach: It was a very complicated task and we had to make sure it was working fine for all edge cases including exceptions(Yes, there were many). To implement this library, I got a very clear understanding of how it works.

There were multiple cases I had to take care of while implementing the library. Some of them are:

  • Lines are added in Remote.

  • Lines are added in Local.

  • Lines are added in both Remote and Local.

  • Lines are removed in Remote.

  • Lines are removed in Local.

  • Lines are removed in both Remote and Local.

  • Lines are modified in Remote.

  • Lines are modified in Local.

  • Lines are modified in both Remote and Local.

  • Pre-existing Lines are modified and new lines are added in Remote.

  • Pre-existing Lines are modified and new lines are added in Local.

  • Pre-existing Lines are modified and new lines are added in both Remote and Local.

  • Pre-existing Lines are modified and some lines are removed from Remote.

  • Pre-existing Lines are modified and some lines are removed from Local.

  • Pre-existing Lines are modified and lines are removed from either Remote or Local.

  • Lines are added and removed at the same time.

The basic approach used for most of the cases is to count the number of lines and then we used:

  1. A For loop to run as many times as the minimum lines were there. For example: If ancestor array has a value at some key which has 3 lines, remote has a value with 4 lines an local array has a value with 2 lines, then the first loop will run for 2 lines as the minimum number of lines is 2. This loop will compare all 3 arrays and store the changes on the same key=>value in new array which would be returned after a successful merge.

  2. Second FOR loop  which ran for as many times as the the remaining lines in the second array with largest number of lines. In our case, ancestor array has 3 lines and 2 of which were audited in first FOR Loop and the 3rd line would be compared in this second FOR Loop. So In this scenario, second For loop would run for 1 line only.  It would compare the 3rd line of ancestor from the remote array to make sure they both are same. If they are same, it means only Local node has the 3rd line updated(as it was removed in local), so it’d be removed from final revision as well otherwise, it’d return a conflict Exception.

  3. Third For loop would run to just add new lines into the new revision. In our case, the line number 4 was only in remote and hence, it’d be added in the new array.

In most of the cases, only 1 or 2 “For loops” would run. All the 3 “For Loops” would run in the case where lines have been added as well as removed in remote, local or ancestor.

Week 4: Writing Tests, Custom Exception and Updated documentation.

At the end of the 3rd week, I thought I was done with almost everything until I started testing. With every new test case, I discovered a new edge case where the code was not working as expected and this is the part where I actually learnt about the real importance of testing.

The challenge was the ensure the code readability. With every edge case, the length and complexity of the code kept increasing. After +1,179 // −32 LOC, We were finally done with our as many test cases as possible and the code with it’s proper documentation.

We Covered the following cases in our tests:

  • Simple Merge in same number of lines in all 3 arrays.

  • Merge with same number of lines in 2 arrays (Addition or removal in 3rd).

  • Merge with different number of lines in all 3 arrays (Addition or removal in all 3 arrays).

  • Catching Merge Conflicts.

  • Key removal from a array(Remote or Local).

Other than writings tests this week, we created our own exception class (ConflictException) which extends the core Exception class. We also updated the documentation, made code readable and updated Readme file with the features and the examples.

All the code I have written has been merged into the core libraries from my Github repositories. The code is ready to be used, modifications and feedback. Please feel free to create a pull request with any issues, suggestions or patch for a bug.

A huge thanks to my mentors Dick Olsson, Tim Millwood and Andrei Jechiu for their guidance. They are always available to answer to my silly queries in an appropriate manner. They keep telling me about all those little mistakes I repeat and make sure that I don't repeat them multiple times. I must say they are equally inclined towards me learning more as well as the completion of this project. Never thought this summer would be this much fun. Learning curve is now going up exponentially.

About Drupal Sun

Drupal Sun is an Evolving Web project. It allows you to:

  • Do full-text search on all the articles in Drupal Planet (thanks to Apache Solr)
  • Facet based on tags, author, or feed
  • Flip through articles quickly (with j/k or arrow keys) to find what you're interested in
  • View the entire article text inline, or in the context of the site where it was created

See the blog post at Evolving Web

Evolving Web