The popularization of the Web as a platform for publishing content is a major resource for the dissemination and creation of new ideas, projects, applications, virtual shops and other ways to have an online presence. At a time when many believe that content is king and that the transmission medium is the message conveyed, much of what is produced in different types of web sites like blogs, institutional sites, online stores, applications, portals and others is supported by a single web technology pillar: the platforms for generating and publishing content.
Driven by the free software movement, the main platforms for content generation and publication, also known as CMS (Content Management System), are gaining more followers either through the catalog of services provided by hosting companies or by users who decide build their your own web hosting infrastructure. Among the main features that have popularized the adoption of CMS we can cite: the free and open licensing model, the ease of use for both users browsing the website and for those who create content, the management and dissemination content and the possibility of almost infinite customization by inserting components and templates freely available and produced by the community.
Currently there are several solutions for content management freely available for those wishing to publish web content. To simplify the nomenclature, these tools will be referenced only as CMS throughout this article. The most popular CMS are: WordPress, Joomla, Drupal and Magento. These take the form of software frameworks ready for use after the customization and configuration of certain features. Its main elements are the web page design templates ready for use, the administrative panel, access security, what you see is what you get editor for content, configuration files, images, and also a data model, whose analysis is the main objective of this article.
Knowing the data model of a CMS is important in various situations. For instance: when you want to do some customization, improve performance, perform maintenance on the database, migrate the platform to ensure that resources are being well used and other tasks that support the web site built with CMS. Also, knowing the technical details related to the CMS data model used can facilitate the execution of tasks on a daily basis and also increase the level of experience and expertise of the professionals involved.
This third part will concentrate in the details of the Drupal data model. Although this CMS is more complex than the previous, WordPress and Joomla, the Drupal data model hides some details that could be improved as we are going to see.
The Drupal data model
Just as WordPress and Joomla, Drupal is also an open source CMS written in PHP whose main focus is the rapid creation of blogs. However, it has some unique features that make it a qualified candidate to applications more robust than just a blog. Among the features it highlights the fact that it has an additional layer to access the database. This additional layer allows the data model to be implemented in different DBMS and makes the task of accessing the stored data independent of a specific SQL dialect. According to the official documentation of Drupal, the supported databases are MySQL, MariaDB, SQLite and PostgreSQL.
During the installation process it is possible to choose a version with fewer features or the full version. In the version with fewer features 45 tables are created in and in the full has 75 tables. The Drupal installer also creates the foreign key constraints and primary key relationships implemented directly in the database. For reasons of space and visualization, Figure 1 presents the data model of Drupal 7 with only the name of tables without their columns and relationships.
Figure 1. The data model of Drupal 7.
One of the interesting points of Drupal, and any data model with a lot of tables, is the separation of groups of related tables into subsystems. For example, in Drupal there is a cache control mechanism implemented directly in the CMS. This cache control subsystem is represented in the database tables named cache, cache_form, cache_menu, cache_page, etc. The same goes for the security subsystem, which includes the tables users, users_roles, role, role_permission and others.
From the standpoint of data typing, there are some observations that can be made: there is no column that uses the bit data type for storing boolean values. For example, the column has_children of the table menu_links clearly should store only two values, but it is implemented with the tinyint data type. In contrast, there is an excessive use of columns of the BLOB and LONGBLOB data types for storing binary data without any context, because these data types can store any binary information from one image to a video file, for instance. Moreover, it is emphasized that there is no column with the data types date time or date on all the tables of the model. Drupal internally control the date format with two additional tables: date_formats and date_format_type. This feature is linked to the independence of specific features of the database abstraction layer and also provides strong support for different languages that can be used with Drupal.
From the standpoint of modeling, there are some features that can confuse programmers not used to working with this model. A clear example of these features is the table called variable. This table has two columns: name, data type VARCHAR(128); and value, data type LONGBLOB. Obviously, this table can store any type of information and one does not need a lot of modeling experience to know that such entities are not intuitive and go against all the best practices of modeling, abstraction, reusability and analysis. Another interesting detail is presented in the table menu_links, shown in detail in Figure 2.
Figure 2. Table menu_links from the data model of Drupal 7.
The menu_links table has columns numbered p1 to p9 of the INT data type which occupies 10 bytes. From the standpoint of modeling does not make much sense to put columns with these names because they do not communicate what can be stored. There is also the question of numbers: What happen if you need to store more than nine values? Or less than nine values? In these situations it is recommended to create entities and relationships in order to be able to correctly and properly associate the values that are stored in columns numbered p1 to p9.
Finally, there is a lot of repetition of columns in some tables. The tables field_data_body, field_data_comment_body, field_data_field_image, field_data_field_tags, field_revision_body, field_revision_comment_body, field_revision_field_image, and field_revision_field_tags have seven repeated columns in each table, as Figure 3 shows. Without going into details of the need for these tables, there is room for some improving by reducing the amount of repeated columns in order to take up less disk space and optimize the data model.
Figure 3. Group of 8 tables with repetition of the 7 first columns.
In summary, the model of Drupal 7 is quite robust with respect to the amount of elements and subsystems, in particular the characteristics of cache management, links, blocking, content filtering, searching, and sorting by tags. However, this model still needs some changes in order to be more optimized without losing the ability to be ported to different DBMS such as MySQL and PostgreSQL.
The production and content management for the Internet is currently carried out by content management tools known as CMS. These tools are ready platforms that can be customized to provide a suitable layout and also to meet other needs not covered by CMS or its components.
Because CMS is a ready platform to be used, they need a database for storing data related to the content, layout and its internal configurations. Based on the currently used major CMS (WordPress, Joomla, Drupal and Magento) this article presents their data models and discusses some technical aspects that may be of interest to those who work using these platforms for content management.
The article details various aspects of models such as typing, naming, organization, relationship, use of constraints, modeling techniques, storage and availability of each of the data models. From the explanations and criticisms of the models provided by this article, readers can begin to familiarize yourself with the layout and the data storage mechanism used by each CMS because this knowledge is useful when there is a need to implement new features in CMS or if it is necessary to perform some administrative task in the database that contains the CMS' data model.
The objective of this third part of the article was to discuss the data model used by the Drupal CMS, a more robust and complete CMS than WordPress and Joomla.
In part three, we will see the Magento model.
To see the second part, access: http://mrbool.com/p/The-Joomla-data-model-Analysis-of-the-WordPress-Joomla-Drupal-and-Magento-data-models-Part-2/23184