In the last couple of years, I developed all of my codes in the way that all members and statics of every class were sorted by Eclipse. I recently joined a new project and it was surprising to me how much the other members of the team were against sorting the members. As always, I tried to understand their motivation and find the differences in our opinions.
Their main reason was the following:
I want to put the functions and variables that belong together next to each other, so I can see them together.
I saw their point. However, I did not agree with it and I could not convince them about my right. I told them:
If you used Git and code review before accepting pull requests, you would find it very annoying if non-fucntional code changes would divert your attention. By non-functional, I mean changing the order of functions or doing some code formatting.
That was not enough. How do you explain the benefit of code reviews and the drawbacks of non-functional changes in the code to someone, who has never used Git neither code reviews?
I still felt that there was something else, too. What did I miss in the argument? Why do I not care if the functions that are called from each other are not close to each other in the source code?
You can call me uppish, but in every development team in my life, I was much faster at code investigation than others. It had one reason. While the other programmers started to analyze the code as a book, I wanted to know only what I had to.
Imagine you are looking for the reason why a label has a specific value on the screen. While the others see the code snippet below:
I see only the following:
You see the difference? I care only about the code I need to know. After I find this code snippet, I will push CTRL+ALT+H to see where the current function is called from. I do not care if the caller function is before or behind this one. I will jump there and continue my investigation.
I look around myself, and see that most of the developers do not even know the most basic keyboard shortcuts of Eclipse:
- CTRL+SHIFT+T: Open a type
- F3: jump to definition
- CTRL+T: Type hierarchy in a pop-up. This is very useful during an investigation if we want to know where a method of an interface is implemented.
- CTRL+ALT+H: Call hierarchy. Where is this variable used or where is this method called from?
- CTRL+L: Jump to line. Very useful if we have a stacktrace and we want to start investigating the issue by line number.
I might have missed a couple of useful shortcut keys. What I wanted to express is that the ones who do code investigations without these, cannot call them experts. Their effectivity is way beyond the ones who know the tricks.
As soon as you start reading the code effectively, you will not care about the grouping of variables and functions at all. However, it will be very annoying if you see that others spend much energy on grouping, instead of concentrating on the business logic they want to implement. If you start thinking of grouping, you waste time and effectivity. You will never be as effective as others.
So why do I care about sorting the members at all? Why should I sort the members if it does not matter during code investigation? Because I want to make only those changes visible in my commits that have actual functional meaning. Without automated-sorting and very strict automated code-formatting in my IDE, I should spend a high amount of energy on reviewing my own commits before pushing them for code review.
We are happy to announce the first public release of Blobstore components. Blobstore allows the programmer to read and write binary large objects within atomic transactions.
Everit Blobstore has in-memory and JDBC-based implementations. While the in-memory implementation can be used to write unit tests, the other one supports several relational database types: Apache Derby, Hsqldb, MySQL, PostgreSQL, Microsoft SQLServer, Oracle
As always, the artifacts are available on maven-central.
For more information, see the site of Blobstore: http://www.everit.org/blobstore.html
After lots of work, ECM (Everit Component Model) is finally released. In the scope of the project several modules were developed:
- org.everit.osgi.capabilitycollector: A simple library that helps tracking references (currently OSGi services and Bundles). ECM uses this library to be able to track the references of components.
- org.everit.osgi.ecm.util.method: Helper methods to generate and parse method signatures and to locate methods based on the 112.9.4 Locating Component Methods chapter of OSGi Compendium specification.
- org.everit.osgi.ecm.metadata: To create a component, the engine needs a Class type and metadata that belongs to it.
- org.everit.osgi.ecm.annotation: Annotations of ECM, that can be used to describe the component functionality of a class
- org.everit.osgi.ecm.annotation.metadatabuilder: A simple utility library that can be used to generate Metadata (that can be passed to ECM Container Factory) from annotated classes.
- org.everit.osgi.ecm.component.api: API for ECM based Component development. This module contains interfaces like ComponentContext.
View original post 187 more words
Code generation improvements
- Property renaming possibility in lqmg xml files
- schema name is generated into metadata classes
- schema parameter is renamed to be capability
- unsatisfied wires can be modified to be optional during code generation (hackWires)
There are many libraries out there that contain reusable code, but only a few of them persist data in relational database. Even if they do so, they behave normally as standalone project. It is hard to use their tables in custom queries of the project that embeds them.
Our goal is to create a technology stack that allows to embed logically related table sets into separate modules. There are many use-cases in which reusable persistent data structure can be defined. With a well defined API, complicated queries can be eased.
A reusable module with relational database tables must meet the following requirements:
- The module must provide an API that makes it possible to easily create or modify the records of the tables
- It must be possible to reference the tables with foreign keys from other modules
- The module should provide API to easily extend SQL queries
During a project many parts can be separated that can be implemented in a reusable way. This post shows some practical use-cases and the way how they can be implemented.
Imagine that you have a User entity with the following fields:
Task: Show a filtered and paginated list of users on a website. The list should be ordered by the localized country name
- Select all the users that meet the filter criteria. If we have 1 million results, than download all of them into memory.
- Translate the country name based on a ResourceBundle for each user
- Order the list of users
- Show the current page (Twenty records from the million)
High performance solution with the re-usable Localization module:
Create a table called localized_data with the following fields:
- default_locale (boolean)
Fill this table with the country codes and with their translations. Example:
We have the original SQL:
SELECT ..., c.country_name, ... FROM user u
We would like to change the SQL in the way that:
- Country name is selected from the localized_data table
- The query gives back 20 rows beginning with the row index 10.000 of the result set.
- If the country name is not available in the locale of the user (e.g.: en_US), select the country name with language of the user (e.g.: en)
- If the country name is not available in the language of the user, select the country name with a default language
- If the country name not available at all in the localized_data table, get the country code
The modified SQL would look like this:
SELECT ..., COALESCE( (SELECT ld.value FROM localized_data ld WHERE ld.key = u.country_code AND ld.locale = :userLocale), (SELECT ld.value FROM localized_data ld WHERE ld.key = u.country_code AND ld.locale = :userFallbackLocale), (SELECT ld.value FROM localized_data ld WHERE ld.key = u.country_code AND ld.default_locale), u.country_code) AS country_name, ... FROM user u ORDER BY country_name LIMIT 20 OFFSET 10000;
The SQL above looks difficult. Re-implementing this query at each place is almost impossible. It is time to find a technology that supports Object-Oriented SQL queries to be able to separate complex query parts to reusable functions. Some solutions that allow this:
In former times we used JPA Criteria API. Recently we changed to QueryDSL due to its benefits. The original query would look like this with QueryDSL:
QUser user = new QUser("u"); SQLQuery query = new SQLQuery(...); resultList = query.from(user).list(..., user.countryCode, ...);
It would be hard to write down the COALESCE logic every place where we want to have localized data as part of the result set. Let’s make a function called createLocalizedCoalesce(DSLExpression<String> key, Locale locale). With that function, it will be easy to write the robust query:
QUser user = new QUser("u"); Coalesce countryNameExpression = createLocalizedCoalesce(user.countryCode, Locale locale); DslExpression<String> countryNameAs = countryName.as(); SQLQuery query = new SQLQuery(...); resultList = query.from(user) .orderBy(countryNameAs.asc()) .limit(20).offset(10000) .list(..., countryNameExpression, ...);
Building a query with the complex localized_data logic is not that difficult anymore.
Initialize database schema
Defining tables, views, versions
We chose QueryDSL to write queries. However, we need a tool that can populate the database schema.
We investigated lots of the technologies that supports creating and modifying tables and Liquibase won. In our solution, each module contains a liquibase changelog file that defines the database schema. In the localization module it looks like the following:
<changeSet id="1.0.0" author="everit"> <createTable tableName="localized_data" schemaName="org.everit.osgi.localization.schema"> <column name="localized_data_id" type="bigint" autoIncrement="true"> <constraints primaryKey="true" nullable="false" /> </column> <column name="default_locale" type="boolean" > <constraints nullable="false" /> </column> <column name="key_" type="varchar(255)" /> <column name="locale_" type="varchar(10)" /> <column name="value_" type="varchar(2000)" /> </createTable> <addUniqueConstraint columnNames="key_, locale_" constraintName="unique_key_locale" schemaName="org.everit.osgi.localization.schema" tableName="localized_data" /> </changeSet>
Initializiation of the database
Our stack is based on OSGi and Declarative Services. Using the right modules and components we are able initialize the database before the business logic is activated. This goal can be achieved by the configuration of the following components:
- First of all, we need a JDBC driver that registers a DataSourceFactory as an OSGi service
- Everit DSF component picks up the DataSourceFactory service and registers an XADataSource OSGi service
- Everit – Commons DBCP component picks up the XADataSource service and by instantiating a BasicManagedDataSource it registers a DataSource OSGi service
- Everit Liquibase DataSource component
- picks up the DataSource service that was registered by Everit – Commons DBCP component
- tries to initialize/validate the database with the specified Liquibase changelog file
- if the initialization/validation is successful, the DataSource object will be registered again as an OSGi service but with different service properties
- Based on service properties, business components can pick up the DataSource that was registered by Liquibase DataSource component
With a configuration like this, we can be sure that the database is up-to-date when the business components are activated.
How does the Liquibase DataSource component know, where it should find the changelog file?
Each bundle that contains one or more Liquibase changelog files. They must also provide one or moreliquibase.schema capability. The capability can have the following attributes:
- name: The “name” of the schema that identifies the changelog file. By “schema” I mean the identifier of a logicaly related set of tables, views and sequences. The meaning of “schema” capability attribute is not the same as a database schema.
- resource: The path of the changelog file within the bundle
- custom attributes: Any custom attribute may be specified. They can be useful if we want to allow special filters on the consumer side (e.g.: version, dbtype, …)
In case of localization, the capability MANIFEST header looks like the following:
Everit – Liquibase DataSource component should be configured to pull this capability. To do that, simply specify the following at the liquibase.schema configuration property: org.everit.osgi.localization.schema.
It is possible to add directives in the configuration of the Liquibase DataSource component. E.g.: org.everit.osgi.localization.schema;filter:=(version>=1.0.0)
Referencing tables and views from other modules
By using Liquibase it is possible to include a changelog file from other changelogs. In case we use the Liquibase jar that is bundled by Everit, it is also possible to include changelog files from other bundles. To do that:
- The consumer bundle must have a Require-Capability MANIFEST header that refers to the capability offered by the producer bundle. By doing that, a wire will be created between the bundles. Our solution finds inclusions based on these wires.
- the include tag must define the path with the “eosgi:” prefix and add the name of the liquibase.schema capability as value instead of the path
If we want to use the Localization module in our project, we should include the liquibase changelog in the following way in the changelog of the application:
And specify the following requirement in the MANIFEST of the application bundle:
The wiring and inclusion above works transitively. E.g.: If you include authorization tables in your application, you include its dependencies transitively as well. As resource is included in authorization, the resource table will be created/validated during the application initialization.
Generating the QueryDSL metadata classes
Writing QueryDSL metadata classes manually would be a hard task. There is a tool called LQMG that generates the Metadata classes from the Liquibase capabilities for us. This tool has the following steps:
- Starts an embedded OSGi container (Equinox)
- Deploys the specified bundles. Those bundles should be specified that contain liquibase capabilities.
- Starts an embedded H2 server
- Based on the specified liquibase capabilities, initializes the embedded H2 database. This is done in the same way as Liquibae DataSource works
- Generates the QueryDSL metadata classes from the H2 databsae
It is also possible to specify renaming rules in LQMG to generate class names and properties that are easier to read in the source code.
If we use lqmg-maven-plugin, the dependencies of the projects will be handled as the OSGi bundles.
The Localization module will be available on GitHub soon.
The Resource project is already available at GitHub. Resource has no own functionality. It is simply a table that has a generated id. Resource is often used in other modules to wire their functionality to each other.
Databsae Sequence could have been used as well, but we wanted to let the developers find out if any other module uses the same resource id. In case a foreign key references the resource record, it cannot be deleted until all module allows it.
The authorization module has already been implemented by Everit based on JPA-OSGi technology stack. As we are changing our technology stack, it will be available soon based on Liquibase-QueryDSL at GitHub.
To have a reusable authorization module, we can use the following relational schema:
Defines a permission between two resources.
- authorized_resource: The one, who has a permission on something. E.g.: user, role, user group, external system, etc.
- target_resource: Resources which other resources can run action on. It might be a document that a user can open and edit or a user who can be deleted or modified
- action: The action of the permission. E.g.: If the authorized resource was a user and the target resource was a document, actions could be ‘open’, ‘edit’, ‘delete’, …
It is possible to define inheritance between authorized resources. The child resource inherits all of the permissions of the parent. A parent could be a user group, role or organization, while a child could be a user or user group. The table is abstract enough to let the designer of the application decide, how to describe roles, groups, users and the relation between them.
Functionality of authorization
First of all, the authorization module contains functions that make it possible to manipulate permissions between resources and the permission inheritance DAG. It also contains a permissionChecker that can
- calculate the inheritance and provide it from cache next time
- provide permission information for an action between two resources
- from the database
- from cache
And last but not least, the authorization component provides functionality to extend QueryDSL based queries with permission restrictions. Imagine the following query:
SELECT ... FROM document d JOIN document.attachment a;
We can define permissions on every entity that contains a foreign key to the resource table. Every entities can be found in a SQL statement behind FROM and JOIN parts may be authorized or target resource. In this case we can define permissions on documents and on their attachments. We define the following function:
Predicate createPermissionPredicate( Expression<Long> authorizedResourceIdExpr, Expression<Long> targetResourceIdExpr, String action);
The function generates the following predicate (in QueryDSL of course):
EXISTS(SELECT 1 FROM permission p WHERE p.authorized_resource_id IN (?, ...) AND p.target_resource_id = d.resource_id AND action IN (?, ...))
By using QueryDSL and the function above, it is easy to extend every queries with authorization restrictions.
This is more a concept for us at the moment. We realized that full-text searching together with permission checks cannot be implemented effectively if the index is outside of the database. We saw huge document management systems that did the following:
- Doing a full-text search with a technology like Lucene
- Checking permissions on each record one-by-one
- Showing twenty records on the user interface from the middle of the result set
And saw them die on requests that had larger result sets :).
Every modern DBMS has built in full-text search support:
- MySQL – Sphinx
- PostgreSQL – TSearch2
- MSSQL – Some Microsoft stuff 🙂
- Oracle – Oracle Text
All of these allow to have FTS as part of a complex SQL statement. We believe that by using the FTS and the authorization modules together the two logic can be migrated into one SQL statement easily. By doing that, only the number of final results have to be transported from the database.
After the developer starts to design projects with this concept, he/she will find many reusable parts. Some ideas: historical tables with data mining functions, audit records, currencies with exchange rates, geolocations, gtfs, …