Paul Dissected – Part 1
John paved the way in his great introductory post and explained that we brought Paul the Octopus back to life. Kiril and Petar showed how to setup the application locally and what it looks like. Now it’s time to take a look under the hood tentacles and see what makes the Octopus swim.
Even though we didn’t have much time we tried hard not to resurrect him in a Frankenstein-ien or brainless Zombie-like way but to give him a decent return with capabilities which can be extended easily in the future on an as-needed basis. The following series of blog posts will deep-dive into the technical details and illustrate some of the main pillars of the app.
To get the expectations right: This is not a step-by-step build-it-from-the-ground-up tutorial. There are great ones already. Check out the super-detailed End-2-End scenario by Jens or the great TechEd Tutorial by Matthias. We crawl the opposite way. We want to take a finished, published, real-life and externally used application (by the way, get it from the app store to play around!) and dissect it piece by piece. It is strongly encouraged to have the source code available and study it. Because the source is so easily accessible, we will hardly include any code snippets here but merely links to it and aim for the prize for “most links in a single blog post” 😉
With this being said, let’s put on our snorkels, grab the sources from github for reference, take a deep breath and let’s get wet.
What’s To Be Covered
- Project Overview
- REST API
- Mail Service
- Data Handling
- Document Service (ECM)
- Background Jobs
- User Authentication
- User & Admin Front-End (UI5)
- Performance Optimization
- iOS App
|- adapters – classes for accessing the SAP HANA Cloud Platform’s services
|- dao – methods for storing, updating and retrieving data from the database
|- entities – JPA entities describing the data schema and relations
|- importers – classes used for retrieving information from the data provider
|- jobs – background jobs that are being executed repeatedly
|- paul – logic for Paul’s betting behavior based on a crowd-sourced approach
|- services – REST services
|- util – code for marshaling and JSON manipulation
|- startup – classes related to the initialization of the application
|- util – commonly used utility functionalities like constants, user utilities, file uploading utilities etc
|- configuration – classes used for configuring properties
- We use the following layers:
- UI (in the webapp folder)
- Business Logic (REST services in the services package)
- Database Access (DAO package)
- For simplicity we reuse the JPA entities as data transfer objects and hand them out directly in JSON format over the REST API.
- All the date and time handling & calculations are done using Joda Time which provides a much simpler and elegant API for time handling than the JDK.
ℹ Some of the functionality is only available in the iOS application (coming soon to github) or with a commercial Opta data feed. The most prominent example is playing together with friends in leagues. Also the detailed game statistics which you can get for teams and players are not visible. Only a fraction is currently shown and the fake data provider does not generate this data as of now. We are working an making this also available in the example application in the future. The REST API is fully functional though.
The overall goal was to have front-end clients connect to the server and interact with it by retrieving data about the user, matches (called fixtures), predictions etc. In order to expose this data we created a REST API using Apache CXF. All data is exchanged in JSON format. The path starts with a /b to signal that this is the endpoint for the BASIC authentication. In a future post we will probably extend the example to SAML2 authentication, which will then be wired to /s. We clustered the functionality into services tied to specific URLs. They can be found in the services package. There are three kinds of access levels for the services:
|Accessible to everybody (anonymous)||
/anonuserservice (registering a new user, email verification, forgot password functionality)
/systemservice (general app info like used date format, version…)
|Accessible to an administrator only||/adminservice (creating dummy data, introspecting data, triggering jobs…)|
|Accessible to a registered user||/user|league|team|fixture|predictionservice (business logic for retrieving user & game information)|
The CRUD functionality is modeled using the following HTTP-method pattern:
- GET for retrieving existing resources
- POST for creating new resources or triggering actions
- PUT for updating existing resources
- DELETE for deleting existing resources
Retrieving information about the app
Issue a GET request to http://localhost:8080/server/b/api/systemservice/info
Getting information about the currently logged in user
Issue a GET request to http://localhost:8080/server/b/api/userservice/user
Submitting a new prediction
Send a new prediction object in JSON format via POST
- JSON content:
- Destination URL: http://localhost:8080/server/b/api/predictionservice/predictions
We initialize the Apache CXF services by including the CXFServlet in our web.xml. We use the application registration approach so that we do not have to list the service classes in the web.xml which can lead to double maintenance when we rename them. Instead we use a service registry and can profit from the refactoring commands of our IDE in the future without the need to touch the web.xml. Each service class extends from the abstract BasicService which provides a small number of helper methods. If there are errors while processing, a corresponding error code is thrown, e.g. “400 bad request” for incorrect parameters or “404 not found” for updating an element which does not exist. See the createPrediction() method in the prediction service for an example where we use the throwBadRequest() helper call.
For some additional control we use our own marshaller which allows us to easily take control over the output we create. We use GSON for that purpose. With it we can define exclusion strategies, the date format, pretty printing for easier debugging and additionally needed converters. To mark certain fields as not to be exposed via REST we introduced a custom JsonIgnore annotation and check for it in the exclusion strategy. This gives us more control (compared to @Transient) over when data should be excluded and when not (e.g. a normal user will not see certain fields vs. an administrator).
So far we have the API only. Now we need data. We signed a contract with Opta Sports to get up-to-date Champions League data and this is what we want to import now. Since for obvious legal reasons we cannot republish the commercial data we provide our own data provider which will supply us with (of course high-quality 😉 ) compatible fake data. Hint: If you want to get some real data in you can use the Admin UI and enter it manually.
Accessing the data is straight forward in our case. Opta will regularly push XML files onto a pre-defined FTP server. We expose the content of our server via HTTP and can then easily fetch the files with a destination and our special tentacle called the Connectivity Service. Using a destination has the advantage that we don’t have to store the credentials inside the source code and that we can change the URL on the fly without restarting the application.
In order to connect to the data server a destination called “opta” needs to be created as described in the previous blog. In the ConnectivityAdapter we then load data from this destination. See the ImporterJob for an example. What Opta delivers is a set of differently structured XML files. We parse the XMLs (example) and save the data in the database. The importers package contains the classes doing the actual imports, each representing an importer for one of the XML file types (e.g. teams, fixtures, statistics…).
There are currently three cases where E-Mails will be sent to users by the back-end:
- Welcome mail upon first login
- In case Shiro is used for authentication:
- Verify E-Mail upon first sign-up
- Send new password if old one was forgotten
Using the Mail Service is extremely easy. As shown in the previous blog the configuration stating which host and credentials to use has been uploaded already. We can now get a fully configured mail session object with two lines:
InitialContext ctx = new InitialContext(); Session session = (Session) ctx.lookup("java:comp/env/mail/Session");
From this session we retrieve a transport, build a javax.mail.Message and send it away. The content of the mail is loaded from templates which are stored in the resources. They contain some variables which are replaced with the actual values before sending. See the triggerForgotPassword() method for an example. Hint: If you call this service in your local development environment and have not configured otherwise, the mail will be saved locally in your Server/work directory for easier testing (you also don’t have to upload a Session file).
We have a total of 15 tables modeled with JPA. We will use the JPA entities later directly in the REST API so we modeled them relatively flat and independent to keep it simple and the JSON feed small. All inherit from a BasicEntity which automatically provides the create and modified timestamps. On top of these plain entities we use corresponding data access objects (DAO) which offer common basic functionality for all entities like retrieving and deleting all or individual entries in addition to entity specific methods like retrieving a user by E-Mail. Some DAOs don’t have entity specific methods (yet) so they only extend from the BasicDAO class and remain empty.
During local development we use Derby which is automatically provided by the Persistence Service. By default the service will create an in-memory database and we will loose all data upon restart. To get a permanent database we remove the “memory:” flag from the connection.properties file in the Server/config_master/connection_data folder.
Having a first version of an app in the water is great but a second version is often soon on the horizon. We need a means to handle changes to the database schema since JPA will (unfortunately) not automatically alter tables (it will do some upgrades as of 2.4 which will be covered in a later blog). JPA will only create them from scratch if they don’t exist yet. In order to add or alter columns we use Liquibase.
To get Liquibase up and running we create a db folder in our resources folder and put the individual change logs there. An example from the last update is below, where we introduced seasons and competitions and also had to adjust the content of the external ID column:
<?xml version="1.0" encoding="UTF-8"?> <databaseChangeLog xmlns="http://www.liquibase.org/xml/ns/dbchangelog" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-2.0.xsd"> <changeSet id="5" author="rw"> <preConditions onFail="MARK_RAN"> <tableExists tableName="USERS" /> </preConditions> <sql>ALTER TABLE FIXTURES ADD (COMPETITION_ID INT)</sql> <sql>ALTER TABLE FIXTURES ADD (SEASON_ID INT)</sql> <sql>ALTER TABLE FIXTURES ADD (VENUE VARCHAR(255))</sql> <sql>ALTER TABLE FIXTURES ADD (CITY VARCHAR(255))</sql> <sql>UPDATE FIXTURES SET EXTID=SUBSTR(EXTID, 2)</sql> </changeSet> </databaseChangeLog>
It basically adds some new columns and removes the first digit of the EXTID column. A little more interesting is the preConditions part. It will only execute the upgrades if the tables already exist. Imagine a fresh deployment, e.g. when testing locally. If the preCondition was not there, the start of the app would fail since Liquibase would not be able to perform an upgrade (the tables don’t exist yet). Instead, JPA will create the tables now in their most current state and only new updates will be executed afterwards. Details on this topic can be found in another blog of mine. We now only have to register the listener in our web.xml and are done. Whenever we make a change to an existing entity now we create a corresponding Liquibase XML file and don’t have to worry about adjusting any databases manually.
ℹ To get further details on wiring Liquibase to your application I recommend reading the nice introduction by Michael Wenz.
Document Service (ECM)
Using the Document Service we can store documents in a folder-like structure. The service itself will store everything in a MongoDB database and will expose an easy to use CMIS API. We use Apache Chemistry to access this API. For our application we use it for storing the user’s profile picture. Note that when running the app locally, you need to also install MongoDB.
<resource-ref> <res-ref-name>EcmService</res-ref-name> <res-type>com.sap.ecm.api.EcmService</res-type> </resource-ref>
Now it is possible to connect and to initialize the repository as shown in the getCmisSession() call in the DocumentService adapter. We freely pick some unique credentials (picking other ones later will create a new repo). When you look through the code of the DocumentAdapter class you will see how to create folders and store documents with just a few calls.
To see it in action let’s take a look at the uploadPicture() method in the UserService. We first get the uploaded content as a byte array (using Apache Commons FileUpload to keep things simple), clean up any previously existing uploaded images for this user and store the new image afterwards.
In order to display such a stored picture again check out the pictureVisualizer() call. It will load the document as a byte array and return it directly, suitable for being displayed by an img tag.
There are two tasks at the moment which are asynchronously done in the background:
- Importing data
- Having Paul update his predictions
We use the Quartz framework to schedule cron jobs for that. The code for scheduling the jobs is rather short. First, the new job class is created in the jobs package. If is must not run in parallel make sure the @DisallowConcurrentExecution annotation is there. Second, the new job is registered in the AppInitializer class with an appropriate trigger.
ℹ Quartz also supports a clustered scheduler. This has the advantage that we can fire up several images of the app (using the elasticity features of HANA Cloud and the automatic load balancing) to be able to serve more users. When several images are active at once, critical jobs must only run on one machine at any given time at once, e.g. the importer so that we don’t have any data corruption. The clustered mode of Quartz will take care of that by using a table in the database for locks. To activate it, create the StdSchedulerFactory with the appropriate settings.
For display in the UI we gather all active jobs and their states and return them through the AdminService REST API.
In the code we use quite a number of hard-coded constants for simplicity sake. Often it makes sense to make some of them configurable from the outside. We have a small ConfigUtil which helps us to do exactly this. It will allow us to load String and Boolean values. It does support several sources (system variable, file, database), where these values can come from. Also, fallbacks are possible. If one source does not supply the requested value the next source will be queried.
In the PaulPredicts app we load properties to set for example the sender of the mails:
String from = ConfigUtil.getProperty("mail", "mail.from");
The property mail.from is defined in the resources file mail.properties.
ℹ The utility also supports advanced concepts, that will be touched in a later blog post. It is for example possible to define ConfigSections which can be used to describe a UI representation of the configuration options which is then shown to the user. If modified, these sections can be saved back in the database and will from then on override any previously specified values from the resource file, since the database if queried first as stated by the fallback chain. This way it is very easy to create an app with sensible defaults in config files and expose some of them to the user so that he can tweak them at runtime.
In a test-driven approach we used JUnit tests to specify the API and functionality wanted and then implemented the needed calls to make the tests green. You can execute the test by right-clicking on the src/test/java folder and selecting “Run As/JUnit Test”.
During local testing we use the Derby in memory database. It is automatically provided during local development by the Persistence Service. Our test utility class takes care of starting up the database. Also, we delete all data before a test is executed. This way we start clean and don’t have any side-effects. There are two basic ways to clean the database: throwing away the database and recreate it or only delete the contents. We used the first approach initially but soon switched to the second, since it takes ~1 second of overhead for each test to recreate the database. Deleting all entries is a matter of a few milliseconds.
What we described so far should enable you to go through the code, understand how the parts are connected and to experiment with it. As a matter of fact, that is highly encouraged. If you need an idea, why not create a new REST service which allows you to cheat and predict after a game is finished? You could then also flag such a prediction with an additional attribute by extending the prediction entity and writing an upgrade script for Liquibase.
Whatever you do, we hope you had fun reading this post and got some new insights. We would love to get your feedback and hope to see you back when we publish the next part covering user authentication and more.
Continue with Part 2 – User Authentication