Technical Articles
Optimize DDL statements generated for type system update
I ran recently in the situation, where the type system update ran longer than the entire time allocated for the cutover and wanted to share our findings and solution with the community as I received echos from other projects facing the same problem.
How runs the type system update?
The type system, configured through the *-items.xml
files, defines the data model, translated into a database schema during the initialization or update of the platform. This translation is performed by the DdlUtils library from the Apache DB project: it generates the DDL statements required for creating or updating the database schema associated to a data model. While the library design allows supporting virtually any database software or version (see o.a.d.Platform
and o.a.d.p.SqlBuilder
), the implementation uses ANSI SQL to support a maximum of database software versions. Consequently, advanced options or capabilities available for only particular database software versions are often not supported by the library. For instance, ANSI SQL only allows adding one field to a table per statement while MySQL allows adding multiple fields to a table within one statement. The MySQL support in the DdlUtils library (see o.a.d.p.m.MySqlBuilder
class) uses ANSI SQL instead of the MySQL specific SQL, generating one statement per field to add instead of one statement per table to modify. Note also that the library project is retired and not further maintained. Consequently, any functionality added recently by database vendors will very likely not be supported by the library.
The type system update can be ran either via hAC or via Ant. When ran through Ant, couple options are available: dry-run, type system only update or full system update. Although these different ways use at the end the same mechanism for generating the DDL statements updating the database schema, they use different technologies, complicating the design and implementation of customizations. hAC uses Spring MVC while the Ant target updatesystem
uses a custom Ant task for the dry-run and type system only update options (see d.h.a.t.i.TypeSystemUpdaterTask
) and a macro-based task otherwise.
Why did the type system update take so long?
The execution of the DDL statements generated by the type system update took a very long time. In my particular situation, new fields were added to itemtypes stored in very large tables (50+ millions records) and the type system update generated ALTER TABLE ADD
statements for each new field. The database used with SAP Commerce was MySQL 5.6, which handles by default the addition of new field to a table as following: it creates a new table based on the old table definition, add the new field, copies then all the records from the old table into the new table and finally remove the old table. The copy of all records can take a significant amount of time, especially for large tables. Moreover, the DdlUtils library generates one statement per new field. In my situation, multiple times, several fields were added to the same itemtype, causing at the end MySQL to copy over and over the same large tables.
The first optimization was to add all new fields to a table in one statement, so that MySQL would only copy once a very large table. For example, instead of generating the following DDL statements:
ALTER TABLE stocklevels ADD p_sapwarehouseid VARCHAR(255);
ALTER TABLE stocklevels ADD p_sapwarehousename VARCHAR(255);
It would generate the following DDL statement:
ALTER TABLE stocklevels
ADD p_sapwarehouseid VARCHAR(255),
p_sapwarehousename VARCHAR(255);
The second optimization after digging in the MySQL 5.6 documentation was to use the INPLACE
instead of the COPY
algorithm when adding a new field:
ALTER TABLE stocklevels
ADD p_sapwarehouseid VARCHAR(255),
p_sapwarehousename VARCHAR(255),
ALGORITHM=INPLACE;
Note that the INPLACE
algorithm cannot always be used. Fortunately, in our situation, it was possible to use for all new fields.
How to customize the generated DDL statements?
The proposed approach is the following:
- Customize the
SqlBuilder
implementation supporting the SAP Commerce database to generate the desired optimized DDL statements - Expose the custom
SqlBuilder
via customPlatform
andHybrisPlatformFactory
classes - Adapt SAP Commerce to ensure the custom
HybrisPlatformFactory
is invoked from hAC as well as Ant.
Let us see now more in details the code and configuration changes to perform for implementing this approach.
Customize the default SqlBuilder
implementation
Instead of implementing from scratch the SqlBuilder
class, it is preferable to extend the default implementation used by SAP Commerce for your database. The class d.h.b.d.HybrisPlatformFactory
implements in the method createInstance
the logic mapping SqlBuilder
implementations to databases. Actually, it maps Platform
implementations to databases but by navigating into the build
method of the Platform
implementation, you can easily find which SqlBuilder
implementation is used.
As MySQL 5.6 was the database in our scenario, I extended the HybrisMySqlBuilder
class as following:
public class MyHybrisMySqlBuilder extends HybrisMySqlBuilder {
private final DatabaseSettings databaseSettings;
public MyHybrisMySqlBuilder(final Platform platform, final DatabaseSettings databaseSettings) {
super(platform, databaseSettings);
this.databaseSettings = databaseSettings;
}
@Override
public void processTableStructureChanges(final Database currentModel, final Database desiredModel, final Table sourceTable, final Table targetTable, final Map parameters, final List changes) throws IOException {
final Iterator changesIterator = changes.iterator();
final List addColumnChanges = new ArrayList<>(changes.size());
while (changesIterator.hasNext()) {
final Object change = changesIterator.next();
if (change instanceof AddColumnChange) {
addColumnChanges.add((AddColumnChange) change);
changesIterator.remove();
}
}
if (!addColumnChanges.isEmpty()) {
final Map<Table, List> addColumnChangesByTables =
addColumnChanges.stream().collect(Collectors.groupingBy(AddColumnChange::getChangedTable));
for (final Map.Entry<Table, List> entry : addColumnChangesByTables.entrySet()) {
processAddColumnChanges(currentModel, desiredModel, entry.getKey(), entry.getValue());
}
}
super.processTableStructureChanges(currentModel, desiredModel, sourceTable, targetTable, parameters, changes);
}
protected void processAddColumnChanges(final Database currentModel, final Database desiredModel, final Table changedTable, final List changes) throws IOException {
this.print("ALTER TABLE ");
this.printlnIdentifier(this.getTableName(changedTable));
this.printIndent();
boolean isFirstAddColumnChange = true;
for (final AddColumnChange change : changes) {
if (isFirstAddColumnChange) {
isFirstAddColumnChange = false;
} else {
this.print(", ");
}
this.print("ADD COLUMN ");
this.writeColumn(change.getChangedTable(), change.getNewColumn());
if (change.getPreviousColumn() != null) {
this.print(" AFTER ");
this.printIdentifier(this.getColumnName(change.getPreviousColumn()));
} else {
this.print(" FIRST");
}
change.apply(currentModel, this.getPlatform().isDelimitedIdentifierModeOn());
}
final Optional algorithm = determineAlgorithmForAddColumnChanges(currentModel, desiredModel, changedTable, changes);
if (algorithm != null) {
this.print(", ALGORITHM=" + algorithm.get());
}
final Optional lock = determineLockForAddColumnChanges(currentModel, desiredModel, changedTable, changes);
if (lock != null) {
this.print(", LOCK=" + lock.get());
}
this.printEndOfStatement();
}
protected Optional determineAlgorithmForAddColumnChanges(final Database currentModel, final Database desiredModel, final Table table, final List changes) {
final List keys = Arrays.asList(
String.format("mysql.dt.ddl.alterTable.%s.addColumn.algorithm", getTableName(table)),
String.format("mysql.dt.ddl.alterTable.%s.algorithm", getTableName(table)),
"mysql.dt.ddl.alterTable.addColumn.algorithm",
"mysql.dt.ddl.alterTable.algorithm"
);
for (final String key: keys) {
final String algorithm = getDatabaseSettings().getProperty(key);
if (algorithm != null) {
return Optional.of(algorithm.toUpperCase(Locale.ROOT));
}
}
return Optional.empty();
}
protected Optional determineLockForAddColumnChanges(final Database currentModel, final Database desiredModel, final Table table, final List changes) {
final List keys = Arrays.asList(
String.format("mysql.dt.ddl.alterTable.%s.addColumn.lock", getTableName(table)),
String.format("mysql.dt.ddl.alterTable.%s.lock", getTableName(table)),
"mysql.dt.ddl.alterTable.addColumn.lock",
"mysql.dt.ddl.alterTable.lock"
);
for (final String key: keys) {
final String lock = getDatabaseSettings().getProperty(key);
if (lock != null) {
return Optional.of(lock.toUpperCase(Locale.ROOT));
}
}
return Optional.empty();
}
protected DatabaseSettings getDatabaseSettings() {
return databaseSettings;
}
}
Notice that the ALGORITHM
and LOCK
options can be configured via properties to give more flexibility. By default, both options are not configured and won’t appear in the DDL statements. In our case, the following properties were added to the local.properties
to make use of them:
mysql.dt.ddl.alterTable.lock=NONE
mysql.dt.ddl.alterTable.algorithm=INPLACE
Be aware that ALGORITHM=INPLACE
and/or LOCK=NONE
are not always possible. Check therefore your changes and your database configuration allow it before using the properties.
Expose the custom SqlBuilder
implementation
The SqlBuilder
is exposed through the Platform
instance (see getSqlBuilder()
method), which is instantiated by the the factory HybrisPlatformFactory
(see createInstance
method). The factory calls the static method build()
from the Platform
implementation class to get a new Platform
instance. Ideally, it should be possible with AspectJ to intercept calls to the build()
method and inject the custom SqlBuilder
in the new Platform
instance. Unfortunately, it is not that simple since the setSqlBuilder()
method is protected. Moreover, as detailed later, it will be needed to get a Platform
instance setup with the custom SqlBuilder
without using AspectJ. The chosen approach is consequently to:
- Extend the default
Platform
implementation for the SAP Commerce database to inject the customSqlBuilder
- Extend the
HybrisPlatformFactory
to return the customPlatform
Like SqlBuilder
, check the implementation of the createInstance()
method in the HybrisPlatformFactory
class to find the default implementation used by SAP Commerce for the database. For MySQL 5.6, it is the HybrisMySqlPlatform
. Unfortunately, the class has a private constructor and doesn’t allow to be extended, which forces to copy & paste its code to extend it. Notice that the implementations for other databases do not have this limitation and it might be therefore easier than for MySQL.
public class MyHybrisMySqlPlatform extends MySql50Platform implements HybrisPlatform {
private static final String MYSQL_ALLOW_FRACTIONAL_SECONDS = "mysql.allow.fractional.seconds";
private final boolean isFractionalSecondsSupportEnabled;
protected MyHybrisMySqlPlatform(boolean isFractionalSecondsSupportEnabled) {
this.isFractionalSecondsSupportEnabled = isFractionalSecondsSupportEnabled;
}
public static HybrisPlatform build(DatabaseSettings databaseSettings) {
Objects.requireNonNull(databaseSettings);
boolean allowFractionaSeconds = Boolean.parseBoolean(databaseSettings.getProperty("mysql.allow.fractional.seconds", Boolean.TRUE.toString()));
MyHybrisMySqlPlatform instance = new MyHybrisMySqlPlatform(allowFractionaSeconds);
instance.provideCustomMapping();
instance.setSqlBuilder(new DTHybrisMySqlBuilder(instance, databaseSettings));
MySql50ModelReader reader = new MySql50ModelReader(instance);
reader.setDefaultTablePattern(databaseSettings.getTablePrefix() + '%');
instance.setModelReader(reader);
return instance;
}
private void provideCustomMapping() {
PlatformInfo platformInfo = this.getPlatformInfo();
platformInfo.setMaxColumnNameLength(30);
platformInfo.addNativeTypeMapping(-1, "TEXT");
platformInfo.addNativeTypeMapping(12002, "BIGINT", -5);
platformInfo.addNativeTypeMapping(12000, "TEXT", -1);
platformInfo.addNativeTypeMapping(12003, "LONGTEXT", -1);
platformInfo.addNativeTypeMapping(12001, "TEXT", -1);
platformInfo.addNativeTypeMapping(12, "VARCHAR", 12);
platformInfo.setDefaultSize(12, 255);
platformInfo.addNativeTypeMapping(6, "FLOAT{0}");
platformInfo.setHasPrecisionAndScale(6, true);
if (this.isFractionalSecondsSupportEnabled) {
platformInfo.setHasSize(93, true);
platformInfo.setDefaultSize(93, 6);
}
}
public String getColumnName(Column column) {
return ((HybrisMySqlBuilder)this.getSqlBuilder()).getColumnName(column);
}
public String getTableName(Table table) {
return this.getSqlBuilder().getTableName(table);
}
}
Note that the copy & paste was done from the HybrisMySqlPlatform
class present in the SAP Commerce 1808 release. I would highly recommend not to copy & paste this code but rather starts from the code of your release and adapt the line(s) setting the SqlBuilder
.
The next step is to extend the standard HybrisPlatformFactory
to expose the custom Platform
, which exposes the custom SqlBuilder
with the optimizations for the DDL statements.
public class MyHybrisPlatformFactory {
private volatile static Platform platform;
static {
PlatformFactory.registerPlatform("MySQL5", MyHybrisMySqlPlatform.class);
PlatformFactory.registerPlatform("MySQL", MyHybrisMySqlPlatform.class);
}
public static Platform createInstance(final DatabaseSettings databaseSettings, final DataSource dataSource) throws DdlUtilsException {
final Platform platform = createInstance(databaseSettings);
platform.setDataSource(dataSource);
return platform;
}
public static Platform createInstance(final DatabaseSettings databaseSettings) throws DdlUtilsException {
if (Boolean.valueOf(databaseSettings.getProperty("mysql.dt.ddl.enabled", Boolean.FALSE.toString()))
&& databaseSettings.getDataBaseProvider() == DataBaseProvider.MYSQL) {
databaseSettings.setStatementDelimiter(";");
return (platform = MyHybrisMySqlPlatform.build(databaseSettings));
} else {
return HybrisPlatformFactory.createInstance(databaseSettings);
}
}
public static synchronized Platform getInstance() {
if (platform == null) {
return HybrisPlatformFactory.getInstance();
} else {
return platform;
}
}
}
Adapt SAP Commerce to call the custom factory HybrisPlatformFactory
As mentioned earlier, the type system update and therefore the factory HybrisPlatformFactory
are accessed through different channels, using different technologies. hAC is based on the SAP Commerce platform while the Ant target updatesystem
uses Ant task and Ant macros.
Packaging all customizations in a JAR
Before digging into the required adaptations, it is important to notice that the custom HybrisPlatformFactory
class as well as other custom classes have to be in the classpath of the SAP Commerce platform as well as in the Ant classpath in order to be invoked. Consequently, it is not possible to store these custom classes within the source folder of an extension as Ant could not access them. Therefore, it shall rather be packaged in a JAR and placed in the lib
folder of an extension, which will automatically be part of the SAP Commerce platform classpath and can be referenced in Ant.
Find below the pom.xml
of my Maven project used for building this JAR library.
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>[...]</groupId>
<artifactId>[...]</artifactId>
<version>1.0</version>
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<maven.compiler.source>1.8</maven.compiler.source>
<maven.compiler.target>1.8</maven.compiler.target>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.ddlutils</groupId>
<artifactId>ddlutils</artifactId>
<version>1.0</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.aspectj</groupId>
<artifactId>aspectjweaver</artifactId>
<version>1.9.0</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.ant</groupId>
<artifactId>ant</artifactId>
<version>1.9.1</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>de.hybris</groupId>
<artifactId>ybootstrap</artifactId>
<version>1808</version>
<scope>system</scope>
<systemPath>[...]/hybris/bin/platform/bootstrap/bin/ybootstrap.jar</systemPath>
</dependency>
<dependency>
<groupId>de.hybris</groupId>
<artifactId>yant</artifactId>
<version>1808</version>
<scope>system</scope>
<systemPath>[...]/hybris/bin/platform/bootstrap/bin/yant.jar</systemPath>
</dependency>
</dependencies>
</project>
Ensure to adapt the versions of the dependencies based on your SAP Commerce release.
Adaptations for hAC
The call from hAC to the factory HybrisPlatformFactory
is more or less hard-coded and AspectJ is consequently the easiest way to adapt SAP Commerce, so that it uses the custom factory. The following aspect intercepts the calls made to the createInstance()
method from the standard HybrisPlatformFactory
and invokes the same method from the custom factory instead.
@Aspect
public class MyHybrisPlatformFactoryAspect {
@Pointcut("execution(static * de.hybris.bootstrap.ddl.HybrisPlatformFactory.createInstance(de.hybris.bootstrap.ddl.DatabaseSettings))")
public void createInstance() {}
@Pointcut("execution(static * de.hybris.bootstrap.ddl.HybrisPlatformFactory.getInstance())")
public void getInstance() {}
@Around("createInstance()")
public Object aroundCreateInstance(final ProceedingJoinPoint pjp) throws Throwable {
final DatabaseSettings databaseSettings = (DatabaseSettings) pjp.getArgs()[0];
return MyHybrisPlatformFactory.createInstance(databaseSettings);
}
@Around("getInstance()")
public Object aroundGetInstance(final ProceedingJoinPoint pjp) throws Throwable {
return MyHybrisPlatformFactory.getInstance();
}
}
The aspect should then be configured using the following META-INF/aop.xml
file.
<!DOCTYPE aspectj PUBLIC "-//AspectJ//DTD//EN" "http://www.eclipse.org/aspectj/dtd/aspectj.dtd">
<aspectj>
<weaver>
<include within="de.hybris.bootstrap.ddl.HybrisPlatformFactory"/>
<include within="MyHybrisPlatformFactoryAspect"/>
</weaver>
<aspects>
<aspect name="MyHybrisPlatformFactoryAspect"/>
</aspects>
</aspectj>
The aspect will be needed for the Ant adaptations and should therefore be packaged with the customizations in the JAR, including the AspectJ configuration. The configuration should also be placed under the resources
folder of an extension to enable the AspectJ support in SAP Commerce.
Adaptations for Ant
The details about the Ant target updatesystem
can be found in the platform/build.xml
file. It uses the macro updateTypeSystem
on one side for the dry-run and type system only update options and the macro updatesystem
on the other side for the full system update option.
The updatesystem
macro is defined in platformadministration.xml
and uses the yrun
macro to trigger the system update. The yrun
macro starts a JVM to run code and accepts additional JVM arguments as well as additional classpath entries. It is consequently possible to activate AspectJ, add the JAR containing thet customizations to the classpath and reuse the aspect implemented for hAC adaptations to get the custom factory HybrisPlatformFactory
invoked.
<yrun additionalclasspath="${ext.myext.path}/lib/mylib.jar"
jvmargs="-javaagent:${ext.core.path}/lib/${aspect.weaver.library} ${aspect.weaver.config}">
[...]
</yrun>
I recommend to copy the definition from the standard updatesystem
macro from platformadministration.xml
and paste it in the buildcallbacks.xml
of an extension. Then rename the macro definition (e.g. myupdatesystem
) and adapt the yrun
macro calls as shown above.
The updateTypeSystem
macro is defined in platformadministration.xml
file and is based on the Ant task d.h.a.t.i.TypeSystemUpdaterTask
. Since the Ant task is called immediately by Ant and Ant does not embed AspectJ, it is not possible this time to reuse the aspect. The solution is consequently to customize the standard TypeSystemUpdaterTask
class as well as the intermediate class HybrisSchemaGenerator
to call the custom HybrisPlatformFactory
class.
Find below the code for the customized HybrisSchemaGenerator
class calling the custom factory HybrisPlatformFactory
.
public class MyHybrisSchemaGenerator extends HybrisSchemaGenerator {
private final DataSourceCreator dataSourceCreator;
public MyHybrisSchemaGenerator(PlatformConfig platformConfig, PropertiesLoader propertiesLoader, DataSourceCreator dataSourceCreator, boolean dryRun) throws Exception {
super(platformConfig, propertiesLoader, dataSourceCreator, dryRun);
this.dataSourceCreator = dataSourceCreator;
}
public MyHybrisSchemaGenerator(PlatformConfig platformConfig, PropertiesLoader propertiesLoader, DataSourceCreator dataSourceCreator, DbTypeSystemProvider dbTypeSystemProvider, OverridenItemsXml overridenItemsXml, boolean dryRun) throws Exception {
super(platformConfig, propertiesLoader, dataSourceCreator, dbTypeSystemProvider, overridenItemsXml, dryRun);
this.dataSourceCreator = dataSourceCreator;
}
@Override
protected Platform createDDLUtilsPlatform() {
return MyHybrisPlatformFactory.createInstance(this.getDatabaseSettings());
}
@Override
protected Platform createConnectedDDLUtilsPlatform() {
return MyHybrisPlatformFactory.createInstance(this.getDatabaseSettings(), this.dataSourceCreator.createDataSource(this.getDatabaseSettings()));
}
}
Find below the code for the customized TypeSystemUpdaterTask
class calling the customized HybrisSchemaGenerator
class.
public class MyTypeSystemUpdaterTask extends TypeSystemUpdaterTask {
@Override
public void execute() throws BuildException {
try {
final HybrisSchemaGenerator schemaGenerator = this.getSchemaGenerator();
this.setScriptNames(schemaGenerator);
this.executeSchemaGenerator(schemaGenerator);
} catch (final Exception e) {
throw new BuildException(e);
}
}
protected String getPlatformHome() {
return this.getProject().getProperty("platformhome");
}
protected HybrisSchemaGenerator getSchemaGenerator() throws Exception {
final PlatformConfig platformConfig = PlatformConfig.getInstance(ConfigUtil.getSystemConfig(this.getPlatformHome()));
final PropertiesLoader propertiesLoader = new StandalonePropertiesLoader(platformConfig, this.getTenantId());
final DataSourceCreator dataSourceCreator = new DBCPDataSourceCreator();
final HybrisSchemaGenerator schemaGenerator = new MyHybrisSchemaGenerator(platformConfig, propertiesLoader, dataSourceCreator, this.isDryRun());
return schemaGenerator;
}
}
The custom code shall be packaged in the JAR with the other customizations, so that it can be added to the Ant task definition classpath.
Finally, the updateTypeSystem
needs to be customized to first invoke the customized TypeSystemUpdaterTask
class and second include the JAR in the classpath. I recommend copying the original updateTypeSystem
macro definition from platformadministration.xml
into the buildcallbacks.xml
of an extension and adapt the task definition as following.
<macrodef name="updateTypeSystem" ...>
[...]
<taskdef name="yMyTypeSystemUpdate" classname="MyTypeSystemUpdaterTask">
[...]
<pathelement path="${ext.myext.path}/lib/mylib.jar"/>
</taskdef>
[...]
</macrodef>
Finally, either the Ant target updatesystem
needs to be adapted to call the new macros or a new Ant target shall be introduced. I recommend introducing a new Ant target by copying the definition of the original one in order to keep the standard Ant target.
Conclusion
Optimizing the DDL statements helped in my situation to cut down the type system update from over a day to couple hours and fit easily inside the maintenance window allocated for Go-Live. However, it took time to implement and validate as multiple executions of the system update were required to ensure the customizations work properly. My advise out this experience for other projects would be to check at the end of each development sprint the execution time of the DDL statements generated by the type system update. It’s easy to collect the DDL statements from hAC or Ant and measure the execution time on a pre-production database. It will help to detect early if an optimization of the DDL statements is needed and implement it during development phase.