Analyzing Custom Geographic Regions
A common problem in business mapping is the use of custom regions. They are called by many names – territories, districts, areas, etc. – but their use and the related mapping problem is the generally the same. So for our purposes here, I will simply refer to them as regions.
Business users want to see and use these special geometries on a map just like they see states, postal codes or counties. However, it is often not easy to do. Unfortunately, when business users are accustomed to thinking and acting in terms of certain geographic areas, not being able to show these on a map can be a major reason why it is hard to engage those analysts in the power of geographic analysis. For this reason I wanted to take the next couple of paragraphs to explain various approaches to solving this problem. So if you have struggled with this issue before, or think that you might in the future, I hope that this overview will prove useful.
This is what you have
This is what your users want to see
There are a few common ways to try to work around this limitation in traditional mapping tools.
The traditional options
One of the most common ways to deal with custom regions is to just create them manually in a hardcore GIS editing tool. If you have the resources of a Fortune 500 company, you can probably afford a professional cartographer to use expensive tools to draw each facet of your business’ regions by hand. Of course, someone first needs to research the proper alignment for the region boundaries, and get them into a form that the cartographer can use. Then he or she can spends hours or days making a pixel-perfect map layer specifically for your organization. The output of this exercise would typically be a Shapefile or a GeoJSON file that can be used in your business mapping tool.
Turning unknown areas into known regions has historically been a costly endeavor
Aside from the up-front work, another problem with this solution is on-going maintenance. Every time a change is made to the business region definitions, the cartographer has to go back to work adjusting the custom map layer. Then that updated layer has to be deployed to all of the business analysts somehow so that they can see the updated view of their business world. Obviously, the more quickly the regions changes, the more burden this option becomes. Also, updating the business system or source with the new regions can be error prone and a time consuming process itself. Depending upon the tools and formats used, it may be necessary to do various steps of format and data conversions just to mate the business data to the geographic diagrams.
Another approach is to try to manually group states, counties, or postal codes in such a way as to allow group-level analysis. If the regions are based on large geographic entities that are few in number, than this option could be a viable solution. However, imagine trying to manage the manual grouping of thousands of postal codes as the regions definitions change over time.
Does anyone know where ZIP code 20190 fits into our regioning scheme?!?
One additional problem with this approach is that BI tools vary widely in their ability to make these “custom groups” as well as the analysis that can be done with them. In many cases, these manually grouped entities limits the BI analysis that can be done, and in other cases external extensions such as mapping tools can’t even make use of them at all.
A better way
After seeing the problem at many customers, we at Visual Crossing set out to solve this problem in a way that we believe to be far superior as well as far simpler to use. That way is our automatic region creation. The Visual Crossing map extension constantly scans the business datasets for dimensions that may relate to something geographic. For common geographic areas the matching is easy. For example, if it sees US states, UK postal codes, or German kreise, it can match them to known, standard geographic entities and display those on the map automatically. That is the easy part.
The harder part is when the business data contains something that does not easily match to standard geographic boundaries. Although this dimension may be a dimension that is entirely non-geographic such as products in a store, it may instead be a geographic layer that simply is not defined by a direct match to standard geographic entities. When the Visual Crossing engine finds one of these “higher-level” geographic “regions”, the real work begins.
The mapping engine looks across the dataset to determine what other geographic layers may relate to this potential geographic level. If it finds a “lower-level”, known geographic layer that is related, it can then use those known geometries to automatically and instantly build a set of custom region geometries based on the business data. There is no need to manually do any assignments nor is there a need to make and distribute manual updates when the business regions change. Your business data already knows how your custom regions should look. Visual Crossing simply built mapping solution to unlock that knowledge.
The Complete Picture
One potential issue that might cross your mind when considering the system above is that the definition of the regions will be limited to the data that is available on the current dataset. For example, if your dataset has sales by ZIP code and some ZIP codes don’t have any sales attached to them, those ZIP codes won’t show up in the dataset at all. Because of this, those ZIP codes won’t appear on the map nor will they be used to aggregate the regions. This will result in regions that have some visual “holes” in them.
Region “holes” can point to business opportunities or be seen as visual distractions
One way to look at these holes is as a useful analysis in their own right. Why is there not data in these ZIP codes? Perhaps we are missing a business opportunity there that merits further investigation. This is an example of a business insight that you can without even looking for it when using mapping.
However, in other cases you may want to see these regions completely filled. There are two common ways to accomplish this. One way is to run the generated region through geographic algorithms that fill these “holes” by making guesses based on the available data. Conceptually the basic algorithm can be easily understood. If a “hole” is completely surrounded by a single region, then that hole can be entirely filled in with its surrounding region. If, however, the “hole” touch multiple regions, then a computation based on proximity, shared border, and other factors can be used to make a best-guess how to assign the hole. In Visual Crossing, a version of this algorithm can be automatically applied to improve the appearance of a region. Note, however, that since the tool is making an educated guess in some situations, the generated output is not guaranteed to match some hand-drawn paper map region hanging on a wall somewhere. However, for many people the benefit of being fully automated more than overshadows the minor inexactness.
The second option is to pre-run a dataset that does contain a full set of the underlying known geographic elements. We call this a “complete dataset.” So, for example, if you have modeled a full ZIP code list into your business data, you can make a dataset and a map that contains the complete set of ZIP codes. The region layer produced in this case will not contain geographic “holes.” You can then save this region layer for future use on maps that may not contain the entire ZIP code set. While initially providing this dataset requires some extra work, this option has the benefit of producing a region layer that exactly matches the known business definitions.
Filled regions automatically generated from the business data
Let’s try it in the real world!
We’ll consider the example of a chain of stores in Ohio. These stores are managed using regions based on the store locations. A simple map of these stores would look like this.
Our stores in the Buckeye State
Our store chain collects loyalty data for customers and analyzes data by ZIP codes to better understand store coverage. Here you can see a map that shows customers in the loyalty program as well as the ZIP codes to which those customers belong.
Stores, customers, and zip codes all together – the truth is in there… somewhere
Our business data also models the custom regions used by the store management. Since each store is associated with a specific region and stores also have a connection to ZIP code, all we need to do is to add the Region dimension to our map. Visual Crossing does the work of finding the relationship between collections of ZIP codes and the regions that contain them. It then can automatically aggregate those ZIP codes into entirely new shapes that represent our store’s custom regions. The best part is that we didn’t need to know anything about cartography or GIS. All we needed to do was drop the business data onto the map, and Visual Crossing figured out exactly what to do.
Easy-to-understand, interactive regions
Where do we go from here?
One key thing to notice is that this regions are their own layer on the map and are available for any type of geographic analysis that you may wish to do. For example, you can do selections and comparisons of these custom regions just like you can any other map entity. Maybe you want to compare the demographics of customers that live in different districts. Or if you want to see the resulting statistics when considering a merger of two custom regions, you can do with the standard selection tools. There are countless options.
I hope that this brief introduction has helped to show you a way to make a very valuable but previously difficult task far more simple and automatic. I am very interested in hearing more about your own experiences relating to mapping custom geographic regions and other mapping topics. Please feel free to send me a private message or comment directly below. I look forward to hearing and learning from you!