Skip to Content

I’m running a petition on causes.com, and I wanted to do some geographic analysis of the results.  Causes.com gives me information about the signers with regard to their name, email address, country and zip code.  The country information was useful for doing the first phase of the analysis, which I wrote about in the Lumira space.  However, I wanted to do a bit more fine grained analysis, and that means I needed to convert that zip code information into something more directly useful (e.g., city, county and state for the United States).  There’s 1500 data points though, so I need an automated method.  Google Geogoding API to the rescue.

The Google Geocoding API is a REST services, so I created a REST client in PowerBuilder.Net using the following URL:

http://maps.google.com/maps/api/geocode/xml?components=country:USA|postal_code:{zip}&sensor=false

Where {zip} represents the zip code that I’ll be passing in as an argument.  Most REST services don’t have a WSDL you can use to create client data types from.  For the PowerBuilder.Net REST client perhaps the simplest way to get it to be able to determine what data types to create is to provide it with a sample response from the service.  What I provided was the following:

<?xml version="1.0" encoding="UTF-8"?>
<GeocodeResponse>
 <status>OK</status>
 <result>
  <type>postal_code</type>
  <formatted_address>Azusa, CA 91702, USA</formatted_address>
  <address_component>
   <long_name>91702</long_name>
   <short_name>91702</short_name>
   <type>postal_code</type>
  </address_component>
  <address_component>
   <long_name>Azusa</long_name>
   <short_name>Azusa</short_name>
   <type>locality</type>
   <type>political</type>
  </address_component>
  <address_component>
   <long_name>Los Angeles</long_name>
   <short_name>Los Angeles</short_name>
   <type>administrative_area_level_2</type>
   <type>political</type>
  </address_component>
  <address_component>
   <long_name>California</long_name>
   <short_name>CA</short_name>
   <type>administrative_area_level_1</type>
   <type>political</type>
  </address_component>
  <address_component>
   <long_name>United States</long_name>
   <short_name>US</short_name>
   <type>country</type>
   <type>political</type>
  </address_component>
  <geometry>
   <location>
    <lat>34.2667473</lat>
    <lng>-117.8545867</lng>
   </location>
   <location_type>APPROXIMATE</location_type>
   <viewport>
    <southwest>
     <lat>34.1031590</lat>
     <lng>-118.0513950</lng>
    </southwest>
    <northeast>
     <lat>34.3791980</lat>
     <lng>-117.6556881</lng>
    </northeast>
   </viewport>
   <bounds>
    <southwest>
     <lat>34.1031590</lat>
     <lng>-118.0513950</lng>
    </southwest>
    <northeast>
     <lat>34.3791980</lat>
     <lng>-117.6556881</lng>
    </northeast>
   </bounds>
  </geometry>
 </result>
</GeocodeResponse>

Note that the information I want comes in the “administrative_area_level_1” (State), “administrative_area_level_2” (County) and “locality” (City) address component types.  I’m also going to grab the approximate lattitude and longitude coordinates from the geometry portion of the response.

With the REST proxy created, I create a WPF application that has a datawindow that reads the CSV data that causes.com gives me and then loops through it adding the information from the Google Geocoding API.  The main code of interest is as follows.

long     ll_index, ll_count
string     ls_zip, ls_lat, ls_long, ls_country, ls_county, ls_city, ls_state
googlegeo_proxy     proxy
GeocodeResponse response
 ll_count = dw_1.RowCount()
 proxy = create googlegeo_proxy
 FOR ll_index = 1 to ll_count
     ls_country = dw_1.Object.country[ll_index]
     if ls_country <> "United States" THEN CONTINUE
     ls_zip = dw_1.Object.zip[ll_index]
     if Len ( ls_zip ) = 4 then
          ls_zip = '0' + ls_zip
     end if
     response = proxy.GetMessage ( ls_zip )
     if response.status = "OK" then
          ls_lat = String ( response.result.geometry.location.lat )
          ls_long = string ( response.result.geometry.location.lng )
           System.Collections.IEnumerator enum
          enum = response.result.address_component.GetEnumerator()
          GeocodeResponseResultAddress_component address
          do while enum.MoveNext()
               address = enum.Current
               Choose CASE address.@type[1]
                    CASE "administrative_area_level_1"
                         ls_state = address.long_name
                    CASE "administrative_area_level_2"
                         ls_county = address.long_name
                    CASE "administrative_area_level_3"
                         ls_city = address.long_name
                    CASE "locality"
                         ls_city = address.long_name
               END choose
          loop
          dw_1.Object.lattitude[ll_index] = ls_lat
          dw_1.Object.longitude[ll_index] = ls_long
          dw_1.Object.state[ll_index] = ls_state
          dw_1.Object.county[ll_index] = ls_county
          dw_1.Object.city[ll_index] = ls_city
     else
          dw_1.Object.lattitude[ll_index] = response.status
     end if
     // Wait 2 seconds or Google will put us into OVER_QUERY_LIMIT condition
     Sleep ( 2 )
NEXT

The zip codes were imported as numbers, so I’m prefixing them with 0 of they are only four digits long.  I’m using an enum to loop through the address_components because PowerBuilder.Net doesn’t offer a particularly easy way of requesting a specific element in the collection.  And referencing address.@type[1] is a bit of a hack, as there are a number of type attributes returned and I’m assuming the one I want is always the first one.  That code will break if they don’t always come back in the order I expect.

The last thing you might note is that I do a Sleep(2) between calls to the API.  Google places some restrictions on calls to their APIs to prevent abuse, often in terms of total number of calls or calls per day.  In the case of the Geocoding API, the restriction is that you can’t call it more often than once every 2 seconds.  The Sleep makes sure that I don’t exceed that limit and start getting overy query limit error response.

And here’s the results on a state by state basis for the United States.

/wp-content/uploads/2013/08/petitionsignersbystate_268312.png

Unfortuantely, Lumira was unable to recognize enough of the city and county names that Google provided that it make trying to do the analysis on any finer detail rather difficult.

To report this post you need to login first.

3 Comments

You must be Logged on to comment or reply to a post.

      1. Mark Libner

        Sounds like a business opportunity for someone. 😀

        Nice analysis nonetheless. Very interesting to see where the other PB’ers are. It’s a shame I couldn’t even get everyone in my shop to sign. We could have gotten to red. 🙁

        (0) 

Leave a Reply