Skip to Content

If you ever been to New York, you’ve surely noticed blue branded citibikes around town.  These bikes allow riders to unlock bikes from hundreds of stations across New York and return them to other stations for a fee. When I’m in New York, I tend to always see them around midtown with a young man riding fighting through rush hour traffic.  But as an analytics geek, I needed data to prove or dispel this myth.

 

Did You Know That These Bikes Are Data Collecting IoT Machines

While it shouldn’t be a surprise, the bike stations are internet-enabled data collection devices. They capture and collect all sorts of data around the trip times, start and stop date/time, start and stop station, bike ID, and some descriptive information about the user.  Furthermore, these bikes have been collecting data since mid-2013.  But like most data collection machines, the data has its fair share of challenges.  The first challenge is that each month is a new data set.  This means that you need to append each month on top of each other to form a history.  The second set of challenges is that periodically, the data set will have different column names, which requires it to be remapped and relabeled.  A third challenge is that there are numerous codes with no descriptive information, so many of these codes need to be mapped to regular locations to make sense of it.  And the final challenge was around data quality.  Subscribers can put in “fake” information about themselves, such as their gender and age.  Anyways, it took a few hours, but I was finally able to get a usable data set to do my analytics.

 

Let’s Take a Look At The Data

The data set goes from January 2014 to March 2018.  Overall, we can see that New Yorkers have become quite fond of these bikes and its popularity has grown each year.

 

Do People Ride In the Winter?

New Yorkers are quite tough and battle through the elements on their bikes.  While less people bike in the winter, many brave folks soldier through.

 

What Stations Get Used The Most?

If we look at the map, we can see that the most popular pick-up and drop-off locations are in mid-town and lower manhattan.  And if we zoom into the map further, we can see that the largest bubbles are around Grand Central Station.

 

 

Where Are They Going from These Stations?

If we filter on the top station (Pershing Square), we can see that they’re typically not biking very far (10-15 blocks away).  This makes sense since the average duration is around 15 minutes.

 

When Do People Ride?

Wednesday is the most common day to ride, but people ride longer on the weekends.

What Types of People Ride?

While there are very limited demographics data on the riders, the data captures age and gender.  The majority of the riders are Millenial men.

When Do These Different Generations Ride and Why?

Here’s where things get a bit more interesting.  In the visual on the left, we can see that millenials use these bikes the most, but primarily during the week.  On the visual on the right, we can see that these millenials are riding during afternoon and evening hours (likely on the way back home from work).

However, if we pivot this and look at the average ride time (instead of the number of rides) and we can see a different trend.  In the visual on the left, we can see while traditionalists (e.g. seniors) and baby boomers ride much less, they tend to take much longer rides – and these rides are primarily in the afternoon and at night.

 

What’s the Profile of These Bike Riders?

Based on the answers above, it’s very easy to paint a picture of who these riders are.  The typical citibike rider is:

* Young men who live and work around the city.

* Short haul ride to/from the train station to work.

* Prefer to ride home – where they have more time, maybe more energy, or want to burn off a few calories.

 

What Does This All Mean?

The power of analytics is that it gives us the whole story behind the data and it can help us to validate our thought process or to gain new insights into our business.

How People Use Your Product Tells You More Than Who Uses It.  Citibike riders provide very little information about themselves, but their usage tells you a lot.  The fact that the numbers or riders have grown dramatically and that people using these bikes to ride to/from work tells us that they value the cost and convenience of these bikes.

Citibikes Is Targeting the Right Customers.  Their tag line “Faster than walking, cheaper than a taxi, and more fun than the subway” with pictures of young adults is the exact demographic that’s using their bikes.  The only exception is that it seems to be a mostly young men that are riding.

People Need Analytics. Not Data.  While New York Citibikes does an excellent job providing timely and updated “raw data“, they provide very little analytics and insight into this data, like we’re seeing above.  This type of analytics let’s us see the whole story in the data.

To report this post you need to login first.

2 Comments

You must be Logged on to comment or reply to a post.

  1. Nabheet Madan

    Thanks Jason for the interesting insights. Sky is the limit with power of IoT & Data & Analytics. One small question are they also using IoT to predict the bike maintenance etc? What all things IoT is being used by them?

    Thanks

    Nabheet

    (0) 
    1. Jason Yeung Post author

      Hi Nabheet – Great question and that would be an awesome use case.  While the dataset doesn’t include maintenance data, it does include a bike ID.  You could identify which bikes are more prone to needing maintenance based on their distance, usage, and types of rides.  For example, I would think that the streets of NYC with tons of potholes and lots of braking is much worse on a bike than a weekend ride.  Another good use case would be an inventory shortage one.  A lot of bikes are used for short hauls around the city and around the train stations.  My guess is that NYC just stockpiles bikes around there – whereas they could use predictive capabilities to predict the number of bikes required at different stations at different times.  Again, another super cool use case.  Just shows the unlimited amount of things that analytics can solve.

      (0) 

Leave a Reply