While looking at Ontario Power Generation‘s official web site, I noticed this number in the bottom right corner of the page:
It contains the amount of power being generated as well as the date/time of the last update. I refreshed a few times and realized that updates occur every five minutes. Curious, I thought I’d whip up a quick module to scrape this information from the web site and produce some nice graphs with RRDTool. I used the open source RRDTool::OO module to do this, which is freely available on the CPAN.
Recognizing that web scraping is not the most reliable means of getting data from a web site, I contacted OPG via e-mail and requested an API for this data. In the latest iteration of WWW::OPG (version 1.004 already on CPAN), a smaller machine-readable text file provides the same data in an easier-to-parse format. Thanks to someone I know only as “Rose” from OPG for providing this file, which is much easier to parse and less likely to change.
As OPG supplies roughly 70% of Ontario’s electric power demand, the consumption statistics provide a relatively good reflection on our behaviour patterns over time. During the course of this, I learned how to work with Round Robin Databases (and wrote an article about it) and was able to observe some interesting trends even in the first week of operation:
The graph begins Saturday, December 26th, 2009 (Boxing Day) and continues through the week approaching the new year 2010. These particular trends are interesting because, while two observable peaks occur each day, the overall power consumption (including 95th percentile consumption) seems much lower than usual.
By comparison, consider this graph of a week ended 14 January 2010 (there were some rather long-lasting outages in the data collection which I’m trying to track down, but it still gives a sense of the general trends):
In this case, the 95th percentile consumption is much higher at about 14GW rather than 10GW. Note that the 95th percentile gives a rather good approximation of an infrastructure’s utilization rate, since it works by indicating peak power after removing the highest 5% of data points. This means that 95% of the time, power consumption was at or below the given line.
Percentile is more important than averages because it indicates the minimum infrastructure to satisfy demand most of the time (95% of the time) so it gives us a simple way to determine whether more infrastructure is needed.
In the specific case of electric power utilities, and because electricity is so important for both industrial and commercial use, legal requirements stipulate that the demand must always be supplied, barring exceptional circumstances such as failures of distribution transformers. In this case, maximum power consumption is a more useful measure for infrastructure planning.