Introduction and crowdsourcing weather data
Shall we talk about the weather? The eternal conversation starter is fast becoming a classic test-bed for the spread of big data, the Internet of Things and cloud storage. Data thus far mostly comes from dedicated weather stations, observation systems at airports, meteorological satellites, ocean buoys, ships, aircraft and – most recently – personal weather stations in people’s homes that are connected to the internet.
If the latter are on the rise, so are sensor-packed smartphones, which are not only creating more demand for hyper-local weather data, but could themselves help meet that demand.
What role does the IoT play in weather prediction?
The spread of sensor-packed IoT devices is giving the weather industry more real time, hyper-local data that could be used – and already is – to develop more sophisticated weather prediction models. “More and more devices such as smartphones, thermostats or even sensors in vehicles are providing real-time data about temperature, humidity, environmental conditions and other variables,” says Hendrik Hamann, manager of the Physical Analytics Group at IBM Research. “This data is the source for improving weather prediction systems.”
The IoT clearly offers new ways in which to collect observations, and more cheaply, too. “Existing observation networks are limited in number but highly calibrated, because they’re expensive to run and operate,” says Charles Ewen, chief information officer at the Met Office. “However, the IoT offers a high density of observations of an unknown quality.”
Is crowdsourced weather data reliable?
The forecasting models are incredibly reliable, but crowdsourcing the raw data – often not validated and unverified – comes with obvious risks in terms of accuracy. “We need to build systems which ensure the integrity of the data they feed in is valid, not just assume that more data will automatically make models more accurate,” says Nick Clarke, Head of Analytics at data analytics company Tessella. “But ensuring the data which feeds into forecasting models is useful requires careful use of advanced analytics techniques.”
Introducing the IoT into weather data collection relies on all of us chipping in. “People like to monitor their environment and their performances,” says Christelle Manach, Marketing Product Manager for the Paris-based IoT startup Netatmo’s Weather Station, which measures temperature, humidity, barometric pressure, CO2, sound and rainfall, and pairs with smartphones.
“They’re willing to be surrounded by smart devices if it helps them … it’s specifically important to measure air quality as the indoor air pollution has dangerous effects on health conditions including asthma, allergies and respiratory problems.”
Netatmo, whose data is displayed on the Netatmo WeatherMap, is partnering with Weather Underground to add its data points to the Weather Underground Personal Weather Station (PWS) Network. The aim? Accurate forecasts for each specific data point location.
Weather prediction and deep machine learning
Predicting the weather
If Netatmo is trying to provide micro-weather data on what’s currently happening, the MetOffice is most concerned with forecasting – and that means macro-data. “The role of the MetOffice is to complement data affecting people’s everyday life, enlarging its possibilities to dimensions where hyper-local information is not enough, such as weather predictions,” says Ewen.
However, the Met Office is also getting involved in crowdsourcing data from the same ‘amateur observing community’ that Netatmo is encouraging, most recently in its WOW – the ‘Weather Observation Website’ campaign – which is trying to get families and children to input data on what the weather is like where they live. “These snippets of information help to build up a real-time and realistic picture of the current landscape, but also help us to build a more precise forecasting for the coming days,” says Ewen.
With a dataset that changes every 15 minutes or less, weather is big numbers, and with computational capacity rising exponentially as data volumes and data variety increases, too. “Traditionally, the weather industry used to take data to the problems,” says Ewen. “We’re looking at technology that brings small data and algorithms to all the big data.”
Cloud on clouds
Predicting the weather has always been a lot about cloud, but it’s now also about the cloud. Once a cable TV company, The Weather Channel – recently bought by IBM to power its forthcoming Watson AI apps – gradually built up its infrastructure to provide local weather for 2.4 billion locations worldwide, using 13 data centres and generating four terabytes of data every hour.
Ironically, given its purchase by IBM, it has recently redesigned its entire big data platform on Amazon Web Services (AWS). “The Weather Channel is a case study in the removal of constraints of working in the cloud,” says Dr. Matt Wood, Product Strategy at AWS. “It was able to take data from its instruments to predict the weather for three million points, but since it’s moved to AWS it’s increased the resolution to three billion.”
By using the cloud The Weather Channel now processes a prediction in milliseconds, and every 15 minutes rather than once per hour. “It runs 20,000 cores on AWS and has several 100 TBs of shared memory across the cluster,” says Wood, and it needs to. If you ask Siri or Google what the weather is, she asks The Weather Channel. “That’s about 700 million people who depend on it,” says Wood.
Deep machine learning
There are two different approaches – high resolution models, and deep machine learning. The first relies on new, more detailed satellite mapping to capture local conditions more accurately.
The second involves “cognitive technologies such as deep machine learning to understand which weather model was more accurate when, where and under what weather situation , using historical weather forecasts and historical weather data, “says Hamann, who is working on exactly this with the Department of Energy and the University of Michigan. “That learning is then applied for future forecasts … both tech approaches often coexist.”
The second approach involves a lot of supercomputer power being thrown at weather data in the name of hyper-local personalisation. “Our observations are combined with unmatched supercomputing power, meaning we can deliver forecasts with more accuracy and personalisation,” says Ewen of the Met Office’s supercomputer. “This new supercomputer will deliver some £ 30 billion (around $ 45 billion, or AU $ 64 billion) worth of benefits in a five year period.”
When it comes to bleeding edge new tech, it never rains but pours in the world of weather forecasting.
- Best cloud services compared: Google Drive vs OneDrive vs Amazon vs iCloud vs Dropbox