Statistical technique for automatically cleaning erroneous data from weather-balloon observations

Statistical technique for automatically cleaning erroneous data from weather-balloon observations
Weather balloons carrying disposable radiosondes are released twice a day at 700 locations around the world to make observations of the upper atmosphere. Credit: Alamy 

Twice a day, weather balloons are released into the atmosphere from 700 locations around the world to observe conditions in the upper atmosphere. Since the 1920s, there have been tens of millions of these radiosonde launches, producing an enormous archive of data that is critical to weather forecasting and climate modeling. In such a large data set, inevitable errors can significantly affect modeling outcomes.

Ying Sun, Saudi Arabia's King Abdullah University of Science and Technology (KAUST) Assistant Professor of Applied Mathematics and Computational Science, collaborated with researchers from the Colorado School of Mines and Baylor University, US, to develop a method to remove these errors using on a robust statistical analysis of the data.

"A radiosonde is a small, expendable instrument package that is suspended below a two-meter-wide balloon filled with hydrogen or helium," explained Sun. "Sensors on the radiosonde measure height, pressure, temperature and dew point; they also calculate wind speed and direction by tracking the position of the radiosonde in flight. Radiosonde observations are the only direct measurements of the Earth's , making them vital for satellite data, weather forecasting and climatology research.

The data's many errors are "far too many to correct by hand, so we need an automatic method for identifying such random errors," explained Sun.

There are automatic methods for removing systematic errors from the data, such as changes in location or measurement units. However, there has been no way to remove genuinely erroneous data, including data-entry mistakes, transmission errors or imprecise tracking of the balloon without also deleting extreme but real measurements—which are some of the most important data for forecasting. Looking specifically at wind data, Sun and her co-workers developed a statistical approach that achieves robust differentiation between extreme values and random errors.

"Our approach considers a more realistic distribution of the wind vector that is skewed with a long tail of rare extreme values," said Sun. "This makes it possible to flag observations that are very likely to be as potential outliers without removing extreme values."

In addition to its application to new daily data, this error-detection scheme can also be used on the huge volumes of radiosonde observations held in archives around the world.

"We are developing an outlier-detection method that is fast and automatic. We will be able to use this method to quickly process the millions of records in the archive," said Sun. "We are also considering the possible effect of climatic change when developing the new method."

Explore further

DOOMED is new online learning approach to robotics modeling

More information: Ying Sun et al. Robust bivariate error detection in skewed data with application to historical radiosonde winds, Environmetrics (2017). DOI: 10.1002/env.2431
Citation: Statistical technique for automatically cleaning erroneous data from weather-balloon observations (2017, February 28) retrieved 19 September 2019 from
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no part may be reproduced without the written permission. The content is provided for information purposes only.

Feedback to editors

User comments

Mar 08, 2017
In the mid and late 1960s I was an enlisted radiosonde operator for the US Air Force Weather Service. My first assignment was to Tinker Air Force Base, Oklahoma, to 6th Mobile Weather Detachment. The observations in that day were made by a tracking "radar" which provided azimuth and range information. The radiosonde transmitted temperature, dew point, and pressure. From this data: altitude, winds, etc. were hand calculated on two large (2' x 4') fine graph paper style, ruled, plotting boards. I worked in the "checking" division, manually checking math calculations and upper air coded observation transmissions. Each radiosonde was "base-line" checked before flight to eliminate a bad unit. Human errors were generally due to mis-coding correct data. Corrections sent by the observers were sometimes also wrong. Other errors occurred such as a balloon breaking prematurely, or an instrument failing during flight. Since then the calculations are computerized.

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more