Temperature is a universal input for daily conversation. It can be simply as hot, cold and in-between. It can be in a range such as weather forecasting. It could be precise as a body temperature reading. Its widespread uses and affordable devices make the temperature a trusted reading without much scrutiny on the calibration and comparison. A datasheet or an express disclosure by the manufacturer is often sufficient for temperature accuracy.
Previously, I built a low-cost weather unit measuring basic inputs including temperature, relative humidity, light intensity. The unit is under $20 with some sensors installed in duplication. The data from this unit is acquired and stored locally in a home server. The location is in Hanoi, Vietnam with coordinates 21.02 °N, 105.83°E.
Available public sources such as nearby weather stations in Hanoi, such as Lang station is assumed to be the best setup for measuring the air temperature. Open API such as Darksky.net and openweathermap.org offers great resources for weather forecasting. Darksky.net, for example, listed a few data sources from big well-known models. Air quality stations such as with UNIS school displayed temperature and relative humidity which appeared to the forecasted data that needs a close look.
A third source is from a "reanalysis" product. I don't have behind-the-scene details of reanalysis, and I assumed its products are the best available data to represent a large scale and comprehensive sets of data. For this post, I used MERRA-2 published by NASA as the reference data.
One goal of this post is to take a close look at various data sources by using analyzing packages such as pandas with Python as the showcase for a combing open, free tool with available data. Besides, knowing relatively how close the measuring of low-cost units to an official station, and different numerical products are beneficial to assess the accuracy of each source.