Comparing temperature sensors (Dallas DS18B20, Microchip MCP9808, Resistance Temperature Detector (RTD) PT-100, Sensirion SHT(3x, 21).
Cleaning and reformatting complex data for pandas
. The data was stored in text file with list of dictionary and the key as another dictionary
Using seaborn
to display data by using its powerful feature hue
Processing records of experiment with lowcost temperature probes
{'ds18_1': {'temp': 25.0625},
'ds18_2': {'temp': 25.4375},
'epoch': 1619566926,
'hostname': 'temperature__6faf24',
'mcp_1': {'temp': -11},
'restart': False,
'rtd_1': {'rtd': 8255, 'temp': 21.40634},
'rtd_2': {'rtd': 8449, 'temp': 27.96803},
'rtd_3': {'rtd': 8432, 'temp': 27.39254},
'sensor': 'temperature',
'sht21_1': {'humid': 118.9924, 'temp': 23.58172},
'sht3x_1': {'humid': 98.06, 'temp': 24.1},
'sht3x_2': {'humid': 96.53, 'temp': 24.29},
'time': '2021-04-27 23:42:11'}
import matplotlib
%matplotlib inline
import matplotlib.pyplot as plt
import datetime
import pandas as pd
import numpy as np
import seaborn as sns
# plt.rcParams['axes.titlesize']= 'large'
# plt.rcParams['axes.titleweight'] = 'bold'
# plt.rcParams['font.size']=13
# plt.rcParams['font.sans-serif'] = 'Open Sans'
# # plt.rcParams['font.family'] = 'sans-serif'
# plt.rcParams['text.color'] = '#4c4c4c'
# plt.rcParams['axes.labelcolor']= '#4c4c4c'
# plt.rcParams['xtick.color'] = '#4c4c4c'
# plt.rcParams['ytick.color'] = '#4c4c4c'
import json
plt.style.use('default')
plt.rcParams["figure.figsize"] = (8,6)
from IPython.core.display import HTML
import base64
def convert_img_base64(fpath=None):
'''convert image to bytes and display on jupyter'''
with open(fpath, "rb") as image_file:
encoded_string = base64.b64encode(image_file.read()).decode('utf-8')
img = f'data:image/png;base64,{encoded_string}'
# return img
# print(img[:200])
return HTML(f"
{img}>")
Here are photos of the experiments. I intended to make a comparision of several temperature probes and place ice inside the box to reduce the temperature
List of temperature probes are
MCU ESP8266 was used to read data from the probes. Collected data packed into dictionary and sent over MQTT server.
Entire breadboards, probes and the battery were placed inside a plastic box (2L) with a pack of ice, isolated a layer of cotton shirt, then thick blanket
convert_img_base64('rtd1.jpg')
convert_img_base64('rtd2.jpg')
# file = 'http://localhost:8000/media/files/shared/temperature.txt'
file = 'https://b-io.info/media/files/shared/temperature.txt'
import requests
req = requests.get(file)
req.status_code
lines = req.text.split('\n')
# with open(file) as f:
# lines = f.readlines()
len(lines)
lines[0]
lines[-2]
from pprint import pprint
pprint(json.loads(lines[-2]))
# we will convert a list of records into dataframe
def clean_data_sensor(data, debug=False):
'''read file into a list of line then pack into a dataframe'''
# with open(file) as f:
# data = f.readlines()
cdata = list()
try:
for line in data:
cdata.append(json.loads(line))
except Exception as e:
print(e, line)
pass
df = pd.DataFrame(data=cdata)
if debug:
max_keys = max(len(line.keys()) for line in cdata)
for line in cdata:
if len(line.keys()) == max_keys:
print(line.keys())
print(line)
break
print(df.info())
return df
df = clean_data_sensor(lines)
df.head(3)
df.info()
# we can drop some columns that are the same for each sensor
# also NaN value on epoch
df.drop(columns=['sensor', 'hostname', 'time', 'restart'], inplace=True)
df.dropna(subset=['epoch'], inplace=True)
df.head()
# colummn names
cols = list(df.columns)[1:]
cols
# use epoch as timestamp
df.epoch = df.epoch.astype(int)
df.set_index('epoch', inplace=True)
df.head()
The column name ( rtd - Resistance Temperature Detector) are the type of sensor. I also have SHT from Sensirion, MCP from Microchip, DS18B20 from Dallas
The row are timestamp in epoch (number of seconds since 1970), and the value in cell are a dictionary with many pairs key:value as needed
An ultimate goal to arrange data in cell by its value (numeric or categorical), and in column for its name
We essentially turn a wide table to a long table
Let try to unpack one column to a new dataframe with the columns for paramaters and cells are the values
df['rtd_1']
# unpack dictionary
df['rtd_1'].apply(pd.Series)
# pivot table, turn columns to rows
df['rtd_1'].apply(pd.Series).unstack().reset_index()
# now we can write a function to process other columns
def rearrange_col(df, col):
dft = pd.DataFrame()
dft = df[col].apply(pd.Series).unstack().reset_index()
dft.columns = ['parameter', 'epoch', 'value']
dft['sensor'] = col
return dft
ls_df = list()
for col in cols:
print(col)
dft = rearrange_col(df, col)
ls_df.append(dft)
# concatinate a list of dataframe to one df
dfs = pd.concat(ls_df)
dfs.head()
# sensor columns contains all sensors
dfs['sensor'].unique()
# column parameter is for the key in dictionary
dfs['parameter'].unique()
# now we use powerful seaborn library for plotting
sns.scatterplot(data=dfs, x='epoch', y='value', hue='sensor')