Separating traces when two monitors are in the same venue#

Our monitors are very dumb - if there are two sensor units on within radio range, it can’t tell their readings apart. We will fix this for the next project!

If you want to use several monitors in the same building and can retrieve the data by smartphone or laptop every week or so, it’s best to make them standalone. There’s a hidden configuration option for this - please ask while we get that documented.

This page is just a convenient way of exploring data so we know how to separate it.

Venue 10 has already borrowed venue 2’s monitor, and you can kind of eyeball the two curves if you zoom in enough.

Hide code cell source
# Using plotly.express

# import ipywidgets as widgets
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go   
#from IPython.display import display


df = pd.read_csv("./venue-10/venue_10_with_device_58BF25DB81A1.csv")
df["timestamp"] = pd.to_datetime(df['timestamp'])
df = df.drop('voltage', axis=1) # often NaN, just get rid of it
df = df.dropna(axis=0, how="any") 
df = df[(df.temperature >-10)] # eliminate rogue data
    
trace = go.Scatter(customdata=df, 
                    y=df['temperature'], 
                    x = df['timestamp'], 
                    mode='markers', 
                    hoverinfo='all', 
                    #name=x
                    )
    


g = go.FigureWidget(data=trace,
                    layout = go.Layout(
                        yaxis=dict(range=[-3,18])
                    ))

# example syntax for two plots on same x-axis - I'd like to show the boiler temperature in
# parallel - but havne't had time to sort the syntax.
#fig = make_subplots(rows=2, cols=1, shared_xaxes=True)

# for i, col in enumerate(cols, start=1):
#     fig.add_trace(go.Scatter(x=df[col].index, y=df[col].values), row=i, col=1)

fig = go.Figure(g)
fig.update_layout(showlegend=True, 
              autosize = True, 
              width=1000, 
              height=500,
)




fig.update_layout(
    hovermode='x unified',
   # range=[range_min, range_max],
    hoverlabel=dict(
        bgcolor="white",
        # font_size=16,
        font_family="Rockwell"
    )
)

#Add range slider
fig.update_layout(
    xaxis=dict(
        rangeselector=dict(
            buttons=list([
                dict(
                     label="All",
                     step="all"
                     ),
                                dict(count=1,
                     label="Hour",
                     step="hour",
                     stepmode="todate"),
                dict(count=1,
                     label="Day",
                     step="day",
                     stepmode="backward"),
                dict(count=7,
                     label="Week",
                     step="day",
                     stepmode="backward"),
                dict(count=1,
                     label="Year",
                     step="year",
                     stepmode="backward")
            ])
        ),
        rangeslider=dict(
            visible=True,
        ),
        type="date"
    )
)


#fig.add_hline(y=16, annotation_text='16C - usual minimum for children', annotation_font_color="blue", line_color='red', layer='above', line_dash='dash')
# fig.update_yaxes(range = [-5, dfCollatedDataSet['temperature'].max()+5])
fig.show()

Readings from the same sensor should be around 5 minutes apart. At first we were lucky that the radio range is good and that the second sensor wasn’t turned on too close to the half-way point between the first sensor’s readings. That means we got a completely separated bimodal distribution of elapsed times and we could easily split the readings: if there was a short time between this reading and the next one (<2.5 minutes), it’s from sensor 1 and if there’s a long time (> 2.5 minutes), it’s sensor 2 - unless it’s more than 4.5 minutes, in which case we’ve missed a reading. It’s not worth worrying about the rest, we just mark them as “risk points” (green) for the human analyst to consider that the traces might be swapped. There was a cluster around 5 minutes. It was probably the same sensor as the previous reading, but there’s enough chance something got power-cycled to consider them at risk.

After a reboot, there’s a 50/50 chance the sensor traces will swap round and the best time intervals for swapping the bins changes - because it depends on how long it was between when the two devices were switched on. Ideally, people would always try to keep it to around 2-3 minutes, but they don’t. That’s why the histogram is now a mess. We could re-analyse the data segmenting it by reboots, but in practice what we have is good enough for the venue to work with.

The variability in time interval is only about 18s so we should usually be able to use this approach for separation - but the bins will get fuzzier the longer the system runs because the devices have a poor sense of time that drifts, especially in the cold.

Hide code cell source
# delta is elapsed time between df['timestamp'].iloc[x-1] and x 
df['delta'] = df['timestamp'].diff(periods=1) /np.timedelta64(1, 'm')
outlier_cutoff_in_mins = 20
outliers = df[df['delta'] >= outlier_cutoff_in_mins]
trimmed_df = df[df['delta'] < outlier_cutoff_in_mins]


#print(len(outliers), " outliers")

hist = go.Histogram(customdata=trimmed_df['delta'], 
                    x=trimmed_df['delta'], 
                    name='elapsed time (minutes)',
                    #xbins = {'size': 1}
                    #nbinsx = 15
)
 
hist_g = go.FigureWidget()
hist_g.layout.title = 'Histogram of elapsed time between readings, omitting very high outliers'
hist_g.layout.xaxis.title= 'elapsed time (in minutes)'
hist_g.layout.yaxis.title = "number of readings"
hist_g.layout.width = 1000
hist_g.layout.height = 500

hist_fig = go.Figure(hist_g)
hist_fig.add_trace(hist)
hist_fig.show()
Hide code cell source
# assign a trace based on the elapsed time since the preceding data point.


df.loc[(df['delta'] >4.5), 'assignment'] = 'switch'
df.loc[(df['delta'] <= 4.5) & (df['delta'] >2.5), 'assignment'] = 'sensor2'
df.loc[(df['delta'] <= 2.5), 'assignment'] = 'sensor1'
# could do something fancier to assign 4.5-6 to the same sensor as the previous reading.

# print("switch length:", len(df[df['assignment'] == 'switch'] ))
# print("sensor1 length:", len(df[df['assignment'] == 'sensor1']))

trace_sensor1 = go.Scatter(
                    y=df[df['assignment'] == 'sensor1']['temperature'], 
                    x = df[df['assignment'] == 'sensor1']['timestamp'], 
                    mode='markers', 
                    hoverinfo='all', 
                    name="first trace"
                    )

trace_sensor2 = go.Scatter(
                    y=df[df['assignment'] == 'sensor2']['temperature'], 
                    x = df[df['assignment'] == 'sensor2']['timestamp'], 
                    mode='markers', 
                    hoverinfo='all', 
                    name="second trace"
                    )

trace_switch = go.Scatter(
                    y=df[df['assignment'] == 'switch']['temperature'], 
                    x = df[df['assignment'] == 'switch']['timestamp'], 
                    mode='markers', 
                    hoverinfo='all', 
                    name="risk here"
                    )


g3 = go.FigureWidget(data=[trace_sensor1, trace_sensor2, trace_switch],
                    layout = go.Layout(
                        yaxis=dict(range=[-3,25])
                    ))


fig3 = go.Figure(g3)
fig3.update_layout(showlegend=True, 
              autosize = True, 
              width=1000, 
              height=500,
)

fig3.update_layout(
    hovermode='x unified',
   # range=[range_min, range_max],
    hoverlabel=dict(
        bgcolor="white",
        # font_size=16,
        font_family="Rockwell"
    )
)

#Add range slider
fig3.update_layout(
    xaxis=dict(
        rangeselector=dict(
            buttons=list([
                dict(
                     label="All",
                     step="all"
                     ),
                                dict(count=1,
                     label="Hour",
                     step="hour",
                     stepmode="todate"),
                dict(count=1,
                     label="Day",
                     step="day",
                     stepmode="backward"),
                dict(count=7,
                     label="Week",
                     step="day",
                     stepmode="backward"),
                dict(count=1,
                     label="Year",
                     step="year",
                     stepmode="backward")
            ])
        ),
        rangeslider=dict(
            visible=True,
        ),
        type="date"
    )
)


#fig.add_hline(y=16, annotation_text='16C - usual minimum for children', annotation_font_color="blue", line_color='red', layer='above', line_dash='dash')
# fig.update_yaxes(range = [-5, dfCollatedDataSet['temperature'].max()+5])
fig3.show()