2.9 KiB


How to track how late buses are:

  • Download bus route data
  • Insert raw bus positions into timescaledb/postGIS table?
    • "Fan-out" JSON responses into one DB row per datapoint.
  • Download routes/stops.
    • There is an endpoint called Stops.
    • There is an endpoint called Timetable, which takes a route number (along with day of week etc).
    • The timetable response includes a list of stops in the route, along with the route ID.
    • There is an endpoint called BusLocationByStop, which takes a stop ID as a parameter.
      • This information has upcoming arrivals, included estimated arrival time, along with the current position of the bus.
    • Record stop info every few minutes, notably the arrivals.
    • This will be annoying to do only in DB since we have a trigger for bus positions.
  • Analyze actual arrival vs what the stop endpoint said about the arrival.
    • Use PostGIS to compute "bus arrived" event: When bus is within X meters of a stop on its route, mark that as an arrival event.
    • Sanity check potential arrivals by removing values that are weird:
      • Bus does not belong to the route on the stop.
      • Bus not lingering at stop long enough. (e.g. driver goes right past).
      • ???
    • Once we have bus arrival events, we can compare them to arrivals throughout the day.
    • We can then discard the raw bus position data, as it's not needed to store it: delete every raw data point between the last arrival and the newly computed one.

Downloading route data:

Arrival computation:

  • Can use edge functions for this?
  • Function takes a bus route number, start time, end time.
  • Cron in DB calls the edge functions, using state as function inputs.
    • state = last interval end time, so we can properly calculate time interval to check.
    • or rather use last ID in the raw table, so we don't skip entries.
  • Query DB to get raw bus positions in the interval.
  • Query DB to get all stops on the route (during the interval? some stops not in use sometimes)
  • For each bus ID, find the raw position where its lat,lon are X meters away from stop lat,lons.
  • These are arrival times for this bus ID, at each stop.
  • Insert into arrivals.
  • After all arrivals computed, drop all raw positions for this bus from DB.

Useful stuff:

SELECT buses_at_stops.* FROM (SELECT DISTINCT ON (, h.measured_at), s.stop_name, ST_AsText(s.coords) as stop_coords, h.measured_at, h.bus_id, ST_AsText(h.coords) as bus_coords, ST_Distance(s.coords, h.coords) as distance
  FROM bus_stops s
    JOIN raw_bus_positions h ON ST_DWithin(s.coords, h.coords, 30)) AS buses_at_stops
    WHERE buses_at_stops.bus_id like '1%'
    ORDER BY buses_at_stops.measured_at desc;