# Jokullbase How to track how late buses are: - [x] Download bus route data - [x] Insert raw bus positions into timescaledb/postGIS table? - [x] "Fan-out" JSON responses into one DB row per datapoint. - [ ] Download routes/stops. - There is an endpoint called Stops. - There is an endpoint called Timetable, which takes a route number (along with day of week etc). - The timetable response includes a list of stops in the route, along with the route ID. - There is an endpoint called **BusLocationByStop**, which takes a stop ID as a parameter. - This information has upcoming arrivals, included estimated arrival time, along with the current position of the bus. - Record stop info every few minutes, notably the arrivals. - This will be annoying to do only in DB since we have a trigger for bus positions. - [ ] Analyze actual arrival vs what the stop endpoint said about the arrival. - Use PostGIS to compute "bus arrived" event: When bus is within X meters of a stop on its route, mark that as an arrival event. - Once we have bus arrival events, we can compare them to arrivals throughout the day. - We can then discard the raw bus position data, as it's not needed to store it: delete every raw data point between the last arrival and the newly computed one. Arrival computation: - Can use edge functions for this? - Function takes a bus route number, start time, end time. - Cron in DB calls the edge functions, using state as function inputs. - state = last interval end time, so we can properly calculate time interval to check. - or rather use last ID in the raw table, so we don't skip entries. - Query DB to get raw bus positions in the interval. - Query DB to get all stops on the route (during the interval? some stops not in use sometimes) - For each bus ID, find the raw position where its lat,lon are X meters away from stop lat,lons. - These are arrival times for this bus ID, at each stop. - Insert into arrivals. - After all arrivals computed, drop all raw positions for this bus from DB. Useful stuff: ``` SELECT buses_at_stops.* FROM (SELECT DISTINCT ON (s.id, h.measured_at) s.id, s.stop_name, ST_AsText(s.coords) as stop_coords, h.measured_at, h.bus_id, ST_AsText(h.coords) as bus_coords, ST_Distance(s.coords, h.coords) as distance FROM bus_stops s JOIN raw_bus_positions h ON ST_DWithin(s.coords, h.coords, 30)) AS buses_at_stops WHERE buses_at_stops.bus_id like '1%' ORDER BY buses_at_stops.measured_at desc; ```