Discounted Dining Finder
- 10 minutes read - 2013 wordsThis post describes how I developed the Discount Dining Finder a lookup map tool for the Eat Out to Help Out scheme in my spare time. I currently work with Equal Experts and HMRC. The aim of this writing is to provide an insight into how problems of scaling services can be solved by having no servers and not using “serverless services” either.
Aperitif
A really nice side effect in working in a high functioning environment is that sometimes you’re involved in bouncing ideas off each other. The delivery teams at HMRC were working on releasing yet another service to the public in less time than it takes you to say “Agile”. This scheme was called Eat Out to Help Out - internally known as Discounted Dining because the powers that be did not want to risk leaking the name prior to the Chancellor’s announcement.
The scheme would consist of different journeys:
- registering a restaurant,
- search for registered establishments for the public and
- claims for payment.
Out of these three, the biggest unknown of in terms of expected volume was the “search journey” to be used by the general public. In this journey, a user would enter a postcode, and registered establishments inside an X mile radius would be displayed. There was a big number of unknowns in terms of how much traffic was to be expected on the HMRC service.
- Would there be a peak at lunchtime?
- What if Martin Lewis goes on TV, recommends visiting the site and two minutes later 10% of the country want to find information about their local eateries?
- Could it impact other HMRC services (the “tax platform” hosts a multitude of services)
Now, the “tax platform” is a very scalable and robust platform and I am not for one minute suggesting that there was going to be a problem using microservices and geo-location in Mongo at scale, but one of the ideas that I floated centred around the fact that the information is fairly static. Sure enough, “Eat Out” businesses register their premises with HMRC, but once registered, the bulk of information is not changing. Postcodes and distances between them are not that changeable. So that’s when I wondered, whether this could be delivered in a static site.
Starter
I went away and found that freemaptools provides me with a list of UK postcodes and their associated latitude/longitude. In that file, there are 1,767,875 postcodes. Searching almost 2 million records sounds like the job for a server and a database, doesn’t it? Erm, no.
Looking at the postcode file
|
|
Instead of searching a single ukpostcodes.csv
(95 MB) everytime, I decided to “shard” or “partition” my CSV file into
smaller files:
|
|
Each file is split into directories by their first letters. So if I want to find out about postcode AB12 4TS
, I’d
split up the outcode (AB12
) into /A/B/AB12.csv
. That file would only have 799 entries, searching them manually
is much more palatable.
So I’ve got my main page and the user would enter their postcode
And I can search for the postcodes simply by using a bit of Javascript inside the user’s browser.
|
|
D3 is a great library for visualisations, but I also found it very useful for reading and processing CSVs in Javascript, and the files can be served up by a static web server.
Great! But how do I get my directory structure. I did not fancy manually copy/pasting the file. You think, surely now it’s time to unleash some NoSQL database or at least some Python. But no, I decided to keep it simple and use a combination of shell scripts and AWK:
|
|
With the split_outcodes.awk
script doing the hard work of creating new files in the correct directory.
|
|
This resulted in 2980 files - the biggest of those was 145K which corresponded to 2701 postcodes. Now that’s much better to search than 1.7 million!
Soup
I didn’t mention the Discounted Dining Finder had a map. A quick diversion on setting that up!
I used LeafletJS - an open source map. Here’s how:
|
|
And I had a map!
Fish
That map didn’t have anything on yet! I am able to convert a postcode into lat/lon though. The next step was to lookup the restaurants. I decided to keep running with the idea of doing all my computation on the user’s browser (desktop or phone).
First of all, I found that the UK postcodes were covering an area of:
|
|
I calculated that the rectangle (60.80 N/-8.16 W) - (49.18 N/1.76 E) covered about 400 miles from West to East and 800 miles North to South. My aim was to provide a lookup that can find all restaurants in a 5 mile radius, so I split my search area up into tiles of roughly 5x5 miles. Here’s my translation function:
|
|
That would give me a coordinate set for a tile. So the Buckingham Palace (51.5 N/-0.14 W) would be at coordinates (33/64). Based on that, I can build another set of files:
|
|
Whereby all the eateries that are in coordinates (33/64) would be in the file pubgrid/33/33-64.csv
. That file would look
like this:
|
|
The javascript can then find the suitable restaurants like so:
|
|
The above code does a few things:
- It calculates the distance between the selected lat/lon and the lat/lon for the restaurant
- It filters out anything that is further away than 5 miles
- It sorts by distance, so that the closest are first
- It takes up to 250 results
- Dynamically create a table that shows the results (IMHO, this is very neat using D3)
- Clear and recreate all the markers on the map.
The end result looks a little like this:
Meat
Now, the next tricky bit is to ensure, that my coordinate grid system, that simplifies (lat/lon) into coordinates contain all the relevant information about the closest eating establishments. As each tile is designed to be about 5x5 miles, in order to ensure that we find every restaurant that is 5 miles away from each tile, each restaurant goes into the tile it is in, as well as the surrounding tiles, this is done using trusty AWK:
|
|
But wait a minute, that presupposes that I have a list of “pubs” and their coordinates. That’s not the case, all we’ve got is the establishment name and their postcode. Thankfully there’s a shell command that I can use to “join” my existing postcode file and a file of establishments and their postcodes:
|
|
The above does the following
- sort both the
pub_postcode.csv
(containing name and postcode) and - sort the
ukpostcodes.csv
(containing the postcode and lat/lon) and - “joins” the two files - creating one whereby the lines are joined by the postcode.
Palate Cleanser
You will have noticed above that my examples aren’t giving real pub or restaurant names. While HMRC had not yet published the list of registered restaurants, I used by shell scripting knowlegde (and a lot of google) to create a fairly neat way of generating random pub/restaurant names.
I took a list of animal names and randomly combined them with “and”, the aim being to get the “Fox and Badger” and endless variations.
Here’s the shell script to allow you to do this:
|
|
This creates
- 100000 random postcodes
- 100000 random animal names
- another 100000 random animal names (in a different order)
- 100000 “and"s
- and combines them all, resulting in my randomly generated pub names:
|
|
Dessert
All of the above is very good, but I’ve still not hosted my tool anywhere, and I don’t want to use my own servers. Thankfully, github.com provides GitHub Pages and GitHub Actions which can be combined to provide a build pipeline and a hosting solution!
Cheese
Thanks for reading, I hope you found the Discounted Dining Finder and the above tale interesting. The source code is available on github.com/beny23/static-distance/ and released using the Apache-2.0 open source licence.
Tags frontend covid-responseIf you'd like to find more of my writing, why not follow me on Twitter or Mastodon?