The Importance Of Data Ethics – For Alan Turing and Castlight

Back in the 1940s when Alan Turing was developing the technology to break the Enigma Code, he was faced with a serious data ethics situation. If his code breaking machine was used to prevent every potential bombing of a ship, the German operators would know straightaway that the code had been cracked and the technology would be redundant. What Turing did was develop an ethical algorithm which was designed to maximise the number of lives saved whilst sustaining the viability of the code.

This is part of Turing’s story as told on the website of The Alan Turing Institute (turing.ac.uk) ,which more than seven decades on, is committed to exploring how data science and AI can be used for the good of society and to support initiatives which bring innovation and ethics together.

It’s so good to know that there’s a UK institute dedicated to promoting the understanding and integration of ethics into the work of the data community.

Digitally Hardwired Ethics

At Castlight, we have always “digitally hardwired” ethics into everything we do with data. First and foremost is our ethical commitment to respect our customers’ data, to treat it with care, protect it with the highest level of security, ensure its anonymity and honour its restricted use within our business to improve our product for our customers.

But our commitment to the ethical use of data is also integral to our product development processes. When we categorise a customer’s transactional data into over 180 categories of discretionary and non-discretionary spending, we believe it’s crucial that every item is correctly evaluated and categorised.

We drill right down to the detail when we make a categorisation decision because our customers have entrusted us with their data and sometimes very important life decisions are being made on the results of the transactional analysis we are conducting for them.

Finer Points of Categorisation

We really do sit around the table and debate the finer points of categorisation such as whether Holland and Barrett is primarily a retailer of food or supplements. Our customers are probably spending a tiny fraction of their income at Holland and Barrett, but it’s important to us that we get it right. If we categorise Holland and Barrett as a food outlet, then for our customer it’s a non-discretionary outlay, if we categorise the shop as a supplements retailer then the outlay is discretionary. It matters.

The importance of the devil of the detail was brought home to me this week when I ran my own transactional data through our CaaS® (Categorisation as a Service) engine. I was delighted that CaaS® was able to correctly recognise that my payment to the Grosvenor was to see a film at Glasgow’s Grosvenor cinema and not to make an on-line bet. If I had been a foreign film junkie and CaaS® had mis-filed my Grosvenor tickets in a betting category, I might have jeopardised a mortgage application.

Honesty Default

But our code of ethics isn’t only rooted in ensuring our customers have the best, most accurate and fairest categorisation analysis. We have also trained our categorisation engine to have a built in honesty default. I love this aspect of CaaS®. It would fit right into the utopian world of the film The Invention of Lying, before Ricky Gervais’s character invented the lie.

CaaS® is honest to its very core. When the engine comes across a transaction it isn’t 100% sure about, it assesses its own level of uncertainty and if it doesn’t hit the very high bar of classification certainty, it refers itself for verification. It recognises that, it is after all,  not human and needs some human intervention. At this point a customer will get the opportunity, for example, to clarify whether their M&S purchases were for food or clothes or the Morrisons forecourt debit was for fuel or de-icing kit. The CaaS engine is then taught by these experiences, and the millions of categorisation decisions it makes, and gradually becomes more and more certain of these less clear-cut classifications, and refers itself less often for verification.

So in a climate where digital security can sometimes seem to pose insurmountable, it is worth taking note of the Alan Turing Institute’s assessment when it says: “…there is reason to be optimistic. As it turns out, the challenges posed by these modern fields of data science and AI can be addressed by one of the oldest – ethics.”