AiLab Interview with Dr Ross Gayler
AiLab interviewed Dr Ross Gayler, a Consultant Data Scientist and Applied Statistician from Melbourne, Australia. We interviewed Ross to hear his expert insights into Artificial Intelligence and to find out about his work and thoughts on what the future may hold.
This interview took place in September 2018 with Dr. John Flackett. AiLab: Hi Ross. Can you tell us a bit about yourself and your current work? RG: Thanks John. I’m Ross Gayler and I work as a freelance data science consultant. With my work history, most of my work is within retail finance, so I look at models of customer behaviour for individuals or small businesses that’s used to inform mass market lending. Although my undergraduate work was in Psychology and Computer Science, in retrospect what I was trying to do was undertake a do-it-yourself Cognitive Science degree at a University that did not offer them and did not exist as a named discipline at the time. From there I wandered my way through psychology, studying areas such as psycholinguistics and the neuroscience end of psychology with a fair amount of time spent on methodological work – which is inherently statistical in psychology. This led me into becoming an applied statistician with a keen interest in AI and data science. I then moved into a job working with expert systems and amongst other things, predominantly working with lending systems. This company eventually closed, so from there I literally walked around the corner to another company that worked with statistical models for lending. I’ve now been working in this area for coming up on 30 years. AiLab: Was it a conscious decision to get into AI, especially given your desire to do a Cognitive Science degree? RG: Yes, it’s certainly an area I’ve been interested in since my teens and I suspect it may well have been influenced by reading too much Isaac Asimov. Somewhere along the way I got very interested in how it could be possible to build a device which is able to act like a person and think at a human level. I ended up getting involved in high school science projects and my Year 12 project was basically building what was really, a very ratty robot. So it’s an area I’ve been interested in for a very long time. When I was at University, the kinds of projects I would do were things like neural network models and so on. This was back in the neural network winter when they were dreadfully unfashionable AiLab: Haha, so you picked something no one else was doing! RG: Yes, sometimes there’s a good reason that no one else is doing it! I consider the notion of career planning to be laughable. Maybe in some very specific careers there is a path you have to take, but in many cases it’s more or less defined by random events. The things that I have ended up being involved with have been driven by my interests and what opportunities happened to turn up. I was initially very interested in neural network models and because of my statistical background, I moved into pattern recognition and the machine learning edge of statistics and classical AI. Somewhere in the late 1970’s or early 1980’s, I decided I couldn’t do the kind of stuff I wanted to do with neural nets. It wasn’t until Paul Smolensky came out with his Tensor Product Variable Binding paper [pdf] in 1990 that this addressed the topics that I was really interested in. AiLab: Can you explain the specific area of AI that interests you the most? RG: Yes, so without descending into too much jargon, what I’m interested in is neural networks that can solve problems involving compositional structure. This is the notion that a system has models of various little sub-components of the overall problem and those components are able to be reassembled into new composites on the fly, such that the behaviour of the overall composite is some reasonable function of the behaviours of the components. Therefore, if the system has chosen the right set of components and the right way to compose them, you can then predict and model the behaviour of an exponentially wide range of things, most of which have never been seen before. Whereas classic neural network modelling is very much like classic statistics – it’s a very shallow and flat modelling that essentially captures a range of behaviours you have seen so far. However, because it is not getting at the underlying mechanisms its ability to generalise to novel situations is rather limited. AiLab: And these limitations are still an issue in AI? RG: Yes, absolutely. AiLab: Some people hold the opinion that deep learning has solved a lot of the issues in AI. How do you see the current progress along with the challenges and issues still faced in AI? RG: Recently we’ve had confirmation from the likes of Geoff Hinton saying that people should not see Deep Learning as the be all and end all of modelling. There’s also a recent paper from Yuille that talks about the limitations of deep learning where he demonstrates that the performance of deep net systems can be very much dependent on the examples used for training. For instance, a deep net system can be very good at recognising a sofa from some specific view angles, but absolutely rubbish at recognising the sofa from other angles. This just reflects the statistics of the image sets that are used and the hope is that if compositional systems are done properly they can overcome this problem by allowing generalisation far outside the training set. Ideally, you'd like a system that can learn from a very small number of exposures and if it already has the appropriate components in stock, then it should be able to look at a novel situation and recognise it as a novel composition of familiar components – so in one exposure it’s done. Whereas, the number of training examples you have access to is the limiting factor with Deep Learning and similar kinds of models. AiLab: The huge interest in AI at the moment is great, but with that also comes hype within the media. Do you have any valuable insights on what people should be aware of when it come to dealing with the hype? RG: Yes, the hype is a real thing and I don’t think any of the systems are capable of performing at the level of expectation that’s been raised by the amount and type of coverage it typically gets in the open press. Unfortunately, as you move away from the technical area, there is a tendency for people to get more enthusiastic about having magic pixie dust. Merely having AI, however defined, is not going to guarantee that there will be a good practical outcome. It’s very much like my opinion on anything to do with Bitcoin, Cryptocurrency and the like – there is a strong tendency to view it as magic pixie dust that you sprinkle on any arbitrary problem and the result is profit. Given the reality of the hype, the danger is that the true performance is going to be seriously out of line with the expectations and anybody who is working with or around AI has to have a strong commitment to understand what the limits are. Sometimes the limits actually arise from the context that it is being embedded into, so for a practical application, the constraints are most likely going to occur from the implementation context and that includes the cultural aspects of the context as well. In general, it would be really helpful if everybody who is working in this area had some degree of competence in decision theory under their belt and thought very carefully about the payoff matrix associated with these sorts of systems. The tendency for businesses is that unless they are working with people that are very sophisticated in using probabilistic models, they will tend to assume that a system makes good decisions and therefore is making the correct decision 100% of the time. Whereas, if you look at these systems from a statistical and decision theory viewpoint, they are guaranteed to be making a variety of wrong decisions and it’s very important to quantify how often they make those wrong decisions and also what are the costs and payoffs associated with those wrong decisions. For example, if AI is being used to make marketing decisions, the worst that can happen is somebody gets an advertisement that was inappropriate for them and you can live with the consequences of that. However, if the end result is the anti-terrorist police come barging into your house in the middle of the night, shoot you and it turns out to be the wrong house, then the consequences are somewhat different. Ethical awareness is also important, because in many cases the people that are authorising these AI systems don’t wear the consequences if things go wrong. Their view of what is an acceptable loss may not be the same as the general public's view. AiLab: As someone that solves real-world problems, do you have any advice for companies that are looking to employ AI within their business? RG: It’s good to be aware that the major changes that may need to be undertaken are not necessarily technical and could well be cultural changes within the management of processes. In regards to investment to support AI, I would be wary of committing vast amounts of money into infrastructure, because that infrastructure may not actually be what is needed; after the money is spent is not the ideal time to find this out. Eugene Dubossarsky has the view that businesses should be investing in people before software. The point is that some companies purchase software thinking it will make life so much easier by making sophisticated decisions. The difficulty comes once the commitment is made to buy the software that there is automatically a political debt within the company, so there needs to be a rapid return on investment to justify the money spent on the software. Eugene’s view – and I agree – is that you’re better investing in people to do that kind of work and let them find out incrementally if the business needs to invest in heavy duty infrastructure to roll out a particular kind of solution. These days on the analytics side, most of what you need to do can be done using open source software. It’s better to start small and explore the space of problems where the software can add value, rather than throwing in tools first and thinking you can hire people afterwards. AiLab: Do you see companies focusing on AI technology, rather than identifying problems and asking is AI suited to solving it? RG: Yes, I’ve worked on projects in the past where, literally, the view held by the senior management is that they have a new sophisticated model, therefore their profit will increase. I would ask: ‘So how are you going to use the results of this model and what are you going to do differently?’ They would say they will not do anything differently as they believe just having possession of this new model will increase revenue. AiLab: Could you explain further for readers who may not know about investing in heavy duty AI infrastructure? RG: Yes – so I am thinking in terms of heavy duty analytical hardware. Perhaps vast databases that can support analytics directly or systems for big data computing in the cloud and possibly software that has expensive licensing fees. For most analytics AI, you can effectively get the software to do this for free, although there are still issues of costs for everything regardless of licensing costs. The other thing is there is a bit of a fetish about big data. For a start, it depends on the problem. For the kind of work that I do in credit risk modelling, usually I can get by quite cheerfully with a sample. There are three case outcomes to be concerned with: (1) loans that had been issued and ended up being a good outcome for the lender; (2) loans that had been issued and ended up being a bad outcome for the lender; and (3) loans which hadn’t been issued – these were applications that had been rejected. Back when I started doing this work, the rule of thumb was that if you had 1,500 samples of each of those case types, you were good to go. A perfectly good model can be built on a few thousand cases and also can built on a much smaller number of cases - it’s just more difficult the smaller the number gets. I take the view that if there is some effect that I can’t see in a sample of say 10,000 cases, but that I can see in a sample of 1,000,000 cases, then I would be really dubious about incorporating that in a model. If the outcome is sufficiently subtle that it can only be seen with that larger sample, then is it likely to be sufficiently stable to use it over time? One of the things about credit risk modelling is allowing time for the outcome of interest to occur. You can’t say immediately after someone has been given a credit line how it turned out. A couple of years needs to be given to see how things are going and in principle, if it’s a mortgage this might mean waiting 20 years, and you don’t want to do that. If your definition of a bad outcome is being 3 months behind on payments then by definition, you need to wait at least 3 months for that to happen. Therefore, typically the data is at least 1 to 2 years out of date at the point when analysis starts and by the time the analysis is complete and the models are built and implemented, it might be another 6 months to a year. Also, these models are not causal models – they are based on correlations, so nobody really believes that the time someone spent at their previous address is causally related to whether the person will do well on a particular loan. It’s about defining patterns that are sufficiently stable over time and that will last the expected lifetime of the model. AiLab: That’s a very interesting and important point that there is not a one-size-fits-all with these systems. It always comes down to the problem needing to be solved? RG: Yes, absolutely. There’s even conceptual issues in how to frame the problem to be able to appropriately attack it analytically. Careful attention needs to be given about all sorts of properties with the predictions that may well not be the things that get heavy airplay in the academic or industry marketing literature. AiLab: Previously, you and I have had discussions about the mind-body problem. Could you explain what this is and why it is important in AI? RG: You mean our discussions on what is known as embodiment in AI? AiLab: Yes! :-) RG: This actually feeds into a whole range of issues which have been raised in philosophy and comes under a range of different labels all with slightly different flavours. The core thing here is if you look back at traditional AI – 1970’s AI and symbolic AI – a lot of the early work in this area took what was almost a logical approach. Researchers effectively said, we’ll build a logical model and if the system knows A and B, then we’ll conclude C. This approach effectively has an assumption of a closed world, in that everything you need to know is already captured in the information and rules you have in the system. If your problem domain can be defined in such terms then great, you can use that approach. For instance, if you’re reasoning about chess games then that is OK, because you can completely capture everything you need to know in terms of those of rules. However, if there’s a system that’s working in an open ended world, that’s no longer a reasonable assumption. It’s the equivalent of navigation by dead reckoning, because the system is completely locked in and it can’t look out the window or know anything about the bits of outside world that are constantly changing. The system has nothing to actually connect it to the world. So, while a programmer might say, ‘my system believes this’, in no sense does the system actually have that belief – it’s just moving symbols around and has no notion of what that corresponds to in the real world. If you want a system that actually knows things about the world in a realistic sense, then essentially it has to have some sort of a body that actually allows it to interact with the world. It needs to have the possibility of being wrong and the possibility of sensing and representing that it’s beliefs about the world are wrong. The world may be entirely synthetic and the world may be simulated, but there has to be some kind of an external reality that the system is in communication with and can act upon for it to have even the possibility that its beliefs are somehow well-founded and are about that world. A personal pet peeve of mine is when people talk about Natural Language Processing systems and say that a system understands the questions. No! What it’s doing is coming back with answers, which some percentage of the time a person judges as a reasonable answer to that question. There is no sense in which the system actually understands what’s going on in that world. AiLab: That’s similar to ‘I forced my AI to watch videos’. The AI hasn’t been forced, as that would require understanding and human emotion on behalf of the computer, which it doesn’t have. Unfortunately, the general public and media hear this narrative at the moment and may believe this assignment of human traits to the systems to be true. RG: Yes, I do believe that people have what’s probably a built-in tendency to anthropomorphise things – like giving their favourite hammer a name – and see things in terms of causality where there isn’t necessarily any. It’s fine to do this, but you don’t want to to believe these things uncritically. AiLab: It helps us understand the world, but does it help us move forward within AI if we believe that? RG: I think that’s the thing with a lot of recommender systems and automatic language translations. The point is for most of the time it’s the human who’s doing the work by taking the fairly horrid output from one of these systems and then wrenching some meaning out of it by applying all their own human-level of understanding, their knowledge of the context and everything else, to get some meaning out of it. AiLab: Do you think there are any particular AI research areas that we should keep an eye on at the moment and is there anything you’re really excited about? RG: Well, there’s my usual pet soapbox, which is compositionality and certainly I think there’ll be more progress in this area. Moving onto areas that other people might care about, I think the work that I find interesting and useful is still work on causality. Last year Judea Pearl was trying to persuade people in the Machine Learning sector that it actually is still an important problem and I certainly think it is. Essentially, causality is all about: if I intervene and do ‘X’, what will change as a consequence? It’s trying to get a model of which changes have what consequences and what would be the actions to take to achieve desired outcomes. For safer systems, then I think it’s inevitable we’ll have some aspects of causal reasoning. Other things I find interesting is work on explainability of models and also work on discrimination and bias in models; that comes from my work in the credit framework. For a long time there has been legislation around discrimination with consequences for models in the credit scoring area. People have to be aware of this and ensure that models do not discriminate. Likewise with explainability; for example, if you had an enormous model that is a complete black box to you, how confident are you in giving that model control over something you really care about? AiLab: Do you have any other thoughts around AI and emerging tech that you would like to share? RG: I think I just put on my general curmudgeon hat and say, be excited by all means, but treat everything with a fair degree of skepticism. There’s a lot of really exciting stuff going on, but it can be a very long time between exciting thing ‘X’ turning up and it having a useful role and an applied purpose. Also, people need to quite careful; it’s all very well coming up with a great new quantum computer based technique for example, but if you’re the only person on the planet who can understand what that model is doing and how to maintain it then I don’t think anyone should be using that model – it’s just too risky. AiLab: Great advice. Thanks so much Ross for your time and valuable insights. RG: Thanks John. AiLab would like to thank Ross for his time and for sharing his awesome insights. This interview is copyright to koolth pty ltd & AiLab © 2018 and may not be reproduced in whole or part without the express permission of AiLab.