Scale Venture Partners

The Startup Tapes #010

Starting up in an AI-first world

with Jeremy Howard (

Artificial Intelligence naturally favors companies that can wield large armies of PhDs, gigantic GPU clusters, and massive datasets, so it’s unsurprising to see large tech titans shoving us head-first into an AI-first world. If the Googles, Facebooks & Baidus of this world, have the game so tilted in their favor, on top of a first-mover advantage, then what’s left for startups? Is every small company laboring in the space doomed to get acquired — or forgotten like so many unfriended chatbots?

Not so, says TED speaker and founder Jeremy Howard. We met him to discuss the differences between AI, Machine Learning, Deep Learning & Representation Learning; discuss the state of AI tooling today; and explore where small, nimble teams of entrepreneurs can take advantage of the huge platform shift that’s afoot.

To be alerted of new tapes sign up for the newsletter.
To be alerted of new tapes sign up for the newsletter.

Jeremy Howard, founder at These basically represent, kind of, decreasing areas of specializations, so, AI on the outside is really anything where we’re getting a computer to make intelligent decisions, which there is a specific definition for, but it’s not really important. Then inside that, there’s machine learning, which is getting a computer to make those decisions by learning from examples. This is how most AI is done today, because it turns out it’s much easier to have a computer learn to do something by seeing examples, than it is to code it by hand. Inside that there’s representation loading, which is basically saying “Okay, here’s an image, I want you to learn some things about the image but I’m not even going to tell you what to look for. I’m not going to tell you that you have to look for edges, or shadows, or areas of color. You have to figure out what to look for.” And then, in the middle, there’s deep learning. Which is a way of doing that. Deep learning uses something called a neural network, with many layers, over a thousand layers nowadays, to come up with a rich structure of representations. So, for example, for an image, at the bottom level it will learn to recognize, all by itself, edges and gradients, and in the middle, it’s going to be able to recognize noses versus eyeballs, and at the top it will recognize the difference between Arnold Schwarzenegger and Katy Perry. For me, deep learning is the really exciting thing to look at, because that’s a thing that has led to state of the art results in areas that we never thought computers would be able to handle in our lifetime, and it’s been able to do it in three to five years. So, that’s totally revolutionized our understanding of what’s possible.

Tim Anglade, Executive in Residence at Scale Venture Partners: And so, you know, there’s a lot of talk around tech companies investing billions, any one of those forms, particularly deep learning, like Baidu, Google, Facebook, they all seem to be having some sort of effort towards that, and increasingly has allowed early-stage start ups that are just getting started, seem to be investing in this, some way, shape, or form, your consumer company then invests in having some recommendation features, powered by deep learning, or pure plays, people that are building deep learning stacks, and trying to sell supporting services around it. So do you feel like that hype is justified? And I guess, more personally, why do you think this is happening all of a sudden right now?

Jeremy: So, to me, the potential for deep learning, as we sit here, is deeper than the potential of the internet was in the early ‘90’s. And I remember people asking me the same question in the early ‘90’s, about the internet, like, “The internet is so hyped, it seems like everybody is talking about the internet.” And there’s no way it was over hyped, because it was very clear that it was going to impact every part of our economy, and our lives. It’s equally clear that deep learning will impact every part of our economy, and our lives, and I say that not as a wild guess, but we know from academic research that every area that people have looked at with deep learning rapidly becomes a state of the art, and gets well beyond what people ever thought was possible. So, for example, better than human capability at speech recognition in Chinese and English, better than human capability at recognizing the content of photographs, cars which drive more safely than humans do. So, for anybody who’s watching this right now, whatever it is that they’re doing, they will be doing it assisted by deep learning in the not too distant future. Anybody who’s not doing it with deep learning is going to fall by the wayside, just like book sellers who didn’t use the internet fell by the wayside. So you’re not going to have a choice to ignore it, because we know mathematically that it can solve any given problem. We know empirically that it is now solving complex problems better than people or any other computer system. And we also know that doing it takes a very, very short amount of time, both to compute and to develop, because we don’t have to hand write any programs, it learns it for you.

Tim: Yeah, you were giving some examples, and I think you were saying earlier in the conversation that it takes about six months, on average, for somebody to start applying deep learning tools to some industry, some different area, and start yielding results that are better than before.

Jeremy: Well, less. I started a medical company, there was four of us, none of us had any medical background, and within two months, we had built a better than state of the art, better than human radiology system, for identifying malignant cancer. And that, if you identify lung cancer early, your probability of survival is ten times higher, so lots of people have been studying this for a long time. We felt a bit like frauds, going in and studying lung cancer as non-medical people, but we knew from the research that deep learning works, and lo and behold, it worked again. So, yeah, it’s something that non-experts, non domain experts can quickly apply, I’ve seen it applied recently to weather forecasting by people who aren’t meteorologists. It really changes the nature of expertise, and the nature of what’s possible, by small numbers of smart people.

Tim: That makes sense, right, and so we talked about this, something I never really appreciated, the opportunity to get the deep learning, that you don’t need to be a domain expert, that you don’t need to have this pre-existing knowledge, there are many things that may be deep learning itself, to be able to go and maybe start finding an innovative solution, whether that’s for cars, or medicine, or any one of those things.

Jeremy: Although, nowadays I’m thinking a little bit differently, again, which is that domain expertise is really valuable, for knowing what problems need to be solved, what are the constraints on solving those problems, whether it’s regulatory, or personnel based, or whatever. And one of the problems worth solving, and the thing that I would love to be able to do, is, have the people who know the answers to those questions, i.e. the domain experts, I would love it if they had access to deep learning themselves. Because at the moment they don’t, at the moment deep learning is really pretty inaccessible, it really does require a fairly significant level of knowledge of programming and math, and a lot of the information on using it effectively is kind of locked up in people’s heads, it’s part of the kind of artisanship of designing these architectures. So my big hope, of what I hope can contribute, is to make it so that domain experts can say, “Okay, I’m working on a system to identify crop disease,” say, and figure out how to deal with it, to improve food availability in India, I’d love it if that person could be like, “Okay, I’m just going to go do that right now.”

Tim: Makes sense. Yeah, and it’s really, you know, you’re right. There’s a lot of obstacles in the the technology itself, which is why I personally find a lot of those pure plays, of building a kind of great deep play in kind of something very interesting, because there’s so much to do about the user experience of it, how are you able to use and apply it to your industry and to your problem.

Jeremy: Yeah, I don’t think that there will be that much in the way of pure play deep learning companies, any more than there are many pure play internet companies. I remember, in the early days of the internet, everybody was starting up internet service providers, everybody wanted to be an internet company. I think there will be very few deep learning companies, there will be one or two, and the rest of us will be hoping to build something more like Amazon, a bookstore on the internet. So it might be a medical diagnostic company, using deep learning, or a weather forecasting company, using deep learning. Agriculture optimization company, using deep learning.

Tim: No, that makes a lot of sense. But let’s talk more about opportunities for start ups, one thing that seems striking in a lot of these machine learning, deep learning, type of conversations, is the advantage of having a large data set, when you look at Google, and Baidu, and Facebook, and all of those people that have large quantities of data, personal data, and are able to very quickly train their neural nets and deliver -

Jeremy: Can I just, I mean, I hear this so much, and it drives me crazy because it’s just not true. When my previous company, when we built our lung cancer model, we had a thousand examples of people with cancer. And this was much harder than most things, because these are three-dimensional CT scans, and also much harder because nobody has really studied these with deep learning before. The reason that this kind of false meme has appeared is a couple of things. The first is that anybody doing their PhD, by definition, is trying to push the technical envelope, so they’re trying to work on levels of scale that people haven’t worked on before, because that’s how you get a PhD. And then, after that, they go and join Google or Facebook, they have Google and Facebook sized data sets, and they try to solve Google and Facebook scale problems. To me, the opportunity for an entrepreneur is the opposite, it’s to be able to say “Okay, what do you know about…” Maybe, your mom’s a real estate agent and you know a lot about that, or maybe you’ve spent your last 10 years doing network security, and you understand Denial of Service attacks. Okay, go do that, with deep learning. Like, whatever it is you do, do it with deep learning. So you could use deep learning to figure out the right amount to sell a house for, you can use deep learning to automatically figure out the a particular packet is from a DOS, and figure out what the optimal set of rules to block it is. You’re not going to need ridiculous amounts of data, because what you can do is you can leverage the stuff that Google and Facebook and those guys have already done. They’ve already built networks which will be 90% of the way there, and then that last 10%, you can use your 200 examples to close the gap.

Tim: That makes sense.

Jeremy: That’s called transfer learning. So, transfer learning and fine tuning are the critical things for entrepreneurs who are interested in using medium size data sets to solve problems they’re interested in.

Tim: No, that makes sense, and then it’s not so much about building the best possible end service, like knowing what question to ask. That industry specific knowledge is where you can shine.

Jeremy: And this is what all entrepreneurs try to learn, is to use their particular knowledge. So, even for me, I spent 10 years in strategy consulting, and so one of the things I know a lot about is how domain experts work in lots and lots of types of companies, and so now I’m thinking, “How do I build something to support those people who I understand really well, and who I know.” This thing I did with medicine is kind of a bit off that standard chart, and that was because I really wanted to study what’s possible. But I think for instance, you know, I already built three start ups, so it wasn’t like I was going to try to do something easy and quick. I was trying to do something slow and hard. But I really think the right approach is still the same, which is follow your passion, or follow your interest, follow your knowledge. You know all of the bits of your industry, which everybody hates, and if you could use an algorithm that could solve any problem better than it’s ever been solved before, which you probably now can, what could you do with that? You’ll have to unlearn a lot, like all the things that you thought were constraints, all the things you thought were hard. If you’d thought, for example, that weather forecasting requires 100 million dollars of fluid dynamics hardware, forget it. None of that is useful because you can learn it in, I believe, less than a second is how long it takes to do it.

Tim: It’s shocking, really, how much you get unexpected results and shortcuts through some of the early deep learning examples we’ve already seen. That makes a lot of sense. Well thank you so much.

To be alerted of new tapes sign up for the newsletter.

The Startup Tapes chronicle the highs & lows of building a startup, through candid, immersive interviews with founders, operators & advisors. Tim Anglade, an Executive-in-Residence at Scale Venture Partners and formerly with Realm, Apigee, and Cloudant leads the project with the goal to de-mystify the process through which startups emerge, grow & succeed. His unfiltered interviews transcribe the conversations we often hear in the boardroom, amongst our portfolio community and with entrepreneurs and partners we engage with every day.

Learn more about Scale Venture Partners at

For guests suggestions, feedback or questions, email
© Scale Venture Partners 2016, all rights reserved.