Luis Serrano: Exploring LLMs and the Journey of an AI Scientist
What's AI Podcast Episode 15 with Luis Serrano from Cohere
Welcome to another exciting episode of the What's AI Podcast! In this episode, I had the pleasure of interviewing Luis Serrano, an AI scientist, YouTuber (Serrano.Academy), and author of the popular book "Grokking Machine Learning." Currently, Luis is working at Cohere, where he is building the Large Language Model University (LLM U), a fantastic resource for learning about LLMs (Large Language Models).
The interview delves into Luis' academic background, his journey in AI, and his passion for teaching. With a PhD in math and experience as a researcher, machine learning engineer at YouTube, and educator at Udacity and Apple, Luis brings a wealth of knowledge and insights to the table. His expertise extends beyond LLMs, covering high-tech areas such as quantum computing and deep learning.
Luis shares his fascination with LLMs and why they captivated his attention (pun intended). He discusses how these models surpassed his expectations by exhibiting human-like language generation capabilities. This revelation motivated him to learn more about LLMs and eventually contribute to their development.
The conversation also touches on the prerequisites for learning natural language processing (NLP) and AI in general. Luis emphasizes the importance of a diverse background, combining different fields and perspectives to enrich the AI community. While programming skills and a basic understanding of math are valuable, he encourages aspiring learners to dive into practical hands-on experiences from the start.
Furthermore, the discussion explores the topic of pursuing a PhD in AI. Luis shares his personal experience and reflects on the value of a PhD. While he found it rewarding, he also highlights that it is a personal choice and not an absolute requirement for an AI career. Instead, he emphasizes the significance of an open and creative mind, problem-solving skills, and a passion for tinkering.
Luis provides insights into LLMs and generative AI. He explains how generative AI has evolved from simple classification tasks to the exciting realm of text and image generation. LLMs, in particular, have the power to create realistic text given minimal input, revolutionizing the way we interact with AI-generated content.
To dive deeper into this captivating discussion with Luis Serrano, listen to the full episode on my YouTube channel or stream it on Spotify. Discover the potential of LLMs, gain valuable insights into the world of AI, and explore the fascinating journey of an AI scientist and teacher (amazing teaching/communication advice towards the end!). Don't miss out on this value-intensive conversation!
Stream the episode on Spotify.
Listen to the full episode on YouTube:
Full transcript:
Podcast 3 - Luis Serrano Audio_v1
[00:00:00] This --- is an interview with Louis Seran. Lewis has a PhD in math and is an AI scientist, YouTuber and author all in ai. He's now building the large language model university or L L M U at KU here. This whole interview goes over the whole L L M University as well as his background. Great tips for teaching and many other things related to prompting and large language models.
I hope you enjoy this interview. So my first question is the same. I ask to everyone coming on the podcast. It's basically who you are, but more precisely maybe your academic background and the various background that you have and that led you to coherent ai. Hello, my name is Louis Sorano and very happy to be on your podcast, Louis.
Thank you for the invitation. I've had a six success story, but I've always gravitated to being an educator. So my life started as a mathematician. So , my goal was to be a professor, so I did an undergrad at PhD Genetic, and I was a [00:01:00] researcher for a while, and then at the university, but I found it fascinating. Founded that it was very much similar to the math I was doing, but that was so incredibly applicable. And so I switched gears and I moved to Google, and so I worked as a machine learning engineer in the recommendations team at YouTube recommending videos.
That was a lot of fun, but I was gravitate towards education, so I really enjoyed teaching and everything, so I moved to uh, Udacity. And for a while I was teaching the courses there with a great team and we were teaching machine learning, deep learning, all these things. And that's a period that I learned a lot.
And that's where I started YouTube channel, ceramic Academy, where I put all my material that I teach. I tried to teach in a simple way cause I was trying to understand everything in a simple way, without formulas, without everything. And I wrote a book at that time. It's called Rocking Machine Learning.
And I continue in the education space but in the tech education space. So I moved to Apple to teach, there's something called Apple University where internally they would teach the employees and so I was teaching machine learning courses there as well. And then I get different bugs that bit me, so I got bit by the quantum computing bug and I got interested in that.
With the influence of a friend of mine who's a [00:02:00] quantum computing scientist. I moved to Asata computing in Toronto and I was a researcher in quantum computing and I found it very fascinating and also I started making educational videos on that as well. And finally the last book that bid me was large language models, cuz that's the other thing right now.
So I, for a few years at sa, quantum Computing Scientist, I switched to CO here where I'm working on large language models and also creating courses and educational material and all that. So we've created something recent called L L M University, which I think will we'll talk about later, but that's my story.
Yeah, we'll definitely dive into the LM University, but first, May I ask, what was the thing that bite you about LLMs? Why did you want to learn more about them and, and dive into this professionally? Yeah, I mean, they really blew my mind when I started seeing, I was seeing the progression of machine learning and I found that things were predictable, right?
Like it was going fast. But I would say, okay, I think the next thing we're gonna do is be able to identify this and classify that. When I started seeing how these models were talking like a human, it blew my mind. It just, it was more [00:03:00] steps than I thought would happen in that amount of time. And so I thought, I have to learn this stuff.
I got very curious and I started asking friends and stuff that, hi. I start everything. I started asking my friend Jay, and just all of a sudden I am working on it. That's super cool. You also had the luck, not the luck, but you did a PhD and you had the right kind of background to get into LLMs and just natural language processing in general, because you already were familiar with the field and the math and programming, but what would you see would be required to start learning about natural language processing and just AI in general?
Yeah. Yeah, great question. I think any field works. So I, I was lucky to have done a lot of math without planning anything of going into ai, and it helps me visualize things in right dimensions and things like that. For things like programming, I have to learn a lot, right? Like I have to constantly be learning because I was not my background.
And I think that's the case with everybody. I think everybody, not just for LLMs, but for machine learning, everybody has. Some background, some things [00:04:00] they enjoy, some way they see the world, and that's very valuable to come into AI because it enriches the field. So I wouldn't want only math people or computer science people in ai.
I want there too, but I don't want just jokes. Like people with all walks of life. Yeah. And the data science teams that have been the more diverse, the better. Because, if you're trying to get these systems to act like a human, you need all kinds of human, not just one type. So I, I know the more diverse the team, the better.
And so that being said, you do need some things to know. Every time knowledge gets more democratized, and every time the technical barrier is smaller in terms of actual technicalities, like the languages get easier. Many years ago, somebody write an assembly and now we don't write in Python, and every time the packages are easier and now there's gonna be a lot of prompt engineering.
So the technical part keeps allowing more people in, which I find fascinating, but I think if you know some basis of programming that's necessary. So I would say one course in Python that gets you the idea of how programming works is good. And then the math I think is also important, but I defer, most experts who say, go home, learn all the [00:05:00] math, and then come back and start making a simple model for classifying house prices.
I don't agree with that, and I think you should definitely get an idea of the math to be able to train these models better and to be able to fine tune them, et cetera. But you shouldn't wait and learn the math before you should learn it as you go. So I think people should get into machine learning by.
Training models by applying them to what they like, to the kind of problems that they are passionate about. And as they go, they learn the math. So they see a matrix, for example, in a neural network or something. And it's okay, I'm gonna learn a little bit of that, but as you go, so that was a long answer, but the short one is some programming, some math, but definitely going in and starting to practice hands-on.
So if people are want LLMs, just take a lab on LLMs and start working and prompting it and see how it goes. Yeah, I also agree with being super practical. It's definitely more motivating and you just end up doing more and learning more by building something that interests you. But if you want to become more of an expert, and let's say like you are aiming to work at DeepMind [00:06:00] or open AI, or really work in ai.
Would you say that this is a question that personally interests me just because I'm doing a PhD and I'm always thinking about that, but would you say that a PhD is worthwhile to do AI research and work in ai and if it is or not? What would you say about your own experience with the PhD? Was it worthwhile or would you do it again if you were back then with the knowledge that you have now?
What do you think about the pc? Yeah, I think a PhD is like a personal choice. I enjoyed it. In the time I'd made the decision, I just only knew pure mathematics and so no applicable skills that I knew. I think I had them, but I didn't know I had them, and so I just knew pure mathematics and the only way to stay in the eating while you're in pure mathematics would just be like, I.
Do a PhD and then a postdoctoral fellowship and then a professor. I was in that path, so I never questioned, I just said my question was, where should I do my PhD or what, like it was a small group and my colleagues, it was a small group and our only question was, where do we end up or what kind of math do we do, or the question never came to me.
I'm thankful for it now. I'm, it was some of the best years of my life. It was [00:07:00] stressing and it was all our work, but it was some of the best years I've, I had. The things I learned in my PhD didn't really help me concretely. Like I don't use the actual things, but it gave me a few things. Like it gave me an intuition of math very strong.
Cause I was just doing it all day for five years. And grinding power I'm gonna continue with this and this yours PhD student as well, right? You just keep going. Hitting the same wall until it breaks. So I think it gives you that, and it shows the employers that, right?
If an employer is a PhD, they go, okay, this person can focus and has determination. But I wouldn't say people should do a PhD for the purpose of a certain job unless their job is professor, which requires a PhD. But I wouldn't say that you need a PhD for an AI job. And even for a researcher, I would say that it's not super necessary.
I think an open mind, I think a creative mind and a love for tinkering and solving problems is really what matters. And there's a lot of ways somebody can get into the research fields without a PhD. They can start working as a data scientist and maybe get into some projects to, [00:08:00] and join a research team.
I've seen a lot of researchers that don't have PhD, so I don't think it's absolutely necessary. There are other ways to get into research. Yeah. And in these other ways, would you think that one can learn fully online and never go to a university or do a master's degree or more? I think we're getting there.
For example, everything I've learned in my life that's not math, I've learned it outside of the classroom, so I've learned it online programming, I learned some programming in university, but I forgot it. And all of the Python I know online, all the machine learning. Online. I think we're getting there. I think even if you learn in university, you still need to learn some things outside.
Yeah. Because things are so fast. Maybe now you can major in machine learning, by the time we get courses in LMS and university, there's gonna be a new thing. Like universities can move as fast as the technology and you have to rely on online learning for that. Great thing would be a combination of the two, if you can do some university and some online.
But I do think we're at the point that people can learn everything online, and that makes me happy because it means people who may not have [00:09:00] the opportunity or the. To go to university are able to learn a lot of stuff and contribute. I've certainly seeing great people who online learn a bunch of stuff and became genius in other fields, that makes me very happy.
Yeah, it's really cool how AI also democratizes a lot of things that we expected the opposite, like lots of people were afraid of new technologies like AI because it'll make the rich Richter and the. Poor, poorer. What I feel like right now is that it's mostly the other way around where it allows for a lot of people that don't know how to do lots of things to, for example, use the free chat G P T application and just learn to code or use it to code or to do anything else.
Like it democratizes a lot of skills that you can now learn super easily or even. Apply without learning them. So I feel like it's really cool and it also links with what you are doing with cohere with the L L M University. But before diving into this very cool topic, I'd also maybe want to [00:10:00] cover some definitions.
So for instance, we often hear gen AI or generative AI as well as L L M. And so I'd love to just first start explaining what those two are for you. So for example, starting with what is A L L M, we get to generative ai, right? I'll introduce it quickly and then talk about lms. I think what was cool like 10 years ago was like classification.
If you could tell if this text is talking about something or it's happy or something, or if this image is a dog or not, and they were like blown our mind, right? And now it's baby can do that. Now it's the opposite, right? Like it's generation, like creation, right? You don't give that much information and you just give a bunch of data.
For a human, for example, answering questions is easy. If I give you a picture of a dog and I say, what's that? You say it's a dog, but zero, A realistic picture of a dog, that's much harder, right? And and so for computers, also much harder to just draw stuff or write stuff. And LLMs just do that, right?
Like they just generate text. But on L L M, I like to see it as a bunch of little moving parts and all those parts are pretty simple, but put together and also [00:11:00] throw a lot of data and a lot of computing power. They just do magic. So I would say the main three parts are embeddings. Embeddings are really the most important thing, in my opinion, of an l l M because it's where the rubber meets the road, right?
It's where the computer starts talking. So it's kind when you translate to the computer, right? We talk in words, computers, talking in numbers, and whatever you do, however smart the human is, and however smart the computer is, or huge, you need a good bridge. And if you don't have a good bridge, nothing happens.
And the embedding is the bridge, right? It's what turns words. Into numbers, and the better the embedding, the better the models. Somebody comes up with a better embedding tomorrow. All the models get better, right? You have to find a good way to turn words into numbers. Not just one number, but like a series of thousands or hundreds of numbers that are similar.
If the word is similar, right? If I say Apple, you gimme a bunch of numbers. And if I say pair, it gives me similar numbers, right? And if I say, Truck, it gives me different numbers, right? So embed is huge thing in a large language model. And then a transformer would be what sort of generates the text.
And it's [00:12:00] really a big neural network, like I think with a bunch of padding on the sides. And I think you put a lot of little things like positional and coning to tell the computer how the words are organized, et cetera. But it's just a big fusion neural network with the ones we saw 10, 15 years ago, but with a lot of stuff.
And then the important thing is attention. Attention is really what, so I already said embeddings are the most important. So I can't say attention is the most important, but attention is really what made this thing start working really well. It was like that one step, that paper attention is all you need.
That's when they started working really well. And attention. What it does is it adds context, right? Because if I say for example, apple, you don't know if I'm talking about the brand or the fruit, right? And no matter how good the embedding is, it's not gonna know because it's a word by itself. And so what attention does is it starts.
Modifying the numbers that are attached to the word apple based on the other words in the sentence. So if I say, please get me an apple and an orange then you know what apple I'm talking about. Because the word orange pulls the word apple, right? So attention is what really gave context. And so now the [00:13:00] computer knows what it's talking about, not just the words.
This actually started glowing the words. Instead of just having a bunch of separate words. So I think to me, transform, retention and embedding are really the big three components of an L L M. And lots of people want to not necessarily know more about large language models or lms, but mainly use them like use chat G P T and do prompting and query them.
But I think it's important to understand how they work. Yep. And. Based on the L L M University that you built and our building, I assume you think so too, but why would be the reason you think it might be interesting to learn and understand how they work in order to maybe use them better, but what's your motivation into understanding, trying to understand them?
I think it's like a car, right? Like you can start driving it without knowing how it works. That's it. I don't need to know anything about the motor. I can just start driving it and that's what people should do, start driving, right? I don't need to learn the infrastructure of a car before I start driving.
Now, if I start getting more [00:14:00] serious and I start to taking long trips and one day the car is gonna break and I need to know what to do, maybe I wanna become a race car driver. Like I, I wanna really get into it that now I really have. To know how it works. I wanna be a Formula One driver. I need to know how every single corner of the car works and how the air goes, and just absolutely everything, right?
So certainly you can start the same thing with, I'm gonna say LLMs, but it's the same thing with all machine learning. But if you wanna start working with LLMs, then definitely, yeah. You don't need to know the infrastructure of them, you just need to know how to prompt them and get stuff back. And then how to take the IPA call and put it in your code and pipeline and everything.
And then a little more. And then maybe you wanna really run with them and fine-tune them properly. So you need to know a bit of their infrastructure. And then if you wanna be a Formula One driver of LLMs, a researcher, or really build them and use them, then you have to know every single moving parts, right?
These attention goes here, the spike tuning goes here and all that. So I think the farther you wanna get with them, the more. You have to know, but the entry bars [00:15:00] simply start prompting them. Yeah, I definitely agree. And I assume some very rare people haven't even used L G P T yet, so what would you say is the best use or some very practical use case of large English models?
Yeah, definitely. What we've seen so far we've played with TR PT or with the model or any of the models is something that talks to you back, right? Because that's the coolest application. It's just a chat bot that I can just ask any question that it comes back. So that's definitely an application.
Another one is very important is search, whether you're searching or whether you were also wanna chat because as you've seen like this mild hallucinate and sometimes you say something with. Full confidence, but completely wrong, and you don't know how to tell, right? So if you enhance them with search, then they can be accurate and give you where the information came from, et cetera.
So definitely search one of the biggest applications right now. And then, things like classification, for example. Classification from 10 years ago we would use classification, but you would have to do 10,000 examples to day-to-day points to trainer model with a good l m, like [00:16:00] with a good embed.
Four or five are good to train a model, right? So that's also a huge application of LLMs in general. And embeddings. I think if you have a really good embedding, you can know a lot about your data. Like you could put your text into an embedding and it tells you so much like it clusters, it gives you a lot of insight.
So I would say, yeah, it definitely generate, which is talking like chat bots, et cetera. Search classification and embeddings are some pretty solid use cases. And who can, for example, chat, you can definitely just use. Chat G P T right away. And for embeddings or classification, you can use the coherent APIs.
Yeah. But if you want to, for example, you have a dataset or you have data in your company and you want to create some kind of chatbot linked with search and memory retrieval system. Yep. If someone is just an entrepreneur and knows Basic Python or not even can he build this kind of application using CO here or something else?
Is it accessible to anyone? Yeah, absolutely. Yeah. The [00:17:00] CO here generate endpoint works like try D, right? So it's the part that generates language, and so you can apply this for your own dataset. So if an organization has a particular dataset where they want the answers to come from there. Then you can plug that into a search model, right?
That when given a query, it searches for the right answers and it gives you the, where the candidates are, where the information is, and then you can use a endpoint to actually generate an answer out of this small set. So it's just much more accurate. So the search is for accuracy in and then generates for like telling the answer in a coherent sentence, right?
So generate and search, and if you are using APIs like that and basically tools that already built, but you are merely implementing them. How is learning about them and for example, going through the L M University helpful to you, if you are just following quick tutorials to implement them in your company, can you do anything to improve them or what do you gain by learning more about those models?
Yeah, that's a great question. The literacy [00:18:00] also has those tutorials, so it does tell you how to plug them in. There's a technical challenge, which is, you know how to put these endpoints together, how to deploy it. All that stuff, but as, yeah, as you said, like you're driving the car, you don't need to know how it works, right?
So that part is no problem, but there are a lot of parameters and there's a lot of things you can do to an l l m to fine tune it and to just make it work better, including working with the embedding and link things together. So there's, if you wanted to work really well, it's good to know some part of what's happening.
So we also have tutorials and fine tuning and, people just go as far as they want, right? Yeah. You made a comparison with cars and for cars. What we basically need is to hands a leg and just a lot of practice. But what is the pays skills that you need to use LMS in your company's website or application?
Is there something that is required to learn beforehand? Yeah, so you need 10 fingers and a lot of practice. No, I'm kidding. I think the more maturity you have with coding, the [00:19:00] better. It's all about practice. If you've done data science before and you know how to train models and you know how to deploy things, then that, that helps a lot.
But yeah, if you don't need to get into the details element. Perfectly fine then yeah, just maturity with code, just knowing how to plug this in your pipeline. Sometimes you have to put, do you wanna do this type of search? Do you wanna improve it this way? So like a little bit of intuition of how the models work, but wet need to get here to that.
And would you recommend any programming language? Would you say that if we give a concrete example someone wants to build, for example, you give it a link of a YouTube video and. You want to create an app that generates like a cool Twitter post about it. So what would you suggest people to learn about?
I assume here Python isn't really the language that you are looking for there. It's mostly front end and then just what it's needed to connect. Api, can you do everything with Python? What would you recommend to go for? No, I would say people should learn the language that they need for their application, right?
Like If somebody's still gonna work on LLMs, definitely. [00:20:00] I always recommend Python because this packages always come first in Python, et cetera. But if somebody is working on front end or something, then definitely that's the language they should learn. Another very important thing or new concept is prompting what you think about it.
I will first say what? My, my intuition with prompting, and please correct me or just say if you are not in agreement with it. To me, I don't know. A lot of people say that prompt engineering is a job of the future, and it's like how we will do everything but feel like It's not really the future.
It's more of a transition. Because right now I feel like, as you say, the better embeddings that we have, the better the models will be. And right now, since they are not perfect, we still need to do good prompt engineering to prepare the English language to the model. Yeah. And so I, I feel like the more we advance towards better embeddings [00:21:00] and mirror models, We won't be needing to be good at preparing the prompts for the models.
And so I feel like in the end, large English models and just AI in general will be just super easily accessible with a very simple query that anyone will do. And so I'm not sure, I agree with the fact that prompt engineering is a job of the future, and I would love to know what you think about that.
These things change so much I don't know. I find that the only thing that's constant of the future is that it's gonna be changing all the time. So I think it could be that prompt engineering.
Is the job of the future for five years until the next one, right? But maybe the same thing happened with data scientists, right? Like it was the job of the present. Maybe we're not gonna need it because these models are gonna be much better. So maybe it is the job of the future for a little bit of time.
I think there's two schools of thought. I think there's one that says prompting engineering is. The data scientist of the future. Yeah. And another school of thought that says that prompt engineering is just gonna be what fine tuning models is right now. Which is important, [00:22:00] but it's, you don't get a job as a model fine tuner.
You, this is a thing you do among all the other toolkits of things in a data scientist. So I think my opinion. It's gonna be big as you say. There's gonna be a day that we may not need to prompt correctly because the model will know much better. But I think the level at which we communicate with a computer is gonna get higher and higher.
You know what I mean? Like the previous generation Paris did assembly, now we write in Python, which is just much clearer with actual words and stuff. Maybe the next generation will speak in words to the computer and they have to figure out the right words to put it there. But maybe in the future the words don't really matter because the computer will know what you're doing, what you're talking about.
But there's just gonna be some higher level of structure that you need to have in order to do things properly, which is what jobs that don't require computers too, right? Like you're still speaking English or in, in Frenchs the language that you speak, but you need a high level of comprehension, right?
Like a c E O never codes, but they only speak words, but they need to have some. Structure. Yeah. For the words they use. And I speak the same words, but [00:23:00] I would not know, how to handle certain level of things. I think our level's gonna get higher and higher. And the next thing will be prompt engineering and the next thing would be just someone who thinks an even higher level than a prompt engineer.
Yeah. But we'll see. So you are still quite optimistic about prompt engineering in general? I think so. Who knows what will happen, but I do think these models are gonna be more and more ubiquitous and somebody who knows how to prompt them correctly is gonna be the sort of an the next job. So I think it'll grow.
What is the secret to prompting them correctly? Is it just like driving a car and it is just. Share practice or do you need to learn more about the models or learn more about embedding specifically what do you need to learn or to do to be a better prompt engineer? Yeah, I'm kind of learning it as we speak.
I think knowing about the math inside the model may not be super helpful, but knowing how they behave, With certain things like being good at Google searching, right? Put some sentences in a way that's much better and that make gets you the results better than somebody who maybe just bump into Google [00:24:00] for the first time, right?
So I think it requires a bit of that and a lot of practice, a lot of knowledge of what you're trying to get. Whatever your application is, you also need a lot of knowledge at Attach. So at the end that if you're opting it for medical application, you need to know some part of that too. But yeah, I mean practice also, I mean, chaining prompts, which may be lost, right?
Because may be something that the computer figures out. But for now, I think chaining prompts is very useful. Like you make a prompt, a simple prompt for something, and then turn that into something else. If you wanna build, for example, very elaborate story or something, maybe in a few years the models will be able to build it.
But Right now it can be. You build a story with one prompt and then you add some things with another prompt and then you know, chapters and characters and stuff like that. So it, I think it's a lot of practice and a lot of sort of high level knowledge. And who do you think should learn about ai?
I will try to give a concrete example just to change you a bit. But for example, I have a friend that is a recruiter in a specific company, so he's looking for people to fill in roles in the company he's working at. Yeah. And so for instance, him using [00:25:00] LinkedIn. And going through CVS and interviewing people, what could he do with ai, with the knowledge that he doesn't know anything about coding and Python and he did like human resources and went right to work, but he has an interest in ai, for example, and wants to improve his work or be more efficient.
I assume that it's definitely beneficial for him to learn more about ai, but what could he do exactly and. Could you give him some, a road to follow and Yeah. What would be the expected results? Yeah, absolutely. first thing you said, if should people learn ai? And I think I, I remember when I was a kid, I think I'll remember my age, but when I was a kid, people were saying, oh, should we learn how to use a computer or not?
and now it's not even a question right now. Kids know how to learn how to use a computer before they talk and just, play with apps and stuff. And then the question is, should I learn how to use the internet or not? And now it's not even a question, right? And I think AI is gonna be that in the future.
Like, Should I learn how to use AI is gonna be all over the place. And so many tasks that are repetitive we're not gonna learn how [00:26:00] to do, , right now. You and I cannot make fire with two rocks, but at some point that was the one thing that everybody had to know, to survive.
So in the future, I mean, there's gonna be a lot of repetitive tasks that people are not gonna know how to do, and , we rely on AI for those, and that's okay because we're gonna move to higher and higher grounds of our thinking. Now for your friend I think recruiting is the.
Perfect example of somebody who should get into ai, without getting very technical, I think that a recruiter, can use some model or, prompt some model or train something simple, to be able to go through instead of five resumes or 10 resumes go through a thousand, right?
So that part is the easy part. I would say being able to train a model to get you good candidates is certainly. Something simple that you train using all the resumes of people you've hired and all the resumes of people you haven't hired and just get it to find patterns and stuff.
But where your friends, skills come in very handy is actually the part that's not ai, right? It's actually the part where you wanna make sure that this model has no biases. Like, AI for recruitment is one of the places you need to work the [00:27:00] hardest because the biases that have come in the past, you know that sexism, racism, they.
Explode. Basically those behaviors get multiplied. And so as a recruiter, they have to know from seeing a lot of resumes. What kind of things are correlated? What kind of things should you look out for in, in your model? Mistakes is it making?
How to fix it is not just deleting the name. Because a lot of stuff is correlated in the information that the model , may pick up and continue with these biases. , so I think, your frank can not only benefit from AI to, to do their job much, much faster, but they can also, come back and input some things and be like, okay, this model, it's making these mistakes.
Let's fix them. You mentioned that one of the simplest things would be to train a model with all the cvs and to help him process them. But is there any other way not requiring him , to train a model? Because here for training a model, he will need to start learning Python and then try to learn how training works, regularization and [00:28:00] everything related.
To training. Yes. So is it possible to use something that is pre-built with either already knowing what is, a good CV or not and you can just prompt it or to give it some kind of database that you have a CVS that just is an easy web application or something that you can just. Yep. Upload your CV and, it'll work. what can he do if he doesn't have any programming? Oned? Yeah. , I'm sure your friend can do this without writing a line of code. , I dunno off the top of my head, place like that. But definitely , there's many places where you can use pre-trained AI models, and your
files and stuff, and get the model to work. So I don't think they need to write, , a line of code, certainly don't need to do regularization or anything along those lines. Like using, pre-train models. They already are fine tuned. So I would say go check out APIs and check out ws, things like that.
And that's even before lamps, right? That's something you can do right away., with LLMs, one thing you can do is a few shop learning, right?, as I said before, you to classify things. You don't only have 10,000 [00:29:00] examples anymore. Yeah. You need four, five, right? And so if you use an l M and you can just say, Hey,, check this out, l m,, these 10 resumes good.
And I'm just gonna paste them there. And these other ones bad, so I'm just gonna paste them there. , now and it already will start giving you results, the more you put in, the better it'll get. But,, as I said with Bannings, that are so good right now, the model will be able to pick up what's, good on a bad resume and,, without writing a lot of code.
Yeah. It's really cool. And basically, The only skill you need to have is, as you said, one that we all already have without questioning, and is like how to use Google and internet. Okay. Yeah, basically common sense. And is there a job that does not require to learn about ai?
Is there some people that should not even think about artificial intelligence or try to understand what it is, or is it relevant to anyone regardless of their background? I have a hard time, , finding. I'm sure there are, but. I have a [00:30:00] hard time and it's one of those things that if I say one job and we li listen to this podcast in six months, the bag, Louis, shut up.
That job already has. No, I think we can, it's like when the machines came and we start using them for things at home or for our job and stuff like that, I think, we're gonna end up using AI for everything. And I think like it's gonna take the mechanical part of our job away, and every job has, intuitive side and a mechanical side.
And maybe we don't really know exactly which one it is, but the question, the answer is, whatever the machine can do is the mechanical part. And whatever we can do is, the intuition, right? So I think, a writer can come up with the idea, of the story. That's a lot of intuition.
And then the wording and the writing is the mechanical part. So, they can use. A model to help them with that, an artist, same thing, right?, we see, for example, our architects, don't draw on with pencil anymore. They just use a lot of modeling tools and stuff like that, but they still think of the building in their [00:31:00] head.
So I think every, I have a hard time thinking of a job that doesn't, but I would say the hardest part for a model is,, anything that requires sort of emotional intelligence. I think, leaders, I think, Anything that requires empathy or group thinking. , I think those are , the last ones to go.
Like those are , the highest level and those, yeah, you can use AI as a tool, but AI is not going , to make those decisions. AI makes decisions when it comes to maximizing or minimizing a number, right? Whatever they do. The minimizing number of mistakes or the number of time we're maximizing the number of.
Things, it went right. , decisions that are, that require empathy, and emotional intelligence, many times there's no number you're maximizing. You're just thinking as a human. So I think those are the ones that will be the hardest, for a machine to do. And likewise, I would add that models merely interpolate from the data and they cannot really extrapolate.
So it's also hard to innovate or create [00:32:00] something new, even though sometimes the. The image or text generated seems new. It's mostly a blend of what already exists, which is also what humans do. But, we can still innovate a bit more than that, I believe. Yeah. It's hard to tell how much of what humans do is interpolating from what we've seen before.
Yeah. And how much of it is new. And I think if anything, all this technology is gonna extract that because, there's gotta be some things that we will discover that the machine can't do and that may just be the one thing we added. Yep. Yeah. And speaking of large language models, we discussed a lot of topics that are very relevant to what you are building at Cohere, so I would love to just dive a little bit more into what you're actually doing with L M U or Large Language Model University.
And to do that, could you maybe give us, A brief overview of what it is exactly what you're covering and the main insights people could expect when going through, this university. Absolutely. [00:33:00] Yeah. I'm so excited of following you. It's just been a blast working with it. And, , yes, it's, since we started with just decided, you know, we wanna bring everyone.
Up to speed and LMS in, in every possible way. And obviously it's material that you can go at your own pace. You can pick this, that, or that, or you can go over it in order. , but the idea is just to have everything. And, I always just like to understand things in the most simple way. So I, I don't yeah, have a formula.
I don't have a matrix. I just have little,, words lying around and cars and fruits and stuff like that., that's what module one is, right? Module one is like a friendly introduction to LLMs. So we talk about embeddings, we talk about, tension models, similarity. So basically the architecture of a transformer, we get through the architecture of a transformer, and,, also talk about things like semantic search, which is embeddings.
, but it's all conceptual. Like it's all,, to get you an idea of what's happening behind the computer. Don't worry about the matrix and the numbers. I can multiply., just know what's under the hood in a fake [00:34:00] way. And then, we have modules two and three, which is three is based on in text representation, which is basically anything that's not generation.
So then we have search,, classification, embeddings. Then there's a lot of labs, right? So this one has basically code labs ,, with the information behind, with , the, all of it has block posts, and videos and, and code labs, but it teaches you how to use the endpoints. And how to analyze things.
So for example, it makes you embed a big data set using the endpoint, and then once you're in the embedding, cluster it, see how. What insights you get,, et cetera, et cetera. So that's module two. And then modules three is generation. So it teaches you how to use the generate endpoint. And then it teaches you a lot of like prompt engineering.
It has a lot , of labs. One of them is, building a story saying before,, chaining prompts,, a lot of use cases , of where would you use, , generative learning, et cetera. And that's what we have right now and we are adding a lot of content. So the next module we'll be adding [00:35:00] pretty soon.
Is deployment. So we have model on deployment on AW s, , SageMaker and a lot of other things. , the co here, , endpoints are cloud agnostic. So we have, examples on, on, on how to deploy and on most platforms. And then we're adding an entire module on search because search is, basically like here's greatest, focus is, and where a lot of,, a lot of the innovation has happened and where it's a leader.
And, , so we have a, we're gonna add an entire module on a lot of different,, a lot of different search applications, and technology. Retrieval, augmented generation, which is what I'm saying, that you can search first, search , and then generate answers from there. Re-ran, which is a step that makes,, searching much better.
And a lot of these technologies. And then another thing we're adding is more, more prompt engineering because it's what everybody wants to learn. So we have, a lot of great material on that and we'll continue adding things,, in the future we'll add things on. Enterprise, type of content like decision maker in a company would.
Doesn't need to know so much about the models or something, but they need [00:36:00] to know the high level, what should I use, what kind of team should I build? Et cetera, et cetera. So yeah, we're always adding stuff and we have a lot of cool events. Discord Channel is very active. We have monthly, , talks I give them, sometimes the other instructors Jme or, and all the of them, they also, give them and we have, yeah, just a lot of activity in the community, office hours, et cetera.
So it's been a lot of fun. Yeah, it's super cool. And I also just want to clarify that you mentioned a lot of relatively complicated terms or terms that people might not be familiar with. Just for example, clustering , or reran or any terms related to AI and large language models. But I wanted to clarify that this is still a course for anyone to jump in and.
And learn like you go through as if it would be a beginner and like anyone should be fine going through it, right? Absolutely. No matter what level of beginner we have some material that's introductory and ml, that you don't need to for LLMs [00:37:00] because everything we use, we, , introduce., but if people wanna go through, through the basic ML material,, they can as well.
And, , we have, , links for everything. So for example, clustering, which is really just, grouping points we have a resource , for that. We linked to a resource of that. We basically linked to a resource on everything. And, , there's very little, that is needed to get up speed here.
That's a, if you want to go to the Code Labs, basic Python and then, basic maths. But I never use a variable. I just use numbers. So three plus four, I don't say X plus y I say three plus four. It's completely free. And it's completely free. Yeah. And you can go in and it's completely free, completely open, and you can go.
We have a suggested order that you can just go next next. But if you wanna really learn this one thing, you can jump into that one thing and go in the, at the pace that you want. And we have we don't have co cohorts but the Discord channel we do have people studying together.
We have study groups. We make it as, as community as possible. But certainly this is growing. It's got a lot of great reception. So many [00:38:00] people have replied, have joined the community that we're definitely putting the accelerator on it. And our goal is to make it into an AC with an infrastructure of a course.
And so yeah, we're getting there. Also, I definitely recommend. Just at least trying or starting to read the initial courses and modules of the L L M U, but just I want to clarify another thing. When you see discord, is it the coherent discord or is there some Yes. Great question. It's in the coherent discord and then there's a small.
Community for L L M U. Ah, perfect. Yeah. And yeah I've seen your videos about embeddings and transformers and it's amazing how you explain stuff and you are a really good teacher. And so I just wanted to diverge a little bit into the teaching world just because I feel like whether you want to teach to people or working in a company and need to do a presentation or.
talking with teammates or colleagues. You always need to explain or at least make your [00:39:00] thoughts Yes, sound simpler and clearer. And so I wonder if you have any tips to better explain concepts or things that you work on with all your experiences with YouTube and. coherent your work at Apple and everything you've been in the teaching industry?
Yeah. Oh, great. Thank you. That's a great question. Actually everybody has their own way to learn, right? Some people build things, some people have to write all the details. I learn by teaching and even the times in my life where I've tried not to teach and I've just done a job that's completely not requires teaching, I always gravitate to teaching because that's my way of understanding things.
And as a matter of fact it stems from difficulties. I have difficulties learning all my life in school. I was just lost in class. I get distracted very easily at meetings and talks. If you ever see me at a conference just watch me. I'm like this because I get lost easily, And so when I see a talk that's full of formulas I cannot have to, either I'm lost or I completely try really hard to understand it with examples, with a [00:40:00] simple example.
And so my first advice is, check out your weaknesses, because there may be a hidden strength there. When I teach I have to talk to the slowest person in the room because that's me, right? If I understand it myself, Everybody will, because it just has to be at a level of simplicity that even I don't get lost for, I get lost in everything.
So I explain to myself concrete tips is they definitely don't take things for granted. I think people have, in academia and an industry there's a bit of an ego where you have to sound professorial and you have to sound little elevated and I think nobody likes that. I mean, We pretend to because everybody does.
But I think. People are scared that if you go and say something too simple, people will laugh or think they're stupid. I guarantee that's not the case. I've said simple things for years in front of everybody. As a matter of fact, I have a hard time trying to say things complicated. If I try to sound technical, I can't.
And I guarantee no, nobody thinks you people are stupid if they say something in the simplest possible way. So always try to find [00:41:00] the simplest example. I always. Ask myself, what is the simplest example that I can have in order to illustrate this concept? So I'll give you an example.
Image recognition. I could explain image recognition as unlike, you have a huge matrix of pixels and that's your image, and you multiply by a vector, and then you have to apply an activation function and daba da, right? If the image is 28 by 28, why not make it two by two? And why not make it black and white?
So all of a sudden, I have four numbers, right? And if I can explain I don't need to multiply this matrix by a vector. I actually can do all the operations. So I go, this is gray, it's 0.5, this is white, it's zero. This is black, it's one. Like I can do it with actual numbers. And somebody who doesn't know a matrix or a vector or matrix, multiplication can follow because they know how to add one plus zero.
You know what I mean? And so I always try to bring it down to the simplest possible example. And if I can find an analogy, that's always good because here's something that happens, right? Like people always say they're bad at math,[00:42:00] but that's not true. People are great at math. You could not survive. If you are not good at math or physics, you know the logic to go around the world and figure out, oh if this is open from two to four on Saturdays, but not on Fridays and today's Friday.
Like that logic is the same logic as math. And physics too. If I tell people to draw a kid jumping they would draw para, they know. But if I tell 'em to draw y equals minus 9.8 x square, they wouldn't know because what people are bad is not at math, it's at abstraction.
And what most of education unfortunately does is that it takes the reality away from things and it gives you something really abstract and then teaches you that. And that's, you cannot appreciate it, right? If I take a song and I, and you're hearing the music, instead, I just take away the music and give you the notes.
We wouldn't know, right? Only the experts would know. And so I piggyback on people's knowledge, like people know how to operate the world. And so therefore, I'm not gonna give 'em a formula. I'm gonna give you a real life scenario where that formula [00:43:00] appears hidden, and tell them, what would you do here?
And they know exactly what to do here. And then I say, okay, in this formula, what you did was this. So always basically vice for teachers. Be humble with knowledge and with your peers. Don't be afraid of sounding stupid by being too simple. That's not the case. And always try to piggyback to reality.
And to the simplest possible example that illustrates things. Thank you that those are really good tips, especially the sounding too simple. I think that's one thing that I still need to get over, just because I recently had like a presentation for my PhD and it's indeed relatively hard to know for those presentations when you don't really know the audience Exactly.
Like Who will it be? It's hard to know. How high level or low level you need to be and yeah, You are also in front of some professors and some postdocs and like experts. So it feels like , you don't want to look like someone that is dumb or take them for dumb people by explaining [00:44:00] like the very basics of a convolution or something that is essentially simple for them.
Yeah. To me it's hard to get over that fear of. Not disappointing, but of sounding that I'm over-explaining stuff and that it's simple. Just get to the next step. I think you are right that it's definitely worth to. Still mention, even if it's, it sounds super simple, especially when I was in my PhD, I had that problem because I think everybody in grad school has that impostor syndrome, right?
So you Yeah. Felt like I'm the only one here not understanding or that I'm explaining things and, something that everybody knows. But then you start talking to others and, and for example, in a conference in math, I would just quietly say to somebody, I didn't understand anything, and they'll be like, me neither.
But they still bit like they knew everything. They just do so confident. And then at the end it's just everybody's feeling like that, so a little bit of, I was thinking a little bit of kindness can go far. And if you just talk that the grad students will understand or that the undergrads will understand.
[00:45:00] Then, if one professor got offended because you were not too simple, then let them, I always send my talks to undergrads and grad students. Yeah. And it, it's crazy how this is so true, where in most conference I go, even in cbpr, that I will go in next week.
But I went there last year and some times that we had attended I watched my friend and they seemed to follow and I was the only one that seemed lu. But at the end they had no idea what he was talking about, and it was just always like this. And at the opposite, I never talked to someone and said, or he said this was too simple.
Like, This never really happens. Worst case it's simple and it was just a great refresher or something. But it's not, it can't be really too simple like Even for example I've seen Jan LA's talk a lot just because he reuses relatively the same talk, and it's relatively straightforward and simple, but it's still interesting every time.
And nobody says this is too simple and useless. It's still [00:46:00] interesting and you can then get a bit deeper. But going through the basics and starting over is definitely. Useful and even maybe required to be understood and to be interesting. Yeah, absolutely. And sometimes I even have experts saying, because experts like think in a very high level and always stay here, right?
Yeah. And they talk here and they understand here, and every now and then, sometimes an expert comes and says, wow you actually brought it down here and gave me more understanding something I already understood. But if you bring it to simple, they may have never done that before.
Maybe they've always thought about it in high level, and if you give them a different spin, a different way to understand things, you are still adding to them. So you gave the beginners a great overview of what you were working on. And the experts a great like dive into something. So that's how I try to aim it so that I don't board the experts, but I also don't like teach the.
The beginners if it's in interesting enough and just like your great car comparison, for example, or any low [00:47:00] level or high level view that is innovative and interesting , that catch your attention. That's also first really good as an expert just to. Like you I basically will steal your ID just because it's easier to explain what I do to other people just because you explained it so well.
And it may also be good for researchers to think about their problem differently and maybe come up with another type of solution just. By reframing the problem, basically. Yeah. It's always good to understand things at their basic level. I find that when I do research I have to do that because otherwise I don't get them I don't understand how, but level.
But I, I sometimes find that when I talk to researchers and I force them to come to my level of examples new things come up because they, you're seeing it from a different angle. Yeah. And keeping in the teaching world. You've been, in this world in machine learning for a while.
Have you seen. A difference, especially for example with your experience on YouTube or elsewhere, but have you seen a difference in [00:48:00] the people wanting to learn about ai, like a difference in profile or in, was it, for example, before it was unleashed students wanting to become researcher and now it's entrepreneurs wanting to create applications.
Is there a difference in the people wanting to get into the field and learn about ai? Humongous? Yeah. When I started in AI around 10 years ago it was only like. Programmers and statisticians who got interested in this, and they would just go in and you would be solving problems outside of the field using only your knowledge and programming and statistics.
So obviously you're missing a lot of things. There's a problem In medicine and you approach it with machine learning, you can only get so far because no doctors looking at it. But now just everybody's interested in ai now everybody wants to come with their own level of expertise and their own knowledge to contribute to AI and tend to be benefited by its, so I, I've seen that it's no longer a field like it's now so ubiquitous that it's more before oh, I do ai.
Okay, now it's Oh, I do a AI with this. Nobody just does ai, like you do [00:49:00] AI with some flavor or some application to this and that. The question is not, do we do AI or not, but what do you apply AI for? And it's a two-way street. I think anybody that comes with a new field and applies AI to it contributes to ai.
So that's one thing I love. Yeah it's an amazing new field. It's just unbelievable. Where now, as you mentioned earlier in this discussion, people can come from linguistics or arts or like any background and be helpful in creating a better model. That's so cool.
It's not like resorts to programmers or mathematician It's amazing. They have their own way of thinking. I worked on musicians with artists. It's, yeah, chess player for example. I met one of the best AIQ planning and they're a lot of people. It's really cool. Yeah, those were all my questions, so I definitely invite anyone listening to check out and get into AI by going through the free L l M University from CO here and also join the Discord just because the events there are also free and otherwise, I would like to ask you if you.
Had anything else that you are working on [00:50:00] or personal projects that you would like to share with the audience? I always have my teaching projects which I'm very passionate about, so definitely I invite everybody to check out the YouTube channel, Sorano Academy, like my last name, Sorano Academy on YouTube.
Have a Spanish version as well, and a Chinese version of well, somewhere very kindly translated a lot of my videos and so yeah, Sorano Academy, definitely check it out. That's where I put all my knowledge. The book, rocking Machine Learning is where I explain all the machine learning in, in the way I like it, in the way I understand it.
And then I recently got a specialization out with Coursera and Deep Learning AI with Andrew, and it's called Math for Machine Learning Specialization. So there I teach like the math that people need to know for machine learning. And it's three courses. Each one of four weeks long.
Linear algebra is the first one, calculus, and the third one is statistics and probability. And it's all geared towards machine learning. Picked all the topics that were basically gonna be used in, in machine learning or give you some kind of insight. [00:51:00] So great in the sand. Things like matrixes rank.
Hypothesis testing probability like maximum likelihood. So I recommended for in AI and like just or getting into AI and trying to learn the math on the side. So yeah, check it out. Really cool. I didn't know you recently did that. It's really useful and I definitely also recommend it if you are going for.
A job in the AI field, like if you are aiming for a job or for doing a PhD or a Masters? Personally I just went to the classic engineering university background and even then during my university, I still took online courses about math and all those topics. So I definitely also, even if you're in university, recommend to check those out and just get better by yourself.
It's. Definitely worth. Well, and I'm sure those resources are just as good, if not better, as everything you do on YouTube and coherent. So super. I will also check those out and maybe add to my, have a guide for getting into AI online, so [00:52:00] I definitely will check those out. Oh, thank you very.
Of course, and I'm a big fan of your content and your channel as well. So it's a, it's an honor to be talking to you. Thank you. And thank you very much for all the amazing insights that you gave. It was a super fun discussion for me. And I'm also a fan of your YouTube channel and your work, so that was really cool for me.
And yeah, thank you very much for your time and for all the great insights that you shared with us. Thank you as well for the great conversation. It's great questions you asked and yeah, I'm very happy to be here.