Module 1: Intro to AI, ML, DL, Gen AI & Agentic AI

L1: AI vs ML vs DL

1:15:24

Transcribe Summary

Introduction to AI
Components of AI & Use Cases
Machine Learning Overview
Supervised vs. Unsupervised Learning
ML Models
Supervised Learning in Practice
Unsupervised Learning & Reinforcement Learning

0:00:00 So what is AI? So AI is basically capability of a system or a machine to take decisions on your own without an explicit program or without explicitly programming what does that mean?

0:00:20 So you have a computer and you want a computer to be able to take real time decisions. So that ability of a computer to make real time decisions based on the environment, based on the actions based on the state change, that computer is able to make decisions or take decisions then I will say my computer

0:00:40 is an intelligent machine or intelligent system. Then what is not intelligence? So I write down a code to calculate the factorial of a number or to calculate the Fibonacci series or to calculate exponential moving average from my times series data okay now that's you know well-defined code set of operations

0:01:04 everything there is no intelligence in that because now I have a well-defined input I have well-defined sequence of calculations and then a well-defined output okay and the computer is just supposed to execute those code the steps and the calculations to end up with the final result so if I am giving

0:01:24 that granularity of instructions to a computer first do this then do this then do this then do this then this and then stop if you have this then do that there can be branching conditional based systems also but that's okay every every other you know every other possibility is already taken care of okay

0:01:48 if you get this kind of data then do this if you get that kind of data then do that If you get some other kind of data, then do this by default.

0:01:56 So I've taken all the possibilities now. So now the computer knows exactly, okay, this is the kind of word document, I'll use this parser.

0:02:02 If it is a PDF document, I'll use this parser. If it is a CSV file, I'll use a CSV parser.

0:02:07 If it is a JSON file coming up, I'll use a JSON parser. If there is a markdown file coming up, I'll use a markdown parser.

0:02:14 If it's a HTML file, I'll use an HTML parser. If something else is coming up, I'll use some default parser.

0:02:21 So, now my system knows exactly and it is exactly performing the same sequence of steps, then there is no intelligence here ok.

0:02:31 So, example of a intelligence system simple maybe I can think about a self-running card yeah self-running cards ok maybe a simple example I can give.

0:02:43 So yes, your self-driving cars will have this 360 degree rotating camera on the top of the car mostly. And that is taking 360 degree life feed of all the environment, all the building, trees, objects, vehicles, which are coming up in front, back, everything which are moving, not moving stationary.

0:03:05 And then if it is moving, it is trying to find the distance between the obstacles or the vehicles which are in front or back, or lane detection, everything fine.

0:03:17 So, let us say your car is cruising on a highway at a speed of 120 kilometers per hour. And a big signboard was detected by the system, my computer, which says, okay, tall plaza ahead 500 meters.

0:03:36 So now, you know that there is a tall plaza going to come up, which is a usual observation in India okay we have big big tall plaza in the express highways also.

0:03:49 So car was able to detect that it was able to read that signboard okay and still kept on going at the same speed and finally banned on the tall plaza.

0:03:59 So what is implemented and what is not implemented in the system? So this system was able to recognize the billboard object detection.

0:04:09 And it was able to recognize what was written, OCR, optical character recognition. It was able to read. So object detection is working well.

0:04:20 Object segmentation is working well. You are able to read the digits and alphabets also. So OCR is also doing well.

0:04:28 So it's a standard computer vision project. But that feedback never went to the breaking system. And that is where the car didn't slow down or stop.

0:04:39 So, since the feedback never went to the breaking system, it was not intelligent. It was a dumb, I would refrain from saying AI project, it was a maybe graduate level computer vision project, maybe just to detect the billboard, what it is written on that, it's not even graduate level these days, high

0:05:02 school kids are already doing these kind of projects. So it's a computer vision project and not a AI system, okay, simple.

0:05:09 So in a typical AI system, you'll have multiple components of machine learning, deep learning, okay, national language processing, speech to text, text to speech, okay.

0:05:19 Robotics optimization. So a lot of these components will be, you know, joined together to create a AI system, okay. So it should be able to perform actions, that's the main crux of it, okay.

0:05:31 That's the artificial intelligence. I'm referring to so there are some common use cases from different different domains. You can see Automation data analysis predictive analytics personalization your Netflix is already doing that personalizing your feeds.

0:05:47 NLP is your national language processing how these AI systems are able to understand the English grammar language and able to do the translation from one language to another language.

0:05:57 Or doing a transcription, what YouTube is already doing, if you see any video, you can turn on the closed captions, and it generates the text on the fly literally for many of these videos where the closed captions were not submitted by the creator, okay.

0:06:15 So YouTube generates it, okay, that's why it says generated CC, okay. Image and video analysis, risk management, okay, portfolio management, again, what kind of stocks you should be investing for a given risk profile in the next six months or two one year time frame based on the technical and fundamental

0:06:34 analysis of the company of the of the companies of the economy and the kind of risk profile that you are holding okay.

0:06:46 So all these analysis can be made and you can get your portfolio optimized. So, and the decision support system process improvement, let's say some process is going on in a shop floor, okay.

0:07:01 So, process in a shop floor meaning machines are aligned to process a particular object in a given sequence. So, we need to find the optimal usage of each of these machines so that few machines don't get overused and there are few machines which are lying underutilized.

0:07:20 So, optimization of the shop for process so that every machine gets some buffer plus are optimally used. You can also make use of these AI systems for health monitoring of these machines.

0:07:34 So we can predict before time, okay, when should we do the servicing of these machines. And if there is any potential problem that might arise in the next couple of days or a week in any of these machines.

0:07:47 So, all the predictive analytics we can also do. So, all these things are already being done customer service, child boards, quality control, R&D, marketing and advertisements.

0:08:00 Okay, so everywhere we have the use cases. Okay, so when to use, when you have complex tasks, you want to replace humans for some of the repetitive tasks you can use.

0:08:12 So, here the word is repetitive task. If the same task is being done manually and the task is more or less well-defined, why not automate that using a AI system?

0:08:22 So that is the overall idea. Or if the task is just too complex for a human to comprehend and execute.

0:08:29 So then you can make use of a computer, a super computer and a AI system on top of it, which can solve these problems.

0:08:37 Adapting and personalization, autonomous and unpredictable events which cannot be really tracked by a human but still can be tracked by data and given to a AI system.

0:08:48 So real-time decision making as in your real-time analytics autonomous vehicles and so on. So these are typical use cases. Now narrowing down to machine learning what is machine learning?

0:09:01 So machine learning the definition says it's a subset of AI which focuses on creating systems that learn from data. So the important thing is we need historical data.

0:09:11 So as humans we learn from our past experiences. As humans we learn from our past experiences. Good experiences, bad experiences but that is where we humans get learning from right fine then how do you teach a machine okay how do you teach a machine so for the machine that experience comes in the form

0:09:38 of data okay if you have more experience do you think we'll make less mistake or more mistake so chances of making mistake will reduce if your experience increases.

0:09:50 That's why people will want to have a experienced data scientist or a chief data scientist who is supposed to have more experience than the junior developers because these guys who have experienced 10 years, 15 years in industry, they know the process well, they know the regulations well.

0:10:08 And so the chances of getting into trouble in any of the client projects becomes much less. So, they know the nuances of the domain technically and business-wise both.

0:10:23 So, same goes with the machines also. If you want to teach the machines that same level of knowledge and intelligence, you need that much amount of data.

0:10:32 So, maybe until 2005, 2010, that is basically just before the cloud became really very popular. All the data was stored locally.

0:10:44 within every company okay and then this AWS game GCP was a slightly late entry as you had already picked up by then okay so first AWS then came as you are and then GCP came up okay following suit they understood that there is a huge market so then once the company started adopting cloud all the data

0:11:07 they started putting up in their cloud their limitation to store data locally was elevated so now they didn't have to think twice whatever data is coming up let's store it let's let it go to the cloud we will see what can be done later but now they had a much more scalable solution okay they didn't have

0:11:28 to think twice that I have a limited storage in my server because now all the data was getting stored on a remote server on cloud server essentially and which they We claimed that it saves, secure this and that via service level agreement, that's the second thing.

0:11:45 So now when the data availability started, then people thought, okay, we have already so much data and we're setting on so much of information.

0:11:53 Why not draw meaningful insights and drive our business decisions based on the insights coming from the data? What we call as data driven insights, okay, that is where the predictive analytics started becoming popular and machine learning became popular.

0:12:10 People started adopting the machine learning techniques and real applications of machine learning started getting deployed within the organizations because now the ML models started making sense because they were trained on sensible amount of data, okay.

0:12:28 So that is where you know this domain and this application started getting adopted across industry verticals okay. So multiple use cases as some of the examples are already mentioned here like Amazon recommendations, Netflix, fraud detection systems, and banking, emails, pamphletters, predictive text

0:12:50 on smartphones, your auto correct, even search engines, the moment you type a couple of words, the entire search query gets predicted in Google.

0:12:59 I hope you have noticed that So, this is all making you some sort of a model, I said that is making use of some sort of a model.

0:13:13 Now, what is that model? What does model mean? So, basically, if I just give a technical explanation, there are three things.

0:13:25 One is data. you are training some algorithm on data. So, some training is done and the process of the training is outlined in the algorithm.

0:13:39 So, what is algorithm? Algorithm is a sequence of well-defined operations or numerical computations. That is called as an algorithm. This is the data which goes in, then this is calculated, then this is computed, then this is calculated it and then finally you get the output ok.

0:13:57 So, well defined sequence of steps of calculations that is called as an algorithm. So, the data enters into that well defined well defined sequence of operations.

0:14:12 This is what you call as the algorithm and the output that you get from the algorithm. Now, after so data goes inside the algorithm and the algorithm does all these computations what all computations that is well defined you don't really have to get into the technical details what computation what matrix

0:14:31 operation is going on within these behind the scenes you don't have to really know everything but just imagine that algorithm is well defined sequence of operations I'm using the word well defined because it is very well known this is not getting predicted by any AI system what computation will be done

0:14:49 that is not predicted there is no uncertainty, there is no stochastic nature in that okay. So, we know very well what all computation I am going to do with my data.

0:15:01 So, this is all the sequence of computations that you are going to perform on your data. So, all that sequence of computations that you are going to perform on your data is called as an algorithm.

0:15:14 What is the output? That output output of the final step in the algorithm is your model. So, when you train your data using some algorithm the output is your model.

0:15:26 So, this is in the form like if I can give some y is a function of x. So, for so, this function could be anything, could be a linear function could be sine, cos, could be logarithm, anything.

0:15:43 So, x is your data. You keep changing the function and you keep getting different values of y. Do you agree with me?

0:15:50 Have you, yeah, is this making sense? This last part, y equal to f of x, okay, for the same x depends on so y could be sine of x, okay.

0:16:03 And then y could be cos of x. And then y could be log of x. rather y could be 2x2 minus 3x plus 5 and so on.

0:16:16 I can end up with infinite possibilities based on what function I'm using to operate on that same x. So x is your data.

0:16:29 What operation you want to perform on that data is your algorithm. And what is the outcome here? That outcome is your model.

0:16:39 So if you change the data, that's interesting now. Even if the algorithm is same. So sign of 90 will be different than sign of zero.

0:16:48 Agreed. Okay. So X has changed. So the same sign function will give you a different value of y. So, what I am trying to say is very simple, your this is y is f of x, the same way model is a function of the algorithm running on some data.

0:17:08 If you change the algorithm, you will get a different model, right. If you have a linear regression algorithm, you will get a linear regression model.

0:17:16 If you change it to a decision tree algorithm, you'll get a decision tree model. If you change it to a K nearest neighbor algorithm, you'll get a K and N model.

0:17:27 If you change that algorithm to random forest, you get a random forest model. If you change your algorithm to XG boost, you'll get a XG boost model.

0:17:35 If you change your algorithm to light GBM, you'll get a light GBM model. See that data is same. You keep changing algorithms and you keep getting different different models.

0:17:46 That part is clear. So on the same data I'm just changing different algorithms and I'm getting different different models. Perfect.

0:17:56 It could be the other way around also okay which means fine what is the other way around means. It means if I keep my algorithm same and change my data slightly.

0:18:14 I will still end up with a different model, please understand that. Even if a single data point has changed, the pattern in the data has changed theoretically at least.

0:18:26 So, theoretically your model has changed, okay? Even if the algorithm is exactly same okay so I mean this is what you need to understand okay every time so if the data is different even though you are using the same algorithm let's say same linear regression logistic regression but the data is different

0:18:46 and so your model will again be different okay okay just to summarize what is model model is the learnt algorithm where is algorithm learning from algorithm is learning from data these two statements are clear and please you know make sure from this second onwards you will not use the word algorithm

0:19:12 and model interchangeably you are otherwise completely disrespecting this me at least okay so if you have understood this clearly you will never ever use.

0:19:28 You will never ever use model word and algorithm word interchangeably. Most of the non-technical folks, they don't know this technical difference and whenever they want, they will say a model and a real algorithm.

0:19:42 I mean, I give them the benefit of doubt. But now that we have this clarity, because if I'm referring to model, I'm referring bring to it as an algorithm, then it is a way of training, what is algorithm, way of training.

0:20:01 For example, gradient descent is an algorithm, back propagation is an algorithm, they are not models, they are way of training, okay, fine, so that part is very clear now, let's.

0:20:16 So what is the ML model then? ML model has been trained on data, okay. Now, depending on what type of data you have, we can still, so this is what I said, that model is some algorithm running on data, find very well.

0:20:33 So, what is the objective for the business, let's say, okay, without getting into technical details, okay. why do you want to train on some data?

0:20:46 What is the training objective? What does the what model want to learn or how can I visualize model in front of me?

0:20:56 That is the question I am asking okay algorithm is clear because it is a sequence of mathematical operations so I can always write down equations and say that okay these five equations they are a part of my gradient descent algorithm.

0:21:09 Simple. I can always express an algorithm as these set of equations, which are going to be used to compute on your data.

0:21:21 So expressing or showing an algorithm to someone is very easy. If I ask, okay, Sachin, show me your algorithm. Sachin will say, Prashant, this is the first step I'm doing, this is the second step I'm doing, this is the computation in the third step and here goes my final step of computation and so that

0:21:38 will be my output and that's it. So there is a very clear visibility, thought process in the algorithm. What exact computation is going to happen in each and every step?

0:21:49 You can even mathematically write it down. That level of clarity everyone has for every algorithm otherwise it's not an algorithm.

0:21:56 You are contradicting the basic definition of algorithm. So, if you take the bookish definition of algorithm, okay, this is what exactly it is.

0:22:05 But then if I am asking, such it show me your model or what is your model? I want to imagine my model now, okay, so how does my model look like, okay, that is the question I am trying to address here, okay, look like means I am just trying to imagine it does it look like a horse or an elephant okay so

0:22:30 what do I imagine when I say the word model okay what do I imagine when I say the word model that's a simple question I'm trying to address here okay so model is the learnt algorithm as I said model is the learnt algorithm where has it learnt from, learnt from data, historical data, you have passed data

0:22:55 , you can say, how that is what the algorithm says, the steps. Okay, can you express this model mathematically? quickly. Question is, Prashant, can you just express the model mathematically?

0:23:14 Okay, then I say yes, it is possible, but not always. Okay. So some of the models can be expressed mathematically, some may not be able to.

0:23:25 Okay, so for example, those models which I can express mathematically. Okay, they are called as statistical learning models. The models which cannot be mathematically expressed, they are machine learning, deep learning models or genii models.

0:23:42 They cannot be mathematically expressed. What do you mean by that? Okay, the model is simple as a equation. Okay, could be some number here, 2.5 plus 3.5 x.

0:23:52 x could be some independent variable and y is your dependent variable. This equation is a model. It can be as simple as that.

0:24:01 You don't have to think about very complex equation. Your model literally I am telling you can be as simple as a linear equation.

0:24:10 That is what your linear regression model is. Which again means your linear regression model is actually a statistical learning model and not a mathematically expressed because of the complexity they cannot be, be mathematically expressed in the form of an equation expressed as an equation that is what

0:24:40 a machine learning deep learning origin A model would be. So, if I say decision tree model, decision tree model is a ML algorithm or ML model, then you can't ask me to show me the equation of a decision tree model.

0:24:53 So that question is irrelevant now. Show me your KNN model in the form of a equation. A KNN model is a machine learning model that again cannot be expressed.

0:25:07 Okay, so any of the machine learning algorithm if I'm saying that this is a machine learning model that cannot be expressed.

0:25:15 If you're able to express it in the form of a equation like this it's not a machine learning model that is a traditional statistical learning model statistical learning model okay but ultimately both these models have the common objective what is the common objective of this model this common objective

0:25:38 about the model is to find the relationship or pattern in the data.

0:25:51 And learn of course find and learn in that. So both statistical learning will also do the same ML model, DL model, Genie model, whatever.

0:26:03 So whatever is the relationship or the pattern present in the data, I want that pattern to be learned that relationship to be learned.

0:26:12 So it's ultimately a learned machine which is now being referred as a model. So any model has this objective, this broad objective, it's a very broad objective.

0:26:22 Okay. What kind of relationship you might ask, what kind of pattern you want to ask. Now again, we'll discuss the finer details later, But any any model wants to do this what does it want it wants to capture how is the data structured okay what is the relationship between the attributes present in the

0:26:41 data or if your data is composed of Both independent variables and dependent variables then what is the relationship between the independent variable and the dependent variable That relationship is what is learnt by the algorithm during training And after learning, it becomes a model.

0:27:02 So model would have captured that relationship. So ultimately, model is kind of a approximation of the relationship, function approximation. Pattern recognition, exactly.

0:27:17 So the concept of pattern recognition is your model ultimately. So any model will try to find the relationship for the pattern present in the data.

0:27:25 So, that's the whole objective, non-technical objective. Now, technical objective would be defined in terms of loss function or cost function and all that will not get into all the details, but here fine.

0:27:41 So, here the key points is the model takes the data as the input. It learns the pattern to a relationship from this data that's exactly what I was seeing.

0:27:50 and then the output, it can predict the outcomes for any future data and same data. So, models can be linear regression model or a decision tree model or a neural network model.

0:28:04 Okay, these are the same name of the algorithms also, a good linear regression algorithm or decision tree or these. After training, they will become the model ultimately.

0:28:15 So, purpose is to to automate the decision making and predictions based on this learner data. So when I talk about machine learning or even deep learning, okay, what is deep learning, we will see that there is another way we define or differentiate these models, okay, and that is based on data, okay,

0:28:38 and the end objective, what we want, okay. So, let's say we have supervised, supervised learning, okay, unsupervised and of course the last one is reinforcement learning RL, okay, in short we call it as RL.

0:29:08 this is supervised learning and this is unsupervised learning okay. So these are the three types based on data. So what do I mean by that?

0:29:22 Let me give some examples and some details okay. So in the case of supervised learning your data would be having some So, independent variables which we also call as features or attributes or explanatory three variables you can say and that is typically represented by capital X the other thing that we

0:30:03 will be having in our data is the target is a target variable. So this target variable is also your dependent variable.

0:30:15 That is typically denoted by let us say Y. So we will have x and y pairs. So how will that typical data look like?

0:30:25 Okay, so let's say I'm giving a small dataset here. Okay, think about a scenario that I have got two features x1 and x2.

0:30:33 These are my independent variables and there is a target y. Okay, so x1 is let us say weight and the x2 is let's say height.

0:30:43 Okay, and based on the weight and height, I want a model to learn and predict whether the person is a male or female.

0:30:52 So, one weight as, let's say, 50 kg, height as 150 cm, okay. This is in cm, this is in kg, that person would be a female.

0:31:04 Weight of 80 kg, 180 cm, that would be probably a male. and similarly 40 kg, 140 centimeter, maybe female and so on.

0:31:17 So, I am just, these are my data points, these are my data points, these are individual data points. So, essentially in supervised learning, what I want the model to learn is the relationship between these features and the target.

0:31:32 How are these features related to the target? and the target in this case, the target variable in this case is a categorical variable, it can be numeric also, it can be numeric also.

0:31:52 So, for example, a house-pice prediction, so another example dataset, let us say the total square feet and the number of bedrooms.

0:32:01 So, 1,200 square feed to bedroom and then the price in a given locality is let's say I'm just giving out some numbers 1.1 CR okay 1500 square feed 3 bhk 1.8 CR okay CR is okay crores that is if you want in of millions I can give it as some millions that would be 110 no 11 million rupees something yeah

0:32:36 11 million rupees in that case okay these are the typical house prices in a tier two city or even tier one city is possible so 11 million that is the price okay 3 bedroom might be like 18 million okay so this is the price okay and so on might be 2,000 square feet and maybe a 4 bedroom apartment could

0:33:00 be 3.6 million also and so on so now in this case these are my independent variables but this time your target is numeric this is your target why this is a X this is a target so your target is a numeric variable where you want to predict the house prices ok.

0:33:23 So, if the target is to predict the house prices which is numeric we call it as regression problem, we call it as a regression problem and if the target is categorical as like in this case ok, then that is called as a classification problem.

0:33:42 So, I will give some examples of regression and classification. So regression problem means you can predict numbers like temperature on any given day, you can predict, okay, what will be the value of nifty 50, three months down the line.

0:33:58 So you are actually predicting the number, okay, in this case would be 25,500, 26,000 something. So you're predicting a number.

0:34:07 So stock market prediction in terms of actual price would be a regression problem. But if you are only predicting again on Monday is nifty going to give me a gap up opening or a gap down opening that is a classification problem then you're talking about categories.

0:34:24 Okay or whether you should buy sell or hold us talk you want a model to predict whether to buy sell or hold.

0:34:30 That is category clearly okay so then you are solving a classification problem so classification task like email email can be classified as spam not spam so this model will be able to.

0:34:42 classify an email spam or not spam okay that is a classification problem you can create a model to classify a credit card transaction as fraudulent or not fraudulent a genuine transaction okay that's again a classification problem you can create again a classification model to predict whether an applicant

0:35:08 , loan applicant should be granted loan or not based on their probability of default. So, I applied for a home loan and my entire ML model says that, okay, give loan to pressure, the provided default is 0.1, hardly 10 percent, okay.

0:35:27 So, that's like default, not default, or you can create a classification model to predict the customer shown in a given industry, It could be a telecom industry, banking industry.

0:35:37 John means subscribing out of the service. So for a banking, this means I am closing my account with xyz bank and moving towards ABC bank because of the higher interest rate that I get on my deposits or it could be better app or it could be better customer service or a number of reasons could be there

0:36:01 . So I have churned out from a previous bank and move not to another bank. That's called as customer churn. So the model can predict whether a customer would churn in future or not so that appropriate action can be taken to retain such a customer.

0:36:14 We can probably give more offers and this and that. Even HR employees are using classification model to predict whether an employee is likely to leave the company or leave the job or not okay so leave or not leave or not leave that's churn not churn so leaving the job is ultimately churn for the organization

0:36:38 right so these are all examples of class if whether tomorrow it is going to rain or not so these are all classification how much millimeter of rainfall will happen tomorrow that is a regression problem ultimately or what would be the average temperature tomorrow that is a regression problem.

0:36:57 So when you want to predict the actual prices okay give me the sales forecast for the next three months that is again a regression problem okay.

0:37:04 So time series forecasting they are all regression problems. So model the story. When you have the data about that independent variables which are features or attributes based on these features and attributes you are going to build a model.

0:37:20 Okay, there will be a target. So you need to have at least these two things independent variables and your dependent variable is your target.

0:37:27 So why I'm calling it as a dependent variable because the value of the dependent is based on the independent variable.

0:37:35 So these these numbers are affecting the probability of any percent to be a male or female. In this case these numbers are affecting the actual prices okay as you see is the square feet is increasing mostly the price will increase or the number of bedrooms increase the price will increase and so on okay

0:37:55 so that is why these are independent features and the target is the dependent feature and this is all the case of a supervised learning so in supervised learning in short we have both X and Y that's how I would write it down and the two typical tasks in supervised learning is regression and classification

0:38:15 ok. So, we have the regression algorithms and the classification algorithms will not get into all the details. So, regression algorithms are linear regression, polynomial regression, regression, lasso regression, decision tree regression, k n n regressors, x z boost regressors.

0:38:33 These are all regression algorithms, classification algorithms most basically is logistic regression, decision trees, k n and name paste, multinomial name paste, Bernoulli name paste, and then you have random forest, extraterrestrial classifiers, XG boost, gradient boost in machine, GBM, AdA boost, line

0:38:52 GBM, CAD boost. So a lot of these algorithms are there for classification, okay. So I have given industry use cases for both, okay.

0:39:01 That is all on the supervised side, okay. So ML or DL again, so later on what model I will create that's a different story, but let's see in terms of data availability, okay.

0:39:17 So back to the same. So supervised, okay, why we have X and Y? Why is it called a supervised? Because see if the model is predicting, model's prediction is given as Y hat, okay, model's prediction is typically written mathematically as Y hat.

0:39:34 So, you can always calculate the error from the model or do you calculate the error from the model? By comparing the model prediction, this is the model prediction, okay.

0:39:46 So by comparing the model prediction with the actual target, your actual target is this. So that becomes the error from the model, that becomes the error model from the model.

0:39:55 And this error can be used for the model training. I'll train the model using this error. I'll say that, okay, hey model, you are making more errors for this kind of scenario, but lesser errors for this kind of scenario, so please focus on that.

0:40:11 So, that feedback I'm able to give based on this error, okay, so how does the model know that I'm learning good or bad, because of the feedback it is getting by these errors.

0:40:22 That's why it is called a supervised, so to make it supervised you need to have the information about X and Y both, Because why is needed to calculate with that error and that error is used to give a feedback like a teacher Okay, so you are learning machine learning, but I am the teacher.

0:40:40 Let's say I will tell you that hey You are making this mystery. So how am I able to give that mistake because I know the actual thing If I don't know how will I give it a feedback in the first place right so the actual target value has to be known if you don't have that target value, then we are talking

0:41:01 about unsupervised and in this case you have x data only. We don't have any target. In this case, we don't have the target information.

0:41:19 So there is no y in this case. You only work with x. You only work with x that's the fundamental difference between this so supervised learning of two types regression and classification that is very clear by now okay so if you don't have target then you might be wondering what we will do okay there

0:41:41 are few tasks that I'll just mention will not get into all the details but there are few tasks so let me just give some examples so what are the tasks that you can do in unsupervised learning, simplest example is clustering or customer segmentation.

0:41:58 Okay, clustering or segmentation. So let me give one use case of clustering. Okay, imagine you are working in a bank that has got let's say a hundred thousand customers or one million customers.

0:42:11 Now many of these customers also have credit cards with the bank, same bank and the other banks. So, you have access to all their transaction information that they have made over the last ten years.

0:42:27 Now, your bank wants to launch a premium signature credit card with exotic benefits like unlimited access to airport lounges.

0:42:38 you get 12 times free entry to international golf courts, clubs, Corinthians club or PanCard clubs, access is their complimentary 10 times a year, let's say, okay, and so on a number of, you know, these kind of benefits.

0:42:59 Dining, fine dining benefits are there, okay, you will get additional 20% discount on fine dining restaurants okay and so on just naming a few of these okay and you as a data scientist have been assigned a task to find out which customers would be interested in this now the problem here is I don't have

0:43:22 information about any of these customers holding a similar credit card from any other bank they will will they tell me right.

0:43:33 So I only have the information about their let's say transactions from the existing credit card but I don't have that information what or other credit cards they have and more specifically whether do they have already got a similar premium credit card from any other bank.

0:43:54 Now the whole task is with this entire pool of 1 million customers and their transaction information. How will you narrow down to your target customer list?

0:44:05 So, your manager is saying, Prasad, I want you to help me because it's not feasible to do a tele-calling all the 1 million customers and start pitching them the product details because this is a premium credit card mostly in white only types.

0:44:21 There will be of course higher limits and higher you can say requirements for you know issuance of this credit card okay not everyone would be interested plus not everyone would be eligible both complications are there so then how do you narrow down so because you don't have the information why target

0:44:48 you don't know whether these customers would have already had similar credit cards or not, okay. So all you have is their transaction information, okay, and probably their Cible score, KYC details, that's it, okay.

0:45:05 So then using this much information, let's say I want to create three customer profiles, three customer segments, Okay, a high profile, a medium profile, and a low profile.

0:45:19 Now, this profile doesn't mean they are earning capability, the profile means probability of profile means suitability and probability of conversion both.

0:45:33 Okay. So, first is suitability, whether this person is eligible in terms of the bank requirements and number two, how likely that dispersion would also be accepting to take that particular threaded card.

0:45:47 Okay, so that is the high profile, high means more probability and more eligibility. Okay, low means low probability, low eligibility, that that's my definition, let's say.

0:45:57 So I will do some calculations, some algorithms like chemins, clustering, I can go for agglomerative clustering, hierarchical clustering, DB scan, And the crystalline algorithms are there.

0:46:11 And after making use of these crystalline algorithms, I have narrowed down to, OK, these 10,000 customers are in my high profile side.

0:46:20 And that is what I will give it to my managers. Boss, this is my list of 10,000 customers who are not only eligible, but might be interested as per my algorithms understanding.

0:46:35 And so my manager says, okay, I'll take 10% of this and give it to my sales team, telecalling team and we'll check how good your cluster is, okay, fine.

0:46:48 So now telecalling 10,000 customers is much more feasible, logical, okay, than calling 1 million customers. So 10,000 customers, maybe if you take a narrow subset, maybe 1000, 1000 customers can me easily reached out in 10 days okay by the sales call and then from there I can see the conversion rate

0:47:09 and maybe from there I can I might want to refine my clusters also getting their feedback but whatever be that this is what the meaning of clustering or segmentation is so you can divide your customer into customer segments that is what we also call as clusters so clustering or segmentation is one task

0:47:33 . So, these are the typical tasks. Here, the tasks are regression and classification. These are the tasks. These are the tasks.

0:47:42 So, here could be dimensionality reduction, dimensionality reduction. So, let us say if I have got a high dimensional data, high dimensional data means let us say theoretically imagine your data set has got again back to my clustering example okay.

0:48:01 I've got all the transaction data okay of multiple months. I have got their demographic information, gender, age, occupation, city whether it is a tier one tier two city or whatever.

0:48:17 I have their information about their assets portfolio and And number of things you see, the number of columns in your data is increasing now, because each information would be a separate column.

0:48:28 So gender is one column, occupation is one column, current month, credit balance, previous month, credit, current month, debit, previous month, debit, monthly average balance, monthly average balance for the previous quarter, monthly average balance for the previous two previous quarter, okay, and the

0:48:45 kind of transactions added. So, each of these information is adding a new dimension, imagine that now I've got 200 columns in my dataset, so that is dimensionality of your dataset, 200 is the dimensionality of your dataset, so number of columns define that dimensionality.

0:49:06 Now, if I want to plot my clusters, how these clusters look like, let's say my manager wants to see, question, can you show me how good your customers are, or can you plot these customers on a 2D graph?

0:49:20 That is not possible, because you have your data is 200 dimensional, your data is 200 dimensional, 200 dimensional data cannot be plotted on a 2D plot unless you reduce the dimensionality.

0:49:37 So the immediate application of dimensionality fraction is a big data visualization. Why do I want to do that? Because I want to visualize that big data.

0:49:46 So big data visualization is an immediate application of this. Okay. And third, what we can also do is recommendation systems.

0:49:57 Okay. Personalized and non-personalized both. So for example, if you are logging for the first time on Netflix and that to an incognito board.

0:50:06 So what does the Netflix show you in this case is a non personalized list of recommendations based on what is more popular in your geographic location because Netflix will stay in your IP and from that IP it knows that whether you are in Mumbai or Guadalu, and then from there, it knows what is more popular

0:50:30 in that geographic location. Okay, what kind of language is spoken based on that? The initial list of shows, TV shows or movies or series will appear, and that is non-personalized.

0:50:47 Max to Max, they might have information about my gender, somehow, looking at the other types, and that is what those cookies are meant for.

0:50:54 These cookies actually keep reading what are the other tabs open in your browser window. Nice. So that is initially non-personnel, but the moment I start browsing.

0:51:08 Okay, so my browsing history now goes to start building a customer persona in their system. Okay, now they see, okay, So Prashant is browsing Star Wars, okay.

0:51:22 He has also browsed for the movie Hobbit, okay. So most likely this guy is interested in sci-fi or something like that, okay.

0:51:35 Definitely not interested in action movies, James Bond movies maybe or may not be even interested in romantic comedies. So now slowly because of my browsing history, watch history, they'll start narrowing down into specific genres and then I'll get more and more personalized recommendation because you

0:51:56 liked this or because you watched this, you might also be interested in this, this, this, this. That's what kind of mailers you get from Flipkart Amazon because you bought this, you may also be interested in this, this, this.

0:52:10 if you watch shoes of this color, you will be given recommendations of stocks matching that. Same thing. So each of your browsing history, watch history, purchase history goes to create a personalized recommendation.

0:52:26 They have no idea whether you have previously purchased a similar product before from different platform or not. So why information is still not there?

0:52:36 they are just guessing it's just a guess word okay but more intelligent okay so that's what the recommendation system is to do upselling cross-selling okay so for example again going back to a band okay you want to increase the revenue from the existing customers by 25% in the next quarter that is the

0:52:57 business problem given to you fine so then I go back to my business and ask to get more clarity do you want this at least like 25% revenue from the existing customers or the new customers can be onboarded?

0:53:10 Then my boss is no no pressure on. We cannot afford to pay additional for cost of customer acquisition. So this additional revenue has to come from existing customers only.

0:53:25 So now my only option is to do upselling and cross selling. Then I will see what all products can be upselled across sell based on what other products they have already bought now whether they will really purchase that or not I don't know where that is how good the sales pitch will be made but again

0:53:42 I can only recommend so because I have let's say a personal loan okay I will be given recommendations for overdraft or I will be given recommendation for and it is a secured loan because I've already had a unsecured loan.

0:53:59 PL is unsecured loan, so I'll be given recommendations for a unsecured loan now. They will never give you another unsecured loan.

0:54:08 And so on, or if you have already, you know, let's say got insurance, echo, what does, what echo does. Okay, so if I bought motor insurance from Akku, okay, now they will start pitching me health insurance term insurance, that is it like cross selling, okay.

0:54:30 So we'll not get into technical details of upselling and cross selling, but yeah, once the platform has idea about so same goes with policy, but the moment they get to know that, okay, you are having so and so policy from so and so they will start pitching you other policies.

0:54:47 Of course, not a similar one if I have what a health insurance from XYZ insurer, I will be given term insurance recommendations pitches, I'll get sales calls that is exactly what is happening here.

0:55:00 Okay, so their recommendation system is working. they have no idea whether I have the terms in certain already or not okay but they will do their job ultimately so that is unsupervised learning and finally the last leg is about reinforcement, reinforcement learning.

0:55:23 In most cases, in most cases, you will not start with any data. So, you will have no data requirement, you can say, okay, so now this is a robot and you want to teach this robot how to climb stairs.

0:55:43 Or you want to teach this robot how to play the game of chess or tick that throw whatever. Now you cannot have the data of all the billions of combinations in the chessboard.

0:55:59 So the robot will make some action on the environment. Environment will have some set of rules. rules. And based on the action in the environment, the environment will reward or penalize this agent.

0:56:13 So this is the agent. My robot is my agent. So agent will perform some action in the environment. Environment will give a feedback to this agent in the form of reward or penalty.

0:56:27 So let's say if the robot is taking a smaller step, it will fall. So it will be given some penalty and then the robot will take slightly bigger step but it will still not sufficient so it will again fall so again penalized so the robot will again take slightly a bigger step okay this time it was able

0:56:46 to get to the first step let's say let's say so then it will get a reward so now the robot tries to figure out okay what kind of actions should I do Because of which, I get repeated reward.

0:57:01 So those actions, it is learning on its own. That is the reinforcement I'm talking about here. So you don't need any data.

0:57:09 It's just you keep training your algorithm. Over millions of iterations here. Yes, you will actually do millions and billions of iterations.

0:57:19 Okay, because you are not starting with the historical data. So the data gets generated as the iterations take place. Okay.

0:57:29 So slowly slowly the agent will learn how big or how small step I have to take to actually climb this or what kind of moves I make which lead to my loss or wins.

0:57:44 So which are my winning moves, which are the losing moves. So you don't hardcore the winning moves and the losing moves.

0:57:50 You want this agent to learn on its own based on the actions. The algorithm will only reward or penalize that okay that's exactly the concept of reinforcement learning.

0:58:03 Now in some cases if you see some tutorials you might get semi-supervised also. So in this case semi-supervised what is happening is you have x and only a fraction of data is labeled.

0:58:23 So you don't have let's say label for all the data points. Let's say you have 1 million data points in X and let's say you have the actual label that is your Y target.

0:58:36 So target is not present for all the data points that is only present for a fraction. So that is why we are calling it a semi-supervised.

0:58:45 So whatever data is available, whatever Y is available we can train a supervised model and then on the other part we can train a unsupervised model and then we can make a combination of that okay so that is semi supervised typically okay credit risk logistic risk so these are the name of the algorithms

0:59:05 you can say okay cyber security threat detection these are the kind of typical applications people have been doing with machine learning, predictive maintenance, sensor data, and all that.

0:59:18 Okay, so in deep learning, we talk about neural networks. It's all about neural network. What is a neural network? It's a network created by using neurons.

0:59:26 So the building block of deep learning is neuron. Neuron, as in your head, you also have billions of neurons in your brain.

0:59:36 So your brain is an active biological neural network. your brain, human brain is a biological neural network.

0:59:47 So we have tried to understand what all computations or calculations are going on within a human brain and we have tried to mimic the same in the machine and that's why we are calling it as a artificial neural network And that is why we are calling it as a artificial unit.

1:00:09 So deep learning is all about these artificial neural networks. Where we are trying to create artificial neurons which mimic the functionality of the brain.

1:00:18 The brain does a lot of thinking that ML models cannot have thinking process. So all the thought process has to be coming from a neural network.

1:00:28 Okay, so that is a basic thing. It's a specialized type of machine learning. you can say that uses neural networks with many, many layers also called as deep neural networks.

1:00:38 So, you can have shallow neural networks if you have just 2 to 3 layers, but if you have more than 3 layers, it becomes deep.

1:00:45 To model and understand complex patterns. So, there is the main takeaway. So, I even today start with linear and logistic only and 90 percent of the times, I really find these exotic algorithms even beating linear logistic, especially when the data volume is less, that is my experience even to it.

1:01:13 So, which, you know, still motivates to take to ML models in most of the use cases. Here, I'm not referring to competitive scenarios like Kaggle and all that, okay?

1:01:26 They will, And so competitive programming is a completely different paradigm. I'm talking about enterprise-grade applications, applications which are deployed, okay.

1:01:37 We still are sticking to traditional methods because they are still able to solve most of the problems, okay, unless you have millions and billions of data points, data points could be what, okay, data points could be what, what is the definition of data point, okay.

1:01:53 So, you can have data as super you know structured data, you can have a structured data this is tabular data or structured data ok structured or tabular data.

1:02:11 So, in the case of a structured or a tabular data each row each row is one data point each row is one data point.

1:02:18 So, that is one definition, each row in our structured or tabular data is a data point or in the case of textual data, what is the meaning of data point.

1:02:33 So, in the case of text data, each email or each SMS message, each article, each article or blog or news item each book okay it could be as large as each book or it could be as small as each tweet so basically what is your dataset composed of your dataset is composed of email emails dataset is composed

1:03:06 of SMS messages dataset is composed of articles and news collections or is it a book collection or is it a tweets collection so based on that the definition of the each data point changes okay so each tweet can be one data in a or each product review is could be each product review or movie review in

1:03:30 the case of again sentiment analysis of product and movie reviews so let's say you can have a product review of 10,000 reviews or you can have let's say movie review data set.

1:03:44 So IMDB International Movie Database is their IMDB data set. It's publicly available with 50,000 movie reviews already freely available. So each review becomes one data point.

1:03:59 Okay, fine. Third, if you're talking about images, if you're talking about images, if you're talking about images. Now, each image can be a data point.

1:04:11 So, for example, in a case of classification, whether this picture belongs to Prasanth or Salman Khan or Tom Cruise. So, then each image, okay, each image with its label, you need to provide that this looks looks like Prashant.

1:04:30 And then another image comes up. Okay, this looks like a Tom Cruise. Then another image. Okay, this looks like someone else.

1:04:38 Okay. So these are the labels. You have to provide images with their labels. Then it is a case of a classification problem, which is supervised.

1:04:46 You need X. X is your image, one image. And Y could be the label. that okay, this image is of Prasad or Tom Cruise or Salman Khan or Shah Rukh Khan whatever names you want to assign.

1:05:00 So data points could be images or it could be collection of images also. So each image can be a data okay, it could be also collection of images okay, so sequence of images it could be sequence of images or collection of image also.

1:05:23 So think about this. I want to create a model for hand gesture recognition, a hand gesture recognition, okay. So this is like I'm swiping forward, I'm swiping backward, I'm swiping upward, I'm swiping downward and I'm swapping this way.

1:05:40 So fast forward, backward, volume up, volume down, pause, click. Okay. So I want to create a deep learning model with handswipes so that your TV is able to recognize your hand gestures.

1:05:55 So then you need to have the sequence of the image frames. So each image in this is a sequence. So, sequence of 25 frames, sequence of 50 or 200 frames, that sequence of 200 frames will define one gesture that is forward or backward.

1:06:15 That is the label. So, sequence of images could be one data point. So, one sequence of 200 images with a label as forward or backward, word, that is my label, okay, fast forward or fast backward, whatever, okay.

1:06:32 That one sequence could be one data point, depends on the use case, that's what exactly. So what kind of model you want to create, okay?

1:06:44 So images, and then you have audio data, video data, so deep learning is capable of working very, very well with textual data, with image data, again with audio and visual data also.

1:06:56 So audio and even video data as well, okay. So, for example, pose recognition, okay, what kind of pose was, okay, similar to gesture recognition, okay, but you can think about ASR automatic speech recognition okay so I give a 5 second clip or 10 second clip audio clip to my model for training so 10 second

1:07:24 audio clip for the model for training and the model is supposed to recognize whether this voice belongs to Prashant or Amitabhachan Mr Amitabhachan that is called as automatic speech recognition no machine learning model will ever be able to do this you need deep learning motors for this okay so they

1:07:44 have their very niche use cases for example object tracking you want to have a CCTV camera at the traffic junction which can track down the cars so you might have seen those bounding boxes over the cars okay several images okay so object detection object tracking object classification so all these kind

1:08:06 of can be very easily done by deplanning algorithms, okay. Now these things are not so easy possible with ML. So that is where I am extending the capabilities of ML to a DL side without getting into technical level details of what is ML versus DL but these are the typical additional use cases which come

1:08:28 up from deplanning side. Another unsupervised example. Now let me give a couple of unsupervised example. All these examples which I have mentioned here, they are supervised.

1:08:38 So for example here in this case, I want to classify a mail as spam or not spam. SMS spam as not spam.

1:08:45 I want to classify my news articles whether it is a technology article, business article, political article, sports article or finance article.

1:08:54 Classification once again. I want to classify each book which is coming from each of the genres, let's say fiction, nonfiction, self-help, this and that.

1:09:02 I want to classify each tweet positive negative one neutral sentiment. I want to classify movie review positive negative or neutral sentiment.

1:09:11 So, these are the clear tasks. Most of them are classification. Perfect. Images again, press on, no press on, classification, sequence of images, again five gestures, again classification, audio video recognition.

1:09:25 Again, this is more or less recognition and classification. So, these are mostly classification regression the very rare cases and those rare cases are time series.

1:09:37 Okay, if when you want to do stock market forecasting you can not make a ML model. Okay, you want to exactly predict how will the nifty behave in the next one month you want to capture that pattern.

1:09:49 So, that is where the deep learning model comes into picture okay fine so these are supervised learning use cases for deep learning the first time we have already time I will give you one unsupervised learning use case for deep learning okay so this image has got lot of noise in the background and And

1:10:15 this same image doesn't have noise, so this is a noise image and this is a clean image. How did I get this clean image?

1:10:25 I might have done some photoshop and cleaned this noise image into a clean image. Now let's say I have got 1 million such images or let's say 10,000 such images.

1:10:38 Of course there's a lot of manual effort which was involved. I could have also done the other way around. I start with a clean image and introduce some noise deliberately so that way I can create noisy images, but I need to have two types of images A and B.

1:10:55 I want to have noisy images and clean images, so I want to have let's say 10,000 noisy images and their corresponding 10,000 clean images.

1:11:02 ok. I want to teach my model concept of denoicing. So, let us say this is a denoicing model. So, a noisy image goes in and the output of that model is a clean image ok.

1:11:26 The output of that model is a clean image ok. So, you can say ok Prashant how is it unsupervised? This is our X and this is our Y very much this can be taken as a very much this can be taken as a classification watch this is not classification actually if you see we are trying to generate some of the

1:11:45 clean images or we want to get rid of the image. So, this denoicing is we are not classifying whether the image is noisy or not please understand why is it not supervised.

1:11:57 We are not classifying. We are not predicting a number here anyway. So we are training this model so that the noise is removed.

1:12:09 You get the same image but a much cleaner version of that. So what is the advantage? The advantage is that you no longer have to use Photoshop or the Photoshop might be using this model behind the scenes.

1:12:20 Now you don't have to do anything manually. you just give a noise image, instantly you get that clean image, okay.

1:12:26 Just a click of a button. So that is one typical use case, okay. Style transfer and many more, I mean the generative capabilities are mostly unsupervised.

1:12:39 We'll talk about that. So I think in the next slide, the last slide for today. So you can see the strength of the deep learning excels in recognition national language processing, speech recognition and many more.

1:12:55 We never discussed NLP today, but national language processing means understanding the grammar, the dependency of the different components in a grammar subject or object, the structure of the sentence, the semantics, the meaning of the words, okay, animes, synonyms, everything.

1:13:13 So that is your NLP stuff. So machine learning algorithms cannot learn the meaning of the words but deep learning algorithms yes absolutely they can understand words and represent the words in the form of numbers because computer does not understand English language computer only understands language

1:13:31 of numbers so I can convert any English language of French language text into numbers and that language of numbers is what the computer knows very So, how do you get that via deep learning?

1:13:45 So, advantage, translation. Google Translate is a deep learning model. You give a text in English. You get the text output in French.

1:13:54 That's a deep learning model. So, those additional capabilities are there with deep learning because they can really learn a lot of complex cities in the data.