1 00:00:00,100 --> 00:00:02,350 ♪ [music] ♪ 2 00:00:03,700 --> 00:00:05,700 - [narrator] Welcome to Nobel Conversations. 3 00:00:07,000 --> 00:00:10,128 In this episode, Josh Angrist and Guido Imbens 4 00:00:10,128 --> 00:00:13,700 sit down with Isaiah Andrews to discuss and disagree 5 00:00:13,700 --> 00:00:16,580 over the role of machine learning in applied econometrics. 6 00:00:18,300 --> 00:00:19,769 - [Isaiah] So, of course, there are a lot of topics 7 00:00:19,769 --> 00:00:21,087 where you guys largely agree, 8 00:00:21,087 --> 00:00:22,313 but I'd like to turn to one 9 00:00:22,313 --> 00:00:24,240 where maybe you have some differences of opinion. 10 00:00:24,240 --> 00:00:25,728 So I'd love to hear some of your thoughts 11 00:00:25,728 --> 00:00:26,883 about machine learning 12 00:00:26,883 --> 00:00:29,900 and the goal that it's playing and is going to play in economics. 13 00:00:30,200 --> 00:00:33,352 - [Guido] I've looked at some data like the proprietary 14 00:00:33,352 --> 00:00:35,100 so that there's no published paper there. 15 00:00:36,719 --> 00:00:38,159 There was an experiment that was done 16 00:00:38,159 --> 00:00:39,500 on some search algorithm. 17 00:00:39,700 --> 00:00:41,497 And the question was... 18 00:00:42,901 --> 00:00:45,600 it was about ranking things and changing the ranking. 19 00:00:45,900 --> 00:00:47,500 That was sort of clear... 20 00:00:48,400 --> 00:00:50,600 that was going to be a lot of heterogeneity there. 21 00:00:50,600 --> 00:00:51,700 Mmm, 22 00:00:51,700 --> 00:00:58,120 You know, if you look for say, 23 00:00:58,300 --> 00:01:00,350 a picture of Britney Spears 24 00:01:00,350 --> 00:01:02,400 that it doesn't really matter where you rank it 25 00:01:02,400 --> 00:01:05,500 because you're going to figure out what you're looking for, 26 00:01:06,200 --> 00:01:07,867 whether you put it in the first or second 27 00:01:07,867 --> 00:01:09,800 or third position of the ranking. 28 00:01:10,100 --> 00:01:12,500 But if you're looking for the best econometrics book, 29 00:01:13,300 --> 00:01:16,500 if you put your book first or your book tenth, 30 00:01:16,500 --> 00:01:18,100 that's going to make a big difference 31 00:01:18,600 --> 00:01:21,829 how much how often people are going to click on it. 32 00:01:21,829 --> 00:01:23,417 And so there you go -- 33 00:01:23,417 --> 00:01:27,218 - [Josh] Why do I need machine learning to discover that? 34 00:01:27,218 --> 00:01:29,195 It seems like could I can discover it simply? 35 00:01:29,195 --> 00:01:30,435 - [Guido] So in general-- 36 00:01:30,435 --> 00:01:32,100 - [Josh] There were lots of possible... 37 00:01:32,100 --> 00:01:35,490 - You what you want to think about there being lots of characteristics 38 00:01:35,490 --> 00:01:37,610 of the items 39 00:01:37,610 --> 00:01:41,682 that you want to understand what drives the heterogeneity 40 00:01:42,300 --> 00:01:43,427 in the effect of-- 41 00:01:43,427 --> 00:01:45,600 - But you're just predicting 42 00:01:45,600 --> 00:01:47,700 In some sense, you're solving a marketing problem. 43 00:01:48,400 --> 00:01:49,580 - [inaudible] it's causal effect, 44 00:01:49,580 --> 00:01:51,800 - It's causal, but it has no scientific content. 45 00:01:51,800 --> 00:01:53,300 Think about... 46 00:01:54,100 --> 00:01:57,300 - No, but it's similar things in medical settings. 47 00:01:58,000 --> 00:02:01,300 If you do an experiment, you may actually be very interested 48 00:02:01,300 --> 00:02:03,900 in whether the treatment works for some groups or not. 49 00:02:03,900 --> 00:02:06,500 And you have a lot of individual characteristics, 50 00:02:06,500 --> 00:02:08,000 and you want to systematically search. 51 00:02:08,000 --> 00:02:09,500 - Yeah. I'm skeptical about that -- 52 00:02:09,500 --> 00:02:12,603 that sort of idea that there's this personal causal effect 53 00:02:12,603 --> 00:02:13,900 that I should care about, 54 00:02:14,000 --> 00:02:16,063 and that machine learning can discover it 55 00:02:16,063 --> 00:02:17,596 in some way that's useful. 56 00:02:17,596 --> 00:02:21,400 So think about -- I've done a lot of work on schools, 57 00:02:21,400 --> 00:02:23,950 going to, say, a charter school, 58 00:02:23,950 --> 00:02:25,225 a publicly funded private school, 59 00:02:25,225 --> 00:02:26,500 effectively, you know, that's free to structure 60 00:02:26,500 --> 00:02:29,300 its own curriculum for context there. 61 00:02:29,300 --> 00:02:31,000 Some types of charter schools 62 00:02:31,000 --> 00:02:32,700 generate spectacular achievement gains, 63 00:02:32,700 --> 00:02:36,400 and in the data set that produces that result, 64 00:02:36,400 --> 00:02:37,800 I have a lot of covariance. 65 00:02:37,800 --> 00:02:41,200 So I have baseline scores, and I have family background, 66 00:02:41,200 --> 00:02:45,800 the education of the parents, the sex of the child, the race of the child. 67 00:02:45,800 --> 00:02:48,300 And, well, soon as I put 68 00:02:48,400 --> 00:02:51,900 Half a dozen of those together. I have a very high dimensional space. 69 00:02:52,300 --> 00:02:54,900 I'm definitely interested in in sort, of course, 70 00:02:54,900 --> 00:02:59,400 features of that treatment effect, like whether it's better for people who 71 00:02:59,900 --> 00:03:02,100 come from lower income families. 72 00:03:02,600 --> 00:03:06,000 I have a hard time believing that there's an application, 73 00:03:06,400 --> 00:03:10,300 you know, for the very high dimensional version of that, where 74 00:03:10,500 --> 00:03:13,200 I discovered that for non-white children who have 75 00:03:13,800 --> 00:03:17,800 high family incomes, but Baseline scores in the third quartile, 76 00:03:18,300 --> 00:03:23,000 And only went to public school in the third grade, but not the sixth grade. 77 00:03:23,000 --> 00:03:25,500 So that's what that high dimensional analysis produces. 78 00:03:25,800 --> 00:03:28,100 This very elaborate conditional statement. 79 00:03:28,300 --> 00:03:31,000 There's two things that are wrong with that. In my view first. 80 00:03:31,000 --> 00:03:34,000 I don't see it as I just can't imagine why it's actionable. 81 00:03:34,600 --> 00:03:36,600 I don't know why you'd want to act on it. 82 00:03:36,600 --> 00:03:41,200 And I know also that there's some alternative model that fits almost as well. 83 00:03:41,800 --> 00:03:43,000 That flips everything, 84 00:03:43,200 --> 00:03:47,500 right? Because machine learning doesn't tell me that this is really the predictor 85 00:03:47,900 --> 00:03:48,100 that 86 00:03:48,400 --> 00:03:52,300 Is it just tells me that this is a good predictor? And so, 87 00:03:52,800 --> 00:03:55,900 you know, I think there is something different about the 88 00:03:56,000 --> 00:03:58,400 Moss social science contest. So I think 89 00:03:58,500 --> 00:04:02,600 the socialized signs of applications you're talking about once where 90 00:04:03,400 --> 00:04:08,100 I think there's not a huge amount of heterogeneity in the effects. 91 00:04:08,400 --> 00:04:14,000 And so what there might be a few allow me to to fill that space. No, 92 00:04:14,600 --> 00:04:18,100 not even then I think for a lot of those those into 93 00:04:18,300 --> 00:04:22,000 Sanctions even effect. You would expect that. The effect is the same sign 94 00:04:22,100 --> 00:04:22,900 for everybody. 95 00:04:23,400 --> 00:04:27,600 It may be there may be small differences in the magnitude, but it's not 96 00:04:28,200 --> 00:04:31,700 for a lot of these education defenses. They're good for everybody. 97 00:04:31,800 --> 00:04:32,300 They're 98 00:04:32,900 --> 00:04:37,600 the it's not that they're bad for some people and good for other people and 99 00:04:37,600 --> 00:04:40,800 that is kind of very small Pockets where they're bad the 100 00:04:40,900 --> 00:04:43,900 but it may be some variation in the magnitude, 101 00:04:44,000 --> 00:04:48,200 but you would need very very big data sets to find those and I 102 00:04:48,400 --> 00:04:51,400 Then in those cases, they probably wouldn't be very actionable anyone. 103 00:04:51,700 --> 00:04:53,800 But there's I think there's a lot of other settings 104 00:04:54,100 --> 00:04:56,600 where there is much more hydrogen it. 105 00:04:57,400 --> 00:05:01,600 Well, I'm open to that possibility and I think the example you gave of 106 00:05:01,900 --> 00:05:05,000 it's essentially a marketing example. 107 00:05:06,400 --> 00:05:08,400 Now that maybe they say there's a there's a 108 00:05:08,500 --> 00:05:10,700 have implications for and that's organization. 109 00:05:10,700 --> 00:05:13,900 How you actually need to whether you need to worry about 110 00:05:14,000 --> 00:05:17,900 the well, I know Market power, some see that paper. 111 00:05:18,400 --> 00:05:21,200 So that's the sense. The sense I'm getting is that 112 00:05:21,500 --> 00:05:23,500 we still disagree on something. Yes. 113 00:05:24,100 --> 00:05:26,700 We have it converged on everything. I'm getting that sense. 114 00:05:27,200 --> 00:05:31,000 Actually. We've diverged on this because this wasn't around to argue about. 115 00:05:33,200 --> 00:05:38,000 Is it getting a little warm here? Yeah. Warm warmed up. Warmed up is good. 116 00:05:38,100 --> 00:05:40,800 The sense. I'm getting his Jaws. Sort of, you're not, you're not 117 00:05:40,900 --> 00:05:43,400 saying that you're confident that there is no way. 118 00:05:43,400 --> 00:05:45,400 That there is an application where the stuff is useful. 119 00:05:45,400 --> 00:05:48,200 You are saying you are you're unconvinced by the existing. 120 00:05:48,300 --> 00:05:52,200 Applications to dedicate fair that I'm very confident. Yeah, 121 00:05:54,200 --> 00:05:55,000 in this case. 122 00:05:55,300 --> 00:05:57,500 I think Josh does have a point that today 123 00:05:58,000 --> 00:06:02,100 even in the prediction cases the where 124 00:06:02,300 --> 00:06:05,000 a lot of the machine learning methods really shine is 125 00:06:05,000 --> 00:06:06,600 where there's just a lot of heterogeneity. 126 00:06:07,300 --> 00:06:10,600 You don't really care much about the details there, right? 127 00:06:10,900 --> 00:06:15,000 Yes. It does. It doesn't have a policy angle or something. 128 00:06:15,200 --> 00:06:18,100 They kind of recognizing handwritten digits and stuff. 129 00:06:18,300 --> 00:06:24,000 For it does much better there than building some complicated model. 130 00:06:24,400 --> 00:06:28,100 But a lot of the social science, a lot of the economic applications. 131 00:06:28,300 --> 00:06:32,100 We actually know a huge amount about the relationship between various variables. 132 00:06:32,100 --> 00:06:34,600 A lot of the relationships are strictly monotone. 133 00:06:35,400 --> 00:06:39,400 There and education is going to increase people's earnings, 134 00:06:39,800 --> 00:06:44,100 irrespective of the demographic, irrespective of the level of Education. 135 00:06:44,100 --> 00:06:47,800 You already have until they get to a PhD. Yeah. There is a graduate school. 136 00:06:49,500 --> 00:06:50,700 A reasonable range. 137 00:06:51,600 --> 00:06:55,900 It's a it's not going to go down very much. We're 138 00:06:56,100 --> 00:06:59,700 in a lot of the settings. For these machine learning method shine. 139 00:06:59,700 --> 00:07:01,900 It's going to there's a lot of non-monetary Necessities 140 00:07:02,100 --> 00:07:04,900 kind of multi modality in these relationships 141 00:07:05,300 --> 00:07:11,500 and they're they're going to be very powerful but I still stand by that. 142 00:07:11,700 --> 00:07:16,100 It kind of It kind of this message just have a huge amount to offer the for 143 00:07:16,400 --> 00:07:18,100 for economists and they go. 144 00:07:18,200 --> 00:07:21,700 To be a big part of the future. 145 00:07:23,400 --> 00:07:25,800 Feels like there's something interesting to be said about machine learning here. 146 00:07:25,800 --> 00:07:27,700 So, here I was wondering, could you give some more, 147 00:07:28,000 --> 00:07:29,000 maybe some examples 148 00:07:29,000 --> 00:07:32,500 of the sorts of examples you're thinking about with applications? I'm at the moment. 149 00:07:32,500 --> 00:07:34,100 So while I'm on areas where 150 00:07:34,700 --> 00:07:36,400 instead of looking for average 151 00:07:36,500 --> 00:07:42,200 cause of facts were looking for individualized estimates, and predictions of 152 00:07:42,400 --> 00:07:47,500 of course of facts and their machine learning algorithms have been very effective, 153 00:07:48,000 --> 00:07:48,100 too. 154 00:07:48,300 --> 00:07:51,500 Surely would have, we would have done these things, using kernel methods. 155 00:07:51,600 --> 00:07:54,500 And theoretically they work great and 156 00:07:54,600 --> 00:07:57,400 the sort of some arguments that you formally can't do any better. 157 00:07:57,600 --> 00:08:00,500 But in practice, they don't work very well and 158 00:08:00,900 --> 00:08:05,400 random Forest, random cause of forest type things that stuff on wagon, Susan. 159 00:08:05,400 --> 00:08:09,500 I think I've been working on. I used very widely. 160 00:08:09,600 --> 00:08:12,200 They've been very effective, kind of, in the settings 161 00:08:12,400 --> 00:08:18,100 to actually get cause of facts that are that the ferry by 162 00:08:18,200 --> 00:08:19,900 Bike over has, and this kind of, 163 00:08:20,700 --> 00:08:25,700 I think this is still just the beginning of these methods. But in many cases, 164 00:08:26,400 --> 00:08:31,600 the these algorithms are very effective as searching over big spaces 165 00:08:31,800 --> 00:08:35,600 and finding the functions that fit 166 00:08:35,900 --> 00:08:41,100 the very well in ways that we couldn't really do the beforehand. 167 00:08:41,500 --> 00:08:45,300 I don't know of an example, where machine learning has generated insights 168 00:08:45,300 --> 00:08:48,100 about a causal effect that I'm interested in. And I, 169 00:08:48,300 --> 00:08:51,300 You know of examples where it's potentially very misleading. 170 00:08:51,300 --> 00:08:53,700 So I've done some work with Brigham Franz and 171 00:08:54,100 --> 00:08:55,100 using, for example, 172 00:08:55,100 --> 00:08:59,900 random Forest to model covariate effects in an instrumental variables problem. 173 00:09:00,200 --> 00:09:01,200 Where you need, 174 00:09:01,600 --> 00:09:03,500 you need to condition on covariance 175 00:09:04,400 --> 00:09:08,200 and you don't particularly have strong feelings about the functional form for that. 176 00:09:08,200 --> 00:09:10,000 So maybe you should curve 177 00:09:10,500 --> 00:09:10,900 think, 178 00:09:10,900 --> 00:09:14,500 be open to flexible curve fitting and that leads you down a path 179 00:09:14,500 --> 00:09:18,000 where there's a lot of nonlinearities in the model and 180 00:09:18,200 --> 00:09:23,000 That's very dangerous with IV because any sort of excluded non-linearity 181 00:09:23,300 --> 00:09:27,600 potentially generates a spurious, causal effect and Brigham. And I showed that 182 00:09:27,900 --> 00:09:32,200 very powerfully. I think in the case of two instruments 183 00:09:32,700 --> 00:09:36,000 that come from a paper, mine with Bill Evans. Where if you, 184 00:09:36,500 --> 00:09:37,600 you know, replace it 185 00:09:38,100 --> 00:09:42,600 in a traditional two stage least squares, estimator with some kind of random Forest. 186 00:09:42,900 --> 00:09:48,000 You get very precisely at estimated nonsense estimates and 187 00:09:49,000 --> 00:09:51,100 You know, I think that's a, that's a big caution. 188 00:09:51,100 --> 00:09:53,400 And I, you know, in view of those findings 189 00:09:53,700 --> 00:09:57,100 in an example, I care about where the instruments are very simple 190 00:09:57,400 --> 00:09:59,100 and I believe that they're valid, 191 00:09:59,300 --> 00:10:01,600 you know, I would be skeptical of that. So 192 00:10:02,900 --> 00:10:06,800 non-linearity and Ivy don't mix very comfortably. Now I said, 193 00:10:07,200 --> 00:10:11,400 you know in some sense that's already a more complicated. Well, it's Ivy. 194 00:10:11,600 --> 00:10:11,900 Yeah, 195 00:10:12,500 --> 00:10:16,700 but then we work on that and friend out. 196 00:10:18,600 --> 00:10:22,300 I sat in tow vehicle actually guy a lot of these papers Cross by my desk and it, 197 00:10:22,700 --> 00:10:29,500 but the motivation is is not clear at a fact, really lacking. 198 00:10:29,800 --> 00:10:35,100 And they're not, they're not, they called type semi-parametric foundational papers. 199 00:10:35,400 --> 00:10:37,100 So that that's a big problem 200 00:10:38,000 --> 00:10:42,400 and kind of related problem is that we have this tradition in econometrics 201 00:10:42,600 --> 00:10:47,500 being very focused on these formulas and tonic results kind of weird. 202 00:10:48,800 --> 00:10:52,600 We have just have a lot of papers that where you people, propose 203 00:10:52,800 --> 00:10:55,700 a method and then establish the asymptotic properties 204 00:10:56,300 --> 00:11:01,900 in in a very kind of standardized way that bad. 205 00:11:02,900 --> 00:11:07,200 Well, I think it's sort of close the door for a lot of work. 206 00:11:07,200 --> 00:11:11,600 That doesn't fit it into that. We're in the machine learning literature. 207 00:11:11,900 --> 00:11:14,300 A lot of things are more algorithmic people. 208 00:11:15,700 --> 00:11:18,500 Had algorithms for coming up with predictions. 209 00:11:18,800 --> 00:11:23,600 The turn out to actually work much better than say, nonparametric kernel regression 210 00:11:24,000 --> 00:11:26,800 for a long-ass time. We're doing all the nonparametric syndecan, metrics. 211 00:11:26,800 --> 00:11:31,100 We do it using kernel regression and I was great for proving theorems. 212 00:11:31,300 --> 00:11:34,800 You could get confidence, intervals and consistency, and asymptotic normality, 213 00:11:34,800 --> 00:11:37,000 and it was all great, but it wasn't very useful. 214 00:11:37,300 --> 00:11:40,900 And the things they did in machine learning. I just way way better, 215 00:11:41,000 --> 00:11:45,100 but they didn't have to the proper. That's not my beef with machine learning theory. 216 00:11:45,300 --> 00:11:51,200 As we know my name, I'm saying there for the prediction part. 217 00:11:51,400 --> 00:11:54,500 It does much better. Yeah, that's a better curve fitting to it. 218 00:11:54,900 --> 00:11:56,500 But it did. So 219 00:11:57,100 --> 00:12:02,700 in a way that would not have made those papers initially easy to get into 220 00:12:03,000 --> 00:12:06,300 the econometrics journals because it wasn't proving the type of things. 221 00:12:06,400 --> 00:12:11,200 You know, when when Brian was doing his regression trees that just didn't fit in 222 00:12:11,800 --> 00:12:15,100 and I think he would have had a very hard time. 223 00:12:15,200 --> 00:12:18,400 Polishing these things. And it could have had six journals. 224 00:12:18,900 --> 00:12:24,400 I, so I think we're we limited ourselves too much and we 225 00:12:24,700 --> 00:12:27,900 that left us close things off 226 00:12:28,000 --> 00:12:30,800 for a lot of these machine learning methods, that actually very useful. 227 00:12:30,900 --> 00:12:34,000 Hmm. I mean, I think they're in general, 228 00:12:34,900 --> 00:12:36,200 that literature the computer. 229 00:12:36,200 --> 00:12:39,300 Scientists have brought a huge number of these algorithms. 230 00:12:39,600 --> 00:12:43,900 The have proposed a huge number of these algorithms that actually very useful 231 00:12:44,000 --> 00:12:44,700 at that are 232 00:12:45,500 --> 00:12:49,100 Affecting the way we're going to be doing empirical work, 233 00:12:49,800 --> 00:12:55,100 but we've not fully internalize that because we're still very focused on getting 234 00:12:55,300 --> 00:12:57,500 Point estimates and getting standard errors 235 00:12:58,600 --> 00:13:01,200 and getting P values in a way that 236 00:13:01,700 --> 00:13:03,100 we need to move Beyond 237 00:13:03,300 --> 00:13:04,300 to fully harness. 238 00:13:04,300 --> 00:13:10,700 The force, the quote, the benefits from machine learning literature. 239 00:13:10,900 --> 00:13:15,100 Hmm. On the one hand. I guess I very much take your point that sort of the the 240 00:13:15,200 --> 00:13:18,600 Tional. Econometrics, framework of sort of propose, a method, 241 00:13:18,600 --> 00:13:22,600 proved a limit theorem under some asymptotic story, story story, 242 00:13:22,600 --> 00:13:26,900 story story publish a paper is constraining. 243 00:13:26,900 --> 00:13:29,700 And that in some sense by thinking, more, 244 00:13:29,700 --> 00:13:33,200 broadly about what a methods paper could look. Like we may write in some sense. 245 00:13:33,200 --> 00:13:35,900 Certainly the machine learning literature has found a bunch of things, 246 00:13:35,900 --> 00:13:38,300 which seem to work quite well for a number of problems 247 00:13:38,300 --> 00:13:42,400 and are now having substantial influence in economics. I guess a question. 248 00:13:42,400 --> 00:13:44,800 I'm interested in is, how do you think? 249 00:13:45,200 --> 00:13:47,600 The goal of fear. 250 00:13:47,900 --> 00:13:51,200 Sort of, do you think there is? There's no value in the theory part of it? 251 00:13:51,600 --> 00:13:54,800 Because I guess it's sort of a question that I often have to sort of seeing 252 00:13:54,800 --> 00:13:56,900 that output from a machine learning tool 253 00:13:56,900 --> 00:13:59,400 that actually a number of the methods that you talked about. 254 00:13:59,400 --> 00:14:01,800 Actually do have inferential results, develop for them, 255 00:14:02,600 --> 00:14:06,400 something that I always wonder about a sort of uncertainty quantification and just, 256 00:14:06,500 --> 00:14:08,000 you know, I I have my prior, 257 00:14:08,000 --> 00:14:11,000 I come into the world with my view. I see the result of this thing. 258 00:14:11,000 --> 00:14:14,500 How should I update based on it? And in some sense, if I'm in a world where 259 00:14:14,600 --> 00:14:15,100 things are. 260 00:14:15,200 --> 00:14:18,200 Normally distributed. I know how to do it here. I don't. 261 00:14:18,200 --> 00:14:21,400 And so I'm interested to hear had I think it sounds. So 262 00:14:21,500 --> 00:14:24,300 I don't see this as sort of close it saying, well 263 00:14:24,400 --> 00:14:26,500 we do these results are not not interesting 264 00:14:26,600 --> 00:14:27,700 but it's gonna be a lot of cases 265 00:14:28,000 --> 00:14:31,200 where it's going to be incredibly hard to get those results and we may not be able 266 00:14:31,200 --> 00:14:33,200 to get there and 267 00:14:33,400 --> 00:14:37,700 we may need to do it in stages. Where first someone says. Hey I have this 268 00:14:39,600 --> 00:14:44,800 interesting algorithm for for doing something and it works well by some 269 00:14:45,600 --> 00:14:49,900 The Criterion that on this this particular data set 270 00:14:51,000 --> 00:14:53,400 and I'm visit put it out there and we should 271 00:14:53,700 --> 00:14:58,000 maybe someone will figure out a way that you can later actually still do inference 272 00:14:58,000 --> 00:14:59,100 on the some condition. 273 00:14:59,100 --> 00:15:02,100 So and maybe those are not particularly realistic conditions, 274 00:15:02,100 --> 00:15:05,500 then we kind of go further, but I think we've been 275 00:15:06,700 --> 00:15:11,400 Too constraining things too much where we said, you know, this is the type of things 276 00:15:12,100 --> 00:15:14,400 that we need to do. And I had some sense 277 00:15:15,700 --> 00:15:18,200 that goes back to kind of the way they dress and I 278 00:15:19,700 --> 00:15:21,900 thought about things for the local average treatment effect. 279 00:15:21,900 --> 00:15:24,600 That wasn't quite the way people were thinking about these problems. 280 00:15:24,600 --> 00:15:29,200 Before they say they there was a sense that some of the people said, you know, 281 00:15:29,500 --> 00:15:31,900 the way you need to do. These things, is you first, say 282 00:15:32,200 --> 00:15:36,300 what you're interested in estimating and then you do the best job you can. 283 00:15:36,500 --> 00:15:37,700 In estimating that 284 00:15:38,100 --> 00:15:44,200 and what you have you guys had doing is doing it, you guys are doing it backwards. 285 00:15:44,300 --> 00:15:46,700 You're going to say here. I have an estimator 286 00:15:47,300 --> 00:15:49,600 and now I'm going to figure out what what 287 00:15:49,800 --> 00:15:51,400 what it says estimating then expose. 288 00:15:51,400 --> 00:15:53,900 You're going to say why you think that's interesting 289 00:15:53,900 --> 00:15:56,600 or maybe why it's not interesting and that's that's not okay. 290 00:15:56,600 --> 00:15:58,600 You're not allowed to do that that way. 291 00:15:59,000 --> 00:16:04,100 And I think we should just be a little bit more flexible and thinking about the 292 00:16:04,300 --> 00:16:06,300 how to look at at 293 00:16:06,400 --> 00:16:11,300 Problems because I think we've missed some things by not by not doing that. 294 00:16:13,000 --> 00:16:16,600 So you've heard our views. Isaiah, you've seen that, we have 295 00:16:17,000 --> 00:16:20,400 some points of disagreement. Why don't you referee this dispute for us? 296 00:16:22,500 --> 00:16:28,100 Oh, I'm so so nice of you to ask me a small question. So I guess for one. 297 00:16:28,200 --> 00:16:33,200 I very much agree with something that he do said earlier of. 298 00:16:36,000 --> 00:16:36,300 So what? 299 00:16:36,500 --> 00:16:37,900 Where it seems. Where the, 300 00:16:37,900 --> 00:16:41,400 the case for machine learning seems relatively clear is in settings, where 301 00:16:41,500 --> 00:16:45,100 you know, we're interested in some version of a nonparametric prediction problem. 302 00:16:45,100 --> 00:16:49,700 So I'm interested in estimating a conditional expectation or conditional probability 303 00:16:50,000 --> 00:16:52,100 and in the past, maybe I would have run a colonel, 304 00:16:52,100 --> 00:16:55,800 I would have run a kernel regression or I would have run a series regression or 305 00:16:56,100 --> 00:16:57,400 something along those lines. 306 00:16:57,700 --> 00:16:58,000 Sort of, 307 00:16:58,000 --> 00:16:58,700 it seems like 308 00:16:58,700 --> 00:17:02,000 at this point we've a fairly good sense that in a fairly wide range 309 00:17:02,000 --> 00:17:06,300 of applications machine learning methods seem to do better for 310 00:17:06,400 --> 00:17:06,800 Or, you know, 311 00:17:06,800 --> 00:17:08,800 estimating conditional mean functions 312 00:17:08,800 --> 00:17:12,000 or conditional probabilities or various other nonparametric objects 313 00:17:12,400 --> 00:17:16,600 than more traditional nonparametric methods that were studied in econometrics 314 00:17:16,600 --> 00:17:19,100 and statistics, especially in high dimensional settings. 315 00:17:19,500 --> 00:17:23,100 So you thinking of maybe the propensity score or something like that? 316 00:17:23,100 --> 00:17:25,300 So exactly, so nuisance functions. Yeah. 317 00:17:25,300 --> 00:17:28,900 So things like propensity scores things or I mean even objects 318 00:17:28,900 --> 00:17:30,100 of more direct inference 319 00:17:30,200 --> 00:17:32,400 interest, like conditional average treatment effects, right? 320 00:17:32,400 --> 00:17:35,100 Which of the difference of two conditional, expectation functions, 321 00:17:35,100 --> 00:17:36,300 potentially things like that. 322 00:17:36,500 --> 00:17:40,400 Of course, even there, right? We the the theory 323 00:17:40,500 --> 00:17:43,700 for in France or the theory for sort of how to how to interpret, 324 00:17:43,700 --> 00:17:45,900 how to make large simple statements about some of these things are 325 00:17:46,000 --> 00:17:50,100 less well-developed depending on the machine learning, estimator used. 326 00:17:50,100 --> 00:17:53,800 And so, I think there's something that is tricky is that we 327 00:17:53,900 --> 00:17:55,700 can have these methods, which work a lot, 328 00:17:55,700 --> 00:17:58,000 which seemed to work a lot better for some purposes. 329 00:17:58,000 --> 00:18:01,600 But which we need to be a bit careful in how we plug them in or how 330 00:18:01,600 --> 00:18:03,300 we interpret the resulting statements. 331 00:18:03,600 --> 00:18:06,200 But of course, that's a very, very active area right now. We're 332 00:18:06,400 --> 00:18:10,400 People are doing tons of great work. And so I exfoli expect and hope 333 00:18:10,400 --> 00:18:12,800 to see much more going forward there. 334 00:18:13,000 --> 00:18:17,300 So one issue with machine learning, that always seems a danger is, or 335 00:18:17,400 --> 00:18:20,300 that is sometimes a danger and had some times led to 336 00:18:20,500 --> 00:18:22,600 applications that have made. Less sense, is 337 00:18:22,800 --> 00:18:25,100 when folks start with a method that are 338 00:18:25,300 --> 00:18:28,500 start with a method that they're very excited about rather than a question, 339 00:18:28,900 --> 00:18:32,100 right? So sort of starting with a question where here's the 340 00:18:32,500 --> 00:18:36,200 object I'm interested in here is the parameter of Interest. Let me 341 00:18:36,700 --> 00:18:37,100 You know, 342 00:18:37,300 --> 00:18:39,500 think about how I would identify that thing, 343 00:18:39,500 --> 00:18:41,800 how I would recover that thing, if I had a ton of data, 344 00:18:41,900 --> 00:18:44,000 oh, here's a conditional expectation function. 345 00:18:44,000 --> 00:18:47,100 Let me plug in an estimator on machine. Learning estimator for that. 346 00:18:47,200 --> 00:18:48,800 That seems very very sensible. 347 00:18:49,000 --> 00:18:53,100 Whereas, you know, if I digress quantity on price 348 00:18:53,700 --> 00:18:56,000 and say that I used a machine learning method, 349 00:18:56,300 --> 00:18:58,900 maybe I'm satisfied that that solves the in dodging, 80 problem. 350 00:18:58,900 --> 00:19:01,200 We're usually worried about their maybe I'm not, 351 00:19:01,500 --> 00:19:03,200 but again, that's something where the, 352 00:19:03,400 --> 00:19:06,300 the way to address. It, seems relatively clear, right? 353 00:19:06,500 --> 00:19:09,000 It's the find your object of interest and 354 00:19:09,200 --> 00:19:11,600 think about, is that just bringing the economics? 355 00:19:11,700 --> 00:19:12,200 Exactly. 356 00:19:12,200 --> 00:19:15,400 And and can I think about it, and they denied it, but harnessed 357 00:19:15,400 --> 00:19:18,300 the power of the machine learning methods for precisely 358 00:19:18,500 --> 00:19:22,800 for some of the components precisely. Exactly. So sort of, you know, the, the, 359 00:19:22,900 --> 00:19:25,600 the question of interest is the same as the question of interest is always been, 360 00:19:25,600 --> 00:19:29,500 but we now better methods for estimating some pieces of this, right? The 361 00:19:29,900 --> 00:19:31,600 the place that seems harder to, uh, 362 00:19:31,900 --> 00:19:33,400 harder to forecast is Right. 363 00:19:33,400 --> 00:19:36,300 Obviously, there's a huge amount going in going on in the machine. 364 00:19:36,400 --> 00:19:37,400 Learning literature 365 00:19:37,500 --> 00:19:39,700 and the great sort of The Limited ways 366 00:19:39,700 --> 00:19:42,900 of plugging it in that I've referenced so far are limited piece of that. 367 00:19:43,000 --> 00:19:46,100 And so I think there are all sorts of other interesting questions about where, 368 00:19:46,300 --> 00:19:46,900 right sort of 369 00:19:47,100 --> 00:19:49,300 where does this interaction go? What else can we learn? 370 00:19:49,300 --> 00:19:52,000 And that's something where, you know, I think there's 371 00:19:52,200 --> 00:19:56,400 a ton going on which seems very promising and I have no idea what the answer is. 372 00:19:57,000 --> 00:20:01,200 No, no. No, it's I so I totally agree with that but it's no. 373 00:20:01,800 --> 00:20:03,500 That's makes it very exciting. 374 00:20:03,800 --> 00:20:06,100 And I think that's just a little work to be done there. 375 00:20:06,600 --> 00:20:11,400 All right. So I say agrees with me there, say that person. 376 00:20:14,500 --> 00:20:17,700 If you'd like to watch more Nobel conversations, click here, 377 00:20:18,000 --> 00:20:20,400 or if you'd like to learn more about econometrics, 378 00:20:20,500 --> 00:20:23,100 check out Josh's mastering econometrics series. 379 00:20:23,600 --> 00:20:26,500 If you'd like to learn more about he do Josh and Isaiah 380 00:20:26,700 --> 00:20:28,200 check out the links in the description.