Guessing Games: Assessment

A man had trouble with his English, so his friend taught him how to say, “Apple pie and coffee,” so when on the job, he could order some food at the local restaurant during his lunch hour. This was fine with our man, and he was grateful to his friend, but after several months he wanted a little more variety in his fare. His friend was glad to oblige and taught him how to say, “Ham and cheese sandwich.”
The man proudly walked into the restaurant the next day and said to the waitress, “Ham and cheese sandwich.”
To which the waitress responded, “White, whole wheat, or rye?”
With shoulders sagging and the smile gone from his face, he answered back, “Apple pie and coffee.”
Found here

My dad told us that one when we were kids. He put on what I recall was supposed to be an Italian accent when saying the telltale words, his version of the new immigrant. Perhaps he put in some other embellishments too. I don’t think he meant it as a morality lesson. Periodically my dad would tell us jokes at dinner, those he heard at work that could be recycled to us kids. Here’s another one from that vintage, certainly no lesson about morals there. The joke above, however, can be taken that way. We’re unable to venture beyond what we already know, however dearly we’d like to, because we don’t have the capacity to learn something new. Literally, we are in a rut.

This seems all the more so for my discipline, economics. The question is why. Could it be that the native language is so strong that we become deaf to any other? Is it hubris or simply mental incapacity that explains the lack of adaptation, the mistaking Shangri-La for Purgatory? The Chicago School, in particular, is having its comeuppance. We have witnessed some amazing confessions of late, none more remarkable than that of Richard Posner, the judge and University of Chicago Law School professor, an icon in the field of Law and Economics and a longtime advocate that markets provide the best solutions for society’s travails. The review by Robert Solow makes clear just how astonishing Posner’s book is, not for its thesis, which is rather ordinary to liberal economists such as Solow, but for who articulates it. It’s as if a lifelong Fundamentalist Christian suddenly turns to Atheism, left with no other conclusion after confronting the evidence from science. Those changes don’t ever seem to happen. Posner however, whose intellectual integrity emerges intact from this book, is forced into this reckoning because it became clear from the recent financial meltdown that such markets are not self-regulating. Indeed, the pursuit of the quick buck via ever increasing financial leverage was a primary source of instability, creating a house of cards that toppled onto itself. One can only wonder whether five or ten years hence, after the economy has rebounded somewhat and moved to its new equilibrium path, if the Chicago School thinking will return to its prior orthodoxy or instead forge some new as of now unknown synthesis.

The rest of the profession may not let them stay where they are, in which case surely the battle ground will be macroeconomics. Current formal macroeconomic models, the type that get published in the leading economic journals, feature neither unemployment nor default on loans. George Akerlof and Robert Shiller in their new book call this a fatal flaw. How can such models accurately predict the functioning of our economy when starting a priori with such lack of realism? And why is it that we’ve reached this state of affairs with scholarly thinking about macroeconomics?

It wasn’t always this way. When I started graduate school in fall of 1976, Northwestern prided itself on teaching an up to date curriculum. Sure we began our first quarter of macro with Robert Eisner, an unabashed Keynesian. There we read The General Theory, though I must say we probably didn’t understand it very well and Professor Eisner knew full well that we wouldn’t. So we also read the Hicks paper, Mr. Keynes and the Classics. That paper introduced the IS-LM (Investment Demand-Savings, Liquidity Demand-Money Supply) framework, which typified intermediate macroeconomics textbooks at the time and continues to do so till this day. The IS-LM model produced something akin to the the familiar supply and demand curves from microeconomics, which largely explains its popularity.

The next quarter course, taught by Robert J. Gordon, offered a bridge between the Keynesian mainstream and the then new “rational expectations” approach. In the former category we read Patinkin’s Money, Interest, and Prices and learned about The Phillips Curve, which posits an inverse relationship between the rate of unemployment and the rate of inflation. (Gordon taught us from his then not yet complete undergraduate textbook about something he called the ee curve, which produced a direct relationship between unemployment and inflation. In conjunction with the Philips curve we could then get the familiar scissors that “determines” the equilibrium unemployment and inflation rates.) This theory seemed more ad hoc than IS-LM. It would have survived, however, if it did a good job empirically. But on that score the approach was treading on thin ice.

The Wage and Price Controls of the Nixon Administration had been lifted and the first OPEC Price shock had been weathered. The result, which ultimately doomed the Carter Presidency, was stagflation, a mystery incompatible with Phillips curve thinking. For an econ grad student desperately in search of terra firma, so entirely unprepared that what we were taught in one quarter would be repudiated in the next, there were even worse things in store. The notion that an activist Fiscal policy could counter the economic malaise was under intellectual attack. As if to signify the ascendancy of the Chicago School, Milton Friedman had recently won the Nobel Prize in Economics. We read a paper by Robert Barro, no Chicago School economist himself, Are Government Bonds Net Wealth?, which implied government spending could not boost aggregate demand, because of the implied future tax burden needed to finance the spending. And we read Sargent and Wallace’s paper about optimal monetary policy in light of rational expectations by the public. The paper argued that only surprises can matter. Anticipated policy changes were neutral.

Expectations about the future, particularly about inflation, are crucial for macroeconomic models since such expectations drive key variables like the interest rate. The rational expectations idea, dating back to John Muth’s 1961 paper, is that the agents in the economy shouldn’t be dumber than the model used to generate their behavior. Instead they should use the model itself and all available current information to formulate their expectations. Prior to the rational expectations idea, expectations had been modeled as “adaptive,” meaning that the future values were predicted by some weighted average of past values, with the weights declining as as we go further into the past. Adaptive approaches can produce inconsistencies between the expectations so generated and the prediction of the underlying model. Rational expectations are coveted because no such inconsistencies are generated.

This was the state of macroeconomic thinking during my first year of graduate school, with a tension between the Keynesians and the new Chicago School. It couldn’t stay there. Partly, this was because of politics. The Reagan Revolution was still in the offing. There was a need for the economic theory that supported the revolution to come into line. But mostly it was because of the culture of the economics discipline itself. Macro economists carry a stigma with them, lampooned nicely in Axel Leijonhufvud’s Life Among The Econ. The stigma emerges from the assumption that they can write aggregate structural equations without deriving those equations rigorously from individual behavior, which is then so aggregated. In other words, macro lacks a good micro foundation. It is micro that is exalted in the temple of Econ, because the behavior is derived directly from rational choice. I will provide my own critique of that in a bit, but first I want to describe the intellectual pull of micro.

Parallel to the macro training we were receiving we also had a year of micro, one quarter of “Price Theory” from Mort Kamien followed by two quarters of General Equilibrium from John Ledyard, my advisor. Ledyard was aware of micro’s pull. I recall that somewhere in the middle of the second quarter he asked our class whether we felt we were being indoctrinated into the field (he used some other words that I can’t recall) liking our education to the exposure neo Marxists get (Marx was still a pretty hot topic when I started graduate school, though not in the Econ department). Ledyard was onto something with this question. When in graduate school, each new micro theory idea seems sensible in the context in which it is first presented. Consequently, the economics doesn’t much grab you. Yet the totality produced by all the messages mesmerizes you and creates a sense of seeing reality a certain way that is very hard to undo.

One of the books Ledyard had us read was Koopmans’ Three Essays. It was my first exposure to the real beauty in the mathematical formulation. An example is the alarming simple result that identifies aggregate profit maximization with individual firm profit maximization when all firms are “price takers” and there are no externalities in production. This is an argument for decentralization. Let each firm go on its own merry way, maximizing to its heart’s content. Ayn Rand’s selfishness seems to find its virtue here. Koopmans also showed us the reasons for the core economic assumptions and the powerful implications that emerged from them as a consequence. Foremost was convexity. If the production set was convex, then there would be a supporting hyperplane for any boundary point, the outward facing normal to which yielding the desired price system. (In English, for every possible efficient production plan a price system could be found under which that plan is the profit maximum, so long as the set of possible production plans was convex.) Conversely, under assumptions about boundedness that amounted to the maxim, “there is no free lunch,” and about closedness that posited existence of efficient production plans, there would be a profit maximizing plan for any given price system.

From Koopmans we graduated to full general equilibrium, allowing for consumers as well as producers, and with that learned about the Arrow-Debreu model, economics’ pièce de résistance. The model combines the optimizing decisions of consumers (those rational actors whom we’d like to model to give suitable micro foundations) along with the profit maximization of firms, all price takers, the assumption that perfect competition requires. We learned about what was required for a competitive equilibrium to exist. And we learned the two Fundamental Welfare Theorems, the analogs of Koopmans’ results for an economy with consumers as well as producers. The general equilibrium approach unites the rational decision making of individual actors with behavior in the aggregate, providing a theoretical justification for market solutions and offering elegance in the modeling approach that is surely captivating to those who have mastered the model.

Perhaps the best well known articulation of the theory is in Debreu’s book, Theory of Value, which shows the approach can readily be extended to economies with commodities that are differentiated by location, or through time of availability, or allocated based on a contingent state of nature. Indeed, the model is held up as the gold standard to which actual markets are compared in that real world markets are not so complete. Ask a theoretically trained economist as to why real world markets don’t perform as well as their theoretical counterparts and very likely the response (at least until quite recently) would be about market incompleteness. (Credit default swaps are securities that completed some financial markets. But I know of nobody in the public sphere who has defended them for doing so.)

As elegant as the theory is, however, it has an air of unreality to it because contrary to the way it is advertised, it is fundamentally about planning, not about behavior. Consider this simple yet revealing example. When I was in graduate school I played a lot of pinball, partly out of fascination with the game and partly as a diversion from the economics. You can think of pinball from the perspective of the ball and its path of motion, in which case physics is the right tool for analysis, or you can think of it from the perspective of the player, which is getting closer to the mark, or you can think about it still another way, as the interplay between the two with feedback loops going both ways, from player to ball and from ball to player. If planning sufficed in the analysis, all the interesting consideration would go on inside the player’s head, before the plunger is pulled. In the process pinball, that almost devilish pursuit, perhaps a mild form of gambling, would get utterly transformed into the most cerebral of endeavors. For each position and velocity of the ball there would be an associated action, perhaps a gentle shaking of the machine, or a click of the flippers, or doing nothing other than monitoring where the ball the is headed. And this map from position and velocity into action would be entirely anticipated in advance, determined to get the the best possible score at the games’ end. The play itself would be reduced to carrying out this elaborate plan. But this is theoretical nonsense. Nobody envisions playing pinball this way.

Pinball when played by an experienced and able player is a work of art. I had the good fortune as a sophomore at Cornell to witness unbelievable prowess in pursuit of this art. A guy whom I knew only as Tex set a record that couldn’t possibly have been beaten. New York State had at the time (perhaps it still does) weird “Blue Laws” that allowed the pinball games to give free balls based on achieving a certain incremental score either from the start of the game or from the acquisition of the previous free ball, but not to give free games. Tex played the same game for about two hours straight, completely obliterating the previous high score. Then, with eight balls left (you start with five at the beginning of the game and could accumulate up to a maximum of ten balls) and having enough of it by that moment, Tex tilted out the game. I have never seen anything like it, before or since. He had complete mastery of the game. Tex did not shake the machine, ever. Instead, he slapped at it on occasion as a way of imparting some additional momentum to the ball. I learned that slapping technique from watching him.

Pinball play is an example of decision making and execution in the moment. You get better at it with experience. You learn the particular game and the particular machine, how the flippers react, how long to let the ball roll down the flipper before trying to shoot the moon. All of this is done by feel, not by calculation. When you have the right feel you and the machine become parts of a whole. Things seem in synch. When you don’t have the feel, the balls seem to go through the flippers or down the side slots with rapidity and the game ends very quickly.

I’d argue that real economic behavior, as distinct from theoretical economic planning, is a lot like the actual play of pinball. Indeed there are micro models that do strive to capture some aspect of this more realistic notion of behavior. In fact there are models that admit unemployment in equilibrium and other models that include default on loans in credit markets as equilibrium behavior, just as Akerlof and Schiller would want. But here’s the thing. We don’t know how to aggregate up these micro models into a macroeconomic theory. Aggregation is a non-trivial undertaking, once you give up rational optimizing as the principal that guides behavior. The choice left to the theorist is either to model behavior in a somewhat realistic way and then attempt to understand what happens in the aggregate via simulation, a very difficult enterprise where one can’t be sure about the generality of the results nor about the lessons learned, go back to the structural equations for the macro economy that were part of the standard toolkit when I was in in graduate school even though those structural equations are not derived from individual behavior, or jettison individual behavior in favor of individual planning à la Debreu and buy into most if not all the results from the general equilibrium model, so that aggregation can be achieved and mathematical elegance in the modeling retained. Many macro economists have opted for this third alternative. Rather than a reluctant choice, however, embracing the approach became a badge of honor. I conjecture that after a while those who did stopped seeing the limitations in their approach. Their models became their reality. They didn’t need another and didn’t want to contemplate that another was possible.

This is not the first time in my adult memory where the economics profession has been assaulted because the unreality in its formulations came into conflict with real world behavior. In the late 1990s with the growth of Netscape, the dot.com boom, and ultimately the Microsoft Case, some thought the fundamental economic paradigm had changed. This new economy, characterized by large increasing returns to scale (in essence, copies of information were free to produce, with the Internet essentially a huge virtual photocopier) and with strong network effects (MS Word is more valuable to me if I know you have it too and vice versa) was fundamentally non-convex. It would be dominated by a few very large players who were anything but price takers. Koopmans hadn’t taught us about this sort of environment. Schumpeter provided the more apt lessons. Sure, we had been living all along with increasing returns in certain well known segments of the economy – railroads, telecommunications, electric power transmission, for example – but all that seemed limited and manageable. After Netscape, things changed. It felt as if the entire economy was being transformed, to be overtaken by the new information economics. Potentially, it was a time to reexamine the core macroeconomic models. Had that happened, perhaps Enron would have been (correctly) interpreted as a portent of things to come. But no such rethinking of macro took place then.

I can only guess why not. The economy was booming. Unemployment rates were unbelievably low, for a while less than 4%, yet inflation was modest. This wasn’t your grandmother’s idea of monopolized economies, with Harberger Deadweight Loss Triangles there for the asking. McDonalds was paying more than twice the minimum wage and still couldn’t get employees; the labor market for unskilled workers was that tight. Innovation was coming fast and furious and kids with computer programming knowledge were dropping out of school to become instant millionaires. It was an exhilarating time, even if ultimately it was fueled by a bubble mentality. I don’t believe economists think hard about their fundamental assumptions in a boom. It takes a bust to get us to go back to the drawing board. Then, too, there was a lot of micro theory developed about network economics. Traditionally, monopoly had been the domain of micro, not macro. Perhaps there was less impetus toward reexamination for that reason.

Personally, in spite of having taught general equilibrium theory to graduate students at Illinois, I’ve never been comfortable thinking about macroeconomics via general equilibrium or any other way, because my intellectual center of gravity focuses on the decision maker first and foremost, or the pair of decision makers in some contractual relationship, such as the interaction between a sponsor and a contractor. When I was writing economic theory papers, I could get interesting behavior out of such a focus (this piece on Cost Overruns done with my doctoral student Antonio is an example), but I couldn’t roll that up in a nice way into an aggregate. At best, aggregation could happen within the one market where the particular relationship was happening, so I favored the partial equilibrium we studied that first quarter of grad school, duly modified to include the insights from the type of interactions that were of interest to me.

Partial equilibrium is full of ad hoc assumptions. (One is allowed to envision the market under consideration to be out of equilibrium but all other markets must be necessarily in equilibrium because income and the prices in these other markets are fixed.) Yet it does offer its own particular insight on aggregation. The question is if some prices go up and other prices go down what can be said about the overall? Has the price level gone up or gone down? This is an example of the index number problem. Economics is at its best when it shows there are no perfect answers to the issue at hand. Alas, that is the case with index numbers. Can’t live with them, well, you know the rest. The Dow, the S&P 500, the CPI, they are all part of the vernacular now. Social Security payments have a Cost of Living Adjustment (COLA) in them. The COLA aims to compensate beneficiaries whose purchasing power would otherwise erode due to inflation. Does the COLA provide the Goldilocks ideal, offering up the right temperature of porridge? Perhaps for some beneficiaries it does. For others it over compensates and for still others it under compensates. It is one size, which naturally doesn’t fit all. This is the lesson about aggregation from partial equilibrium. You can come up with a number, sure. But you can’t hang your hat on it.

* * * * *

I have been fortunate in my teaching to have had some unplanned yet highly revealing experiences that have profoundly shaped my views and have allowed me to reconsider what I was trying to achieve and how I should go about doing that. Some of these come from my own class; others are entirely outside the domain of instruction; still others fall somewhere in between. Here are examples, one of each.

In the late 1990s I was teaching intermediate microeconomics to a large class of (about 180) students, using undergraduate TAs who’d previously taken the course to interact with the students online during scheduled office hours. This worked pretty well and was a popular thing with the students. They could get help right when they needed it, while they were working on their homework. But I got one complaint repeatedly – there should be face to face office hours with the TAs too. I didn’t think those would be heavily utilized so I resisted the suggestion initially. Ultimately I capitulated because it seemed that continuing to deny the request was having a negative impact on how the students viewed the online offering. Sometimes you do things just for the symbolism, not for the real consequences.

My office at the time was in a little cul-de-sac with other offices like mine that had a window and several “interior offices” (no window). Via the grant we had gotten to support the online learning we had arranged for one of those interior offices to be used by the TAs for their office hours. At the time, most people did dialup from where they lived and I wanted the TAs to have a good network connection when they were online. So we provided that and the computer to work on. We used the same space for the face to face office hours.

I scheduled the TAs to make sure we had coverage during all the evening slots, 7 – 11 PM. But since the face to face office hours were going to be in the afternoon and we scheduled them only as an afterthought, I was more at the mercy of the TA schedules, they had their own courses to take, so the timing for the face to face office hours was more haphazard. As it turns out, some of those overlapped my own office hours, not an efficient alternative to be sure, but the best we could come up with under the circumstances.

It was my habit at the time when in my office to leave the door open, to be closed only if somebody wanted to talk with me in confidence or if it got really noisy in the corridor. Keeping the door open was the friendly thing to do. Sitting at my desk, I could see all the floor traffic that entered my end of the cul-de-sac, which included everyone who entered the office that my TAs used. Imagine my surprise when I watched students in my class walk right past my own office so they could meet with the TA. This is the sort of “natural experiment” that most faculty don’t get a chance to witness. But I did.

If you buy into the information that U.S. News and other periodicals put out about rating colleges, then the quality of the faculty and the student-faculty ratio are two important metrics that determine the ratings. From that students would seem to value having interactions with faculty, particularly one-on-one interactions, so by that logic any student who would schlep up to the faculty member’s office to ask some questions would surely want to pose those to the faculty member, especially if the faculty member was not already engaged with another student. This line of thought makes what the students actually did a real puzzler.

Of course, the behavior is no mystery at all. Students who don’t understand something that they believe they should know are shy about it. The last person on earth they’d like to let in on their lack of understanding is their professor. It’s much easier for them to talk with undergraduate TAs, who have no responsibility whatsoever for determining their final course grades. That much seems clear enough, but I’ll push this one step further, because as an economist I will argue strongly that how people actually behave indicates what they value. Here were students doing just that. They valued interacting with the undergraduate TAs because they felt comfortable opening up with them. It didn’t matter that I was much more expert about the economics. What mattered was that the TAs were closer in age and experience to the students and that the TAs had almost no authority. That meant the students attending the office hours had less to risk by opening up about what they didn’t know. If real learning happens as someone goes from being stuck to getting unstuck and if one of the better ways to get unstuck is to ask for help, the willingness to open up about being stuck trumps the ability to access expertise. That fact appeared, right under my nose.

I am not saying that expertise doesn’t matter at all. From time to time the TA would come to my office to ask for help, unsure of how to answer the student’s question. And once in a while the TA and student would both come into my office, the curiosity getting the better of them. Once the student understood that I wasn’t going to bite his head off, then expertise surely was desired. The point is that getting comfortable comes first, before getting the expertise. If comfort isn’t achieved, the questions are never posed, and the learning doesn’t happen at all or gets put off till later when the student’s nerve might return. This is a Maslow Hierarchy of Needs argument. Self-esteem is a more basic need. That need must be addressed at first. Teachers likely know this about their own learning. Nonetheless, they may be wooden to the idea when it is their own students who are in need. It is far too easy to abstract about the student fear from not knowing. Mostly it remains hidden from the instructor, so out of mind.

The next experience is really a set of experiences and comes from work. For a time I directed a campus Center for Educational Technologies with about ten people working under me. (The staff grew gradually in response to growth in use.) A few of these good folks were techies; they ran our servers and administered the software we supported. The rest of the staff directly assisted faculty members or did other work (like supervise our student employees) that was related to the first task. I had two scheduled weekly meetings with staff. One was with the Assistant Director, who acted as the office manager and made sure things were getting done. My job was to set general directions for the Center and identify big projects we would support, but otherwise leave the work to the staff. Through the one-on-one with the Assistant Director we could have a conversation where each of us could take the pulse on the other, learn about the current issues, and make sure things were progressing nicely. If they weren’t we’d have to do something about it.

The other scheduled meeting was an all-staff meeting. I felt it important for the staff to know what was going on in the office and for me to alert them together about campus matters that were relevant to our work. Though some in the group were naturally expansive and others more reticent, we did make an effort for everyone to get a turn. There is a natural tension in educational technology between the techies and those who directly assist faculty, because each wants to do things a certain way, with those ways being quite different. The techies are as a rule quite risk averse and want to offer services that are very controlled. Those who directly support faculty want services that are flexible so they can accommodate idiosyncratic needs when they arise. I thought it was beneficial for that tension to get a repeated public airing. Sometimes it was me on one side and the rest of the office on another. I can be pig headed about things and need the pushback from the group to change my thinking. That too was good to get aired in public.

I did have other meetings with staff, either on an as needed basis because an issue would arise or more as a mentoring type of conversation that would happen from time to time. I do like to have one-on-one conversations as a rule and would have done it more frequently with the staff, but time wouldn’t allow it. So this was the next best thing. I had numerous meetings outside the office, on committees or individually with other folks on campus. And I was still teaching the large class, where I would design experiments of new technology use to try, and occasionally writing pieces for review. So time was indeed scarce.

I wouldn’t argue that this way of running the office was the best possible approach. Ultimately, when the assistant director got more control of the office and the staff had grown larger still, she abandoned the full staff meeting in favor of meetings of staff with like roles. Those meetings were more functional and more participatory. So certainly it would have been possible to do things differently, even early on. But I would say that this experience was not paralleled by other experiences I had as a faculty member. Running the Center was unlike service on committees. Perhaps faculty who run a research lab with a bunch of post docs and graduate students have similar experiences, because so much of the discussion is about work flow. I never ran a research lab. But I did do this.

Ultimately my appointment got switched to full time administrator. Once that happened, I no longer had to teach as part of my job. When I did teach, it was because I volunteered for the activity. And understanding that I was a freebie for the Economics Department, I asked to teach small classes so I could have more interaction with students and learn from them. (Teaching was in some sense an applied research for my work directing the center. It was a chance for me to test out some of my ideas.) In these small classes we all could be seated and still be both seen and heard. There was no need for me to stand in front of the room. I found myself more comfortable conducting the live class session much as I had conducted the staff meetings, focusing on work flow much of the time, and designing the assignments for the course so that there would be project work that indeed could be discussed. My mechanism wasn’t exactly the same but it was similar and it was clearly informed by the prior experience. My taste had changed regarding what I wanted to achieve in the class. Generating conversation was the most important thing; lecturing to work through models in detail still happened from time to time, but wasn’t the steady fare. The students really liked this approach. My course evaluations were unambiguous on that score and it also showed in the impromptu conversations we had during the break. I’d ask the students about how the class compared to the rest of what they were taking. It was completely different. They liked the contrast.

My sense is that more faculty would experiment with their teaching if they had other models of how to conduct class that made sense to them. On this, faculty are creatures of habit. They do things a certain way because that’s the way they’ve done them in the past. Or they do it that way because when they were graduate students that’s the way it was done. If other models for class transactions don’t present themselves, there will be no change. In my position I’ve advocated for faculty experimentation, but when given the opportunity myself, the best I could do was imitate something I had already done before, only not in the classroom. I believe there is a big lesson in this observation. If we want faculty to change their teaching, we must give them alternatives in which they have already participated. Then they are imitating more than inventing. We will flatter them if we can come up with this set of experiences for them to copy. Then they will flatter us.

The last set of experiences I want to discuss inform me in what I value as a learner. My last two years as an undergraduate at Cornell I found what I was looking for. Only I stumbled into it and didn’t know it’s what I wanted to find till I was well involved. My intellectual world was idyllic. Actually, it was two worlds. One revolved around classes. That world was private. I did my work on my own for the most part, took courses based on my own idiosyncratic interests. I had the occasional discussion with classmate or TA or Professor, but the topics and the entry into the discussion were determined by the class itself, not by outside influences.

The other world revolved around a rooming house where I lived, where I had social interactions with my housemates. We shared a kitchen and eating place and that was where we gathered. We had discussions there and played there and formed a bond there so that when we went out, to eat or to listen to music or to see a movie, we went as a tight knit group. These friends were diverse in their interests, in their areas of study, and in how far along they were in school. Somehow that diversity helped. It blocked certain areas of discussion (what we were studying) but enabled many other topics that might have been blocked if one of us had been expert. We were all novices so we were open among ourselves. We valued that openness a great deal. When outsiders came in they had layers of self-protection, they put on an act if you will; most everyone did. We nurtured them to peel off this false skin and let them be themselves. Then they became part of the group. It was a wonderful feeling to be with these people.

I did not know how to recreate that feeling and I lost it once I went to grad school. My social life and my class life there overlapped much more and the diversity of interest was lost. It was lonelier and less intellectual, though much more intensive about the subject matter of the classes itself. I began to think of my Cornell experiences not as an aspiration for other experience I would try to create on my own, but simply part of a wonderful journey that had ended.

I found the same feeling not quite twenty years later, in an online discussion group we had on campus that focused on how to teach with technology. That was summer and then fall of 1995, pretty early on with online learning. Friday evenings especially were involved in reading the posts of colleagues in the discussion forum and making my own points. We too were a diverse community from all over campus, representing lots of different areas of study. But we had a common question to answer and some belief that the answers we came up with would cut across our fields. I can’t speak for the others who participated and what the draw was for them, but I was pretty much a home body those evenings because we had very young kids and we spent much family time together in our living room. Going online was a diversion from that, though sometimes I had a kid with his head on my shoulder as I was looking at my computer screen.

I was back to being a novice in the conversation, though I had been a faculty member for 15 years. I did have teaching experience, but in no way did I have a pat hand about my class, where I knew that for the middling students especially it wasn’t working well at all. I think it really helped in building this sense of community that nobody was expert. Some had done prior things with technology that were interesting and useful for their students. But nobody had it all figured out. There was a need for a collective distillation of what we were learning based on the individual explorations we were all going through. That need served to form a tight bond with the group. As we began to satisfy the need the discussion turned more technical and less interesting to me. By late spring of 1996 it had essentially ended. It was a shooting star, wondrous while it lasted, unfortunately of too short duration.

* * * * *

Let’s return to “Apple pie and coffee.” The question I want to ask is whether others aside from economists act like new immigrants, unable to speak the new language and without a program for acquiring it. My guess is that it happens all over. The Bush Administration readily comes to mind. Read these pieces from 2006 in the New Yorker by George Packer, Letter from Iraq: The Lesson of Tal Afar and A Reporter at Large: Knowing the Enemy. (These were recommended by David Brooks in his Sidney Awards from that year.) They are remarkably interesting to read, even now, well after the nation has turned its attention away from Iraq. The first piece makes it seem as if we could have actually won the war and turned “the peace” over to the Iraqis, if only we had called it a counterinsurgency early on and managed it as such thereafter. Instead, the Pentagon was in denial. It “knew the answer” before the problem was solved, so no need to learn what was actually going on. Further, Rumsfeld and his minions maintained the untenable position that the original strategy was the right one from the get go, so they had a need to deny evidence that would tend to refute that position.

I recall in the press many discussions about the number of troops on the ground being inadequate. But I don’t remember much being said at the time that critiqued what is was the troops should be doing. Packer’s Tal Afar piece makes clear that there were essentially two different approaches pursued early on, the first – kill or disable all the Iraqi bad guys, the second – befriend and empower all the Iraqi good guys. The first alienates the entire population. The Americans are a foreign power and they are wrecking destruction. The second has a chance to grow the population who want to be good guys, both Shia and Sunni. It clearly is a more collaborative approach. It is also slower and requires greater patience. That is the nature of counterinsurgency. And it is waged mostly by cultural and economic means. The military part is only a small fraction.

If the second approach had been the plan across the board, it might very well have worked. But it didn’t play out that way. Commanders in the field were free to try the approach they thought best. So in some areas in Iraq there was intense fighting, as a result of the first approach, while in other areas there were efforts to give Iraq back to the Iraqis that proved successful, as a consequence of the second approach. Packer also reports about the sheer waste some years later, where many of the troops were housed on huge bases, essentially walled cities, and there was very little interaction between this occupying force and the native population. The opportunity for repair was there but it was being wasted, a consequence from the second approach not being more fully embraced early on.

We non-combatants tend to think that war is won on the battlefield. War is also waged by propaganda and the better propaganda machine tends to win. Growing up in the 1960s and from reading later, George Orwell especially but also various reading about the Russian Revolution, I have implicit knowledge of that fact. Packer’s Know the Enemy piece makes that essential point, coupling it with modern day marketing on the Internet. Al Qaeda has learned the lessons of Malcolm Gladwell. They are winning the hearts and minds of many of the poor in the Muslim world through their approach.

The U.S. Military Machine, for all its might and its capabilities to shock and awe, is still in the Dark Ages on the propaganda/military front. Packer talks about what it would take to bring our approach up to the present, so it has a chance to compete. The key requirement is to embrace an anthropological approach, understand “the street” from a cultural point of view. Only then is appropriate communication possible. Apparently we last tried this in Vietnam. We are loathe to return to that past. By ignoring history, we turn out to be repeating it.

* * * * *

This is my Mr. Miyagi moment. Recall in The Karate Kid that after Daniel-San had waxed the cars, painted the fence, sanded the floors, and painted the house he was very angry. He confronts Mr. Miyagi, arguing that he’s supposed to be learning Karate, but he hasn’t learned anything. He’s just put in a lot of time doing a remarkable amount of uncompensated labor. Miyagi responds, in turn, that Daniel has indeed learned quite a bit. Miyagi goes further. He illustrates the point, asking Daniel to demonstrate what he has learned by first showing the proper motions from the particular activity, then by having Daniel defend himself from Miyagi’s attack where the learned motion provides the key to the defense. Miyagi is still teaching in the process, “Watch eye! Always watch eye.” It is the best scene in the movie.

Mr. Miyagi practices what Randy Pausch in his last lecture calls misdirection. The teacher gets the student to focus on one thing while the teacher is really after the student learning something different. The teacher holds back on that something different, letting the student engage in the first activity for itself, because the student can do that on his own. Eventually the time is ripe. The student has readied himself for the real point. Only he’s unaware he’s been readying himself all along. The instructor has his attention now and the instructor springs into action. This is the best way to teach important lessons. The process of misdirection signifies the importance of what is learned and the readying activity allows the student to take it all in, one big gestalt.

This chapter is about assessment. We’ve been doing readying activities for the topic right along. I lead off with the Economics because in talking about assessment many are making the same sort of mistake that the Chicago School macro economists have made. They have forgotten how difficult aggregation is to achieve. They make it seem easy by ignoring much of reality. Learning and the products that accompany it - the homework sheets, the projects, the essays, and all the rest are extremely difficult to aggregate and still be able to make meaning of the result. It’s even harder with learning and its byproducts than it is with economics. The lure of standardized tests is that they seemingly can be aggregated, but it is fools’ gold. (The tests themselves do suffer from an index number problem that nobody talks about. Some students get a set of questions (A) right and a different set of questions (B) wrong. With other students the results are flip flopped. Then by the magic of the scoring, those students are ranked with one group outperforming the other, as if we know that one sort of mistake is less important than the other. It is becoming increasingly popular to argue for a portfolio rather than a test approach. Evaluating the portfolio by a rubric may change the nature of the evidence to evaluate but leaves the index problem intact.)

Let’s grant that the test may have some value because it can be aggregated. From there through hubris, or intellectual laziness, or simply not understanding how much information is being ignored with this focus on the standardized test, the advocates have accepted the untruth that the tests are the total measure of learning. So they hang their hats on the numbers, though there is little or no justification in doing so. The tests are used for purposes analogous to how COLAs are used to adjust Social Security, to rate students, teachers, and schools.

It is clear that the macroeconomics has failed, miserably. Its focus has been too narrow. It simply couldn’t explain the recent meltdown of the economy. Let’s take a lesson from that. There is much more about learning than can be seen by performance on standard tests. Let’s talk about learning in its totality and assess it accordingly.

Much assessment should occur during the process of students going about their class work. (In this piece by Malcolm Gladwell from December, the “superstar” high school math teacher (described in the middle of the page) gives individual feedback to most of the students in the class while they are working a problem.) When that class work is open ended, students need more that simple feedback. Process assessment, like my staff meetings, happens via conversation. Instructors must have ongoing conversations with students, ensemble, in small groups, and perhaps individually too. This is the simple take away from my message. It is an inescapable conclusion. Part of the conversation can be by correspondence. Part of the conversation can be by phone or by video conference. The medium matters less. It is the fact that these conversations happen that is most important and that they are ongoing, not once in a blue moon.

One reason for talking about Tal Afar is to show that this lesson about learning – the troops having tea with the Iraqis in this instance – is more basic than classroom instruction and therefore one will find the need for conversation wherever open ended learning takes place. Atul Gawande, in a quite recent piece in the New Yorker that attempts to explain the high cost of health care at a micro level, argues that conversation between doctor and patient and between the various doctors who devise the patient’s treatment is the essence of good health care. It happens all too rarely but does occur at exemplars of good practice, such as the Mayo Clinic.

I will not detail how those conversations should occur. I’ve written a good deal on this in my piece, Rethinking Office Hours and in an earlier piece, Killing the Puppy, linked from there. I do want to make the following point, however. Some conversations exist in name only. The participants are too guarded to have real exchange. The discussion ends up being perfunctory and unrewarding. It is necessary for everyone to open up first. That might not happen on its own accord. So the issue needs to be addressed squarely. Any mechanism for getting the class to function well will incorporate activities and structures to encourage openness of the participants. Students benefit not just from learning about the subject matter but also from developing the self-confidence that they can hold up their end of the conversation.

Conversation as assessment means the assessment is embedded in the activities of the class. This notion of embedding assessment is important and it stands in contrast with a view that says learning happens over here and assessment happens over there. With the latter view, conceptually the places must also vary temporally. Assessment happens after learning. This idea is ingrained. My campus has a Final Exams week at the conclusion of the semester. Many campuses have that sort of thing. The presence of assessment after the fact may block thinking about embedded assessment. If that is true, it is a shame. It is the embedded assessment that is fundamental. The conversations need to happen during learning.

Given that, one might ask if the final exam is necessary (or some other high stakes assessment). This is an answer qua analysis rather than a simple yes or no. One has to look at the total environment and from that see what the final exam contributes. What do the students learn preparing for it? How do the students make their preparations? What does the instructor learn from the student performance? What would happen in the absence of the exam? Those questions require answers that will depend on the instructor, the students, and the subject matter. If the instructor actually does an analysis of the environment to produce such answers, it would be a very good thing for the instructor to write it up and make it explicit, for example by posting it online. I do believe that irrespective of the course students need to be responsible for some deliverable due at or near the end of term that is of substantial consequence regarding grade. The extrinsic incentive matters and in itself communicates what the instructor values in the work of the students.

The other reason for bringing up the essays by George Packer is that assessment must not restrict itself to whether the particular student is learning. It must also be about whether the mechanism itself is working. The two go hand in hand and happen simultaneously. The troops led by Colonel H.R. McMaster assessed the Iraqis in Tal Afar, who in turn assessed the Americans and their commitment to bringing order and well being to the community. Together they assessed whether their joint actions were making things safer and establishing trust between Shia and Sunni, enabling the Iraqis to take over on their own. So it is with learning in the classroom. The students assess the teacher as the teacher assesses the students. Collectively they assess whether the course “is working.” When it’s not, modifications need to be made in the approach. Those changes need to be suitably negotiated among the class members.

I fear that this norm I’ve sketched lies beyond current practice in far too many cases. One reason for this is that the instructor is isolated in his teaching and after a time goes about it as an automaton, adhering to a syllabus that was constructed long ago, basing the instruction on prior offerings of the course rather than on the current class population and recent events, in the field of study and in the world at large. The norm itself needs to be a part of campus culture and the culture within individual departments. At present the culture may not support the norm. It is the culture that must change.

I know that soon after I started college at MIT in 1972 I felt college was different than high school because in high school the school was responsible for the students learning but in college it was the student who was responsible. This was reflected in the difficulty of the material, the frequency with which we had classes and had to do homework, and the amount of interaction between teacher and student. I can’t say whether I would have had that feeling had I attended a different university. But let’s say it was true across the board. Then a good reason why supporting the norm is not part of the culture now is that it wasn’t part of the culture then and our individual values aren’t yet geared to change the culture.

Times, however, have changed. We are witnessing major institutions failing. We cannot let higher education fail. We need to recognize collectively the necessity of a shared responsibility for student learning. We must build a new culture with that as the basis. Gawande sketches what shared responsibility looks like in medicine. It is not hard to envision what the parallel would look like in higher education. There is a lot of work to do. We should get started.

Guessing Games

Saturday, May 30, 2009

Assessment

No comments:

Table of Contents

About Me

PDF Versions of Chapters

Usage Tracking

Copyright

Guessing Games

Saturday, May 30, 2009

Assessment

No comments:

Subscribe To

Table of Contents

About Me

PDF Versions of Chapters

Usage Tracking

Copyright