Designed to Fade

What the research can and cannot tell us about scaffolding, partnership, and AI in learning

May 11, 2026

A recent piece I published looked at three datasets pointing in the same direction. Students are asking to be developed as thinkers, not trained as AI users. Employers are asking for the human capacities that develop through that kind of formation. The signal is consistent and reasonably easy to read. The harder question, the one the piece deliberately stopped short of answering, is what would actually have to happen inside courses for that signal to be honored.

Before getting to the design question, I want to consider for a bit the underlying research. Not because the research is settled, but because the parts that are clearer than most institutional responses suggest, and the parts that are genuinely open, both matter for what comes next. The argument here is that the learning science we already have is more honest than most of the conversation about AI in education, and that the most important question it raises is one we cannot yet answer.

What scaffolding actually means

The concept of scaffolding has a specific meaning in learning science, and the meaning matters. Lev Vygotsky’s zone of proximal development, articulated in the 1930s but translated into wide use much later, described the gap between what a learner can do alone and what they can do with appropriate support from someone more capable. Wood, Bruner, and Ross gave the support a name in 1976. They called it scaffolding, by analogy to the temporary structure that lets a building rise. The analogy carried two ideas at once. The scaffold makes possible what the learner cannot yet manage alone, and the scaffold is supposed to come down. A scaffold that stays is no longer a scaffold. It becomes part of the building.

John Sweller’s cognitive load theory, developed across the 1980s and 1990s, gave the scaffolding tradition its most precise mechanism. Working memory is small and fragile. When the cognitive demands of a task exceed what working memory can hold, learning stops happening. Effective instruction reduces extraneous load (the mental work the task imposes that isn’t actually about the learning), preserves germane load (the work that builds schema), and meets the learner where their capacity actually is. When the learner is novice, the scaffolding is heavy. As capacity grows, the scaffolding lightens, then disappears.

Barak Rosenshine synthesized decades of research on direct instruction into ten principles in 2012, and several of them name the same arc explicitly. Provide scaffolds for difficult tasks. Guide student practice. Reduce supports as competence increases. Move toward independent practice. The shape of effective instruction in this tradition is not flat. It is curved. The educator gives more at the start, less in the middle, almost nothing at the end. The learner ends up able to do, alone, what they could not do alone at the beginning.

This is the frame the AI conversation in education has not yet truly considered. When we talk about AI in learning, we tend to talk about the moment of use. What did the student produce, what role did AI play, was the result acceptable. The scaffolding tradition would ask a different sequence of questions. What was the learner capable of before they had access to the support? What capacity is the support helping them build? What can they do, alone, after the support is removed? Those questions are not optional in this tradition. They are what scaffolding means.

What Wang and Zhang actually measured

The most-discussed recent piece of research on cognitive offloading and AI sits awkwardly inside the scaffolding frame, and reading it carefully helps explain why. Wang and Zhang’s 2026 study of 912 students, published in the International Journal of Educational Technology in Higher Education, has been widely shared and widely interpreted. The interpretation has not always tracked what the paper actually shows.

The paper’s central variable is partnership orientation, not quantity of offloading. Partnership orientation is a relational construct. It describes the stance the learner takes toward AI: whether they treat it as a collaborator to think with, an authority to defer to, or a shortcut to use around. Wang and Zhang found that partnership orientation simultaneously predicted increased critical vigilance toward AI outputs (β = 0.335) and increased strategic delegation of substantive work to AI (β = 0.351), and that both pathways independently predicted transformative learning experience. The same orientation made students more critical and more willing to delegate, at the same time, and both contributed to learning depth.

This is a different finding than the one that has often been pulled out of the paper. Several widely shared summaries, including a piece I wrote in April that drew on a thoughtful interpretive breakdown by another writer, framed the paper’s findings as a U-shaped curve with three zones of AI use. That framing came from an interpretive overlay on the paper’s exploratory quadratic analysis, not from the paper’s central findings. The quadratic analysis appears in the paper, but it is labeled as exploratory, the paper does not name zones, the paper does not designate any zone as worst, and the paper does not include a no-AI control group that would make the lowest end of the curve empirically grounded. I want to name this here, because this piece is going to do something different than that earlier reading allowed for, and the difference is the point.

What partnership orientation predicts, in Wang and Zhang’s data, is a particular kind of engagement with AI that produces deeper learning experience. It is not a quantity of use. It is a way of using. The implication for course design is meaningfully different from a zones reading. We are not trying to push students into a behavioral category. We are trying to develop a relational stance.

Partnership at Gap

The Microsoft Research field experiment with Gap Inc., published as a working paper in April 2026, gives us the clearest real-world picture I have seen of what partnership orientation looks like as a deliberate intervention and what it produces. The study had 388 full-time Gap employees, all with identical access to Microsoft Copilot. The researchers, Alex Farach, Alexia Cambon, Lev Tankelevitch, Connie Hsueh, and Rebecca Janssen, were not testing whether the tool worked. They were testing whether the structure surrounding the use of the tool mattered.

They tested two kinds of scaffolding. The first was behavioral. Pairs of employees followed a structured Create-Out-Loud protocol: meet via Microsoft Teams, discuss a strategic plan verbally, generate a transcript, prompt Copilot to draft a document from it. The second was cognitive. Individual employees received partnership training developed by Conor Grennan and AI Mindset, adapted from his Generative AI for Professionals curriculum. The training had no feature demos, no prompt tips, no use cases. It focused on three things: reframing AI as a collaborative partner rather than a search engine, replacing a single-query interaction habit with a multi-turn conversational one, and giving participants guided practice in iterative prompting.

The results split. The behavioral protocol underperformed unstructured AI use. Pairs assigned to the structured protocol produced documents averaging 10.68 out of 22, compared with 15.63 for the control group, and were more than eight times more likely to fail to produce a document at all. The protocol added coordination overhead that crowded out the actual cognitive work. The cognitive intervention went the other direction. Participants who received partnership training were more than twice as likely to produce a top-quality document compared with those who got standard Copilot feature training. Seventy-seven percent hit the maximum score versus 61.8 percent of controls.

The contrast is the part worth considering. Identical tool access. Same time, same task, same evaluation rubric. The variable that mattered was not whether people had access to AI, and not whether their use was structured procedurally. The variable that mattered was how they thought about what AI is for. Conor Grennan, in a LinkedIn post about the results, put it this way: the company that makes Copilot proved the tool isn’t the bottleneck. How people think about it is.

The researchers are forthright about the study’s limitations. There is a confound between morning and afternoon sessions that they were not able to fully resolve. There was differential attrition between conditions. Documents were graded primarily by GPT-4o-mini rather than by human raters, with only moderate inter-rater agreement on the individual task. The study was not pre-registered. None of these limitations cancel the finding. They do mean that the field experiment is one piece of evidence in a longer conversation, not a settled answer. That is roughly the position partnership orientation occupies in the research base right now: a construct with real empirical support, real measurable effects, and real limitations that responsible readers should hold alongside the headline number.

What the Microsoft study and the Wang and Zhang study together support is reasonably specific. Cognitive framing matters more than feature knowledge. Relational stance matters more than quantity of delegation. The structure that helps is the kind that shapes how learners think about what they’re doing, not the kind that prescribes the steps they take.

What the research cannot yet tell us

Here is what neither the Wang and Zhang study, nor the Microsoft field experiment, nor most of the current evidence base on AI and learning actually answers. We do not yet know what happens, in any well-measured way, when a learner who developed their thinking with AI present then faces a novel situation with the AI absent.

This is the frontier gap, and it is the part of the conversation that is missing from most institutional discussions of AI in higher education. Wang and Zhang measured transformative learning experience inside the learning task. Their outcome variable captured depth of engagement, perspective shift, and reframing of assumptions during the work itself. Microsoft measured document quality in a single session. The Pallant study from earlier in 2025 measured definition quality across twelve weeks, with AI present throughout. The 2025 MIT EEG study is one of the few that measures what happens when AI is removed mid-task, and it found that students who had been writing with ChatGPT showed reduced neural connectivity and could not recall their own earlier work. That study is small, single-session, and not yet replicated, but it is one of the only signals we have about what the absence of the support reveals.

The scaffolding tradition treats the absence of the support as the test. A learner who can do the work alone, after the scaffold is removed, has built capacity. A learner who cannot has not. This is the question the AI conversation in higher education has not been able to answer, because we have not yet built the studies that would tell us. Most of what we know about AI use in learning is about the in-task experience, in a single session, with the AI present throughout. The longitudinal question, the across-courses question, the novel-situation question, the without-AI question, are open.

The picture this leaves us with is incomplete, and I want to be honest about that. The research base supports specific things. Partnership orientation produces deeper engagement during the task. Cognitive framing affects task quality more than feature training does. Scaffolding traditions in learning science have a long, well-documented track record across other domains. What it does not yet support is the claim that AI-integrated learning produces durable thinking capacity that transfers to situations where the AI is not available. That claim might turn out to be true. There are theoretical reasons to think it could be. The evidence we would need to make the claim with confidence does not exist yet. Decisions are being made on a partial record, and the partial record is not a reason to stop, but it is a reason to be careful about how confidently we speak.

The honest version of the scaffolding question, applied to AI, sounds like this. Are we building thinkers who happen to have new instruments available, or are we building proficient users of an instrument? Those are not the same project. The distinction shows up most clearly in what the learner can do when the instrument is taken away. We do not yet know how to test for that systematically inside higher education, and until we do, the institutional response to AI is operating on a confidence the evidence does not yet warrant.

Toward design

This is the post that does not solve the problem it raises. The next piece will spend its time on the design question. If we take the scaffolding tradition seriously, and we take partnership orientation seriously, and we hold the frontier gap honestly, what does that imply for the way we structure courses, assessments, and AI integration policies? That is where the practical work lives, and that is where student agency, ethical use, and the actual experience of learning come back to the center of the conversation.

What I want to leave readers with is a slightly different orientation toward the conversation. Most institutional discussions of AI in higher education are conducted as if we know more than we do. The vendor literature, the workshop circuit, and many of the rapid-rollout policies all assume a settled science that does not exist. The scaffolding tradition, applied carefully to AI, suggests something more demanding than the settled-science framing makes room for. We have a relational construct, partnership orientation, with empirical support but limited transfer evidence. We have a learning science tradition, Sweller and Vygotsky and Rosenshine, that gives us strong mechanisms but predates the technology by decades. We have a frontier gap that the field has not yet built the studies to close.

Holding all of that at once is harder than picking a side. It is also the responsibility we owe students who are asking us to develop them as thinkers. They are asking the right question. The research can tell us part of the answer. The rest of the answer is going to come from doing the work carefully, watching what happens, and being willing to revise as the evidence accumulates.

The next piece takes that into the classroom.

Thanks for reading The Collaboration Chronicle: Human+AI in Education! This post is public so feel free to share it.

Cindy Yee Au

I am completely on board with the need to explore this area of scaffolding, partnerships and AI in learning. In fact, I am seeking to develop an AI Collaborative Learning Ecosphere to discuss the very issues you talk about and to co-create a framework that can include the best of learning and educational research with AI tools. Would love to connect!

The Collaboration Chronicle: Human+AI in Education

Discussion about this post

Ready for more?