Only 31% of High School Graduates Prepare for STEM: Why Trebuchet Competitions Are Failing
ByNovumWorld Editorial Team

The narrative that generative AI will rescue STEM education by acting as an infinite tutor is a dangerous lie that ignores the crumbling foundation of student preparation. We are witnessing a bubble where the hype surrounding large language models obscures a reality where high school graduates lack the basic calculus to understand the physics simulations they are supposedly controlling.
- Only 31% of high school graduates complete a college preparatory curriculum, rendering advanced STEM projects like trebuchet competitions futile for the majority of students who lack the requisite math skills.
- A national survey indicates 86% of teachers use AI detection tools, signaling a systemic distrust of student output and a crisis in academic integrity that undermines the learning process.
- The integration of generative AI into physics classrooms risks creating a generation of engineers who rely on GPT-4o for problem-solving rather than understanding the underlying calculus, effectively hollowing out technical competence.
The Case For: The Silicon-Enhanced Ballistics Myth
Proponents of integrating machine learning into high school engineering argue that computational tools accelerate the design iteration cycle, allowing students to bypass tedious manual calculations. This perspective suggests that by utilizing simulators like the Treb-Bot, students can model projectile trajectories and optimize counterweight ratios before cutting a single piece of wood. Theoretically, this mirrors professional workflows where engineers leverage massive compute clusters to run finite element analysis, preparing students for a world where NVIDIA H100s, with their 3.35 TB/s of memory bandwidth, do the heavy lifting. The NASA JPL ‘Candy Toss’ Competition is often cited as a success story where students apply complex principles in a gamified environment, ostensibly proving that engagement can be manufactured through high-tech challenges.
The argument posits that if a student can leverage a Transformer-based model to predict the optimal release angle or mass ratio, they are learning to wield modern tools. In this view, the AI acts as a force multiplier, handling the differential equations while the student focuses on system architecture and structural integrity. Advocates claim that this approach democratizes engineering, allowing students with weaker math skills to participate in high-level design by offloading the grunt work to the cloud. They point to the efficiency gains in industry, where Mixture of Experts (MoE) architectures like those used in Mixtral 8x7B allow for specialized routing of queries, arguing that students should learn to prompt these systems rather than memorize formulas. The promise is that by reducing the friction of failure, students become more willing to experiment, leading to deeper engagement with the material.
However, this reliance on software assumes that the output of the model is grounded in physical reality. Generative AI models, including GPT-4o and Claude 3.5 Sonnet, are probabilistic engines trained on vast datasets of text, not physics engines. They can hallucinate plausible-sounding but physically impossible solutions, such as suggesting a counterweight ratio that defies the conservation of energy. When a student relies on a 1 million token context window to ingest a textbook and regurgitate an answer, they are not performing engineering; they are acting as a human interface for a stochastic parrot. The “efficiency” gained is often an illusion, masking the fact that the student has no intuition for why the design works or fails. The cost of inference, while dropping, remains non-trivial; running high-fidelity simulations or querying premium APIs creates a financial barrier that public schools cannot sustain, unlike the venture-backed startups burning cash to subsidize these APIs.
The Case Against: The Crutch of Computation
The integration of AI into STEM projects acts as a cognitive crutch that actively prevents the acquisition of critical problem-solving skills. Hamsa Bastani, a professor at Wharton, highlights the “paradox of generative AI,” warning that if students use the technology lazily and trust the model completely, they fail to develop the skills the model is meant to augment. When a high schooler uses a 70 billion parameter model to derive the kinematics of a trebuchet, they bypass the cognitive struggle necessary to build intuition. The result is a fragile knowledge base that collapses the moment the API goes down or the context window is exceeded. We are seeing a surge in the use of AI detection tools, with 86% of teachers employing them, yet these tools are notoriously inaccurate against modern models, creating an environment of suspicion rather than learning.
The Pedagogical Trebuchet case study emphasizes that the educational value lies in the physical construction, the historical context, and the manual adjustment of the machine, not just the theoretical optimization of the trajectory. If the AI solves the differential equation for the projectile motion, the student has not engineered anything; they have merely executed a prompt. This process strips away the “desirable difficulty” required for learning, turning education into a consumption of pre-digested answers. The over-reliance on these tools is particularly damaging in physics, where understanding the relationship between torque, angular velocity, and arm length is essential. A student who prompts an AI for the optimal dimensions of a trebuchet arm without understanding the material stress limits is likely to build a dangerous structure.
Furthermore, the obsession with AI literacy distracts from the fact that only 34% of high school teachers believe AI tools offer equal benefits and harms. The classroom is becoming a battleground where teachers fight against the tide of AI-generated essays and code, wasting time on detection rather than instruction. The unit economics of this approach are disastrous; schools are expected to pay for subscriptions to AI grading tools and detection software while their budgets for actual lab equipment are slashed. The “thin content” produced by AI—generic, soulless, and often factually incorrect—is degrading the quality of student work. We are trading the deep, messy, and rewarding process of learning for the shallow, instant gratification of a generated answer. This is a scam sold to school districts by ed-tech companies promising a revolution that delivers nothing but a dependency on their proprietary APIs.
The Uncomfortable Truth: Infrastructure vs. Intuition
The fundamental issue is not the availability of AI tools, but the catastrophic lack of preparation in the student body. Only 31% of high school graduates complete a basic college preparatory curriculum, meaning the vast majority lack the algebra and calculus prerequisites to even understand the output of an AI assistant. We are trying to run Llama-3 405B-level inference on minds that haven’t even booted the operating system. The disparity between the compute power available in the cloud and the cognitive power in the classroom is widening. While NVIDIA’s new B200 GPUs promise to quadruple inference performance for trillion-parameter models, the average student’s ability to interpret the results remains stagnant. This creates a dangerous asymmetry where students wield tools they cannot control, leading to a generation of engineers who can operate software but cannot diagnose why it failed.
The economics are equally damning; running high-fidelity simulations or querying premium APIs costs money that school districts do not have. While venture capital pours billions into generative AI startups, the unit economics for public education remain unsustainable. The obsession with “AI literacy” is a bubble that ignores the reality that 35% of teachers believe these tools cause more harm than good. We are risking safety, too; without understanding the torque pressures on an axle, a student relying on an AI-generated blueprint might build a machine capable of causing serious injury. Medieval trebuchets utilized beams of up to 10 meters and could throw 100 kg stones, but they relied on craftsmen who understood the material limits of wood and iron. Modern students, relying on AI, might design a similar machine without understanding that the 40:1 weight ratio required for maximum range generates immense stress on the pivot point.
Privacy and sovereignty are also major concerns in this rush to digitize education. Who owns the student data? Is it the school district, or is it the AI provider training their next model on student errors? The distinction between “Open Source” and “Open Weights” is lost on administrators who sign contracts with cloud providers, effectively locking their curriculum into a proprietary ecosystem. The data sovereignty angle is critical; if a school relies on a closed API like GPT-4o, they are at the mercy of OpenAI’s pricing changes and rate limits. This creates a single point of failure for the curriculum, unlike traditional textbooks which do not require an internet connection to function. The “thin content” of AI-generated lesson plans cannot replace the “thick” understanding gained from hands-on failure. We are optimizing for the wrong metric, focusing on the speed of output rather than the depth of understanding.
The Benchmark Trap
The metrics used to justify AI in education are often as hollow as the promises of “superhuman intelligence. We cite LMSYS Chatbot Arena scores or MMLU benchmarks to prove that models are smart, but these tests measure pattern matching, not reasoning. A model that scores highly on the MMLU (Massive Multitask Language Understanding) can still fail to solve a novel physics problem because it is overfitted to the training data. Similarly, students trained to pass standardized tests using AI tutors may score higher on the SAT but fail catastrophically in a lab setting when faced with a messy, unstructured problem. The “benchmark trap” convinces administrators that the technology is working because the numbers go up, while the actual competence of the students declines. This is the same overfitting problem that plagues machine learning in general; the model learns the test, not the concept.
The competitive landscape of AI coding assistants, with tools like Cursor and Roo Code vying for dominance, further complicates the classroom. These tools are designed to maximize developer productivity, not to maximize student learning. When a student uses Cursor to generate the Python code for a trebuchet simulation, they bypass the struggle of learning syntax and logic. The “unpopular opinion” that actual machine learning work is not fun—consisting largely of data curation and parameter tuning—is lost on students who see only the polished final product. They are sold the myth of the “10x engineer” who types a prompt and deploys an app, ignoring the reality that engineering is 90% debugging and 10% coding. By hiding this reality, AI tools set students up for failure in the workforce, where they will be expected to debug the code the AI wrote.
The failure of trebuchet competitions to engage students is a symptom of this broader malaise. The Bettendorf High School Trebuchet Egg Throw Competition is a microcosm of the problem; students are asked to build complex machines without the foundational knowledge to make them work. The competition involves range and accuracy tests, but if the students cannot calculate the projectile motion themselves, they are just guessing. The risk assessments for these projects often overlook the most dangerous risk: the risk of ignorance. A student who does not understand the potential energy stored in a 500 lb counterweight is a safety hazard, regardless of how many safety waivers they sign. The “Engineering Beauty of the Trebuchet” is lost on students who view it as a hurdle to be cleared with the least amount of effort possible.
Until we fix the foundational math deficit, handing students AI tools for engineering projects is like giving a Ferrari to someone who doesn’t know how to drive.