Curated Research on
AI-Era Education

Top-tier insights from Stanford, MIT, Harvard, and leading research institutions on preparing K-12 students for an AI-driven world.

News

Latest AI and education developments from top universities

Berkeley BAIR

Gradient-based Planning for World Models at Longer Horizons

.grasp-results-table table { font-size: 0.875rem; line-height: 1.35; width: 100%; } .grasp-results-table th, .grasp-results-table td { padding: 0.35rem 0.5rem; } /* Consistent whitespace between major sections (this post is long and hr-heavy) */ article.post-content h2 { margin-top: 2.75rem; margin-bottom: 0.75rem; } article.post-content h2:first-of-type { margin-top: 2.25rem; } article.post-content h3 { margin-top: 1.65rem; margin-bottom: 0.5rem; } article.post-content hr { margin-top: 2.5rem;

Georgia Tech

AI is Reengineering Drug Discovery by Speeding Up Testing and Scanning Petabytes of Data for Connections Between Diseases

AI is Reengineering Drug Discovery by Speeding Up Testing and Scanning Petabytes of Data for Connections Between Diseases Superadmin Fri, 04/17/2026 - 12:15 In December, The Conversation hosted a webinar on AI’s revolutionary role in drug discovery and development. Science and technology editor Eric Smalley interviewed Jeffrey Skolnick , eminent scholar in computational systems biology at Georgia Institute of Technology, and Benjamin P. Brown , assistant professor of pharmacology at Vanderbilt U

Berkeley News

UC Berkeley and UCSF researchers are using AI to revolutionize medical imaging

Amid a growing shortage of radiologists, a startup named Voio strives to make medical imaging more efficient — and more effective. The post UC Berkeley and UCSF researchers are using AI to revolutionize medical imaging appeared first on Berkeley News .

Georgia Tech

New Study Could Show How TikTok’s Algorithm Affects Youth Mental Health

New Study Could Show How TikTok’s Algorithm Affects Youth Mental Health Superadmin Mon, 04/13/2026 - 10:58 Meta CEO Mark Zuckerberg took the witness stand last week in Los Angeles County Superior Court to defend his company from accusations that social media harms children. A lawsuit filed by a 20-year-old plaintiff alleges Instagram and other social media apps are designed to make young users addicted to their platforms. Meanwhile, social media experts believe the algorithms that drive content

Berkeley BAIR

Identifying Interactions at Scale for LLMs

--> Understanding the behavior of complex machine learning systems, particularly Large Language Models (LLMs), is a critical challenge in modern artificial intelligence. Interpretability research aims to make the decision-making process more transparent to model builders and impacted humans, a step toward safer and more trustworthy AI. To gain a comprehensive understanding, we can analyze these systems through different lenses: feature attribution , which isolates the specific input features dri

MIT News

Can AI help predict which heart-failure patients will worsen within a year?

Researchers at MIT, Mass General Brigham, and Harvard Medical School developed a deep-learning model to forecast a patient’s heart failure prognosis up to a year in advance.

Research

Peer-reviewed papers from leading academic journals

Computers & Education: Artificial Intelligence

A framework for evaluation of large language models in essay assessment: Reliability, alignment, and causal reasoning

["Tongxi Liu","Luyao Ye","Wei Yan"]

Library & Information Science Research

An experimental study on the impact of Generative AI on university students' emotions and performance in creative problem-solving tasks

["Yu Bai"]

Computers & Education

Grounded Knowledge Graph Extraction via LLMs: An Anchor-Constrained Framework with Provenance Tracking

Knowledge graphs represent real-world facts as structured triplets and underpin a wide range of applications, including question answering, recommendation, and retrieval-augmented generation. Automatically extracting such triplets from unstructured text is essential for scalable knowledge base construction. Traditional extraction methods require task-specific training data and struggle to generalize across domains. Large language models (LLMs) offer an alternative through in-context learning, enabling flexible extraction without fine-tuning. However, LLMs frequently hallucinate—generating plausible triplets unsupported by the source text. The root cause is the lack of provenance: existing methods produce triplets without explicit links to their textual origins, making faithfulness unverifiable. This paper presents Anchor-Extraction-Verification-Supplement (AEVS), a framework that grounds every triplet element to the source text. AEVS operates in three stages: (1) anchor discovery identifies entities, relation phrases, and attribute values with precise positions, forming a constrained extraction vocabulary; (2) grounded extraction generates triplets linked to discovered anchors; and (3) restoration-based verification validates triplets through hierarchical matching, with a coverage-aware supplement ensuring comprehensive extraction. Experiments on WebNLG, REBEL, and Wiki-NRE demonstrate consistent improvements over both trained models and LLM-based baselines. Ablation studies confirm that anchor-based constraints are the primary mechanism for hallucination reduction. Dedicated analyses of anchor discovery quality, computational cost (2.83–4.28 LLM calls per sample), and hallucination rates (0.23–20.23% across model–dataset configurations) provide insights into the framework’s practical applicability and limitations. .

["Yuzhao Yang","Genlang Chen","Binhua He","Yan Zhao"]

Computers & Education

ExamQ-Gen: Instructor-in-the-Loop Generation of Self-Contained Exam Questions from Course Materials and Decision-Support Grading

Reliable evaluation of large language models (LLMs) for educational use requires benchmarks that reflect exam constraints, instructor grading practices, and the operational consequences of thresholded decisions. This paper introduces ExamQ-Gen, an instructor-in-the-loop benchmark that couples two tasks: (i) an LLM answering university-style exam questions and (ii) decision-support grading aligned with an instructor reference. Automatic grading is used for triage and feedback; in practice, ExamQ-Gen supports instructor-led exam authoring and provides grading recommendations, while the instructor issues the final grade and pass/fail decision. ExamQ-Gen is constructed from the course content by using an LLM to generate exam-style questions directly from the lecture materials, producing a course-derived question set suitable for controlled experimentation. The benchmark then instantiates contrasting exam conditions, including instructor-authored (HUMAN) versus pipeline-generated (PIPELINE) artifacts, to evaluate robustness under distribution shifts that can occur when exam questions and answers are produced through different generation workflows. Using two LLM “students” (Llama3-8B-Instruct and Mistral-7B-Instruct) and an LLM-based grader, we compare automatic grading against an instructor reference on a 1–10 score scale and at the decision level induced by the operational pass policy (pass if score ≥ 9). Accordingly, our conclusions are conditioned on the two evaluated student models. Score-level agreement is strong under HUMAN conditions but degrades substantially under PIPELINE conditions, indicating condition-dependent stability. At the pass threshold, decision errors are highly asymmetric, with false fails dominating false passes, meaning that conservative grading may appear safe while producing credit denial. A severity-focused analysis isolates a high-stakes failure mode—denial of instructor-perfect answers—and shows that, in the most affected PIPELINE condition, the perfect-pass miss rate reaches 0.926 (50/54), consistent with systematic conservatism rather than borderline noise. Overall, the results highlight that aggregate score agreement and accuracy are insufficient for instructor-controlled exam deployment and motivate reporting practices that combine disaggregated score agreement, threshold-based error asymmetry with uncertainty, and severity-aware diagnostics under exam-relevant condition shifts.

["Catalin Anghel","Emilia Pecheanu","A. Anghel","M. Craciun","A. Cocu"]

Computers & Education

Adoption of AI in Higher Education: Engineering Faculty Perceptions of Preparation for Industry 4.0

Artificial intelligence (AI) has established itself as a key technology in the context of Industry 4.0, with direct implications for university education, especially in engineering degrees. This study analyses the degree of adoption and the main educational uses of AI-based tools in higher education, as well as teachers’ perceptions of their contribution to preparing students for the professional challenges associated with Industry 4.0. A qualitative descriptive-interpretative design was used, involving semi-structured interviews with 32 engineering teachers at the University of Seville. The results show an incipient and uneven adoption, focused mainly on instrumental uses to support planning and material development, with still limited integration in assessment and learning personalisation. Despite this, teachers perceive AI as a resource with the potential to promote the development of digital skills and improve employability, although they emphasise the need for specific teacher training and institutional support for deeper and more coherent pedagogical integration.

["José Fernández Cerero","José María Fernández Batanero","Daniel Fernández Cerero","Marta Montenegro Rueda"]

Computers & Education

Intrusion Detection in Fog Computing: A Systematic Review of Security Advances and Challenges

Fog computing extends cloud services to the network edge to support low-latency IoT applications. However, since fog environments are distributed and resource-constrained, intrusion detection systems must be adapted to defend against cyberattacks while keeping computation and communication overhead minimal. This systematic review presents research on intrusion detection systems (IDSs) for fog computing and synthesizes advances and research gaps. The study was guided by the “Preferred-Reporting-Items for-Systematic-Reviews-and-Meta-Analyses” (PRISMA) framework. Scopus and Web of Science were searched in the title field using TITLE/TI = (“intrusion detection” AND “fog computing”) for 2021–2025. The inclusion criteria were (i) 2021–2025 publications, (ii) journal or conference papers, (iii) English language, and (iv) open access availability; duplicates were removed programmatically using a DOI-first key with a title, year, and author alternative. The search identified 8560 records, of which 4905 were unique and included for qualitative grouping and bibliometric synthesis. Metadata (year, venue, authors, affiliations, keywords, and citations) were extracted and analyzed in Python to compute trends and collaboration. Intrusion detection systems in fog networks were categorized into traditional/signature-based, machine learning, deep learning, and hybrid/ensemble. Hybrid and DL approaches reported accuracy ranging from 95 to 99% on benchmark datasets (such as NSL-KDD, UNSW-NB15, CIC-IDS2017, KDD99, BoT-IoT). Notable bottlenecks included computational load relative to real-time latency on resource-constrained nodes, elevated false-positive rates for anomaly detection under concept drift, limited generalization to unseen attacks, privacy risks from centralizing data, and limited real-world validation. Bibliometric analyses highlighted the field’s concentration in fast-turnaround, open-access journals such as IEEE Access and Sensors, as well as a small number of highly collaborative author clusters, alongside dominant terms such as “learning,” “federated,” “ensemble,” “lightweight,” and “explainability.” Emerging directions include federated and distributed training to preserve privacy, as well as online/continual learning adaptation. Future work should consist of real-world evaluation of fog networks, ultra-lightweight yet adaptive hybrid IDS, self-learning, and secure cooperative frameworks. These insights help researchers select appropriate IDS models for fog networks.

["Nyashadzashe Tamuka","T. Mathonsi","T. Olwal","Solly Maswikaneng","Tonderai Muchenje","T. Tshilongamulenzhe"]