Large Language Models (LLMs) excel at general tasks but underperform in specialized domains like economics and psychology, which require deep, principled
understanding. To address this, we introduce ACER (Automated CurriculumEnhanced Regimen) that transforms generalist models into domain experts without sacrificing their broad capabilities. ACER first synthesizes a comprehensive,
textbook-style curriculum by generating a table of contents for a subject and then
creating question-answer (QA) pairs guided by Bloom’s taxonomy. This ensures
systematic topic coverage and progressively increasing difficulty. The resulting
synthetic corpus is used for continual pretraining with an interleaved curriculum
schedule, aligning learning across both content and cognitive dimensions.
Experiments with Llama 3.2 (1B and 3B) show significant gains in specialized
MMLU subsets. In challenging domains like microeconomics, where baselines
struggle, ACER boosts accuracy by 5 percentage points. Across all target domains,
we observe a consistent macro-average improvement of 3 percentage points. Notably, ACER not only prevents catastrophic forgetting but also facilitates positive
cross-domain knowledge transfer, improving performance on non-target domains
by 0.7 points. Beyond MMLU, ACER enhances performance on knowledgeintensive benchmarks like ARC and GPQA by over 2 absolute points, while maintaining stable performance on general reasoning tasks. Our results demonstrate
that ACER offers a scalable and effective recipe for closing critical domain gaps
in LLMs.