Chain of Thought vs Direct Answers: Cost, Latency, and Quality at Scale

Test when hidden or minimal reasoning beats verbose chains.

Large language models can be prompted to produce long chains of reasoning (chain of thought, CoT), and many academic benchmarks suggest that this improves accuracy. However, in practice CoT increases latency and token usage, which raises costs and may not always yield better results. Developers debate whether CoT is actually useful in production, where response time and efficiency are critical. This project provides a systematic investigation into when chain-of-thought prompting is beneficial and when it should be avoided. You will build a benchmark suite of reasoning tasks tailored to a chosen domain (e.g., medical reasoning). You will test different prompting strategies: direct answers without explanation, CoT with a single reasoning chain, self-consistency with multiple reasoning paths, and minimal rationale approaches. Metrics will include accuracy, token usage, inference time, reproducibility, and user preference (where applicable). Experiments will be run across multiple model sizes and inference stacks to capture general trends. The study will produce practical recommendations for developers deploying LLMs in time- or cost-sensitive environments. It will also contribute to open benchmarks and reproducible protocols for reasoning evaluation.

Goal

Provide clear guidance on when to use chain of thought prompting, balancing accuracy gains against cost and latency.

Learning outcome

Understanding of LLM reasoning methods (CoT, self-consistency, rationale pruning)
Skills in designing fair benchmarking protocols
Experience measuring cost and latency trade-offs in production setups
Ability to publish reproducible evaluation results

Qualifications

Programming in Python, basic knowledge of LLM prompting, and interest in evaluation methodology. Strong motivation :)

Supervisors

Steven Hicks

Chain of Thought vs Direct Answers: Cost, Latency, and Quality at Scale

Goal

Learning outcome

Qualifications

Supervisors

Associated contact