LLM ベースモデル モデル名 Context Window (トークン数) MMLU (推論) HumanEval (コーディング) GSM8K (算術能力) Spider 1.0 (SQL) llama3.1-405b 128,000 88.6 89 96.8 - reka-core 32,000 83.2 76.8 92.2 - llama3.1-70b 128,000 86 80.5 95.1 - mistral-large2 128,000 84 92 93 - mistral-large 32,000 81.2 45.1 81 81 reka-flash 100,000 75.9 72 81 - llama3.1-8b 128,000 73 72.6 84.9 - mixtral-8x7b 32,000 70.6 40.2 60.4 - llama-2-70b-chat 4,096 68.9 30.5 57.5 - jamba-instruct 256,000 68.2 40 59.9 - jamba-1.5-mini 256,000 69.7 - 75.8 - jamba-1.5-large 256,000 81.2 - 87 - Snowflake Arctic 4,096 67.3 64.3 69.7 79 llama3.2-1b 128,000 49.3 - 44.4 - llama3.2-3b 128,000 69.4 - 77.7 - gemma-7b 8,000 64.3 32.3 46.4 - mistral-7b 32,000 62.5 26.2 52.1 - ※2024/12/16現在