Success rate by model

Percentage of correct answers per model

openai/gpt-5

60.38%
Total
53
Correct
32
Accuracy
60.38%
IdOkOut
subj_present_001cregui
subj_present_002sapigueu
subj_present_003vinguin

anthropic/claude-4.5-sonnet

75.47%
Total
53
Correct
40
Accuracy
75.47%
IdOkOut
subj_present_001cregui
subj_present_002sapigueu
subj_present_003vinguin

amazon/nova-premier-v1

69.81%
Total
53
Correct
37
Accuracy
69.81%
IdOkOut
subj_present_001cregui
subj_present_002sàpigueu
subj_present_003vinguin

google/gemini-2.5-pro

66.04%
Total
53
Correct
35
Accuracy
66.04%
IdOkOut
subj_present_001cregui
subj_present_002sapigueu
subj_present_003vinguin

x-ai/grok-4

26.42%
Total
53
Correct
14
Accuracy
26.42%
IdOkOut
subj_present_001cregui
subj_present_002sàpigueu
subj_present_003vinguin