Intelligence Test Scores for ChatBots | ||||||||||||||||
By Julius.Lin@AITek.com | 9/28/2024 | |||||||||||||||
Test Case# | Meta.AI | ChatGPT | Gemini | Test Case Name | Answer | Perplexity.AI | Liner | Claude 3.5 | ||||||||
1 | 0.2 | 0.4 | 2nd time right | 0 | Forgotten-worry city | consistent | ||||||||||
2 | 1 | 1 | 1 | Some are happy | inconsistent | |||||||||||
3 | 0.25 | 2nd time right | 0.25 | 0.25 | God | inconsistent argument | ||||||||||
4 | 1 | 1 | 1 | fishes in water.(easiest) | invalid argument | |||||||||||
5 | 1 | 0 | 0 | wealthy thief | invalid argument | |||||||||||
6 | A | 0.9 | 0.9 | 0.9 | sociologist's observation 3 sets |
Partial Valid | ||||||||||
B | 1 | 1 | 1 | absolutely Invalid | ||||||||||||
7 | A | 1 | 1 | 0 | Buddha 4 sets | invalid | ||||||||||
B | 0.9 | 0.9 | 0.9 | SOUND VALID | ||||||||||||
8 | 0 | 0 | 0 | ecologist 5 sets | valid argument | |||||||||||
none of the LLM can tackle 5 sets problem correctly. So I stop here | see details at 'Answer Collation' | |||||||||||||||
72.50% | 64.50% | 50.50% | #'Answers Collation'!A1 | |||||||||||||
wait to test DeepMind's Differentiable Reasoning system | ||||||||||||||||
The author is the first one to
break the limitation of visual computation of sets into hyper-sets computation without mathematical
limitation back in |
||||||||||||||||
1984(presented) ~1987(published), for the goal of processing very complicate real-world problems beyond human comprehension. such as: | ||||||||||||||||
ex1. medTech areas, where many symptons have interwinded relationship to some diseases and how do we deduct, induct and abduct them to find | ||||||||||||||||
root causes and cures ; or | #'body health'!A1 | |||||||||||||||
ex2. there are tens, hundreds or even thousands of status of parameters come in to be processed to reach our goals defined or self optimized. | ||||||||||||||||
e.g. in A) economy | #'stock market'!A1 | |||||||||||||||
and B) battle fields | ||||||||||||||||
or C) many other disciplines, | ||||||||||||||||
Some problems are like these exercises I gave to you with 0 :∅ or 1:Ǝ binary values, some are with degree of truth like fuzzy logic, | ||||||||||||||||
and some are stochastic like Bayesian inference for the challenges of the neuro-symbolic AI. | ||||||||||||||||
* Venn-Lin method: | Logical Deduction and Method of Calculus in Venn Diagram | 1987 | archive | |||||||||||||
* 林智勇 Lin, J. Y. | 邏輯推理與范恩圖解之計算法 | 1987 | ||||||||||||||
(To download the fulltext article, please either obtain library access or log in as an Airiti member.) |