Intelligence Test Scores for ChatBots
By Julius.Lin@AITek.com 9/28/2024
Test Case# Meta.AI ChatGPT Gemini Test Case Name Answer Perplexity.AI Liner Claude 3.5
1   0.2   0.4 2nd time right 0   Forgotten-worry city consistent        
2   1   1   1   Some are happy inconsistent        
3   0.25 2nd time right 0.25   0.25   God inconsistent argument        
4   1   1   1   fishes in water.(easiest) invalid argument        
5   1   0   0   wealthy thief invalid argument        
6 A 0.9   0.9   0.9   sociologist's observation
3 sets
Partial Valid        
B 1   1   1   absolutely Invalid        
7 A 1   1   0   Buddha 4 sets invalid        
B 0.9   0.9   0.9   SOUND VALID        
8   0   0   0   ecologist 5 sets valid argument        
    none of the LLM can tackle 5 sets problem correctly. So I stop here see details at 'Answer Collation'      
  72.50% 64.50% 50.50% #'Answers Collation'!A1      
  wait to test DeepMind's Differentiable Reasoning system      
                           
The author is the first one to break the limitation of visual computation of sets into hyper-sets computation without mathematical limitation back in
1984(presented) ~1987(published), for the goal of processing very complicate real-world problems beyond human comprehension. such as:
ex1. medTech areas, where many symptons have interwinded relationship to some diseases and how do we deduct, induct and abduct them to find
root causes and cures ; or #'body health'!A1
ex2. there are tens, hundreds or even thousands of status of parameters come in to be processed to reach our goals defined or self optimized.
e.g. in A) economy #'stock market'!A1
and B) battle fields
or C) many other disciplines,
Some problems are like these exercises I gave to you with 0 :∅ or 1:Ǝ binary values, some are with degree of truth like fuzzy logic,
and some are stochastic like Bayesian inference for the challenges of the neuro-symbolic AI.
* Venn-Lin method: Logical Deduction and Method of Calculus in Venn Diagram 1987 archive
* 林智勇 Lin, J. Y. 邏輯推理與范恩圖解之計算法 1987
(To download the fulltext article, please either obtain library access or log in as an Airiti member.)