Chat GPT4 Research / Menglin Zhao / Jan.2024 - Apr. 2024 Guidance: Prof. Matt DiGirolamo Free Consultation
Research on Empathy Ability & Response Naturalness in ChatGPT 4
Agenda
03
Methodology
Why are we doing this?
REASON:
📈
Need for Emotional Support
👎 13%
in the Entire Unsatisfaction Rate
GOAL:
Collect qualitative and quantitative data to evaluate user satisfaction with ChatGPT's conversational responses and to assess its capability to demonstrate empathy.
Introduction
ChatGPT4 is an AI-based conversational language model developed by OpenAI. It uses its trained model to compute the answers users are most likely seeking, providing assistance with various issues through natural language text.
As of September 2023, it has 150 million active users. Users include students, programmers, creators, and more.
Methodology
Competitive Analysis
Presentations are
communication tools that
can be used as lectures.
01
Survey
Presentations are
communication tools that
can be used as lectures.
02
User Interview
Presentations are
communication tools that
can be used as lectures.
03
Empahy
Questionnaires
Five questionnaires to evaluate ChatGPT4's empathy
04
Competitive Analysis
Features
ChatGPT 4 | Gemini | Copilot | |
Voice Conversation | ✅ | ❌ | ❌ |
Pin Conversation Boxes | ❌ | ✅ | ❌ |
Links in the Response | ❌ | ✅ | ✅ |
A Guess What You Want to Ask | ❌ | ✅ | ✅ |
Integrated into Programming Environment | ❌ | ❌ | ✅ |
Design Your Own GPT | ✅ | ❌ | ❌ |
Image Creating | ✅ | ❌ | ❌ |
Excel Analysis | ✅ | ❌ | ❌ |
Price | $25 per month | $19.99 per month | $20 per month |
Advantages
Value & Business Goal
With Google ecosystem, it is connected to Google's vast user base to make better daily decisions
Integrates into Google’s native ecosystem, such as Google Flights, Google Maps, YouTube, etc., and their users are almost everyone in the US
Gemini (Google)
Value & Business Goal
Primarily aimed at developers, assisting users to enhance user work efficiency
As an IDE plugin, it is directly integrated into the developer's programming environment.
Focus on helping programmers write code.
Insert links to support the answer
Copilot (Microsoft)
Advantages
ChatGPT4 (OpenAI)
Such as the Transformer model, interpret users' intentions more accurately and therefore generate contextually appropriate responses.
Such as a literature reading plugin, to cater to personalized needs.
ChatGPT can read, summarize, and analyze the file.
Users can design their GPTs or utilize those designed by others.
Facilitate collaboration.
Streamline interactions, making ChatGPT more “human”.
What should we learn from them?
Value & Business Goal
Progressing towards Empower AGI means that it not only needs to excel in specific domains but also be able to "Enhance more comprehensive and widespread general intelligence.
Enhance the ability to assist on various platforms and services, whether it's travel planning, learning resources, or business tools.
Provides expert-level features in specific areas, such as programming or data analysis.
Survey
Quantitative data on ChatGPT 4's emotion recognition ability and overall response satisfaction
Introduction
33
I surveyed participants to understand their experiences regarding satisfaction, including the naturalness of responses, the acquisition of new insights, and daily & emotional support performance.
Survey includes two instances of survey logic and an open-ended question.
Participants
Primarily use ChatGPT for?
30.49%
Writing Assistance
30.49%
Study Assistance
28.05%
Information Searching
Survey - Brief Data (arranged from lowest to highest)
6.71 out of 10
Overall Satisfaction Rate
53.13%
Offer new Insights
6.69 out of 10
Meet needs
70.96%
Somehow satisfied or satisfied with the response
72.88%
Using ChatGPT for 30 minutes or less each day
74.46%
User-friendly
Survey - Pattern and Insights
Naturalness in Different Tasks
There's a large standard deviation in ratings for naturalness.
Naturalness perception may vary with different tasks and needs, requiring further clarification during interviews.
01
Low Naturalness, Lower Satisfaction
The low naturalness is a major reason for the low satisfaction.
Provide natural responses, including incorporating colloquial expressions and adjusting sentence structures.
02
Low Empathy
83% of participants who asked about emotional issues expressed chatGPT’s low empathy.
Provide emotional value, giving them personalized advice and human-like reactions.
03
Survey - Pattern and Insights
Innovative Content
Users who report satisfaction with the new insights provided by ChatGPT are more likely to perceive its responses as accurate.
Ensuring the accuracy of the GPT model in generating content that is both innovative and relevant is crucial.
04
Low Credibility and Input Capacity
Users commonly desire links in replies, and the ability to input more text.
Use links to support responses' credibility and increase the text input capacity to accommodate complex programming needs.
05
Interview
Qualitative data on the ChatGPT 4’s naturalness of response and its performance on Emotion Recognition
Introduction
The objective is to explore the underlying reasons for the perception of low naturalness and the need for empathy.
Each interview lasts 30 minutes and comprises 10 questions
Participants
5
Interview - Findings
“Too robotic!”
“Too much unnecessary info.”
👎 Low Naturalness
All participants use ChatGPT for tasks like email correction, requiring responses to be more natural, utilize relatively casual vocabulary, and be easily understandable.
👎 Not Casual
The need for naturalness should be determined by different use scenarios.
Scenarios
Interview - Findings
“ChatGPT can't feel me..”
“Too neutral..”
Among the four participants, three sought advice from ChatGPT on matters of life, personal philosophy, or emotions.
Personal Support
👎 Low Empathy
Their friends’ opinions are often based on personal experiences, and may not necessarily apply to others.
They want to get some unjudgemental advice.
Why not ask friends?
Interview - Findings
“It’s tiring to type very specific requirements every time.”
👎 Lengthy and Broad
👎 Hard to Review
Half of the users expressed dissatisfaction with ChatGPT's one-question-get-one-answer format.
👎 Single format
Action Insights
Naturalness Adjustment
Add a feature to switch between formal and casual tones as needed.
2. Context Understanding Model
Enhance the ability of context understanding model, to auto-adjust output naturalness for various scenarios.
01
Customize Responses
Surveys & Questions
Gain insights into users' life values and preferences.
Enhance the emotional esupport by interacting based on individual characteristics and needs.
02
Precise Answers
When a user's question lacks specificity, the system should ask further questions to narrow down the scope of the inquiry.
2. Preset Prompts
Offer preset prompts for quick access to needed information.
03
Measuring Empathy
Most users are concerned with ChatGPT's empathic capabilities, so I’ve evaluated its empathy
Five Questionnaires
Interpersonal Reactivity Index | Fantasy: 18 Perspective Taking:17 Empathic Concern: 11 Personal Distress: 9 |
Toronto Empathy Questionnaire | 18 / 50 |
Empathy Quotient | 37 / 100 |
Autism Spectrum Quotient | 43 / 64 |
Perth Empathy Scale | 39 /100 |
Our empathy exceeds ChatGPT 3.5's, but remains below average human levels
How might we show emotional understanding and caring for others?
Conclusion & Limitation
Conclusion
It’s vital for being easy to use, boosting work efficiency, and accelerating the work completion rate
2. Enhancing Emotional Empathy
By adjusting emotional expression according to context and user preferences, ChatGPT can better meet diverse user emotional needs, advancing toward a multi-faceted AI
Limitations
ChatGPT has a broad user base, but the sample size of this study is limited, and the interviewees all come from digital media-related industries, which may result in a lack of generalizability
2. Ethical Concerns
When participants encounter emotional or daily life questions during an interview or data collection process, some may feel Uncomfortable or Guarded
Next Steps in Research
Under the premise of increasing sample size, and based on existing research findings
Reference
Thank you for watching.