Sentiment Analysis with LLMs

Sentiment analysis is a technique in natural language processing (NLP) and text analysis that determines the emotional tone or opinion expressed in a piece of text. It identifies whether the sentiment of the text is positive, negative, neutral, or sometimes more nuanced categories (e.g., anger, happiness, sadness).

This process is often applied to unstructured text data sources like social media posts, customer reviews, and news articles.

Let's first take a look at the different types of sentiment analysis tasks, why they are hard to solve, and how LLMs can help.

Simple sentiment analysis

Let's say you want to analyze a product review and classify its sentiment as "positive," "negative," or "neutral." You can use the simple Python code below to do this.

# Preprocessing
text = "The battery life of this phone is amazing, but the camera quality is terrible."
text_lower = text.lower()

# Rule-Based Lexicon Approach
positive_words = ["amazing", "great", "excellent", "good", "fantastic"]
negative_words = ["terrible", "bad", "poor", "horrible", "awful"]

# Scoring
positive_score = sum(1 for word in text_lower.split() if word in positive_words)
negative_score = sum(1 for word in text_lower.split() if word in negative_words)

# Classification
if positive_score > negative_score:
    sentiment = "positive"
elif negative_score > positive_score:
    sentiment = "negative"
else:
    sentiment = "neutral"

# Output
print(f"Sentiment: {sentiment}")

Which will get you something like Sentiment: positive. It fair to say that there will be no understanding of context (e.g., "amazing battery life" vs. "amazing how bad the battery is"); it cannot effectively handle negations ("not amazing"); and it lacks granularity or nuanced understanding.

With LLMs, you can achieve a more nuanced understanding of the text.

Prompt for simple sentiment analysis

Analyze the sentiment of the following review. Classify it as "positive," "negative," or "neutral," and provide a reason for the classification: Review: "The battery life of this phone is amazing, but the camera quality is terrible." Provide the result in JSON format.

Then you'll get something like:

{
  "sentiment": "positive",
  "reason": "The battery life is amazing, but the camera quality is terrible."
}

Opinion mining, or aspect-based sentiment analysis

Opinion Mining, also known as aspect-based sentiment analysis, is a subfield of natural language processing (NLP) that focuses on identifying and extracting subjective information from text. It goes beyond general sentiment analysis by not only determining the polarity (positive, negative, or neutral) of a text but also identifying the specific opinions expressed about different aspects or entities within the text.

For example, in a product review:

Text: "The battery life of this phone is amazing, but the camera quality is terrible."
Output:
- Aspect: Battery life → Sentiment: Positive
- Aspect: Camera quality → Sentiment: Negative

This presents an even harder task than before. You need to identify the target of the opinion (e.g., product, feature, service); pinpointing the specific features or attributes being discussed (e.g., battery life, camera); and classify the sentiment as positive, negative, or neutral for each aspect.

To go even further, you can identify who expressed the opinion (e.g., user, reviewer); and understand the context, such as time, location, or conditions influencing the opinion.

Because of the complexity of the task, I find that you need a longer, more specific and detailed prompt to get good results.

Prompt for opinion mining

Perform a detailed aspect-based sentiment analysis on the following review. Follow these specific instructions:

Aspects: Identify all distinct aspects (e.g., features, attributes, or topics) mentioned in the text.
- Ignore general statements or unrelated information.
- Combine similar phrases into a single aspect (e.g., "battery life" and "battery performance" → "battery life").
Sentiment: For each aspect, classify its sentiment as "positive," "negative," or "neutral."
- Positive: Clear praise or satisfaction (e.g., "amazing," "excellent").
- Negative: Clear criticism or dissatisfaction (e.g., "terrible," "poor").
- Neutral: Statements that are factual or lack clear sentiment (e.g., "just okay," "average").
Negation Handling: Pay attention to negations or modifiers.
- Example: "Not great" → Negative; "Not bad" → Positive.
Mixed Sentiment: If an aspect has both positive and negative sentiments, label it as "mixed" and provide details for each sentiment.
Reason: For each aspect, provide a brief explanation of why the sentiment was assigned, referencing specific words or phrases.
Ambiguity: If the sentiment is unclear or ambiguous, label it as "neutral" and explain the ambiguity.
Output Format: Return the result as a JSON array with the following fields:
- "aspect": The aspect being discussed.
- "sentiment": The sentiment ("positive," "negative," "neutral," or "mixed").
- "reason": A brief explanation for the sentiment.

Example Input: "The battery life of this phone is amazing, but the camera quality is terrible, and the screen is just okay."

Expected Output:

[
    {
        "aspect": "battery life",
        "sentiment": "positive",
        "reason": "The battery life is described as amazing."
    },
    {
        "aspect": "camera quality",
        "sentiment": "negative",
        "reason": "The camera quality is described as terrible."
    },
    {
        "aspect": "screen",
        "sentiment": "neutral",
        "reason": "The screen is described as just okay, indicating a neutral sentiment."
    }
]

Now analyze this review: "<Insert Review Here>"

Which should result in a JSON like:

[
  {
    "aspect": "battery life",
    "sentiment": "positive",
    "reason": "The battery life is described as amazing."
  }
]

Emotion detection

Emotion Detection is a subfield of natural language processing (NLP) that aims to identify and categorize the emotional tone expressed in a piece of text. Unlike sentiment analysis, which typically classifies text as positive, negative, or neutral, emotion detection identifies more granular emotional states such as happiness, anger, sadness, fear, surprise, or disgust.

There are many complexities to emotion detection. Emotions can be subjective and vary based on personal interpretation. The same text may evoke different emotions for different people. Also, emotions often rely on the context in which the words are used. "I can't believe you did this!" can express surprise, anger, or excitement depending on context.

Text may express multiple emotions simultaneously (e.g., joy and sadness in a bittersweet statement) Classifying text into a single category can oversimplify the emotional complexity. "I just got promoted!" implies happiness but doesn't directly mention it.

Different cultures and languages may express emotions differently, posing challenges for universal emotion detection.

The 'big boss' of emotion detection is sarcasm. Sarcasm often contradicts the literal meaning of words, making accurate emotion detection difficult. "Oh, great! Another meeting!" likely expresses annoyance rather than genuine excitement.

For example, the following text:

"I can't believe they forgot my birthday! It's so frustrating, but I guess it's not a big deal."

You might classify it as:

{
  "primary_emotion": "anger",
  "secondary_emotions": ["sadness"],
  "reason": "The text expresses anger ('so frustrating') about being forgotten, along with a sense of sadness implied by the disappointment ('forgot my birthday')."
}

Subjectivity detection

Subjectivity Analysis is a task in Natural Language Processing (NLP) that determines whether a given piece of text is subjective (opinion-based) or objective (fact-based). It is a foundational step in applications like sentiment analysis, opinion mining, and content moderation.

As you would know, texts often mix subjective and objective elements. Some statements may be subjective in one context but objective in another. Texts may imply opinions without explicitly stating them, and what is subjective in one domain may be objective in another.

For example, the following text:

"The battery life of this phone is amazing, but the camera quality is terrible."

You might classify it as:

{
  "subjectivity": "subjective",
  "reason": "The text expresses opinions about the battery life and camera quality."
}

Building blocks for big problems in generative AI

Subjectivity analysis, sentiment analysis, emotion detection, and opinion mining are integral to solving several big AI challenges.

Content Moderation

Automatically identifying harmful, offensive, or inappropriate content in user-generated text, such as social media posts, reviews, or comments. Social media platforms, like Facebook, struggle with this problem because of the sheer volume of content, the need for real-time moderation, and how difficult it is to define what is harmful, offensive, or inappropriate.

Hallucination Detection

Detecting when AI models generate false or misleading statements that appear factual (hallucinations). Subjectivity analysis can help flag statements that should be factual (e.g., encyclopedic or medical contexts) but include subjective or fabricated content. This is particylarly hard for Identifying nuanced or subtle hallucinations, such as "The Eiffel Tower was built in 1899," which sounds credible but is factually wrong.

Bias Detection

Models may unintentionally favor certain sentiments, emotions, or subjective perspectives. This is a particlarly big research topic, because we want to make sure that the models we build are aligned with human values and culture. Ensuring fairness across different demographics and cultural contexts without distorting genuine results is an unsolved problem.

AI quality optimisation for sentiment analysis

Big payoffs are ahead of us when we solve these problems.

To address these issues effectively, businesses and researchers need tools like Datograde, which enable them to observe, evaluate, and optimize their AI data extraction pipelines. This would not only streamline workflows but also enhance the quality and reliability of AI outputs.