Skip to content
Menu
Cloud Gal 42
  • Home
Cloud Gal 42

QuickGuide: Use AWS Comprehend to perform language analysis

May 22, 2021 by admin

Step 1 – Create Ubuntu EC2 instance and prep for Comprehend

sudo apt update
python3 --version
	sudo apt install python3 (if python3 not installed)
sudo apt install python3-pip -y
sudo apt install zip -y
sudo apt install awscli -y
pip3 install boto3
pip3 install --upgrade awscli
aws configure (use the IAM user you created for AWS CLI)

Step 2 – Create dummy comprehend.py

sudo chmod 777 -R /opt
cd /opt
vi comprehend.py (press i to edit, when done editing, press Esc and :wq! to save and exit)

import boto3
import json

comprehend = boto3.client(service_name='comprehend', region_name='us-east-2')
text = "As master Yoda said - May the force be with you."
#text = "Traffic usually is not good on my way to work."
#text = "La technologie peut vous donner du bonheur."

print("Calling DetectDominantLanguage")
print(json.dumps(comprehend.detect_dominant_language(Text = text), sort_keys=True, indent=4))

print('Calling DetectEntities')
print(json.dumps(comprehend.detect_entities(Text=text, LanguageCode='en'), sort_keys=True, indent=4))

print('Calling DetectKeyPhrases')
print(json.dumps(comprehend.detect_key_phrases(Text=text, LanguageCode='en'), sort_keys=True, indent=4))

print('Calling DetectSentiment')
print(json.dumps(comprehend.detect_sentiment(Text=text, LanguageCode='en'), sort_keys=True, indent=4))

print('Calling DetectSyntax')
print(json.dumps(comprehend.detect_syntax(Text=text, LanguageCode='en'), sort_keys=True, indent=4))

print('All done\n')

Step 3 – Execute comprehend.py

python3 comprehend.py

"Languages": [
        {
            "LanguageCode": "en",
            "Score": 0.9962069392204285
        }
    ],

"Entities": [
        {
            "BeginOffset": 10,
            "EndOffset": 14,
            "Score": 0.9931020736694336,
            "Text": "Yoda",
            "Type": "PERSON"
        }
    ],

"KeyPhrases": [
        {
            "BeginOffset": 3,
            "EndOffset": 14,
            "Score": 0.9999999403953552,
            "Text": "master Yoda"
        },
        {
            "BeginOffset": 26,
            "EndOffset": 35,
            "Score": 0.9946843981742859,
            "Text": "the force"
        }
    ],

"Sentiment": "NEUTRAL",
    "SentimentScore": {
        "Mixed": 0.00018740717496257275,
        "Negative": 0.0032549421302974224,
        "Neutral": 0.7407105565071106,
        "Positive": 0.2558470666408539
    }

You can change the text in the python script to see how the analysis changes with each statement.

Related

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recent Posts

  • Role of AI/ML in Cybersecurity
  • QuickGuide: Security on OCI
  • The Cloud Management Plane
  • Secure Installation and Configuration of Virtualized Cloud Datacenters
  • Cloud Datacenter: Hardware-specific Security Configuration Requirements

Recent Comments

  • Rafael on Installing Debian on OCI
  • Jorge on Installing Debian on OCI
  • admin on Installing Debian on OCI
  • Andreas on Installing Debian on OCI
  • admin on Installing Debian on OCI

Archives

  • December 2022
  • February 2022
  • September 2021
  • July 2021
  • June 2021
  • May 2021
  • April 2021
  • February 2021
  • January 2021
  • November 2020
  • October 2020

Categories

  • aws
  • bcdr
  • cloud
  • cloudsecurity
  • compliance
  • informationsecurity
  • oracle
  • pci
  • QuickGuide
  • security
©2025 Cloud Gal 42 | Powered by WordPress and Superb Themes!