Integrate generative AI capabilities into Microsoft Office using Amazon Bedrock

Generative AI is rapidly transforming the modern workplace, offering unprecedented capabilities that augment how we interact with text and data. At Amazon Web Services (AWS), we recognize that many of our customers rely on the familiar Microsoft Office suite of applications, including Word, Excel, and Outlook, as the backbone of their daily workflows. In this blog post, we showcase a powerful solution that seamlessly integrates AWS generative AI capabilities in the form of large language models (LLMs) based on Amazon Bedrock into the Office experience. By harnessing the latest advancements in generative AI, we empower employees to unlock new levels of efficiency and creativity within the tools they already use every day. Whether it’s drafting compelling text, analyzing complex datasets, or gaining more in-depth insights from information, integrating generative AI with Office suite transforms the way teams approach their essential work. Join us as we explore how your organization can leverage this transformative technology to drive innovation and boost employee productivity.

Solution overview

Figure 1: Solution architecture overview

The solution architecture in Figure 1 shows how Office applications interact with a serverless backend hosted on the AWS Cloud through an Add-In. This architecture allows users to leverage Amazon Bedrock’s generative AI capabilities directly from within the Office suite, enabling enhanced productivity and insights within their existing workflows.

Components deep-dive

Office Add-ins

Office Add-ins allow extending Office products with custom extensions built on standard web technologies. Using AWS, organizations can host and serve Office Add-ins for users worldwide with minimal infrastructure overhead.

An Office Add-in is composed of two elements:

The code snippet below demonstrates part of a function that could run whenever a user invokes the plugin, performing the following actions:

Initiate a request to the generative AI backend, providing the user prompt and available context in the request body
Integrate the results from the backend response into the Word document using Microsoft’s Office JavaScript APIs. Note that these APIs use objects as namespaces, alleviating the need for explicit imports. Instead, we use the globally available namespaces, such as Word, to directly access relevant APIs, as shown in following example snippet.

// Initiate backend request (optional context)
const response = await sendPrompt( user_message: prompt, context: selectedContext );

// Modify Word content with responses from the Backend
await Word.run(async (context) => 
  let documentBody;

  // Target for the document modifications
  if (response.location === 'Replace') 
    documentBody = context.document.getSelection(); // active text selection
   else 
    documentBody = context.document.body; // entire document body
  

  // Markdown support for preserving original content layout
  // Dependencies used: React markdown
  const content = renderToString(<Markdown> response.content  < /Markdown>);
  const operation = documentBody.insertHtml(content, response.location);

  // set properties for the output content (font, size, color, etc.)
  operation.font.set( name: 'Arial' );

  // flush changes to the Word document
  await context.sync();
);

Generative AI backend infrastructure

The AWS Cloud backend consists of three components:

Amazon API Gateway acts as an entry point, receiving requests from the Office applications’ Add-in. API Gateway supports multiple mechanisms for controlling and managing access to an API.
AWS Lambda handles the REST API integration, processing the requests and invoking the appropriate AWS services.
Amazon Bedrock is a fully managed service that makes foundation models (FMs) from leading AI startups and Amazon available via an API, so you can choose from a wide range of FMs to find the model that is best suited for your use case. With Bedrock’s serverless experience, you can get started quickly, privately customize FMs with your own data, and quickly integrate and deploy them into your applications using the AWS tools without having to manage infrastructure.

LLM prompting

Amazon Bedrock allows you to choose from a wide selection of foundation models for prompting. Here, we use Anthropic’s Claude 3.5 Sonnet on Amazon Bedrock for completions. The system prompt we used in this example is as follows:

You are an office assistant helping humans to write text for their documents.

[When preparing the answer, take into account the following text: <text>context</text>]
Before answering the question, think through it step-by-step within the <thinking></thinking> tags.
Then, detect the user's language from their question and store it in the form of an ISO 639-1 code within the <user_language></user_language> tags.
Then, develop your answer in the user’s language within the <response></response> tags.

In the prompt, we first give the LLM a persona, indicating that it is an office assistant helping humans. The second, optional line contains text that has been selected by the user in the document and is provided as context to the LLM. We specifically instruct the LLM to first mimic a step-by-step thought process for arriving at the answer (chain-of-thought reasoning), an effective measure of prompt-engineering to improve the output quality. Next, we instruct it to detect the user’s language from their question so we can later refer to it. Finally, we instruct the LLM to develop its answer using the previously detected user language within response tags, which are used as the final response. While here, we use the default configuration for inference parameters such as temperature, that can quickly be configured with every LLM prompt. The user input is then added as a user message to the prompt and sent via the Amazon Bedrock Messages API to the LLM.

Implementation details and demo setup in an AWS account

As a prerequisite, we need to make sure that we are working in an AWS Region with Amazon Bedrock support for the foundation model (here, we use Anthropic’s Claude 3.5 Sonnet). Also, access to the required relevant Amazon Bedrock foundation models needs to be added. For this demo setup, we describe the manual steps taken in the AWS console. If required, this setup can also be defined in Infrastructure as Code.

To set up the integration, follow these steps:

Create an AWS Lambda function with Python runtime and below code to be the backend for the API. Make sure that we have Powertools for AWS Lambda (Python) available in our runtime, for example, by attaching aLambda layer to our function. Make sure that the Lambda function’s IAM role provides access to the required FM, for example:


    "Version": "2012-10-17",
    "Statement": [
        
            "Effect": "Allow",
            "Action": "bedrock:InvokeModel",
            "Resource": [
                "arn:aws:bedrock:*::foundation-model/anthropic.claude-3-5-sonnet-20240620-v1:0"
            ]
        
    ]

The following code block shows a sample implementation for the REST API Lambda integration based on a Powertools for AWS Lambda (Python) REST API event handler:

import json
import re
from typing import Optional

import boto3
from aws_lambda_powertools import Logger
from aws_lambda_powertools.event_handler import APIGatewayRestResolver, CORSConfig
from aws_lambda_powertools.logging import correlation_paths
from aws_lambda_powertools.utilities.typing import LambdaContext
from pydantic import BaseModel

logger = Logger()
app = APIGatewayRestResolver(
    enable_validation=True,
    cors=CORSConfig(allow_origin="http://localhost:3000"),  # for testing purposes
)

bedrock_runtime_client = boto3.client("bedrock-runtime")


SYSTEM_PROMPT = """
You are an office assistant helping humans to write text for their documents.

context
Before answering the question, think through it step-by-step within the <thinking></thinking> tags.
Then, detect the user's language from their question and store it in the form of an ISO 639-1 code within the <user_language></user_language> tags.
Then, develop your answer in the user's language in markdown format within the <response></response> tags.
"""

class Query(BaseModel):
    user_message: str  # required
    context: Optional[str] = None  # optional
    max_tokens: int = 1000  # default value
    model_id: str = "anthropic.claude-3-5-sonnet-20240620-v1:0"  # default value

def wrap_context(context: Optional[str]) -> str:
    if context is None:
        return ""
    else:
        return f"When preparing the answer take into account the following text: <text>context</text>"

def parse_completion(completion: str) -> dict:
    response = "completion": completion
    try:
        tags = ["thinking", "user_language", "response"]
        tag_matches = re.finditer(
            f"<(?P<tag>'.join(tags))>(?P<content>.*?)</(?P=tag)>",
            completion,
            re.MULTILINE | re.DOTALL,
        )
        for match in tag_matches:
            response[match.group("tag")] = match.group("content").strip()
    except Exception:
        logger.exception("Unable to parse LLM response")
        response["response"] = completion

    return response


@app.post("/query")
def query(query: Query):
    bedrock_response = bedrock_runtime_client.invoke_model(
        modelId=query.model_id,
        body=json.dumps(
            
                "anthropic_version": "bedrock-2023-05-31",
                "max_tokens": query.max_tokens,
                "system": SYSTEM_PROMPT.format(context=wrap_context(query.context)),
                "messages": ["role": "user", "content": query.user_message],
            
        ),
    )
    response_body = json.loads(bedrock_response.get("body").read())
    logger.info("Received LLM response", response_body=response_body)
    response_text = response_body.get("content", [])[0].get(
        "text", "LLM did not respond with text"
    )
    return parse_completion(response_text)

@logger.inject_lambda_context(correlation_id_path=correlation_paths.API_GATEWAY_REST)
def lambda_handler(event: dict, context: LambdaContext) -> dict:
    return app.resolve(event, context)

Create an API Gateway REST API with a Lambda proxy integration to expose the Lambda function via a REST API. You can follow this tutorial for creating a REST API for the Lambda function by using the API Gateway console. By creating a Lambda proxy integration with a proxy resource, we can route requests to the resources to the Lambda function. Follow the tutorial to deploy the API and take note of the API’s invoke URL. Make sure to configure adequate access control for the REST API.

We can now invoke and test our function via the API’s invoke URL. The following example uses curl to send a request (make sure to replace all placeholders in curly braces as required), and the response generated by the LLM:

$ curl --header "Authorization: token" \
     --header "Content-Type: application/json" \
     --request POST \
     --data '"user_message": "Write a 2 sentence summary about AWS."' \
     https://restapi_id.execute-api.region.amazonaws.com/stage_name/query | jq .

 "completion": "<thinking>\nTo summarize AWS in 2 sentences:\n1. AWS (Amazon Web Services) is a comprehensive cloud computing platform offering a wide range of services like computing power, database storage, content delivery, and more.\n2. It allows organizations and individuals to access these services over the internet on a pay-as-you-go basis without needing to invest in on-premises infrastructure.\n</thinking>\n\n<user_language>en</user_language>\n\n<response>\n\nAWS (Amazon Web Services) is a cloud computing platform that offers a broad set of global services including computing, storage, databases, analytics, machine learning, and more. It enables companies of all sizes to access these services over the internet on a pay-as-you-go pricing model, eliminating the need for upfront capital expenditure or on-premises infrastructure management.\n\n</response>",
 "thinking": "To summarize AWS in 2 sentences:\n1. AWS (Amazon Web Services) is a comprehensive cloud computing platform offering a wide range of services like computing power, database storage, content delivery, and more.\n2. It allows organizations and individuals to access these services over the internet on a pay-as-you-go basis without needing to invest in on-premises infrastructure.",
 "user_language": "en",
 "response": "AWS (Amazon Web Services) is a cloud computing platform that offers a broad set of global services including computing, storage, databases, analytics, machine learning, and more. It enables companies of all sizes to access these services over the internet on a pay-as-you-go pricing model, eliminating the need for upfront capital expenditure or on-premises infrastructure management."

If required, the created resources can be cleaned up by 1) deleting the API Gateway REST API, and 2) deleting the REST API Lambda function and associated IAM role.

Example use cases

To create an interactive experience, the Office Add-in integrates with the cloud back-end that implements conversational capabilities with support for additional context retrieved from the Office JavaScript API.

Next, we demonstrate two different use cases supported by the proposed solution, text generation and text refinement.

Text generation

Figure 2: Text generation use-case demo

In the demo in Figure 2, we show how the plug-in is prompting the LLM to produce a text from scratch. The user enters their query with some context into the Add-In text input area. Upon sending, the backend will prompt the LLM to generate respective text, and return it back to the frontend. From the Add-in, it is inserted into the Word document at the cursor position using the Office JavaScript API.

Text refinement

Figure 3: Text refinement use-case demo

In Figure 3, the user highlighted a text segment in the work area and entered a prompt into the Add-In text input area to rephrase the text segment. Again, the user input and highlighted text are processed by the backend and returned to the Add-In, thereby replacing the previously highlighted text.

Conclusion

This blog post showcases how the transformative power of generative AI can be incorporated into Office processes. We described an end-to-end sample of integrating Office products with an Add-in for text generation and manipulation with the power of LLMs. In our example, we used managed LLMs on Amazon Bedrock for text generation. The backend is hosted as a fully serverless application on the AWS cloud.

Text generation with LLMs in Office supports employees by streamlining their writing process and boosting productivity. Employees can leverage the power of generative AI to generate and edit high-quality content quickly, freeing up time for other tasks. Additionally, the integration with a familiar tool like Word provides a seamless user experience, minimizing disruptions to existing workflows.

To learn more about boosting productivity, building differentiated experiences, and innovating faster with AWS visit the Generative AI on AWS page.

About the Authors

Martin Maritsch is a Generative AI Architect at AWS ProServe focusing on Generative AI and MLOps. He helps enterprise customers to achieve business outcomes by unlocking the full potential of AI/ML services on the AWS Cloud.

Miguel Pestana is a Cloud Application Architect in the AWS Professional Services team with over 4 years of experience in the automotive industry delivering cloud native solutions. Outside of work Miguel enjoys spending its days at the beach or with a padel racket in one hand and a glass of sangria on the other.

Carlos Antonio Perea Gomez is a Builder with AWS Professional Services. He enables customers to become AWSome during their journey to the cloud. When not up in the cloud he enjoys scuba diving deep in the waters.

Integrate generative AI capabilities into Microsoft Office using Amazon Bedrock

Solution overview

Components deep-dive

Office Add-ins

Generative AI backend infrastructure

LLM prompting

Implementation details and demo setup in an AWS account

Example use cases

Text generation

Text refinement

Conclusion

About the Authors

Recent Articles

The Roadmap for Mastering MLOps in 2025

Clustering Eating Behaviors in Time: A Machine Learning Approach to Preventive Health

Beware of phone scams demanding money for ‘missed jury duty’

NYT Connections hints and answers for May 9: Tips to solve ‘Connections’ #698.

Insights in implementing production-ready solutions with generative AI

Related Stories

Leave A Reply Cancel reply