Anomaly and Fraud Detection in AWS using Machine Learning, Artificial Intelligence and Data Insights | by William Aaron | Nov, 2024


In today’s digital age, the increasing complexity and volume of data present both opportunities and challenges for businesses. One significant challenge is the detection of anomalies and fraudulent activities, which can pose substantial financial and reputational risks. Leveraging the robust suite of tools and services provided by AWS, organizations can harness the power of Machine Learning (ML), Artificial Intelligence (AI), and Data Insights to address these issues effectively.

This article proposes a Cloud Infrastructure solution to deploy an Anomaly and Fraud detection system on AWS using containerized environments and storage systems. We also proceed to provide a solution to visualize data from the analytics using some managed data visualization services provided by AWS.

WAARON Finances Inc. is a rapidly growing financial technology company that provides innovative digital banking and payment services to millions of customers worldwide. With the rise in online transactions and digital payments, the company faces increasing risks of fraudulent activities and anomalies within their massive transaction datasets. These risks include unauthorized transactions, account takeovers, and unusual spending patterns that could indicate potential fraud.

The company aims to leverage AWS’s advanced machine learning, artificial intelligence, and data analytics services to build a solution capable of:

  1. Identifying Anomalies: Detecting irregular patterns in transaction data that deviate from typical customer behavior, helping to flag suspicious activities for further investigation.
  2. Fraud Detection: Utilizing predictive models to identify potentially fraudulent transactions in real-time, thereby minimizing the financial impact and preventing losses.
  3. Real-time Monitoring: Implementing a robust monitoring system to continuously analyze incoming data streams and alert security teams to any potential threats immediately.
  4. Data Insights: Gaining deeper insights into customer behavior and transaction trends, allowing for more informed decision-making and strategic planning.

The Architecture diagram provided above is the proposed Infrastructure for WAARON Finances Inc. This article breaks down each service used in the architecture, why it is used and finally it’s function in the whole architecture workflow.

Data Source

The data source section is typically a storage solution. This section is the proposed location where WAARON should store all transaction data and CSV files for processing. The S3 bucket serves as the storage location for all csv files and transaction data, while the DynamoDB table stores some transactional metadata.

Data Transformation

This section is where the compute and processing happens. The application code is written using the Micro Services pattern with each micro service performing individual functions. Let’s look at each one of the micro service in this project scenario.

  1. Collector Micro Service: This is a Java based application that accepts the raw transactions data (most likely compressed) and will extract data from it in a readable and cleaned format. It processes lines of a payment data file, extracts relevant information, processes it, and stores it in a storage service (S3).
  2. Detector Micro Service: This is a python based micro service that performs the anomaly detection using Machine Learning techniques. It reads input data from S3 bucket and transforms into embeddings using Standard Scaler and Sentence Transformer, then compares them to model data stored in PostgreSQL. Finally, an enriched dataset with cosine similarity as anomaly signal (between 0 and 1) is stored as
    output data into S3 bucket.
  3. Curation Micro Service: In this project, the data csv contains a column called curation that contains the source IP address of the transaction. What this microservice does is, it extracts all of that data along with some metadata and store in to S3. This data will then be further used for some analytics and visualization data in another section.

A data lake is a centralized repository that allows you to store all your structured and unstructured data at any scale. In this project, S3 and RDS are the two primary managed services used for the data lake, to store data. PostgreSQL in this case is used to store the vectors of historical data with or without fraud.

This section is where finalized data stored in S3 is queried with Amazon Athena and is further visualized using Amazon QuickSight. Amazon Athena provides features to query directly from S3 using SQL statements these query data is then converted into appealing dashboards to analyze for Fraud in certain accounts.

GitHub Actions

DevOps practices is also incorporated into the architecture specifically CICD pipeline using GitHub Actions to pick changes of the application code and automatically deploy new versions of the ECS Task Definition. The pipeline also builds up new application docker image that is stored in ECR.

AWS ECR

Elastic Container Registry is a managed service on AWS to store different versions of docker images. In this project, ECR is used to store multiple versions of the python and Java micro services.

IAM Roles

To ensure security practices in the application, IAM roles to only allow least privileged access are incorporated into every service ensuring best security practices are followed.

Networking

VPC, Subnets, and Security groups are used for the networking bit of the architecture, a NAT gateway is also configured to ensure the the ECS tasks are in the private subnet without inbound internet connection traffic.

Finally, an API interface is provided to trigger the fraud detection analysis upon HTTP request that is sent to it. The Interface is built with API Gateway which is directly integrated with ECS with the aid of Amazon CloudMap services to load balancer the ECS tasks behind the API gateway.

Route 53 in this architecture is used to map a custom domain to the API gateway invoke URL.

In an era where digital transactions are rapidly increasing, the ability to effectively detect and prevent anomalies and fraud is crucial for financial institutions like WAARON Finances Inc. By leveraging AWS’s robust suite of services, the proposed solution provides a comprehensive and scalable approach to identifying and mitigating fraudulent activities in real-time.

Recent Articles

Related Stories

Leave A Reply

Please enter your comment!
Please enter your name here