Replacing Traditional Coding techniques with AI Agents

Introduction

I was interested in researching how I could use AI tools and techniques, particularl, AI Agents, within my software projects in place of traditional coding techniques.

I built a simple Tic-Tac-Tic program in order to try this out. When creating a simple game like thus, you might initially think of using traditional design strategies—logic-driven rules, if-else conditions, and some clever heuristics.

However, there’s another path: leveraging Artificial Intelligence (AI). In this post, I’ll show you how I built a Tic-Tac-Toe project using a fully connected feed-forward neural network powered by the Deep Q-Network (DQN) algorithm.

Hopefully its an good example of how developers can use AI to solve problems that would otherwise require extensive rule-based logic.

All code for this project is available at https://github.com/garry-svg/AI-Agent-TicTacToe

Traditional Coding techniques for Tic-Tac-Toe

In a traditional implementation of Tic-Tac-Toe, the developer explicitly defines the game’s logic. For instance, you might:

  • Use if-else conditions to check for winning combinations.
  • Implement heuristics to determine strategic moves, such as prioritizing the center or blocking an opponent’s winning move.
  • Create exhaustive case-based logic to evaluate all possible board states.

While this approach works for simple games like Tic-Tac-Toe, it becomes unsuitable for more complex problems with large state spaces. It’s here where using AI can really shine..

AI Approach: AI Agents

In my project, I replaced traditional logic with an AI agent. The key components of this system were:

1. Environment

The environment represents the game’s world—the Tic-Tac-Toe board in this case. Using Python’s Gymnasium library, I defined the board as a grid of nine positions, each of which could hold an X, O, or remain empty. The environment provides the rules of the game:

  • Valid moves.
  • Win, lose, or draw conditions.
  • Rewards for each move.

2. AI Agent

The agent interacts with the environment. It observes the board state and selects actions (moves) to maximize its cumulative reward. The agent is powered by a neural network trained using the DQN algorithm.

3. Neural Network

The neural network is a fully connected feed-forward network:

  • Input Nodes: 9 nodes representing the state of the Tic-Tac-Toe board.
  • Hidden Layers: Layers that process patterns and relationships within the data.
  • Output Nodes: 9 nodes corresponding to the Q-values of possible moves.

Each Q-value represents the expected cumulative reward of making a specific move in the current board state.

4. DQN Algorithm

The Deep Q-Network algorithm trains the neural network to predict Q-values. Here’s how it works:

  • The agent plays games against itself, exploring different moves.
  • It receives rewards based on the outcomes of its actions (e.g., +10 for winning, -10 for losing, 0 for a draw).
  • The DQN algorithm updates the neural network’s weights to improve its predictions, reinforcing moves that lead to better outcomes.

Steps to Build the Tic-Tac-Toe AI Agent

1. Define the Environment

Using Gymnasium, I implemented a custom environment for Tic-Tac-Toe. The environment included methods like reset() (to initialize the board), step() (to process moves), and render() (to display the board).

2. Set Up the Neural Network

Stable Baselines3 provided the neural network architecture. I didn’t need to design the network from scratch—the library’s default implementation for DQN included a fully connected feed-forward network.

3. Train the Model

I trained the model using the DQN algorithm by running simulations:

  • The agent played 10,000 games against itself.
  • It learned which moves maximized its rewards over time.
training AI model
Training the model

4. Integrate a User Interface

To make the game interactive, I built a simple web interface using Flask. The interface allowed a human player to compete against the AI agent, showcasing the AI’s ability to make strategic moves.

Why Use AI Agents?

Using AI for Tic-Tac-Toe might seem excessive, but it demonstrates concepts that scale to more complex problems. Unlike traditional logic, AI doesn’t require predefined rules. Instead, it learns optimal strategies through training, making it adaptable to:

  • Games with larger state spaces (e.g., Chess or Go).
  • Dynamic systems where rules evolve.
  • Complex decision-making scenarios.

Key Takeaways

  1. AI Simplifies Complexity: For simple games like Tic-Tac-Toe, traditional methods are sufficient, but AI shines when scaling to more complex environments.
  2. Reusable Frameworks: Libraries like Stable Baselines3 and Gymnasium make it easy to implement and train AI agents.
  3. Hands-On Learning: Building this project deepened my understanding of reinforcement learning, neural networks, and the interplay between agents and environments.

Conclusion

The Tic-Tac-Toe project is a foundational example of how to build an AI agent. It highlights the shift from rule-based programming to learning-based systems, opening doors to more sophisticated AI applications.

Whether you’re tackling games, robotics, or real-world optimization problems, the principles demonstrated here are a great starting point for leveraging AI effectively.

Have you built your own AI agent or explored reinforcement learning? Share your thoughts in the comments below!

What is SEPA?

What is SEPA?
What is SEPA?

SEPA stands for Single Euro Payments Area. It was created to simplify international euro transfers between EU member states. It allows you to send and receive payments in euros between two cross-border bank accounts in the eurozone. You can read more about Single Euro Payments Area from the ECB here.

The Single Euro Payments Area ensures that payments made across the eurozone are as simple as domestic transactions. This promotes economic integration and the mobility of people within the eurozone. A single market for payments services increases competition thereby reducing costs of moving money throughout the eurozone.

The primary instrument for making payments in the Single Euro Payments Area is the SEPA Credit Transfer or SCT.

In order to make an SCT Payment, you need:

  • IBAN of the person you want to pay
  • The bank receiving the payment must be a SEPA member
  • The payment must be in euro

The Single Euro Payments Area is regulated by the European Payments Council (EPC).

Advantages of SEPA

Advantages of SEPA
Advantages of SEPA

Four types of SEPA payment

There are four different types of SEPA payment. All of these payment types have their message definitions defined in the ISO 20022 framework.

SEPA Credit Transfer

Usually used for one-off transfers, while PSPs move payments from one bank account to another within the eurozone. For more information, please see here

SEPA Instant Credit Transfer

Unlike an SCT, an instant credit transfer can move money from one account to another in less than ten seconds.

SEPA Direct Debit Transfer Core

Used for subscription services as well as monthly items like utility bills. These are fundamentally different to credit transfers, as it is the recipient that requests the money transfer from the sender rather than the other way around. For more information, please see here

SEPA Direct Debit Business-to-Business

Available if you are collecting Direct Debit payments from other businesses

What is ISO 20022?

ISO 20022 logo
ISO 20022 logo

ISO 20022 is a common language and model for financial messages across the world.

It covers five financial areas:

  • Payments
  • Securities
  • Trade services
  • Cards
  • FX (Foreign Exchanges).

More information can be found here

A key part of the SEPA is the use of the ISO 20022 messaging standard using XML during payment processing.

The ISO 20022 payments standard will apply to domestic, ACH, high value and cross-border payments and by 2025 it will be the universal standard for high value payment systems of all reserve currencies.

The standard breaks down to the following areas:

  • Account Management (acmt)
  • Cash Management (camt)
  • Payments clearing and settlement (pacs)
  • Payment initiation (pain)

Over 70 countries have adopted the standard including Japan, China, and India. Currently over 200 payment types are supported. This facilitates harmonisation between different payment methods and systems around the world.

Messages are available within the standard for the complete end-to-end payments chain: customer to bank (payment), bank to bank (payment clearing and settlement) and reporting (cash management).

Advantages of ISO 20022

  • Improved Straight through processing rates through the use of common language and format among payment systems.
  • Facilitates better analytics. This leads to better AML more effective claims and investigations.
  • More enriched data should lead to improved understanding of customer needs and new sources of revenue such as Request to Pay.
  • Supports a much larger character set than that of MT messages. This is very important in countries such as China.
  • Greater protection against Anti Money Laundering (AML) and other financial crimes.

What is its relationship to SEPA?

All banks operating within the Single Euro Payments Area must adhere to the SEPA payment standard which conforms to the ISO 20022 standard. All SEPA payment messages are compliant with ISO 20022. However SEPA payment messages will be more restrictive in applicable business rules than their ISO 20022 counterparts.

What is the difference between ISO 15022 and ISO 20022?

ISO 15022 is an ISO standard for securities messaging used in transactions between financial institutions across the SWIFT network. ISO 20022 will replace the ISO 15022 standard.

How is the ISO 20022 standard defined?

The financial industry uses different terms and different message formats across to describe payment processes. This leads to barriers in facilitating payment integration. ISO 20022 aims to define common payment business processes and consistent message standards to ensure financial institutions have a common understanding of payment information exchanged. The ISO 20022 standard is defined as individual 3 layers:

  • Business Concepts
  • Logical models
  • Syntax

Business Concepts

The ISO 20022 standard is defined using a business model which describes the roles, actors and processes that make up the flow of payment information. This business model will define concepts such as debtor, creditor, debtor agent, creditor agent, credit transfer and direct debit.

Logical Models

The logical model describes all of the parts needed to perform a business activity such as a credit transfer. Its independent of syntax. The following diagram shows a simplified data model for a subset of the CreditTransferTransactionInformation.

ISO 20022 logical model

Syntax

To generate the physical syntax, we use the logical model. The physical syntax of ISO 20022 is XML. An example of XML syntax within ISO 20022 is the pain.001 XML message.

The dictionary defines all the terms used in ISO 20022. This removes any ambiguity to the different terms used in the payments industry.

Ripple and ISO 20022

Ripple is a technology company that leverages decentralised block chain technology in order to facilitate cheaper and faster cross-border payments. ISO 20022 is the common language and model for financial messages being exchanged across the world. This blog post discusses what Ripple is and its adoption of ISO 20022.

What is Ripple?

Ripple and ISO 200222

Ripple is a RTGS, currency exchange and remittance network that is available to financial institutions. It uses cryptocurrency to power cross-border transactions.

Ripple was designed to be a replacement for the SWIFT network. It facilitates the transfer of a number of fiat currencies and cryptocurrencies between financial institutions for a small amount of XRP. This fee is small in comparison to the amount charged by banks for cross-border payments.

What is XRP?

XRP is the cryptocurrency native to RippleNet, the Ripple platform. It uses a decentralised blockchain known as the XRP Ledger (XRPL). XRP uses the Ripple transaction protocol (RTXP) to process transaction.

Distributed ledger technologies (DLT) have the potential to reduce costs of transactions. They also speed up the transaction processing because they remove the need for a central authority such as central banks or financial networks like SWIFT.

RippleNet has a facility, ODL, On-Demand Liquidity, which is used to buy XRP on the market when needed and selling it when received, effectively eliminating volatility by completing both transactions within seconds.

Ripple and ISO 20022

Ripple is part of the ISO 20022 standards body with a particular focus on distributed ledger technology. Its membership will allow customers to use RippleNet to access a network of global financial institutions and connect to one standardized API for all counterparty transactions.

Ripple can now provide RippleNet at a greater scale by conforming to the same standard that SWIFT and its members will use. SWIFT members will integrate more easily with RippleNet as they will adhere to the same message standard.

The use of distributed ledger technology within ISO 20022 will allow faster transaction processing and cross-border payments via a standardised API. ISO 20022 stands to become the global payments language and Ripple will benefit from this.

In a sense Ripple and ISO 20022 are quite complementary; both hope to make cross-border payments cheap, easy and automatic.

Who are SWIFT?

Overview

SWIFT is an acronym for Society for Worldwide Interbank Financial Telecommunications. Based in Belgium, it is a cooperative society who run a vast network allowing banks and financial institutions to transfer money and securities securely.

Its is important to note that SWIFT is neither a settlement network nor a clearing network, it is a messaging system which sends global payment orders to be processed by a clearing or settlement system.

SWIFT logo
SWIFT logo

SWIFTNet Network

SWIFTNet is the private, secure network used by banks and financial institutions for secure communication owned by SWIFT.

It provides a number of communication protocols which SWIFTNet members can avail of depending on their use-case.

  • InterAct – used to exchange XML financial messages such as MX and ISO20022. It is suitable for when the financial messages are processed in almost realtime and an RTGS is involved.
  • Fin – used to exchange MT and ISO 15022 message formats. Fin is a store and forward messaging system.
  • FileAct – used to exchange financial messages in batch. Financial messages are queued and then batch processed at a certain time. FileAct is not time sensitive.

Financial Messaging Standards

SWIFT provides the standards for the syntax governing the structure of financial messaging used in the network.

Two of most important standards are ISO 15022, used for securities settlement and asset servicing messaging and ISO 20022, used as a common standard for payment messaging world-wide.

SWIFT Interfaces, Software and Services

SWIFT provides a number of interfaces that members can integrate with in order to transmit their financial messages:

  • SWIFTNet Link – API to SWIFTNet. Used by all members to connect to SWIFTNet.
  • SWIFTAlliance Gateway (SAG) – Software service that connects to SWIFTNet. Faciliates other financial integrating with SWIFTNet.
  • Alliance Access (SAA) / Alliance Messaging Hub (AMH) – Messaging software which process all SWIFT message flows, such as Fin, FileAct, InterAct etc.

What are High and Low Value Payment Systems?

Overview

Banks generally separate their payment systems into high and low value streams:

  • High value payments (inter-bank payments)
  • Low value payments (retail payments).

High value payments are settled instantly. Low value payments are batched and settled generally at end of day.

high value/low value payment systems

High Value Payment Systems (HVPS)

HVPS transfer large value inter-bank transactions. Transactions settle in real-time and instantly. HVPS are Real Time Gross Settlement Systems (RTGS).

Generally HVPS do not have an upper limit on the monetary value of transactions allowed. Central banks will have oversight of the HPVS in individual countries ensuring stability and reliability. They also will use robust security to guard against fraud, hacking and other malicious behavour.

Examples

Fedwire (United States)`
RTGS (India)
RITS (Australia)
MEPS+(Singapore)
TARGET2 (EU)
CHAPS (UK)

Low Value Payment Systems (LVPS)

LVPS are deferred settlement systems. Payments are batched and settled at a certain point in the day. Transaction amounts are on average low in comparison to high value payments.

Low value payments constitute the vast majority of payments. Individually they pose no systemic risk and it is inefficient and expensive to process them in a real time basis. They settle in batches at the end of the day.

LVPS are also known Batch EFT (Electronic funds transfer) or ACH (Automated Clearing House) payment systems.

SWIFT supports over 25 low-value payment systems.

Examples

CHIPS (United States)
NEFT (United States
BECS (Australia)
SEPA (EU)

What is Gross Settlement?

(Real-Time) Gross Settlement (RTGS) are settlement systems that facilitate the transfer of funds instantly. They are used for inter-bank high-value payments that needed to be cleared immediately.

What is Gross Settlement?
Payment settled through RTGS

Examples of RTGS settlement systems are TARGET2, used in the Eurozone area, FedWire Funds Service, used in the United States and CHAPs, used in the UK. RTGS systems are run by central banks, i.e. CHAPS is run by Bank of England, Target2 by the European Central Bank and FedWire by the Federal Reserve.

An RTGS transaction means that:

  • Each transaction gets settled in real-time by the central bank. An RTGS transaction is not subject to any wait time.
  • Gross settlement means that the transaction is settled on a one-to-one basis and is not bundled with any other transactions.

Advantages of an (Real-Time) Gross Settlement (RTGS)

  • Accurate Cash Flow – As an RTGS transaction is settled immediately, customers know when exactly their accounts are debited/credited, thereby ensuring more predictable cash flows.
  • Reduced risk – RTGS transactions are settled immediately meaning that there is less risk for the receiving party. Funds will already have been credited to the receiving party by the time the payment instruction arrives.

What is CHAPS?

CHAPS is acronym for Clearing House Automated Payment System. It is a high-value payment systems, operated by the Bank of England, providing efficient, settlement risk-free and irrevocable payments in Sterling. It was first introduced in 1984.

CHAPS is an RTGS where settlement is risk-free, i.e. because each payment is settled individually, in real-time. It is primarily used for the same-day settlement of high-value wholesale payments as well as time-critical, high value retail payments like house purchases.

In 2019 it processed over 35 millions high value payments with a total monetary value on 75 trillion GBP.

Who uses CHAPS?

There are over 35 financial institutions, known as direct participants, who make payments over CHAPS. Direct participants include high street banks, financial market infrastructures and challenger banks.

Several thousand other banks make payments through direct participants (correspondent banking).

How does CHAPS work?

How does CHAPS work?
How does CHAPS work?

What are the benefits?

  • Supports secure high value, same day payments from payment service providers to their customers.
  • Payments are highly secure using the SWIFT payment infrastructure together with the Bank Of England’s RTGS system. There is no upper/lower payment limits.
  • Among direct participants, the liquidity requirements eliminate settlement risk.
  • The system processes and settles transactions on the same day, thus ensuring that the beneficiary receives payments on the same day as the payment initiation.
  • Allows the direct transfers between direct participants and financial institutions eliminating the need for intermediaries.

What is TARGET2?

TARGET2 (Trans-European Automated Real-time Gross Settlement Express Transfer System) is the RTGS owned and operated by the Eurosystem. The Eurosystem is the monetary authority of the Eurozone; it consists of the European Central Bank (ECB) and the national central banks of the Eurozone countries.

TARGET2 is used by central banks and commercial banks to process euro payments in real-time enabling the free flow of money across the Eurozone. It is a payment system that enables EU banks to transfer money between each other in real time.

It’s mandatory to use TARGET2 when settling euro payments that involve the Eurosystem. It was introduced in 2007 replacing the older TARGET (Trans-European Automated Real-time Gross Settlement System) system. In 2020, it processed over 92 billion transactions with a total value of approximately €152 trillion

How does it work?

TARGET2 Ovreview

Who uses TARGET2?

It is used by central banks and commercial banks.

  • Central banks will use it to manage liquidity and facilitate the flow of funds between financial institutions.
  • Commercial banks within the eurozone will use it to settle large value transactions. A full list of participants can be found here