Reverse engineering banking apps, an introduction to alternative account access

PSD2 is now a reality. Following its introduction, most Fintech companies now use banks' dedicated interfaces to connect with customers' payment accounts. But that doesn’t mean that techniques prior to PSD2 are not in use anymore, specially for other industries. When we started Ibanity, we were relying only on reverse engineering. Others were using web scraping.

In this article, we will try to shed some light on both techniques and provide more details about reverse engineering. We used this technique until the first half of 2020 on the banks that still didn’t have a compliant API back then. We have since sunset all our reverse engineered connectors.

Web scraping versus reverse engineering, what is the difference?

The two techniques we are going to cover are the direct access to the web interface (web scraping) and direct access to the mobile interface or reverse engineering of the PSU. By direct access to the web interface, we mean the act of interpreting the HTML and JavaScript code that is returned by the bank’s servers. By direct access to the mobile interface, we mean the act of understanding how a private API or dedicated mobile interface works to reuse it directly for interoperability with the said system’s servers.

 Direct access to the web interface (web scraping) or direct access to the mobile interface of the PSU (reverse engineering)

Interpreting the HTML and JavaScript code that is returned by the bank's servers is called direct access to the web interface. Direct access to the mobile interface, on the other hand, is the act of reusing a private API or dedicated mobile interface for interoperability with a system's servers.

Web scraping

The format that is generated by the bank’s server (and any other website in the world) is HTML most probably with a little bit of JavaScript for dynamic elements. These two languages are then executed by your web browser and rendered to show an interface the user can point and click on. It’s literally a set of instructions that your web browser must interpret to show you a webpage. This is how all websites work.

We can catch this traffic in two distinct ways:

  1. We can intercept the HTML and JavaScript before it is rendered by the web browser and simply extract the information, we need from it. This is similar to reading a text file. Then, we can show this information in our own application.
  2. We can simulate a web browser and use code to automate the browser interaction to get the information we need. We are programming a web browser.

The big advantage of such techniques is that they don’t require too much advanced development skills. For example, they are heavily used in some industries to aggregate data from travel websites or to provide the best airline ticket prices.

However, this technique has a problem. In the case where your bank requires more than a simple login password to access your account, such as a card reader, you will not be able to access the bank account in question in an unattended way. This means, access is not allowed without an explicit action from the end-user.

In some countries prior to PSD2, this was the easiest way to connect to your user’s bank accounts data. Several well-known incumbent Fintech companies built their business on this technique. However, if you were in a stricter country, you’d have to use another technique to build services that could access your user’s accounts in an unattended way.

Understanding reverse engineering of mobile APIs

Most banks have mobile applications. When the customer installs it for the first time, banks ask for the same types of credentials as on their web interface. For example, using the card reader or the Digipass. Afterwards, they offer other means of authentication that do not require those physical devices anymore. Such means of authentication can be based on fingerprints, Face ID, and pin code, among others. 

A bank's mobile app usually works differently from its web interface. If we could figure out how it is done in the bank's mobile app, we could use the same functionality, but from a different piece of software, connecting to the same “pipes” and sending the same valid information through them.

That act of trying to figure out how these pipes work is called “reverse engineering”. We want to understand the XML or JSON API that the bank has exposed on the internet, which is the interface their mobile app uses and the one we want to directly access to.

These APIs are not the same as the ones the banks are building to comply with PSD2. We are talking here about so-called "private APIs" specifically designed for their mobile application. These APIs existed way before PSD2 became a thing.  You might wonder why these were not the APIs that were opened to TPPs but that’s a conversation for another time.

Similar to web scraping, we are trying to intercept the information before it reaches the final interface (the web browser or the phone) to:

  • Interpret that information, and
  • Understand how to interact with the bank’s servers directly

The issue is that the mobile application is a binary file, an executable that runs on a mobile phone. So it is not in plain sight like the HTML in web scraping. In other words, the only thing we have is the mobile app binary file, a blackbox, and we are trying to figure out what the server expects from that blackbox. That way, we can implement our own version of it.

Banks' mobile applications allow initiation of payments as well as storing sensitive customer banking data. Imagine you are on a public Wi-Fi and your mobile banking app is not protected at all. Any hacker with basic knowledge who connected to the same public Wi-Fi and performed a few manipulations could see all the traffic between the bank's servers and the mobile application. Let’s say you initiate a payment and the hackers see that request. They could replicate that request, change the destination IBAN and then replay it. To protect their customers, banks must put in place a few critical protection measures on their applications.

The goal of such protection measures is to prevent any hacker from being able to see any account information or initiate payment, providing that the hacker has no physical access to the device itself. There is indeed no possible protection of a mobile application if the hacker has physical access to the device of the customer and knows the personal bank credentials of that customer. None! Cracking it is just a matter of time. For example, the Pokémon Go case is an interesting story about this.

The following are a few examples of protection techniques. Please note, however, that this list is not exhaustive. 

  • Code obfuscation: prevents easy de-compilation of binary code by making decompiled code difficult to read or impossible to re-assemble.
  • Certificate pinning: embed code in the app that makes sure only verified certificates can be used to communicate between the mobile application and the servers.
  • Payload encryption: the server, through a secured connection, does not send the information in plain text but encrypts it before sending it, meaning the mobile application has a key to decrypt it.
  • One-time passwords: When opening a session, an algorithm is used both on the mobile application and the server to exchange a password valid only for one connection.

A few examples of protection techniques

Examples of protection techniques used by banks

To protect their customers, banks must put in place a few critical protection measures on their applications. Some examples of protection techniques include code obfuscation, certificate pinning, payload encryption and one-time passwords.

What we need is to understand the API that the mobile application is using. Once we can write the documentation of that API, we can substitute ourselves with the mobile application of the bank.

Reverse engineering a bank’s mobile API

Most banks have at least implemented the protection measures listed before, which means their customers should be safe regardless wherefrom they are connecting. It means the only way to get access to that mobile API is to be a lawful customer of that bank. Basically, this means having an account at that bank and access to the bank’s mobile application.

The objective is to be able to replicate the onboarding of a new “phone” with the bank mobile application. Indeed, the way most bank mobile APIs work is that, as a customer, you need to “add a new phone” that is allowed to access your accounts. You can then activate an authentication such as fingerprinting or Face ID. We want to do the same, except we will not use an actual physical phone for each customer

To the bank, there will be no way to see the difference unless we include a specific word in the device name that is sent or provide additional information when sending our requests to their mobile API. We use a generated device identifier and a valid device name, which is the same for all iPhone 13s or all Google Nexus, for example. 

In other words, we make the bank believe that the user has a new phone, when in fact it is our own code and not a physical device at all. We also must store every token or key received from the bank during that onboarding to be able to login again when needed.

Now, we need to remove all the roadblocks preventing us from doing that enrolment by:

  • First, seeing the traffic exchanged between the servers of the bank and the mobile application.
  • Second, understanding how the one-time password (OTP) is generated inside the mobile application so we can replicate it for both enrolment and login.

Once this is done, we can monitor the traffic by performing what is called a Man-in-the-Middle (MITM) attack on our own phone by running a modified version of the binary of the application on a real physical phone. A Man-in-the-Middle attack is a network attack that allows a third-party computer to route the traffic between two computers through itself. That way, the traffic can be recorded and replayed.

Is this legal?

This is certainly not a legal advice post, so everyone is required to do their homework in that respect. This technique was mostly used in a time where banks didn’t provide dedicated interfaces which is not the case anymore since 2019. Some Fintech companies are most certainly still using it for banking services not covered by PSD2 though. We strongly advise to do your own legal assessment before attempting any of this.

Conclusion

Reverse engineering, and to some extent web scraping, were much more popular before PSD2. They had the advantage to rely on interfaces that the bank was also using. Making it less prone to major changes since banks try to provide stable service to their customers.

With the arrival of PSD2, most players have switched to PSD2 interface for payment accounts. This introduced a big architecture change for existing players, and an opportunity for new entrants.

These techniques and reverse engineering may seem scary at first, but it is still used in several different industries to connect systems together or guarantee interoperability of formats. It is, in some cases, the only way to connect legacy systems, or unsupported devices together.