A Two Coffee Problem

Saturday 13 April 2024

The Web of Trust

Everyday all of us type a web address into a browser or click on a link provided by a search engine and interact with the web sites that are presented.

Whilst we should always be vigilant, on many occasions we will simply trust that the site we are interacting with is genuine and the data flowing between us and it is secure.

Our browsers do a good job of making sure our surfing is safe, but how exactly is that being achieved. How do we create trust between a website and its users?

SSL/TLS

Netscape first attempted to solve this trust problem by introducing the Socket Secure Layer (SSL) protocol in the early 90s. Initial versions of the protocol still had many flaws but by the release of SSLv3.0 in 1996 it had matured into a technology that was able to provide a mechanism for trust on the web.

As SSL became a foundational part of the web, and because security related protocols always have to be under constant evolution to maintain safety, the Internet Engineering Task Force (IETF) developed Transport Layer Security (TLS) in 1999 as an enhancement to SSLv3.0.

TLS has continued to be developed with TLSv1.3 being released in 2018.

Its primary purpose is to ensure data being exchanged by a server and a client is secured, but also to establish a level of trust such that the two parties can be sure who they are exchanging the data with.

Creating this functionality relies on a few different elements.

Public Key Encryption

Public key encryption is a form of asymmetric encryption that uses a pair of related keys deemed public and private.

The mathematics behind this relationship between the keys is too complex to go into in this post, but the functionality it provides is based on the fact that the public key can be used to encrypt data that only the private key can decrypt.

This means the public key can be freely distributed and used to encrypt data that only the holder of the private key can decrypt.

The keys can also be used to produce and verify digital signatures. This involves the holder of some data using a mathematical process to "sign" this data using their private key.

The receiver of the data can use the public key to verify the signature and therefore prove that the data came from someone who has the corresponding private key.

Public Key Infrastructure (PKI)

Public Key Infrastructure (PKI) builds on top of the functionality provided by public key encryption to provide a system for establishing trust between client and server.

This is achieved via the issuance of digital certificates from a Certificate Authority (CA).

The CA is at the heart of the trust relationship of the web. When two parties, the client and server, are trying to form a trust relationship they must delegate to a 3rd party that they both already trust, this is the CA.

The CA establishes the identity of the organisation the client will interact with via off line means and issues a digital certificate. This certificate establishes the identity of the organisation, its public key and is signed by the CA to prove it was the one that issued the certificate.

A client when it receives the certificate from the server can use the CA's public key to verify the signature and therefore trust the data in the certificate.

It's possible to have various levels of CA's that may delegate trust to other CA's, deemed intermediary CA's. But all certificates should ultimately be able to be traced back to a so called Root CA that all parties on the web have agreed to trust and whose public keys are available to all participants.

Certificates and Handshakes

All of the systems previously described are combined whenever we visit a web site to establish trust and security.

A user types a web address into the browser or clicks a link provided by a search engine.
The user's browser issues a request to the web site to establish a secure connection.
The server in response sends the browser it's certificate.
The browser validates the certificate authenticity by verifying the signature of the Root CA that the certificate is issued from using the public key of the CA that has been pre-installed on the users machine.
Once the certificate is validated, the browser creates a symmetric encryption key that will be used to secure future communication between the browser and the web site. It encrypts the symmetric key using the servers public key and sends it to the server.
The users browser has now established the identity of the web site, based on the data contained in its validated certificate, and both parties now have a shared symmetric key that can be used to secure the rest of their communication in the session.

There are certain pieces of functionality that are fundamental to allowing the web to operate in the way it does.

Without the functionality provided by SSL/TLS it wouldn't be possible to use the web as freely as we do whilst also trusting that we can do so in a safe and secure manner.

Monday 1 April 2024

Imagining the Worst

In the modern technological landscape the list of possible security threats can seem endless. The breadth of potential attackers and potential vectors for their attacks has never been so large, does this mean we are all just helpless waiting for an attack and the terrible consequences to befall us?

One way to be proactive in the face of these dangers is to try and anticipate what form these treats might take, what damage they could do and what countermeasures it might be possible to take.

Threat modelling is a technique for enumerating the threats a system might face, identifying whether or not safeguards might exist and analysing the consequences of these attacks succeeding.

To help developers and engineers with the threat modelling process Microsoft developed the STRIDE mnemonic in 1999 to serve as a checklist of things for teams to consider when analysing the potential impact of threats to their system.

STRIDE

The STRIDE mnemonic attempts to categorise potential threats in terms of the impact they may have, this allows teams to analyse if any part of a system may be susceptible, and if so how this might be mitigated.

Spoofing is the process of falsely identifying yourself within a system. This might be by using stolen user credentials, leaked access tokens or cookies and any other form of session hijacking.

Tampering involves the malicious manipulation of data either at rest, for example altering data within a database, or while in transit, for example by acting as a main in the middle.

Repudiation relates to an attacker being able to cover their tracks by exploiting any lack of logging or ability to trace actions within a system, this might also include an attacker having the ability to falsify an audit trail to hide malicious activity.

Information Disclosure occurs when information is available to users who shouldn't be able to view it. This might cover a system returning database records a user has no entitlement to view, or the ability of an attacker to intercept data in transit, again for example by acting as a man in the middle.

Denial of Service is any attack that denies users the ability to legitimately use a system, of which the most common form of attack is to overwhelm a system with requests or otherwise cause the system to become unresponsive or unusable.

Elevation of Privilege occurs when an attacker is able to elevate their permissions within a system under attack, normally this would mean obtaining administrator privileges or otherwise penetrating a network sufficiently to be trusted more than a normal external user.

Threat Analysis

Many tools and processes exist for implementing threat modelling, but most will revolve around a team of system experts brainstorming potential threats that a system or sub-system might be susceptible too.

This involves using analysis helpers such as STRIDE to put yourself in the mindset of an attacker. For example you may asses if an authentication system could be exploited via spoofing. The answer might be no because of certain mitigations, or yes because of certain flaws.

When applying this style of analysis to all the aspects of STRIDE it is unlikely that you will find the system is completely protected against all possible attacks. Instead you're a looking to demonstrate that it is adequately protected given the likelihood of an attack being successful and the benefit that would be gained by an attacker if they were successful.

Security is not a design activity that is ever truly complete and instead will be something that evolves over time. You can either choose to learn by mistakes when attacker are successful or you can attempt to pro-actively preempt this by performing some self critical internal analysis to ensure security levels are the highest they can be.

Sunday 24 March 2024

Having a Distrustful Mindset

I've talked previously about the concept of zero trust as it relates to security. When trying to apply an over arching philosophy like this it can feel quite daunting as to where to start.

Rather than having to create a comprehensive design in how to re-engineer your existing applications and network it can help to start with a shift in mindset in how you view the security properties of your system.

Security is an ever evolving discipline with many different facets so I wouldn't want to give the impression that the items I mention below are the only things you need to think about, but I think they do help foster the kind view that you need to have a healthy distrust of the world around you.

Authenticate

Authentication is the process of identifying the actors within a system. Traditionally this means authenticating users via them supplying a username/password or other such shared secret.

But this can also be extended to cover many other links in the chain.

This might include identifying the client or application the user is using to make the requests, the network the requests are coming from or even the physical device that is being used to communicate.

It's possible for all of these aspects of the flow of data to be used to indicate that something malicious maybe happening, and therefore properly identifying all these elements will enable you to asses if you can trust them.

Properly identifying all elements also enables comprehensive logging of any actions performed.

Authorise

Once you've identified these elements you can move onto to authorisation. This is the process of deciding if the requested operation should be allowed. Normally this would mean is this user allowed to view this data or perform this action.

But again the concept can be extended to cover more aspects of the system. Should this client or application be allowed to perform this action? Should these kinds of requests be allowed from this area of the network? Should this physical device be used to perform this action?

Going beyond authorising just the user again increases your ability to both detect malicious actions and prevent them.

Authorisation can also be linked to the method being used for authentication. Some actions a user may want to perform might be linked to needing a higher level of authentication. As an example some admin actions might require multi-factor authentication onto top of a username and password.

Encrypt

Pretty much all applications involve the movement and manipulation of data, and a large number of threats will relate to trying to expose that data to parties that shouldn't be able to view it.

One defence against this is to make sure that at the points where the data doesn't need to be viewed or worked with it is encrypted.

This will broadly come down to encrypting the data at rest and in transit. When the data is being stored and when it is being moved it should always be encrypted, the only time it should be in plain text is when it is being shown to the user. A user that has been properly authentication and authorised to view that data, via that application, from that network using that device.

This mindset should also be applied to other aspects of data sensitivity, identifying which data items are sensitive and how they are being transported. An example of this would be considering what data items are being included in the query string of requests, because this might be logged in various places data within it should be assessed for sensitivity.

Authentication, authorisation and encryption aren't the only security related factors you need to be thinking about. But embracing them and thinking about them in a greater depth will help you dive deeper into the security of your software and system and be aware of a greater breadth of possible threats.

Saturday 16 March 2024

Hacking Humanity

The modern technological world is a dangerous place, many evil actors are lurking around every dark corner with designs on your data, or simply wishing to impact your business just to show they can.

Many of these potential hackers will look to exploit flaws in your software, we should all be aware of the OWASP top ten and how code can be made to do things it wasn't intended to do.

However there is a class of attacks that have nothing to do with exploiting flaws in software and instead are about taken advantage of flaws in human nature. Social Engineering is the process of using human psychology in order to manipulate someone into undermining the security of a system and play an unwitting role in an attack.

The nature of these types of attacks are very different to technological based attacks and so therefore are the possible defences against them.

Cognitive Bias

All Social Engineering attacks exploit some aspect of cognitive bias, this is the human tendency to make incorrect decisions based on flaws on how we interpret information being presented to us.

The exploitation of these biases can take many forms but most are trying to persuade us to take an action that will ultimately lead to harm. The below is not an exhaustive list but demonstrate some of the techniques and why they work.

The Halo Effect attempts to get you to concentrate on a particular aspect of a communication whilst ignoring information that should lead you to question what is on offer. An example would be receiving a notification that you've won a prize, the positivity of that news is designed to distract you from thinking about the fact that you never entered any kind of competition or how your contact details were obtained.

Recency bias exploits the tendency to place more importance on recent events over historical ones. Attacks of this nature are timed to appear to align with recent experiences, many examples of this would have been seen during the COVID pandemic.

Authority bias takes advantage of our unwillingness to challenge someone or something that has perceived authority over us. Examples of this would be emails that claim to come from a senior work colleague or a government department.

There are many more examples of cognitive bias all of which use some aspect of human psychology to get us to ignore information that should make us suspicious in favour of information that makes us feel we should act.

Vectors of Attack

Social Engineering can be exploited in many different forms of social interaction and communication. Again the below is not an exhaustive list but acts as examples of the vectors these attacks may use.

By far the biggest vector is phishing and its variations. Whether it be an email, instance message, SMS or phone call all phishing attacks are designed to get the victim to expose information or take an action because they are duped into thinking the instruction is coming from someone it isn't.

This might be someone exposing their credit card information because they think they are communicating with their bank or someone clicking on a link that installs malware because they think it relates to a planned delivery.

Spear phishing attempts to make these attacks even more convincing by crafting the attack to be specific to an individual rather than generic in nature.

A similar vector is that of impersonation, access may be granted to a building to an individual because they are dressed in an official looking manner or because they meet a stereotype of how we expect certain individuals to present themselves.

Tail gating is another example of a physical social engineering attack where an attacker will follow someone into a secure building looking to exploit a human tendency to avoid conflict or to openly question an individuals actions.

Possible Defences

The first and most affective defence against social engineering is education. Teaching people that these techniques exist and the impact they can have will hopefully foster a natural suspicion of unsolicited communication before taking action.

Combined with this education many organisations undertake regular testing of employees. This will often take the form of them being exposed to emails or other communication that exploits the same aspects of cognitive biases to get them to take a certain action. This represents a safe way for people to realise they have been exploited and learn for next time when the attack may be real.

Some defences are technical in nature, for example employing principles of least privilege and zero trust can help to ensure that the blast radius of any attack is kept to a minimum. An example would be ensuring employees have the minimum level of system access needed to fulfil their roles, meaning if their account is compromised the attack gains limited access and influence.

Social Engineering is at least as bigger cyber threat as attacks that exploit technical flaws. The aims are usually the same, information exposure and taking control over a system. Where code can be scanned and tested on a continual and automated bases to find and rectify technical flaws, it is often a harder problem to solve to make individuals aware of their cognitive biases and how they can be exploited.

But knowing they exist is a pretty good step in the right direction.

Sunday 25 February 2024

Resting in Style

The majority of software engineers who have had any exposure to server side applications, either by building them or consuming their functionality, will be familiar with the idea of a RESTful API.

Sometime the concept of an API being RESTful is framed just in terms of its interface, its use of URIs, HTTP verbs etc.

However Representational State Transfer (REST) actually goes beyond just a definition of an interface and describes an architectural style for how server side software can be built and consumed by clients.

Early Days of the Web

By the early 90s the web was starting to become prevalent in peoples lives with websites becoming available beyond the academic institutions that had pioneered the web's invention.

As it became clear that the web's adoption was growing at pace, pressure started to grow to formalise its as yet relatively loose architecture. This flexibility was one of the web's strengths but without some formal standards to govern how its functionality could be consumed there was a risk it would become fractured.

To address these concerns organisations such as the World Web Consortium (W3C) and the Internet Engineering Task Force (IETF) put together a series of working groups to document a more formal approaches to some of the key web technologies such as HTTP an HTML.

One of the people involved in these working groups was a computer scientist called Roy Fielding. As part of this work Fielding developed the concept of a REST architecture that cumulated in his 2000 PhD thesis "Architectural Styles and the Design of Network-based Software Architectures".

This thesis defined a series of constraints that when followed would create a system with a RESTful architecture.

REST Constraints

The first REST constraint is the one we are maybe most familiar with, that of a Uniform Interface. The constraint dictates that your system should have a uniform API interface where data within your system is represented by a collection of resources.

Resources might be customers, products, user reviews or any other data your system provides access to. Request URIs identify the resources they want to work with. The data returned in API responses are a representation of a resource rather than necessarily being tied to how the resource might be stored in the backend.

Resource representations are self describing to give the client all the required metadata in order to process the data, an example of this would be the Content-Type header that includes metadata about the data representation. Resources should also demonstrate Hypermedia as an Agent of Application State (HATEOAS) where it provides links that can be used to obtain related resources, such as customer linking to their previous orders.

The next constraint is that of Client/Server, this states that clients and servers should be able to evolve independently without a dependency on each other. A client should know only about the API interface being offered by a system and have no knowledge or dependency on how that interface is being implemented.

A constraint which follows from the client/server approach is that of a Layered System. Building a system as a series of layers allows for a loosely coupled architecture where multiple systems maybe involved in data processing and storage whilst a uniform interface is presented to clients.

The next constraint says that all interactions between a client and server should be Stateless. A server should not store state related to previous requests from a client. If a client applications state is relevant in the processing of a request then this request should contain all the necessary information the server needs in order to process it. An example of this would be the fact that all requests should contain authentication and authorisation information to prove a user of the client has previously logged in.

Resources will often not change on a frequent basis, in order to address this and make sure interactions are efficient, then each resources should indicate whether or not it is Cacheable. This involves both the resource representation indicating if its cacheable and for how long, as well as clients being able to query if the resource representation they currently have is still accurate.

The final constraint was deemed optional and is not one that you will see frequently applied. Mostly when consuming APIs they will be data driven and provide static representations of data in formats such as JSON, HTML or XML. However a situation was envisioned where it might be necessary for an API to return functionality to deal with data as part of the response. The Code on Demand constraint described how an API could return executable code alongside data to increase the functionality that could be offered via APIs.

Modern Relevance

The use of the web in the modern world is probably far beyond even that which was envisioned in the early 90s when the need for engineering standards was first being identified.

It is testament to the work done by the working groups involved that the REST constraints are as relevant today as they were then.

You will not see many server applications that completely follow all these constraints, some pragmatism is often required to adapt them to your particular application and its nuances.

What is more important is to value the architectural properties they promote. Scaleability, modifiability, portability and reliability are examples of these properties that are never going to fall in or out of fashion even if the technologies being employed are subject to these changes in favourability.

Good engineering and an understanding of what good looks like for an application will always remain relevant even if the world the applications exists in does.

Sunday 4 February 2024

Leading in Software Engineering

Aspiring software engineers are often keen to understand how they can progress in their career to a leadership position within their team or their organisation.

Clearly a certain level of technical competence is going to be a requirement but there is also a number of soft skills that are often the make or break factor.

If you have the correct aptitude then correcting any omissions in technical knowledge can be easily addressed, but developing the softer skills can often require a more conscious effort.

Software within Context

The majority of the companies you will work for aren't in the software engineering business. Software is instead a tool being employed to achieve other business goals.

Understanding the business context within which you are creating software is important, the software itself isn't the end goal, the things it can achieve are. Allowing this context to influence the engineering decisions you make will increase the effectiveness of your engineering output.

You don't necessarily have to become an expert in how your company operates. But an understanding of its source of profitability, its customer base and its goals and motivations will enable you to become a more rounded operator.

Don't just see yourself as an engineer, see yourself as someone with engineering skills working towards a common goal.

The Reality of Technical Debt

Many engineers will frequently become frustrated at business decisions that they feel are impinging on the quality of their software.

This passion to be involved in quality engineering is an asset and is much better than a "that'll do" attitude. But a pragmatic outlook is going to be required to avoid becoming too disheartened.

It is unlikely to be the goal of your company to produce quality software, that will probably never be stated, but going back to the idea of business context, the business will have goals that are wider than pure engineering excellence.

This means decisions will be made, for legitimate business reasons, that are counter to what you might like to see within an engineering context. This means technical debt is just a reality of life, but there are good and bad ways to deal with and manage technical debt.

Technical debt is only technical debt if you understand you are incurring it, if you are oblivious to it then you're just writing bad software. Often the skill as a leader is to identify where technical debt is unavoidable, trying to limit its scope, making sure a plan exists for paying the debt back in the future and ensuring the fact it is being incurred and its impacts are properly communicated.

Being a zealot and refusing to compromise or co-operate with the business is going to make you less effective.

Helping People Help Themselves

When you move into a leadership position its often tempting to think you have to do everything yourself, this lack of delegation can lead to a tendency to do rather than teach.

As a leader others will often come to you with problems they are struggling to solve. It may be tempting to simply solve the problem for them, but in doing this you are restricting their ability to learn and grow.

Sometimes a problem might be urgent or have severe implications meaning you need to take ownership of the situation. But on other occasions rather than solve the problem for somebody you can instead guide them to the right solution and give them the skills to repeat the process in the future.

This will also extend to letting other know it's ok to not know the answer. Others will often assume that you know the answer to all problems, making them understand that you don't and in fact you just have developed a set of skills to help you find the answer will help those individuals acquire the same skills.

It's tempting to think that a leader in a software engineering team is equivalent to the chief geek and the best engineer. While it's likely that someone with the aspiration to lead will also have strong technical skills that is not the only qualifying factor in being an effective in a leadership role.

There are other softer factors that can require us to re-think the way we view our role but will ultimately make you a more well rounded engineer and successful leader.

Sunday 28 January 2024

The Problem with Castles and Moats

Pretty much any system accessible via the internet can expect to come under regular attacks of varying sophistication. This may range from the simply curious to those that mean to cause harm and damage.

Protecting yourself from these intrusions is therefore a key activity in the day to day operation of any team.

But is it realistic to expect to always be able to keep attackers at bay on the edge of your infrastructure? Are external threats the only thing you should be concerned about?

Zero Trust Security takes an approach that answers no to both those questions. It tries to instil defence in depth to ensure you protect yourself from many different attack vectors and actors.

Castles and Moats

A traditional approach to security, often termed castle and moat, takes an approach where access to a network is hard obtain, but once access is granted then there is an implicit trust of anyone and anything inside the network perimeter.

The source of this implicit trust probably comes from a desire for convenience but also a belief that attackers should be kept outside the network at all times.

Of course keeping attackers outside should be the goal, but the problem with castle and moat is that if an attacker does gain access, which is unfortunately likely to happen given the abundance and skill of some attackers, they then have free reign within the network to do what they like.

Principles of Zero Trust

Zero trust security is based on a set of principles designed to remove the implicit trust that comes with a castle and moat approach. These principles assume that attackers and both inside and outside the network, therefore no user or device should be trusted unless they are verified and their access validated.

The fact that both users and machines are part of the trust evaluation is key. Rather than a network being open with access permitted from any part to any other part, the network is segmented into different areas with rules enforced over which parts of a network can connect to which other parts.

Another important consideration is that of least privilege, this means even after a user or device has been authenticated they are only authorised to have the lowest level of access required to fulfil their role.

Zero trust will often also employ mechanisms to limit the risk of the exposure of credentials. This might be the regular rotation of passwords, implementing multi-factor authentication and a requirement for regular re-authentication rather than long running sessions.

Advantages and Benefits

All of these measures are deigned to limit what an attacker on the inside of the network can achieve, and crucially to prevent them being able to roam the network at will.

Rather than fighting one battle with attackers on the perimeter with high stakes we assume at some point we will lose and try to defend our assets and resources on multiple levels.

Zero trust also acknowledges that threats don't just come from the outside world. Someone who has legitimate access to the network might also have malicious intents. These so called malicious insiders can cause as greater damage as any external attacker, and have the added advantage of understanding the network topology and operation.

It's an unfortunate reality of the modern technology landscape that no system or part of a system can be deemed completely safe. The battle with would be attackers often becomes an arms race, placing your faith in your ability to always win this race can leave you open to large amounts of damage for any momentary slip in your ability to repel attackers.

Assume its possible they might get in and protect your network and your data from all possible angles. In this instance it isn't paranoia, they really are out to get you.

Sunday 14 January 2024

Backend for Frontend

Any software engineer that has worked within the client server model will be familiar with the generalised categorisation as applications as frontend or backend.

Backend applications provide access to data and apply the necessary business logic around this access and the updating of these data sources. Front end applications provide an interface for users to view and interact with this data.

In most situations these applications are running in separate environments, such as a users browser or mobile phone in the case of the front end applications, and a remote server for the backend application. Interaction between these two applications is then generally achieved via some sort of API interface.

Having this separation allows both applications to be developed and deployed independently but the design of the APIs that binds the two together is key to drive this source of efficiency. One approach is of course to have a traditional generic API interface designed to serve many possible uses, but a Backend for Frontend (BFF) takes a different road in order to provide an interface specific to the needs of the frontend it is serving.

The Problem of Many Frontends

Let's imagine we start working on a web frontend application to provide functionality to users via the browser. We develop a backend API to provide access to the necessary data and functions with the development of both apps proceeding in tandem.

Our product is a success so we are asked to produce a mobile app to provide access to the same functionality, so we drive this mobile app from the same backend API. Clearly it will be possible to build the app using this API, but is it the optimal approach? A mobile device comes with very different limitations to a desktop browser. This is in terms of performance, network access, screen real estate and just the general way in which users tend to interact with it.

We are then asked to provide access via a voice based app for a digital virtual assistant where we have to deal with the problem of having a very different medium to communicate with our users.

These competing needs put a lot of pressure on the team developing the backend application, creating a bottle neck in development and making it difficult to maintain a consistent and coherent interface to the API.

But what about if we took a different approach?

Backend for Frontend

That different approach is the concept of a Backend for Frontend (BFF).

A BFF is a backend application where the API interface is tailored specifically for the needs of the frontend it is designed to serve. This tailoring includes the data it exposes, both in terms of depth and shape, as well as the orchestration of business processes the user may wish to trigger.

In our above example we would build separate BFFs to serve web, mobile and voice.

The web BFF would expose larger amounts of data and provide access to more complicated business flows requiring multiple interactions. The mobile BFF provides access to a more compact dataset to reduce the amount of data passing between frontend and backend, as well as providing an increased level of orchestration to reduce the number of API calls involved in implementing an outcome. The voice BFF returns a very different data schema to provide for the unique user interface thats required.

Most likely all three BFFs are built on top of an internal enterprise API layer meaning their sole responsibility is to provide the optimised interface for the frontends they aligned to.

Development of the BFF can sit alongside the team developing the paired frontend application leading to less bottle necks in development allowing for each channel to operate on its own release cadence.

Pitfalls and Considerations

So does this mean a BFF is an obvious choice when developing within the client server model? There are very few absolutes in engineering so like any pattern its never the case that it should be applied in every situation, and careful consideration still needs to be given to its implementation.

Firstly you do need to confirm that your frontends do have different needs. Above I've made the generalisation that mobile and web frontends can't efficiently be driven by the same API. In general that might be a reasonable assumption to make but you should take the time to asses if the optimal API surfaces would actually be different for your use case.

Secondly you should consider how to implement multiple BFFs whilst maintaining strict separation of concerns to avoid duplication. A BFFs implementation should be solely concerned with the shaping of data and functionality for the frontends they serve. Internal business rules should be implemented in a different shared layer rather than duplicated across the BFFs, failure to do this will lead to inconsistencies in experience as well as creating inefficiencies in development.

Lastly consideration should be given to the fact that a BFF approach leads to more applications being developed and deployed. Failure to have teams structured to develop within this model and to have a good DevOps story to manage the increased number of deployments will stop you achieving the promised increase in efficiency.

So many systems demonstrate the frontend/backend split that the devil in the detail of how these applications will interact is one of the most important factors in how successful your efforts will turn out to be. The performance, usability and development efficiency of your applications is in large part going to be related to how you implement this interaction. BFFs should be one of the tools at your disposal to ensure frontend and backend can work in harmony to deliver great outcomes.

Sunday 7 January 2024

Teaching the Machine

In my previous post on Large Language Models (LLMs) I referred several times to the models being trained, but what does this actually mean?

The process I was referencing is known as Machine Learning (ML), it comes in many different forms and is probably the most important and time consuming aspect of building Artificial Intelligence (AI).

Much like our own learning, the training of an AI model is an iterative process that must be fine tuned and repeated in order to ensure a workable and useful model is delivered.

Logic vs Maths

Traditionally engineered software is a collection of logical statements, such as if-else, do-while and for-each, that form fixed predetermined paths from input to output. The problem the software deals with must essentially be solved in the minds of the engineers producing it so they can determine these logical paths.

However there are many classification of problems where it is impractical to solve them by writing software in this way. This may be complex medical problems, problems of processing complex data such as natural language or problems where the actual solution isn't known in advance.

The process of Machine Learning exposes statistical algorithms to large amounts of data in order to find relationships. This allows the trained model to generalize and make predictions or observations about unseen data without the need for explicit instructions on how the data should be processed.

Supervised Learning

Broadly speaking Machine Learning falls into two categories, supervised and unsupervised with the difference between the two being whether the required output is known in advance.

Supervised learning uses datasets that consist of many input data items, referred to as features, and a desired output known as a label. As an example we might have a medical dataset covering many different aspects of a persons health and a marker of whether or not they have a disease such as diabetes. The goal of the training is to develop a model that given someones health data can predict if they are likely to develop diabetes.

The model learns by processing the training data and identifying the relationships between the various data items, with its accuracy being assessed by how well it can produce the required output. When via a process of iteration and tweaks to the mathematical algorithms being used the model is deemed to be trained it is used to process previously unseen data.

Many types of supervised learning use some form of mathematical regression to define trend lines for datasets where this trend line forms the basis for prediction and definition of the output label.

Human experts in the problem space the model is dealing with are key in the process of identifying the data features that the model should work with and ensuring the data it is trained with is of sufficient quality and diversity to produce a workable and accurate model.

Unsupervised Learning

Unsupervised learning, also sometimes referred to as deep learning, involves datasets that don't have a predefined label defining what the output should be. The model identifies previously unknown relationships in the data in order to cluster datapoints and find commonalties, this then enables the model to look for the presence of these commonalties in new data it is exposed to.

Examples of this type of learning might be analysis of customer buying behaviour in order to predict future purchases, the ability to recognise the content of images, or in the case of an LLM like ChatGPT the ability to predict the natural language that forms an answer to a question.

Supervised learning is generally based on neural networks. Originally based on the organisation of the human brain, a neural network consists of a number of interconnected neurons. However rather than being biological in nature, these neurons are mathematical models that take a number of inputs and produce a single output to pass onto the next neuron in the chain.

Although the types of problems that are addressed by supervised learning are generally those where there is no preconceived wisdom on the relationship between the data items, human experts still play a crucial role in analysing the outputs of the model in order to drive the iterative training process that will increase the models accuracy.

The portrayal of AI in science fiction would have us believe that a switch is flipped and a fully formed intelligence springs into life holding infinite knowledge. The reality is that its a pain staking, costly and time consuming process that requires many cycles to perfect. Machine Learning is essential the nuts and bolts of AI, services such as ChatGPT live or die based on the expertise and engineering that is applied in this phase. The fundamentals of the learning process applies equally to machines as to humans.

Monday 1 January 2024

The Power of Maths and Data

Artificial Intelligence (AI) is one of those relatively rare terms that has made the jump from software engineering into the lexicon of everyday life.

Many of the usages of the term are really nothing more than marketing ploys to engender products with a greater technological cachet. However the publics exposure to Large Language Models (LLMs), most notably in the form of ChatGPT, has given a glimpse into the potential of AI.

There is however an unfortunate anthropomorphic effect that comes with the term intelligence. When we observe LLMs like ChatGPT in action we tend to equate its operation with our own intelligence and imagine the machine thinking and reasoning for itself.

While you could have a philosophical debate about what it means to "think", I don't believe viewing the technology in this way is helpful, and is what leads to many of the perceived doomsday scenarios we are told it could lead us towards.

So what is actually happening inside an LLM?

Patterns Leading to Understanding

AI is a very broad term covering many different technologies and applications. But in general it can be viewed as using mathematics to find patterns in data and using this knowledge to predict, classify or in the case of LLMs generate output.

Whilst some applications of AI may look at a narrow dataset such as, credit card transactions to classify activity as fraud, or health data to predict the likelihood of disease, LLMs are trained on extremely large amounts of human language.

The vast size of the datasets used means the patterns the model is able to identify gives it a general purpose understanding of natural language. This enables it to have an understanding of language supplied to it as well as generating language in response.

The key aspect to keep in mind here is that this understanding of language relates to knowing the patterns and relationships between words based on the observations of a large dataset rather than an understanding of the actual words themselves. Given this set of input words, what is the most likely set of output words that statistically would form an answer to that query.

Transformers

The most common architectural pattern used to build LLMs is called a Transformer Model.

Transformer Models are an example of something called a neural network. Originally based on the organisation of the human brain, a neural network consists of a number of interconnected neurons.

However rather than being biological in nature, these neurons are mathematical models that take a number of inputs and produce a single output to pass onto the next neuron in the chain.

A Transformer consists of an encoder and a decoder.

The encoder takes input language and divides it up into a number of tokens which we can think of the constituent parts of words. Mathematical equations are then applied to these tokens to understand the relationship between them. This produces a mathematical representation of the input language allowing the model to predict the potential output language.

The decoder then runs this process in reverse to move from the mathematical representation of the output back into tokens to form the language to provide to the user.

When trained on a significantly large amount of varied data this allows the model to provide answers to questions on many subjects.

Use Cases and Downsides

The generative nature of LLMs makes them ideal for use cases such as chat bots, document generation, and if trained on appropriate data sets even specialised tasks such as code generation.

Their ability to interact also enables them to be used for applications such as chat bots along with a conversational approach to information retrieval from large datasets.

The LLM approach can also be applied to data sources other than language, allowing for audio and image generation as well as language.

However the nature of how LLMs are trained can lead to some downsides its important to be aware of. LLMs will reflect the nature of the data they are trained on, if that data has natural human bias then this will be reflected in the output language that the model produces.

LLMs can also display a behaviour called hallucination. This is where the model produces output language that while coherent isn't factually accurate or doesn't relate to the input language. There are many reasons for this but most relate to the earlier point that the models output is based on mathematical analysis rather than an in built understanding of the language it is provided or returning.

The AI revolution is real, and its potential impacts are made visible to the majority of us via LLMs such as ChatGPT or Google's Bard. It is also the interactions with these models that drives a lot of peoples fears about the direction the technology will take us. But it's important to appreciate how these models are doing what they do before becoming overly fearful or pessimistic.