Imagine you are buying a car. What essential features do you need? A vehicle should deliver a person from point A to point B. But you also check its safety, comfort, maintainability, ease of repair, and mileage. You may also look for an electric version or better speed. Why should you check these additional characteristics that aren't critical to its main duty? To limit the surprises that may occur when delivering the primary function: taking you from point A to point B.
[ Learn key considerations for designing an event-driven hybrid cloud architecture, common use cases, and technologies that can help along the way. ]
Software has similar nonfunctional requirements (NFR), which are also called architectural characteristics. Whether you're working on a website, a mobile app, or a desktop program, software should have a set of quality-oriented attributes to meet end-user needs.
What are nonfunctional requirements?
Briefly, functional requirements define what a system is supposed to do. In the case of a car, that's taking a person from A to B. Nonfunctional requirements stipulate how a system is supposed to be.
Here is a cheat sheet for understanding nonfunctional requirements:
These top 10 architectural characteristics cover most aspects of a large-scale project. You don't need to accommodate all of them in your project; pick the most essential and knock it out. This article does not provide solutions to meeting these NFRs but instead explains areas to consider when designing a system.
Scalability refers to the systems' ability to perform and operate as the number of users or requests increases. It is achievable with horizontal or vertical scaling of the machine or attaching AutoScalingGroup capabilities.
Here are three areas to consider when architecting scalability into your system:
- Traffic pattern: Understand the system's traffic pattern. It's not cost-efficient to spawn as many machines as possible due to underutilization. Here are three sample patterns:
- Diurnal: Traffic increases in the morning and decreases in the evening for a particular region.
- Global/regional: Heavy usage of the application in a particular region.
- Thundering herd: Many users request resources, but only a few machines are available to serve the burst of traffic. This could occur during peak times or in densely populated areas.
- Elasticity: This relates to the ability to quickly spawn a few machines to handle the burst of traffic and gracefully shrink when the demand is reduced.
- Latency: This is the system's ability to serve a request as quickly as possible. This also includes optimizing the algorithms and using edge computing to replicate the system near users to reduce the round-trip time of a request.
[ Download the digital transformation eBook to gather team tools to drive change. ]
Availability is measured as a percentage of uptime and defines the proportion of time that a system is functional and working. Availability is affected by system errors, infrastructure problems, malicious attacks, and system load. Things to consider include:
- Deployment stamps: Deploy multiple independent copies of application components, including data stores.
- Geodes: Deploy backend services into a set of geographical nodes, each of which can service any client request in any region.
Extensibility measures the ability to extend a system and the effort required to implement the extension. The extension can occur by adding new functionality or modifying existing functionality. The principle provides enhancements without impairing current system functions. When architecting extensibility, consider:
- Modularity and reusability: Reusability, together with extensibility, allows technology to be transferred to another project with less development and maintenance time, as well as enhanced reliability and consistency.
- Pluggability: This is the ability to easily plug in other components, for example with microkernel architecture.
Consistency guarantees that every read returns the most recent write. This means that after an operation executes, the data is consistent across all the nodes, and thus all clients see the same data at the same time, no matter which node they connect to. Consistency improves the data's freshness.
A system can gracefully handle and recover from accidental and malicious failures. Detecting failures and recovering quickly and efficiently is necessary to maintain resiliency. The primary factor to consider when architecting for resiliency is:
- Recoverability: This is the preparatory processes and functionality that enable services to return to an initial functioning state after an unintended change. Unintended changes include soft or hard deletion or misconfiguration of applications.
- Disaster recovery: Disaster recovery (DR) consists of best practices designed to prevent or minimize data loss and business disruption resulting from catastrophic events—everything from equipment failures and localized power outages to cyberattacks, civil emergencies, criminal or military attacks, and natural disasters.
Following are some DR design patterns you might implement to build resiliency into your architecture:
- Bulkhead: This pattern isolates elements of an application into pools so that if one fails, the others will continue to function.
- Circuit breaker: This pattern handles faults that might take a variable amount of time to fix when connecting to a remote service or resource.
- Leader election: This pattern coordinates the actions performed by a collection of collaborating task instances in a distributed application by electing one instance as the leader that assumes responsibility for managing the other instances.
Usability is a system's capacity to enable users to perform tasks safely, effectively, and efficiently while enjoying the experience. It is the degree to which specified consumers can use software to achieve quantified objectives with effectiveness, efficiency, and satisfaction in a quantified context of use. Related factors include:
- Accessibility: Make the software available to people with the broadest range of characteristics and capabilities, including users with deafness, blindness, colorblindness, and more.
- Learnability: Make the software easy for users to learn.
- API contract: Internal teams need to understand the API contracts to help them plug into any system.
Observability is the ability to collect data about program execution, modules' internal states, and communication between components. To improve observability, use various logging and tracing techniques and tools, including the following:
- Logging: There are different types of logs generated within each request, such as event logs, transaction logs, message logs, and server logs.
- Alerts and monitoring: Prepare monitoring dashboards, create service-level indicators (SLIs), and set up critical alerts.
- Tiered levels of support: Set up on-call support processes for Level 1 and Level 2 support. L1 support includes interacting with customers. L2 support manages the tickets escalated by L1 and helps troubleshoot. L3 is the last line of support and usually comprises a development team that addresses the technical issues.
Security is the degree to which the software protects information and data so that people, other products, or systems have data access appropriate to their types and levels of authorization. This family of characteristics includes the following five attributes:
- Confidentiality: Data is accessible only to those authorized to have access.
- Integrity: The software prevents unauthorized access to or modification of software or information.
- Nonrepudiation: Prove whether actions or events have taken place.
- Accountability: Trace user actions.
- Authenticity: Prove the user's identity.
[ Download Cloud-native meets hybrid cloud for a step-by-step guide to tackling modern IT strategy. ]
Additional security requirements include:
- Auditability: Audit trails track system activity so that when a security breach occurs, you can determine the mechanism and extent of the breach. Storing audit trails remotely, where they can only be appended, can keep intruders from covering their tracks.
- Legality: This involves adherence to laws or other industry requirements.
- Compliance: Adherence to data protection laws like GDPR, CCPA, SOC2, PIPL, or FedRamp
- Privacy: Ability to hide transactions from internal company employees, such as encrypting transactions so that even database administrators and network architects cannot see them
- Authentication: Security requirements ensure users are who they say they are.
- Authorization: Security requirements ensure users can access only certain functions within the application (by use case, subsystem, web page, business rule, field level, and so forth).
Durability relates to software's serviceability and ability to meet users' needs for a relatively long time. Things to consider include:
- Replication: Share information to ensure consistency between redundant resources to improve reliability, fault-tolerance, or accessibility.
- Fault tolerance: This enables a system to continue operating correctly in the event of one or more faults within some of its components.
- Archivability: This manages whether the data needs to be archived or deleted after a period of time. For example, customer accounts will be deleted after three months or marked as obsolete and archived in a secondary database for future access.
[ Aging legacy systems affecting your enterprise IT plans? Get a handle on your technical debt by downloading Technical debt: The IT leader's essential guide. ]
Agile is a software method that enables a team to respond to changes quickly. Software development is all about modification, so agility is a key NFR. Key factors include:
- Maintainability: How easy is it to apply changes and enhance the system? Maintainability represents the degree to which developers can effectively and efficiently modify the software to improve, correct, or adapt it to changes in the environment and requirements.
- Testability: How easily can developers and others test the software?
- Ease of development: Can developers modify the software without introducing defects or degrading existing product quality?
- Deployability: This is the time it takes to get code into production.
- Installability: How easy is system installation on all necessary platforms?
- Upgradeability: How quick and easy is it to upgrade from a previous version of an application or solution to a newer version on servers and clients?
- Portability: Does the system need to run on more than one platform?
- Configurability: How easily can end users change aspects of the software's configuration (through usable interfaces)?
- Compatibility: How well can a product, system, or component exchange information with other products, designs, or members and perform its required functions while sharing the same hardware or software environment?
Now that you are familiar with the architectural characteristics or NFRs, you may be wondering which ones will fit your project needs. Or maybe all of them are required in your project. So how can you adapt these characteristics to your needs?
Once you understand the functional requirement, try to find any bottlenecks in the system that may add obstacles to primary functions. How do you find the bottleneck? Try answering a few of these questions:
- Will the system perform in a 100M/1B userbase?
- Will the system handle 10,000 concurrent requests?
- Am I handling the data securely?
- Can I add more features easily without impacting the existing working features?
There are many other similar questions to help you determine the characteristics that will aid your project. Some of these questions can help identify a bottleneck or lower-performing areas, which are potential starting points to improving the system's overall reliability.
This article is based on Top 10 architecture characteristics / nonfunctional requirements with cheatsheet on Devgenius and is republished with permission.