Architecting Modern Applications

Thursday, March 9, 2023

Securing Public Website Domain - Part 3 - DNSSEC

The third part of this series is regarding DNSSEC. Every time the browser requests a website, it goes via a DNS. Since DNS is mostly an outsourced component, we hardly ever worry about its scalability and security. On evaluating a clients website for modern and reliable internet standards for DNS security, the first error had the following information:-

Too bad! Your domain is not signed with a valid signature (DNSSEC). Therefore visitors with enabled domain signature validation, are not protected against manipulated translation from your domain into rogue internet addresses.

What is DNNSEC?

The Domain Name System Security Extensions (DNSSEC) is a critical technology to secure the Domain Name System (DNS). DNSSEC provides a layer of authentication and integrity checking to the DNS, ensuring that the information transmitted is trustworthy and has not been tampered with.

To enable DNNSEC, the domain registrar or the hosting provider should do that.

Vulnerabilities due to a domain not being signed with a valid signature?

While DNSSEC has been available for many years, it is surprising that many known websites still need this signature from the DNS provider. These include major cloud providers like https://www.amazon.com, https://www.microsoft.com, and DNS providers like https://www.godaddy.com.

As per https://internet.nl/faqs/dnssec/, some real-world incidents that DNSSEC could have prevented include

One of the primary issues with not implementing DNSSEC is DNS cache poisoning. It occurs when an attacker can manipulate the DNS lookup process and direct users to a fraudulent website that looks identical to the legitimate site. It is achieved by intercepting DNS queries and responding with fake records modified to redirect the user elsewhere. DNSSEC prevents this attack by providing a mechanism to verify the authenticity of DNS records.

Another issue is DNS spoofing, which is similar to DNS cache poisoning but occurs when an attacker can inject false DNS records into the cache of a DNS resolver. It can allow the attacker to redirect traffic to malicious servers and intercept user communication. DNSSEC mitigates this by adding a layer of validation to the DNS records returned to the resolver.

Furthermore, DNSSEC can prevent man-in-the-middle (MITM) attacks by ensuring that the DNS records are authentic and have not been tampered with during transit. MITM, explained in Part-2 of this series, is when an attacker intercepts communication between the client and the server and alters the data. DNSSEC can prevent this by adding digital signatures to the DNS records.

Conclusion

Why several websites do not implement this feature may range from lack of awareness, cost factor, and risk of breakage to a complex setup. But, the adoption of DNSSEC seems essential for ensuring the security and integrity of the DNS.

Tuesday, March 7, 2023

Securing Public website domain - Part 2 - Implementing HSTS

Most of the time, the HTTP to HTTPS redirection for a website happens at the DNS, Edge, or Application layer. So by typing the naked domain (non-www) of the website, there is still a vulnerability between the user’s browser and these layers. The browser does not have the intelligence to redirect the URL to HTTPS.

SSL ensures an encrypted connection between the browser and the website. HSTS forces web browsers to become intelligent and use an encrypted HTTPS connection at the Browser layer.

The first thing to understand is what kind of vulnerabilities are present if HSTS (HTTP Strict Transport Security) is not implemented. These are very fundamental and similar to vulnerabilities without HTTPS implemented.

Vulnerabilities without HSTS

a) Man in the Middle attack is when attackers can intercept traffic between a user's browser and the website and manipulate the connection to use an unencrypted HTTP protocol instead of a secure HTTPS protocol. This way, attackers can read or modify the content of the communication, leading to potential data breaches, session hijacking, or phishing attacks.

b) Cookie hijacking: Without HSTS, attackers can intercept or tamper with session cookies sent over an unencrypted HTTP connection. This way, attackers gain access to user accounts and steal sensitive information, such as personal data or financial details.

c) SSL stripping: Attackers can use SSL stripping attacks to devaluate the communication from HTTPS to HTTP and intercept sensitive data sent over an insecure connection. This technique is often combined with phishing attacks, where users are redirected to fake websites designed to steal login credentials or other personal information.

d) DNS hijacking: Attackers can perform DNS hijacking attacks to redirect users to a malicious server instead of the intended website. It allows the attacker to intercept and manipulate the communication between the user's browser and the fake website, leading to potential data theft or malware infection.

e) SSL certificate fraud: Without HSTS, attackers can use fraudulent SSL certificates to impersonate a legitimate website and deceive users into sharing sensitive information. It is an issue when users are not aware of the legitimate website's SSL certificate or security indicators, as they may unknowingly trust a fraudulent certificate.

Implementing HSTS

To implement HSTS, the response header just needs to be sent as part of the website's HTTP responses. The header should include the HSTS policy, which specifies how long browsers should remember to use HTTPS instead of HTTP when communicating with the website.

A simple example of a code snippet that needs to be added to the header is as below: -

HTTP-Strict-Transport-Security: max-age=31536000; includeSubDomains

The max-age parameter specifies the number of seconds that the HSTS policy will be in effect. In this case, it is set to one year (31536000 seconds). The includeSubDomains parameter tells the browser to apply the HSTS policy to all subdomains of the website as well.

Once you've added the response header to the website's HTTP responses, any time a user visits the site, their browser will remember to use HTTPS instead of HTTP for a specified period of time, 1 year as per the above example

In summary, without HSTS, websites are vulnerable to a wide range of attacks that can lead to data breaches, identity theft, and other serious consequences. Implementing HSTS is the first crucial step in ensuring the security and privacy of website users.

Securing your public-facing website domain - Part 1

One of the unknown facts about a website or domain is how it implements proper SSL standards, leaving it insufficiently secure. Casually browsing through free online tools like https://www.ssllabs.com/ssltest/, and https://www.internet.nl/ made me realize how outdated we were from the realities of modern website security and privacy.

Having the website as SSL does provide a secure layer that provides encryption between the web server and the browsers. However, we need to stay updated to keep us away from newer vulnerabilities.

The result of the free online tool gave our website a RED rating, and I had to take the entire report and address each item. Below are all the vulnerabilities that we started and eventually addressed. I will go through each of them in detail in subsequent posts.

1. Implement HTTP Strict Transport Security (HSTS)

2. Implement Domain Name System Security Extensions (DNSSEC)

3. Avoid mixed content: Ensure that all resources on your website, including images, scripts, and stylesheets, are served over HTTPS. Avoid including content from non-HTTPS sources.

4. Implement secure cookie settings: Use the Secure and HttpOnly flags on cookies to ensure that they are only transmitted over HTTPS and cannot be accessed by malicious scripts.

5. Use security headers: Implement security headers, such as Content Security Policy (CSP) and X-Frame-Options, X-Content-Type-Options, and Referer-Policy to protect against cross-site scripting (XSS) and clickjacking attacks.

6. Ensure websites have the latest TLS version enabled. Also, all old ciphers that are not supported are deleted.

7. Proper redirection from HTTP to HTTPS on the same domain in both of www. as well as for naked domains.

Saturday, February 25, 2023

Building a Chatbot using WebSockets on Azure Services - Part 1

Got an opportunity to build a chatbot application on Azure and it was a dilemma to use either HTTP polling or use the WebSocket way of interacting with the client. The moment WebSocket was discussed got a lot of caution from the organization architecture team regarding the complexity and the technical back draws.

Integrating WebSockets for a chatbot using Microsoft Azure Services can greatly enhance the user experience by providing real-time, bidirectional communication between the chatbot and the user. Microsoft Azure provides a number of services that can be used to implement a WebSockets-based chatbot solution, including Azure SignalR Service, Azure Application Gateway, and Azure Functions.

One of the key advantages of using Azure SignalR Service is that it provides a fully managed service for adding real-time functionality to applications, making it easy to implement WebSockets in a chatbot solution. By using Azure Application Gateway, incoming traffic from users can be redirected to the chatbot's backend, and the WebSockets connection between the chatbot and the user can be secured using SSL offloading and authentication.

In addition to these services, Microsoft Azure also provides a number of tools for monitoring and troubleshooting, such as Azure Monitor and Azure Log Analytics, to ensure that the chatbot solution is performing optimally and that any issues are quickly identified and resolved.

To ensure a seamless user experience, it is important to plan for scalability from the start, so that the chatbot solution can handle increasing traffic and user numbers. This can be achieved using Azure's auto-scaling and load-balancing features.

Finally, thorough testing is critical to ensure that the chatbot implementation meets the requirements and expectations. This includes testing the WebSockets connection through the Azure Application Gateway, as well as any other components of the chatbot solution.

In conclusion, by using Microsoft Azure Services, it is possible to implement a highly performant and scalable WebSockets-based chatbot solution that provides a real-time, responsive experience for users.

Sunday, January 8, 2023

Tackling and Mitigating a Distributed Network attack

A Distributed Denial of Service (DDoS) attack on a public website can have severe consequences for businesses, including lost revenue, reputational damage, and even legal liability. DDoS attacks involve overwhelming a website with traffic from multiple sources, making it inaccessible to legitimate users.

There was a high-volume DDoS attack recently on one of the public sites that I am responsible for. I was in the midst of all the action around the clock and helped mitigate this issue. There was an initial glitch, but the site has been running stable for the customers, with a lot of work done behind the scenes.

We did have a finely tuned WAF layer, but this was a pure layer 7 volumetric attack from across the Globe and was purposely targeting the application layer. The attack was staggered with throughput in millions of requests per minute. Most well-known WAFs available in the market with Advanced DDoS protection and ML-based pattern detections have limitations at such high volumes.

Some of the best practices to prevent a DDoS attack includes

Implement DDoS protection measures

The first step in tackling a DDoS attack is to implement DDoS protection measures. These measures can include firewalls (WAF), load balancers, and intrusion prevention systems (IPS). Additionally, businesses can use cloud-based DDoS protection services, which can automatically detect and mitigate attacks. The protection needs to be done at different layers of OSI design. Also, most modern WAF products have managed rules along with advanced Bot protection that needs to be fine-tuned.

Develop a response plan

Organizations should have a plan in place for how to respond to a DDoS attack. This kind of attack can happen at any time and a proper plan should include a response team, communication protocols, and steps to address the attack. Additionally, another key element is that teams should regularly test their plan to ensure it is effective.

Monitor network traffic

Monitoring network traffic is critical in identifying a DDoS attack on a public website. This can be achieved by using network traffic analysis tools that can identify spikes in traffic and alert the security team in real time. Capturing logs along with ready-made queries can help identify and monitor malicious traffic.

Block malicious traffic

One of the most effective ways to mitigate a DDoS attack is to block malicious traffic. This can be achieved by using access control lists (ACLs) and firewalls to block traffic from known malicious IP addresses or Geo Locations. Additionally, teams can use rate limiting, geo-blocks, automatic pattern detection, and URL blocks to limit the amount of traffic coming from specific sources.

Use Content Delivery Networks (CDNs)

CDNs can help to mitigate the impact of a DDoS attack by distributing traffic across multiple servers. This can help to absorb the attack and keep the website online. Additionally, CDNs can offer DDoS protection services that can automatically detect and mitigate attacks.

Tooling and Testing

Having appropriate alerts and notifications along with pattern detection can help identify malicious traffic from time to time. Performing a shadow DDoS test along with timely load testing the infrastructure can help to size the application appropriately.

Educate users

Finally, teams should educate their users about the risks of DDoS attacks and how to prevent them. This can be achieved through regular training and awareness campaigns. Additionally, businesses can encourage users to report any suspicious activity or traffic they observe.

In conclusion, a DDoS attack on a public website can have significant consequences for businesses. There is no one solution to prevent an attack and the mitigation plan varies from application to application. But the key here is to understand the events of malicious traffic and address specific attack vectors. Also, having a WAF and continuous tuning of the WAF is required to maximize app protection without causing false positives.

Sunday, January 1, 2023

Distributed Denial of service attacks on different OSI layers - Part 3

A Distributed Denial of Service (DDoS) attack is a malicious attempt to make a network or website unavailable to its users by overwhelming it with traffic from multiple sources. These attacks can occur at different layers of the Open Systems Interconnection (OSI) model, and each layer presents different challenges for mitigation. In this article, we will discuss DDoS attacks on different OSI layers with examples.

Layer 3 (Network Layer)

DDoS attacks at the network layer target the routing of IP packets. These attacks aim to consume network bandwidth, making the targeted service unavailable to legitimate users. An example of a network layer DDoS attack is the Ping of Death attack, where an attacker sends oversized ping packets to a target, causing the system to crash or become unavailable.

Mitigation strategies for network layer DDoS attacks include implementing access control lists (ACLs) to filter out unwanted traffic and deploying routers with built-in DDoS protection features.

Layer 4 (Transport Layer)

DDoS attacks at the transport layer target the transport protocol, such as Transmission Control Protocol (TCP) or User Datagram Protocol (UDP). These attacks aim to consume server resources, making the targeted service unavailable to legitimate users. An example of a transport layer DDoS attack is the SYN flood attack, where an attacker sends a flood of TCP SYN requests to a server, consuming server resources and causing the service to become unavailable.

Mitigation strategies for transport layer DDoS attacks include implementing rate limiting and implementing SYN cookies to prevent SYN flood attacks.

Layer 7 (Application Layer)

DDoS attacks at the application layer target the application protocol, such as HTTP or HTTPS. These attacks aim to consume server resources, making the targeted service unavailable to legitimate users. An example of an application layer DDoS attack is the HTTP Flood attack, where an attacker sends a large number of HTTP requests to a server, consuming server resources and causing the service to become unavailable.

Mitigation strategies for application layer DDoS attacks include implementing web application firewalls (WAFs) to filter out unwanted traffic, implementing rate limiting, and using CDN services to distribute the load across multiple servers.

In conclusion, DDoS attacks can occur at different layers of the OSI model, and each layer presents unique challenges for mitigation. Mitigation strategies for DDoS attacks include implementing ACLs, deploying routers with built-in DDoS protection features, implementing rate limiting, using SYN cookies, implementing WAFs, and using CDN services. By being prepared and implementing these strategies, businesses can mitigate the risks of DDoS attacks and ensure their systems remain available to legitimate users.

Sunday, December 11, 2022

Fine Tuning a WAF to avoid False Positives - Part 2

This week has been an action-packed week with some high-volume DDoS attacks on one of the web applications. We have been spending a lot of time understanding the importance of having a WAF for all our client-facing public domains. In today's Cloud architecture Web Application Firewalls (WAFs) is a crucial part of any organization's security posture. They protect web applications from DoS, DDoS, and attacks, such as SQL injection, cross-site scripting (XSS), and other malicious activities. However, WAFs need to be fine-tuned regularly to ensure they provide maximum protection without causing false positives. In this article, we will discuss some best practices we followed to fine-tune a WAF and prevent multiple attacks on our application.

1. The first step in fine-tuning a WAF is to understand the web application it is protecting. This includes identifying the application's components, such as the web server, application server, and database. Additionally, it is essential to identify the web application's behavior, including the type of traffic it receives, the HTTP methods it uses, and the expected user behavior. Understanding the web application will help to identify which rules should be enabled or disabled in the WAF.

2. Configure WAF logging WAF logging is a critical component of fine-tuning. It allows security teams to analyze WAF events and understand which rules generate false positives. WAF logs should be enabled for all rules, and log data should be retained for an extended period, such as 90 days or more.

3. Start with a default configuration WAFs come with a default configuration that provides a good starting point for fine-tuning. Start with the default configuration and enable or disable rules as necessary. Additionally, some WAFs have pre-built templates for specific applications, such as WordPress or Drupal. These templates can be an excellent starting point for fine-tuning.

4. Test the WAF Once the WAF is configured, it is essential to test it thoroughly. The WAF should be tested with a variety of traffic, including legitimate traffic and malicious traffic. This will help identify any false positives or negatives generated by the WAF.

5. Tune the WAF Based on the results of testing, the WAF should be fine-tuned. This may include enabling or disabling rules, adjusting rule thresholds, or creating custom rules to address specific attack vectors. Additionally, WAFs may have machine learning or AI capabilities that can help to reduce false positives.

6. Monitor the WAF After fine-tuning, the WAF should be monitored regularly to ensure it is providing maximum protection without causing false positives. WAF logs should be analyzed regularly, and any anomalies should be investigated immediately.

In conclusion, fine-tuning a WAF is a critical component of any organization's security posture. It requires a thorough understanding of the web application, careful configuration, and extensive testing. Additionally, WAFs should be regularly monitored and fine-tuned to ensure they provide maximum protection without generating false positives. By following these best practices, organizations can ensure their WAFs provide maximum protection against web application attacks.

Thursday, December 8, 2022

Demystifying the hidden costs after moving to the Cloud

The web application at a client was hosted using a combination of services on Azure. The architecture was quite simple and used the following services. Front Door, Api Manager, App Service, SQL Database, Service Bus, Redis Cache, and Azure Functions. As the application matured, little did we think of all the hidden costs of the cloud at the start of the project.

Azure Front Door used for efficient load balancing, WAF, Content Delivery Network, and as a DNS. However, the global routing of requests through Microsoft's network incurred data transfer and routing costs. What started as a seamless solution for enhanced user experience turned into a realization that global accessibility came at a price. Also, the complexity of configuring backend pools, health probes, and routing rules can lead to unintended expenses if not optimized.

App Services had a modest cost to begin with on low-scale Premium servers. But as the application garnered a lot of hits, so did the number of users and, subsequently, the resources consumed. The need for auto-scaling to handle increased traffic and custom domains brought unforeseen expenses, turning the initially reasonable hosting costs into a growing concern. So, keep an eye on the server configuration and the frequency of scaling events.

Azure SQL Database brought both power and complexity. Scaling to meet performance demands led to increased DTU consumption and storage requirements. The once manageable monthly expenses now reflected the intricate dance between database size, transaction units, and backup storage. Not scaling down the backups also incurred costs, especially for databases with high transaction rates. Inefficient queries and suboptimal indexing can increase resource consumption, impacting DTU usage and costs.

Azure Service Bus, the messenger between the application's distributed components, began with reasonable costs for message ingress and egress. Yet, as the communication patterns grew, the charges for additional features like transactions and dead-lettering added expenses to the budget. Also, long message TTLs can lead to increased storage costs.

Azure Cache for Redis, used for in-memory data storage, initially provided high-performance benefits. However, as the application scaled, the usage to accommodate larger datasets, the costs associated with caching capacity, and data transfer began to rise, challenging the notion that performance came without a price. Eviction of data from the cache, may result in increased data transfer costs, especially if the cache is frequently repopulated from the data source. Also, fine-tuning cache expiration policies is crucial to avoid unnecessary storage costs for stale or rarely accessed data.

Lastly, the Azure Functions, with its pay-as-you-go model, was supposed to be the least cost of all services as it allowed to invoke functions as needed. But, the cumulative charges for execution, execution time, and additional resources reminded me that serverless, too, had its hidden cost. Including unnecessary dependencies in your function can inflate execution times and costs.

Demystifying the expenses after moving to Azure required a keen understanding of its pricing models and a strategic approach to balancing innovation with fiscal responsibility.

Sunday, November 27, 2022

Choosing the right WAF for your Enterprise wide Applications - Part 1

This is a multi-part series on how to protect a web application using a WAF. To start with this part explains how to choose the right WAF for an Enterprise-wide web application.

Web Application Firewalls (WAFs) are a crucial part of any organization's security infrastructure, protecting their web applications from cyber threats. With so many WAFs available in the market, choosing the best one can be a daunting task. I have been reading Gartner reports, along with performing POCs and trying to choose a tool that best suits the client, Below are the different criteria to consider when choosing the best WAF for your organization.

Security Features

When choosing a WAF, the first and most crucial criterion is its security features. The WAF should have strong protection against various cyber threats, including DDoS, SQL injection, cross-site scripting (XSS), and other common OWASP web application vulnerabilities. Additionally, the WAF should offer threat intelligence services that provide continuous updates on the latest security threats and attack patterns.

Customization and Configuration

The WAF should be easily customizable and configurable to suit an organization's specific security needs. It should allow for custom rule creation, custom signature creation, and other customization options that allow you to fine-tune the WAF's security policies according to an organization's requirements. The ability to perform extensive rate-limiting or geo-blocking features is some of the common requirements of a WAF.

Performance and Scalability

The WAF should offer excellent performance and scalability, especially for high-traffic websites or applications. It should be able to handle a large number of concurrent connections without compromising performance or introducing latency. Additionally, the WAF should be scalable, allowing an organization to expand and grow without requiring a complete WAF overhaul. In simple words, it should not be a single point of failure.

Integration with Existing Security Infrastructure

The WAF should be easy to integrate with the organization's existing security infrastructure, including firewalls, intrusion detection and prevention systems (IDPS), and Security Information and Event Management (SIEM) systems. This integration should also allow for seamless communication and collaboration between the different security systems, providing a holistic approach to security.

Compliance and Regulations

The WAF should comply with various regulatory standards, such as the Payment Card Industry Data Security Standard (PCI DSS) or the General Data Protection Regulation (GDPR). Additionally, the WAF should be auditable, providing detailed logs and reports allowing compliance verification and audit trails.

Ease of Use and Management

The WAF should be easy to use and manage, with a user-friendly interface that allows security administrators to monitor and manage the WAF effectively. Additionally, the WAF should offer automation and orchestration capabilities, allowing for seamless deployment and management of the WAF across different environments.

In conclusion, choosing the best WAF for an organization requires careful consideration of various criteria, including security features, customization and configuration, performance, and scalability, integration with existing security infrastructure, compliance and regulations, and ease of use and management. Selecting the right WAF that meets an organization's specific security needs can protect web applications from various cyber threats and ensure your organization's continued success.

Monday, July 18, 2022

Using Well Architect Framework to Address Technical Debt - Part 1

Since getting my well-architected framework proficiency certification a year back, I have become a massive fan of the framework and have used it extensively at work. The Well Architected Framework is a tool with a set of standards and questionnaires that illustrates design patterns, key concepts, design principles, and best practices for designing, architecting, and running workloads in the cloud.

All major cloud providers like AWS, Azure, Google, and Oracle have defined the framework foundation, and they continue to evolve them with their platforms and services.

Organizations that have moved to the cloud have a different set of challenges. As all workloads are running in the cloud, the typical requirement from businesses is for more agility and focus on shipping functionalities to production. Teams are very less invested in improving the technical debts. This leads to more reactive rather than proactively continuous improvements and a huge pile load of epics to resolve.

The well-architected framework (WAF) suits really well for teams that are unaware of where to start with the technical debt in terms of priority. The fundamental pillars of the WAF are

a) System design b) Operational Excellence c) Security d) Reliability e) Performance f) Cost optimization and the newly added pillar g) Sustainability.

The framework can be fine-tuned to fit custom requirements based on the application domain. The framework is also apt to address typical Cloud challenges like the high cost of cloud subscriptions, Application Performance tuning, Cloud security, Operation Challenges in a Cloud or Hybrid setup, Quick recoveries from failure, and improvement on organizations' Green Index.

A dashboard helps to view the technical debts once the questionnaire is updated based on the WAF pillars. The below diagram illustrates the WAF dashboard heatmap and the technical debt based on prioritization and impact. The dashboard stresses the needed improvement and helps to measure the changes implemented by comparing them to all the possible best practices.

Performing these reviews on a timely basis helps the team to identify unknown risks and mitigate the problem very early. The WAF reviews fit well with the Agile ways of working and the principle of Continuous improvement.

Below are the links to Well-Architected Frameworks described by different cloud vendors.

a) https://aws.amazon.com/architecture/well-architected

b) https://cloud.google.com/architecture/framework

c) https://docs.microsoft.com/en-us/azure/architecture/framework/

d) https://docs.oracle.com/en/solutions/oci-best-practices

Monday, April 4, 2022

AWS managed Blockchain Blog

I have been part of an interesting case study on AWS-managed blockchain. Glad to be part of authoring the new AWS blog post on AWS Managed Blockchain.

--> https://aws.amazon.com/blogs/apn/capgemini-simplifies-the-letter-of-credit-process-with-amazon-managed-blockchain/

Sunday, March 20, 2022

The Sustainable Enterprise - Why cloud is key to business sustainability

I have been writing several articles on this topic and am pleased to contribute to this newly released white paper on the topic of "How Enterprises can achieve sustainable IT via the cloud, teaming up with Microsoft. Nice to share an Architects view and work with some of the market-leading experts on this topic.

Download the white paper here:

https://www.capgemini.com/se-en/resources/the-sustainable-enterprise-why-cloud-is-key-to-business-sustainability/

Friday, February 4, 2022

Harnessing Green Cloud computing to achieve Sustainable IT evolution

A few months back, I had written an article about Sustainability explaining what it is all about when it comes to software development. Since then I have come across this topic in several forums, including discussions with multiple client organizations that have pledged to quantify and improve on this subject.

Organizations that move their applications towards cloud services tremendously improve their IT environmental impacts and goals of being sustainable. They are several factors that an enterprise has to consider beyond just selecting a cloud provider to be considered environmentally sustainable.

Focus on the following 6 areas can help organizations kick start their Green IT revolution on Cloud.

a) Cost Aware Architecture thinking

In applications built on cloud infrastructure, there are several moving parts with innumerable services. Organizations who have moved to the cloud often find it very difficult to be cost-aware, ensuring optimal usage of these services.

They are so engrossed in building their core business applications that they don’t invest in cost-aware architecture teams that focus on optimizing the spending by eliminating unprovisioned infrastructure, resizing or terminating underutilized and using lifecycle management. Practices like energy audits, alerts and IT cloud analysis helps to identify costs and identify systems that need to be greened.

Cloud provides services like Azure Advisor and AWS Trusted Advisor helps to optimize and reduce overall cloud expenditure by recommending solutions to improve cost-effectiveness. Services like Azure Cost management and Billing, AWS Cost Explorer, and AWS Budgets can be used to analyze, understand, calculate, monitor, and forecast costs.

b) Sustainable development

Building applications using modern technologies and cloud services help optimize development code and ensures faster deployments. It also enables in reduction of redundant storage and end-users energy levels.

Sustainable development on the cloud has many parts. It involves an end-to-end view of how the data traverses wholistically. Improving load times by optimizing caching strategies reduces the data size, data transfer quantity, and bandwidth. With new innovative edge service solutions and by serving the content from the appropriate systems, energy-efficient applications can be built reducing the distance at which the data travels.

c) Agile Architecture

One of the core Agile principles is to promote sustainable development and improve ways of working by making the development teams deliver at a consistent pace.

Cloud services provide tools like Azure and AWS DevOps, which is commonplace for development teams to organize, plan, collaborate on code development, build and deploy applications. It allows organizations to create and improve products faster than traditional software development approaches.

d) Increase Observability

There is a direct correlation between an organization's Observability maturity and Sustainability. In Observability, the focus is to cultivate ways of working within development teams to have a holistic data-driven mindset when solving system issues. The concept of Observability is becoming more and more prominent with the emergence and improvement of AI and ML-based services.

Service to improve automation diagnostics, automatic infra healing, and the advent of myriads of services used for deep code and infra drills, real-time analysis, debugging and profiling, alerts and notifications, logging and tracing, etc indirectly helps in organizations return of investment, increasing productivity

e) Consumption-based Utilization

Rightly sized applications, enhanced deployment strategies, automated backup plans, and designing systems using Cloud's well-architected frameworks result in utilizing the underlying hardware and its energy efficiency. It also serves the organization's long-term goals of reducing consumption and power usages, improving network efficiencies, and securing systems. Utilizing the right cloud computing service also helps the applications to Scale Up or Out appropriately.

Using cloud-provided Carbon tracking calculators helps gauge systems or applications that require better optimization in terms of performance or better infrastructure.

Conclusion

With AWS introducing Sustainability as the 6th pillar, green cloud computing has become one of the interesting topics for all organizations across different domains. While we all have come across tons of articles predicting how to save the world from various natural catastrophes and climate changes when it comes to software development on the Cloud, it's the foundational changes that one can start with to bring about the transformation.

Monday, November 15, 2021

The fundamental principles for using microservices for modernization

The last few years I have spent a lot of time building new application on microservices and also moving parts of monolith to microservices. I I have researched and tried sharing my practical experience in several articles on this topic.

This week my second blog on some foundational principles of microservices published on Capgemini website.

https://www.capgemini.com/se-en/2021/11/the-fundamental-principles-for-using-microservices-for-modernization/

Wednesday, October 20, 2021

How to manage the move to microservices in a mature way

This week my very first blog on this topic is published on Capgemini website.

https://www.capgemini.com/se-en/2021/10/how-to-manage-the-move-to-microservices-in-a-mature-way/

Friday, September 3, 2021

The advent of Observability Driven Development

A distributed application landscape with high cardinality makes it difficult for dedicated operation teams to monitor system behavior via a dashboard or react abruptly to system alerts and notifications. In a microservices architecture with several moving parts, detecting failures becomes cumbersome, and developers end up looking at errors like finding a needle in a haystack.

What is Observability?

Observability is more than a quality attribute and one level above monitoring, where the focus applies more to cultivating ways of working within development teams to have a holistic data-driven mindset when it comes to solving system issues.

An observability thought process enables development teams to embed the monitoring aspect right at the nascent stage of development and testing.

Observability in a DevSecOps ecosystem

Several Organizations are adopting a DevSecOps culture, and it has become essential for development teams to become self-reliant and have a proactive approach to identify, heal and prevent systems faults. DevOps focuses on giving the development teams ability to make rapid decisions and more control to access infrastructure assets. Observability enhances this by empowering development teams to be more instinctive when it comes to defining system faults.

Furthermore, the modern ways of working with Agile, Test Driven Development, and Automation enable development teams to get deep insights into operations that can potentially be prone to failures.

Observability on Cloud platforms

Applications deployed on Cloud provide the development teams with several out-of-box myriads of system measurements. Developers can gauge and derive quality attributes of a system even before a code goes into production. Cloud services make it easy to collate information like metrics, diagnostics, logs, and traces for analysis, and they are available at the developer’s behest. AI-based automated diagnostics along with real-time data give developers deep acumen into their System Semantics and characteristics.

Conclusion

Observability is more of an open-ended process of inculcating modern development principles to increase the reliability of complex distributed systems. The benefits of the Observability mindset helps organizations resolve production issues speedily, reduces dependency and cost on manual operations. It also benefits development teams to build dependable systems helping end customers with a seamless user experience.

Monday, July 5, 2021

Driving Digital Transformation using Sustainable Software Development

The term Digital Transformation in the last decade or so has become a well-known strategy in various organizations. Businesses across every domain are reviving their traditional businesses to adapt to a more modern digital marketplace.

But in the last few years, sustainable development has become one of the essential mainstays for a successful digital transformation journey. The Covid pandemic has also pushed organizations across different domains to rethink and emphasize environmental factors, climate changes, and human well-being to lure consumers.

Embracing a cloud-first model is one of the critical constituents in digital transformation and sustainable journeys. More and more organizations are speeding up their Cloud computing journeys and investing in modern SaaS/PaaS services, thus reducing environmental impacts and eliminating major infrastructure expenses. Organizations need to be wary and invest wisely in sustainable software-building methodologies for successful software implementations and cloud migrations seamlessly.

Organizations that strive to be data-driven have a better ability to monitor operations and analyze system behaviors accurately. The real-time analysis of information results in better usage of devices and improves the defined sustainable characteristics. Companies that invest in AI/ ML can have a very substantial benefit to sustainability. The science of reliable predictability in the digital realm can bridge gaps in system information interchange, zero wastages, improve storage and distribution mechanisms, eco-friendly products, free delivery methods, reusable infrastructure, etc. All of these can directly help in subduing environmental consequences.

In conclusion, the principles of building next-generation Digital software and Sustainable development go hand in hand. In the modern agile world, both of these journeys have a common goal of not jeopardizing the capability of future needs. These can be applied to systems as much as they can be related to human well-being. Adaptable working methods of Extreme Programming, Agile, Lean, Kanban help teams to strive for rapidly focused executions. These ways of organization working improve distributed system communications, their collaborations, their usages, and velocity. All of these indirectly result in contributing to energy-efficient software development.

Monday, June 14, 2021

My Capgemini Cloud Expert Profile

I joined Capgemini as a Cloud Solutions Architect and happy to share my profile is on the expert's page of Capgemini Sweden.

https://www.capgemini.com/se-en/experts/expert/shailendra-bhatt-managing-delivery-architect-at-capgemini-sweden-digital-and-cloud-solutions/

Sunday, May 23, 2021

Tips preparing for Professional AWS Solution Architect Exam

I recently cleared my AWS Solutions Architect Professional Exam with a total of 948/1000 and thoroughly enjoyed preparing for the exam. I spent a total of 6 months of preparation. This is in spite of the fact that I got 1000/1000 in the Associate Architect exam last year.

The exam as such is really tough. It not only evaluates one's knowledge and experience on AWS, but one also has to strategize for reading lengthy questions, time each question, and also be prepared to sit continuously for 190 minutes to finish the exam.

Below are some of the learnings and tips that I can share so that one can make good use of and benefit from studying for the exam. Preparation of the exam can be divided into basically 3 phases

Phase 1 Preparation

To start with, the exam requires considerable experience on the platform, I would say at least 2 years of hands-on experience on core AWS services. I would definitely recommend passing the Associate exam as the Professional one is way too tough.

a) Plan for taking a course and stick to the same. Select a course with a good rating on popular training sites like Udemy/Coursera or Udacity. Try out different courses for few days and choose a course where you are comfortable with the language and flow of the course. The basic content of all the highly rated courses is more or less the same. Also, choose a course that has practical samples on the topics that one is not comfortable with or has not worked on.

b) Plan a date and book the exam date. Choose somewhere between 2 to 3 months. AWS allows you to change the date twice for a booked exam.

c) Create a personal AWS account to practice as the exam covers way too many services which one may not have implemented in day-to-day professional work.

d) The exam is not theoretical and requires vast experience in the services. There are several real-world scenarios based questions and there are multiple ways to solve a specific problem. Read through a lot of use cases from different organizations especially the ones from the latest AWS re: Invent.

Phase 2 Preparation

In this phase, get deeper into the course and practice the below points in structuring and helping to know the services better.

a) AWS adds new services very frequently and one has to be well versed with each and every service that is present especially the new ones. AWS updates all the latest services in the below white paper.

AWS overview - https://d1.awsstatic.com/whitepapers/aws-overview.pdf

b) Each of the areas has several services that can perform the same task. Try to analyze which services are the best fit when considering Non-Functional requirements of Cost Optimization, Scalability, Performance, Duration, Automation, Scalability, Availability, Reliability, Security.

For example, S3 buckets are the most durable and cost optimization in terms of storage. But when it comes to performance EBS/EFS is better. Another example is when it comes to databases DynamoDB gives near real-time performance, but has limited data support. Aurora on the other hand is the most scalable when it comes to multi-region databases but is less scalable.

c) Try to understand what combination of all of the services is the best fit for requirements.

How to migrate on-premise systems and data to the cloud. It could be using a physical device in Snowball, or Server Migration service or Database Migration service or how to transform content using AWS Transform or AWS DataSync or Storage Gateways.

d) Start attempting to write practice tests and get the feel of the exam complexity. Slowly improve the ability to attempt more and more questions using a stopwatch.

Phase 3 Preparation

In this phase ensure that you have gone through the course and have a very good hold on the fundamentals of all the areas and are well versed with all services.

a) It is very difficult to master each and every service in depth. So, it is absolutely ok if one knows just the basics of certain services.

b) During this phase ensure you are at ease writing practice tests and are able to attempt 45-50 questions in a single sitting.

c) Your accuracy has improved and so has your reading speed. When attempting questions you are now more confident eliminating the wrong options.

d) By this time you will know that you have the confidence and better hold on the exam. If time is not a barrier, based on your comfort level try to push yourself to prepare and postpone the exam by a week or 2. This will just help you revise multiple times and improve the chances of clearing the exam.


You can Follow me on Home My Linkedin Profile My Medium Profile My PodCast Presentations View My Other Blog About Me