Thursday, March 9, 2023

Securing Public Website Domain - Part 3 - DNSSEC

The third part of this series is regarding DNSSEC. Every time the browser requests a website, it goes via a DNS. Since DNS is mostly an outsourced component, we hardly ever worry about its scalability and security. On evaluating a clients website for modern and reliable internet standards for DNS security, the first error had the following information:-


Too bad! Your domain is not signed with a valid signature (DNSSEC). Therefore visitors with enabled domain signature validation, are not protected against manipulated translation from your domain into rogue internet addresses. 


What is DNNSEC?


The Domain Name System Security Extensions (DNSSEC) is a critical technology to secure the Domain Name System (DNS). DNSSEC provides a layer of authentication and integrity checking to the DNS, ensuring that the information transmitted is trustworthy and has not been tampered with. 

To enable DNNSEC, the domain registrar or the hosting provider should do that. 


Vulnerabilities due to a domain not being signed with a valid signature?


While DNSSEC has been available for many years, it is surprising that many known websites still need this signature from the DNS provider. These include major cloud providers like https://www.amazon.com, https://www.microsoft.com, and DNS providers like https://www.godaddy.com. 


As per https://internet.nl/faqs/dnssec/, some real-world incidents that DNSSEC could have prevented include


One of the primary issues with not implementing DNSSEC is DNS cache poisoning. It occurs when an attacker can manipulate the DNS lookup process and direct users to a fraudulent website that looks identical to the legitimate site. It is achieved by intercepting DNS queries and responding with fake records modified to redirect the user elsewhere. DNSSEC prevents this attack by providing a mechanism to verify the authenticity of DNS records.


Another issue is DNS spoofing, which is similar to DNS cache poisoning but occurs when an attacker can inject false DNS records into the cache of a DNS resolver. It can allow the attacker to redirect traffic to malicious servers and intercept user communication. DNSSEC mitigates this by adding a layer of validation to the DNS records returned to the resolver.


Furthermore, DNSSEC can prevent man-in-the-middle (MITM) attacks by ensuring that the DNS records are authentic and have not been tampered with during transit. MITM, explained in Part-2 of this series, is when an attacker intercepts communication between the client and the server and alters the data. DNSSEC can prevent this by adding digital signatures to the DNS records.


Conclusion


Why several websites do not implement this feature may range from lack of awareness, cost factor, and risk of breakage to a complex setup.  But, the adoption of DNSSEC seems essential for ensuring the security and integrity of the DNS. 

Tuesday, March 7, 2023

Securing Public website domain - Part 2 - Implementing HSTS

Most of the time, the HTTP to HTTPS redirection for a website happens at the DNS, Edge, or Application layer. So by typing the naked domain (non-www) of the website, there is still a vulnerability between the user’s browser and these layers. The browser does not have the intelligence to redirect the URL to HTTPS. 


SSL ensures an encrypted connection between the browser and the website. HSTS forces web browsers to become intelligent and use an encrypted HTTPS connection at the Browser layer.


The first thing to understand is what kind of vulnerabilities are present if HSTS (HTTP Strict Transport Security) is not implemented. These are very fundamental and similar to vulnerabilities without HTTPS implemented.


Vulnerabilities without HSTS


a) Man in the Middle attack is when attackers can intercept traffic between a user's browser and the website and manipulate the connection to use an unencrypted HTTP protocol instead of a secure HTTPS protocol. This way, attackers can read or modify the content of the communication, leading to potential data breaches, session hijacking, or phishing attacks.


b) Cookie hijacking: Without HSTS, attackers can intercept or tamper with session cookies sent over an unencrypted HTTP connection. This way, attackers gain access to user accounts and steal sensitive information, such as personal data or financial details.


c) SSL stripping: Attackers can use SSL stripping attacks to devaluate the communication from HTTPS to HTTP and intercept sensitive data sent over an insecure connection. This technique is often combined with phishing attacks, where users are redirected to fake websites designed to steal login credentials or other personal information.


d) DNS hijacking: Attackers can perform DNS hijacking attacks to redirect users to a malicious server instead of the intended website. It allows the attacker to intercept and manipulate the communication between the user's browser and the fake website, leading to potential data theft or malware infection.


e) SSL certificate fraud: Without HSTS, attackers can use fraudulent SSL certificates to impersonate a legitimate website and deceive users into sharing sensitive information. It is an issue when users are not aware of the legitimate website's SSL certificate or security indicators, as they may unknowingly trust a fraudulent certificate.


Implementing HSTS


To implement HSTS, the response header just needs to be sent as part of the website's HTTP responses. The header should include the HSTS policy, which specifies how long browsers should remember to use HTTPS instead of HTTP when communicating with the website. 


A simple example of a code snippet that needs to be added to the header is as below: -


HTTP-Strict-Transport-Security: max-age=31536000; includeSubDomains


The max-age parameter specifies the number of seconds that the HSTS policy will be in effect. In this case, it is set to one year (31536000 seconds). The includeSubDomains parameter tells the browser to apply the HSTS policy to all subdomains of the website as well.


Once you've added the response header to the website's HTTP responses, any time a user visits the site, their browser will remember to use HTTPS instead of HTTP for a specified period of time, 1 year as per the above example


In summary, without HSTS, websites are vulnerable to a wide range of attacks that can lead to data breaches, identity theft, and other serious consequences. Implementing HSTS is the first crucial step in ensuring the security and privacy of website users.


Securing your public-facing website domain - Part 1

One of the unknown facts about a website or domain is how it implements proper SSL standards, leaving it insufficiently secure. Casually browsing through free online tools like https://www.ssllabs.com/ssltest/, and https://www.internet.nl/ made me realize how outdated we were from the realities of modern website security and privacy. 

Having the website as SSL does provide a secure layer that provides encryption between the web server and the browsers. However, we need to stay updated to keep us away from newer vulnerabilities.

The result of the free online tool gave our website a RED rating, and I had to take the entire report and address each item. Below are all the vulnerabilities that we started and eventually addressed. I will go through each of them in detail in subsequent posts.

1. Implement HTTP Strict Transport Security (HSTS) 

2. Implement Domain Name System Security Extensions (DNSSEC) 

3. Avoid mixed content: Ensure that all resources on your website, including images, scripts, and stylesheets, are served over HTTPS. Avoid including content from non-HTTPS sources.

4. Implement secure cookie settings: Use the Secure and HttpOnly flags on cookies to ensure that they are only transmitted over HTTPS and cannot be accessed by malicious scripts.

5. Use security headers: Implement security headers, such as Content Security Policy (CSP) and X-Frame-Options, X-Content-Type-Options, and Referer-Policy to protect against cross-site scripting (XSS) and clickjacking attacks.

6. Ensure websites have the latest TLS version enabled. Also, all old ciphers that are not supported are deleted.

7. Proper redirection from HTTP to HTTPS on the same domain in both of www. as well as for naked domains.


Saturday, February 25, 2023

Building a Chatbot using WebSockets on Azure Services - Part 1

 Got an opportunity to build a chatbot application on Azure and it was a dilemma to use either HTTP polling or use the WebSocket way of interacting with the client. The moment WebSocket was discussed got a lot of caution from the organization architecture team regarding the complexity and the technical back draws. 

Integrating WebSockets for a chatbot using Microsoft Azure Services can greatly enhance the user experience by providing real-time, bidirectional communication between the chatbot and the user. Microsoft Azure provides a number of services that can be used to implement a WebSockets-based chatbot solution, including Azure SignalR Service, Azure Application Gateway, and Azure Functions.

One of the key advantages of using Azure SignalR Service is that it provides a fully managed service for adding real-time functionality to applications, making it easy to implement WebSockets in a chatbot solution. By using Azure Application Gateway, incoming traffic from users can be redirected to the chatbot's backend, and the WebSockets connection between the chatbot and the user can be secured using SSL offloading and authentication.

In addition to these services, Microsoft Azure also provides a number of tools for monitoring and troubleshooting, such as Azure Monitor and Azure Log Analytics, to ensure that the chatbot solution is performing optimally and that any issues are quickly identified and resolved.

To ensure a seamless user experience, it is important to plan for scalability from the start, so that the chatbot solution can handle increasing traffic and user numbers. This can be achieved using Azure's auto-scaling and load-balancing features.

Finally, thorough testing is critical to ensure that the chatbot implementation meets the requirements and expectations. This includes testing the WebSockets connection through the Azure Application Gateway, as well as any other components of the chatbot solution.

In conclusion, by using Microsoft Azure Services, it is possible to implement a highly performant and scalable WebSockets-based chatbot solution that provides a real-time, responsive experience for users.

Sunday, January 8, 2023

Tackling and Mitigating a Distributed Network attack

A Distributed Denial of Service (DDoS) attack on a public website can have severe consequences for businesses, including lost revenue, reputational damage, and even legal liability. DDoS attacks involve overwhelming a website with traffic from multiple sources, making it inaccessible to legitimate users.

There was a high-volume DDoS attack recently on one of the public sites that I am responsible for. I was in the midst of all the action around the clock and helped mitigate this issue. There was an initial glitch, but the site has been running stable for the customers, with a lot of work done behind the scenes.

We did have a finely tuned WAF layer, but this was a pure layer 7 volumetric attack from across the Globe and was purposely targeting the application layer. The attack was staggered with throughput in millions of requests per minute. Most well-known WAFs available in the market with Advanced DDoS protection and ML-based pattern detections have limitations at such high volumes.  

Some of the  best practices to prevent a DDoS attack includes

Implement DDoS protection measures
The first step in tackling a DDoS attack is to implement DDoS protection measures. These measures can include firewalls (WAF), load balancers, and intrusion prevention systems (IPS). Additionally, businesses can use cloud-based DDoS protection services, which can automatically detect and mitigate attacks. The protection needs to be done at different layers of OSI design. Also, most modern WAF products have managed rules along with advanced Bot protection that needs to be fine-tuned.  

Develop a response plan
Organizations should have a plan in place for how to respond to a DDoS attack. This kind of attack can happen at any time and a proper plan should include a response team, communication protocols, and steps to address the attack. Additionally, another key element is that teams should regularly test their plan to ensure it is effective. 

Monitor network traffic
Monitoring network traffic is critical in identifying a DDoS attack on a public website. This can be achieved by using network traffic analysis tools that can identify spikes in traffic and alert the security team in real time. Capturing logs along with ready-made queries can help identify and monitor malicious traffic. 

Block malicious traffic
One of the most effective ways to mitigate a DDoS attack is to block malicious traffic. This can be achieved by using access control lists (ACLs) and firewalls to block traffic from known malicious IP addresses or Geo Locations. Additionally, teams can use rate limiting, geo-blocks, automatic pattern detection, and URL blocks to limit the amount of traffic coming from specific sources.

Use Content Delivery Networks (CDNs)
CDNs can help to mitigate the impact of a DDoS attack by distributing traffic across multiple servers. This can help to absorb the attack and keep the website online. Additionally, CDNs can offer DDoS protection services that can automatically detect and mitigate attacks.

Tooling and Testing
Having appropriate alerts and notifications along with pattern detection can help identify malicious traffic from time to time. Performing a shadow DDoS test along with timely load testing the infrastructure can help to size the application appropriately. 

Educate users
Finally, teams should educate their users about the risks of DDoS attacks and how to prevent them. This can be achieved through regular training and awareness campaigns. Additionally, businesses can encourage users to report any suspicious activity or traffic they observe.

In conclusion, a DDoS attack on a public website can have significant consequences for businesses. There is no one solution to prevent an attack and the mitigation plan varies from application to application. But the key here is to understand the events of malicious traffic and address specific attack vectors. Also, having a WAF and continuous tuning of the WAF is required to maximize app protection without causing false positives. 

Sunday, January 1, 2023

Distributed Denial of service attacks on different OSI layers - Part 3

A Distributed Denial of Service (DDoS) attack is a malicious attempt to make a network or website unavailable to its users by overwhelming it with traffic from multiple sources. These attacks can occur at different layers of the Open Systems Interconnection (OSI) model, and each layer presents different challenges for mitigation. In this article, we will discuss DDoS attacks on different OSI layers with examples.

Layer 3 (Network Layer)

DDoS attacks at the network layer target the routing of IP packets. These attacks aim to consume network bandwidth, making the targeted service unavailable to legitimate users. An example of a network layer DDoS attack is the Ping of Death attack, where an attacker sends oversized ping packets to a target, causing the system to crash or become unavailable.

Mitigation strategies for network layer DDoS attacks include implementing access control lists (ACLs) to filter out unwanted traffic and deploying routers with built-in DDoS protection features.

Layer 4 (Transport Layer)

DDoS attacks at the transport layer target the transport protocol, such as Transmission Control Protocol (TCP) or User Datagram Protocol (UDP). These attacks aim to consume server resources, making the targeted service unavailable to legitimate users. An example of a transport layer DDoS attack is the SYN flood attack, where an attacker sends a flood of TCP SYN requests to a server, consuming server resources and causing the service to become unavailable.

Mitigation strategies for transport layer DDoS attacks include implementing rate limiting and implementing SYN cookies to prevent SYN flood attacks.

Layer 7 (Application Layer)

DDoS attacks at the application layer target the application protocol, such as HTTP or HTTPS. These attacks aim to consume server resources, making the targeted service unavailable to legitimate users. An example of an application layer DDoS attack is the HTTP Flood attack, where an attacker sends a large number of HTTP requests to a server, consuming server resources and causing the service to become unavailable.

Mitigation strategies for application layer DDoS attacks include implementing web application firewalls (WAFs) to filter out unwanted traffic, implementing rate limiting, and using CDN services to distribute the load across multiple servers.

In conclusion, DDoS attacks can occur at different layers of the OSI model, and each layer presents unique challenges for mitigation. Mitigation strategies for DDoS attacks include implementing ACLs, deploying routers with built-in DDoS protection features, implementing rate limiting, using SYN cookies, implementing WAFs, and using CDN services. By being prepared and implementing these strategies, businesses can mitigate the risks of DDoS attacks and ensure their systems remain available to legitimate users.




Sunday, December 11, 2022

Fine Tuning a WAF to avoid False Positives - Part 2

 This week has been an action-packed week with some high-volume DDoS attacks on one of the web applications. We have been spending a lot of time understanding the importance of having a WAF for all our client-facing public domains. In today's Cloud architecture Web Application Firewalls (WAFs) is a crucial part of any organization's security posture. They protect web applications from DoS, DDoS, and attacks, such as SQL injection, cross-site scripting (XSS), and other malicious activities. However, WAFs need to be fine-tuned regularly to ensure they provide maximum protection without causing false positives. In this article, we will discuss some best practices we followed to fine-tune a WAF and prevent multiple attacks on our application.

1.  The first step in fine-tuning a WAF is to understand the web application it is protecting. This includes identifying the application's components, such as the web server, application server, and database. Additionally, it is essential to identify the web application's behavior, including the type of traffic it receives, the HTTP methods it uses, and the expected user behavior. Understanding the web application will help to identify which rules should be enabled or disabled in the WAF.

2. Configure WAF logging WAF logging is a critical component of fine-tuning. It allows security teams to analyze WAF events and understand which rules generate false positives. WAF logs should be enabled for all rules, and log data should be retained for an extended period, such as 90 days or more.

3. Start with a default configuration WAFs come with a default configuration that provides a good starting point for fine-tuning. Start with the default configuration and enable or disable rules as necessary. Additionally, some WAFs have pre-built templates for specific applications, such as WordPress or Drupal. These templates can be an excellent starting point for fine-tuning.

4. Test the WAF Once the WAF is configured, it is essential to test it thoroughly. The WAF should be tested with a variety of traffic, including legitimate traffic and malicious traffic. This will help identify any false positives or negatives generated by the WAF.

5. Tune the WAF Based on the results of testing, the WAF should be fine-tuned. This may include enabling or disabling rules, adjusting rule thresholds, or creating custom rules to address specific attack vectors. Additionally, WAFs may have machine learning or AI capabilities that can help to reduce false positives.

6. Monitor the WAF After fine-tuning, the WAF should be monitored regularly to ensure it is providing maximum protection without causing false positives. WAF logs should be analyzed regularly, and any anomalies should be investigated immediately.

In conclusion, fine-tuning a WAF is a critical component of any organization's security posture. It requires a thorough understanding of the web application, careful configuration, and extensive testing. Additionally, WAFs should be regularly monitored and fine-tuned to ensure they provide maximum protection without generating false positives. By following these best practices, organizations can ensure their WAFs provide maximum protection against web application attacks.


Thursday, December 8, 2022

Demystifying the hidden costs after moving to the Cloud

The web application at a client was hosted using a combination of services on Azure. The architecture was quite simple and used the following services. Front Door, Api Manager, App Service, SQL Database, Service Bus, Redis Cache, and Azure Functions. As the application matured, little did we think of all the hidden costs of the cloud at the start of the project.

Azure Front Door used for efficient load balancing, WAF, Content Delivery Network, and as a DNS. However, the global routing of requests through Microsoft's network incurred data transfer and routing costs. What started as a seamless solution for enhanced user experience turned into a realization that global accessibility came at a price. Also, the complexity of configuring backend pools, health probes, and routing rules can lead to unintended expenses if not optimized.

App Services had a modest cost to begin with on low-scale Premium servers. But as the application garnered a lot of hits, so did the number of users and, subsequently, the resources consumed. The need for auto-scaling to handle increased traffic and custom domains brought unforeseen expenses, turning the initially reasonable hosting costs into a growing concern. So, keep an eye on the server configuration and the frequency of scaling events.

Azure SQL Database brought both power and complexity. Scaling to meet performance demands led to increased DTU consumption and storage requirements. The once manageable monthly expenses now reflected the intricate dance between database size, transaction units, and backup storage. Not scaling down the backups also incurred costs, especially for databases with high transaction rates. Inefficient queries and suboptimal indexing can increase resource consumption, impacting DTU usage and costs.

Azure Service Bus, the messenger between the application's distributed components, began with reasonable costs for message ingress and egress. Yet, as the communication patterns grew, the charges for additional features like transactions and dead-lettering added expenses to the budget. Also, long message TTLs can lead to increased storage costs. 

Azure Cache for Redis, used for in-memory data storage, initially provided high-performance benefits. However, as the application scaled, the usage to accommodate larger datasets, the costs associated with caching capacity, and data transfer began to rise, challenging the notion that performance came without a price. Eviction of data from the cache, may result in increased data transfer costs, especially if the cache is frequently repopulated from the data source. Also, fine-tuning cache expiration policies is crucial to avoid unnecessary storage costs for stale or rarely accessed data.

Lastly, the Azure Functions, with its pay-as-you-go model, was supposed to be the least cost of all services as it allowed to invoke functions as needed. But, the cumulative charges for execution, execution time, and additional resources reminded me that serverless, too, had its hidden cost. Including unnecessary dependencies in your function can inflate execution times and costs.

Demystifying the expenses after moving to Azure required a keen understanding of its pricing models and a strategic approach to balancing innovation with fiscal responsibility.

Sunday, November 27, 2022

Choosing the right WAF for your Enterprise wide Applications - Part 1

 This is a multi-part series on how to protect a web application using a WAF. To start with this part explains how to choose the right WAF for an Enterprise-wide web application. 

Web Application Firewalls (WAFs) are a crucial part of any organization's security infrastructure, protecting their web applications from cyber threats. With so many WAFs available in the market, choosing the best one can be a daunting task. I have been reading Gartner reports, along with performing POCs and trying to choose a tool that best suits the client, Below are the different criteria to consider when choosing the best WAF for your organization.

Security Features

When choosing a WAF, the first and most crucial criterion is its security features. The WAF should have strong protection against various cyber threats, including  DDoS, SQL injection, cross-site scripting (XSS), and other common OWASP web application vulnerabilities. Additionally, the WAF should offer threat intelligence services that provide continuous updates on the latest security threats and attack patterns.

Customization and Configuration

The WAF should be easily customizable and configurable to suit an organization's specific security needs. It should allow for custom rule creation, custom signature creation, and other customization options that allow you to fine-tune the WAF's security policies according to an organization's requirements. The ability to perform extensive rate-limiting or geo-blocking features is some of the common requirements of a WAF.

Performance and Scalability

The WAF should offer excellent performance and scalability, especially for high-traffic websites or applications. It should be able to handle a large number of concurrent connections without compromising performance or introducing latency. Additionally, the WAF should be scalable, allowing an organization to expand and grow without requiring a complete WAF overhaul. In simple words, it should not be a single point of failure. 

Integration with Existing Security Infrastructure

The WAF should be easy to integrate with the organization's existing security infrastructure, including firewalls, intrusion detection and prevention systems (IDPS), and Security Information and Event Management (SIEM) systems. This integration should also allow for seamless communication and collaboration between the different security systems, providing a holistic approach to security.

Compliance and Regulations

The WAF should comply with various regulatory standards, such as the Payment Card Industry Data Security Standard (PCI DSS) or the General Data Protection Regulation (GDPR). Additionally, the WAF should be auditable, providing detailed logs and reports allowing compliance verification and audit trails.

Ease of Use and Management

The WAF should be easy to use and manage, with a user-friendly interface that allows security administrators to monitor and manage the WAF effectively. Additionally, the WAF should offer automation and orchestration capabilities, allowing for seamless deployment and management of the WAF across different environments.

In conclusion, choosing the best WAF for an organization requires careful consideration of various criteria, including security features, customization and configuration, performance, and scalability, integration with existing security infrastructure, compliance and regulations, and ease of use and management. Selecting the right WAF that meets an organization's specific security needs can protect web applications from various cyber threats and ensure your organization's continued success.


Wednesday, August 3, 2022

Instilling the idea of Sustainability into Development Teams

Inculcating Green coding practices and patterns with the development team is a modern-day challenge. It can go a long way to reducing the carbon footprints and long-term sustainable goals of an organization. 

Good Green coding practices improve the quality of the software application and directly impact the energy efficiency on which the software applications are running. However, the software developers of today's agile work environment seldom focus away from rapid solution building in reduced sprint cycles. They have all the modern frameworks and libraries at their behest, and writing energy-efficient code is not always the focus. Furthermore, modern data centers and cloud infrastructure provide developers with unlimited resources resulting in high energy consumption and impacting the environment. 

Below are some of the factors that improve the programming practices and can show a drastic impact on the Green Index 

a) Fundamental Programming Practices

Some of the fundamental programming practices start with proper Error and Exception handling. It also includes paying extra attention to the modularity and structure of the code and being prepared for unexpected deviation and behavior, especially when integrating with a different component or system.

b) Efficient Code Development

Efficient code development helps to make the code more readable and maintainable. Efficient code writing includes avoiding memory leaks, high CPU cycles, and managing network and Infrasturc Storage in a proficient manner. It also includes avoiding expensive calls and unnecessary loops, and eliminating unessential operations. 

c) Secured Programming Mindset

A secured programming mindset ensures that the software application has no weak security features or vulnerabilities. Secured programming also includes protecting data, data encoding, and encryption. OWASP vulnerability list awareness and performing timely Penetration testing assessments ensure the application code is compliant with the required level of security.

d) Avoidance of Complexity

A complex code is the least modified and the hardest to follow. A piece of code developed may in the future be modified by several different developers, and hence avoiding complexity when writing code can go a long way to keep the code maintainable. Reducing the cyclomatic complexity of the methods by dividing the code and logic into smaller reusable components helps the code to remain simple and easy to understand. 

e) Clean Architecture concepts

Understanding Clean Architecture concepts is essential to allow changes to the codebases. In a layered architecture, understanding the concerns of tight coupling and weak cohesion helps in reusability, minimal disruption to changes, and avoids rewriting code and capabilities. 

Conclusion

As Architects and developers, it is essential to collect Green metrics on a timely basis and evaluate the compliance and violations of the code. Measuring these coding practices can be done using various static code analysis tools. The tools can further be integrated into the IDE, at the code compilation or deployment layer, or even as a standalone tool. 

With organizations in several industries now focusing on individual sustainability goals, green coding practices have become an integral part of every software developer. The little tweaks to our development approach can immensely contribute to the environmental impact in the long run.

Monday, July 18, 2022

Using Well Architect Framework to Address Technical Debt - Part 1

 Since getting my well-architected framework proficiency certification a year back, I have become a massive fan of the framework and have used it extensively at work. The Well Architected Framework is a tool with a set of standards and questionnaires that illustrates design patterns, key concepts, design principles, and best practices for designing, architecting, and running workloads in the cloud.

All major cloud providers like AWS, Azure, Google, and Oracle have defined the framework foundation, and they continue to evolve them with their platforms and services. 

Organizations that have moved to the cloud have a different set of challenges. As all workloads are running in the cloud, the typical requirement from businesses is for more agility and focus on shipping functionalities to production. Teams are very less invested in improving the technical debts. This leads to more reactive rather than proactively continuous improvements and a huge pile load of epics to resolve.

The well-architected framework (WAF) suits really well for teams that are unaware of where to start with the technical debt in terms of priority. The fundamental pillars of the WAF are  

a) System design b) Operational Excellence c) Security d) Reliability e) Performance f) Cost optimization and the newly added pillar g) Sustainability.



The framework can be fine-tuned to fit custom requirements based on the application domain. The framework is also apt to address typical Cloud challenges like the high cost of cloud subscriptions, Application Performance tuning, Cloud security, Operation Challenges in a Cloud or Hybrid setup, Quick recoveries from failure, and improvement on organizations' Green Index.

A dashboard helps to view the technical debts once the questionnaire is updated based on the WAF pillars. The below diagram illustrates the WAF dashboard heatmap and the technical debt based on prioritization and impact. The dashboard stresses the needed improvement and helps to measure the changes implemented by comparing them to all the possible best practices. 



 Performing these reviews on a timely basis helps the team to identify unknown risks and mitigate the problem very early. The WAF reviews fit well with the Agile ways of working and the principle of Continuous improvement. 

Below are the links to Well-Architected Frameworks described by different cloud vendors.





Monday, June 13, 2022

AWS Migration and Modernization Gameday Experience

 I was at the AWS gameday and it was a very fun and learning experience partnering with fellow colleagues and competitors. AWS has kind of created this concept very differently than a typical hackathon and it is more like a gamified activity in a much more stress-free environment.

For the Migration and Modernization Gameday, it was a 6-hour activity with an hour's break for lunch (most of them had it by their desk). We were asked to migrate a 2-tier e-commerce application to AWS in a specific region with all AWS services at our behest. This specific gameday was a level-300 and required at least an associate certification, but I felt even non-experts with some AWS hands-on knowledge can also contribute immensely to the team.

The first part of the day went into setting up the basic infrastructure on new VPCs and following certain guidelines to migrate databases (using DMS) and web servers (using App Migration service). We followed the AWS documentation for the migration part.

The fun part of the gameday was in the latter half of the session post-lunch when the basic migration was completed and we had to switch the DNS from the on-premise to the cloud infrastructure. That’s when the application is throttled with real-world traffic, volumetric attacks, fault injections, etc. The better your application performed the more points you got and vice versa.


Here are some of the learnings for folks wanting to participate in the next Gameday.

a) Be thorough with the networking concept in AWS. Outline your end-state network architecture view and naming conventions to begin with. As you will be on the console this will help avoid confusion.

b) Plan all the AWS services that would be the right fit for the requirements. Since it's a real-world environment scenario, extra points are awarded to teams that include all different AWS services.

E.g. Cloudfront as CDN, AWS WAF for firewall, AWS Guard Duty for threat detection, AWS Cloudwatch for monitoring, etc.

c)Ensure that the architecture follows the well-architected framework pillars.

Operation Excellence, Security, Reliability, Performance Efficiency, Cost Optimization, Sustainability

d) Segregate tasks between team members and ensure to review each other's changes, especially the networking part which is typically confusing. 

e) Last but not least, if stuck in any step for long, try to get the application running first and then try implementing the right principles.


Monday, April 4, 2022

AWS managed Blockchain Blog

I have been part of an interesting case study on AWS-managed blockchain. Glad to be part of authoring the new AWS blog post on AWS Managed Blockchain.

--> https://aws.amazon.com/blogs/apn/capgemini-simplifies-the-letter-of-credit-process-with-amazon-managed-blockchain/

Sunday, March 20, 2022

The Sustainable Enterprise - Why cloud is key to business sustainability

I have been writing several articles on this topic and am pleased to contribute to this newly released white paper on the topic of "How Enterprises can achieve sustainable IT via the cloud, teaming up with Microsoft. Nice to share an Architects view and work with some of the market-leading experts on this topic.

Download the white paper here: 

 https://www.capgemini.com/se-en/resources/the-sustainable-enterprise-why-cloud-is-key-to-business-sustainability/ 

Friday, February 4, 2022

Harnessing Green Cloud computing to achieve Sustainable IT evolution

A few months back, I had written an article about Sustainability explaining what it is all about when it comes to software development. Since then I have come across this topic in several forums, including discussions with multiple client organizations that have pledged to quantify and improve on this subject. 


Organizations that move their applications towards cloud services tremendously improve their IT environmental impacts and goals of being sustainable. They are several factors that an enterprise has to consider beyond just selecting a cloud provider to be considered environmentally sustainable.


Focus on the following 6 areas can help organizations kick start their Green IT revolution on Cloud.




a) Cost Aware Architecture thinking

In applications built on cloud infrastructure, there are several moving parts with innumerable services. Organizations who have moved to the cloud often find it very difficult to be cost-aware, ensuring optimal usage of these services.

 They are so engrossed in building their core business applications that they don’t invest in cost-aware architecture teams that focus on optimizing the spending by eliminating unprovisioned infrastructure, resizing or terminating underutilized and using lifecycle management. Practices like energy audits, alerts and IT cloud analysis helps to identify costs and identify systems that need to be greened.


Cloud provides services like Azure Advisor and AWS Trusted Advisor helps to optimize and reduce overall cloud expenditure by recommending solutions to improve cost-effectiveness. Services like Azure Cost management and Billing, AWS Cost Explorer, and AWS Budgets can be used to analyze, understand, calculate, monitor, and forecast costs. 

b) Sustainable development

 Building applications using modern technologies and cloud services help optimize development code and ensures faster deployments. It also enables in reduction of redundant storage and end-users energy levels. 

Sustainable development on the cloud has many parts. It involves an end-to-end view of how the data traverses wholistically.  Improving load times by optimizing caching strategies reduces the data size, data transfer quantity, and bandwidth. With new innovative edge service solutions and by serving the content from the appropriate systems, energy-efficient applications can be built reducing the distance at which the data travels. 

c) Agile Architecture

One of the core Agile principles is to promote sustainable development and improve ways of working by making the development teams deliver at a consistent pace.

Cloud services provide tools like Azure and AWS DevOps, which is commonplace for development teams to organize, plan, collaborate on code development, build and deploy applications. It allows organizations to create and improve products faster than traditional software development approaches.

d) Increase Observability

There is a direct correlation between an organization's Observability maturity and Sustainability. In Observability, the focus is to cultivate ways of working within development teams to have a holistic data-driven mindset when solving system issues. The concept of Observability is becoming more and more prominent with the emergence and improvement of AI and ML-based services.

Service to improve automation diagnostics, automatic infra healing, and the advent of myriads of services used for deep code and infra drills, real-time analysis, debugging and profiling, alerts and notifications, logging and tracing, etc indirectly helps in organizations return of investment, increasing productivity

e) Consumption-based Utilization

Rightly sized applications, enhanced deployment strategies, automated backup plans, and designing systems using Cloud's well-architected frameworks result in utilizing the underlying hardware and its energy efficiency. It also serves the organization's long-term goals of reducing consumption and power usages, improving network efficiencies, and securing systems. Utilizing the right cloud computing service also helps the applications to Scale Up or Out appropriately. 

Using cloud-provided Carbon tracking calculators helps gauge systems or applications that require better optimization in terms of performance or better infrastructure. 

Conclusion

With AWS introducing Sustainability as the 6th pillar, green cloud computing has become one of the interesting topics for all organizations across different domains. While we all have come across tons of articles predicting how to save the world from various natural catastrophes and climate changes when it comes to software development on the Cloud, it's the foundational changes that one can start with to bring about the transformation

Monday, November 15, 2021

The fundamental principles for using microservices for modernization

The last few years I have spent a lot of time building new application on microservices and also moving parts of monolith to microservices. I I have researched and tried sharing my practical experience in several articles on this topic.

This week my second blog on some foundational principles of microservices published on Capgemini website.

https://www.capgemini.com/se-en/2021/11/the-fundamental-principles-for-using-microservices-for-modernization/

Wednesday, October 20, 2021

How to manage the move to microservices in a mature way

The last few years I have spent a lot of time building new application on microservices and also moving parts of monolith to microservices. I I have researched and tried sharing my practical experience in several articles on this topic.

This week my very first blog on this topic is published on Capgemini website.

 https://www.capgemini.com/se-en/2021/10/how-to-manage-the-move-to-microservices-in-a-mature-way/

Friday, September 3, 2021

The advent of Observability Driven Development

A distributed application landscape with high cardinality makes it difficult for dedicated operation teams to monitor system behavior via a dashboard or react abruptly to system alerts and notifications. In a microservices architecture with several moving parts, detecting failures becomes cumbersome, and developers end up looking at errors like finding a needle in a haystack.

What is Observability?

Observability is more than a quality attribute and one level above monitoring, where the focus applies more to cultivating ways of working within development teams to have a holistic data-driven mindset when it comes to solving system issues.












An observability thought process enables development teams to embed the monitoring aspect right at the nascent stage of development and testing.

Observability in a DevSecOps ecosystem

Several Organizations are adopting a DevSecOps culture, and it has become essential for development teams to become self-reliant and have a proactive approach to identify, heal and prevent systems faults. DevOps focuses on giving the development teams ability to make rapid decisions and more control to access infrastructure assets. Observability enhances this by empowering development teams to be more instinctive when it comes to defining system faults.










Furthermore, the modern ways of working with Agile, Test Driven Development, and Automation enable development teams to get deep insights into operations that can potentially be prone to failures.

Observability on Cloud platforms

Applications deployed on Cloud provide the development teams with several out-of-box myriads of system measurements. Developers can gauge and derive quality attributes of a system even before a code goes into production. Cloud services make it easy to collate information like metrics, diagnostics, logs, and traces for analysis, and they are available at the developer’s behest. AI-based automated diagnostics along with real-time data give developers deep acumen into their System Semantics and characteristics.

Conclusion

Observability is more of an open-ended process of inculcating modern development principles to increase the reliability of complex distributed systems. The benefits of the Observability mindset helps organizations resolve production issues speedily, reduces dependency and cost on manual operations. It also benefits development teams to build dependable systems helping end customers with a seamless user experience.

Building Microservices by decreasing Entropy and increasing Negentropy - Series Part 5

Microservice’s journey is all about gradually overhaul, every time you make a change you need to keep the system in a better state or the ...