Wednesday, January 10, 2024

Edge Tech Realities - Part 1 - Web Performance

The edge technology has improved severalfold in the last decade or so. In the late 2000s, I used Akamai to serve static content for several market-leading clients. These platforms typically made the application calls lesser and improved the performance and latency of the websites. 

Last year or so, I have been working with Microsoft Azure's front door, and this is when I realized in detail that, like any technological advancement, edge platforms have their fair share of disadvantages and practical challenges. 

First and foremost is the application performance. Recently, after caching all static pages on the CDN, a global website that I was part of surprisingly increased its average page load times. That's when I realized that the distribution of resources across multiple edge locations can lead to inconsistencies in processing power and network connectivity. Also, the page load times will differ based on where the nearest edge locations are present. 

Also, I learned that with Microsoft Azure backbone networking, the distributed nature of edge platforms and their edge servers are not created equal. They differ in processing power, storage capacity, and network bandwidth. Based on the logs, some users had slower response times, as their requests were directed to an underperforming edge server. 

In a web application whose static content needs frequent updates, the purging of the cache happens very often. This will need a very close synchronization between all the edge servers. A lot of times, it takes time for these edge locations to be in sync, and the more the purges, the more inconsistencies.

Lastly, if there are sudden spikes in traffic, say due to promotional campaigns or seasonal traffic. Scaling of edge servers may not always be instantaneous, leading to performance bottlenecks. Though this can be only for a certain period, it still impacts the end user.


Wednesday, November 29, 2023

Mastering the Art of Reading Thread Dumps

I have been for years trying to find a structured way to read thread dumps in production whenever there is an issue. I have often found myself in a wild goose chase, deciphering the cryptic language of thread dumps. These snapshots of thread activities within a running application have so much information, providing insights into performance bottlenecks, resource contention, and High Memory/CPU. 


In this article, I'll share my tips and tricks based on my experience, having read several production thread dumps effectively across multiple projects, demystifying the process for fellow my expert engineers.


Tip 1: Understand the Thread States

Thread states, such as RUNNABLEWAITING, or TIMED_WAITING, offer a quick glimpse into what a thread is currently doing. Mastering these states helps in identifying threads that might be causing performance issues. For instance, a thread stuck in a WAITING state can be a candidate for further investigation.


Tip 2: Identify High CPU Threads

The threads consuming a significant amount of CPU time are often the culprits behind performance degradation. Look for "Top 5 Threads by CPU time" threads and dig into their stack traces. It is where the full stack trace is defined, pinpointing to the exact method or task responsible for the CPU spike.


Tip 3: Leverage Thread Grouping

Grouping threads by their purpose or functionality can simplify the analysis process. In complex applications, the number of threads can be really confusing. Hence, collating or grouping them together can be helpful. For e.g, grouping threads related to database connections, HTTP requests, or background tasks together. This approach often provides a more coherent view of the application's concurrent activities.


Tip 4: Pay Attention to Deadlocks

Deadlocks are the nightmares of multithreaded applications. Thread dumps provide clear indications of deadlock scenarios. Look for threads marked as "BLOCKED" and investigate their dependencies to identify the circular dependencies causing the deadlock.


Tip 5: Explore External Dependencies

Modern applications often rely on external services or APIs. Threads waiting for responses from these external dependencies can significantly impact performance. Identify threads in WAITING states and trace their dependencies to external services.


Tip 6: Utilize Profiling Tools

While thread dumps offer a snapshot of the application state, profiling tools like VisualVM, YourKit, or jVisualVM provide a dynamic and interactive way to analyze thread behavior. These tools allow you to trace thread activities in real time, making it easier to pinpoint performance bottlenecks.


Tip 8: Contextualize with Application Logs

Thread dumps are more powerful when correlated with application logs. Integrate logging within critical sections of your code to capture additional context. This fusion of thread dump analysis and log inspection provides a holistic view of your application's behavior.


In conclusion, reading thread dumps is both an art and a science. It requires a keen eye, a deep understanding of the application's architecture, and the ability to connect the dots between threads and their activities. By mastering this skill, one can unravel the intricacies of their applications, ensuring optimal performance and a seamless user experience.

Monday, November 27, 2023

Pitching IaC to Stakeholders

As a Cloud Architect, I have several times explained to our stakeholders regarding Infrastructure as Code (IaC) and how it makes our cloud project a no-brainer, especially for applications running on the cloud.

In every discussion, I keep explaining how all our development work can be super fast without any mistakes, reusable, and save us tons of time and money in the future.

The first question I always get is, what is wrong with the current manual ways, and it has served us well so far? Will the cost increase our short-term budget?

I take a deep breath and re-iterate that having a blueprint always saves time, increases accuracy, and saves costs in the long run. IaC simplifies future changes, and environments can be replicated without any major rework. 

The gap between Business and IT often arises not due to the incapability of IaC but the challenge of translating its intricacies into a language both realms can comprehend. An Architect has to be persistent and repetitive. 

With Cloud first implementations, surely there will be a time when Businesses in large organizations will take efficiency and automation seriously.

Saturday, November 25, 2023

The Plight of an Architect in an Agile Project

Agile methodology in software development has emerged as a guiding light, promising flexibility, collaboration, and adaptability. But organizations have mistaken it for a luxury cruise liner while treating it like The Pirates of Carribeans, Black Pearl on the high seas of chaos.

Agile, with its sprints, stand-ups, and user stories, was supposed to be the antidote to the rigid and often cumbersome Waterfall methodology. However, in the real world, Agile is sometimes wielded like a double-edged sword – misused by developers and misunderstood by business leaders.

The Agile coaches are like the Pirate Captain, are the ones mainly responsible to steer the meetings, and are the ones who navigate the ship without an ounce of technical know-how. Picture the Agile stand-up meetings as the meeting of the Brethren Court, which typically turns into recitations of individual developers achievements. Each developer trying to resolve epics and making their own stories for their everyday chores, trying their best to please their captain. 

Then there are the Product Owners who act as the Pirate Lords, holding the keys to the treasure chest of project priorities. These lords of prioritization often struggle to let go the old ways of the Waterfall, treating Agile like a mere parrot on their shoulder rather than a shipboard companion. They treat technology debt as The dead man's chest, which is not supposed to be opened or seen.

Amidst all of them, the Architects are often left in the lurch as Agile teams treat their decisions as an afterthought. Their long-term vision gets lost in the relentless pursuit of project priorities and sprint goals. Good Architects are aware of the so called mirage on the horizon. But, often find themselves relegated to the backseat of the ship, much like a passenger becoming mere spectator watching their maps of successful navigation become damp and tattered in this unpredictable Agile Storm. 

Friday, April 21, 2023

Funnel-based Architecture for application Security on the Cloud - Part 1 - The Framework

As a Solution Architect, I've got a few opportunities to work with organizations facing security challenges on the cloud, especially with public facing applications. One of the most common issues I've encountered is a lack of visibility and control over their cloud environments.


To solve these security issues I've implemented a funnel-based framework for enhancing security on the cloud. This framework involves identifying the data flow within the cloud platform and implementing funnel points, which act as choke points at each layer for security controls. The last steps of the framework include increasing observability and continuous security improvements.


Below are the different steps :-




Step 1: Identify the Data Flow within the Cloud Platform


The first step in implementing a funnel-based framework for security on the cloud is to identify the data flow within the platform. It concerns understanding the data types processed through the platform and identifying the various stages in the data flow. It also includes getting to know every service or layer through the data flows.


Step 2: Implement Funnel Points


Based on the data flow, the next step involves implementing funnel points throughout the platform. Funnel points are choke points in the data flow where security controls are added at each layer to protect from threats. These funnel points are part of the Network, Transport, and Application Layers. These funnel points in the system may include network gateways, data storage, web and application services, and other components. 


Step 3: Implement Security Controls at Each Funnel Point


At each funnel point, security controls at each layer or service protect the cloud environment. It includes access controls, encryption and decryption processes, network security controls, monitoring and logging mechanisms, vulnerability management, and incident response processes. Each security control design addresses a specific threat or vulnerability and works together to provide comprehensive protection for the cloud environment.


Step 4: Regularly Monitor and Update the Security Controls


Once the security controls are implemented in each layer, it is critical to regularly monitor and update them to ensure they are working effectively. It involves monitoring the platform for suspicious activity, regularly reviewing access controls, updating software and security patches, and testing the security controls to identify any weaknesses or vulnerabilities.


Step 5: Continuously Improve the Framework


Finally, to continuously improve the funnel-based framework for security on the cloud, it is critical to stay ahead of emerging threats and vulnerabilities. It involves staying up-to-date on the latest security trends and best practices, regularly reviewing the security controls to identify areas for improvement, and working with clients to identify new threats and risks.


By following these steps, I was able to implement a comprehensive funnel-based framework for security on the cloud that provided good protection against a wide range of threats and vulnerabilities. I will deep dive into the Funnel based Architecture with examples in Part 2.

Funnel-based Architecture for Website Security on the Cloud - Part 2 - Using Microsoft Azure Services

In Part 1 of the article, I described the Funnel-based framework and various steps to improve web application security on the cloud. In this article, I will cite a real-world example of how I used the funnel-based framework and designed a Funnel-based architecture to filter and analyze malicious traffic for a web application.


The layered approach of Funnel-based Architecture is essential in providing multiple levels of security to web applications. By having multiple layers of security, each layer is responsible for detecting and blocking various attacks, making it more challenging for attackers to breach several layers at once. If an attacker bypasses one layer of defense, the other layers can still provide protection, making it harder for them to compromise the web application.


Below is an example of a multi-layered funnel that blocks malicious web requests. As each layer provides an increased level of security. The diagram illustrates 





a) The data or request flow from the browser, DNS, across edge layers, and all Azure services in the background. 

b) All layered funnel points have independent layers to choke malicious traffic by ip filtering, Geo-blocks, custom WAF rules, rate limiting, content caching, etc. 

c) Security controls at each layer or funnel point where access controls and restrictions using user authentication, authorization, audit trails, data encryption at rest, transit, via Intrusion Detection and Prevention System.

d) Deep Monitoring and Alerting of each layer and creating custom automated ways to update infrastructure and WAF rules, log analysis, auto threat detections, triggering application protection via scaling, captchas, static sites, etc. 

e) Finally, continuous improvement by providing regular security assessments and benchmarking, performing penetration testing, security awareness training, incident response planning, etc.


Here are some examples of security tools that we used to create a Funnel-based Architecture on Azure:


  1. Azure Firewall: A network layer security tool that provides a managed, cloud-based firewall service to protect Azure virtual networks and resources from network-based threats.
  2. Azure Front Door: A global, scalable, and secure entry point that provides routing, caching, and load balancing of web traffic at the network layer.
  3. Azure Application Gateway: A layer-7 load balancer that provides WAF and SSL termination capabilities to protect web applications from application-layer attacks.
  4. Marketplace WAF: An Advanced WAF that provides robust in-house web application firewall protection by securing applications against layer 7 DDoS attacks, malicious bot traffic, all OWASP top 10 threats, and API protocol vulnerabilities.
  5. Azure DDoS Protection: A layer 3/4 protection service that protects against DDoS attacks by automatically mitigating them in the Azure network before they reach the targeted resource.
  6. Azure Key Vault: A cloud-based service that provides secure storage and management of cryptographic keys and secrets used by cloud applications and services.
  7. Azure Sentinel: A cloud-native SIEM and SOAR solution that provides intelligent security analytics and threat intelligence across the enterprise.




Sunday, March 26, 2023

Role of a Solution Architect in Modern Organizations

This week came across a topic of how the market lacks good Solutions Architects and how the role of a Solution Architect is often misused in the IT industry, especially with the advent of cloud certifications. With Agile Organizations, the role of the Solution Architect also seems to be diminishing. It has also led to organizations creating other different Architecture titles doing specific tasks.

Having played the role of Solution Architect for about a decade now, I feel the title typically requires years of experience designing and implementing complex software solutions in different projects and domains. 

Solution Architect Mindset

It involves a deep understanding of Business requirements, Technical constraints, Architecture Design and Principles, and staying up-to-date on Emerging Technologies. Solution Architects must also possess strong communication and leadership skills to work effectively with Stakeholders, Developers, Business Architects, and other team members.

With the rise of cloud certifications, many individuals get the title of Solution Architect without the necessary experience and expertise. It is probably the main reason that has led to a proliferation of Solution Architects who lack the required skills to design and implement complex solutions. 

Also, Organizations do not clearly define the roles and responsibilities of a Solution Architect. The role has evolved in the past decade with emerging technologies and ways of working, but the fundamentals have always remained the same. 

By definition, a Solution Architect's role is to design and oversee the implementation of software solutions that meet business requirements. To think like a modern Solution Architect, for any project, one has to mold in the below five areas. 



a) Understanding Business Requirements

b) Evaluate the requirements based on the abilities

c) Define the Architecture principles of the solution

d) Define the Architecture and design the overall solution

e) and Participate in the development of end-to-end solution


Conclusion

In conclusion, to have a successful Solution Architect mindset, one needs to ensure they have the problem-solving ability, love having strategic thinking proficiency, have proper communication to collaborate, and have the urge for continuous learning. 


Wednesday, March 15, 2023

Choosing between Azure function App or Azure App Service for your application

This week for building a client-facing mobile app, I had a choice of using either a PaaS or a Serverless implementation on Microsoft Azure This week to create a client-facing mobile application, I had a choice of using either a PaaS or a Serverless implementation on Microsoft Azure Cloud. 


One of the development teams wanted to use an Azure Function, and the other wanted to go with the simple Azure App service plan. It was not the first use case wherein I had to choose between the two services. Serverless technology has evolved and is no longer used just for independent worker-based functionalities. I have used it for several different use cases, including building a part of an e-commerce site running purely on serverless technology. 


I had to take in all the pros and cons and take a logical view to choose between the two. 


Step 1


I started to understand the abilities based on the granularity of the business requirements


What are the requirements?

a.    Business logic - Does the application require a full-fledged middleware with data and business logic?

b.   Lightweight system - Is the application more of a lightweight web API system?

c.    Expected Traffic - What will be the dedicated traffic towards the application?

d.   Code complexity - Does the application require a small piece of code or function for every request?

e.    Performance - Does the application require quick responses?

f.    Scalability- What are the scalability requirements?

g.   Ease of Development/Deployment - What is the level of ease of deployment and development?

h.   Security - What are the security requirements?

i.     Cost - What are the budget or cost requirements?

j.     Application Design - Complexity of the application design?

k.   Governance - Is governance an issue? Are there multiple teams owning and deploying modules to production regularly?

l.     Maintenance - Who will maintain the application, and what are the SLAs?

m. Redundancy- What are the application redundancy requirements?

n.   Technology - Is there a technology restriction?

o.   Learning curve – Technology adaption and learning curve requirements?



Step 2: 


As the requirements are understood, the next step is to understand and map the capabilities and challenges and compare the two services in the context of the development teams.


The Azure function has Three plans -> 1. Consumption 2. Premium, and 3. App service plan


a.    Plan fitment: The consumption plan has cold start issues, less security, and less cost. So, this has to be considered before going with this plan for a production-ready product. Neither cold starts nor security is an issue in building applications with App service.

b.   Cost Issue: Premium and App service plans can avoid cold starts. They also provide better security (with VNET) but at a higher cost. The cost is the same as the App service plan. 

c.    Complexity: With Azure functions, if several moving parts are created, and limited people are maintaining it, then it can get complex. 

d.   Governance issue in the future: If we are building independent decentralized functions, standardization can be an issue in the future.

e.    Latency: If a function in the future needs to wait for another function to execute, then the latency becomes an issue and complexity as well.

f.    Future Business logic: If writing business logic requires building several functions or classes, then an app service integration will be the way to go.

g.   Learning curve and support: It can become a learning curve issue for someone new to maintain and support the applications with multiple functions.


Concluding, both services offer several similar features, In our case, the application was simple and had some future business requirements that required processing with no end-user impact, so the App Service seemed to be an apt fit. However, the choice could be totally different for another set of requirements. 


                                                     

Building Microservices by decreasing Entropy and increasing Negentropy - Series Part 5

Microservice’s journey is all about gradually overhaul, every time you make a change you need to keep the system in a better state or the ...