Wednesday, January 29, 2025

Looking into Azure AI Foundry: Building a Smart Response System

Over the past few weeks, I had the opportunity to dive deep into Azure AI Foundry and work on poc that combined prompt flow, large language models (LLMs), and database indexing. The idea was to build a system that evaluates different LLMs, retrieves relevant information from a database, selects the right email template, and generates a personalized email response. 

Azure AI Foundry

Azure AI Foundry is a powerful platform designed to simplify the development of AI applications by orchestrating flows involving LLMs, prompts, and Python tools. The best thing about the AI foundry is the visualized graph-based interface for creating flows and its ability to test and debug seamlessly. After setting up my project in the Azure AI Foundry portal, the major part was the setting of Prompt flows.

Designing the Prompt Flow

The first step was to create a Prompt Flow that could evaluate multiple LLMs based on specific input parameters. Here’s how I structured it:

Input Parameters: The flow began by taking user inputs such as query type, historical data context, and additional metadata.

LLM Evaluation: Using Azure OpenAI GPT models, I evaluated several LLMs for their performance on these inputs. This step involved crafting multiple prompt variants using Jinja templating and comparing their outputs.

Index Lookup: Once the best-performing model was selected, I integrated an Index Lookup tool to query a vector database for relevant historical data.

Template Selection: Based on the retrieved data, the system dynamically chose one of several pre-uploaded email templates.

Database Indexing & Retrieval

The Index Lookup tool in Azure AI Foundry made it easy to search through my vector database for relevant results. This tool uses embeddings generated by LLMs to find the most contextually appropriate matches for a given query.

For example:

If the input query was related to customer feedback, the system would retrieve historical feedback records.

For support-related queries, it would fetch relevant support ticket summaries.

This indexing mechanism ensured that every email response was grounded in accurate and relevant data.

Generating the Response

Once the right template was selected, I used the chosen LLM to fill in placeholders in the template with dynamic content. The final email response was not only accurate but also personalized based on historical interactions.

For instance:

A customer asking about delayed shipping would receive an email referencing their previous order details.

A user requesting technical support would get an email tailored to their issue history.

This seamless integration of templates with real-time data retrieval made the system highly effective.

There main areas to look into when turning the POC into a proper use case.. 

1. LLM Evaluation: Comparing multiple LLMs required careful tuning of prompts and iterative testing. What was frustrating at times was that different LLM gave different results for the same query.

2. Data Integration: Ensuring that database indexing worked smoothly with diverse datasets took some effort.

3. Template Management: Designing flexible templates that could adapt to various contexts required creativity. Since I had limited data this impacted the output result.

Final Thoughts

Azure AI Foundry I think can revolutionize workflows by combining retrieval-augmented generation (RAG) techniques with powerful LLMs. The ability to evaluate models dynamically, retrieve relevant data efficiently, and generate personalized outputs has immense potential across different industries —from customer support to marketing automation.


Monday, November 25, 2024

In search of a cost effective strategy for modernizing a monolithic - Part 3

It's been a while since I last updated on the modernization journey of the monolithic Java application.
As I highlighted in Part 2, the initial domain identification process is proving to be more time-consuming than anticipated. The lack of readily available stakeholders and domain experts presented a significant hurdle, making it challenging to gain a comprehensive understanding of the application's intricate workings. Moreover, the discovery of hidden interdependencies between seemingly distinct domains further is complicating the decomposition strategy.

Budget constraints continued to be a major factor shaping our decisions. With the initial plan of migrating to the cloud no longer feasible, we had to explore alternative approaches that would deliver substantial value without breaking the bank. The strangler fig pattern is emerging as a promising solution, offering a way to incrementally modernize the monolith by gradually replacing specific functionalities with microservices built using the same technology stack.

To mitigate risk and ensuring business continuity, we identified a few non-critical modules as our initial targets for strangulation. These modules, while not directly impacting core business operations, would provide valuable insights and experience as we refined our modernization approach.

The Next Steps
With a clearer understanding of the challenges and a viable modernization strategy in place, the next crucial step is to develop a detailed roadmap that will guide our efforts moving forward. This roadmap will prioritize modules for decomposition based on three key factors:

Business value: Modules that deliver significant business value and have a high potential for improvement will be prioritized.
Technical complexity: Modules with lower technical complexity will be targeted first to minimize risk and accelerate the modernization process.
Potential for immediate impact: Modules that can demonstrate tangible benefits quickly will be given preference to showcase the value of the modernization initiative.

Creating a comprehensive business capability matrix is essential for effectively mapping out the modernization journey. This matrix will provide a clear overview of the application's functionalities and their relationships to business processes.

Looking Ahead
The road ahead is undoubtedly long and complex, but I'm a bit optimistic about the progress we've made so far. By embracing an iterative approach and focusing on delivering value incrementally, I'm feel that we can successfully modernize this application and unlock its full potential.

Tuesday, November 19, 2024

The Hidden Carbon Cost when Working Remote

As part of an Organization that has sustainability as one of its top agenda, I've been closely examining the environmental impact of remote work, and the results are not as straightforward as many might assume. While initial studies painted a rosy picture of reduced emissions due to eliminated commutes, a more nuanced personal analysis reveals that working from home may, in a lot of cases, lead to higher CO2 emissions than traditional office work. 


Since I live in Sweden, a cold country, this analysis seems even more true, considering the high number of dark and cold months in a calendar year. Let me try to break this down into real-world numbers. Let me try to take an example of how sustainable I am as an Architect when it comes to working from the office or from home.


Carbon Emission Numbers when working from the office


For a day when I am working from the office with proper sustainable practices (walking commute, single screen use, no overtime, reusable coffee mug, no printing), I calculated my daily CO2 emissions of 14.3 kg.


This includes approximately 6 kg from office energy use, 2.4 kg from computer usage, and other small contributions from digital activities and shared resources like sending emails, taking video calls, etc.

My reference for the data is from articles that support how remote working reduces CO2 footprint.


 Also, for the fact that like most modern offices, the office I work at also offers green building technologies, and energy-efficient workspaces leading to sustainable corporate facilities.

 

https://www.sciencedaily.com/releases/2023/09/230918153242.htm  

https://allwork.space/2023/09/digest-remote-works-green-potential-can-working-from-home-slash-carbon-emissions/  

https://www.anthropocenemagazine.org/2023/09/remote-work-is-better-for-the-climate-but-mainly-in-large-doses/


Carbon Emission Numbers when Working Remote


In contrast, when I work from home initially it appeared to have lower emissions at 10.7 kg CO2 per day based on the above factors. However, when I factor in the limitations and long-term impacts of remote work, the numbers change dramatically. Let me detail out


1. Basic home energy use: Electricity for computer and multiple screens: 2.0 kWh/day, CO2 emissions: 0.512 kg CO2/day

2. Heating/Cooling (year-round average, Sweden is a cold country still keeping the numbers low): 50 kWh/day (assuming less efficient home HVAC) CO2 emissions: 9.2 kg CO2/day

3. Additional residential energy use: Cooking, appliances, increased device usage. Estimated additional 2 kWh/day, CO2 emissions: 0.512 kg CO2/day

4. Inefficiencies due to work-from-home disadvantages:

  • Communication delays (1.5 hours/day lost): CO2 emissions: 0.0576 kg CO2/day
  • Reduced collaboration efficiency (Appx 1.5 hours/day lost): CO2 emissions: 0.0576 kg CO2/day
  • Misunderstandings in digital communication (Appx 1 hour/day): CO2 emissions: 0.0384 kg CO2/day 
  • Limited access to physical resources (1 hour/day of inefficiency): CO2 emissions: 0.0384 kg CO2/day

5. Longer working hours due to blurred boundaries:

Extra energy use: 0.3 kWh with CO2 emissions: 0.0768 kg CO2/day

6. Increased use of digital tools for team bonding: Video calls, and virtual social events: 0.5 kWh/day

CO2 emissions: 0.128 kg CO2/day

7. Additional lighting needs for home office:

0.5 kWh/day CO2 emissions: 0.128 kg CO2/day

8. Increased non-work energy use during breaks:

TV, Mobile, additional cooking: 2 kWh/day

CO2 emissions: 0.512 kg CO2/day

9. Less efficient home office equipment:

Additional 1 kWh/day

CO2 emissions: 0.256 kg CO2/day

10. Frequent additional trips during work hours:

CO2 emissions: 4.04 kg CO2/day (using 0.404 kg CO2/mile)


Some of these additional factors bring the total daily emissions for our remote worker to 15.7 kg CO2, surpassing the office-based scenario by 1.4 kg, and I have gone easy with the numbers for point 4 --> Inefficiencies due to work-from-home disadvantages.


Conclusion


In conclusion, this data may differ from case to case and for different working models. But just in my case, I feel working from the office isn't just a traditional approach, it's, turning out to be a strategic environmental choice. By centralizing my work, optimizing shared resources, and fostering collaborative environments, I think I am being more sustainable than when I am working from home.


What I am still missing is a spontaneous innovative environment that fosters, immediate problem-solving, people-to-people knowledge transfers, and most importantly, enhanced mental health through social interactions.


Imagine how these sustainable numbers will improve if all my colleagues were also working from the office.

Sunday, October 20, 2024

Session Management: B2C vs B2B Customer Facing Applications

As I was thinking about a good topic to write about this week, I couldn’t help but reflect on the subject of user session management. It is a critical topic that plays a pivotal role in shaping user experiences on customer-facing websites. Since being part of both B2C and B2B applications, I have noticed how different approaches to session management can either enhance or hinder user engagement.


In B2C applications, we have used a browser-level cookie with a 30-minute timeout, while the B2B applications employ server pinging every few minutes. Both approaches have their merits and challenges.


When designing a session management system, these are the architecture abilities considered:


Scalability: Our solution must handle varying user loads efficiently.

Security: Protecting user data and preventing unauthorized access.

User Experience: The system should balance security with ease of use. 

Flexibility: How flexible does the session management solution cater to different timeouts?

Performance: We must minimize the impact on server resources and network traffic.

Compliance: Our implementation should adhere to relevant data protection regulations.


Based on these considerations, our ideal implementation was 


For B2C

Since the number of users accessing the application is very high and requires high concurrency, there was a requirement for better user experience and performance. Hence, using secure HTTP-only cookies for session tokens was more suited. 


Also, since the application had several public pages, security was achieved by storing limited data in cookies and ensuring all secured information was on the server side. 


A distributed session could have been achieved, as there was a Redis cache layer. However, we wanted to keep the session stateless, and using a sticky session was available, and scalability was not an issue.


For B2B


Since concurrency was not an issue, server-side session management with client-side pinging periodically was a more apt solution. It did not hamper user experience SLAs and was more secure. 


Also, different b2b functionalities required different session timeouts. Keeping the logic on the server side made it more flexible and more controlled in terms of monitoring.


In conclusion, this is a good topic to revisit continuously, considering how the application security landscape is evolving, and so will our session management approach.  

Wednesday, October 2, 2024

In search of a cost effective strategy for modernizing a monolithic - Part 2

As I delved deeper into the discussions over the last few weeks, the challenges became a lot clearer and increasingly apparent. The tightly coupled modules and the massive Oracle database continued to be the focal points of our discussions. With only one developer possessing limited knowledge of the application, we found ourselves navigating through issues such as getting time from the developer, unknown assumptions, and several undocumented dependencies.

The initial domain identification process is still ongoing and getting time-consuming. Engaging with stakeholders and domain experts is not as planned as there are none :-(. Also discovered that what appeared to be distinct domains often had hidden interdependencies, further complicating our decomposition strategy.

The budget constraints also loom large as we progress. Every potential solution had to be weighed against its cost implications, forcing us to think creatively about resource allocation. The initial plan of migrating to the cloud is now squashed. 

One promising approach that I think could be a good fit is the strangler fig pattern. This strategy will hopefully allow for incremental modernization by gradually replacing specific functionalities of the monolith with microservices using the same technology. We identified a few non-critical modules that could serve as our initial targets to not risk core business operations.

As we move forward, the next crucial step is to create a business capability matrix and a detailed modernization roadmap. This roadmap will prioritize modules for decomposition based on their business value, technical complexity, and potential for immediate impact. 

The next steps seem to be a long process, but I am getting much more optimistic than before. Hopefully, the next time I write I will have the approach and roadmap finalized. 

To be continued...

Sunday, September 8, 2024

In search of a cost effective strategy for modernizing a monolithic - Part 1

This week, I was asked to delve into a complex monolithic Java application that seems to be more than a decade old. With its tightly coupled modules and heavy reliance on a massive Oracle database, this application was a classic example of a system that had evolved into a tangled web of dependencies with years of tech debt. 

The task is to modernize this behemoth without disrupting the business operations or incurring high costs. On top of this, there was only 1 developer who knew about the applications, and that too with a lot of assumptions. The application's architecture, characterized by tightly coupled modules, posed significant challenges in terms of scalability, maintainability, technology fitment, and modernization efforts. 

Like my previous monolith-to-microservices journey, the first step right now is to understand the intricacies of the applications. It's pretty evident now that the modules are interdependent that even minor changes could have ripple effect to systems. All of these also need to happen with tight budget constraints in mind, which is a unique challenge. 

The main focus is to prioritize high-impact areas and I have started to identify the domains, mainly core, supporting, and generic. To accurately identify these domains, stakeholders and domain experts are engaged. I guess it would be several days until we even think of starting on the decomposition strategy. 

To be continued....


Monday, August 12, 2024

The Burden of Network on Architecture - Part 1 Network Bandwidth

IT infrastructure in most organizations is a separate department that operates and manages the IT environment of an enterprise. We as Solution Architects are confined to our landscape and very seldom dive deep into the hardware and network services that directly impact our applications. 

One of the critical factor is the Network Bandwidth that can make or break an Architecture's performance. Network bandwidths are not infinite and directly impact the Cost, Speed, and Performance of applications. It is very essential to understand how much bandwidth is allocated to a given set of applications and what all applications are controlled by the network. As the network traffic increases, the network bandwidth increases clogging the applications.  

I once at a client encountered a situation in an on-premise environment where a massive 10 TB file transfer brought an entire network to its knees. The transfer was initiated without proper planning, and it quickly saturated the available bandwidth. As a result, critical business applications slowed down, and some systems even crashed due to timeout errors. Employees couldn't access essential resources, and customer-facing services experienced significant delays.

This incident taught the importance of implementing robust traffic shaping and prioritization mechanisms. Post that incident, the client network team always had a bandwidth alert. They also ensured that large data transfers were scheduled during off-peak hours and that critical services have guaranteed bandwidth allocations. 

Perils of Holiday Code Freezes

Holiday Code Freeze has now become an age-old practice where deployments are stopped, and it's aimed to reduce any risk of new bugs or issues in the system when a majority of support staff is away. However, this also means the servers and infrastructure are left untouched for several weeks, and surprisingly, in my experience, this can be a bit chaotic. 

Application and Database Leaks

First and foremost, the most common error noticed is the Application leak. I recall one particular instance where an e-commerce application began to slow down significantly a week into the holiday break. The application retained references to several objects that were not required, causing the heap memory to fill up gradually. As the memory usage increased, the application became sluggish, eventually leading to crashes and "out of memory" errors.

Leaks can also happen when connecting to a database. In another example, the same e-commerce application began experiencing intermittent outages. The root cause was traced to a connection leak where the application was not releasing database connections after it was used. As the number of open connections grew, the database server eventually refused new connections, causing the application to crash. 

Similarly, I have also experienced code freeze situations, where thread leaks were consuming system resources and slowing down the application. This typically happens when threads are created but not terminated. 

Array index out-of-bounds errors

Another issue I recently encountered during a recent freeze was the Array index out-of-bounds error. The application was a CMS, and system downtime started in the middle of a week when an application tried to access an index in an array that didn't exist. It happened due to unexpected input and data changes not accounted for in the custom code.

Array Index out-of-bound exceptions can also be caused by data mismatch when interacting with external services or APIs, not under code freeze. Once, during a holiday season, a financial reporting application began throwing array index out-of-bounds exceptions. The root cause was traced back to an external data feed that had changed its format. The application was expecting a certain number of fields, but the external feed had added additional fields, causing the application to attempt to access non-existent indices. It led to errors that took the application offline until a patch was deployed after the freeze.

Cache Corruption

Cache corruption is another potential way of bringing down heavy cache-dependent applications. In online real-time applications, caches improve the application performances, but, on several occasions, I have seen over time, if not cleared, caches can become corrupt, leading to stale and receiving of incorrect data.

COnclusion

While it's funny that IT stakeholders think that the code freeze aims to maintain stability, in most cases, they expose underlying issues that might not be apparent during regular operations. The even more funnier thing is that a majority of the time, these issues are resolved by a simple server restart.

Wednesday, August 7, 2024

Extracting running data out of NRC/Nike + (Nike Run Club) using API's

For the past few weeks, I have been struggling to see the running kilometers getting updated in my  Nike + App. It could be a bug or a weird feature of the app and since this was kind of a demotivation, I decided to go ahead and create my own dashboard to calculate the results. Also, for some reason, Nike discontinued viewing and editing activities on the web.

Considering I had about 8 years of data and you never know when this kind of apps stop to exist or when they become paid versions. It's always better to persist your data to a known source and if required use it to feed it into any other application. I also went ahead and uploaded my data to UnderArmour's "MapMyFitness" App which has much better open-source documentation. 

It turns out that there is a lot of additional information the NRC app captures which are typically not shown on the mobile app. Few of the information include 

  1. Total Steps during the workout including detail split between intervals
  2. Weather Details during the workout 
  3. Amount of the time the workout was halted for 
  4. Location details including latitude and longitude information that can help you plot your own Map

Coming to the API part, I could not get hold of any official Nike documentation, but came across some older blogs https://gist.github.com/niw/858c1ecaef89858893681e46db63db66 in which they mentioned few API endpoints to fetch the historic activities. I ended up creating a  spring-boot version of fetching the activities and storing it in a CSV format in my Google Drive. 

The code can be downloaded here ->  https://github.com/shailendrabhatt/Nike-run-stats (currently unavailable)

The code also includes a postman repository which contains a Collection that can also be used to fetch one's activities. Just update the {{access_token}} and run the Get requests.

While the blog that had details of the API was good enough, a few tips that can be helpful 

  • Fetching the Authorization token can be tricky and it has an expiry time. For that, you will need a https://www.nike.com/se/en/nrc-app account and fetch the authorization token from the XML HTTP request headers for the URL type api.nike.com. There are few requests hitting this URL and the token can be fetched from any of them.
  • The API described in the link shows details of after_time, one can also fetch before_time information 
/sport/v3/me/activities/after_time/${time}
/sport/v3/me/activities/before_time/${time} 
  • Pagination can be easily achieved using the before_id and after_id. These ids are of different formats ranging from GUIDs to a single-digit number and can be confusing.

Saturday, August 3, 2024

Instilling the idea of Sustainability into Development Teams

Inculcating Green coding practices and patterns with the development team is a modern-day challenge. It can go a long way to reducing the carbon footprints and long-term sustainable goals of an organization. 

Good Green coding practices improve the quality of the software application and directly impact the energy efficiency on which the software applications are running. However, the software developers of today's agile work environment seldom focus away from rapid solution building in reduced sprint cycles. They have all the modern frameworks and libraries at their behest, and writing energy-efficient code is not always the focus. Furthermore, modern data centers and cloud infrastructure provide developers with unlimited resources resulting in high energy consumption and impacting the environment. 

Below are some of the factors that improve the programming practices and can show a drastic impact on the Green Index 

a) Fundamental Programming Practices

Some of the fundamental programming practices start with proper Error and Exception handling. It also includes paying extra attention to the modularity and structure of the code and being prepared for unexpected deviation and behavior, especially when integrating with a different component or system.

b) Efficient Code Development

Efficient code development helps to make the code more readable and maintainable. Efficient code writing includes avoiding memory leaks, high CPU cycles, and managing network and Infrasturc Storage in a proficient manner. It also includes avoiding expensive calls and unnecessary loops, and eliminating unessential operations. 

c) Secured Programming Mindset

A secured programming mindset ensures that the software application has no weak security features or vulnerabilities. Secured programming also includes protecting data, data encoding, and encryption. OWASP vulnerability list awareness and performing timely Penetration testing assessments ensure the application code is compliant with the required level of security.

d) Avoidance of Complexity

A complex code is the least modified and the hardest to follow. A piece of code developed may in the future be modified by several different developers, and hence avoiding complexity when writing code can go a long way to keep the code maintainable. Reducing the cyclomatic complexity of the methods by dividing the code and logic into smaller reusable components helps the code to remain simple and easy to understand. 

e) Clean Architecture concepts

Understanding Clean Architecture concepts is essential to allow changes to the codebases. In a layered architecture, understanding the concerns of tight coupling and weak cohesion helps in reusability, minimal disruption to changes, and avoids rewriting code and capabilities. 

Conclusion

As Architects and developers, it is essential to collect Green metrics on a timely basis and evaluate the compliance and violations of the code. Measuring these coding practices can be done using various static code analysis tools. The tools can further be integrated into the IDE, at the code compilation or deployment layer, or even as a standalone tool. 

With organizations in several industries now focusing on individual sustainability goals, green coding practices have become an integral part of every software developer. The little tweaks to our development approach can immensely contribute to the environmental impact in the long run.

Wednesday, July 24, 2024

AI Aspirations but lacking the Automation Foundation

I am witnessing a growing need for more clarity among IT teams regarding AI and Automation. They see competitors touting AI initiatives and feel pressured to follow suit, often without even grasping the fundamental differences between AI and automation. Everyone wants to implement AI, but they do not realize that they have yet to scratch the surface of basic automation. 


In a recent event at a client, the management heads announced an AI workshop day and their plans to implement AI into their development process. However, as the workshop started, I observed the lack of technical know-how regarding AI. Even developers struggled to differentiate between rule-based automation and the more complex, adaptive nature of AI. This knowledge gap has led to unrealistic expectations and misaligned strategies.


Let me cite another example from a client and elaborate. A year back the business management was pushing to implement an AI-driven customer service chatbot, which was the need of the hour, and went live with some cutting-edge services and technology. However, since its implementation, the chatbot did not see much traffic. As I tried to understand the reasons were several:-


  1. Poor integrations to existing systems like CRM, customer service tools, or even marketing automation. This meant the chatbot could not even access or update customer information in real time. Everything was done manually.
  2. It lacked typical customer interaction functionalities like personalization, order tracking, appointment scheduling, and even FAQs efficiently as it lacked automated processes.
  3. It could not seamlessly hand off to a human agent 
  4. finally, the bot engine lacked sufficient training and updates.

All of the above reasons are directly related to the lack of automation in various aspects of IT and business.


One initiative that hopefully works is to begin by asking teams to map out their current automated processes. This exercise usually reveals significant gaps and helps shift the focus from AI to necessary automation steps.


As we read and learn from others successful AI implementation is a journey, not a destination. It requires a solid foundation of automated processes, clean data, and a clear understanding of organizational goals. Until this reality is grasped, AI initiatives will continue to fall short of expectations.

Building Microservices by decreasing Entropy and increasing Negentropy - Series Part 5

Microservice’s journey is all about gradually overhaul, every time you make a change you need to keep the system in a better state or the ...