Few years back in my earlier article (https://www.shailendrabhatt.com/2020/11/extracting-running-data-out-of-nrcnike.html), I explained how to extract running data from Nike NRC using public endpoints and basic API scripts. At that time, this was a step up from manual tracking, a way to view runs, pace, and totals using dashboards and exported files. Seeing progress week-to-week and analyzing simple trends locally felt empowering and encouraging to go back to do the next run. Also, back then running api's were publically available, which they no longer are and that was another reason to store this information locally for future references.
 |
| Part 1 - Feeding Historical data to Open AI |
In this article, I am trying to move beyond static tracking to an autonomous coach that suggests plans in real-time based on historical data. Instead of just visualizing past metrics in dashboards, I have tried using OpenAI’s models (GPT-5) and vector search databases to analyze my performance. The idea was to use the tools to correlate training variables and create predictions for upcoming runs, all based on my own historical data.
 |
| Part 2 - Storing run data into Vector Database |
Leveraging OpenAI’s LLMs, I added chat-based analysis—so the app answers training questions in natural language instead of just showing numbers.

Step 1: The first step was bringing together Nike NRC, Strava data into a unified dataset. After several manually exporting activity logs, I finally automated the pipeline for both Nike NRC and Strava using API's. The updated codebase is present here --> https://github.com/shailendrabhatt/runanalysis-usingopenai
Step 2: The next step was feeding the data into a Vector database. All summary of runs are sent to the Open AI's embedding endpoint.
var embeddingPayload = new { input = runSummary, model = "" };var response = await httpClient.PostAsJsonAsync("https://api.openai.com/v1/embeddings", embeddingPayload);var embedding = await response.Content.ReadFromJsonAsync<EmbeddingResponse>();
For semantic analysis, runs and embeddings are indexed using a vector database
Step 3: The application features a chat window. When prompted about my running trends or wanting any advice, the chat backend sends context data with the query to the OpenAI endpoint:
var gptRequest = new { model = "gpt-4-turbo", messages = new[] { new { role = "system", content = "You are a running coach." }, new { role = "user", content = "Analyze my last four days of training and suggest improvements." } } };var resp = await httpClient.PostAsJsonAsync("https://api.openai.com/v1/chat/completions", gptRequest);
By integrating OpenAI’s GPT-based models as a chat interface on top of my fitness vector database, I can now do more than just browse charts:
I ask, “When did I last have a similar training block to this month?”, and the LLM summarizes and compares periods—even if I don’t remember the dates or specific workouts.
Even questions like, “Did my pace improve after switching shoes?” or “Which kind of recovery week led to my highest VO2max gains?” The LLM pulls contextually relevant vectors (embeddings) from the DB that has almost 10 years of my running information, draws parallels and trends, and gives coherent, actionable feedback.
I have now started to actually trigger the query just before a workout to evaluate how long I should do that specific run. The next step is to fetch the Garmin data, which seems to be a tricky case and then integrate the heart rate, stress levels to get more accurate feedback.