Log stats details for long running SQL queries#5376
Log stats details for long running SQL queries#5376apurvabhaleMS wants to merge 17 commits intomainfrom
Conversation
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Fixed
Show fixed
Hide fixed
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Fixed
Show fixed
Hide fixed
There was a problem hiding this comment.
Pull request overview
Adds best-effort logging of Query Store runtime stats for long-running SQL search queries in the SQL Server search layer, controlled via dbo.Parameters feature flag + threshold.
Changes:
- Adds a runtime feature flag (
Search.LongRunningQueryDetails.IsEnabled) and numeric threshold (Search.LongRunningQueryDetails.Threshold, default 60s). - Measures query execution time around
ExecuteReaderAsyncin both search and include-search paths. - Introduces query-text normalization and a Query Store lookup query to log recent runtime stats for matching statements.
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Show resolved
Hide resolved
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Show resolved
Hide resolved
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Outdated
Show resolved
Hide resolved
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Fixed
Show fixed
Hide fixed
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Fixed
Show fixed
Hide fixed
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Fixed
Show fixed
Hide fixed
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Fixed
Show fixed
Hide fixed
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Fixed
Show fixed
Hide fixed
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Fixed
Show fixed
Hide fixed
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Fixed
Show fixed
Hide fixed
src/Microsoft.Health.Fhir.SqlServer/Features/Storage/CachedParameter.cs
Dismissed
Show dismissed
Hide dismissed
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 2 out of 2 changed files in this pull request and generated 9 comments.
Comments suppressed due to low confidence (1)
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs:570
- The stopwatch starts before ExecuteReaderAsync but stops after the entire reader processing completes, so the measured "execution" time includes result materialization/decompression in addition to SQL execution. If the intent is to detect long-running SQL, consider measuring only the DB call (e.g., around ExecuteReaderAsync / DB read loop) or clearly renaming this to reflect end-to-end processing time.
var st = DateTime.UtcNow;
var executionStopwatch = Stopwatch.StartNew();
try
{
using (var reader = await sqlCommand.ExecuteReaderAsync(CommandBehavior.SequentialAccess, cancellationToken))
{
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Show resolved
Hide resolved
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.Health.Fhir.SqlServer/Features/Storage/CachedParameter.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Show resolved
Hide resolved
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Show resolved
Hide resolved
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.Health.Fhir.SqlServer/Features/Storage/CachedParameter.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Show resolved
Hide resolved
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Show resolved
Hide resolved
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Outdated
Show resolved
Hide resolved
...oft.Health.Fhir.SqlServer.UnitTests/Features/Search/SqlServerSearchServiceQueryStoreTests.cs
Show resolved
Hide resolved
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Fixed
Show fixed
Hide fixed
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Dismissed
Show dismissed
Hide dismissed
src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs
Fixed
Show fixed
Hide fixed
...oft.Health.Fhir.SqlServer.UnitTests/Features/Search/SqlServerSearchServiceQueryStoreTests.cs
Outdated
Show resolved
Hide resolved
...oft.Health.Fhir.SqlServer.UnitTests/Features/Search/SqlServerSearchServiceQueryStoreTests.cs
Dismissed
Show dismissed
Hide dismissed
| catch (Exception ex) | ||
| { | ||
| _logger.LogWarning( | ||
| "Long-running SQL ({ElapsedMilliseconds}ms). Query: {QueryText}. Query Store lookup failed for long-running query.", | ||
| executionTimeSnapshot, | ||
| queryTextSnapshot); | ||
| _logger.LogDebug(ex, "Query Store lookup failed for long-running query."); | ||
| } |
Check notice
Code scanning / CodeQL
Generic catch clause Note
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 2 days ago
In general, fix this by replacing catch (Exception) with one or more specific exception types that represent expected/benign failures in this context, and by optionally handling OperationCanceledException separately when using cancellation tokens. Leave genuinely fatal exceptions (e.g., OutOfMemoryException, ThreadAbortException) unhandled, so they are not silently swallowed.
For this code block, the exception can only be thrown from the Task.Run delegate, specifically from LogQueryStoreByTextAsync or from the CancellationTokenSource. Since this is an auxiliary SQL logging path, the most appropriate exceptions to handle are SqlException, IOException, and OperationCanceledException:
OperationCanceledException: expected if the 2-second logging timeout is hit; we should treat this as benign and either ignore or log at very low severity.SqlException: expected failures while querying Query Store.IOException: possible failures when writing logs, or ifLogQueryStoreByTextAsyncinteracts with I/O.
We then remove the outer catch (Exception ex) around the whole block, or at least narrow it similarly, because the only code inside is the conditional and the Task.Run scheduling, both of which should not throw in normal operation; any error there should be visible rather than silently debug-logged.
Concretely:
- Inside the
Task.Rundelegate:- Replace
catch (Exception ex)with:catch (OperationCanceledException)that does nothing (or logs atDebug).catch (SqlException ex)andcatch (IOException ex)that log at the sameWarning/Debuglevels as before.
- Replace
- Outside (below line 859), either remove the outer
try/catchor narrow it similarly. Since all heavy work is inside the delegate and already protected, and schedulingTask.Runshould not fail in typical ways we want to hide, we can safely remove the outertry/catch. If you prefer to keep a safety net, narrow it tocatch (OperationCanceledException)to cover any unexpected cancellation source usage.
No new methods are required. We may need using System.IO; and using Microsoft.Data.SqlClient;, but both are already present at the top of the file, so no import changes are necessary.
| catch (Exception ex) | ||
| { | ||
| _logger.LogWarning( | ||
| "Long-running SQL ({ElapsedMilliseconds}ms). Query: {QueryText}. Query Store lookup failed for long-running query.", | ||
| executionTimeSnapshot, | ||
| queryTextSnapshot); | ||
| _logger.LogDebug(ex, "Query Store lookup failed for long-running query."); | ||
| } |
Check notice
Code scanning / CodeQL
Generic catch clause Note
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 2 days ago
In general, the fix is to replace broad catch (Exception) clauses with more specific exception types that represent expected, recoverable failures. Only those exceptions should be handled and suppressed; others should be allowed to propagate so they can be diagnosed and fixed.
Here, the outer try covers logic that checks elapsed time and, if necessary, schedules a background task to log query details. The main foreseeable exceptions are:
SqlExceptionfrom interacting with SQL-related objects.OperationCanceledExceptionin case of cancellation (though the cancellation token here is internal, this is still benign to ignore in this logging context).
We should change the outer catch (Exception ex) at lines 2020–2022 to two specific catch blocks: one for SqlException and one for OperationCanceledException. Both should keep the existing debug logging behavior (same message and exception variable), so functionality is preserved: non-critical failures in logging still don't affect the main operation, but unexpected exceptions (like NullReferenceException) will now propagate instead of being silently swallowed.
Concretely, within src/Microsoft.Health.Fhir.SqlServer/Features/Search/SqlServerSearchService.cs, in the area around lines 1987–2023, we will:
- Replace the single
catch (Exception ex)with:catch (SqlException ex)catch (OperationCanceledException ex)
- Keep the logging calls the same inside each catch.
No new imports are required; SqlException is already imported from Microsoft.Data.SqlClient, and OperationCanceledException is available via System.
| @@ -2017,10 +2017,14 @@ | ||
| }); | ||
| } | ||
| } | ||
| catch (Exception ex) | ||
| catch (SqlException ex) | ||
| { | ||
| _logger.LogDebug(ex, "Failed to initiate long-running SQL query logging."); | ||
| } | ||
| catch (OperationCanceledException ex) | ||
| { | ||
| _logger.LogDebug(ex, "Failed to initiate long-running SQL query logging."); | ||
| } | ||
| } | ||
| } | ||
| }, |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Description
Feature Flag for Long-Running Query Details
• Added _longRunningQueryDetails (ProcessingFlag) to control whether Query Store logging is enabled at runtime via the Search.LongRunningQueryDetails.IsEnabled parameter in dbo.Parameters.
• The flag defaults to enabled (true), so logging is active unless explicitly disabled.
Configurable Threshold
• Added LongRunningQueryDetailsThresholdId ("Search.LongRunningQueryDetails.Threshold") to allow the threshold to be configured via dbo.Parameters at runtime.
• Added LongRunningThresholdMillisecondsDefault = 60000ms (1 minute) as the fallback when no database parameter exists.
Execution Time Measurement
• Added Stopwatch-based timing (executionStopwatch) around ExecuteReaderAsync in both SearchImpl and SearchIncludeImpl.
• The elapsed time is compared against the threshold; if exceeded and the feature flag is enabled, LogQueryStoreByTextAsync is called.
Query Store Stats Logging
• StripQueryPreambleLines —Strips diagnostic and parameter-declaration preamble lines from a SQL command's text so the remaining query body can be used as a search key for Query Store lookups via LIKE.
• LogQueryStoreByTextAsync — Queries sys.query_store_query_text joined to runtime stats DMVs.
Error Handling
• If the Query Store lookup itself fails with a SqlException, it's caught and logged as a warning (not rethrown), so it never impacts the original search operation.
Related issues
Addresses AB180196
Testing
Describe how this change was tested.
FHIR Team Checklist
Semver Change (docs)
Patch|Skip|Feature|Breaking (reason)