Passing entire lists in sql query, best practice
I am currently looking at ways to pass lists of integers in a SQL query and am trying to decide which one is best in which situation, what are the benefits of each and what are the pitfalls that should be avoided :)
Now I know 3 ways that we are currently using in our application.
1) Parameter evaluated by table: Create a new table parameter in sql server:
CREATE TYPE [dbo].[TVP_INT] AS TABLE(
[ID] [int] NOT NULL
)
Then run the query:
using (var conn = new SqlConnection(DataContext.GetDefaultConnectionString))
{
var comm = conn.CreateCommand();
comm.CommandType = CommandType.Text;
comm.CommandText = @"
UPDATE DA
SET [tsLastImportAttempt] = CURRENT_TIMESTAMP
FROM [Account] DA
JOIN @values IDs ON DA.ID = IDs.ID";
comm.Parameters.Add(new SqlParameter("values", downloadResults.Select(d => d.ID).ToDataTable()) { TypeName = "TVP_INT" });
conn.Open();
comm.ExecuteScalar();
}
The main drawbacks of this method is the fact that Linq does not support table-valued parameters (if you create an SP with a TVP parameter, linq will not be able to run it) :(
2) Convert the list to binary and use it in Linq! This is slightly better. Create a SP and you can run it in linq :)
For this, SP will have an IMAGE parameter and we will use a user-defined function (udf) to convert it to a table. Currently we have implementations of this function, written in C ++ and in assembly, both have almost the same performance :) Basically, each integer is represented by 4 bytes and is passed to SP. In .NET, we have an extension method that converts IEnumerable to byte array
Extension Method: public static Byte [] ToBinary (this IEnumerable intList) {return ToBinaryEnum (intList) .ToArray (); }
private static IEnumerable<Byte> ToBinaryEnum(IEnumerable<Int32> intList)
{
IEnumerator<Int32> marker = intList.GetEnumerator();
while (marker.MoveNext())
{
Byte[] result = BitConverter.GetBytes(marker.Current);
Array.Reverse(result);
foreach (byte b in result)
yield return b;
}
}
SP:
CREATE PROCEDURE [Accounts-UpdateImportAttempts]
@values IMAGE
AS
BEGIN
UPDATE DA
SET [tsLastImportAttempt] = CURRENT_TIMESTAMP
FROM [Account] DA
JOIN dbo.udfIntegerArray(@values, 4) IDs ON DA.ID = IDs.Value4
END
And we can use it by running SP directly or in any linq query we need
using (var db = new DataContext())
{
db.Accounts_UpdateImportAttempts(downloadResults.Select(d => d.ID).ToBinary());
// or
var accounts = db.Accounts
.Where(a => db.udfIntegerArray(downloadResults.Select(d => d.ID).ToBinary(), 4)
.Select(i => i.Value4)
.Contains(a.ID));
}
This method has the advantage of using compiled queries in linq (which will have the same sql definition and query plan, so will be cached as well) and can also be used in SP.
Both of these methods are theoretically unlimited, so you can pass millions of ints at a time :)
3) Simple linq.Contains () This is a simpler approach and is ideal for simple scripts. But of course this is a limitation.
using (var db = new DataContext())
{
var accounts = db.Accounts
.Where(a => downloadResults.Select(d => d.ID).Contains(a.ID));
}
The biggest drawback of this method is that each integer in the downloadResults variable will be passed as a separate int. In this case, the query is limited to sql (the maximum allowed parameters in the sql query, which is a couple thousand if I remember correctly).
So I would like to ask ... which one do you think is the best, and what other methods and approaches have I missed?
a source to share