RE: How to run your CTE just once, and re-use the output

There’s an excellent article “How to run your CTE just once, and re-use the output” over at sqlsunday.com, but a few people on Reddit are wondering if inserting into a #temp table or @table variable would be any better [for performance].

As Brent Ozar is fond [paraphrasing!] of saying, “If you don’t know, you’re wasting your time tuning!“.

Spinning up a 4GHz, 6 CPU Hyper-V with 8GB ram, SQL Server 2019 Developer Edition, parallelism cost threshold to 50, server maxdop to 6, attaching a copy of the Stack Overflow March 2016 database onto a 7200RPM 1TB hard drive, setting compatibility level to 190, database maxdop to 6, creating the index given on sqlsunday.com (which took 9 minutes), setting “set statistics io on;” and running each query, here are the results I see on this virtual test server.


Query 1, “CTE + Union All”
(8926714 rows affected)

Table 'Users'. Scan count 28, logical reads 322600.
Table 'Posts'. Scan count 30, logical reads 333030.

Total logical reads: 655,630. This is our baseline.


Query 1-B, “Query 1 + Suggested Index”

Query 1 is suggesting to create an index, in addition to the index already used from the blog post.

CREATE NONCLUSTERED INDEX ix_suggested_index
ON [dbo].[Users] ([Reputation])
INCLUDE ([DisplayName]);

This took 4 seconds to create. Let’s see if Query 1 runs any better with this new suggested index. (The suggested index is now used 4 times alongside the previous index.)

(8926714 rows affected)

Table 'Users'. Scan count 28, logical reads 99860.
Table 'Posts'. Scan count 30, logical reads 333030.

Logical reads on the Users table has dropped significantly, and the query returned results sooner.

Total logical reads: 432,890. (Better than Query 1.)


Query 2, “Query 1 – Expanded”
(8218141 rows affected)

Table 'Users'. Scan count 28, logical reads 323384.
Table 'Posts'. Scan count 28, logical reads 333030.

Already, we can see that the row counts are different. Query 1 is returning 708,573 more rows than Query 2. Without spending more time digging into why, my guess would be the query was incorrectly expanded [from the cte to subqueries] to the post?

Total logical reads: 656,414. (Worse. And the row count is new lower.)


And again, now with the suggested index. (Same query as Query 2, just with index.)

(8218141 rows affected)

Table 'Users'. Scan count 28, logical reads 99924.
Table 'Posts'. Scan count 29, logical reads 332248.

Logical reads on the Users table has dropped significantly, and this query also returned results sooner.

Total logical reads: 432,172. (Slightly better than the baseline. But I do not trust the results as the row count is still lower.)


Query 3, “CTE + Cross Apply + Union All”
(8926714 rows affected)

Table 'Posts'. Scan count 10, logical reads 109998.
Table 'Users'. Scan count 7, logical reads 24996.

Warning: Null value is eliminated by an aggregate or other SET operation.

Same row count, fewer logical reads on both tables. Except now we have a null-warning!

I would prefer to rewrite all versions of the query to be properly sorted and then run a compare using WinMerge on the resulting text files.

Total logical reads: 134,994. (A lot better than the baseline!)


Query 4, “CTE + Cross Apply”
(8926714 rows affected)

Table 'Posts'. Scan count 10, logical reads 110212.
Table 'Users'. Scan count 7, logical reads 24996.

It did run a few seconds faster than Query 3 with about the same logical reads. The execution plan is also much cleaner. Remember: these tests are still being ran with the SQL Server 2019 suggested index created after the original Query 1 on the Users table.

Total logical reads: 135,208. (A lot better than baseline!)


I’ll update this post with more information about using #temp tables and @table variables when I have the time (and ambition). For now, I have to get back to “working” šŸ˜‰.


…Later the next day…

Query 5, “Select into #temp + Cross Apply Values”
SELECT u.DisplayName, u.Reputation,
        SUM(p.ViewCount) AS ViewCount,
        SUM(p.CommentCount) AS CommentCount,
        SUM(p.FavoriteCount) AS FavoriteCount
into #temp
FROM dbo.Users AS u
LEFT JOIN dbo.Posts AS p ON p.OwnerUserId=u.Id AND p.PostTypeId IN (1, 2)
GROUP BY u.DisplayName, u.Reputation;

SELECT DisplayName, x.Metric, x.[Value]
FROM #temp cte
CROSS APPLY (
    VALUES ('Reputation', cte.Reputation),    --- 1
           ('Views',      cte.ViewCount),     --- 2
           ('Comments',   cte.CommentCount),  --- 3
           ('Favorited',  cte.FavoriteCount)  --- 4
    ) AS x(Metric, [Value])
WHERE x.[Value]>0;

And the results of selecting into a #temp table, and then selecting from that #temp table (otherwise the same query as Query 4).

(4569697 rows affected)

Table 'Posts'. Scan count 9, logical reads 110144.
Table 'Users'. Scan count 7, logical reads 24940.
Warning: Null value is eliminated by an aggregate or other SET operation.

(8926714 rows affected)

Table '#temp'. Scan count 7, logical reads 28206.

Total logical reads: 163,290. (Better than baseline, but more than Query 4. Plus took ~20 seconds longer to return results.)


Query 6, “insert @table then cross apply”
declare @temp table(
	DisplayName nvarchar(40),
	Reputation int,
	ViewCount int,
	CommentCount int,
	FavoriteCount int
)

insert into @temp( DisplayName, Reputation, ViewCount, CommentCount, FavoriteCount )
SELECT u.DisplayName, u.Reputation,
        SUM(p.ViewCount) AS ViewCount,
        SUM(p.CommentCount) AS CommentCount,
        SUM(p.FavoriteCount) AS FavoriteCount
FROM dbo.Users AS u
LEFT JOIN dbo.Posts AS p ON p.OwnerUserId=u.Id AND p.PostTypeId IN (1, 2)
GROUP BY u.DisplayName, u.Reputation;

SELECT DisplayName, x.Metric, x.[Value]
FROM @temp cte
CROSS APPLY (
    VALUES ('Reputation', cte.Reputation),    --- 1
           ('Views',      cte.ViewCount),     --- 2
           ('Comments',   cte.CommentCount),  --- 3
           ('Favorited',  cte.FavoriteCount)  --- 4
    ) AS x(Metric, [Value])
WHERE x.[Value]>0;

And the results of this @table query.

(4569697 rows affected)

Table 'Posts'. Scan count 2, logical reads 109728.
Table 'Users'. Scan count 1, logical reads 24782.
Warning: Null value is eliminated by an aggregate or other SET operation.

(8926714 rows affected)

Table '#B63020B0'. Scan count 7, logical reads 28203.

Total logical reads: 162,713. (Slightly better than Query 5, but still 27,505 more than Query 4. That’s an extra 225 MB in writes+reads and a slower query!)


Conclusion

The article “How to run your CTE just once, and re-use theĀ output” over at sqlsunday.com had the correct idea.

Using #temp tables when you need to use the same data in multiple separate queries is usually a good idea (better than repeatedly querying the same source tables), but if the same data needs to be queried multiple times in the same query, then go with the Query 4’s “CTE+cross apply” method.

Why the @table variables sucked: As far as I’ve read, we don’t get up-to-date statistics when using @table variables. And that would cause less than ideal query plans. If I’m behind on my readings about this, please let me know with a link to some updated reading material!