Categories
Protiguous Quote

Get Angry

Get angry.. Do something positive about it.
Just don’t stay mad.

Protiguous, 2021

Categories
Anything

Don’t Rush Me

Good riddance, you vile racist cigar smoking scumbag.

Categories
SQL Server

Microsoft SQL Server Updates and Patches

https://sqlserverbuilds.blogspot.com/

Categories
truth

Cult Warning Signs

  • The Chosen One is always right.
  • Criticism of the Chosen One is wrong and shameful.
  • Anything the Chosen One does is justified; ignore the consequences.
  • The Chosen One is the only source of Truth, everybody else is lying.
  • Anyone not loyal to the Chosen One is the enemy.
Categories
Protiguous Quote Quotes

Once I was wrong

If you can learn that you were wrong, then you won’t be anymore.

Protiguous, 2021
Categories
coronavirus politics

How many people would still be alive if Donald Trump had never been President?

Categories
Numbers SQL SQL Script Strings

SQL Server Function: How to format bytes into greater units like MB, GB, TB, etc

Here’s a link to my gist for a SQL script to return a string from bytes formatted as a larger unit.

Examples

select [dbo].[FormatBytes]( 1324.13, 'byte' );
select [dbo].[FormatBytes]( 13241322.567567, 'bytes' );
select [dbo].[FormatBytes]( 1324132255.567567, 'tb' );

Link: https://gist.github.com/Protiguous/c40ef32ee4fa0f4e93c9413d6cc3d6bd

select [dbo].[FormatBytes]( 13241322.567567, 'bytes' );
select [dbo].[FormatBytes]( 132413225.567567, 'bytes' );
select [dbo].[FormatBytes]( 1324132255.567567, 'bytes' );
select [dbo].[FormatBytes]( 13241322551.567567, 'bytes' );
select [dbo].[FormatBytes]( 13241322.567567, 'mb' );
select [dbo].[FormatBytes]( 132413225.567567, 'gb' );
select [dbo].[FormatBytes]( 1324132255.567567, 'tb' );
select [dbo].[FormatBytes]( 13241322551.567567, 'zb' );
select [dbo].[FormatBytes]( 13241322551.567567, 'yb' );
select [dbo].[FormatBytes]( 132413225512.567567, 'yb' );
select [dbo].[FormatBytes]( 132413225, 'bb' );
select [dbo].[FormatBytes]( 1324132253, 'bb' ); –too big!
select [dbo].[FormatBytes]( 1324.13, 'byte' );
select [dbo].[FormatBytes]( 1324135, 'geopbyte' ); –too big!
create or alter function [dbo].[FormatBytes] ( @bytes decimal(38, 0), @toUnit nvarchar(15) = N'bytes' )
returns sysname
with schemabinding
as
begin
declare @prefix decimal(38, 0); adjust precision to your needs. Can scale higher bytes the lower precision is.
set @toUnit = trim(@toUnit);
if @toUnit is null return null;
set @bytes = @bytes *
case @toUnit collate SQL_Latin1_General_CP1_CI_AI
when N'b' then 1
when N'byte' then 1
when N'bytes' then 1
when N'kb' then 1024
when N'kilobyte' then 1024
when N'kilobytes' then 1024
when N'mb' then 1048576
when N'megabyte' then 1048576
when N'megabytes' then 1048576
when N'gb' then 1073741824
when N'gigabyte' then 1073741824
when N'gigabytes' then 1073741824
when N'tb' then 1099511627776
when N'terabyte' then 1099511627776
when N'terabytes' then 1099511627776
when N'pb' then 1125899906842624
when N'petabyte'then 1125899906842624
when N'petabytes' then 1125899906842624
when N'eb' then 1152921504606846976
when N'exabyte' then 1152921504606846976
when N'exabytes' then 1152921504606846976
when N'zb' then 1180591620717411303424
when N'zettabyte'then 1180591620717411303424
when N'zettabytes' then 1180591620717411303424
when N'yb' then 1208925819614629174706176
when N'yottabyte' then 1208925819614629174706176
when N'yottabytes' then 1208925819614629174706176
when N'bb' then 1237940039285380274899124224
when N'brontobyte' then 1237940039285380274899124224
when N'brontobytes' then 1237940039285380274899124224
when N'geopbyte' then 1267650600228229401496703205376
when N'geopbytes' then 1267650600228229401496703205376
end;
set @prefix =
case
when abs(@bytes) < 1024 then @bytes bytes
when abs(@bytes) < 1048576 then (@bytes / 1024) kb
when abs(@bytes) < 1073741824 then (@bytes / 1048576) mb
when abs(@bytes) < 1099511627776 then (@bytes / 1073741824) gb
when abs(@bytes) < 1125899906842624 then (@bytes / 1099511627776) tb
when abs(@bytes) < 1152921504606846976 then (@bytes / 1125899906842624) pb
when abs(@bytes) < 1180591620717411303424 then (@bytes / 1152921504606846976) eb
when abs(@bytes) < 1208925819614629174706176 then (@bytes / 1180591620717411303424) zb
when abs(@bytes) < 1237940039285380274899124224 then (@bytes / 1208925819614629174706176) yb
when abs(@bytes) < 1267650600228229401496703205376 then (@bytes / 1237940039285380274899124224) bb
else (@bytes / 1267650600228229401496703205376) geopbytes
end;
return convert(sysname, @prefix) +
case
when abs(@bytes) < 1024 then N' Bytes'
when abs(@bytes) < 1048576 then N' KB'
when abs(@bytes) < 1073741824 then N' MB'
when abs(@bytes) < 1099511627776 then N' GB'
when abs(@bytes) < 1125899906842624 then N' TB'
when abs(@bytes) < 1152921504606846976 then N' PB'
when abs(@bytes) < 1180591620717411303424 then N' EB'
when abs(@bytes) < 1208925819614629174706176 then N' ZB'
when abs(@bytes) < 1237940039285380274899124224 then N' YB'
when abs(@bytes) < 1267650600228229401496703205376 then N' BB'
else N' geopbytes'
end;
end;
view raw FormatBytes.sql hosted with ❤ by GitHub
Categories
2020 politics science trump virus

Bubonic Freedoms

Imagine people in Europe during the time of the Bubonic plague.
On one side, people with a strong rational fear of rats.
On the other, a bunch of morons in red hats waiving around their pet rats and talking about freedom..

Categories
Benchmarking Numbers script

Jeff Moden’s Tally Table Function (Generate Large Numbers)

Testing Jeff’s excellent number table generator with the examples below takes ~8 to ~13 minutes on this 6-CPU 8GB ram, 3.8 GHz virtual machine running SQL Server 2019 Developer Edition CU8. Server maxdop is set to 6, parallelism cost threshold is set to 50. Database maxdop is also set to 6.

Disclaimer: This virtual machine is not optimized for performance. The 6 equally sized tempdb data files are purposely located on slow storage (only ~6200 IOPS) through the same storage controller in an attempt to exaggerate the effect of any “bad” parts of queries. There are also backup (and other various) jobs possibly running during these tests.


Test Query 1: Select Into #temp version, 1 billion numbers.

drop table if exists #temp;
select [N] into #temp from [dbo].[fnTallyBig]( 1073741824 );

Result: Test query 1 took ~13 minutes, went parallel and used all 6 cores; hovered around 15% CPU usage – according to the Hyper-V Manager.


Test Query 2: Select @n=N version, 1 billion numbers.

declare @n bigint;
select @n=[N] from [dbo].[fnTallyBig]( 1073741824 );

Result: Test query 2 took ~8 minutes and also went parallel and used all 6 cores; hovered around 9% CPU usage – according to the Hyper-V Manager.


Testing conclusion: There are no known bad parts of Jeff Moden’s script. (Yay!)
Also: The tempdb performance on this virtual machine is horrible! 😉


This is the modified version of Jeff Moden’s “fnTally” script to create the number generating function. I’ve removed the @ZeroOrOne parameter in favor of always starting at zero.

If you read through the comments, there are alternate versions.. One that lets you specify the starting number!

CREATE OR ALTER FUNCTION [dbo].[fnTallyBig]( @MaxN BIGINT )
/**********************************************************************************************************************
 Purpose:
 Return a column of BIGINTs from 0 up to and including @MaxN with a max value of 10 Quadrillion.

 Usage:
--===== Syntax example
 SELECT t.[N]
   FROM [dbo].[fnTallyBig](@MaxN) t;
 
 select t.[N] into #numbers from [dbo].[fnTallyBig](4294967296) t;

 @MaxN has an operational domain from 0 to 4,294,967,296. Silent truncation occurs for larger numbers.

 Please see the following notes for other important information.

Original script can be found at https://www.sqlservercentral.com/scripts/create-a-tally-function-fntally

 Jeff's Notes:
 1. This code works for SQL Server 2008 and up.
 2. Based on Itzik Ben-Gan's cascading CTE (cCTE) method for creating a "readless" Tally Table source of BIGINTs.
    Refer to the following URL for how it works.
    https://www.itprotoday.com/sql-server/virtual-auxiliary-table-numbers
 3. To start a sequence at 0, @ZeroOrOne must be 0. Any other value that's convertible to the BIT data-type will cause the sequence to start at 1.
 4. If @ZeroOrOne = 1 and @MaxN = 0, no rows will be returned.
 5. If @MaxN is negative or NULL, a "TOP" error will be returned.
 6. @MaxN must be a positive number from >= the value of @ZeroOrOne up to and including 4,294,967,296. If a larger number is used, the function will silently truncate after that max. If you actually need a sequence with that many or more values, you should consider using a different tool. 😉
 7. There will be a substantial reduction in performance if "N" is sorted in descending order. If a descending sort is required, use code similar to the following. Performance will decrease by about 27% but it's still very fast especially compared with just doing a simple descending sort on "N", which is about 20 times slower.
    If @ZeroOrOne is a 0, in this case, remove the "+1" from the code.

    DECLARE @MaxN BIGINT; 
     SELECT @MaxN = 1000;
     SELECT DescendingN = @MaxN-N+1 
       FROM dbo.fnTally2(@MaxN);

 8. There is no performance penalty for sorting "N" in ascending order because the output is implicitly sorted by ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
 9. This will return 1-10,000,000 to a bit-bucket variable in about 986ms.
    This will return 0-10,000,000 to a bit-bucket variable in about 1091ms.
    This will return 1-4,294,967,296 to a bit-bucket variable in about 9:12(mi:ss).

 Revision History:
 Rev 00 - Unknown     - Jeff Moden 
        - Initial creation with error handling for @MaxN.
 Rev 01 - 09 Feb 2013 - Jeff Moden 
        - Modified to start at 0 or 1.
 Rev 02 - 16 May 2013 - Jeff Moden 
        - Removed error handling for @MaxN because of exceptional cases.
 Rev 03 - 07 Sep 2013 - Jeff Moden 
        - Change the max for @MaxN from 10 Billion to 10 Quadrillion to support an experiment. 
          This will also make it much more difficult for someone to actually get silent truncation in the future.
 Rev 04 - 04 Aug 2019 - Jeff Moden
        - Enhance performance by making the first CTE provide 256 values instead of 10, which limits the number of CrossJoins to just 2. Notice that this changes the maximum range of values to "just" 4,294,967,296, which is the entire range for INT and just happens to be an even power of 256. Because of the use of the VALUES clause, this code is "only" compatible with SQLServer 2008 and above.
        - Update old link from "SQLMag" to "ITPro". Same famous original article, just a different link because they changed the name of the company (twice, actually).
        - Update the flower box notes with the other changes.
**********************************************************************************************************************/
      
RETURNS TABLE WITH SCHEMABINDING AS 
 RETURN WITH
  H2(N) AS ( SELECT 1 
               FROM (VALUES
                     (1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
                    ,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
                    ,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
                    ,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
                    ,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
                    ,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
                    ,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
                    ,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
                    ,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
                    ,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
                    ,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
                    ,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
                    ,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
                    ,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
                    ,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
                    ,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
                    ) V(N))           --16^2 or 256 rows
, H4(N) AS (SELECT 1 FROM H2 a, H2 b) --16^4 or 65,536 rows
, H8(N) AS (SELECT 1 FROM H4 a, H4 b) --16^8 or 4,294,967,296 rows
            SELECT N = 0 UNION ALL
            SELECT TOP(@MaxN)
                   N = ROW_NUMBER() OVER (ORDER BY N)
              FROM H8;
Categories
Benchmarking Developers SQL Tuning

RE: How to run your CTE just once, and re-use the output

There’s an excellent article “How to run your CTE just once, and re-use the output” over at sqlsunday.com, but a few people on Reddit are wondering if inserting into a #temp table or @table variable would be any better [for performance].

As Brent Ozar is fond [paraphrasing!] of saying, “If you don’t know, you’re wasting your time tuning!“.

Spinning up a 4GHz, 6 CPU Hyper-V with 8GB ram, SQL Server 2019 Developer Edition, parallelism cost threshold to 50, server maxdop to 6, attaching a copy of the Stack Overflow March 2016 database onto a 7200RPM 1TB hard drive, setting compatibility level to 190, database maxdop to 6, creating the index given on sqlsunday.com (which took 9 minutes), setting “set statistics io on;” and running each query, here are the results I see on this virtual test server.


Query 1, “CTE + Union All”
(8926714 rows affected)

Table 'Users'. Scan count 28, logical reads 322600.
Table 'Posts'. Scan count 30, logical reads 333030.

Total logical reads: 655,630. This is our baseline.


Query 1-B, “Query 1 + Suggested Index”

Query 1 is suggesting to create an index, in addition to the index already used from the blog post.

CREATE NONCLUSTERED INDEX ix_suggested_index
ON [dbo].[Users] ([Reputation])
INCLUDE ([DisplayName]);

This took 4 seconds to create. Let’s see if Query 1 runs any better with this new suggested index. (The suggested index is now used 4 times alongside the previous index.)

(8926714 rows affected)

Table 'Users'. Scan count 28, logical reads 99860.
Table 'Posts'. Scan count 30, logical reads 333030.

Logical reads on the Users table has dropped significantly, and the query returned results sooner.

Total logical reads: 432,890. (Better than Query 1.)


Query 2, “Query 1 – Expanded”
(8218141 rows affected)

Table 'Users'. Scan count 28, logical reads 323384.
Table 'Posts'. Scan count 28, logical reads 333030.

Already, we can see that the row counts are different. Query 1 is returning 708,573 more rows than Query 2. Without spending more time digging into why, my guess would be the query was incorrectly expanded [from the cte to subqueries] to the post?

Total logical reads: 656,414. (Worse. And the row count is new lower.)


And again, now with the suggested index. (Same query as Query 2, just with index.)

(8218141 rows affected)

Table 'Users'. Scan count 28, logical reads 99924.
Table 'Posts'. Scan count 29, logical reads 332248.

Logical reads on the Users table has dropped significantly, and this query also returned results sooner.

Total logical reads: 432,172. (Slightly better than the baseline. But I do not trust the results as the row count is still lower.)


Query 3, “CTE + Cross Apply + Union All”
(8926714 rows affected)

Table 'Posts'. Scan count 10, logical reads 109998.
Table 'Users'. Scan count 7, logical reads 24996.

Warning: Null value is eliminated by an aggregate or other SET operation.

Same row count, fewer logical reads on both tables. Except now we have a null-warning!

I would prefer to rewrite all versions of the query to be properly sorted and then run a compare using WinMerge on the resulting text files.

Total logical reads: 134,994. (A lot better than the baseline!)


Query 4, “CTE + Cross Apply”
(8926714 rows affected)

Table 'Posts'. Scan count 10, logical reads 110212.
Table 'Users'. Scan count 7, logical reads 24996.

It did run a few seconds faster than Query 3 with about the same logical reads. The execution plan is also much cleaner. Remember: these tests are still being ran with the SQL Server 2019 suggested index created after the original Query 1 on the Users table.

Total logical reads: 135,208. (A lot better than baseline!)


I’ll update this post with more information about using #temp tables and @table variables when I have the time (and ambition). For now, I have to get back to “working” 😉.


…Later the next day…

Query 5, “Select into #temp + Cross Apply Values”
SELECT u.DisplayName, u.Reputation,
        SUM(p.ViewCount) AS ViewCount,
        SUM(p.CommentCount) AS CommentCount,
        SUM(p.FavoriteCount) AS FavoriteCount
into #temp
FROM dbo.Users AS u
LEFT JOIN dbo.Posts AS p ON p.OwnerUserId=u.Id AND p.PostTypeId IN (1, 2)
GROUP BY u.DisplayName, u.Reputation;

SELECT DisplayName, x.Metric, x.[Value]
FROM #temp cte
CROSS APPLY (
    VALUES ('Reputation', cte.Reputation),    --- 1
           ('Views',      cte.ViewCount),     --- 2
           ('Comments',   cte.CommentCount),  --- 3
           ('Favorited',  cte.FavoriteCount)  --- 4
    ) AS x(Metric, [Value])
WHERE x.[Value]>0;

And the results of selecting into a #temp table, and then selecting from that #temp table (otherwise the same query as Query 4).

(4569697 rows affected)

Table 'Posts'. Scan count 9, logical reads 110144.
Table 'Users'. Scan count 7, logical reads 24940.
Warning: Null value is eliminated by an aggregate or other SET operation.

(8926714 rows affected)

Table '#temp'. Scan count 7, logical reads 28206.

Total logical reads: 163,290. (Better than baseline, but more than Query 4. Plus took ~20 seconds longer to return results.)


Query 6, “insert @table then cross apply”
declare @temp table(
	DisplayName nvarchar(40),
	Reputation int,
	ViewCount int,
	CommentCount int,
	FavoriteCount int
)

insert into @temp( DisplayName, Reputation, ViewCount, CommentCount, FavoriteCount )
SELECT u.DisplayName, u.Reputation,
        SUM(p.ViewCount) AS ViewCount,
        SUM(p.CommentCount) AS CommentCount,
        SUM(p.FavoriteCount) AS FavoriteCount
FROM dbo.Users AS u
LEFT JOIN dbo.Posts AS p ON p.OwnerUserId=u.Id AND p.PostTypeId IN (1, 2)
GROUP BY u.DisplayName, u.Reputation;

SELECT DisplayName, x.Metric, x.[Value]
FROM @temp cte
CROSS APPLY (
    VALUES ('Reputation', cte.Reputation),    --- 1
           ('Views',      cte.ViewCount),     --- 2
           ('Comments',   cte.CommentCount),  --- 3
           ('Favorited',  cte.FavoriteCount)  --- 4
    ) AS x(Metric, [Value])
WHERE x.[Value]>0;

And the results of this @table query.

(4569697 rows affected)

Table 'Posts'. Scan count 2, logical reads 109728.
Table 'Users'. Scan count 1, logical reads 24782.
Warning: Null value is eliminated by an aggregate or other SET operation.

(8926714 rows affected)

Table '#B63020B0'. Scan count 7, logical reads 28203.

Total logical reads: 162,713. (Slightly better than Query 5, but still 27,505 more than Query 4. That’s an extra 225 MB in writes+reads and a slower query!)


Conclusion

The article “How to run your CTE just once, and re-use the output” over at sqlsunday.com had the correct idea.

Using #temp tables when you need to use the same data in multiple separate queries is usually a good idea (better than repeatedly querying the same source tables), but if the same data needs to be queried multiple times in the same query, then go with the Query 4’s “CTE+cross apply” method.

Why the @table variables sucked: As far as I’ve read, we don’t get up-to-date statistics when using @table variables. And that would cause less than ideal query plans. If I’m behind on my readings about this, please let me know with a link to some updated reading material!