THE SQL Server Blog Spot on the Web

Welcome to - The SQL Server blog spot on the web Sign in | |
in Search

Greg Low (The Bit Bucket: IDisposable)

Ramblings of Greg Low (SQL Server MVP, MCM and Microsoft RD) - SQL Down Under

  • Power Query Webinar this week

    Hi Folks,

    That Power Query webinar that I’m doing for the global Power BI team is on this week.

    Love to see you attend.

    Registration is here:

  • Database on a diet–final section (part 4)

    Hi Folks,

    I’ve been writing articles for SQL Server Magazine as part of my Data Tales series.

    Over the last couple of months, I’ve been writing a min-series about a database on a diet.

    Earlier parts described the need for size reduction, how ROW and PAGE compression come into play, how to choose between them, etc.

    Part 4 (the final part) is now online here:

    It covers GZIP compression of large string fields, PDF size reduction in BLOBS, and more.

    Enjoy !

  • Power Query Webinar

    Hi Folks,

    Want to get your head around Power Query in both Excel and Power BI Desktop?

    I've got a webinar happening for the Power BI team later this month:

  • SQL Server index updates vs PostgreSQL index updates

    The link below is to quite a fascinating article that on the surface is about why Uber moved to MySQL from PostgreSQL.

    What it really is about is a severe limitation in how PostgreSQL indexes are updated.

    With SQL Server, if you update a few columns in a row, only the indexes that have that column are updated. In PostgreSQL, all indexes still need to be updated. I understand why they've done that but it's not a good outcome.

    SQL Server shines in this regard.

  • Happy Birthday Power BI from all of us

    It’s hard to believe that it’s only a year, given how far Power BI has come.

    Many of us in the community got together under the lead of Paul Turley and Adam Saxton and created a short video for Power BI’s birthday.

    Looks good:



    So to James Phillips, and all the Power BI team, happy birthday from Down Under, Power BI !


    (Good to see some locals in there too Smile)

  • SQL Server 2016 Polybase Services stuck in Change Pending state

    I’ve been using the Polybase services in SQL Server 2016 all throughout the pre-release period without any issue.

    When I tried using them in the GA (Generally Availability) release, I could never get the two Polybase services to start. They were always stuck in a “Change Pending” state. Trying to start them in SQL Server Configuration Manager basically made SSCM look like it was hung.

    Trying to use the Polybase services to create an external data source returned:

    TCP Provider: No connection could be made because the target machine actively refused it.


    Fellow MVP John Paul Cook came to the rescue today. He previously had an issue with TCP Chimney support and noted that it looked similar.

    True enough, the problem was that TCP wasn’t an enabled protocol on my SQL Server install on my machine. Enabling TCP and restarting made all good.

    Thanks John !

  • Search for a String in all Tables of a Database

    When I work in customer sites, there is a quite a difference between how different sites get data for testing. But it basically comes down to a variation of one of these five:

    • Testing? That’s for wimps. (Well perhaps for users)
    • I type in all the test data that I need.
    • We use a data generator to produce our test data.
    • We just periodically get a copy of production data.
    • We get production data and it gets masked before we can use it.

    It’s the last option that I’m concerned with today. Masking data is challenging and usually much more difficult than imagined. This week, I was working with a masked database and we ended up finding all client details tucked inside a chunk of XML that was used to hold the metadata for a series of images. They had done a good job of removing it from all the normal columns from all tables in the database, but it’s so easy to miss extras like this.

    So after we fixed the issue, I was left wondering how effectively it really had all been masked.

    The following script takes a while to run on a large database. It takes a string to search for, and looks to find any locations in the database that hold that string. Optionally, it dumps out a copy of all rows containing the string. It does this for char, nchar, varchar, nvarchar, varchar(max), nvarchar(max), and xml data types.

    I hope someone finds it useful.

    -- Search for a string anywhere in a database

    -- v1.0 Dr Greg Low, 11 June 2016


    DECLARE @StringToSearchFor nvarchar(max) = N'Jones';

    DECLARE @IncludeActualRows bit = 1;


    DECLARE @SchemaName sysname;

    DECLARE @TableName sysname;

    DECLARE @ColumnName sysname;

    DECLARE @IsNullable bit;

    DECLARE @TableObjectID int;

    DECLARE @Message nvarchar(max);

    DECLARE @FullTableName nvarchar(max);

    DECLARE @BaseDataTypeName sysname;

    DECLARE @WereStringColumnsFound bit;

    DECLARE @Predicate nvarchar(max);

    DECLARE @SQL nvarchar(max);

    DECLARE @SummarySQL nvarchar(max) = N'';

    DECLARE @NumberOfTables int;

    DECLARE @TableCounter int = 0;


    IF OBJECT_ID(N'tempdb..#FoundLocations') IS NOT NULL


           DROP TABLE #FoundLocations;



    CREATE TABLE #FoundLocations


        FullTableName nvarchar(max),

        NumberOfRows bigint



    SET @NumberOfTables = (SELECT COUNT(*) FROM sys.tables AS t

                                           WHERE t.is_ms_shipped = 0

                                           AND t.type = N'U');




    SELECT SCHEMA_NAME(schema_id) AS SchemaName, name AS TableName, object_id AS TableObjectID

    FROM sys.tables AS t

    WHERE t.is_ms_shipped = 0

    AND t.type = N'U'

    ORDER BY SchemaName, TableName;


    OPEN TableList;

    FETCH NEXT FROM TableList INTO @SchemaName, @TableName, @TableObjectID;




        SET @TableCounter += 1;

        SET @FullTableName = QUOTENAME(@SchemaName) + N'.' + QUOTENAME(@TableName);

        SET @Message = N'Checking table '

                          + CAST(@TableCounter AS nvarchar(20))

                          + N' of '

                          + CAST(@NumberOfTables AS nvarchar(20))

                          + N': '

                          + @FullTableName;

        PRINT @Message;


        SET @WereStringColumnsFound = 0;

        SET @Predicate = N'';




        SELECT AS ColumnName, AS BaseDataTypeName

        FROM sys.columns AS c

        INNER JOIN sys.types AS t

        ON t.system_type_id = c.system_type_id

        AND t.user_type_id = c.system_type_id -- note: want the base type not the actual type

        WHERE c.object_id = @TableObjectID

        AND IN (N'text', N'ntext', N'varchar', N'nvarchar', N'char', N'nchar', N'xml')

           AND (c.max_length >= LEN(@StringToSearchFor) OR c.max_length < 0) -- allow for max types

        ORDER BY ColumnName;


        OPEN ColumnList;

        FETCH NEXT FROM ColumnList INTO @ColumnName, @BaseDataTypeName;


        WHILE @@FETCH_STATUS = 0


              SET @WereStringColumnsFound = 1;

              IF @Predicate <> N''


              SET @Predicate += N' OR ';


              SET @Predicate += CASE WHEN @BaseDataTypeName = N'xml'

                                     THEN N'CAST(' + QUOTENAME(@ColumnName) + N' AS nvarchar(max))'

                                     ELSE QUOTENAME(@ColumnName)


                              + N' LIKE N''%' + @StringToSearchFor + N'%''';

              FETCH NEXT FROM ColumnList INTO @ColumnName, @BaseDataTypeName;



        CLOSE ColumnList;

        DEALLOCATE ColumnList;


        IF @WereStringColumnsFound <> 0


              SET @SQL = N'SET NOCOUNT ON;

                           INSERT #FoundLocations (FullTableName, NumberOfRows)

                           SELECT N''' + @FullTableName + N''', COUNT_BIG(*) FROM '

                       + @FullTableName

                            + N' WHERE '

                            + @Predicate

                                                + N';';

              EXECUTE (@SQL);


              IF (SELECT NumberOfRows FROM #FoundLocations WHERE FullTableName = @FullTableName) > 0


                  SET @SummarySQL += N'SELECT * FROM ' + @FullTableName + N' WHERE ' + @Predicate + N';' + NCHAR(13) + NCHAR(10);




        FETCH NEXT FROM TableList INTO @SchemaName, @TableName, @TableObjectID;



    CLOSE TableList;

    DEALLOCATE TableList;


    SELECT *

    FROM #FoundLocations

    WHERE NumberOfRows > 0

    ORDER BY FullTableName;


    DROP TABLE #FoundLocations;


    IF @SummarySQL <> N'' AND @IncludeActualRows <> 0


        EXECUTE (@SummarySQL);



  • Auto-Injection of Enterprise Edition Features

    There’s an interesting scenario that’s come up in creating some new samples for SQL Server 2016.

    I know that for many, many years people have asked for a way to limit developer edition to only use standard edition features, or at least to have a way to flag them. I’ve often thought that I’d like a way to set a “target environment” and have the colorization change anything that I can’t use in standard to an obviously different color.

    However, previously, if you used developer edition to create a database, as long as you didn’t use any enterprise features, you could then backup and restore that database to a standard edition server.

    That’s no longer the case with SQL Server 2016.

    If you create a temporal table in a database on developer edition, it automatically chooses PAGE compression for the history file. There is also no DDL to avoid this. The database can no longer be restored on a standard edition server. To work around it, you would need to rebuild the table without compression after it’s created.

    I see this as quite a change in behavior. I don’t think that features that are only available in enterprise (or other perhaps Azure) editions should be “auto-injected”.

    Clearly, if you script out the database and create a new database on standard edition using the script, it will create the history table without compression. But I suspect that many people create DBs as starter DBs using developer edition, and then deploy them to other editions that don’t support compression.

    I’d be interested in thoughts on how common this practice currently is.

  • Should my database have foreign key constraints?

    There is an endless discussion in the development community about whether databases should include declared foreign key constraints or not.

    As a consultant, I regularly visit a wide variety of client sites. Many of these have applications designed without constraints. When I ask why there are no constraints, the response is invariably one of the following:

    • The app ensures that it’s right
    • They don’t work well with our application development
    • They are too slow
    • What’s a foreign key constraint?

    Most of the sites that I work with have sizeable databases. This discussion is important in those situations.

    The app ensures that it’s right

    In most enterprises, most large databases end up being used by more than one single application, and often each application will be made up of many sub-applications or modules. That means that every place that accesses the data needs to apply the same rules. (The same issue applies to other forms of constraints as well).

    Worse, the databases often end up being accessed by applications from different authors, and often with different technology stacks.

    ETL processes are often used to move data into the databases or to update that data.

    Even worse, in real production scenarios, data-fixes often get applied directly to the database.

    A view of the world that says that it will be ok because everything goes through a single application is a very narrow view of the world.

    They don’t work well with our application development

    What this really boils down to in most cases is that the developers didn’t want to work out the order for updates and it’s just easier for them without constraints as they can update any table they want, in any order.

    That means that the data is periodically in invalid states and (hopefully) eventually consistent. But it also means that the data has many interim states that are actually invalid states. What happens with concurrent access at that point? What happens to a reporting application that finds invoices for customers that don’t exist?

    It would be helpful if SQL Server supported deferred constraint checking but today it doesn’t. I’ve been formally asking them for it for over 10 years:

    I still think it’s one of the most important enhancements that could be made to SQL Server from a development point of view.

    They are too slow

    So often I’m told “we can’t do it because it would be too slow”. Yet almost every time I ask if they’ve actually tested it, I’m told “no”.

    The reality is usually that someone’s brother’s cousin’s friend read it somewhere on the Internet so they decided it would be a problem.

    Whenever I really test it, I find very little impact as long as appropriate indexing is in place, and sensible options are chosen when bulk importing data.

    I do occasionally find specific constraints where I decide to disable them but they are few and far between. And even then, I like to see the constraint still be in place (so it can be discovered by tools, etc. ) but just not checked. It certainly never applies to all constraints in a database.

    What’s a foreign key constraint?

    Sadly this is also a common question, fortunately mostly only on smaller databases and applications. Some developers really just don’t understand the issue but this is not most developers.

    Bottom Line

    When I do a detailed check on a system that has run without constraints for quite a while, I almost always find data integrity issues.

    When I show the integrity issues to the customer, they look surprised then say something like “ah, that’s right, we had that bug a few weeks back”. Pretty much every time, there are some issues somewhere.

    That’s ok if you’re building a toy application; not ok if you’re building a large financial application.

    At the very least, sites that decide not to have constraints should have a process in place that periodically checks data integrity. Most sites that don’t have constraints don’t do this either.

    It’s important to find issues as soon as they occur. Finding issues well after they have been created always makes them much harder to fix. At least with constraints, you get instant feedback that something’s wrong. If you have a bug in your ETL process that is messing up your data in some subtle way, you don’t want to find that out several weeks later after other processing has occurred on that data.


    I like to see constraints in place until it’s proven that a particular constraint can’t be in place for a performance reason. Then, I like to see it (and only it) disabled rather than removed. A bonus is that it can usually still be discovered by tools.

  • Data Tales #8: Database on a Diet (Part 3)

    Hi Folks,

    My series of articles for SQL Server Magazine continues. Last time, I continued a short series about a large database that needed to go on a diet. Last time, I look at the internals of row and page compression, to see what happens when they are used. We saw the significant differences in how ROW and PAGE compression are implemented. So how do you decide what to use?

    This time, we look at when ROW and PAGE compression make sense, and provide detailed guidance on how to decide which should be used for which tables and indexes, or even for which partitions of which tables and indexes. A blended approach is usually the appropriate outcome.


  • Avoiding drives going offline when adding nodes to availability groups or creating / validating Windows clusters

    Several of my customers lately have mentioned that when they’ve added a node to a Windows failover cluster for supporting an Availability Group, that they’ve ended up with an outage because the non-shared drives have been taken offline by the installation process. All commented that it didn’t use to happen on Windows Server 2008 R2 and they were taken by surprise when it happened on Windows Server 2012 R2.

    I mentioned it in the MVP list and one of my US buddies Allan Hirt said that we need to uncheck the relevant checkbox during the install.

    I stepped through another one today and this is the culprit:


    The default is now to add all eligible storage to the cluster.

    That is perhaps a reasonable default when setting up most Windows failover clusters, but it is really not a sensible default when working with Availability Groups. It ends up adding all your other drives and then puts them offline.

    Unchecking that box while performing installs/validations should avoid the issue.

    Thanks to Allan for pointing it out.

  • SQL Down Under Podcast 68–Guest Joe Yong–SQL Server 2016 StretchDB

    Hi Folks,

    One of the intriguing options that’s coming as part of SQL Server 2016 is StretchDB. I’ve been spending some time working with it and learning about it, and I’ve ended up with lots of questions.

    It was great to get an opportunity to speak to Joe Yong to get all my questions sorted.

    Hope you’ll enjoy it too. You’ll find it here:

  • SQL Down Under podcast Show 67–Kasper DeJonge and SQL Server 2016 Analysis Services

    Last week I had an opportunity to record a new podcast with Kasper DeJonge from the SQL Server Analysis Services team. Kasper has also been working on Power Pivot and many other areas.

    SQL Server 2016 is lining up to be a great release for Analysis Services with significant enhancements for both tabular and multi-dimensional data models.

    I hope you enjoy the show. You’ll find it here:

  • Very pleased to see free developer edition of SQL Server

    I spend a lot of my time working with ISVs (or software houses) and with individual developers. For SQL Server to have a long-term future, we need to be appealing to more and more new developers, to get them to understand that SQL Server is a great platform for building their applications above.

    One of the most important aspects of this right up front, is making it really frictionless to get started with SQL Server.

    In the past, it has at times been very difficult to purchase SQL Server Developer Edition. To me, that’s been ridiculous if we want to attract new developers to the platform. I have a friend who, a year or so back, wanted to get Developer Edition and spent weeks (literally) trying to find out how to buy it. In the USA, it was available in the Microsoft store. You could pay $40 and download it. If you had an MSDN subscription, it was also just available for download. But for some reason that I’ll never fathom, it wasn’t available in the Australian Microsoft store and you had to purchase it from a license vendor instead.

    Because Developer Edition was a low-cost item, the license vendors weren’t interested in it. Even when I spoke to the local Microsoft sales subsidiary, after two weeks I couldn’t find out how to buy it either. The local sales team seems to be basically goaled on Enterprise Edition sales, so a Developer Edition license didn’t even feature on their radar.

    Sadly, by comparison, my friend downloaded PostgreSQL and was using it 15 minutes later. This situation could not continue.

    The Visual Studio team had been rightly proud of their Visual Studio Community Edition and how well it was going. It was just a free download.

    I was a very vocal member of a crowd asking “Where is the SQL Server Community Edition?”.

    Well this week, our pleas have been answered. Any developer can just join the Visual Studio Dev Essentials program and then download SQL Server Developer Edition for free.

    I’ve got much more to post about what else we need to do to help get the new developers but credit must be given where it’s due. Thanks for listening Microsoft !

  • Learning Mandarin vs Cantonese

    I had a former colleague ask me the other day if he should learn Cantonese or Mandarin. He was going to spend a few months in Hong Kong and southern China.

    Here are my opinions. I’m sure there are many who might differ but this is how I see it today.

    Hong Kong is a tricky one. A large number of people there can speak Mandarin given the difficult relationship between “mainland” China and Hong Kong, even locals who can speak it don’t treat you the same as if you speak Cantonese. Many just aren’t happy about aspects of how China now runs Hong Kong. But the way that I see it, is that resist as they might, they will be “integrated”.

    In the meantime, and probably for another generation, Cantonese is certainly where it’s at. Already though, China makes all children in Hong Kong learn Mandarin.

    People don’t seem to understand how determined the Chinese government is to achieve standardisation. In the end, it’s the only way you can run such a large country without anarchy. They even have a single time zone for the whole of China. I can’t see any chance of them supporting 900 or so dialects going forward. It would probably scare the Cantonese speakers today to think about it, but I think that eventually Cantonese will be like Aboriginal languages are in Australia: more of a curiosity. Clearly, that will take some time.

    Personally, I wouldn’t put any effort into Cantonese but you might decide differently if you are intending to be in a majority Cantonese-speaking country for quite a while. You certainly get treated better by locals if you use at least some of their dialect. Same thing happens in Shanghai. Add some Shanghai-ese into your Mandarin, and they are so happy.

    Another thing I’ve liked about learning Chinese is that the writing system is essentially the same for all dialects. So even when I see signs written in Hong Kong, I can still read most of it. I have no idea what the words are in Cantonese (although I can easily guess some), but I know what the sign means, and that’s what counts the most.

    The challenge with Chinese reading and writing is simplified vs traditional systems. In the 1950’s and 1960’s, the Chinese government decided to simplify the writing system as they had so many people that were illiterate. So they took a bunch of characters and made simplified versions of them. The simplified ones are now used widely. However, once again, Hong Kong is a hold-out. Most people in Hong-Kong or Taiwan still use traditional characters. So there are some characters that I have to stop to try to work out what they are.

    The younger people in Hong Kong are learning Simplified characters. Other countries like Singapore have standardised on Simplified. Even in the middle of China, however, there are movements to try to reinstate Traditional characters as they are seen as more meaningful. I can’t see that prevailing and simplified characters will be the future.

    Most older Chinese writing that I see in Australia is Traditional as many of the Chinese that originally came here did so from Hong Kong, and many came before Simplified writing was introduced. Most new Chinese that you see in Australia is Simplified. That’s what all the Chinese tourists and students tend to use. When I ride the trains in Melbourne, I used to predominantly hear Cantonese. Now I predominantly hear Mandarin, largely due to an influx of people from other areas of China, and particularly due to younger students.

    I’ve been using a variety of methods for learning Mandarin over the years. My main effort at present is to attend 3 x 1 hour one-on-one lessons from Hanbridge Mandarin each week. (

    I’ve got other posts that I’ll update soon, on the other methods that I’m using, but the online lessons (my current teachers are from ShenZhen) have increased my abilities the fastest of any method that I have tried.

    And I just speak to every Chinese person that I get an opportunity to do so.

This Blog



No tags have been created or used yet.


Privacy Statement