THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Louis Davidson

  • Speaking on 7/25 for the Nashville SQL Server User Group: How In-Memory Database Objects Affect Database Design

    So I have blogged about it, and I have prepared for it, and next Friday at lunch time I will be unveiling my new presentation. The location/other details can be found here: http://nashville.sqlpass.org/Home.aspx, but the abstract is:

    How In-Memory Database Objects Affect Database Design
       

    With SQL Server 2014, Microsoft has added a major new feature to help optimize OLTP database implementations by persisting your data primarily in RAM. Of course it isn't that simple, internally everything that uses this new feature is completely new. While the internals of this feature may be foreign to you, accessing the data that uses the structures very much resembles T-SQL as you already know it. As such, the first important question for the average developer will be how to adapt an existing application to make use of the technology to achieve enhanced performance. In this session, I will introduce the concepts of In-Memory Database objects, discussing how the design is affected by the new technology. As a basis for the presentation, I will start with a normalized database, and adapt the logical and physical database model/implementation in several manners, performance testing the tables and code changes along the way.

    It is not exactly what I have envisioned for the presentation for the slightly distant future, but I am pretty pleased with where it is right now. I decided that since this was such a new feature, it is very likely that people would not be well enough acquainted with the subject for me to ignore the introductory aspects. So while I originally planned to dive right in, I have added a lot of introductory material to explain the features enough first to make sure that the design aspects I will cover make sense no matter your level with the in-memory features.  I plan to use the same format with some flexibility if I do this for a SQL Saturday later this year, and certain so when I do the presentation at Devlink. Luckily at Devlink I have another 15 minutes to work with, so 15 more minutes of code comparison will hopefully fit the needs of the more programming oriented attendees at Devlink.

    Of course, I am not done blogging about a few additional aspects I have come up with, but with a week and a half to go before I present it, more writing on the presentation has been the goal.

     

  • How In-Memory Database Objects Affect Database Design: Uniqueness

    This part is part of an ongoing series of blogs I am writing while preparing to give a presentation based on the prefix of this blog's title. It is a work in progress as I explore the effects of the new in-memory model. I would love comments that tell me that I am wrong (especially if it turns out that I am!)

    The first issue that arose in my tests was with the new concurrency model. I started out with the stored procedure I will show later in the entry. It sees if a customer exists, and if not create it. The table is the customerAddress table, with the following structure:

    image

    With on-disk structures, I have a uniqueness constraint on the Address columns, along with the customerId. The customerId is the first column in the constraint declaration, so the performance was great. I don’t use a transaction or error handling (yet) because I want to simulate what a normal application might do. I put the code in stored procedure form because, well, it will just be easier to manage for me. (Plus we can change that code to be proper a lot easier, and when I build the natively compiled procedures, things will be easier to simulate.

    Basically, the gist is, look up the address using all of the address columns that are in the uniqueness constraint. If you get a customerAddressId, return it, if not, create the customer and then return it, using the alternate key values:

    CREATE PROCEDURE Customers.CustomerAddress$CreateAndReturn
    (
        @customerId INT,
        @Line1  NVARCHAR(30),
        @Line2  NVARCHAR(30),
        @City   NVARCHAR(30),
        @ZipCode VARCHAR(20),
        @USStateCode CHAR(2),
        @AddressTypeCode NVARCHAR(20)
    )
    AS
    BEGIN
        SET NOCOUNT ON
        DECLARE @customerAddressId INT = (
                                         SELECT CustomerAddressId
                                          FROM  Customers.CustomerAddress
                                          WHERE  CustomerId = @CustomerId
                                            AND  (Line1 = @Line1 OR (Line1 IS NULL AND @Line1 IS NULL))
                                            AND  (Line2 = @Line2 OR (Line2 IS NULL AND @Line2 IS NULL))
                                            AND  (City = @City OR (City IS NULL AND @City IS NULL))
                                            AND  (ZipCode = @ZipCode OR (ZipCode IS NULL AND @ZipCode IS NULL))
                                            AND  USStateCode = @USStateCode
                                            AND  AddressTypeCode = @AddressTypeCode ) --we are assuming validation is done elsewhere

        IF @customerAddressId IS NULL
            BEGIN
                INSERT  INTO [Customers].[CustomerAddress] ( [CustomerId], [Line1], [Line2], [City], [ZipCode], [USStateCode], [AddressTypeCode] )
                VALUES  ( @CustomerId, @Line1, @Line2, @City, @ZipCode, @USStateCode, @AddressTypeCode )

            SET @customerAddressId = (
                                              SELECT CustomerAddressId
                                              FROM  Customers.CustomerAddress
                                              WHERE  CustomerId = @CustomerId
                                                AND  (Line1 = @Line1 OR (Line1 IS NULL AND @Line1 IS NULL))
                                                AND  (Line2 = @Line2 OR (Line2 IS NULL AND @Line2 IS NULL))
                                                AND  (City = @City OR (City IS NULL AND @City IS NULL))
                                                AND  (ZipCode = @ZipCode OR (ZipCode IS NULL AND @ZipCode IS NULL))
                                                AND  USStateCode = @USStateCode
                                                AND  AddressTypeCode = @AddressTypeCode )

            END

          RETURN COALESCE(@customerAddressId, -100)
    END
    GO

    My test data resides in a side database, with a lot of duplicated customers and addresses. The goal is to do a good number of reads, but not make the code so complex that we start to deal with slight variances in addresses. These are the kinds of things one might do in an ETL process, or certainly as a background process in their OLTP system.  For this example, I want to avoid duplication, but only if it is 100% duplicated.

    The code is called in batches of four inter connected sets of procedures. Each returns a surrogate key via a return statement (it isn’t 100 finished, as I haven’t really handled stuff like –100 being returned, even though it really couldn’t occur with my current data.):

    DECLARE @CustomerId INT
    EXEC @CustomerId = Customers.Customer$CreateAndReturn @firstName = 'Tyler', @MiddleName = 'R', @LastName = 'David', @EmailAddress = 'Tyler@email.com'

    DECLARE @CustomerAddressId INT
    EXEC @CustomerAddressId = Customers.CustomerAddress$CreateAndReturn @customerId = @CustomerId,@Line1='9 Gordon Highway',@line2='Apt 296',@city='Buffalo',@ZipCode='81254',@USStateCode='MO',@AddressTypeCode='Office'

    DECLARE @SalesOrderId int
    EXEC @SalesOrderId = Sales.SalesOrder$Create @CustomerId=@CustomerId,@CustomerAddressId=@CustomerAddressId,@SalesOrderStatusCode='New'

    EXEC Sales.SalesOrderItem$Create @SalesOrderId=@SalesOrderId,@SalesOrderItemNumber=48904,@Quantity=3.6386,@UnitPrice=14.57,@ProductPriceId=3779
    EXEC Sales.SalesOrderItem$Create @SalesOrderId=@SalesOrderId,@SalesOrderItemNumber=98015,@Quantity=3.0596,@UnitPrice=5.31,@ProductPriceId=1043

    I have 4 files of these stored procedure batches, each file having 10,000 of batches in it. I start them all at the same time using my WaitFor Sync FrameWork I blogged about a few year’s back. When I was using the on-disk tables, not once did this code fail to work, and not once did I get an error, even though I certainly could have if a duplicated row was inserted:

    DECLARE @customerAddressId INT = (
                                         SELECT CustomerAddressId
                                          FROM  Customers.CustomerAddress
                                          WHERE  CustomerId = @CustomerId
                                            AND  (Line1 = @Line1 OR (Line1 IS NULL AND @Line1 IS NULL))
                                            AND  (Line2 = @Line2 OR (Line2 IS NULL AND @Line2 IS NULL))
                                            AND  (City = @City OR (City IS NULL AND @City IS NULL))
                                            AND  (ZipCode = @ZipCode OR (ZipCode IS NULL AND @ZipCode IS NULL))
                                            AND  USStateCode = @USStateCode
                                            AND  AddressTypeCode = @AddressTypeCode ) --we are assuming validation is done elsewhere

        IF @customerAddressId IS NULL
            BEGIN
                INSERT  INTO [Customers].[CustomerAddress] ( [CustomerId], [Line1], [Line2], [City], [ZipCode], [USStateCode], [AddressTypeCode] )
                VALUES  ( @CustomerId, @Line1, @Line2, @City, @ZipCode, @USStateCode, @AddressTypeCode )

    Time is the enemy. Time passes between the fetch of the CustomerAddressId and the insert. If another connection inserts the row in those moments between these statements, you could get a duplicate key error. Which I accepted as a possibility. If I finished this code for on-disk, I would handle that error by refetching the key. No problem.

    When I changed my table to an in-mem table (which I will talk more about the actual structure in a blog or two), I immediately got duplicated data. Why? Three reasons.

    1. The in-memory code ran 300% faster with little optimization.

    2. No UNIQUE key constraint. I put a hash index on the customerId so the WHERE clause would use a simple seek operation, but with only a PRIMARY KEY constraint, there was nothing to stop the duplicate

    3. The no locking optimistic concurrency control. (I won’t go into too much detail, but read this if you haven’t yet: http://sqlblog.com/blogs/kalen_delaney/archive/2013/10/20/sql-server-2014-in-memory-oltp-hekaton-whitepaper-for-ctp2.aspx)

    Each of these factors figured into the issue. Since the code is faster, it is more likely that we will have collisions. This was certainly a factor, as in my first tests, I forgot to index the customerId, and the code ran twice as slow as the on-disk version, and there were no collisions.

    Without a uniqueness constraint, the rows will get created with no issue, even if you accidentally get duplicates to create. The lack of constraints is one of my least favorite parts of the whole in-memory structures.

    Using Multi-Valued Concurrency Control (MVCC), even while the row is being inserted (or really, as long as the transaction has not been committed), other connections can execute the search for the address and get back no results. In the default isolation level, access is in the basic snapshot isolation level. The second connection sees how the table is when they start the transaction.  Even if I escalated to REPEATABLE READ or SERIALIZABLE, it wouldn’t eliminate duplication, since the new row’s uniqueness is defined as a surrogate key and there would technically be no collision. Is using a surrogate key the best solution? It isn’t feeling like it for these tables, but you know what, this is the norm for most people. I could easily just chuck the primary key on the natural key here, and let the identity be the possible (and far less likely) duplication point. But that seems like cheating, and what if a table has 2 natural key choices, or 3, or more. You would be stuck again with this issue (and it will not be easy to solve.)

    Fortunately, this code here is going to fail IF multiple rows are committed by the time it executes:

    SET @customerAddressId = (SELECT CustomerAddressId
                                      FROM  Customers.CustomerAddress
                                      WHERE  CustomerId = @CustomerId
                                        AND  (Line1 = @Line1 OR (Line1 IS NULL AND @Line1 IS NULL))
                                        AND  (Line2 = @Line2 OR (Line2 IS NULL AND @Line2 IS NULL))
                                        AND  (City = @City OR (City IS NULL AND @City IS NULL))
                                        AND  (ZipCode = @ZipCode OR (ZipCode IS NULL AND @ZipCode IS NULL))
                                        AND  USStateCode = @USStateCode
                                        AND  AddressTypeCode = @AddressTypeCode )

    Since a subquery can only return 1 value and not cause an error. But with MVCC, this too could be an issue if we want to put it in a transaction and rollback on an error.  Sadly however, it will not be so easy to fix because it would only fail if the transaction with the insert has been committed.

    As I start to think about how to solve this problem, the solution is going to have to be very different than with disk based tables. There, if I threw a transaction around the search and the insert (and possible a few indexing hints to block multiple readers), I could stop the duplication easily. However, in all of the MVCC isolation levels, readers are never blocked. I could use an application lock to single thread the inserts, but that would definitely not be the most efficient operation, and it would not work at all in managed code.

    In reality, there is another more common solution (as I expect a lot of people wouldn’t even have the alternate key on the address to start with). Just give in and accept the duplicates as part of high performance business. We can minimize the damage by changing the two subqueries to:

    SET @customerAddressId = (SELECT MIN(CustomerAddressId)

    So we always try to use the first one. With a low cardinality table like a customer’s address, you will only get back a few rows at most, so the aggregate isn’t going to be terribly costly. And you build cleanup routines to reallocate duplicated addresses. This is something that you will need to do with the data in any case, since it is altogether possible that the user types: '9 Gordon Hwy' instead of '9 Gordon Highway', and you get duplicates you need to clean up anyhow. My current times to enter the 40,000 rows across 4 connections using the command line SQLCMD interface (the files crashed SSMS!) is around 12 seconds.

    When I get back to refining the code tomorrow, this is going to be my new tack for this (and perhaps the customer create) procedure. Minimize duplication, but go ahead and allow it. That problem is for the ETL Architect to deal with. (Now if much of my daily job wasn’t writing ETL, that would actually be comforting.)

  • How In-Memory Database Objects Affect Database Design: Or does it?

    This part is part of an ongoing series of blogs I am writing while preparing to give a presentation based on the prefix of this blog's title. It is a work in progress as I explore the effects of the new in-memory model. I would love comments that tell me that I am wrong (especially if it turns out that I am!) 

    Before I start describing what I have discovered so far along the way, let's make a few educated guesses. I have my data model from the previous blog entry (here), and I have a plan of attack for entering data. Basically, I plan to take the "normal" way that people deal with data like this, and enter data into each table separately, outside of and overarching transaction.

    So I will create or retrieve, the Customer first, then the CustomerAddress, then I will create the SalesOrder and ever how many SalesOrderItems that the customer (or my data generation using RedGate's Data Generator!) wants. I may dress it up over time, but for my initial experiments, this is the plan.

    For a database with UNIQUE constraints on alternate keys, and FOREIGN KEY constraints on relationships, this is all pretty easy and safe. I know I can't duplicate a customer, or violate foreign key constraints. How this will play out in the code is still just a bunch of guesses, with my only goal to basically be to not have the ETL architect trying to decide if running me over with a car is punishment enough for me introducing more bad data to deal with.

    But the title of the session ends in "...Database Design". The code isn't database design. Rather, as an architect, it is important to realize that the database design affects the code. So will we need different data structures or will the normal normal model suffice? My initial feeling is that the logical model doesn't change. The actual implementation details (such as the lack of foreign, unique, and check constraints) will be a difference, and the hardware/implementation layer changes immensely, and this affects your code in ways you had best understand.

    I "think" that what I determine will be that the basic data architects output remains rather constant. The rules of normalization (briefly described here), don't change at all. You still design the tables and columns based on the same factors of cardinality you would before. The physical model will be the same, and if your data modeling tool supports the in-memory structures, it is just a matter of choosing which tables should be in-mem and which don't really merit it, followed by a lot of testing (a lot).

    For my first tests, I only made the four "hot" tables in-mem:

     

    This design will work fine when dealing with interop code, even if I want to implement data integrity checks to the domain tables. If I want to use native code, then all of the tables used will need to be in-memory. Are data integrity checks that important? Well, they are to me, but not so much to a lot of folks who trust their other layers to get it right. My scenario, where all of the domain tables are "static" enable this scenario to work just fine. As long as ProductType never changes, and the other code layers have only the right values, you can easily say "this works" (as long as it does... and your code has been tested for what happens if anything crashes on any given statement to the database...which is not an easy task.).

    In my next post, I will share at least one of the effects the change to in-memory code has had on the code I am using, and what I am considering doing about it.

  • How In-Memory Database Objects Affect Database Design: The Logical and Physical Models

    This part is part of an ongoing series of blogs I am writing while preparing to give a presentation based on the prefix of this blog's title. It is a work in progress as I explore the effects of the new in-memory model. I would love comments that tell me that I am wrong (especially if it turns out that I am!)  

    Now that I have the conceptual model I want to work with (and bunch of "life" tasks out of the way), I head now into fleshing out the model. I go ahead and add attributes, along with domains that I will use to build my actual database.  The model continues on with a minimalist approach, just the basics that are needed, nothing more, nothing less. Some times, names, etc;

     Logical Model

    As this is a very simple model, I won't drag it out anymore than I need to, and I am going to go ahead and make this a physical model:

     

     

    Basically, I set datatypes, added domain tables for some of simple domains (to make doing data checking easier when (if?) I build manual foreign keys in the code, since in-memory databases don't support the concept), and added row create and modify times to every editable table. (The four tables Customer, CustomerAddress, SalesOrder, and SalesOrderItem that I will be loading as part of my stress testing.) Nothing too fancy, but at the same time, not so simplistic that I felt like it would lose any and all meaning.  The other tables are not really "transactional", so I will be loading them all at once as a configuration step. My lack of adding physical columns like row modify times isn't a suggestion that it isn't needed at all in a real system (who added that $.10 fare from Boston to LA?), but rather I wanted to keep it as an indication that I wasn't planning on dealing with that here. The four major tables will be loaded at full speed ahead, and as fast as we can take orders from new and existing customers, we will. All other data will be simply configuration data, for me.

    On the other hand, if I was going to do a full day test, a process that might be interesting to include would be a price hike. Who chooses the price? The database code, or the front end? If the price changes while the client has the old price in hand, should the transaction fail and say "Sorry, our prices went up, try again! Thank you come again." or just accept the previous price?  And then, how much time might acceptably pass before the fail notice comes up? Could that be hacked by a craft 10 year old to get pricing from a year ago? Discussions like this is why actual design sessions take simply forever to complete, and are very boring to people who strictly code... It is also why we are avoiding the topic here, for the most part.

    In terms of our implementation, one of the more interesting columns to deal with are going to be the CustomerNumber, and the SalesOrderNumber. Generating unique values for customer consumption is a fairly simple concept, but it will change when we start working with in-memory code, certainly for the 2014 version.

    The next steps (which are already underway, but still looking for interesting things to blog about along the way), are already underway. In the next entry in this session writing process, I will discuss my methods of testing the structures, the issues I have seen along the way, and anything else that comes up.

     

  • Speaking this weekend at SQL Saturday 286 - Louisville

    Wow, where does the time go? Hope I see every reader there in Louisville this weekend for a lovely time. Say you live in Alaska and it would be too far to go in one day? Hey, we all have our problems (and don't tell anyone, but I am going to try to finish up video versions of these sessions by the end of the year.. shh.)

    I will be doing my session on Database Design Fundamentals and How to Write a DML Trigger, both sessions I have done quite often of late, and I really like how they work now. I have a bit of a reputation as a slide/code tinkerer and this time is no different, but the biggest thing I have settled on is how I deal with presentation like these, where honestly the subject is more than a 1 hour session can handle. Homework.

    I will cover the fundamentals of whatever topic in slides, and then demo as much code as I have time, but the rest is homework. I got the idea when I was prepping to speak to the Richmond User Group, as I was flip flopping between showing some slides and showing code. I realized that the slides would explain the fundamentals better than slogging through the internals of the code directly, and that when I attend sessions with a lot of code, all I am sitting there thinking is "man, I want that code".  So I try to comment the code to make it self explanatory, run the code before each presentation, enhance it as I have time to, show you enough of the code to get you started, and then give it to you to play with (and if you want to pick it apart, email me at drsql@hotmail.com, I LOVE constructive criticism.  

    I post all of my slides and code on my website (http://drsql.org) not just because it is part of a presentation, or to make myself feel cool, but mostly so I can reference it as I need it. I use my code snippets all of the time when coding, and I try to keep them up to the version I am using (or writing about as it were.) So hopefully, I see you and get to explain the fundamentals, then the (rather wordy at times) slides are there for reference, and the code is there to get you started practicing on your own. Homework.

    The abstracts for this weekend's presentations: 

    Database Design Fundamentals

    In this session I will give an overview of how to design a database, including the common normal forms and why they should matter to you if you are creating or modifying SQL Server databases. Data should be easy to work with in SQL Server if the database has been organized as close as possible to the standards of normalization that have been proven for many years. Many common T-SQL programming "difficulties" are the result of struggling against the way data should be structured and can be avoided by applying the basic normalization techniques and are obvious things that you find yourself struggling with time and again (i.e. using the SUBSTRING function in a WHERE clause meaning you can't use an index efficiently). 

    How to Write a DML Trigger

    Triggers are extremely powerful and useful (if somewhat rarely needed) objects that are coded very similar to a common stored procedure. Yet for their similarity, there are some very important differences that need to be understood. In this session, I will walk through what goes into writing a robust DML trigger, starting with a simple version of a trigger, and working through some very useful applications of DML Triggers.. 

     

  • Speaking this week at Richmond SQL Server User Group

    Thursday night, at 6:00 (or so) I will be speaking in Richmond (http://richmondsql.org/cs2007/ ), talking about How to Implement a Hierarchy using SQL Server. The abstract is:

    One of the most common structures you will come across in the real world is a hierarchy (either a single parent "tree" or a multi-parent "graph"). Many systems will implement the obvious examples, such as a corporate managerial structure or a bill of materials. It turns out that almost any many-to-many relationship can be used as a hierarchy to find the relationship of parent rows to child rows (for example, the relationship of actors to movies they’ve been in). In this session, we’ll discuss several different implementations of hierarchies that you can use to optimize your hierarchy implementations and put into practice immediately, with ready-made T-SQL examples.

    Hierarchies are one of my favorite "fun" topics, as they are interesting for a few reasons. First, they have very common usages that most people come across, and second, they make for fairly interesting example code and performance testing. In my slides (and downloads), I will have examples where I generate several trees, including 3400, 41000, 52000, and even a 1.2 million node tree (which, using my slightly slow generator, took like 16 hours on my i7 laptop) along with 5 rows of sales data for every root node of the tree. It is kind of interesting to me to see how well the different tree implementations behave using each sized tree. I may even get a chance this week to toss the tree structures into in-memory tables and check their performance (but if not, it certainly will be included in what I am going to announce in the next paragraph.)

    The only downside is that (not unlike most of my presentations) I have way too much material for 1 hour (or even 2). So I will be working in the upcoming future (hopefully by Devlink) to put out my directors cut video of this and several other presentations I have that are just too unwieldy for a non-precon sized session. I will officially announce this effort soon (along with a realistic schedule!), but it has been met with many life issues. I had a few weeks set aside for this task, but the weekend I sat down to record videos, I got sick and have had to push things back.

    However, all of the code will be available for download, and my email address is no secret (drsql@Hotmail.com) and I am always happy (if sometimes slow) to answer questions, take criticisms, or paypal payments at that address, so feel free to do either with the code when it is posted at http://www.drsql.org/Pages/Presentations.aspx 


  • How In-Memory Database Objects Affect Database Design: The Conceptual Model

    This part is part of an ongoing series of blogs I am writing while preparing to give a presentation based on the prefix of this blog's title. It is a work in progress as I explore the effects of the new in-memory model. I would love comments that tell me that I am wrong (especially if it turns out that I am!)  

    After a rather long break in the action to get through some heavy tech editing work (paid work before blogging, I always say!) it is time to start working on this presentation about In-Memory Databases. I have been trying to decide on the scope of the demo code in the back of my head, and I have added more and taken away bits and pieces over time trying to find the balance of "enough" complexity to show data integrity issues and joins, but not so much that we get lost in the process of trying to actually get rows generated for display.

    To that end, I came up with the following basic model:  

     Conceptual Model

     

     

     

     

     

     

     

     

     

     

     

     

     

     

    We will define customers, define a simple US address with a domain set only for US States (my client here can only sell to the US in any case. We will define Products, and over time, the price can change. I had at one time considered including discounts and offer codes, but it all started to seem to get way far away from what was necessary to complete the task at hand. Start with a set of tables using on-disk structures pre-loaded with a few thousand rows of data, then a several more thousand transactions, and look at the performance.  Then morph the design through a few cycles (both with and without RI, with and without stored procedures. In the end, the hot spot of my design will be two-fold:

    1. Looking up and creating new customers (will I duplicate orders from the same customer? Probably, I figure I may use AdventureWorks data for the loading, though I am not 100% sure.).

    2. Creating new orders

    The rest of the process would just be tedious and harder to visualize for the attendees (and I will have 1:15 at Devlink, and 1 hour if any of the two SQL Saturday's pick up the session, and that isn't much time.)

    If you think this is (or isn't) adequate, I am interested to hear from you.

    The next blog will be the Logical Model, where I will start to fill out the design. I will use these diagrams in the session to demonstrate the design, and it is the process I do anytime I am given a database to design (other than the fact that I get to adjust the requirements to meet the amount of work I want to do!)


     

  • Speaking this weekend at SQL Saturday 277 - Richmond

     

    One of my favorite locations to speak or attend is when Richmond has a SQL Saturday. (though if you are an organizer of another SQL Saturday's I have submitted to, note that I said "one of my favorites" :)). This will be the third time I go to Richmond. I like it for several reasons:

     

    • The people - I have coworkers coming up from Virginia Beach to attend, and from Maryland, and our leader lives in Richmond; I have a cowriter who is also speaking (Jessica Moss), and other good friends who are amongst the speakers and organizers
    • The location - Richmond is a cool city to visit (even if it is just for a day,) and we are having a team gathering after the event
    • The trip - I am driving in from Nashville, starting out tomorrow night, taking the slow road with time to stop and smell the roses.  On the way back I am taking a vacation day and stopping by Dollywood, so that is awesome..

     

    Of course, none of this fits into the "why should *you* care that I will be there" category. I will be speaking on Triggers, a subject that I equally love and loathe. Love because they are awesomely powerful tools to help manage data integrity. I loathe them because they are so misused by many people. That was why I initially put together this abstract.

     

    How to Write a DML Trigger

    Triggers are extremely powerful and useful (if somewhat rarely needed) objects that are coded very similar to a common stored procedure. Yet for their similarity, there are some very important differences that need to be understood. In this session, I will walk through what goes into writing a robust DML trigger, starting with a simple version of a trigger, and working through some very useful applications of DML Triggers..

     

    It is set at the intermediate level so I have done some editing of the introductory material, and will focus more on several sets of examples with plenty of code to download when you head back home and need to write your own trigger.

     

    Will I see you there?

  • How In-Memory Database Objects Affect Database Design: Getting Started

    Weird, somehow I lost text in the middle of this post after it was posted... I created it a few days back and set it to post later... Corrected.

    So I put in the following abstract for Devlink (and may for SQL Saturday Orlando and/or PASS Summit.) I don't know if I will get accepted, but I am pretty excited to work on this topic one way or another...

    "With SQL Server 2014, Microsoft has added a major new feature to help optimize OLTP database implementations by persisting your data primarily in RAM. Of course it isn't that simple, internally everything that uses this new feature is completely new.  While the internals of this feature may be foreign to you, accessing the data that uses the structures very much resembles T-SQL as you already know it. As such, the first important question for the average developer will be how to adapt an existing application to make use of the technology to achieve enhanced performance. In this session, I will start with a normalized database, and adapt the logical and physical database model/implementation in several manners, performance testing the tables and code changes along the way. "

    My plan is to start with a moderately straightforward table design, and implement a set of scripts that create a bunch of data. Starting out with normal on-disk tables using both ad-hoc calls and stored procedures, I will morph the design from just changing some of the tables to in-memory, then change the procedures to use in-memory wherever possible and test any changes we need to make to the physical structures. Another aspect to the process will be errant data. Part of the experiment will be to have a smattering of bad data to simulate real live programmers and users doing real live stuff.

    I think I know the main content already, but I expect to pick up a lot of knowledge between now and then (which I will attribute, naturally!) I have at least 5 months, before I would need to present it, so who knows if I will change my mind. Along the way, I will blog semi-frequently about the progress, including what I have learned, code samples, philosophies, etc. 

     

     

  • Abstract Submission Day for Speaking 2014…

    Today is my least favorite part of the whole speaking thing. It is the day that I put myself out there and wait to see if I get rejected or accepted. Rejection stinks, but at the same time, being accepted means more work. Neither are perfect, and as I stated in my last post, I also have a goal of not getting myself stuck spending 7 hours a day doing work outside of my day job.

     

    I plan to put in abstracts to most of the conferences I expect to attend this year. And when I say attend, I mean I plan to attend pretty much regardless of whether my submissions are accepted.  I have just as much desire to learn as I have to teach, and accumulating knowledge for my ongoing writing efforts can be just as valuable as speaking. (The only downside being that when you don't speak, you don't get to go the speaker dinner, which can be a lot of fun, particularly at a SQL Saturday event).

     

    My goals for this year is to get myself stuck writing a lot of new sessions. I do have a few sessions that are "done" that I would just have to update a bit that I may submit:

     

    Fundamentals of Relational Database Design - Just what it sound like really. Normalization, documentation, etc.

     

    DML Triggers - The basics of writing triggers with examples. I am doing this session in Richmond in 3 weeks at SQL Saturday.

     

    Hierarchies - A deep session that covers the different ways a hierarchy can be implemented in SQL without favoring one method over the other in a manner other than empirically. I plan to base a book on the subject later this year.

     

    Sequences - All about the sequence and its performance characteristics. Not a big hit in the past since 2012 adoption hadn't quite got there when I did it first, perhaps it has come time.

     

    Finally, I do have a new session that I am going to put in and start writing in my spare time. 

     

    How In-Memory Database Objects Affect Database Design - In this session, I will be demonstrating a normalized database design using disk based tables, and then  adjusting the design to make best use of in memory objects. To demonstrate, I will use test data that will simulate multiple connections of orders being taken.

     

    In the biggest change of my writing /speaking career, I plan to start writing this presentation way before it is accepted. I will be starting on this presentation immediately in my free time, blogging about it as I go.  I also have a lot to learn about in memory database structures, so this should reinforce the learning I am and will be doing (more about all of this when I write my introductory blog about it).  Even if it never gets presented, I will certainly be using the material when I put together my next database design book, whenever that turns out to be. 

     

    Unless someone just really wants me to (and I don't know the meaning of the word "no"), I doubt I put in a pre-con session this year. It is a lot of work and makes PASS a crazy time of year. Realistically, I may just attend a pre-con if one strikes my fancy. I am in the middle of building a T-SQL from the ground up pre-con (or larger) class that I am doing for my company (with rights to share secured in the process).

     

    Other than Atlanta, which has a deadline of Tuesday (March 3) for submission, I have a bit of time to change my mind and add/subtract an abstract, but I think that I am going to stick with:

     

    SQL Saturday Atlanta - May 3 - Hierarchies, Triggers, Database Design

    SQL Saturday Louisville - Jun 21 - Hierarchies, Triggers, Database Design

    Devlink - Aug 27 - In Memory DB, Hierarchies

    SQL PASS Summit - Nov 4 - In Memory DB, Hierarchies, Triggers, Database Design , Sequences

     

    Now nothing to do but sick back and wait... And prepare the In-Mem session.. And hey, if none are chosen, it will give me something else to blog about!

  • Holy Cow, It has been a while

     

    Well, it has been over a year since my last insanely optimistic blog about the next year, but this year it is going to be (actually) different (not like the last time I said it would be different)

     

    Usually, like pretty much everyone on earth, I overestimate just how much I can get done in the following twelve months by some impressive amount. Writing, speaking, blogging, exercising, etc etc. Not this year. I am specifically going the other direction, saying I am doing less. Specifically more writing and less of everything else, particularly things with oppressive deadlines that preclude personal activities like exercising, eating right, going to Disney World, seeing the kid and her kids, and perhaps most importantly, sleeping!

     

    (Ironically it is 12:35 am as I write this. Yeah, I will start tomorrow…ish)

     

    I am not giving up speaking entirely, as I do love the challenge and teaching people.  As such, I do expect to put in sessions for Devlink, Atlanta, Richmond, Louisville, for sure, plus probably PASS.  But my goal is to not do what I did last year, which was to put in a bunch of new sessions that get chosen which I then have to write (the fun part), and then practice to death until I am ready to present in front of 5-100 people (the not fun part) and then travel to the conference and speak (the less fun than writing but more fun than sitting at home wishing I was at the conference part.) Of course, I may just pop down to a conference or two and just attend and learn.  That might be fun too.

     

    The biggest thing I plan to write more steadily again. This blog should see more life, and I try to maintain a healthy stream of articles on my What Counts for a DBA Simple-Talk blog.  The What Counts blog is fun because I have to really stretch myself to fill in a topic in a way that is both interesting and informative. Sometimes it works great, and sometimes the editor smacks me in the mouth and makes me start again. In contrast, this blog is great because I do what I want with no editor but myself, though I am annoying to write for at times. With an editor, I can trust someone has looked over the blog that isn't making a bunch of in jokes to himself.

     

    Along the lines of writing, another challenge I love is technical editing. A great way to sharpen your knowledge of a subject is to tech edit a book/article/presentation. It is very much like writing a book, without the need for so much creativity.  Just like when you write you have to fill in the blanks that you don't know/haven't done before, you have to do the same in tech editing. Everything you don't know, you need to look up and validate with other authorities, leading you to learn more and more. And even when you do know everything in the material, for those hours you spend, you take on a role that is incredibly important to the author. Make sure that the material is correct, and help fill in the blanks. I know I have learned a lot from my tech editors over the year, because some pre-conceived notions you have are possibly wrong.

     

    Writing is what I really like to do, because it can be done at my pace. I have 20+ ideas in my OneNote list just to the right of where I am writing, and I just have to fill in the blanks (notably the area between the title and the end of the blog, but that is just work.  I have another 50 ideas for the What Counts blog, and possibly a book idea that I will be working on this year. 

     

    I am definitely getting ready to wind back up the Why We Write series, something I paused around PASS time last year before I had family and health stuff going on that precluded me keeping on. I got 9 entries in last year, (plus one where I interviewed myself), and I would like to get more this year.

     

    Lastly, I do intend to get back to my roots in the SQL Server Forums, something I keep promising I will do and keep getting drug away from.. I really miss the interactive nature of the process of solving immediate problems. Yet, the biggest issue there is that my interest lies in problem prevention by proper design, and it is very hard to do. Even in my day job, too often the situation dictates that things are done subpar for time constraints, or code constraints, or tool constraints (ironically, often this leads to a lack of constraints and a surplus of data issues :))

  • SQL in the City (Charlotte) Wrap Up

    Ok, it has been quite a while since the event, two weeks and a day to be exact, but I needed a rest before hitting Windows Live Writer again. Speaking is exhausting, traveling is exhausting, and well, I replaced my laptop and had to get all of my software back together. (Between Windows 8.1 sync features, Dropbox and Skydrive, it has never been easier…but I digress.)

    There are plenty of great vendors out there, but one of my favorites has always been Red-Gate. I have written half of a book with them, have a blog series on What Counts for a DBA, and have convinced at least one company to purchase their tool belt product for all of the developers/dbas (the company I work for!). I have also spoken on their behalf a few times, this time being the most fun.

    If you didn’t get a chance to attend any of their events yet (full list here), it is a highly recommended experience. It is very much like a SQL Saturday, except perhaps that Red-Gate has a better budget than most, so the venue and food are at least a wee bit better.  At a minimum you generally get two of my favorite speakers: Steve Jones and Grant Fritchey, plus lots of the Red Gate team looking to get and give feedback about their tools. They had two concurrent sessions, and for the most part, about half of the sessions were at least somewhat Red Gate tools specific. I was personally interested in the Red Gate sessions, so I missed the sessions by Nigel Sammy, Kevin Boles, Mickey Steuwe, and Kevin Hazzard.

    Of course, the one non-Red Gate centric session I attended was the one I gave. I was a headline speaker for the conference, which was very gratifying I must admit. I try to seem very humble, but really I have a raging ego that must be served. But seriously, it was very much an honor to have my name just under Steve and Grant who are extremely good, very natural speakers. Lucky for me, they picked from the list of abstracts I submitted, my favorite session: Fundamentals of Relational Database Design (slides available here). The combination of this being a free event, not being recorded, a wonderful energetic audience who were into giving feedback (like the fact that the user it what matters in designing a database!), and a delightful lunch, I had a very nice session indeed. (It was quite well rated even, so for once I felt good and everyone seemed to like it simultaneously.)  I learned later that it was being recorded, so if you want to see it again, or for the first time, I will tweet the address once I get it.

    All in all, I just have to say that it was a great time, and next year I hope to at least attend their events if I don’t’ get a chance to speak again.

    Now I just need to get to blogging a wrap up of the PASS Conference!

  • How to Design a Relational Database; PASS Precon Swag and it’s “Symbolism”

    Update! 10 more books added to the cadre from my friends at Red-Gate.

    With less than a week to go, I am starting to pack up for Charlotte and PASS 2013. I love that it is in Charlotte this year so I can drive and bring along some goodies to give away. Books and toys mostly, a variety of which were chosen rather specifically for some manner of symbolism with a tie in to database design for the most part. (Okay, symbolism is perhaps a bit of a stretch, but I have tied everything, even the goofy stuff, to the meaning of the class!)

    1. 5 printed, and 3 ebook copies of my Pro 2012 Relational Database Design and Implementation. Well, obviously the symbolism here is that I hope the rest of the class each buys 5 copies and gives away to their coworkers so my royalties will buy me more than a model airplane.

    2. Lego Architecture Leaning Tower of Pisa – Last year I gave away the Farnsworth (glass) House because we database folks can’t throw stones because we aren’t perfect either. (Luckily, none of us actually live in glass houses either!). The Leaning Tower of Pisa reminds me that not every masterpiece is perfect. And well, you really need to start with a good foundation!

    3. Lego DeLorean Time Machine – Because, if you had a time machine, up front database design wouldn’t be that necessary.  Just go back and fix the database from the start without rework. Of course we don’t have time machines so, the more right you get it done the first time the better.  (And it is a really cool set!)

    4. Lego Space Needle – For anyone pining for Seattle, they can get this and put it in their hotel room and pretend it is next year.

    5. 5 copies of the book: The Imagineering Workout – A book by some Disney Imagineers, about how to stretch and use your imagination. Imagination is very important when you are designing a database (or any code). If you compare the vision of engineers of early computers (no one will want one of these in our house) to science fiction writers (wrist radio, eh Dick Tracy), who is more right. Neither were particularly right, but look at those companies that flourish, they have a great mix of technology, customer service, and imagination on their side.

    As architects, being able to envision the final product in action is of great help. Being able to code and execute code mentally means you don’t have to write a lot of code to test out every idea you have. Not that your mental query processor is going to find all architecture errors ahead of time, but the more it does eliminate the better.

    6. 1 copy of MVP Deep Dives 1 – A great book by a lot of great authors. I had an extra copy in my swag stash, so it is coming along too.  There is a chapter or two on database design in there penned by myself and Paul Nielsen, so it fits!

    7. Lego Star Wars X-Wing Fighter – Well, no deep symbolism here, and I don’t mention Star Wars in the slides, but Star Wars is cool and a job (or class) in technology ought to be fun.  (And don’t start a debate about prequels in the comments. The prequels were not as great as the original trilogy, but I still like them and the Clone Wars series! A bad day in the the Star Wars universe is better than a day at work in any case.)

    8. 2 Lego Brickleys – I use pictures of Brickley Lego dragons to illustrate the magnitude of design and requirements needed early in the session. Brickley is a great example to me because he comes in three sizes. This tiny size, an order of magnitude greater size (in number of pieces), and the “life” sized version at Downtown Disney in Orlando. The process to design and then explain to someone else how to build these is very similar, but the time and documentation investment is very very different.

    9. A Lego set of a Demolition man blowing something up – Sometimes, you just have to blow things up and start afresh. Many databases out there make Frankenstien’s monster seem like a very natural creature, with bits and pieces bolted on to something that ought to have been dead for years.

    10. 4 small Lego “Swamp Creature” sets – This set is really in there for two reasons. 1 because this is October and 2, to make a simply awful joke that (if you get this set and put it on your desk) should remind you what the job of the data architect really is using a bit of rhyme association.

    11. A copy of Performance Tuning with SQL Server Dynamic Management Views, my book with Tim Ford for Red-Gate, because while in 7 and a half hours (less lunch and breaks), we just won’t have time to spend a lot of time on the physical side of things.

    12. A copy of The Manga Guide to Databases. A fun book with some “interesting” art, but with a lot of decent explanations of database design principles. In some manners, it is very comparable to what I want to do, to strip away the complexities of theory and leave only the practical application. Of course, it is a lot lower level and feels like it was written for a 12 year old, but it was kind of fun too.

    13. 5 copies of SQL Server Transaction Log Management – I have read 1/2 of this book by Gail Shaw and Tony Davis, and it is excellent. I learned quite a few things about the transaction log that I didn’t know (nor had every really thought about :)

    14. 5 copies of SQL Server Concurrency – A book by one of my SQL heroes, Kalen Delaney. Wish she was going to be here at the Summit this year.

    So while it does in fact look like just a bunch of toys and books (for good reason, actually, since that what it is), all of the items were chosen for reasons beyond just that they were cool or fun, to help the person who ends up taking them home will flash back to the class when they got them. Am I spoiling the fun by sharing the list before the session? Maybe, but it will help me to determine how many people read the blog about such things, and well, it will keep me from changing my mind on #3 and #7 (both sets I haven’t bought myself yet and really do quite want.)

    So I hope to see you next Tuesday, and perhaps you get something cool to ship home because your luggage is already too full as it is!

  • SQL In The City Charlotte - Fundamentals of Database Design

    Next Monday, October 14, at Red-Gate's SQL In The City conference in Charlotte, NC (one day before PASS), I will be presenting my Fundamentals of Database Design session. It is my big-time chestnut session, the one that I do the most and have the most fun with. This will be the "single" version of the session, weighing in at just under an hour, and it is a lot of material to go over (even with no code samples to go awry to take up time.) 

    In this hour long session (presented in widescreen format for the very first time!), I cover the process of database design, from gathering requirements, assembling a model, and finally checking to make sure that model does the best job meeting the standards that Codd and friends have been refining for the past 30 years, plus some discussion about testing and implementation.  You can find out more about me and other speakers here on the SQL In the City site. 

    (Note: The following day (Oct 15) I will be doing the LP version of this session at the PASS conference, also in Charlotte, blog about it here, with the big difference being that we will do some design and I will only have to talk while breathing out.)

    I have no idea what to expect as this is my first SQL in the City, but I love Red Gate tools, and everyone I have met from RedGate have always been fun people to hang around with. If you haven't signed up for one of the Monday precons for PASS (my fellow pre-con writers worked hard on their sessions :)), please register and come over to the Ritz Carlton (the one where the conference is being held of course) and get your learn on!

     

    On barely related note, the name SQL In The City always reminds me of the Pete Townshend's song "Street in the City".

    "I watch as Fleet Street makes new heroes rise and fall.
    The news is written in the eyes of us all.
    One is a sinner. One is a saint, but most of us worry about showing up late.

    Street in the city. Street in the city on a working day."

    Somehow, I doubt this was heavy in the mind of the people choosing a name, but I march to the beat of a drum that is decidedly different than even many of my nerdy SQL friends (which would make a great Saturday morning cartoon show, though I digress in my digression.)

  • Why We Write #9 - An Interview With Laerte Junior Poltronieri

    For my next interview, I am going to a person whom I do not know all that well, but have seen his name many times. Laerte Junior Poltronieri (@LaerteSQLDBA) is a  very active tweeter, with over forty four thousand tweets to his name. His blog: http://shellyourexperience.com/, focuses heavily on one of my least favorite SQL Server topics, PowerShell and it is in Portuguese.  If, like me, you don't speak Portuguese, you can use Bing Translator here to check out his blog.

    Laerte has quite a few articles on SimpleTalk, and is a contributor to the SQL Server PowerShell Extensions, a project you can find here on Codeplex, which (without knowing anything about PowerShell myself,) looks quite interesting. He is also a Senior Powershell Expert Consultant for a very cool organization Confio. Oh yeah, and he is also a SQLPASS Regional Mentor. Quite a writing resume...

    1.  There was a point in time when you didn't have a blog, didn't tweet, and had no public presence whatsoever. And then, one day, you made the decision to put yourself out there. What prompted you to get started writing, and what was the first public work you produced?

    I always enjoyed reading technical articles and SQL Server was a passion at first time, in version 6.5. My First book was internals 7.0 ( from Kalen Delaney) and then from 2000 (also from Kalen). When the internet became a reality and blogs began to appear, I read a lot about SQL Server. I can say I grew up in SQL Server reading Buck Woody, Kalen Delaney, Kevin Kline, Bob Beauchemin, Steve Jones, Grant Fritchey, Allen White, Thomas La Rock, Robert Davis, Christian Bolton, Simon Sabin, Ron Dameron (the  first article about PowerShell that I read), Chad Miller  and many other Jedi that exist in our community. But for you to write about something, in my opinion, you have at least a good knowledge of the subject and like it. That's when I started thinking about creating a blog for SQL Server. At the same time started seeing PowerShell (had watched the Buck Woody Session here at TechEd Brazil) and I spent a lot time to learn it. But what’s the point if you learn but you not share ? That is the feeling. What I feel sharing sometimes is even bigger than when I learn it. What could I have learned without the content that I read  by the names that I told before ? So I can say that all theses amazing people from our community (actually they started the idea of the community) inspire me to make what they did with me. Gave the opportunity to learn. Then I started to blog about PowerShell.

    2. We all have influencers that affect our trajectory as a writer. It may be a teacher who told you that you had great potential, or another writer who impressed you that you wanted to be like? Or perhaps on the other end of the spectrum it was a teacher who told you that you were too stupid to write well enough to spell your own name, much less have people one day impressed with your writing? Who were your influences that stand out as essential parts of your journey to the level of writer you have become?  

    All these names that I told before, but there is one guy that really impressed me. I was starting with PowerShell and just had read the article from Ron Dameron about PowerShell. Then I wrote my script but nothing was working. For several days I was trying and failing. I was almost giving up when I read the blog from Shay Levy (PowerShell Master Jedi) and sent and email to him. For my surprise, a couple of minutes later he replied. We exchange some emails and he was  trying to help me but was not working. I know that it was 2 or 3 AM in his time and the he asked to me if he could go remotely in my machine to check. He did and solve it and taught me. C’mon..a guy that I had never seen in my life, at 3 AM spent his time teach a guy that he had never seen you his life (LOL) and solve the problem and teaching. It impressed me a LOT, and I always remember that e be grateful to Shay for that. He was one of my influencers too, not only by the technical content, but his good will to teach and share

    3.  What would you say has been your greatest success story as a writer, even if it was not a commercial success? And conversely, have there been any projects that were just complete disasters that you probably could have looked back and realized that before you got started?

    Hahaha. There is a disaster before all started. Today he is a great friend and at the time was very careful with his words (He is a gentleman) and , to be honest, he was right. I try to send an script to SQL Server Central and Steve Jones replied my email saying that my content was good, but my English was not (LOL) and at the time he could not review for me. I could got this as bad, but hahah., I got much more happy that I was talking with Steve Jones rather than what he told me  and I thought “I will improve my English as better I can and I will send again”. After some time I got my first Script published at SQL Server Central. I still remember my feeling when it happened because it is the same feeling that I still have today when I see my name at SQL Server Central and Simple-Talk.

    I had luck in my writer career. Simple-Talk and SQL Server Central always was my source for study and when I read Buck, Steve, Kevin, Grant I thought. “I have some good content to share. I know that and I want to be where my gurus are. I want to know them and share with them” . Then I send and email to Simple-Talk and for my happiness , Andrew replied me that I could write for them. Oh Boy. It was AMAZING !!!! I have a deeply respect and gratitude to Steve, Andrew, Chris and all people from Simple-Talk/Redgate  to gave me that opportunity. Everything in my life started with them

    4. I saw from your SQLBits biography, you are a huge Star Wars Fan (me too.) Does Star Wars or any pop culture affect your writing? I know I frequently want to infuse Monty Pythone-sque tidbits into my writing (even if the occasional editor strips it out as non-essential.

    Yes.. I am a HUGE Star Wars Fan. I still remember when I was in the movie theater and “Luke..I am your father” and everybody “WOWWWWWWWWWWWWWWWWWW !!!!!!”. it is amazing story and I just love it. Of course affect my writing. Not only Star Wars. I am a Marvel/DC fan, I love Batman, Superman, Spawn, Ghost Rider…and  in the certain way it affects all my life, not only my writing. If any Star Wars will be shown in the TV , do not ask me for go out hehehe.

    5. And now for something completely different…Assume a time machine has been created, and you are allowed to go back in time to speak to a group of potential writers, in which you and I are in attendance. What would be the most important point you want to make sure that we hear and follow up on?

    Two words . Share and courage. You don’t need to know how to link , talking about SQL Server, SQL Server with Aliens and talk about internals in binary to write about it. You just need to have the passion to share and the courage to do it. Does not matter if it is internals ultra super content or basic. Just to have the courage to share, you have my respect.

    6. Finally, beyond the "how" questions, now the big one that defines the interview series. Why do you do write?

    First because I like it, second because I like it, third because I like it hehehehe. I guess the most important is because with writing I could know my gurus and be part of something big and that really matters. The SQL Server/PowerShell community. Today when people like Bob,  Argenis, Steve Jones, Grant, Kevin Kline, Allen, Edwin, Mladen, Bolton, Thomas La Rock, Neil, Simon, Buck, Kalen  and so many other well known names see me, they know my name and say hello to me. I am a simple guy from a country city in Brasil. This is priceless for me.

     

    Thank you, to Laerte for another fine installment in the Why We Write series (anyone into Star Wars can’t be all bad!) I particularly like his answer about why he writes. Because he likes it. I think in future installments the “why” question will also include that question. Do you like it? I know I personally waver daily between loving to write and hating it. Sometimes the hate is simply because I can’t find the words to reflect the material I want to write about. Other times I just enjoy spending hours at this keyboard typing about technology.

More Posts Next page »

This Blog

Syndication

Links to my other sites

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement