THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Kalen Delaney

Did you know? -- Altering the length of a fixed-length column

You may be well aware that if you alter the length of a column to a larger value, that SQL Server doesn’t go through all the rows at the time you issue the ALTER. The ALTER is a metadata only change, and the actual movement of data doesn’t happen until a row is updated.

However, SQL Server does not reuse the original bytes in the row when a row’s length is altered. You may end up not being able to use the full allowable row length if you’ve altered column lengths.

Here’s the example. Create a table with two large char columns and a smallint. The initial length of each row will be just a bit more than 3000 bytes, out of a maximum row length of 8060 bytes.

CREATE TABLE bigchange

(col1 smallint, col2 char(2000), col3 char(1000));

GO

 

Now alter the second column to increase its length to 3000 bytes. The length of a row should now be just a bit more than 4000 bytes.

 

ALTER TABLE bigchange

   ALTER COLUMN col2 char(3000);

GO

 

Now try and add another 3000 byte column, which should bring the length to just over 7000 bytes. This will fail:

 

ALTER TABLE bigchange

      ADD col4 char(3000);

 

Msg 1701, Level 16, State 1, Line 1

Creating or altering table 'bigchange' failed because the minimum row size would be 9009, including 7 bytes of internal overhead. This exceeds the maximum allowable table row size of 8060 bytes.

 

However, if you just create a table with two 3000-byte columns and a 1000-byte column, there will be no problem.

 

CREATE TABLE nochange

(col1 smallint, col2 char(3000), col3 char(1000), col4 char(3000));

GO

 

Here’s a query you can run to see the metadata for where each column will start. Run it against the bigchange table:

 

SELECT  c.name AS column_name, column_id, max_inrow_length,
      pc.system_type_id, leaf_offset

 FROM sys.system_internals_partition_columns pc

    JOIN sys.partitions p

        ON p.partition_id = pc.partition_id

    JOIN sys.columns c

         ON column_id = partition_column_id

            AND c.object_id = p.object_id

WHERE p.object_id=object_id('bigchange');

GO

 

Here’s my results:

column_name   column_id   max_inrow_length system_type_id leaf_offset

------------- ----------- ---------------- -------------- -----------

col1          1           2                52             4

col2          2           3000             175            3006

col3          3           1000             175            2006

 

Note that col1 is only 2 bytes long, and starts at offset 4 (after the row header information) but that the next column doesn’t start until offset 2006. Bytes 6 – 2005, the original 2000 bytes for col2, are not used for anything now.

 

So be careful when using large datatypes, especially if you want to make them fixed length instead of variable length.

 

-- Kalen

Published Friday, October 13, 2006 10:33 AM by Kalen Delaney

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Peter W. DeBetta said:

So then the question is: How do you "recover" that unused space?

I found one way to do it. First, you'll need a not null column, You can either add one, or preferably, use an existing column.

Using an existing column
1. Create and drop a Clustered Primary Key

   ALTER TABLE bigchange ADD CONSTRAINT pk_bigchange PRIMARY KEY (col1);
   GO
   ALTER TABLE bigchange DROP CONSTRAINT pk_bigchange;
   GO

2. Run the query that shows the leaf offset and see that the offsets are now "fixed". In this case, the leaf offsets of col1, col2, and col3 are respectively: 4, 6, 3006

Creating a new column
1. Create the new column. If the table has data the column must have some value. In this case, since there wasn't an identity defined, I just added the new column as an identity.

   ALTER TABLE bigchange ADD col4 int not null identity(1, 1);
   GO

2. Create and drop a Clustered Primary Key

   ALTER TABLE bigchange ADD CONSTRAINT pk_bigchange PRIMARY KEY (col1);
   GO
   ALTER TABLE bigchange DROP CONSTRAINT pk_bigchange;
   GO

3. Drop the column you added.

   ALTER TABLE bigchange DROP COLUMN col4;

4. Run the query that shows the leaf offset and see that the offsets are now "fixed". In this case, the leaf offsets of col1, col2, and col3 are respectively: 8, 10, 3010. You'll notice that all the leaf offsets are 4 more than in the first example - a remnant of col4 being there.

I'm sure there are other techniques, and if you have them, do tell.

--Peter
October 13, 2006 3:34 PM
 

Peter W. DeBetta said:

OOps - I forgot to mention that in my first example (using an existing column), I changed col1 to be NOT NULL so that I could do the example. So, if you want to try this yourself, replace the original CREATE TABLE code with:

CREATE TABLE bigchange
(col1 smallint not null, col2 char(2000), col3 char(1000));

--Peter
October 13, 2006 3:40 PM
 

Manu said:

I remember having this problem once, and I reclaimed the space by recreating the clustered key
October 16, 2006 10:26 AM
 

Peter W. DeBetta said:

Manu,

I should have made the generalization that any clustered index would do the trick. Thanks for pointing it out.

So if you already have a clustered index or key, you can recreate it. Of course, if the clustered key is being referenced by any foreign keys, you will have to drop those and recreate those references as well.

Finally, another method would be (assuming there is no clustered index on the table)...

   CREATE CLUSTERED INDEX ic_bigchange ON bigchange(col1);
   GO
   DROP INDEX bigchange.ic_bigchange ;
   GO

--Peter
October 17, 2006 11:10 AM
 

Ewart said:

Hi there I've tried the clustered index trick and didn't work for me when changing data types in SQL 2000.  If anyone has any ideas:

http://groups.google.co.nz/group/microsoft.public.sqlserver.server/browse_frm/thread/fabeb25ca4c15b89/6f47e93ff9d446b5?hl=en&

regards
ewart
November 29, 2006 11:37 PM
 

Shaun said:

Changing a type from fixed-decimal to vardecimal does not follow the principal of delayed change.  The data is modifed at time of setting.

June 1, 2007 5:48 PM
 

david wei said:

Kalen,

I know it's an old thread but I am just reviewing all interested articles. (especially yours!)

After done my own research, I found the following statement is not correct:

"You may be well aware that if you alter the length of a column to a larger value, that SQL Server doesn’t go through all the rows at the time you issue the ALTER. The ALTER is a metadata only change, and the actual movement of data doesn’t happen until a row is updated."

When you alter a column (char or varchar), SQL server actually does move the data inside the page, you can noticed this by using DBCC page to check the this column is listed twice and one is marked as dropped. (the total number of columns is also increased by 1)

you mentioned this in your storage book.

The interesting is if you change the length to a SMALLER (not larger) value, SQL server does not move the data.

David Wei

July 25, 2007 7:25 PM
 

Salim Fayad said:

Hi Kallen,

I wanna congratulate you on the very nice book "Microsoft SQL Server 2008 Internals". But I have 2 questions:

1. If I have 2 columns of type "bit" in one table, how are they being stored? I noticed that they have the same offset and that they are saved in the same byte. Is this correct?

2. I ran your query above and it produced some duplicate column. I noticed that the "sys.partitions" returned 2 records for the same table. Is there some specific criteria that I should put to make it generic to return back all the columns without duplication?

Thank you very much.

Regards,

Salim Fayad

salimfayad@hotmail.com

March 22, 2010 11:08 AM
 

Kalen Delaney said:

Hello Salim

Up to 8 bit columns will be combined in a single byte, so it will show they have the same offset.

I'm confused about your second question. My output shows one row for every column. Are you saying you get something different, running the same script?

Or are you using a different table? You may need to read more about partitions and sys.partitions view. If you have indexes on the table, each index has its own partitions.

March 22, 2010 11:26 AM
 

Salim Fayad said:

Hi Kalen,

You are right. There are 2 partitions on this table that I am doing the query. My question was: Is there some criteria to get the default partition that generates all the columns?

Thank you.

March 23, 2010 2:24 AM
 

Salim Fayad said:

Hi Kalen again,

Can you please help me in how the Transaction Log stores the LOB? (varbinary(MAX), text, ntext, image, ...)?

Thank you in advance

March 23, 2010 5:23 AM
 

Kalen Delaney said:

Salim

There is no such thing as a 'default' partition. All the partitions contain rows from the table and all the rows have all the columns. However, if you are seeing multiple rows for each column, you can filter the output for only one partition_number...something like: AND partition_number = 1

As for you Transaction log question, MS does not make public exactly how LOB data is stored in log records.

~Kalen

March 24, 2010 7:48 PM
 

Mark Patrick said:

I used Peter's technique to fix a large custom SQL table, in Sage CRM, that required many field revisions/deletions, due to poor design and haphazard maintenance. About half way through, I was getting row size errors and couldn't finish my clean-up job. Thanks Peter...and Kalen, for pointing out this somewhat obscure fact about SQL. I was down to considering having to delete the table, and everything related to it, and starting over. You both saved me at least 3-4 solid days of work.

July 13, 2010 11:14 AM
 

Martin said:

Hi Kalen,

I was trying to write a query that I could run against each database to detect tables and indexes with this issue.

In doing so it came to my attention that the JOIN ON column_id = partition_column_id is not necessarily valid.

An example where this is the case is below...

DROP TABLE Foo

CREATE TABLE Foo (A CHAR(65), B CHAR(66), C CHAR(67), D CHAR(68), E CHAR(69));

CREATE CLUSTERED INDEX ixA ON Foo(A);

CREATE NONCLUSTERED INDEX ixB ON Foo(B) INCLUDE (E);

SELECT CASE max_inrow_length

         WHEN 4 THEN 'uniquifier'

         ELSE CHAR(max_inrow_length)

      END AS actual_column_name,

      column_id,

      name AS column_name_according_to_join,

      max_inrow_length,

      leaf_offset

FROM   sys.system_internals_partition_columns pc

      JOIN sys.partitions p

        ON p.partition_id = pc.partition_id

      LEFT JOIN sys.columns c

        ON column_id = partition_column_id

        AND c.object_id = p.object_id

WHERE p.object_id = OBJECT_ID('Foo')  AND index_id > 1          

Which returns

actual_column_name column_id   column_name_according_to_join  max_inrow_length leaf_offset

------------------ ----------- ------------------------------ ---------------- -----------

B                  1           A                              66               1

A                  3           C                              65               67

uniquifier         4           D                              4                -1

E                  2           B                              69               132

June 15, 2011 6:32 AM

Leave a Comment

(required) 
(required) 
Submit

This Blog

Syndication

Favorite Non-technical Sites or Blogs

Powered by Community Server (Commercial Edition), by Telligent Systems
  Privacy Statement