There is a feature, called index skip scan that has been in Oracle since version 9i.
When I across this, it seemed like a very clever trick, but not a critical capability.
More recently, I have been advocating DW on SSD in approrpiate situations,
and I am thinking this is now a valuable feature in keeping the number of nonclustered
indexes to a minimum.
Briefly, suppose we have an index with key columns: Col1,
Col2, in that order.
Obviously, a query with a search argument (SARG) on Col1 can use this index,
assuming the data distribution is favorable.
However, a query with the SARG on Col2
but not Col1 cannot use the index in a seek operation.
Now suppose that the cardinality of Col1,
(the number of distinct values of Col1), is relatively low.
The database engine could seek each distinct first value of Col1
and the specified SARG on Col2.
Microsoft SQL Server currently does not have the Oracle Index Skip-Scan feature,
but the capability can be achieved with a work-around.
In this example, the LINEITEM table has a cluster key on columns
but does not have an index leading with L_ORDERKEY.
Our query is to find a specific Order Key in the LineItem table.
If there is a table with the distinct date values, DimDate,
we could force a loop join from the DimDate table to LineItem
(even though only columns from LineItem are required)
to get the execution plan below.
The question is now: how effective is this technique?
The most efficient execution plan is of course, to have an index leading with the Order Key column.
But the situation calls for keeping the number of nonclustered indexes to an absolute minimum.
So how does the above execution plan compare with a table scan?
A table scan, in this type of query, such that only few rows meet an easy to evaluare SARG,
might run at about 1GB/s per core.
Note this is far higher than the 200MB/sec cited in the Microsoft Fast Track Data Warehouse
documents. This is because the FTDW baseline is a table scan that aggregates several columns
of every row. And not only that, a Hash Match is also required to group the results.
Basically, a needle in haystack table scan runs much faster than the more complex
aggregate and group scan.
At most, a 1GB table scan might acceptable for a non-parallel execution plan
and even a 50GB table scan could be tolerable on a powerful 32-core system
with an unrestricted parallel execution plan.
A loop join can run somewhere in range of 100,000-200,000 seeks/sec to the inner source.
Realistically, a Data Warehouse with 10 years data has distinct 3652 days (depending on the leap year situation).
A loop join with 3650 rows from the outer source should run some where around 36ms.
Even if the DW had 20 years data, this is still acceptable, on the assumption that
the non-lead column Order Key search is in the minority,
with the plus being one less index on the big table is required.
If the query could be bounded to with a single year or quarter-year,
then we are approaching the efficiency of having the extra nonclustered index.