<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://www2.sqlblog.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Search results matching tag 'Big Data'</title><link>http://www2.sqlblog.com/search/SearchResults.aspx?o=DateDescending&amp;tag=Big+Data&amp;orTags=0</link><description>Search results matching tag 'Big Data'</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP2 (Build: 61129.1)</generator><item><title>PASS Business Analytics Conference (BAC) Recap</title><link>http://www2.sqlblog.com/blogs/kevin_kline/archive/2013/04/14/pass-business-analytics-conference-bac-recap.aspx</link><pubDate>Sun, 14 Apr 2013 14:15:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:48667</guid><dc:creator>KKline</dc:creator><description>&lt;p style="font-family:Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif;font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;&lt;/p&gt;&lt;p style="font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;The PASS Business Analytics Conference (&lt;a href="http://www.passbaconference.com/"&gt;PASS BAC&lt;/a&gt;) is&amp;nbsp;&lt;a href="http://www.sqlpass.org/"&gt;PASS&lt;/a&gt;' first&amp;nbsp;foray&amp;nbsp;into an event that is dedicated to business intelligence, big data, data visualization, and business analytics. &amp;nbsp;And it totally makes sense for PASS to move in this direction, over and above the flagship community work centered on database management and application development. &amp;nbsp;Why? &amp;nbsp;Because business analytics is all about how to&amp;nbsp;&lt;em&gt;apply&amp;nbsp;&lt;/em&gt;the data being collected and managed by all of those developers and DBAs. &amp;nbsp;And, at the end of the day, how we use and apply our data is really the nexus of its value. &amp;nbsp;That's what matters to business. &amp;nbsp;You can&amp;nbsp;&lt;a href="http://passbaconference.com/Connect/Blog/entryid/542/Taking-Business-Analytics-to-the-Next-Level.aspx#.UWZVyFeJuzE"&gt;read the speech from the standing president&lt;/a&gt;, Bill Graziano (&lt;a href="https://twitter.com/#!/billgraziano"&gt;Twitter&lt;/a&gt;&amp;nbsp;|&amp;nbsp;&lt;a href="http://weblogs.sqlteam.com/billg/rss.aspx"&gt;Blog&lt;/a&gt;), or watch it online at the PASS website.&lt;/p&gt;&lt;p style="font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;&lt;img class="alignnone" alt="" width="640" height="386" style="border:1px solid black;cursor:default;margin:2px;" src="https://sphotos-b.xx.fbcdn.net/hphotos-snc6/892805_435264543230101_1655024948_o.png"&gt;&lt;/p&gt;&lt;p style="font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;The day one highlight, introduced by the SQL Server team's best presenter - Amir Netz (&lt;a href="https://twitter.com/AmirNetz"&gt;Twitter&lt;/a&gt;), is the release of a new BI data visualization tool called&amp;nbsp;Project “GeoFlow” for Excel. &amp;nbsp;GeoFlow is a 3D visualization and storytelling tool that helps you&amp;nbsp;map, explore and interact with both geographic and chronological data for visualizing data which is difficult to identify in traditional 2D tables and charts. With GeoFlow, you can plot up to a million rows of data in 3D on Bing Maps, see data changes over time and share findings through appealing screenshots and cinematic, guided video tours of the data. It's really something you have to see to understand – check out the video demo and screenshots below. You can also&amp;nbsp;&lt;a href="http://spr.ly/getgeoflow"&gt;download&amp;nbsp;&lt;/a&gt;and try it out firsthand today. It’s an entirely new way to experience and share insights – one you’ll probably enjoy. &amp;nbsp;&lt;span style="line-height:19px;"&gt;For more information on GeoFlow, check out the&amp;nbsp;&lt;/span&gt;&lt;a target="_blank" style="line-height:19px;" href="http://blogs.office.com/b/microsoft-excel/archive/2013/04/11/dallas-utilities-electricity-seasonal-use-simulation-with-geoflow-preview-and-powerview.aspx"&gt;Excel team’s blog&lt;/a&gt;&lt;span style="line-height:19px;"&gt;&amp;nbsp;and visit the&amp;nbsp;&lt;/span&gt;&lt;a style="line-height:19px;" href="http://www.microsoft.com/en-us/bi/Products/Office.aspx"&gt;BI website.&lt;/a&gt;&lt;/p&gt;&lt;p style="font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;&lt;img class="alignright" alt="" width="150" height="200" style="border:0px;cursor:default;float:right;" src="http://pricetheory.uchicago.edu/levitt/images/Photo-of-Steven-Levitt.png"&gt;&lt;/p&gt;&lt;p style="font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;The highlight for me, aside from connecting with so many friends and colleagues in the exhibit hall at the&amp;nbsp;&lt;a href="http://www.sqlsentry.net/"&gt;SQL Sentry&lt;/a&gt;&amp;nbsp;booth, was the day 2 keynote address by&amp;nbsp;&lt;a href="http://pricetheory.uchicago.edu/levitt/home.html"&gt;Dr. Steve Levitt&lt;/a&gt;&amp;nbsp;of&amp;nbsp;&lt;a href="http://www.freakonomics.com/"&gt;Freakonomics&lt;/a&gt;&amp;nbsp;fame. &amp;nbsp;Freakonomics is both&amp;nbsp;&lt;a href="http://www.freakonomics.com/blog/"&gt;a brilliant blog&lt;/a&gt;&amp;nbsp;and&amp;nbsp;&lt;a href="http://www.amazon.com/Freakonomics-Economist-Explores-Hidden-Everything/dp/0060731338/ref=sr_1_1?ie=UTF8&amp;amp;qid=1365774766&amp;amp;sr=8-1&amp;amp;keywords=freakonomics"&gt;the number one business book in America&lt;/a&gt;. &amp;nbsp;His insights are well documented in a variety of places, not just in his own channels, but also in places such as&amp;nbsp;&lt;a href="http://www.ted.com/speakers/steven_levitt.html"&gt;TEDtalks&lt;/a&gt;. &amp;nbsp;I'm also really enjoying his new website,&amp;nbsp;&lt;a href="https://www.freakonomicsexperiments.com/"&gt;https://www.freakonomicsexperiments.com/&lt;/a&gt;.&lt;/p&gt;&lt;p style="font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;Steve presented an outstanding keynote, full of funny anecdotes and insights into the world of data analytics and interpretation. A couple of his comments really resonated with me which are worth repeating. In one story, he pointed out how some of the greatest insights came from corporate data which was collected incidentally or coincidentally. The data that help provide the greatest and most valuable revelations were from data that was basically a corporate afterthought. &amp;nbsp;Another revelation - he's only now starting to make much use of relational databases. &amp;nbsp;He primarily uses spreadsheets, flat files, and the&amp;nbsp;&lt;a href="http://www.stata.com/"&gt;Stata&lt;/a&gt;&amp;nbsp;statistical analysis tool. &amp;nbsp;Another insight, which I've known and&amp;nbsp;proselytized&amp;nbsp;as "the Fresh Pair of Eyes" approach, is that it really helps him to gain insights in a problem by knowing as little about the problem as possible. &amp;nbsp;As it turns out, if you know the industry or the challenge at the core of the problem, you make a lot of assumptions that limit your means of interpreting data. &amp;nbsp;By knowing nothing or next to nothing about a particular problem, you can ask the questions that insiders never ask. &amp;nbsp;Here's an example (not from the keynote though) - let's say you're an energy company CEO. &amp;nbsp;You might spend a lot of time thinking about how to accommodate the expected huge increase in energy consumption due to lots of people driving electric cars. &amp;nbsp;You might tell your data analysts to figure out when and how to ensure peak electrical usage is available at the times when consumers are recharging their electric vehicles. &amp;nbsp;But a fresh pair of eyes would point out that electric cars, in their present form, are a&amp;nbsp;&lt;a href="http://www.caranddriver.com/features/decade-in-review-electric-cars"&gt;huge energy boondoggle&lt;/a&gt;&amp;nbsp;compared to hybrid and plain ol' cheap, high-mileage models like the Honda Civic. &amp;nbsp;Consumers will never recoup their investment in a high-priced, all-electric car compared to a cheap, gas sipping model.&lt;/p&gt;&lt;p style="font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;&lt;img class="size-medium wp-image-5629 alignright" alt="IMG_0287 - Copy" width="300" height="164" style="border:0px;cursor:default;float:right;" src="http://kevinekline.com/wp-content/uploads/2013/04/IMG_0287-Copy-300x164.jpg"&gt;At the heart of his presentation is the fact that data is meaningless when it doesn't answer important questions! &amp;nbsp;Many times, data professionals spend so much time devising elegant SQL statements and clever user-interfaces that they forget about using a fresh pair of eyes when they look at business questions. &amp;nbsp;Our session,&amp;nbsp;&lt;em&gt;Operational Excellence for the BI Pro,&amp;nbsp;&lt;/em&gt;focused on the trails and travails of successfully implementing and growing the footprint of a business intelligence project.&lt;/p&gt;&lt;p style="font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;In addition, we had a fun and very informative panel discussion breakfast on Thursday of the PASS BAC. At right is a picture of Nick Harshbarger, Justin Randal, and me prior to the session. &amp;nbsp;The audience was very engaged and, despite having no slides, there was a whole lot of wisdom goin' on. &amp;nbsp;The panel included Chris Webb&amp;nbsp;(&lt;a href="https://twitter.com/#!/Technitrain"&gt;Twitter&lt;/a&gt;&amp;nbsp;|&amp;nbsp;&lt;a href="http://cwebbbi.spaces.live.com/feed.rss"&gt;Blog&lt;/a&gt;),&amp;nbsp;Craig Utley,&amp;nbsp;Jen Stirrup&amp;nbsp;(&lt;a href="https://twitter.com/#!/jenstirrup"&gt;Twitter&lt;/a&gt;&amp;nbsp;|&amp;nbsp;&lt;a href="http://www.jenstirrup.com/"&gt;Blog&lt;/a&gt;), Paul Turley (&lt;a target="_blank" href="http://sqlserverbiblog.com/"&gt;Blog&lt;/a&gt;), &amp;nbsp;and Stacia Misner&amp;nbsp;(&lt;a href="https://twitter.com/#!/StaciaMisner"&gt;Twitter&lt;/a&gt;&amp;nbsp;|&amp;nbsp;&lt;a href="http://blog.datainspirations.com/"&gt;Blog&lt;/a&gt;). I served as the moderator and facilitator of the session. &amp;nbsp;We recorded the session, with a little HD Flip camera, and although I haven't checked out the file yet, we're hopeful we can post it or at least a transcript soon.&lt;/p&gt;&lt;p style="font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;Do you have a "fresh eyes" story? I'd love to hear it! &amp;nbsp;Post a comment here!&lt;/p&gt;&lt;p style="font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;Many thanks,&lt;/p&gt;&lt;p style="font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;-Kevin&lt;/p&gt;&lt;p style="font-size:13.333333969116211px;line-height:18.99305534362793px;"&gt;&lt;a href="http://twitter.com/kekline"&gt;-Follow me on Twitter!&lt;/a&gt;&lt;br&gt;&lt;a href="https://plus.google.com/u/1/113032055249023350257?rel=author"&gt;- Google Author&lt;/a&gt;&lt;/p&gt;&lt;p&gt;&lt;/p&gt;</description></item><item><title>[New England] Mark Souza on Big Data and Cloud at NESQL</title><link>http://www2.sqlblog.com/blogs/adam_machanic/archive/2013/01/13/new-england-mark-souza-on-big-data-and-cloud-at-nesql.aspx</link><pubDate>Sun, 13 Jan 2013 18:33:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:47138</guid><dc:creator>Adam Machanic</dc:creator><description>&lt;p&gt;This Thursday, January 17, at &lt;a href="http://nesql.org/"&gt;New England SQL Server&lt;/a&gt; we'll be featuring &lt;a href="http://twitter.com/mark_sqlcat"&gt;Mark Souza&lt;/a&gt;, General Manager of the Data Platform Group at Microsoft. Most of you are probably familiar with that name; &lt;b&gt;he's the guy who founded the CAT team&lt;/b&gt;, and he's been in a number of key roles in the SQL Server organization for the past several years. &lt;/p&gt;&lt;p&gt;Mark's topic for Thursday is &lt;b&gt;Big Data and Cloud at Microsoft&lt;/b&gt;. The talk should be an interesting look into how the company is approaching these key areas.&lt;/p&gt;&lt;p&gt;If you're in New England and aren't already a member of New England SQL Server, &lt;a href="http://www.nesql.org/SignUp/tabid/100/Default.aspx"&gt;you can sign up here&lt;/a&gt;. &lt;b&gt;RSVP is required&lt;/b&gt; to attend our meetings, and we will send out the invitation e-mail on Tuesday morning.&lt;/p&gt;&lt;p&gt;Hope to see many of you there! &lt;br&gt;&lt;/p&gt;</description></item><item><title>Using Hadooop (HDInsight) with Microsoft - Two (OK, Three) Options </title><link>http://www2.sqlblog.com/blogs/buck_woody/archive/2012/12/04/using-hadooop-hdinsight-with-microsoft-two-ok-three-options.aspx</link><pubDate>Tue, 04 Dec 2012 15:28:23 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:46509</guid><dc:creator>BuckWoody</dc:creator><description>&lt;p&gt;Microsoft has many tools for &amp;ldquo;Big Data&amp;rdquo;. In fact, you need many tools &amp;ndash; there&amp;rsquo;s no product called &amp;ldquo;Big Data Solution&amp;rdquo; in a shrink-wrapped box &amp;ndash; if you find one, you probably shouldn&amp;rsquo;t buy it. It&amp;rsquo;s tempting to want a single tool that handles everything in a problem domain, but with large, complex data, that isn&amp;rsquo;t a reality. You&amp;rsquo;ll mix and match several systems, open and closed source, to solve a given problem.&lt;/p&gt;
&lt;p&gt;But there are tools that help with handling data at large, complex scales. Normally the best way to do this is to break up the data into parts, and then put the calculation engines for that chunk of data right on the node where the data is stored. These systems are in a family called &amp;ldquo;Distributed File and Compute&amp;rdquo;. Microsoft has a couple of these, including the &lt;a href="http://www.microsoft.com/hpc/en/us/default.aspx"&gt;High Performance Computing edition of Windows Server&lt;/a&gt;. Recently we partnered with &lt;a href="http://hortonworks.com/"&gt;Hortonworks&lt;/a&gt; to bring the &lt;a href="http://hadoop.apache.org/"&gt;Apache Foundation&amp;rsquo;s release of Hadoop&lt;/a&gt; to Windows. And as it turns out, there are actually two (technically three) ways you can use it.&lt;/p&gt;
&lt;p style="padding-left:30px;"&gt;&lt;span style="color:#993300;"&gt;&lt;em&gt;(There&amp;rsquo;s a more detailed set of information here: &lt;a href="http://www.microsoft.com/sqlserver/en/us/solutions-technologies/business-intelligence/big-data.aspx"&gt;&lt;span style="color:#993300;"&gt;http://www.microsoft.com/sqlserver/en/us/solutions-technologies/business-intelligence/big-data.aspx&lt;/span&gt;&lt;/a&gt;, I&amp;rsquo;ll cover the options at a general level below)&amp;nbsp; &lt;/em&gt;&lt;/span&gt;&lt;/p&gt;
&lt;h1&gt;First Option: Windows Azure HDInsight Service&lt;/h1&gt;
&lt;p&gt;&amp;nbsp;Your first option is that you can simply log on to a Hadoop control node and begin to run Pig or Hive statements against data that you have stored in Windows Azure. There&amp;rsquo;s nothing to set up (although you can configure things where needed), and you can send the commands, get the output of the job(s), and stop using the service when you are done &amp;ndash; and repeat the process later if you wish.&lt;/p&gt;
&lt;p&gt;(There are also connectors to run jobs from Microsoft Excel, but that&amp;rsquo;s another post)&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;a href="http://sqlblog.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-79-79/0572.option_2D00_1.png"&gt;&lt;img src="http://sqlblog.com/resized-image.ashx/__size/550x0/__key/communityserver-blogs-components-weblogfiles/00-00-00-79-79/0572.option_2D00_1.png" alt="" width="367" height="212" border="0" /&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;This option is useful when you have a periodic burst of work for a Hadoop workload, or the data collection has been happening into Windows Azure storage anyway. That might be from a web application, the logs from a web application, &lt;a href="http://en.wikipedia.org/wiki/Telemetry"&gt;telemetrics&lt;/a&gt; (remote sensor input), and other modes of constant collection. &amp;nbsp;&lt;/p&gt;
&lt;p&gt;You can read more about this option here: &amp;nbsp;&lt;a href="http://sqlblog.com/b/windowsazure/archive/2012/10/24/getting-started-with-windows-azure-hdinsight-service.aspx"&gt;http://blogs.msdn.com/b/windowsazure/archive/2012/10/24/getting-started-with-windows-azure-hdinsight-service.aspx&lt;/a&gt;&lt;/p&gt;
&lt;h1&gt;Second Option: Microsoft HDInsight Server&lt;/h1&gt;
&lt;p&gt;Your second option is to use the Hadoop Distribution for on-premises Windows called Microsoft HDInsight Server. You set up the Name Node(s), Job Tracker(s), and Data Node(s), among other components, and you have control over the entire ecostructure.&lt;/p&gt;
&lt;p&gt;&lt;a href="http://sqlblog.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-79-79/7041.option_2D00_2.png"&gt;&lt;img src="http://sqlblog.com/resized-image.ashx/__size/550x0/__key/communityserver-blogs-components-weblogfiles/00-00-00-79-79/7041.option_2D00_2.png" alt="" width="152" height="179" border="0" /&gt;&lt;/a&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;This option is useful if you want to &amp;nbsp;have complete control over the system, leave it running all the time, or you have a huge quantity of data that you have to bulk-load constantly &amp;ndash; something that isn&amp;rsquo;t going to be practical with a network transfer or disk-mailing scheme.&lt;/p&gt;
&lt;p&gt;You can read more about this option here: &lt;a href="http://www.microsoft.com/sqlserver/en/us/solutions-technologies/business-intelligence/big-data.aspx"&gt;http://www.microsoft.com/sqlserver/en/us/solutions-technologies/business-intelligence/big-data.aspx&lt;/a&gt;&lt;/p&gt;
&lt;h1&gt;Third Option (unsupported): Installation on Windows Azure Virtual Machines&lt;/h1&gt;
&lt;p&gt;&amp;nbsp;Although unsupported, you could simply use a Windows Azure Virtual Machine (we support both Windows and Linux servers) and install Hadoop yourself &amp;ndash; it&amp;rsquo;s open-source, so there&amp;rsquo;s nothing preventing you from doing that.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;a href="http://sqlblog.com/cfs-file.ashx/__key/communityserver-blogs-components-weblogfiles/00-00-00-79-79/0121.option_2D00_3.png"&gt;&lt;img src="http://sqlblog.com/resized-image.ashx/__size/550x0/__key/communityserver-blogs-components-weblogfiles/00-00-00-79-79/0121.option_2D00_3.png" alt="" width="326" height="188" border="0" /&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Aside from being unsupported, there are other issues you&amp;rsquo;ll run into with this approach &amp;ndash; primarily involving performance and the amount of configuration you&amp;rsquo;ll need to do to access the data nodes properly. But for a single-node installation (where all components run on one system) such as learning, demos, training and the like, this isn&amp;rsquo;t a bad option.&lt;/p&gt;
&lt;p&gt;Did I mention that&amp;rsquo;s unsupported? :) &lt;/p&gt;
&lt;p&gt;You can learn more about Windows Azure Virtual Machines here: &lt;a href="http://www.windowsazure.com/en-us/home/scenarios/virtual-machines/"&gt;http://www.windowsazure.com/en-us/home/scenarios/virtual-machines/&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;And more about Hadoop and the installation/configuration (on Linux) here: &lt;a href="http://en.wikipedia.org/wiki/Apache_Hadoop"&gt;http://en.wikipedia.org/wiki/Apache_Hadoop&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;And more about the HDInsight installation here: &lt;a href="http://www.microsoft.com/web/gallery/install.aspx?appid=HDINSIGHT-PREVIEW"&gt;http://www.microsoft.com/web/gallery/install.aspx?appid=HDINSIGHT-PREVIEW&lt;/a&gt;&lt;/p&gt;
&lt;h1&gt;Choosing the right option&lt;/h1&gt;
&lt;p&gt;Since you have two or three routes you can go, the best thing to do is evaluate the need you have, and place the workload where it makes the most sense.&amp;nbsp; My suggestion is to install the HDInsight Server locally on a test system, and play around with it. Read up on the best ways to use Hadoop for a given workload, understand the parts, write a little Pig and Hive, and get your feet wet. Then sign up for a test account on HDInsight Service, and see how that leverages what you know. If you're a true tinkerer, go ahead and try the VM route as well. &lt;/p&gt;
&lt;p&gt;Oh - there&amp;rsquo;s another great reference on the Windows Azure HDInsight that just came out, here: &lt;a href="http://sqlblog.com/b/brunoterkaly/archive/2012/11/16/hadoop-on-azure-introduction.aspx"&gt;http://blogs.msdn.com/b/brunoterkaly/archive/2012/11/16/hadoop-on-azure-introduction.aspx&lt;/a&gt; &amp;nbsp;&lt;/p&gt;</description></item><item><title>Big Data Learning Resources</title><link>http://www2.sqlblog.com/blogs/lara_rubbelke/archive/2012/09/10/big-data-learning-resources.aspx</link><pubDate>Mon, 10 Sep 2012 21:38:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:45124</guid><dc:creator>Lara Rubbelke</dc:creator><description>&lt;p&gt;I have recently had several requests from people asking for resources to learn about Big Data and Hadoop.&amp;nbsp; Below is a list of resources that I typically recommend.&amp;nbsp; I'll update this list as I find more resources.&amp;nbsp; Let's crowdsource this... Tell me your favorite resources and I'll get them on the list!&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;&lt;font face="Times New Roman" size="3"&gt;

&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt;"&gt;&lt;b&gt;&lt;span&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;Books and Whitepapers&lt;o:p&gt;&lt;/o:p&gt;&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&lt;/b&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Times New Roman" size="3"&gt;

&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt 0.5in;"&gt;&lt;span&gt;&lt;font face="Calibri" size="3"&gt;&lt;a href="http://oreil.ly/y579n3" target="_blank"&gt;Planning
for Big Data Free e-book&lt;/a&gt;&lt;/font&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Times New Roman" size="3"&gt;

&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt 1in;"&gt;&lt;i&gt;&lt;span&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;Great
primer on the general Big Data space.&amp;nbsp; This is always my recommendation
for people who are new to Big Data and are trying to understand it.&lt;o:p&gt;&lt;/o:p&gt;&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&lt;/i&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Times New Roman" size="3"&gt;

&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt 0.5in;"&gt;&lt;span&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;&lt;a href="http://www.barnesandnoble.com/w/hadoop-tom-white/1015558328" target="_blank"&gt;Hadoop:
The Definitive Guide&lt;/a&gt; by Tom White&amp;nbsp; &lt;o:p&gt;&lt;/o:p&gt;&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Times New Roman" size="3"&gt;

&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt 1in;"&gt;&lt;i&gt;&lt;span&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;This
will dive deep under the hood of Hadoop.&amp;nbsp; This should not be a first book
for someone who is just starting with Hadoop, Map Reduce or Big Data.&amp;nbsp;
Make sure you don’t get the first edition.&amp;nbsp; The third edition is the best
as it also dedicates a chapter to HBase, Hive, and other tools in the ecosystem
that are important to understand.&lt;o:p&gt;&lt;/o:p&gt;&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&lt;/i&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Times New Roman" size="3"&gt;

&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt 0.5in;"&gt;&lt;span&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;&lt;a href="http://www.barnesandnoble.com/w/programming-pig-alan-gates/1102508834?ean=9781449302641" target="_blank"&gt;Programming
Pig&lt;/a&gt; by Alan Gates&lt;o:p&gt;&lt;/o:p&gt;&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Times New Roman" size="3"&gt;

&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt 1in;"&gt;&lt;i&gt;&lt;span&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;Great
(and entertaining) book about Pig.&amp;nbsp; The first chapter is a really good
primer on Hadoop.&lt;o:p&gt;&lt;/o:p&gt;&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&lt;/i&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Times New Roman" size="3"&gt;

&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt 0.5in;"&gt;&lt;span&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;&lt;a href="http://www.barnesandnoble.com/w/programming-hive-jason-rutherglen/1112590470?ean=9781449319335" target="_blank"&gt;Programming
Hive&lt;/a&gt; By Edward Capriolo, Dean Wampler, Jason Rutherglen (est publication date
10/9/2012)&lt;o:p&gt;&lt;/o:p&gt;&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Times New Roman" size="3"&gt;

&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt 0.5in;text-indent:0.5in;"&gt;&lt;i style="mso-bidi-font-style:normal;"&gt;&lt;span&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;Nothing to say
about this book yet – it isn’t yet released.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp;
&lt;/span&gt;I will add a quick blurb when I have a chance to read it.&lt;o:p&gt;&lt;/o:p&gt;&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&lt;/i&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Times New Roman" size="3"&gt;

&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt 0.5in;"&gt;&lt;span&gt;&lt;font face="Calibri" size="3"&gt;&lt;a href="http://cacm.acm.org/magazines/2011/6/108666-if-you-have-too-much-data-then-good-enough-is-good-enough/abstract" target="_blank"&gt;“If You
Have Too Much Data, then ‘Good Enough’ Is Good Enough”&lt;/a&gt; by Pat Helland &lt;/font&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Times New Roman" size="3"&gt;

&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt 1in;"&gt;&lt;i&gt;&lt;span&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;Great
whitepaper to discuss the tenets behind distributed systems.&lt;o:p&gt;&lt;/o:p&gt;&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&lt;/i&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Times New Roman" size="3"&gt;

&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt 0.5in;"&gt;&lt;span&gt;&lt;o:p&gt;&lt;font face="Calibri" size="3"&gt;&amp;nbsp;&lt;/font&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Times New Roman" size="3"&gt;

&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt;"&gt;&lt;b&gt;&lt;span&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;Websites&lt;o:p&gt;&lt;/o:p&gt;&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&lt;/b&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Times New Roman" size="3"&gt;

&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt 0.5in;"&gt;&lt;span&gt;&lt;font face="Calibri" size="3"&gt;Apache
Hadoop: &lt;/font&gt;&lt;a href="http://hadoop.apache.org/" target="_blank"&gt;&lt;font color="#0000ff" face="Calibri" size="3"&gt;http://hadoop.apache.org/&lt;/font&gt;&lt;/a&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt; &lt;o:p&gt;&lt;/o:p&gt;&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Times New Roman" size="3"&gt;

&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt 0.5in;"&gt;&lt;span&gt;&lt;font face="Calibri" size="3"&gt;Microsoft
Big Data Solution: &lt;/font&gt;&lt;font color="#0000ff" face="Calibri" size="3"&gt;www.microsoft.com/bigdata&lt;/font&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;
&lt;o:p&gt;&lt;/o:p&gt;&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Times New Roman" size="3"&gt;

&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt 0.5in;"&gt;&lt;span&gt;&lt;font face="Calibri" size="3"&gt;Windows
Azure: &lt;/font&gt;&lt;font color="#0000ff" face="Calibri" size="3"&gt;www.windowsazure.com/en-us/home/scenarios/big-data&lt;/font&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;
&lt;o:p&gt;&lt;/o:p&gt;&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Times New Roman" size="3"&gt;

&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt;"&gt;&lt;span&gt;&lt;o:p&gt;&lt;font face="Calibri" size="3"&gt;&amp;nbsp;&lt;/font&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Times New Roman" size="3"&gt;

&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt;"&gt;&lt;b&gt;&lt;span&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;Webcasts&lt;o:p&gt;&lt;/o:p&gt;&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&lt;/b&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Times New Roman" size="3"&gt;

&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt 0.5in;"&gt;&lt;span&gt;&lt;font face="Calibri" size="3"&gt;Hadoop
Videos on Microsoft TechNet: &lt;/font&gt;&lt;a href="http://social.technet.microsoft.com/wiki/contents/articles/6204.hadoop-based-services-for-windows-en-us.aspx#videos" target="_blank"&gt;&lt;font color="#0000ff" face="Calibri" size="3"&gt;http://social.technet.microsoft.com/wiki/contents/articles/6204.hadoop-based-services-for-windows-en-us.aspx#videos&lt;/font&gt;&lt;/a&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Times New Roman" size="3"&gt;

&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt 0.5in;"&gt;&lt;span&gt;&lt;font face="Calibri" size="3"&gt;Hortonworks
Video Series: &lt;/font&gt;&lt;a href="http://hortonworks.com/videos/" target="_blank"&gt;&lt;font color="#0000ff" face="Calibri" size="3"&gt;http://hortonworks.com/videos/&lt;/font&gt;&lt;/a&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Times New Roman" size="3"&gt;

&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt 0.5in;"&gt;&lt;span&gt;&lt;font face="Calibri" size="3"&gt;Cloudera
Video Series: &lt;/font&gt;&lt;a href="http://www.cloudera.com/resource-types/video/" target="_blank"&gt;&lt;font color="#0000ff" face="Calibri" size="3"&gt;http://www.cloudera.com/resource-types/video/&lt;/font&gt;&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;&lt;span&gt;&lt;o:p&gt;&lt;font face="Calibri"&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt 0.5in;"&gt;&amp;nbsp;&lt;/p&gt;&lt;span&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt 0.5in;"&gt;&amp;nbsp;&lt;/p&gt;&lt;font face="Calibri"&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt 0.5in;"&gt;&lt;font color="#000000" face="Times New Roman" size="3"&gt;

&lt;/font&gt;&lt;span&gt;&lt;font size="3"&gt;&lt;a href="http://www.oreillynet.com/pub/e/2290" target="_blank"&gt;Tim
O'Reilly and Dave Campbell Explore How to Accelerate Insights from Data &lt;/a&gt;&lt;/font&gt;&lt;/span&gt;&lt;/p&gt;&lt;/font&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt 0.5in;"&gt;&amp;nbsp;&lt;/p&gt;&lt;/span&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt 0.5in;"&gt;&lt;span&gt;&lt;font size="3"&gt;&lt;a href="http://borntolearn.mslearn.net/btl/b/bethenext/archive/2012/08/03/what-do-big-data-amp-speeding-tix-have-in-common-guest-judge-denny-lee.aspx" target="_blank"&gt;Denny Lee talks about Big Data&lt;/a&gt;&lt;/font&gt;&lt;/span&gt;&lt;/p&gt;&lt;/font&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt;"&gt;&lt;font face="Times New Roman" size="3"&gt;&lt;/font&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt;"&gt;&lt;b&gt;&lt;span&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;Blogs&lt;o:p&gt;&lt;/o:p&gt;&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&lt;/b&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Times New Roman" size="3"&gt;

&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt 0.5in;"&gt;&lt;span&gt;&lt;font face="Calibri" size="3"&gt;Andrew
Brust on ZDNet: &lt;/font&gt;&lt;a href="http://www.zdnet.com/blog/big-data/" target="_blank"&gt;&lt;font color="#0000ff" face="Calibri" size="3"&gt;http://www.zdnet.com/blog/big-data/&lt;/font&gt;&lt;/a&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Times New Roman" size="3"&gt;

&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt 0.5in;"&gt;&lt;span&gt;&lt;font face="Calibri" size="3"&gt;Denny
Lee: &lt;/font&gt;&lt;a href="http://dennyglee.com/" target="_blank"&gt;&lt;font color="#0000ff" face="Calibri" size="3"&gt;http://dennyglee.com/&lt;/font&gt;&lt;/a&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Times New Roman" size="3"&gt;

&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt 0.5in;"&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;&lt;span&gt;Carl
Nolan: &lt;/span&gt;&lt;span style="mso-bidi-font-weight:bold;"&gt;&lt;a href="http://blogs.msdn.com/b/carlnol/archive/tags/hadoop+streaming/" target="_blank"&gt;&lt;font color="#0000ff"&gt;http://blogs.msdn.com/b/carlnol/archive/tags/hadoop+streaming/&lt;/font&gt;&lt;/a&gt;
&lt;/span&gt;&lt;/font&gt;&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt 0.5in;"&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;&lt;span style="mso-bidi-font-weight:bold;"&gt;&lt;/span&gt;&lt;/font&gt;&lt;/font&gt;&amp;nbsp;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt 0.5in;"&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;&lt;span style="mso-bidi-font-weight:bold;"&gt;Cindy Gross:&amp;nbsp;&lt;font color="#1f497d"&gt;&lt;a href="http://blogs.msdn.com/b/cindygross/"&gt;http://blogs.msdn.com/b/cindygross/&lt;/a&gt;&lt;/font&gt;&lt;/span&gt;&lt;span&gt;&lt;o:p&gt;&amp;nbsp;&lt;/o:p&gt;&lt;/span&gt;&lt;/font&gt;&lt;/font&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Times New Roman" size="3"&gt;

&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt 0.5in;"&gt;&lt;span&gt;&lt;font face="Calibri" size="3"&gt;Oakleaf
Blogs (good for Hadoop on Azure): &lt;/font&gt;&lt;a href="http://oakleafblog.blogspot.com/" target="_blank"&gt;&lt;font color="#0000ff" face="Calibri" size="3"&gt;http://oakleafblog.blogspot.com/&lt;/font&gt;&lt;/a&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Times New Roman" size="3"&gt;

&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt 0.5in;"&gt;&lt;span&gt;&lt;font face="Calibri" size="3"&gt;Buck
Woody: Big Data: A Microsoft Tools Approach &lt;/font&gt;&lt;a href="http://sqlblog.com/blogs/buck_woody/archive/2012/02/20/big-data-a-microsoft-tools-approach.aspx" target="_blank"&gt;&lt;font color="#0000ff" face="Calibri" size="3"&gt;http://sqlblog.com/blogs/buck_woody/archive/2012/02/20/big-data-a-microsoft-tools-approach.aspx&lt;/font&gt;&lt;/a&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;
&lt;o:p&gt;&lt;/o:p&gt;&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Times New Roman" size="3"&gt;

&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt 0.5in;"&gt;&lt;span&gt;&lt;font face="Calibri" size="3"&gt;Forrester
Blogs: &lt;/font&gt;&lt;a href="http://blogs.forrester.com/category/big_data" target="_blank"&gt;&lt;font color="#0000ff" face="Calibri" size="3"&gt;http://blogs.forrester.com/category/big_data&lt;/font&gt;&lt;/a&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;
&lt;o:p&gt;&lt;/o:p&gt;&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Times New Roman" size="3"&gt;

&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt;"&gt;&lt;b&gt;&lt;span&gt;&lt;o:p&gt;&lt;font face="Calibri" size="3"&gt;&amp;nbsp;&lt;/font&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/b&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Times New Roman" size="3"&gt;

&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt;"&gt;&lt;b&gt;&lt;span&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;Try Now&lt;o:p&gt;&lt;/o:p&gt;&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&lt;/b&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Times New Roman" size="3"&gt;

&lt;/font&gt;&lt;/p&gt;&lt;p class="MsoNormal" style="margin:0in 0in 0pt;text-indent:0.5in;"&gt;&lt;span&gt;&lt;font face="Calibri" size="3"&gt;Preview
of the Hadoop-based service for Windows Azure: &lt;/font&gt;&lt;a href="https://www.hadooponazure.com/"&gt;&lt;font color="#0000ff" face="Calibri" size="3"&gt;https://www.hadooponazure.com&lt;/font&gt;&lt;/a&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;&amp;nbsp;&amp;nbsp;
&lt;o:p&gt;&lt;/o:p&gt;&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;&lt;font face="Times New Roman" size="3"&gt;
&lt;/font&gt;&lt;/p&gt;</description></item><item><title>Hadoop growing pains</title><link>http://www2.sqlblog.com/blogs/piotr_rodak/archive/2012/06/21/hadoop-growing-pains.aspx</link><pubDate>Thu, 21 Jun 2012 20:45:56 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:44001</guid><dc:creator>rodak.p@gmail.com</dc:creator><description>&lt;p&gt;This post is not going to be about SQL Server. I have been reading recently more and more about “Big Data” – very catchy term that describes untamed increase of the data that mankind is producing each day and the struggle to capture the meaning of these data. Ten years ago, and perhaps even three years ago this need was not so recognized. Increasing number of smartphones and discernable trend of mainstream Internet traffic moving to the smartphone generated one means that there is bigger and bigger stream of information that has to be stored, transformed, analysed and perhaps monetized. The nature of this traffic makes if very difficult to wrap it into boundaries of relational database engines. The amount of data makes it near to impossible to process them in relational databases within reasonable time. This is where ‘cloud’ technologies come to play.&lt;/p&gt;  &lt;p&gt;I just read a good article about the &lt;a href="http://ovum.com/2012/06/21/tooling-is-starting-to-tame-hadoop/" target="_blank"&gt;growing pains of Hadoop&lt;/a&gt;, which became one of the leading players on distributed processing arena within last year or two. Toby Baer concludes in it that lack of enterprise ready toolsets hinders Hadoop’s apprehension in the enterprise world. While this is true, something else drew my attention. According to the article &lt;u&gt;there are already about half of a dozen of commercially supported distributions of Hadoop&lt;/u&gt;. For me, who has not been involved into intricacies of open-source world, this is quite interesting observation. On one hand, it is good that there is competition as it is beneficial in the end to the customer. On the other hand, the customer is faced with difficulty of choosing the right distribution. In future, when Hadoop distributions fork even more, this choice will be even harder. The distributions will have overlapping sets of features, yet will be quite incompatible with each other. I suppose it will take a few years until leaders emerge and the market will begin to resemble what we see in Linux world. There are myriads of distributions, but only few are acknowledged by the industry as enterprise standard. Others are honed by bearded individuals with too much time to spend.&lt;/p&gt;  &lt;p&gt;In any way, the third fact I can’t help but notice about the proliferation of distributions of Hadoop is that IT professionals will have jobs.&lt;/p&gt;  &lt;p&gt;&amp;#160;&lt;/p&gt;  &lt;p&gt;   &lt;div style="padding-bottom:0px;margin:0px;padding-left:0px;padding-right:0px;display:inline;float:none;padding-top:0px;" id="scid:0767317B-992E-4b12-91E0-4F059A8CECA8:1c504cf7-cd4a-41b6-a76e-896b7e8a80ea" class="wlWriterEditableSmartContent"&gt;BuzzNet Tags: &lt;a href="http://www.buzznet.com/tags/Hadoop" rel="tag"&gt;Hadoop&lt;/a&gt;,&lt;a href="http://www.buzznet.com/tags/Big+Data" rel="tag"&gt;Big Data&lt;/a&gt;,&lt;a href="http://www.buzznet.com/tags/Enterprise+IT" rel="tag"&gt;Enterprise IT&lt;/a&gt;&lt;/div&gt;&lt;/p&gt;</description></item><item><title>Two TechNet Radio Sessions You Don't Want to Miss</title><link>http://www2.sqlblog.com/blogs/kevin_kline/archive/2012/06/20/two-technet-radio-sessions-you-don-t-want-to-miss.aspx</link><pubDate>Wed, 20 Jun 2012 20:12:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:43983</guid><dc:creator>KKline</dc:creator><description>&lt;p&gt;I was recently honored to speak on TechNet Radio in two separate sessions about BigData &amp;amp; Hadoop and cloud databases (specifically SQL Azure).  The show debuted on the &lt;a title="TechNet Edge" href="http://technet.microsoft.com/en-us"&gt;TechNet homepage&lt;/a&gt; under “Today’s News” and on the &lt;a title="TechNet Edge" href="http://technet.microsoft.com/en-US/edge/default"&gt;TechNet Edge homepage&lt;/a&gt;.  In each of these shows, I did what I like to do for all the parties I attend - bring a friend.  To make my life easier, I simply reposted the verbiage that TechNet used, rather that to write my own.&lt;/p&gt;&lt;h2&gt;About the BigData/Hadoop video:&lt;/h2&gt;&lt;p&gt;Microsoft SQL Server MVP Kevin Kline and Vice President of Database Development at Quest Software Guy Harrison (&lt;a title="Guy Harrison's Blog" href="http://www.guyharrison.net/"&gt;blog &lt;/a&gt;| &lt;a title="Guy Harrison's Twitter Feed" href="http://twitter.com/guyharrison"&gt;twitter&lt;/a&gt;), join us for today’s episode where we discuss Big Data and Hadoop ---from what it is, why its important as well what role does it play in cloud computing.  &lt;/p&gt;&lt;p&gt;&lt;strong&gt;Watch the video: &lt;a title="Microsoft TechNet with Kevin Kline and Guy Harrison" href="http://technet.microsoft.com/en-us/edge/technet-radio-community-corner-microsoft-mvp-kevin-kline-and-guy-harrison-on-big-data-and-hadoop.aspx"&gt;HERE&lt;/a&gt; &lt;/strong&gt; &lt;/p&gt;&lt;p&gt;Use the following short link to share the word with on Tweeter, Facebook, and LinkedIn: &lt;a href="http://bit.ly/In8uu8"&gt;http://bit.ly/In8uu8&lt;/a&gt; &lt;/p&gt;&lt;h2&gt;About the SQL Azure video: &lt;/h2&gt;&lt;p&gt;Microsoft SQL Server MVP Kevin Kline is back and brings with him Director of Development at Quest Software, Patrick O’Keefe. Tune in as they chat about the latest enhancements of SQL Server 2012, SQL Azure, as well as Project Lucy – a unique data analytics service in the cloud which offers insight on system and data performance through analytical presentations.  &lt;/p&gt;&lt;p&gt;&lt;strong&gt;Watch the video: &lt;a title="Microsoft TechNet with Kevin Kline and Patrick O'Keefe" href="http://technet.microsoft.com/en-us/edge/technet-radio-community-corner-microsoft-mvp-kevin-kline-and-patrick-o-keefe-on-sql-server-2012-and-project-lucy.aspx"&gt;HERE  &lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Use the following short link to share the word with on Tweeter, Facebook, and LinkedIn: &lt;a href="http://bit.ly/Hypc6z"&gt;http://bit.ly/Hypc6z&lt;/a&gt;&lt;/p&gt;</description></item><item><title>PASS Summit 2011, Day 3 - A Tribute to Wayne Snyder</title><link>http://www2.sqlblog.com/blogs/kevin_kline/archive/2011/10/14/pass-summit-2011-day-3-a-tribute-to-wayne-snyder.aspx</link><pubDate>Fri, 14 Oct 2011 15:51:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:39047</guid><dc:creator>KKline</dc:creator><description>&lt;p style="text-align:center;"&gt;&lt;a href="http://www.microsoft.com/presspass/events/sqlpass/images/0671_low.jpg"&gt;&lt;img class="aligncenter" title="Wayne Snyder" alt="" src="http://www.microsoft.com/presspass/events/sqlpass/images/0671_low.jpg" width="373" height="557"&gt;&lt;br&gt; &lt;/a&gt;First things first, Wayne Snyder is rolling off the board of directors for PASS this year.  We'd worked together, shoulder to shoulder along with Joe Webb (&lt;a href="http://webbtechsolutions.com/" target="_blank"&gt;blog&lt;/a&gt; | @&lt;a href="http://twitter.com/joewebb" target="_blank"&gt;joewebb&lt;/a&gt;) and other outstanding members of the SQL Server community, for many years of on the PASS board of directors and I'm certain that my tenure on the board and as president of the organization would've been nothing but trouble had Wayne not been there, covering my blind side(s), at every turn.  Here's my tribute to Wayne Snyder:&lt;/p&gt;&lt;p style="padding-left:30px;"&gt;&lt;span&gt;If you were to mention “Wayne Snyder” to me, I’d instantly start to grin and, probably, nod a little bit.  Wayne is the kind of leader who always comes to mind with overpowering and emotional warmth.  Sometimes when you visualize a memory of a person, you see them in your mind’s eye stooped over a console deep in thought or pontificating at a meeting somewhere deep in corporate America.  But when I recall Wayne, I always see an image of Wayne smiling with his arms out wide as if he’s going to wrap you in the biggest, most comforting, Southern-fried, big brother  hug you’ve had all year.  And that image is loaded with all kinds of deep positive connotations: supportive, enthusiastic, sincere offer you thoughtful conversation, honest convictions, and straight answers. &lt;/span&gt;&lt;/p&gt;&lt;p style="padding-left:30px;"&gt;&lt;span&gt;To use an analogy, some leaders are only the “thermostat” of their organization – they set the temperature for everyone else.  But Wayne was also the “thermometer” as well – he showed what temperature at which our organization was running.  And that temperature is &lt;em&gt;warm&lt;/em&gt;. As a PASS member, you knew within a heartbeat that it was ok to give a shout-out back to the speaker in a crowded auditorium, that there were no stupid questions, that it was ok to be the one who knew the least in the room because, in fact, &lt;em&gt;he &lt;/em&gt;was the guy who knew the least in the room once and here he was to help you become the one who knew the &lt;em&gt;most&lt;/em&gt; in the room! I honestly can’t count the number of people who Wayne recruited into the ranks of PASS simply by being Wayne.&lt;/span&gt;&lt;/p&gt;&lt;p style="padding-left:30px;"&gt;&lt;span&gt;Thank you, Wayne, for your many years of service to our community.  And thank you most of all for acting as the wellspring of our communities exuberant, uplifting, and just plain fun attitude of embodied in our motto of “Learn. Grow. Share”.  No one does it better than you.&lt;/span&gt;&lt;/p&gt;&lt;p&gt;Now, it goes without saying that Dr. Dewitt's keynote is one of the singlemost anticipated sessions of the entire event.  Why?  As Dr. Dewitt mentions himself, the hallmark of his sessions are a semester of graduate school IT learning distilled into one hour of awesomeness.  There are lots of great resources discussing NoSQL on the internet (and I've pointed out a lot of them in the past).  But who wouldn't rather leapfrog months of on-the-side research learning about NoSQL by enjoying Dr. Dewitt's keynote?  Watch the streaming video at this &lt;a title="Livestreaming SQLPASS keynote" href="http://www.sqlpass.org/summit/2011/Live/LiveStreaming/LiveStreamingFriday.aspx" target="_blank"&gt;SQLPASS link&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;And if you're here at the PASS Summit on Day 3, I hope to see you in my two sessions this afternoon:&lt;/p&gt;&lt;p style="padding-left:30px;"&gt;&lt;strong&gt;&lt;a href="http://www.sqlpass.org/summit/2011/Speakers/CallForSpeakers/SessionDetail.aspx?sid=2006" target="_blank"&gt;Crash! Boom! Bang! 10 Ways to Blow Up Castle SQL Server and the Techniques that Catch Them&lt;/a&gt;&lt;/strong&gt; (DBA-318)&lt;br&gt; &lt;em&gt;Enterprise Database, Administration and Deployment, &lt;/em&gt;Regular Session (75 minutes) in 3AB&lt;/p&gt;&lt;p style="padding-left:30px;"&gt;&lt;strong&gt;&lt;a href="http://www.sqlpass.org/summit/2011/Speakers/CallForSpeakers/SessionDetail.aspx?sid=1509" target="_blank"&gt;Are you a Linchpin? Career management lessons to help you become indispensible. &lt;/a&gt;&lt;/strong&gt; (PD-200)&lt;br&gt; &lt;em&gt;Professional Development, &lt;/em&gt;Regular Session (75 minutes) in 4C4&lt;/p&gt;&lt;p&gt; Follow me on &lt;a title="C'mon. You know you want to." href="http://twitter.com/kekline" target="_blank"&gt;Twitter&lt;/a&gt;!&lt;/p&gt;</description></item><item><title>PASS Summit 2011, Day 1 </title><link>http://www2.sqlblog.com/blogs/kevin_kline/archive/2011/10/13/pass-summit-2011-day-1.aspx</link><pubDate>Thu, 13 Oct 2011 13:40:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:39029</guid><dc:creator>KKline</dc:creator><description>I've already had a few good days in Seattle/Redmond this week, meeting with the Microsoft SQL Server program teams and with other Microsoft SQL Server MVPs.  I was as excited as a squeeling Justin Beiber fangirl waiting for his new video, wishing I could tell you all of the cool things I learned at Redmond about the future of SQL Server.  But as you'd expect, all of that cool stuff is presently NDA.  I'm sure there'll be some cool announcements from Microsoft this week.  So be on the lookout for the good word from Microsoft.
&lt;h2&gt;Keynote&lt;/h2&gt;
Rushabh Mehta, the PASS president, spent a few moments extolling the value of community and the achievements of the professional association.  And he's got a lot to be proud of.  PASS has come &lt;span style="text-decoration:underline;"&gt;such&lt;/span&gt; a long way.  One of the most telling facts about the significance of PASS, to me, is that important SQL Server announcements now happen at the PASS Summit.  There was a time, and not very long ago too, in which Microsoft made important SQL Server announcements at other Microsoft events like PDC and TechEd.  No longer!  PASS is the nexus for Microsoft's data management users.  And it shows.

Ted Kummert, Microsoft's top data executive, had a lot of exciting talking points about how the community has grown.  PASS now has hundreds of chapters worldwide and nearly ninety thousand members.  The event has over 4000 paying attendees this year, which means probably around 6000 total attendees including press, exhibitors, speakers, etc.  That's big!  In fact, that's just about the peak capacity for the Washington State Convention Center here in Seattle.  No wonder PASS will be at other locations in the future.
&lt;h2&gt;It's Officially called SQL Server 2012&lt;/h2&gt;
SQL Server "Denali" is officially rolling out as &lt;span style="text-decoration:underline;"&gt;SQL Server 2012&lt;/span&gt;.  There are a lot of interesting new developments with SQL12 regarding the way the product is splitting into multiple types of appliances designed for specific workloads and customer needs.  Need a massive processing appliance, check! That's PDW.  Need a hybrid solution for data housed both on premises and in the cloud?  Check.  Need processing power for BigData?  Need processing for non-relational and unstructured data?  Check.

Microsoft's improving tools will culminate in a new release of development tools called "SQL Server Data Tools", formerly known as Project Juneau, while the business intelligence side of the house will have a new set of tools in "Power View", formerly known as Project Crescent.  Hadoop figured large in the keynote, as Microsoft acknowledges that many BigData problems are best served by non-relational data stores.  Denny Lee, of SQLCAT, proposed an in-house data marketplace during his demos.  My face lit up like a kid at a surprise 10-yr birthday party.  Really?!?  FOR ME?!!?  I laugh because I'd been doing that at jobs throughout my career, offering up what I used to call the "data feedstore" to managers within my team.  +! for validation of your ideas.
&lt;h2&gt;First Session of the Day&lt;/h2&gt;
From there I headed out to my first presentation of the conference, which I was delivering with my pal Buck Woody (&lt;a title="Buck Wouldn't, Woody?" href="http://blogs.msdn.com/b/buckwoody/" target="_blank"&gt;blog&lt;/a&gt; | &lt;a title="Inventor of the BuckmeisterwoodyfullerIne" href="http://twitter.com/buckwoody" target="_blank"&gt;twitter&lt;/a&gt;) of Microsoft. Our session was all about Cloud 101 - when it's appropriate to use the cloud and where you can learn more about the specific technologies like IaaS, PaaS, and SaaS.  Many IT pros don't know the difference and are being subjected to the "implement it!" decrees of their bosses who simply read an article on an airplane saying that the cloud is the future.  The best quote from the Twittersphere about our session?  "Elastic is fantastic"  I couldn't have said it better!

Speaking of conference sessions, my buddy Brent Ozar (&lt;a title="One of the few, the proud, the MCMs" href="http://brentozar.com/" target="_blank"&gt;blog&lt;/a&gt; | &lt;a title="Tro-lo-lo with BrentO" href="http://twitter.com/brento" target="_blank"&gt;twitter&lt;/a&gt;) pointed out this great mobile schedule planning resource:
&lt;p style="padding-left:30px;"&gt;Go to &lt;a href="http://guidebookapp.com/getit/"&gt;Guidebook&lt;/a&gt; and download the app for your iPhone, Windows Phone 7, Android, or Blackberry.  After launching it, you’ll be prompted to download a guide.  Type in PASS Summit, and we’re near the bottom of the list.&lt;/p&gt;
Voila! Instant mobile schedule guidebook to the PASS Summit.
&lt;h2&gt;The Energy is Nuts!&lt;/h2&gt;
After delivering my session, it was off to the Exhibit Hall, where I played the role of booth jockey for Quest Software for the rest of the proceedings that day.  I noticed two things of significance.  First, the crowds were thicker and more energetic than I've seen in years.  Wow!  I knew attendance was our highest ever, but the crowd was near to bursting out at the seems like a 14-year old kid wearing last season's clothes.  So either the Washington State Convention Center is no longer big enough or more planning is needed to make this venue work.  When I was in leadership for PASS, planning and properly utilizing the venue was always a logistical nightmare.  So I don't envy the current leadership in figuring out how to make the PASS Summit scale to an even larger size.  The second thing I noticed was how focused the crowd was.  Usually, you get a lot of tire-kickers in the booth who, deep down inside, only want your vendor swag.  Yes, we had some cute swag this year (a &lt;a title="The TOAD IDE" href="http://www.toadworld.com" target="_blank"&gt;Toad&lt;/a&gt; beanie baby and some cool ribbons for your badge).  But we also had huge crowds even &lt;em&gt;after &lt;/em&gt;we ran out of swag.  And, in case you didn't detect the important part of the previous sentence, &lt;em&gt;we ran out of swag!&lt;/em&gt; That's right we gave out everything on day 1 of a 3 day event.  I nearly &lt;a title="My daughters love Victoria Justice" href="http://www.youtube.com/watch?v=K6oE23XeZPM" target="_blank"&gt;freaked the freak out&lt;/a&gt;. What is going on here, folks?  Haven't you heard that there's a recession going on?

&amp;nbsp;

&amp;nbsp;</description></item><item><title>ETL Demo With Data from Data.Gov</title><link>http://www2.sqlblog.com/blogs/kevin_kline/archive/2011/08/05/etl-demo-with-data-from-data-gov.aspx</link><pubDate>Fri, 05 Aug 2011 21:45:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:37542</guid><dc:creator>KKline</dc:creator><description>&lt;p&gt;A little over a month ago, I wrote an article (&lt;a href="http://kevinekline.com/2011/06/30/is-there-such-a-thing-as-easy-etl/" title="ETL, Expressor, and Data.Gov" target="_blank"&gt;&lt;em&gt;Is There Such a Thing as Easy ETL&lt;/em&gt;&lt;/a&gt;) about expressor software and their desktop ETL application, expressor Studio.  I wrote about how it seemed much easier than the native ETL tools in SQL Server when I was reading up on the tool, but that the "proof would be in the pudding" so to speak when I actually tried it out loading some free (and incredibly useful) data from the US federal data clearinghouse, &lt;a href="http://data.gov" title="The US Federal Data Clearinghouse" target="_blank"&gt;Data.Gov&lt;/a&gt;.
&lt;/p&gt;&lt;p&gt;
If you'd rather not read my entire previous article - quick recap, expressor Studio uses “semantic types” to manage and abstract mappings between sources and targets. In essence, these types are used for describing data in terms that humans can understand—instead of describing data in terms that computers can understand. The idea of semantic abstraction is quite intriguing and it gave me an excuse to use data from data.gov to build a quick demo. You can download the complete data set I used from the following location: &lt;a href="http://explore.data.gov/International-Statistics/International-Data-Base/qm22-4smj" title="Data.Gov International Statistics" target="_blank"&gt;International Statistics&lt;/a&gt;.  (Note: I have this dream that I'm going to someday download all of this free statistical data sets, build a bunch of amazing and high-value analytics, and make a mint.  If, instead, YOU do all of those things, then please pay to send at least one of my seven kids to college in repayment for the inspiration.  I'm not kidding.  I have SEVEN kids. God help me).

&lt;/p&gt;&lt;p&gt;The federal government, to their credit, has made great progress in making data available.  However, there is a big difference between accessing data and understanding data. When I first looked at one of the data files I downloaded, I figured it was going to take me years to decrypt the field names. Luckily, I did notice an Excel file with field names and descriptions. Seriously, there are single letter field names in these files where the field name “G” has a description of “Age group indicator” (Oh Wow).  See the figure below.

&lt;/p&gt;&lt;p&gt;&lt;a href="http://kevinekline.com/?attachment_id=1763" rel="attachment wp-att-1763"&gt;&lt;img src="http://kevinekline.com/wp-content/uploads/2011/08/expressor-2-01.png" class="aligncenter size-full wp-image-1763" title="expressor, 2, 01" alt="" width="623" height="334"&gt;&lt;/a&gt;
&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;
It's stuff like this that reminds me why we have data quality and &lt;a href="http://en.wikipedia.org/wiki/Master_data_management" title="Wikipedia::Master Data Management" target="_blank"&gt;master data management tools&lt;/a&gt;.  Ok, back to expressor Studio. I quickly mapped a couple of files into expressor Studio using their “Read File” operator. It was fairly simple and easy to use. My data included files with country area information, population, and gender information by year. Once I mapped these files I quickly wanted to shed the default cryptic, nay, nonsensical names. I could have just renamed the fields when I initially mapped them into the system but that would mean I would have to manage the names in three separate locations. Bah! It made more sense to create a common semantic type and reuse it across all three files.&lt;/p&gt;&lt;p&gt;

&lt;a href="http://kevinekline.com/?attachment_id=1764" rel="attachment wp-att-1764"&gt;&lt;img src="http://kevinekline.com/wp-content/uploads/2011/08/expressor-2-02.png" class="aligncenter size-full wp-image-1764" title="expressor, 2, 02" alt="" width="624" height="389"&gt;&lt;/a&gt;

&lt;/p&gt;&lt;p&gt;There are two flavors of semantic types within expressor Studio to handle your mappings, atomic types or composite types. An atomic type is simply a single field name whereas a composite type is a combination of one more atomic types. Since the data files had many common fields, I decided to create a core set of atomic types that I could then roll up into composite types based on the files I was mapping. This kept the mappings simple and easy to understand and most importantly the whole exercise took about 5 minutes. Once the types were created I simply mapped the cryptic names from the files to the business friendly names in my semantic type.  (I can't even begin to imagine how long this would've taken using native tools, but certainly not 5 minutes).&lt;/p&gt;&lt;p&gt;

&lt;a href="http://kevinekline.com/?attachment_id=1765" rel="attachment wp-att-1765"&gt;&lt;img src="http://kevinekline.com/wp-content/uploads/2011/08/expressor-2-03.png" class="aligncenter size-full wp-image-1765" title="expressor, 2, 03" alt="" width="624" height="389"&gt;&lt;/a&gt;

&lt;/p&gt;&lt;p&gt;&amp;nbsp;&lt;/p&gt;&lt;p&gt;Now I was ready to move my data. I took the data from three files and combined them into one master dataset. From there, my international statistics from Data.Gov were pumped right into my waiting SQL Server database.  Note that I could've used Excel or just about any other database as my target instead of SQL Server.

&lt;/p&gt;&lt;p&gt;Now, you might be saying to yourself "That looks easy because you read all the help files first."  Actually, no.  In fact, some of my buddies like to lovingly tell me to "RTFM" from time to time.  It's not that it offends my masculinity to read a manual.  I just usually like to have a go first and then, if needed, go back to the manual.  In fact, all I really used was &lt;a href="http://community.expressor-software.com/blogs/hsheng/14-new-5-minute-demo-expressor-studio.html" title="5-minute video of expressor Studio" target="_blank"&gt;this 5-minute demo video&lt;/a&gt; that in noticed when I was downloading the tool.
&lt;/p&gt;&lt;p&gt;
If you're tackling ETL and you want it fast and easy, then you might want to check out their website, &lt;a href="http://www.expressor-software.com/"&gt;www.expressor-software.com&lt;/a&gt;, to learn more about the expressor company and products.

&lt;/p&gt;&lt;p&gt;Enjoy!
&lt;/p&gt;&lt;p&gt;
-Kev

&lt;/p&gt;&lt;p&gt;P.S. &lt;a href="http://twitter.com/kekline" title="C'mon. You know you want to!" target="_blank"&gt;Follow me on Twitter!&lt;/a&gt;

&amp;nbsp;

&amp;nbsp;&lt;/p&gt;</description></item><item><title>What I'm Reading, July 22 2011</title><link>http://www2.sqlblog.com/blogs/kevin_kline/archive/2011/07/21/what-i-m-reading-july-22-2011.aspx</link><pubDate>Thu, 21 Jul 2011 14:43:00 GMT</pubDate><guid isPermaLink="false">21093a07-8b3d-42db-8cbf-3350fcbf5496:37152</guid><dc:creator>KKline</dc:creator><description>&lt;br&gt;I read too much, and that, my friends, is an entirely separate topic for a blog post. But I thought I'd share with you a little more about what I'm reading because sometimes, if I'm lucky, it might be something you'd enjoy too.

So I'm going to start sharing what I'm reading at least once per week, partly so that I don't firehose too many reading links directly into your brain (where I to do it say once per month) and partly to solidify in my own mind the information that I'm reviewing. So here are a few good links for the seven days leading up to July 22, 2001:
&lt;ul&gt;
	&lt;li&gt;&lt;a href="http://www.whitehouse.gov/blog/2011/07/18/big-data-new-insights" title="Whitehouse: From Big Data to New Insights" target="_blank"&gt;Microsoft and Whitehouse partnership on BigData&lt;/a&gt;: BigData isn't a particularly new concept.  But I was intrigued to learn that the National Science Foundation, Microsoft, and 13 other teams were partnering on developing better BigData analytics for lots of government data from activities such as healthcare, economic development, education, transportation, and the power grid.  Cools stuff!  Plus, Microsoft has developed a new tool called &lt;a href="http://research.microsoft.com/en-us/projects/azure/daytona.aspx" title="Microsoft Research's Project Daytona" target="_blank"&gt;Project Daytona&lt;/a&gt; to better harness the power of the cloud, in general, and Windows Azure, specifically.&lt;/li&gt;
	&lt;li&gt;While we're on the topic of &lt;a href="http://www.computerworld.com/s/article/357387/Feds_begin_race_to_the_cloud" title="ComputerWorld: Feds race to the cloud" target="_blank"&gt;Federal IT in the Cloud&lt;/a&gt; be sure to read this linked article from &lt;a href="http://www.computerworld.com" title="ComputerWorld Magazine" target="_blank"&gt;ComputerWorld&lt;/a&gt;.  Say what you will about our government, but putting government IT in the cloud and increasing both its transparency and availability will make a huge difference in how the Federal government will be able to service the public.  We're talking as big a difference as corporations experienced between the "catalog on the web" experience of the 1990's to the Web2.0 experience of today.&lt;/li&gt;
	&lt;li&gt;If you're the social media type, give this article a read discussing the&lt;a href="http://searchengineland.com/the-power-of-hashtags-on-twitter-84408" title="The Power of Hashtags in Social Media" target="_blank"&gt; Power of Hashtags in Social Media&lt;/a&gt;.&lt;/li&gt;
	&lt;li&gt;The Register, of the UK, whose tagline is "Biting the hand that feeds IT" has a great article on a &lt;a href="http://www.theregister.co.uk/2011/07/13/mike_stonebraker_versus_facebook/" title="The Register" target="_blank"&gt;spat over database technologies between the IT sage Michael Stonebreaker and Google&lt;/a&gt;.  It's a great read if for no other reason than to prove that databases are worth fighting over.&lt;/li&gt;
	&lt;li&gt;And if you think Microsoft is still towing the relational database barge without thinking about other technologies, you need to read up on Projects &lt;a href="http://research.microsoft.com/en-us/projects/dryad/" title="Microsoft Project Dryad" target="_blank"&gt;Dryad&lt;/a&gt; and &lt;a href="http://research.microsoft.com/en-us/news/headlines/daytona-071811.aspx" title="Microsoft Project Daytona" target="_blank"&gt;Daytona&lt;/a&gt;.&lt;/li&gt;
	&lt;li&gt;Finally, I'm still getting lots of questions about when and where to limit SQL Server's Max Degrees of Parallelism.  Be sure to read &lt;a href="http://sqlblog.com/ControlPanel/Blogs/and%20Guidelines%20for%20%27max%20degree%20of%20parallelism%27%20configuration%20option" title="Microsoft SQL Server MAXDOP" target="_blank"&gt;Microsoft's Recommendations and Guidelines for 'max degree of parallelism'&lt;/a&gt; configuration option here.&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;
And just because so many of us in IT are closet or former musicians, there's &lt;a href="http://www.ustream.tv/gibson-learn-and-master-live-lessons" title="Gibson Learn and Master Series" target="_blank"&gt;Live Guitar Lessons with Steven Krenz&lt;/a&gt;, sponsored by my hometown boyz at &lt;a href="http://www2.gibson.com/Gibson.aspx" title="Gibson Guitars, in my hometown of Nashville, TN" target="_blank"&gt;Gibson Guitar&lt;/a&gt;.

Got a favorite article or tool tip? Let me know!  Enjoy,

&lt;/p&gt;&lt;p&gt;-Kev

&lt;/p&gt;&lt;p&gt;&amp;nbsp;Follow me on &lt;a href="http://twitter.com/kekline" title="C'mon. You know you want to!" target="_blank"&gt;Twitter&lt;/a&gt;.&lt;/p&gt;</description></item></channel></rss>