Steve Wozniak @ FIX University "HAL2020" 101011 Steve Jobs
Over the weekend my wife and I decided to go pay our last respects to Borders Books. I could write a blog entry on why they failed (which includes missing the Internet, then missing e-Readers), or why we bought nothing (going-out-of-business discounts didn’t even bring their prices down enough to equal Amazon and the lines were so long there was no reason to wait for a bad deal), but I’d rather write about my aha moment. I now know a major reason we have so many website breaches.
While browsing through Borders’ selection of computer books I decided I wanted to see what some of them said about security. I picked up an introductory book on building websites and looked for Security in the index. This ~750 page “all in one” guide had just TWO pages on security, and they contained no details just some general things you’d need to worry about if you set up an e-commerce site. Then I grabbed another website book and discovered it had ZERO pages on security. And then another with the same result. It seems that we are training website developers that security is of no importance.
Next I moved on to SQL programming. For those who don’t know it, something called “SQL Injection” has been amongst the top couple of ways to breach a website for the last several years. SQL Injection isn’t a bug in database products, it is the result of application programming mistakes. The first book I picked up on programming SQL (Microsoft SQL Server specifically) didn’t talk about SQL Injection at all. So I decided to look at books that were specifically aimed at web database programming rather than SQL specifically. I picked up a book on jquery and found no mention of SQL Injection. Then a book on ADO.NET and it also had no mention of SQL Injection either. Wow, this was disturbing. We aren’t training programmers to take the proper precautions when writing access to databases either.
After returning home I used Amazon to look at the indexes for, and perform searches on, a number of other website and database programming books and the results were little better. Introductory website books rarely discuss good security practices. With introductory database programming books the situation is a little bit hit or miss. Some don’t mention SQL Injection at all, while I did run across a few that gave good guidance on how to avoid it. But the overall situation is pretty clear: when someone starts out building websites and web applications (be they doing so for their spouse’s small business or working for a large enterprise) they have no awareness nor training of how to build a secure website or application. None.
Next I thought, well there are always people who hold a Certification (from Microsoft, Cisco, etc.) and surely they have the appropriate training in security. So I set about looking at some of the training materials for certification. My first look was at the Self-Paced training for MCTS Exam 70-516, “Accessing Data with Microsoft .NET Framework 4″. This is a pretty obvious place to be testing knowledge about SQL Injection, but a search on this book yielded ZERO mentions of it. The training book for 70-515, “Microsoft .NET Framework 4 – Web Applications Development” also contains no references to SQL Injection. Almost all its discussions about security are related to authentication and authorization, with none on how to write a secure application. Other exams, and the training material for them, may indeed cover these topics. However, these are the two main exams around web application development (including with data access) and they do not. For example, someone who wanted a SQL Server certification would find some training on SQL Injection in the materials for exam 70-433 (a full ONE page). However this exam isn’t part of the web application development (technically a Visual Studio) certification and someone using a database other than SQL Server certainly wouldn’t bother taking it. Once again we see that we aren’t training web developers in how to build a secure website or web application, nor are we expecting them (via certification) to know how to do it.
Once a developer is aware of, and interested in (often because their site has been compromised), creating secure websites and web applications there is plenty of documentation, training, and help to be had. There are books about specific topics (e.g., there are books on how to protect against SQL Injection attacks going back to 2002) to general ones on building secure web applications. OWASP (The Open Web Application Security Project) has extensive documentation, training, assessment materials, conferences, and tools for building secure websites. But all of this assumes that a developer knows he or she needs to learn about these topics, which means that only the most experienced developers tend to have a focus on security.
Now I think we know why there are so many breaches of websites, the people developing them are simply not being trained in how to create secure sites. The way to address this is to put security front and center during the initial training of web developers. Basic security practices can’t be an add-on, it has to be part of the fundamental knowledge base that everyone operating in the IT arena has. From the very first website or application they create.
I’ve been asked many times over the years about a port of Microsoft SQL Server to *nix (as we used to call it, since Unix was the primary offering in the Enterprise while Linux was just gaining traction). Most recently someone asked in a reply to one of my posts if Microsoft had ever seriously considered it. While I can’t speak for any recent thinking, if you go back to over a decade ago it was given some very serious consideration. There were three reasons for this. First (and primarily), a number of Microsoft’s key partners (both software and hardware) lived in a multi-platform world and had a strong interest in seeing Microsoft SQL Server on *nix. Second, this was the period when the highest end Windows hardware platforms were of the 8 processor variety and much of the competition amongst database engines was moving into the 16-32 processor range. There simply was no Windows-based platform to compete against the Sun E10000 that had become the mainframe of the Internet (bubble) era. As some may recall, even a couple of Microsoft’s acquired properties (e.g., Hotmail) used Oracle on big Unix boxes long after everything else had moved to Windows because there were no Windows equivalents (until the Unisys ES7000) they could move to. Third, customers kept telling us they were happy to use SQL Server but didn’t want to use Windows NT 4. And so, on a couple of occasions serious thought was given to porting SQL Server to *nix.
So why didn’t Microsoft take SQL Server to *nix? On one occasion a partner commitment that might have made it viable failed to materialize. On another occasion I initiated the investigation on the basis of a partner request but then decided it was a bad idea. Here is why:
There are five things you have to consider when evaluating if Microsoft should take SQL Server to *nix:
What exactly is the product offering you intend to bring to *nix, does it have a real market there, and can you position the offering to succeed?
What is the impact of going multi-platform on the product family, engineering methodology, organization, and partner engineering organizations?
What is the business model, including how do you partner, market, and (very importantly) sell into the Enterprise *nix world when you are a company that has no expertise in doing so?
How do you provide Enterprise-class service for SQL Server when it is running on a platform that your services organization has no expertise with?
What is the negative business impact on with entire Windows platform associated with making a key member of the server product family available on *nix?
The product is always what people think of first so let me address it first. When someone says Microsoft SQL Server you could think of two things. One is the relational server(sqlservr.exe) that has its origins at Sybase and was re-written by Microsoft to produce Microsoft SQL Server 7.0 and later versions. The other is that plus all the BI (Analysis Services, Reporting Services) and tools that are also part of the product family. When someone talks about porting SQL Server to *nix the difference between just porting the former and porting the latter is at least an order of magnitude. Maybe two.
Fortunately what people were asking for (again, over a decade ago,) was just the relational server. The first order engineering of making that happen, assuming you disable some Windows-specific functionality, was rather small (on the order of a few manweeks). But would a reduced functionality relational engine, without some popular features, be accepted as a serious offering by customers? What of future planned (and now delivered) features like CLR Functions and Stored Procedures? Would the lack of that functionality bother customers? Would they insist we support Java Stored Procedures instead? What about management infrastructure? Would SQL Server have to support different management infrastructures on *nix and on Windows? Would we place new hooks in sqlservr.exe or would WMI/WBEM handle it all? *nix DBA’s used shell scripts as their primary management tool, but the SQL Server of that day was not scriptable. Would those DBA’s accept the use of GUI tools? Would *nix users accept “Windowsisms” that made it quite clear to them that SQL Server was not a native *nix product? On these last few points I had lots of historical evidence to suggest that *nix customers would not be happy about the situation. They wanted a product that showed a full commitment to the *nix platform. Which meant that far more engineering work than simple porting was required. And then there was performance and scalability. The effort to tune SQL Server to run well on a 32 processor E10000 running Solaris was going to dwarf the work to port it there in the first place.
The second thing to think about is how this is going to impact everything else around it, including the engineering methodology and organizations. For example, do you follow a philosophy that says the core team just worries about Windows and throws the Windows product over the wall to a team that adapts it to *nix? Or do you reorganize everything so you have a core team that builds multi-platform software and then teams that adapt it to the various platforms (which is what true multi-platform companies do)? Do you stop putting in Windows-centric features such as CLR support to make multi-platform support easier? Or can you get the CLR team to become multi-platform with you? What about the Visual Studio and (what became) the Systems Center teams? SQL Server takes components from many other teams, so what do you do about those? Do you reverse the original decision to have a single SQL Server product (that included OLAP and the rest) and split out a relational-only product on Windows since that is all you are going to offer on *nix? While a small contained project was possible, the changes likely required to succeed were earth shattering.
Third, engineering is one thing but everything else involved in bringing a product to market is even bigger. Would customers really consider buying *nix products from “the Windows company”? Particularly mission-critical Enterprise products? How do you sell SQL Server on *nix when in fact you have no sales capability in the *nix world? Not only that but how do you even get a sales rep to return the phone call when someone wants to buy $100K of SQL Server but no other Microsoft products? What if it is only a $25K sale? $5K? Given how much energy goes into an enterprise-class sale, and particularly a database sale, a sales rep who went after these deals would not only never make their quota but they’d be losing money for Microsoft. So you have to rely on partners to do the heavy lifting, but does that really help? I was assured by some contacts that Sun would have welcomed SQL Server onto Solaris. But could you have imagined a Sun sales rep bringing anyone from Microsoft into their account to help with the sale of SQL Server on a Solaris system? I couldn’t. It goes against all the rules of account control. Perhaps if Microsoft created a dedicated SQL Server *nix sales team, who agreed never to pass information to the Windows Server sales guys, you could overcome this, but that greatly complicates the problem and raises the costs of the undertaking. At the time Microsoft did not have dedicated sales teams for anything, so you’d be trailblazing yet another trail. And then there is IBM. An IBM sales rep is going to lead with DB2 and then bring in someone else if the customer insists on it. Oracle’s existing market share means they are already in the account and are likely the ones encouraging the customer to tell IBM they want to run Oracle on the IBM Server. But Microsoft didn’t have the same level of account presence, particularly with the *nix-oriented parts of IT organizations, to insert itself into these sales situations without a lot of incremental effort. And that puts you back into the problem of the value of the sale being too low to justify that effort. So this is another questionable partner situation. While that left lots of other players (HP, Compaq/Digital, Dell, etc.) who could have been great partners, Sun and IBM were the two leading *nix players. Microsoft’s ability to penetrate the *nix database market without them would have been greatly impeded.
Fourth, since the target of my investigation was really mission critical Enterprise systems you have to address how they will be serviced. How do you go after the most demanding service and support situation with an organization that has no expertise in servicing the environment? Can you do it mostly with partners? Again this is just complex and potentially expensive to solve. It also flew in the face of something else we were experiencing. Big Enterprise customers want your senior executives to shake their senior executives hand and promise you’ll stand behind them and make them succeed with your product. How do you do that if you outsource service to a partner? Would Microsoft have had to acquire a *nix-oriented services company in order to succeed?
Lastly, what is the business impact on the overall Windows Server business (or on Windows client as well) if you port SQL Server to *nix? One could say this bullet is a duplicate of number three and it all washes out in the business plan. But I doubt a SQL Server business plan could have fully captured the question or the impact. I had full executive support in investigating a port, but had I brought forth a proposal to proceed I would have faced arguments from many that I was undermining Microsoft’s entire business plan.
I started to work through all of the above and realized that the cost of porting SQL Server to *nix, and succeeding with it, was enormous. And more importantly, it would distract from our ability to move the product forward. It would also distract from Microsoft’s ability to push Windows Server to support high-end hardware and address other Enterprise requirements. And when all was said and done that it was going to be a huge net negative for the business. So I dropped the idea.
Fortunately for Microsoft the collapse of the Internet bubble, and its assumption that businesses would grow faster than Moore’s Law for an indefinite time period, coincided with the release of Windows 2000 Server. Windows 2000 Server addressed many of the problems (reliability, scalability, manageability) with Windows NT 4 Server in the Enterprise. The combination caused customers to apply a price for value analysis (that they’d ignored during the bubble) to their server purchase decisions. And high-end hardware such as the Unisys ES7000, HP Super Dome, and offerings from Fujitsu, NEC, and others essentially wiped out the single system *nix scalability advantage. Windows Server then continued its Enterprise-oriented improvements in subsequent releases, helping pave the way for SQL Server to grow its success in the Enterprise without a *nix port.
Has Microsoft taken another look at a port since I left the SQL Server team? I don’t know. Some of the pressures to port SQL Server to *nix have receded over the years while others (e.g., MySQL) have emerged. But I think it is an idea whose time has passed. In a Cloud Services (as opposed to just hosting a standard VM in the cloud)environment no one really knows nor cares what the underlying OS is. SQL Azure thus becomes Microsoft’s answer for those who don’t want to run an in-house Windows Server just so they can run SQL Server.
There are times I think Microsoft would be better off as a pure software company rather than one oriented primarily around a single platform (i.e., Windows). On the other hand, when I look at the database industry I see that most of the database companies that became leaders based (to a large extent) on their multi-platform support are gone or are niche players. Ingres, Sybase, Informix, and Oracle all used multi-platform as their means to displace the systems vendors own offerings (DEC Rdb, IBM DB2, and non-relational offerings from many others). Of those only Oracle remains a significant supplier of database software. And while some may consider multi-platform orthogonal to the problems the other database companies faced I think it played a significant role in their inability to keep up. Doing multi-platform right is a huge and ongoing expense; A tax on their ability to invest in improving their own database technology and focus their sales and marketing efforts. By sticking with a single platform Microsoft, SQL Server included, gets to focus its engineering, marketing, sales, and services efforts on adding synergistic value instead of thinly spreading a least common denominator over multiple-platforms. This has been key both to Microsoft’s overall success, and specifically to SQL Server’s success.
In retrospect I can say I’m very happy that we never went the multi-platform route with SQL Server. Even if I was the primary advocate in the management team for doing so.
I couldn’t let the day go by without commenting on Steve Jobs’ resignation as Apple CEO. But rather than write something totally new I’m going to post two entries I made on Facebook that capture my views of the situation:
#1:
While my friends and I were playing around on ASR33 teletypes connected to BOCES LIRICS’ DECsystem-10, Bill Gates and Paul Allen were doing the same (the Lakeside School in Seattle had its own DECsystem-10). That is what got us all hooked on computers. Steve Jobs and Steve Wozniak followed a slightly different route but ended up in a similar place. Many SHS alumni went on to careers in computing, and we’ve made some major contributions (although obviously nothing on the same scale as these other guys). In particular Jobs and Gates became the two key leaders of the computer industry as we transitioned to “personal computing”, and just as importantly the two most important industrialists of the baby boom generation. I went to work for Bill, but have no less respect for Steve’s importance to our industry and our generation. When Bill decided to move his full-time focus from Microsoft to charity work I was saddened that I wouldn’t be working with him but happy for what it meant for the world overall. But Steve being forced from his CEO role at Apple over health reasons saddens me tremendously. It is, as Stephen Gyetko points out, pretty ominous and I am taking it quite personally. And so in the spirit of this group, you really know you’re from Syosset if you used the ASR33s when there was one each in adjoining closets on the second floor of the E wing (around the corner from Mr. T’s) when the teachers were clueless (or even scared) about what to do with them and everyone else in the school thought you were nuts; And you can trace that experience to the news of the day.
#2:
The interesting thing is that I don’t consider Jobs that unique in the “visionary” camp. Everything Apple did came from somewhere else. Jobs real contribution was in being the perfectionist who turned those concepts into products that people wanted and wanted as passionately as he wanted to make them. And what he did more brilliantly in his second life at Apple than he did in his first was to have perfect timing of bringing new products to market just as the underlying technologies reached sufficient maturity to allow mass market adoption. The iPod wasn’t the first music player, the iPhone wasn’t the first smartphone, the iPad wasn’t the first tablet. Nor was the Mac (or even the Lisa) the first GUI OS. But in each case the execution (from the end-user standpoint) was superb. Gates had similar, or even more far reaching, vision but Microsoft’s software-only business model, reliance on OEMs and other partners to complete even the basic product offering, breadth of activities, and even Bill’s management style couldn’t realize those visions in the same polished way that Apple (as a vertically integrated systems vendor) could. Bill’s huge business innovation was the OEM model for software. Steve’s was in getting both media companies and telecom carriers to alter their business models to accomodate the experiences he wanted to create. And overcoming Apple’s inability to get shelf space in retail stores by creating his own (again, not the first computer manufacturer to try it but the first to truly succeed at it) retail stores. Apple will of course continue on and do well for now. The real test will be when the next sea-change comes along and there is internal friction over how to respond. That’s when Jobs will really be missed.
After a couple of years of leaks, and a few carefully selected releases of information, we’ll finally see the big reveal of Microsoft’s Windows 8 at the BUILD conference this coming week. Make no mistake that Windows 8 is the most important release of Windows since 1990′s Windows 3.0, and Microsoft’s future as a platform company is dependent on it being a run-away success. Let’s do a little trip back in time to explore some eery parallels and then speculate on what we’ll see (or rather, what we have to see) at BUILD.
We spent the 1990s and 2000s living in a world dominated by computers with a Graphical User Interface (GUI) based on WIMP (Windows, Icons, Menus, and a Pointing device). This style of user interaction was pioneered in the 1960s at SRI and 1970s at Xerox PARC. Many companies then started working on commercialization though it was Apple that really brought this work to public attention with the Lisa and then to commercial success with the Macintosh computers. Microsoft entered the market for GUI-based UI around the same time as Apple, however it did so with an add-on shell to MS-DOS rather than with a complete GUI environment. Throughout the Windows 1/2 era we had a situation much as exists today. Microsoft had a GUI offering, but it was mostly used as a launcher for MS-DOS apps and was a poor representation of the GUI paradigm. Apple, with the Macintosh, had an excellent representative of the GUI paradigm and one that excited end users. As 1990 approached Microsoft was in danger of being relegated to a legacy business around MS-DOS while Apple was poised to become the dominant provider of PCs for both applications (e.g., desktop publishing) that worked best in a GUI paradigm and for the vast majority of people who still didn’t own a PC. Then came Windows 3.0.
With Windows 3.x (so 3.0/3.1/3.11) Microsoft made the leap to a full and competitive GUI OS and changed the game. Now it could bring all the benefits of its business model, such as a wealth of OEMs, retail and other distribution channels, and large application portfolio, into a GUI-centric world. Microsoft Windows soon eclipsed Macintosh (in perhaps every dimension except elegance) and went on to become the dominating operating system of the 1990s and 2000s.
The GUI paradigm served users very well for about 20 years because the computer form factor remained little changed in that time. Computers consisted of a system unit, a monitor, a keyboard, and a pointing device. This evolved from separate components into integrated systems, particularly in the form of the Notebook computer. But the essence didn’t change. Meanwhile various attempts to introduce new form factors were met with limited success. Various attempts at Tablet form factors, for example, went nowhere.
Development of Personal Digital Assistants (PDAs), which would evolve into today’s smartphones, began in the late 1980s. Once again Apple was a leader, but this time their Newton OS failed. Others succeeded, but in limited form. Finally Apple re-entered this market in 2007 with the iPhone and its IOS operating system. IOS itself was derived from the Macintosh OS but introduced a new finger-friendly Natural User Interface paradigm UI. Naturally Apple wasn’t the only company working on NUI, but once again they were the first to bring it to full fruition in a commercially successful product. Once again they enjoyed great success, but are in the process of being eclipsed in market share by Google’s Android. Microsoft is trying to play catchup, but it is unclear how much of a chance they have. This battle may not follow exactly the path of the earlier battle in PCs, but it is eerily similar so far. Even take Apple’s use of patents to try to stop Android. You may recall that Apple used the courts in an, ultimately futile, attempt to stop Microsoft Windows. But this posting isn’t about mobile phones, it is about the future of Microsoft Windows. So let’s move on.
With a clear hit in the NUI OS space in IOS, Apple moved on to introduce the iPad in 2010. At its introduction, and in some cases even a year later, analysts saw the iPad as a niche offering. What happened instead was the iPad caught on as a primary computing device that has started to encroach on the traditional Notebook computer. Many people, particularly when not on business trips, leave the Notebook at home and take just their iPad. This is a somewhat nuanced discussion in that for any given scenario one or the other form factor (Tablet or Notebook PC) one is actually better than the other. You can read a newspaper or book, create a spreadsheet, watch a movie, make airline reservations, layout a brochure, etc. on either. But the iPad is a much better experience for consumption-oriented activities like reading or watching while the Notebook is a better experience for creation-oriented activities like spreadsheets, brochures, editing photographs, etc. Something like making an airline reservation splits the difference, with someone who does lots of reservations finding a Notebook somewhat more productive (e.g., fewer required screen transitions and easier data entry) while someone who infrequently performs this activity may be perfectly happy doing it on an iPad. Most importantly, the number of people who are heavy Consumers on content far outweigh the number of people who are heavy creators of content. That gives the long-term edge in computing devices to those that are best at Consumption, placing the dominance of traditional Notebook (and even desktop) PCs in jeopardy.
Just as with the original Macintosh Apple has redefined the tablet category and staked out a strong lead in the NUI-based computing world. While to date this has primarily been based on the iPad’s superiority for content consumption, the truth is that people are working hard to improve its usefulness in content creation. For example, the addition of a Bluetooth keyboard greatly improves the users ability to create Spreadsheets, documents, etc. Capacitive Pens allow for higher resolution drawing than you can do with your finger. And how long until Apple creates a docking station that effectively turns your iPad into a full-blown analogy to a Macintosh PC? I imagine we’ll see that (and the demise of the traditional Mac line) within this decade.
Microsoft has introduced many NUI elements into Windows over the years, but the basic usage paradigm remains GUI. So, as in 1990, Microsoft is faced with either pulling off a full transition to the new NUI world or being relegated to a legacy supplier while Apple runs away with the next generation of computing. Microsoft has lucked out in one respect; unlike in the mobile phone space Android has failed to gain traction in the tablet space. And Apple’s legal assault on Android (where, for example, they have been able to block sales of some Android tablets in some jurisdictions) promises to keep the opportunity open long enough for Microsoft to step in with the only real IOS alternative. And so this comes down to a classic repeat of the Windows 3.0/3.1/3.11 vs Macintosh battle of the early 1990s. If Microsoft succeeds we’ll be looking back a decade from now at a world in which it is still the dominant supplier of end-user computing devices. If it fails, a decade from now we’ll be lumping Windows in with MVS and VMS as living historical artifacts that matter only to a few people.
Since this blog entry has gotten so long I’m going to split the discussion of what we should expect to see next week into a separate entry. See you on the other side.
Following up from my previous post on why Windows 8 is so important I wanted to speculate on what we’ll see next week, or rather what we have to see in order to believe that Microsoft can succeed. First we’ll talk about Microsoft’s strengths and weaknesses and how they need to exploit and/or correct them. Then we’ll talk about the key characteristics of the (general purpose) NUI OS world. And finally we’ll talk about a few key Windows 8 things we need to see.
Microsoft has two big strengths that they really need to exploit in order to make Windows 8 succeed. The first is their classic strength around being a multi-vendor platform. They need to get a large number of hardware manufacturers creating a substantial number of differentiated and interesting devices and pushing them heavily through all available channels (web/mail order, retail, distribution, direct corporate sales, etc.). This is a key strength against Apple, though if done poorly (as with Vista) can turn into a key weakness. They have to give the OEMs a lot of freedom to innovate and differentiate on hardware, but they have to keep enough control to make sure the OEMs don’t create devices that show off Windows 8 poorly. This is something that the Windows Phone 7 guys addressed with their very restrictive “Chassis” definition. Windows 8 can’t be as restrictive as Windows Phone 7, but they need to make some attempt to keep things from turning into the wild wild west. Talking about the channel front a bit, this is where I think Android has stumbled in the tablet space. Android Tablet manufacturers tied themselves too closely to the mobile phone sales channels, making it difficult for customers to find and purchase appropriate devices. For example, for a long time Best Buy kept Android tablets displayed in a back corner of their mobile phone area. This made them hard to find, and hard to find expertise to help you with them. Or I know someone who wanted a WiFi-only Samsung Galaxy Tab. Samsung withheld this device for many months in favor of 3G-enabled devices that you could only buy with 3G service. Finally my friend gave up waiting and purchased a WiFi-only iPad. Microsoft has well-developed channels in both the PC and Mobile spaces, with vendors like Dell able to work magic in the PC distribution space (even though it has bombed in the Mobile Phone distribution space). If Samsung, Toshiba, and others use both their PC and Mobile businesses to create and sell Windows 8 tablets then Microsoft has a huge advantage over Apple or Android.
The second strength Microsoft has is the one that really has differentiated the target end-user successes of Microsoft and Apple over the years. Apple targets multi-media consumption (and creation with the Mac) while Microsoft targets the Information Worker. If Microsoft pulls off the trick of being truly competitive with Apple for consumption-oriented users while being the clear offering of choice for Information Workers, then it can recreate the success it had with Windows 3.0 and beyond in the new world. And Microsoft brings many advantages to the table in trying to do this. For example, the ability of a Windows 8 tablet to join an Enterprises’ Domain (and have all its management and security benefits) would immediately make Windows 8 the tablet favored by Corporate IT (including the Chief Information Security Officer) for all internal use. But this won’t matter if end-users don’t love the devices, so Microsoft can’t count on Domains to overcome a weak end-user experience. But if the end-user is excited about Windows 8, this becomes a huge differentiating feature. Another factor will be how seriously Microsoft’s other products, particularly Office, embrace Windows 8 tablets. Everyone of my friends and family who has switched to the Mac has also run Microsoft Office for Mac on them. Imagine now that a full-fledged version of Microsoft Office (including Outlook) comes out that is fully usable on Windows 8 Tablets; suddenly carrying just a tablet with you on business trips becomes a real option. So Microsoft has a lot to bring to the table, and bring it they must!
One more strength is Microsoft’s view of Windows as a general purpose platform. Whereas Apple has been somewhat hostile to third-party e-book readers (e.g., they won’t let them actually sell you a book through their app) Microsoft is more likely to be telling Amazon et al ”Please come make Windows 8 Tablets the best e-book readers on the market; what more can we do to help you succeed?” That attitude, spread across the entire application space, could be a huge advantage.
A final strength? That docking station that turns your iPad into a Mac replacement I mentioned? That could be a truly trivial thing for a Microsoft OEM to do with Windows 8.
I’ve already alluded to one weakness, that the same ecosystem that is such a strength can kill you by producing bad products. Or by ignoring you. One of the problems Microsoft initially had with Windows was that its ecosystem (e.g., Lotus) wasn’t developing for it. Windows 8 has to be exciting enough that the ecosystem clearly favors it over Android.
Another weakness is Windows bloat. This one has probably caused more advanced criticism about the idea of a Windows 8 tablet than anything else. It is usually couched in terms of “why would you want something as bloated as Windows on a tablet”? But then people forget that IOS is a reworked Mac OS. So the real question is, has Microsoft reworked Windows sufficiently so that a Windows 8 tablet doesn’t suffer from Windows overall bloat? There are promising signs. In Windows 7 Microsoft introduced MinWin (part of a multi-version restructuring cleanup they’ve had underway) as well as made changes that allowed many services to not be started until they were needed. Windows 7 was the first version of Windows to run well on smaller hardware configurations than its predecessor. Assuming they’ve continued to invest in this restructuring work it would be easy to see how they could keep bloat from killing the Windows 8 tablet experience. Even a recent reveal, that Windows 8 COLD boot time may be as low as 8 seconds, is evidence that Windows 8 is lean enough for tablets rather than suffering from the bloat we became accustomed to a decade ago. But still, until we see otherwise most people will continue to worry that Windows is too bloated for tablets.
Another weakness is that from an end-user perspective Windows has been way too non-prescriptive and confusing as a platform compared to IOS. For example, media experiences are spread across Windows Media Player, Windows Media Center, Zune, Silverlight, Flash, HTML5 and others. While Windows 8 will no doubt continue to support all of these (and Windows 8 tablets most of them), is there a clear primary streaming media story for Windows 8 tablets? This is a space where for decades Apple has really shined and Microsoft has continually shot itself in the foot. Microsoft has no more toes to blow off and needs to have a clear preferred experience, in this area and in many others, to gain the consumer love that Apple currently enjoys. And yes, I do realize that I’ve made openness of the platform both a strength and weakness. One that Microsoft will have to navigate carefully.
Lastly I’ll mention “3 screens and a cloud” as both a strength and weakness for Microsoft. Microsoft has talked about this for many years, but to date hasn’t shown much in terms of their offerings. The 3 screens references the PC (which would include tablets), the phone, and the TV. Unifying these can be a critical advantage for Microsoft, or an achilles heal. If we see more unification around Windows 8 it becomes a powerful advantage. If not, Apple (and Google) are pursuing their own “3 screens and a cloud” strategies that will eclipse Microsoft. Fortunately there are both hard signs (e.g., XBox Live on Windows Phone 7) and many rumors that suggest Microsoft is finally getting its act together in this space. Hopefully BUILD will offer us some more evidence this is true.
Ok, so what are the key elements that we now associate with a NUI environment that Microsoft must address? The first is the most obvious, which is a modern visual and interactive style that takes advantage of TOUCH and GESTURES as the primary interaction method. This must extend throughout (e.g., you can’t have people trying to touch little X boxes to close things or drag scroll bars as you would do with a pointer; either at the OS or app level). One of my test experiences playing with a Windows 7 tablet was the NY Times Newsreader App. On my iPad I just swipe to go to the next page. On a Windows 7 tablet I have to find and tap (nee click) on an arrow to go to the next page. On a Windows 8 tablet just swiping has to work. Incorporation of voice recognition, use of the camera and other sensors, etc. are all pluses that Microsoft can (must?) use to differentiate. Microsoft has good enough voice recognition to do free form speech-to-text. Will we see that finally achieve widespread usage in a Windows 8 tablet? Will Microsoft, or its OEMs, build support around Window 8 for virtual projected keyboards? Or 3D video conferencing? Or….
A second element is a more locked down application environment. You may recall that apps were dead prior to the introduction of the second version of IOS and the App Store. This was because Windows (and Mac OS and Linux and…) had such a wild west attitude towards applications that they made systems unreliable, slow, and non-secure. Phrases like “DLL Hell” may still ring a bell. And certainly you’ve experienced the inability to fully uninstall an application. Any modern OS has to have an application model that can be sandboxed for reliability and security, can install apps simply and quickly, can uninstall apps just as simple and quickly, and doesn’t have side effects on unrelated apps. Windows 8 must have such an app model or the end-user experience will suffer greatly compared to IOS. In fact, the existence (and enforcement?) of such a model would do a great deal to eliminate most of the major issues that Windows has suffered over the last 20 years.
An “app store”. Apps have been around forever. But until the iPhone’s App Store came out there was no easy way to find them, know they weren’t laced with malware, know they weren’t likely to reduce system reliability, purchase them easily, download them easily, and install or uninstall them easily. Attempts to create marketplaces for existing applications didn’t really work because they addressed few of these characteristics. But with a new app model, and of course its own experience with the Windows Phone 7 Marketplace, Microsoft could introduce a Windows 8 “app store” that has all the characteristics necessary for a modern NUI-based OS to succeed.
Lastly, I’ll mention a design center for Consumption-oriented experiences. The truth about the iPhone, iPad, etc. is that they are replacing newspapers, books, DVD players, portable game players, etc. They are how you keep your kids entertained on a long road trip, and yourself on a long plane trip. They have become our companions when dining out alone, and our personal shopping assistants when we are in a store. Their larger screens make them more suitable for this than a Smartphone, yet they still are of a size and weight that you can carry in a purse or your hand. And so it is critical that any modern OS put consumption experiences ahead of creation experiences (where such tradeoffs are required).
Microsoft has already revealed some key elements of Windows 8. We know it will support the tablet form factor, including the use of ARM-based chipsets (a practical if not absolute necessity). We know it will offer both a modern NUI-style user experience evolved from the Metro experience designed for Windows Phone 7 as well as the traditional GUI experience. On the rumor level we’ve heard about a new app model reportedly called AppX, along with an associated “app store”. There are also plenty of rumors about relationships with XBox and Windows Phone 7 (e.g., that Windows 8 will run WP7 apps, which is technologically trivial to accomplish). I imagine tablets will always use the NUI interface, but that’s one thing we’ll have to wait on. For example, will Microsoft do anything to enforce this (e.g., a tablet edition that doesn’t include the old interface while the Pro or Enterprise edition includes both). I think we can assume this is all true, including the rumors.
But there is much we don’t know. How deeply will the NUI experience extend? Will new applications be NUI through and through? What happens when you run them from within the GUI (aka, traditional) shell? What happens when you run an existing GUI-based app in the NUI environment? Did they alter the common dialogs and graphical elements to make them more finger friendly (as Windows Mobile did with 6.5)? We should get answers to these questions next week. And they better be good.
What is the new App model? This is going to be the most revolutionary change to the Windows ecosystem since Windows itself. What is happening with graphics? This is one of the most awaited disclosures we’ve seen in a long time. And tell us please about that “app store”. This is the most important discussion next week because it impacts the expertise of the entire Windows developer ecosystem. How much their existing knowledge and skills is still applicable vs how much they have to start from scratch will impact both their enthusiasm for Windows 8 and the time to market for apps that conform to the new model.
There is a lot we may or may not find out next week. This is a developer conference, so Microsoft may withhold much in the way of end-user feature information. Will they, for example, disclose what the primary media strategy for Windows 8 is? I don’t know. They will certainly save as much news as they can for nearer the launch of Windows 8, but it will be small compared to what we learn this coming week.
I’ve run out of steam so I’m going to leave things here. Windows 8 will either be the release that propels Microsoft to leadership of the next two decades of computing or that confirms it is on the road to oblivion. Yes, I think it is that important. Are you looking forward to the big reveal as much as I am?
One of the end-user oriented features revealed in yesterday’s BUILD keynote that I’m particularly excited about is the expansion of Windows Defender capabilities. Microsoft has had the limited Defender anti-malware capability built-in to Windows since Vista. Defender, although it uses the same anti-Malware engine as Microsoft Security Essentials (MSE), is primarily targetted at preventing Spyware and contains just a small fraction of MSE’s anti-malware signatures (and monitors fewer Windows pathways than MSE). This was an idea leftover from a decade ago when Anti-Virus and Anti-Spyware were considered two different problem spaces and users had to purchase two different solutions. Microsoft purchased the Giant Anti-Spyware product and made it available to users, later releasing an equivalent capability set as Windows Defender. User’s still had to install an anti-Virus product if they wanted protection from more than Spyware. Over the years both Anti-Virus and Anti-Malware were subsumed into integrated Anti-Malware products, but Windows Defender stayed targetted at its original Anti-Spyware mission. Since all Anti-Malware products now contain anti-Spyware capabilities, and with the threat threats now focused largely outside of its original scope, Windows Defender had become superfluous. Microsoft essentially had two choices, remove Windows Defender entirely or bring it into the current age and move it from being just an anti-Spyware feature to being a more complete Anti-Malware offering. With Windows 8 Microsoft is taking the Anti-Malware route with Windows Defender. This means every Windows 8 system will have excellent basic Anti-Malware capability out of the box. Finally!
How will Windows 8 Defender change the Anti-Malware landscape? Well, along with other Windows 8 security changes, it makes it much harder for the bad guys to attack the Windows system universe. If you look at the numbers each release of Windows (since XP SP2) has been less subject to Malware than its predecessor. One of the biggests issues remaining is that a very large percentage (I don’t recall the number, but perhaps half) of PCs do not run Anti-Malware software (or have let their subscription lapse so that they don’t get updates). Windows 8 mostly eliminates that situation. Not only does this leave Windows systems better protected, it may actually shrink the opportunity for malware authors to profit from their work so substantially that they turn their focus elsewhere (Mac, Android, Linux, etc.). This has already started to happen with Windows 7, Windows 8 should dramatically accelerate the trend.
For Anti-Malware product vendors I don’t think that Windows 8 Defender really impacts their strategies. They will continue to appeal to consumers by providing what they position as premium capabilities compared to Windows Defender (or MSE). And they will still primarily make their way onto systems by paying OEMs to pre-install trial versions. They’ll continue to tweak their products to make them more attractive for certain types of users. For example, I’d love to see an offering that lets me include the security status of my mother’s PC (2000 miles away) and alerts me when her system has a security issue. That might be enough to get me to install a paid product on not just her PC, but every PC in my household.
So welcome to the world of fully built-in anti-Malware with Windows 8.
One of the things we all struggle with is why do computers have bugs? Why do they run perfectly well for months and then slow down? Worse, why are they slow one day and fast another? Or why does something work a dozen times in a row and then suddenly stop working? Well, I have an experience yesterday that let me explain this to a Doctor and he suggested I share it.
I was in the Doctor’s office and he had his IT support person in looking at why something worked from the computer in his office but not the computer in the exam room. Actually, it had been working in the exam room but had suddenly stopped working correctly. He rhetorically directed this question at me and I rhetorically asked him why a cell suddenly goes crazy and starts reproducing out of control (i.e., Cancer) or why a drug can help 99.9% of patients and kill .1% of them? Then I explained that software (and hardware) have become so complex, with so much state information lying around, that we can no longer completely understand nor control their behavior. You could see the lightbulb go off and he said “I got it, they are Biological Systems”. He, like all of us, think of computers as being ruled by the laws of physics (as he put it, or mathematics as I tend to think of it) and of course at some level (just as with biological systems) they are. But when you look at things at the higher systems level they really have started to resemble a biological system in which no two instances (just like no two people) are exactly alike.
Now I understand why a doctor would so easily jump to this understanding of modern computer systems, but I’ll dive into it in case you aren’t comfortable with the analogy.
People (as an example of biological systems) are each unique individuals. They receive their basic programming (e.g., DNA) from their parents and while each Homo Sapien inherits mostly the same programming we also inherit a bunch of unique programming. There are hundreds of BRCA1 gene mutations. If a woman inherits one of the wrong ones then she has a 60% chance of developing Breast Cancer. There are 20-25,000 genes in the human body, and I don’t know how many variations of each, and then the interactions between genes. So our variety and complexity are quite high. Well, you say, this doesn’t happen with computers? And I say BS Nearly every computer out there is at least a slight variation from every other computer. They have different CPU chips, different graphics boards, different BIOS authors, different version of that BIOS, different hard drive models, different collections and versions of software installed on them, etc. A PC is a PC in a similar way that a Person is a Person. They are the same, yet actually quite different, even in the aspects we think of as invariant.
What is even more interesting is that both People and Computers run around with both temporary and persistent State information floating around, and that this state information alters their systems’ behaviors in seemingly unpredictable ways. For example, there are many drugs which make people photosensitive. For those not taking the drug 30 minutes in the sun helps their tan and produces nice amounts of Vitamin D. For those taking the drug 30 minutes in the sun produces a sun burn. Smoking produces all kinds of persistent state changes. Combine all the persistent (over a lifetime) and temporary state changes and you get strokes, heart disease, cancer, etc. Or take Vioxx. Like most drugs it induced a temporary state change to fight inflamation (e.g., from Arthritis). Unfortunately it turned out that in some patients, the state change it caused interacted with other state (and perhaps genetic programming) in the body to cause heart damage. So how does this work in computers?
Let’s take something as simple as an e-mail message. Each message has a tremendous amount of state, and you are constantly altering that state. Read the message and the computer switches the state from Unread to Read. Reply and it records that the message has been replied to. Flag or otherwise categorize the message and that’s recorded too. Sync your Droid, our iPad, Outlook on your PC, access that same message from IE7, IE8, IE9, Firefox, Chrome, Safari, etc. and you have a tremendous amount of both temporary and persistent state involved. Things would likely be simple if programming always dealt with one state at a time, but often you deal with multiple states simultaneously. Since the amount of state being kept in a typical computer is now so large, from a practical standpoint the variation is approaching dangerously close to infinity. The programming of how to behave in light of all that state, and how to modify that state information, is different for each of the ways of accessing your mail. And so you very quickly end up with things from minor bugs, like the iPad (and iPhone’s) email application not being able to correctly maintain the Read/Unread count, to more serious problems like having your iPhone or Outlook lose the ability to correctly sync with the email server without deleting and re-adding the email account, to disasters like email being completely lost. Now multiply this idea through everything running on your system, and that even seemingly independent things can have state interactions, and you start to see the picture. Why does killing and restarting an application, or rebooting your computer, often resolve problems? Because it clears temporary state information. Sometimes you never get back to that same set of temporary state and thus the problem never recurs. Sometimes you get back to it eventually. Occasionally you can reproduce it quickly, implying an interaction with more persistent state. As annoying as rebooting is, think of it as an advantage Computers have People. We can’t just throw away our temporary state, we have to alter it quite tediously using drugs, nutrition, lifestyle change, etc. But if anything this strengthens the analogy.
Now you know why after decades of trying to make software bug-free they are still so unreliable. 30 years ago most bugs were straightforward coding mistakes, and those now rarely make it out of the software development process. 20 years ago most bugs were about localized mishandling of a single piece of state, and once again those rarely make it through the software development process. But since then we’ve been struggling with the explosion of both temporary and persistent state on both a local and global basis. The trend to new app models (IOS, Windows Phone 7, and now Windows 8′s Metro app model) is largely driven by the realization that the industry needed greater isolation of state-sharing between applications (and greater control over the application’s impact on system state). That’s progress and explains a lot of why an iPhone or Windows Phone 7 device feels so much more reliable than either a Windows or Mac PC. At the same time the move to cloud computing, and thus the greater amount of state sharing between various clients and the cloud, is increasing the amount of distributed state. An explosion in the number of cores in a typical computer processor, and the growth in heterogeneity of cores (or auxiliary processors like GPUs) also is dramatically increasing complexity. So when we look back 10 years from now I don’t expect the overall reliability of computer systems to have improved. Doesn’t the fight to make computer systems reliable feel a lot like the fight to cure Cancer?
So the next time you wonder why computers aren’t more reliable, or try to explain it to a friend, keep the biological system analogy in mind. Because those are the rules computer systems are now following.
(Let me apologize before you start for the length of this blog entry. If it were a magazine article I’d spend hours more trying to edit it to perhaps half its current length. But this is a blog, and the thing about blogs is that they are usually stream of conscience rather than highly thought through and edited. And when I stream my thoughts, well…)
One of my recent postings brought up a reply that essentially says “touch and gestures is old thinking, I want a speech-based user interface”. Ah, wouldn’t we all? Generalized speech recognition is one of the “Holy Grail”‘s of Computer Science. I can still remember one of my friend’s returning from Carnegie Mellon University for a summer break in the mid-1970s and going on about how generalized speech recognition (he’d been working on Hearsay-II) was right around the corner. 35ish years later and we are still not quite there. I still pick on him about it. A couple of years ago I teased Microsoft’s Chief Research Officer (and former CMU professor), Rick Rashid, about this. Rick correctly pointed out that we have come a long way and that speech recognition is now entering widespread, if more targeted, use. So I’m going to talk about the evolution of computer User Interface, where we seem to be with speech, and why speech may never become the primary mode of computer interaction.
When it comes to direct human interaction with computers the way it all started was by taking existing tools and figuring out how to wire them up to the computer. We had typewriters, so by hooking a typewriter to the computer you could input commands and data to it and the computer could print its output. We had oscilloscopes so by hooking one up to the computer we could output more graphical information. We had to create a language you talked to the computer in and those command line (aka command shell) languages became the primary means of interacting with computers in the 1960s, 70s, and 80s. Even today Linux, Windows, MAC OS, etc. all have command line languages and they are often used to perform more esoteric operations on the systems. The nice thing about command line languages is that they are dense and precise. The bad thing is that they are unnatural (requiring wizard-level experts who have trained on and utilized them for years).
These three attributes, density (how much information can be conveyed in a small space), precision (how unambiguous is the information conveyed), and how natural (to the way humans think and work) can be used to evaluate any style of computer interaction. The ideal would be for interactions to be very dense, very precise, and very natural. The reality is that these three attributes work against one another and so all interaction styles are a compromise.
As far back as the 1960s researchers were looking for a more natural style of computer interaction than command lines. And obviously Science Fiction writers were there too. For example in the original Star Trek we see interactive graphic displays, tablet style computers, sensor-based computers (e.g., Tricorder) and computers with full speech recognition. Who can forget Teri Garr’s amazement at seeing a speech controlled typewriter in 1968′s “Assignment: Earth” episode? Yet these were all truly science fiction at the time. Interestingly Star Trek never showed use of a computer mouse, and in the Star Trek Movie “The Voyage Home” when Scotty sees one he has no idea what it is. I find that interesting because the computer mouse was invented in 1963, although most people would never see one until the 1990s.
The command line world wasn’t static and continued to evolve. As video terminals began to replace typewriter-style terminals (or “teletypes”) they evolved from being little more than glass teletypes to being capable of displaying forms for data input and displaying crude graphics for output. Some more human-oriented command languages, such as the Digital Command Language (DCL) appeared. Some command line processors (most notably that of DEC’s TOPS-20) added auto-completion and in-line help, making command lines much easier to use by non-experts. Of all these only Forms altered the basic Density, Precision, Naturalness equation by allowing Task Workers (e.g., order entry clerks) to make use of computers. After all, filling out forms is something that humans have been doing for at least a couple of centuries.
In the 1960s and 1970s Stanford Research Institute’s ARC and Xerox’s PARC continued to work on better ways to interact with computers and produced what we now know as the Graphical User Interface (GUI), based on Windows, Icons, Menus, and Pointers (WIMP). While WIMP is far less dense than command line based systems, it maintains their precision. Density is still important however, which is why keyboard shortcuts were added to Microsoft Windows. But most importantly, WIMP is far more natural to use than command lines due to the desktop paradigm and visual clues it provides. It was GUI/WIMP that allowed computers to fully transition from the realm of computer specialists to “a computer on every desk and in every home”.
Work continued on how to make computers even more natural to use. One of the first big attempts was Pen Computing and Handwriting Recognition, which had its roots in the 1940s (or as far back as 1888 if you want to stretch things). There was a big push to bring this style to the mainstream in the late 1980s and early 1990s, but it failed. High costs, poor handwriting recognition, and other factors kept pen computing from catching on. It wasn’t dense nor precise enough. This style enjoyed a bit of a renaissance in the late 1990s with the introduction of the Palm Pilot which eschewed general handwriting recognition in favor of a stylized pen input technique known as Graffitti. The Palm Pilot was also a limited function device, which allowed it to be well tuned for Pen use. This lead to further use of a Pen (aka Stylus) in many PDAs and Smartphones. However, the more general purpose the platform (e.g., Smartphones or a PC) the more tedious (lack of density) Pen use became. In other words, the use of a Pen as just another pointer in a WIMP system was just not very interesting.
This finally brings us to the user interface paradigm that will dominate this decade, Touch and Gestures (Touch). Touchscreens have been around for many years, at least back to the 70s. But they generally had limited applicability (e.g., the check-in kiosk at the airport). When Apple introduced the iPhone, dropping WIMP and bypassing Pen Computing in favor of a Touch-based UI, it really did change the world. To be fair Microsoft introduced these at the same time, but in a very limited production product known as Surface. So Apple gets the real credit. Touch trades away density and precision to achieve a massive leap in how natural it is for a human to interact with the computer. The tradeoff works really well for content consumption, but is not good for content creation. So WIMP, which is a great content creation paradigm, is likely to live on despite the rise of Touch. The place most users probably notice Touch’s precision problems are when there are a series of links on a web page that are stacked on top of one another. Your finger can’t quite touch the right one (there it is, lack of precision). If you are lucky you can use a gesture to expand and then position the page so you can touch the right link (requiring more operations, which is less dense than WIMP would allow), but sometimes even this doesn’t work. Now expand this to something like trying to draw a schematic, or a blue print, and you can see the problems with Touch and why WIMP will continue to survive. For another example consider how much easier it is to book a complex travel itinerary (tons of navigation and data input) on your PC versus doing the same on your iPad. It is one of the few activities where I feel compelled to put down my iPad and move to my PC. Writing this blog is another. Touch is great for quick high-level navigation to content you want to view. It is painful for performing precise and/or detailed input.
Speech-based user interface research dates back to the 1950s, but took off in the 1970s. You can really split this into Speech output and Speech recognition. As I pointed out earlier, the big joke here is that generalized speech recognition is always right around the corner. And has been for almost 40 years. But speech synthesis output has been commercially successful since 1984′s introduced of DECtalk. DECtalk was a huge hit and 27 years later you can still hear “Perfect Paul” (or “Carlos” as he was known to WBCN listeners, which included so many DECies that most of us forgot the official name), DECtalk’s default voice, from time to time. But what about Speech recognition?
If you own a Windows XP, Windows Vista, or Windows 7 PC then you have built-in speech recognition. Ditto for the last few versions of Office. How many of you know that? How many of you have tried it? How many use it on a regular basis? I’d love if Microsoft would publish the usage statistics, but I already know they would indicate insignificant usage. My father used to call me up and say “hey, I saw a demo of this thing called Dragon that would let me write letters by just talking into the computer”. He did this more than once, and each time I told him he had that capability in Microsoft Word, but to my knowledge he never actually tried it. I did meet a lawyer who threw away her tape recorder and began using Dragon Naturally Speaking for dictation, but I think she was a special case. Frankly, in all the years I’ve heard about speech recognition she is the only layperson (or non-physically challenged person) I’ve met who uses it on such a general and regular basis. More on her situation later. Meanwhile my own attempts to use this feature demonstrated its weakness. It works great until you have to correct something, then its use becomes extremely tedious (lack of precision and density), and complex changes require the use of a pointing device (or better put, you go back to WIMP).
It’s not just that you can do dictation in Microsoft Word or other applications, you can control your Microsoft Windows machine with it. However, I can’t see many people doing this for two reasons. One is speech’s lack of both density and precision. The other is that layering speech on top of a WIMP system makes everything about speech’s lack of density and precision worse. File->Save As->… is just too tedious a command structure to navigate with speech. But the most important indictment of speech as the primary form of computer interaction is that it is far less natural than people assume.
Think about how annoying it is for someone to take a cell phone call in a restaurant. Or why do you suppose that most U.S. Airlines have decided not to install microcells on their planes so you can use your cell phone in flight (and even those with in-flight WiFi are blocking Skype and VOIP services)? And how proper is it for you to whip out your cell phone and take a call in the middle of a meeting? Or think about how hard it is to understand someone in a crowded bar, at a rock concert, in an amusement park, or on a manufacturing floor. Now imagine talking to your computer in those same circumstances. Your co-workers, fellow diners, or seatmates will want to clobber you if you sit around talking to your computer. And you will want to slit your own throat after a few experiences trying to get your computer to understand you in a noisy environment. Speech is a highly flawed communications medium that is made acceptable, in human to human interaction, by a set of compensating mechanisms that don’t exist in a human to computer interaction.
I recently read about a study that showed that in a human to human conversation comprehension rises dramatically when you can see the face of the person you are talking to. Our brains use lip-reading as a way to autocorrect what we are hearing. Now maybe our computers will eventually do that using their cameras, but today they are missing this critical clue. In a human to human interaction body language is also being used as a concurrent secondary communication channel along with speech. Computers don’t currently see this body language, nor could they merge it with the audio stream if they did. In human to human communications the lack of visual cues is what makes an audio conference so much less effective than a video conference, and a video conference so much less effective than an immersive experience like Cisco’s Telepresence system, and Telepresence somewhat less effective than in-person meetings. And when you are sitting in a meeting and need to say something to another participant you don’t speak to them, you slip them a note (or email, instant message, or txt them even though they are sitting next to you).
I use speech recognition on a regular basis in a few limited examples. One of the ones I marvel at is United Airlines’ voice response system (VRP). It is almost flawless. In this regard it proves something we’ve long known. You can do generalized speech recognition (that is, where the system hasn’t been trained to recognize an individual’s voice) on a restricted vocabulary or you can do individualized recognition on a broader vocabulary. For example, getting dictation to work requires that you spend 15 or more minutes training the software to recognize your voice. I imagine that specialized dictation (ala medical or legal) takes longer. United has a limited vocabulary and so it works rather well. My other current usage is Windows Phone 7′s Bing search. I try to use speech recognition with it all the time, and it works maybe 70% of the time. There are two problems. The first is that if there is too much noise (e.g., other conversation) around me then it can’t pick up what I’m saying. The bigger one is that if I say a proper noun it will often not come close to the word I’m trying to search on. Imagine all the weird autocorrect behaviors you’ve seen on steriods. Autocorrect is a great way to think about speech recognition, because after software converts raw sound into words that sounds similar it uses dictionary lookups and grammatical analysis to guess at what the right words are. I suggest a visit to http://damnyouautocorrect.com/ for a humorous (and, warning, sometimes offensive) look at just how off course these techniques can take you.
Let’s get to the bottom line. Speech has horrible precision, poor density, and there are social factors that make it natural in only certain situations.
So what is the future of speech? Well first of all I think the point uses of it will continue to grow dramatically. Things like United Airlines’ VRP. Or the lawyer I mentioned. She used to dictate into a tape recorder then pay a transcription service to transcribe the tape. She would then go back over the transcript and make corrections. The reason that a switch to Dragon Naturally Speaking worked for her is that the correction process took her no more time then did fixing the errors the transcription service introduced. And it was a lot cheaper to have Dragon do the initial transcription than to pay a service. So certainly there are niches where speech recognition will continue to make inroads.
The bigger future for speech is not as a standalone user interface technology but rather part of a full human to humanoid-style of interaction. I can say play or “touch” play to play a video. I can merge sensory inputs, just as humans do, to figure out what is really being communicated. I can use a keyboard and/or pointer when greater precision is required, just as humans grab white boards and other tools when they can’t communicate with words and gestures alone. And I can project output on any display (the same one you use as your TV, your phone, a dedicated monitor, the display panel on your oven, the speakers on your TV or audio components, etc. This is the totality of a Natural User Interface (NUI). Speech doesn’t become truly successful as a user interface paradigm of its own. It shines as part of the NUI that will dominate the next decade.
I really think it will take another 8-10 years for a complete multi-sensor NUI (nee Humanoid UI) to become standard fare, but Microsoft has certainly kicked off the move with the introduction of Kinect. It’s primitive, but its the best prototype of the future of computing that most of us can get our hands on. Soon we’ll be seeing it on PCs, Tablets, and Phones. And a decade from now we’ll all be wondering how we ever lived without it.
ason Garms, the Group Program Manager at Microsoft responsible for Windows 8′s security features, has written an overview of Windows 8′s added malware protection. If you are on the techie-side then it’s a great read, but otherwise your eyes will probably glaze over. So I’ll do a little bit of a summary for those who are curious, but if this is a topic of deep interest then I highly recommend reading Jason’s blog entry.
First let’s get the part that might make your eyes glaze over out-of-the-way. Malware-authors often are trying to exploit a vulnerability (i.e., flaw) to install their malware on your system. There are things (known as mitigations) you can do in software that make it very difficult to exploit any vulnerabilities they may find. Microsoft started introducing these techniques in Windows XP SP2 and has been expanding them in each release since. This is a key reason why, for example, Windows 7 is so much less subject to Malware than Windows XP. And Windows 8 contains yet another set of major mitigation improvements.
Last month the Microsoft SQL Server team effectively sounded the death knell for Microsoft’s OLE DB. I say “effectively” because while SQL Server isn’t the only implementor of OLE DB, it is (or rather was) Microsoft’s flagship for this data access technology. Since I was both a godfather of the OLE DB strategy and responsible for the SQL Server implementations that have now been deprecated I thought now would be a good time to reveal the overall strategy and why it never succeeded as envisioned.
Before we time travel into OLE DB’s origins let’s survey the current state of data access in SQL Server. The first person who contacted me after Microsoft announced SQL Server’s deprecation of OLE DB basically said “there goes Microsoft, changing data access strategies again”. Well, Microsoft does indeed have a history of replacing its data access APIs all too frequently. But the truth is that OLE DB has been with us for 15 years and, although deprecated, will be supported for another 7. 22 years is a heck of a long lifespan for a technology, particularly one that was only partially successful. And the truth is that OLE DB is well past its prime, with many other data access technologies (both older and newer) in far greater use.
The reason Microsoft has seemingly changed horses so often on the data access front is because of the rapid evolution that the market has demanded. Initially SQL Server used the DB-Library API that Sybase had invented (and then deprecated around the time Microsoft and Sybase ended their relationship). Microsoft had come up with ODBC as a way for Excel to import data from various data sources, but the primary client database actually in use at the time was the JET database inside Microsoft Access and Visual Basic. A programming model called DAO was provided to access JET. DAO could access SQL Server and databases supporting ODBC but only through JET’s distributed query processor (RJET/QJET) thus making that access slow. For SQL Server 6.0 Microsoft created a native ODBC driver for SQL Server to replace DB-Library and for Visual Basic 4 the Developer Division introduced RDO as an object model that lived directly on top of ODBC and thus didn’t have to go through RJET/QJET. RDO/ODBC quickly became the native and preferred data access story for applications written with Microsoft technology. When OLE DB came along we introduced ADO as the object model directly on top of OLE DB. With the introduction of .NET we needed an object model that both was optimized for the .NET world (which could have been just a minor evolution of ADO) but more importantly one that was specifically tuned for the Internet. This latter requirement was one of the factors that lead the ADO.NET team to create their own data provider model, one of which could be connector to OLE DB data sources. But for optimal performance they chose to implement a data provider that natively spoke to SQL Server’s TDS network protocol. Later programming advances, such as LINQ and the Entity Framework, also use the ADO.NET native SQL Server Data Provider. During the development of SQL Server 2000 it became apparent that SQL Server was at a huge disadvantage when our customers chose to build applications using Java with either IBM’s Webspere or BEA’s Weblogic because we didn’t have a JDBC driver. I initiated an effort to add a Microsoft-supported JDBC driver to SQL Server’s bag of tricks.. More recently a PHP driver, that actually layers on top of ODBC, was added to SQL Server’s supported data access methods. So for nearly a decade now the primary ways to write applications that access SQL Server have NOT involved OLE DB! No wonder the SQL Server team feels comfortable deprecating it.
With that background out-of-the-way let’s time travel back and look at the real OLE DB strategy and why it never achieved its goals. When I joined Microsoft in April of 1994 there was already a OLE DB effort underway. In fact the very first meeting I remember attending was an evening meeting of a design team working on the OLE DB spec. It was an evening meeting because all the participants also had “day jobs” that their management pressured them to work on. The key driver of OLE DB at the time was the Cairo Object File System (OFS). Soon thereafter we’d press the reset button and assign a pair of newly hired Partner Architects to the OLE DB effort as their day jobs. OFS, though still a factor for a while, soon departed the scene. With OLE DB we were trying to accomplish two things. One was a vision of Universal Data Access that went beyond relational databases, the other was the idea of Componentized DBMS. OLE DB was to partially succeed at the first, but fail horribly at the second.
It is hard to remember that back in the early 90s when things like OFS and OLE DB were conceived that relational databases were still in their youth and not very widely accepted. Most corporate data was still stuck in hierarchical (IMS) and network (Codasyl) databases or flat files (VSAM, RMS). Vast amounts of data was stored on the desktop, usually in tools like Microsoft Excel. The most popular data store for applications on the desktop was Btrieve, an ISAM-type offering. Microsoft also realized that email, then still in its infancy, would turn out to be the largest information store of all. Microsoft Exchange was envisioned as a mail system on top of OFS, but ultimately implemented its own (again distinctly non-relational) store. And many people thought that Object Databases were the wave of the future, though ultimately they never achieved much success outside the CAD/CAM/CAE world. So it seemed clear that Microsoft needed a data access strategy that would work across the relational, object, and legacy data worlds.
One proposal was to extend ODBC to handle the new requirements however this approach was ultimately rejected. Since this was before my time I don’t understand exactly what happened, but what I recall being told was that they tested the extensions out with other companies involved with ODBC and found significant resistance to them. Deciding that if the industry wasn’t going to accept the idea of extending ODBC they might as well go for a more optimal solution, Microsoft went down the path that lead to OLE DB.
Beyond better legacy data access, the fact that Microsoft was working on non-relational stores makes it kind of obvious why we thought we needed OLE DB. But we can take it to another level, we thought that even in the relational world we would evolve to add more object and navigation capabilities. And we would implement this by creating a componentized DBMS that let an application use various capabilities depending on its needs. There would be a Storage Engine, Query Processor, and one or more Navigation Engines. In the most primitive form the Navigation Engine would implement SQL’s Cursors, but in an extended form it would be an in-memory database that projected data as objects that you could access via pointer chasing (ala Object Databases). An application could go against data in the database directly with ISAM-style access against the Storage Engine, or it could use the Query Processor to access data in the Storage Engine or other stores, or it could use a Navigation Engine (or other unconcieved of components) for additional capabilities. It was this strategy that really drove Microsoft down the OLE DB path, and this strategy that never came to fruition.
By 1994 Microsoft had realized it wanted to be a serious contender in the database arena but had not finalized its strategy for doing so. In preparation for this the company had negotiated a split with Sybase giving us rights to an older version of their source code, joint ownership of things like the TDS protocol, and freedom (for both parties) to pursue our respective strategies. While the SQL Server team had launched an effort to build the first independently developed version (SQL95 aka SQL Server 6.0) of the product, there was tremendous debate going on around how to proceed in the long term. The organization behind the JET database engine in Access (also known as JET-RED) had embarked on an effort to create a new JET-compatible Server database known as JET-BLUE (fyi, it is JET-BLUE that is used in Microsoft Exchange and not JET-RED; most people just say JET and don’t realize they are different). However there was no query processor being built to work with JET-BLUE and no customer for it other than Microsoft Exchange. The Exchange team, faced with delays in OFS, had opted to build their own interim store using JET-BLUE for the low-level database capabilities. This “interim” store is still in use today. The discussion throughout 1994 was do we take JET-BLUE and build a full new RDBMS around it or do we start with SQL Server and basically gut it and replace its insides while maintaining a highly compatible exterior. There was a lot of back and forth but ultimately we decided that if we were going to succeed in the Enterprise that evolving the SQL Server product was the more likely route to success (because we had a large customer base and it was popular with Enterprise ISVs). This didn’t sit well with the SQL Server team, because they realized we were forcing them down a risky re-write path while they preferred a more straightforward evolution of their code base. And it really didn’t sit well with the JET-BLUE team, whose leader (one of Microsoft’s earliest employees) made one last appeal to Bill Gates before the strategy was finalized. As everyone now realizes the strategy to go with SQL Server was both chosen, and did succeed. But it ultimately doomed the vision of a componentized DBMS.
Work started in 1994 on design for a Query Processor (QP) for the new componentized DBMS, and after we made the decision that SQL Server would be our future product focus we moved the QP team to the SQL Server organization. But it wasn’t until we shipped SQL Server 6.5, a minor update to SQL95, in the spring of 1996 that work on re-architecting and re-writing SQL Server got fully underway. The internal architecture was to follow the componentized DBMS idea and use OLE DB to connect its components. While it largely did this, the choice to gut and build on an existing product introduced some realities that hadn’t really been anticipated in the original plan.
There were two factors that the componentized DBMS idea hadn’t fully taken into account. The first was rapid innovation in Query Processor technology that made the then state-of-the-industry split of responsibilities between the Storage Engine and Query Processor obsolete. The second was that it didn’t account for all the aspects of a traditional Relational Engine that didn’t fall into the Query Processor or Navigation Engine categories. For example, OLE DB said nothing about management interfaces across the components. Two other factors would also come into play. The first was that we couldn’t rewrite all of SQL Server in a single release and so we’d have to maintain some legacy code that violated the new architecture. The second was that to maintain compatibility with older versions of SQL Server and to exceed its performance we’d have to violate the architecture. Although we thought both of these two later factors would be temporary, they ultimately contributed greatly to abandonment of the componentized DBMS idea.
As we re-built SQL Server (Sphinx, later known as SQL Server 7.0) we did use OLE DB internally. The Query Processor and Storage Engine do talk to one another using it. In one regard this architecture proved out nicely in that we were able to add Heterogeneous Query directly into the product. There is a lot of unique connection and metadata management work, but once you use OLE DB to materialize a Rowset to an external data source the Query Processor can just party on the data without regard to if it came from the Storage Engine or an external source. But that is about the only part of using OLE DB internally that I can point to as a success. For all the things that OLE DB doesn’t cover we had to use a lot of private interfaces to talk between the Storage and Relational engines. And then there is that rapidly evolving Query Processor things. It turned out we could never allow access directly to the Storage Engine (SE) because the QP ended up (for performance reasons) taking over responsibilities that had previously been in the SE. For example, the maintenance of Referential Integrity. Or for a later example, materialization of the Inserted and Deleted tables used by Triggers. We’d debated doing this in SQL Server 7.0, but for expediency went with the traditional solution. In the latter the way these virtual tables are created is for the Storage Engine to scan backwards through the Log file. However, as you get to higher performance systems the Log becomes your key bottleneck and so removing these backward scans is an important performance boost. So now the Query Processor could look to see if an update would cause a Trigger to fire and build in to the Query Plan the creation of the inserted and deleted tables along the way. But this also meant that you could never allow an application to directly update a table through the Storage Engine if a Trigger existed. As we put all our energy into building SQL Server 7 and then SQL Server 2000 these various realities pushed the componentized DBMS idea further and further away from ever becoming a reality.
It wasn’t just SQL Server’s difficulty with implementation of the concept that caused the death of the componentized DBMS plan, it was its success in the market and the acceptance of relational databases throughout the industry. Basically it became more important to have a great RDBMS, and extend it as needed, then to componentize. Without spending any more time on this point, let’s just leave it with the idea that this key force behind OLE DB was effectively dead.
So what of the Unified Data Access part of the OLE DB strategy? Well that was more successful but also had three flaws, plus had run its course. It ran its course because the success of relational databases meant that the need to access other database sources dimished fairly rapidly. One flaw is related to the componentized database flaw, which is that the non-relational database solutions that OLE DB envisioned never really took off. Another is that it was too related to the COM world and thus anyone who had to target platforms other than Windows couldn’t fully embrace it. And the final one is that interop vendors basically followed the same strategy with OLE DB that they followed with ODBC. They put a OLE DB interface on a lightweight Query Processor and then used a proprietary interface between their Query Processor and drivers for different data sources. It was their QP and Drivers that turned every data source into a relational data source, thus eliminating any differentiation between OLE DB and ODBC for accessing that data. OLE DB’s unique advantages in this space were thus never fully exploited.
OLE DB has had other issues during its life of course. It is a rather complicated architecture, partially because it was intended to do so much and partially because it was designed by a committee and then had to be rapidly reworked. The architect who did a lot of that rework admitted to me years later how that was the one piece of work in his career that he was embarrassed about. Also a lot of OLE DB components we shipped as part of a package called MDAC, were caught up in multiple controversies such as a lockdown of who could update things shipped in Windows. We wasted a lot of time and effort trying to figure out how and when updates to OLE DB could ship, how to maintain compatibility between versions, etc. But I think these tactical issues account for far less of OLE DB’s limited success than the failure of our original strategic imperatives to take hold. Without those OLE DB became a solution looking for a problem.
The second big change is the expansion of the built-in Windows Defender into a more complete anti-Malware solution. Jason revealed that when Windows 7 shipped the telemetry Microsoft was seeing indicated that close to 100% of systems had up-to-date anti-malware, but that a year later at least 27% did not. This is likely because many people do not pay for subscriptions to the anti-malware software pre-installed by computer manufacturers once the trial subscription runs out. Windows Defender addresses this problem.
A really exciting development is the inclusion of Application Reputation into Windows 8 itself. This feature first appeared in Internet Explorer 9′s SmartScreen and has now been extended to any file that is downloaded from the Internet (via other browsers, for example) and then run. If the file is has a known good reputation then Windows lets it run. If it does not have an established reputation then Windows warns you that it is risky to run. You will now see fewer warnings than in the past (Microsoft estimates that typical users will see only 2 warnings a year), and should take those warnings very seriously.
The last set of changes Jason talks about are changes to how Windows boots that protect against newer types of malware called Bootkits and Rootkits. One of the areas that malware authors have begun targeting is to install their malware so that it runs before any anti-malware software is started. So somewhere between when you press power-on and you logon to Windows. If malware can take control during this period then it can hide from or disable anti-malware software. Microsoft has secured this path, particularly when you are using a new PC that includes the latest firmware implementing “Secure Boot”. I can’t tell you how many conversations I’ve been in with security experts where the summary has been “we can’t really tell if a computer is healthy because the boot path is vulnerable”. With Windows 8 (and modern computers) that will no longer be true.
There’s the summary of Windows 8′s malware-protection improvements. For more details please see Jason’s blog posting.