Mostrando entradas con la etiqueta Software. Mostrar todas las entradas
Mostrando entradas con la etiqueta Software. Mostrar todas las entradas

10/4/11

Microsoft: Happy 36th birthday!

The company's story is an important part of Americana. But how much do you really know about it?
By Microsoft Subnet on Mon, 04/04/11 - 4:24pm.

Microsoft was founded on April 4, 1975, as a partnership between Bill Gates and Paul Allen. The company's history is an embedded part of Americana. Its leaders are household names. Its products grace just about every household in the land. It's story is the stuff of American legend and myth (a scrappy startup that turned into an international powerhouse).

Smile worthy too: Steve Ballmer as emoticons

But how much do you really know about the legendary software maker? Here's a quiz to test you. (Answers can be found on page 2 of this article.)

1. In what city and state was Microsoft founded? a. Bellevue, Wash.
b. Redmond, Wash.
c. Albuquerque, N.M
d. Tucson, Ariz.

2. How old were Bill Gates and Paul Allen when they founded Microsoft?'
a. Gates was 19. Allen was 22.
b. Gates was 17; Allen was 26.
c. Gates was 24; Allen was 30.
d. Gates was 26; Allen was 19.

3. On November 2, 2001, Microsoft and the Department of Justice came to an agreement on the DOJ's antitrust lawsuit against Microsoft. What was the product that originally sparked the lawsuit?
a. DOS
b. Excel
c. Internet Explorer
d. Windows

4. In what year did Steve Ballmer join Microsoft?
a. 1990
b. 2000
c. 1965
d. 1980

5. What year was the flagship Windows 3.0 released?
a. 1984
b. 2000
c. 1988
d. 1990

Answers:

1. C. Microsoft was founded in Albuquerque, New Mexico. It didn't move to Washington until 1979 (Bellevue). It moved to its Redmond HQ's in 1986.

2. A. The teenaged Gates was paired with a bushy-bearded Allen. Gates was only 19 but looked like he was 15. Allen was 22 but looked about 35.

3. C. Internet Explorer was the cause for the DOJ case against Microsoft in a case that began in 1998. Not only was it argued that bundling IE for free with Windows gave it an unfair advantage in the browser market, but that Microsoft fiddled with Windows to make IE perform better than third-party browsers. After extensions, government oversight of Microsoft as a result of the case is set to expire in May.

4. D. Steve Ballmer joined Microsoft in 1980 in an executive operations role. He was responsible for personnel, finance, and legal areas of the business. Although he became CEO in 2000, Bill Gates didn't retire from day-to-day operations until 2008, so Ballmer has only had solo reign of the company since that time. In 1980, Microsoft had year-end sales of $8M and 40 employees.

5. D. Windows 3.0 was released in 1990 and was the first wildly successful version of Windows. This version of Windows was the first to be pre-installed on hard drives by PC-compatible manufacturers. Two years later, Microsoft would release Windows 3.1. Together, these two versions of Windows would sell 10 million copies in their first two years. When Windows 95 was to launch in 1995, people stood in lines to buy their copy. Sphere: Related Content

13/3/11

How The Kernel Manages Your Memory

After examining the virtual address layout of a process, we turn to the kernel and its mechanisms for managing user memory. Here is gonzo again:
mm_struct
Linux processes are implemented in the kernel as instances of task_struct, the process descriptor. The mm field in task_struct points to the memory descriptor, mm_struct, which is an executive summary of a program’s memory. It stores the start and end of memory segments as shown above, the number of physical memory pages used by the process (rss stands for Resident Set Size), the amount of virtual address space used, and other tidbits. Within the memory descriptor we also find the two work horses for managing program memory: the set of virtual memory areas and the page tables. Gonzo’s memory areas are shown below:
Memory Descriptor and Memory Areas
Each virtual memory area (VMA is a contiguous range of virtual addresses; these areas never overlap. An instance of vm_area_struct fully describes a memory area, including its start and end addresses, flags to determine access rights and behaviors, and the vm_file field to specify which file is being mapped by the area, if any. A VMA that does not map a file is anonymous. Each memory segment above (e.g., heap, stack) corresponds to a single VMA, with the exception of the memory mapping segment. This is not a requirement, though it is usual in x86 machines. VMAs do not care which segment they are in.

A program’s VMAs are stored in its memory descriptor both as a linked list in the mmap field, ordered by starting virtual address, and as a red-black tree rooted at the mm_rb field. The red-black tree allows the kernel to search quickly for the memory area covering a given virtual address. When you read file /proc/pid_of_process/maps, the kernel is simply going through the linked list of VMAs for the process and printing each one.

In Windows, the EPROCESS block is roughly a mix of task_struct and mm_struct. The Windows analog to a VMA is the Virtual Address Descriptor (VAD); they are stored in an AVL tree. You know what the funniest thing about Windows and Linux is? It’s the little differences.

The 4GB virtual address space is divided into pages. x86 processors in 32-bit mode support page sizes of 4KB, 2MB, and 4MB. Both Linux and Windows map the user portion of the virtual address space using 4KB pages. Bytes 0-4095 fall in page 0, bytes 4096-8191 fall in page 1, and so on. The size of a VMA must be a multiple of page size. Here’s 3GB of user space in 4KB pages:
Paged Virtual Space
The processor consults page tables to translate a virtual address into a physical memory address. Each process has its own set of page tables; whenever a process switch occurs, page tables for user space are switched as well. Linux stores a pointer to a process’ page tables in the pgd field of the memory descriptor. To each virtual page there corresponds one page table entry (PTE) in the page tables, which in regular x86 paging is a simple 4-byte record shown below:
x86 Page Table Entry 4KB
Linux has functions to read and set each flag in a PTE. Bit P tells the processor whether the virtual page is present in physical memory. If clear (equal to 0), accessing the page triggers a page fault. Keep in mind that when this bit is zero, the kernel can do whatever it pleases with the remaining fields. The R/W flag stands for read/write; if clear, the page is read-only. Flag U/S stands for user/supervisor; if clear, then the page can only be accessed by the kernel. These flags are used to implement the read-only memory and protected kernel space we saw before.

Bits D and A are for dirty and accessed. A dirty page has had a write, while an accessed page has had a write or read. Both flags are sticky: the processor only sets them, they must be cleared by the kernel. Finally, the PTE stores the starting physical address that corresponds to this page, aligned to 4KB. This naive-looking field is the source of some pain, for it limits addressable physical memory to 4 GB. The other PTE fields are for another day, as is Physical Address Extension.

A virtual page is the unit of memory protection because all of its bytes share the U/S and R/W flags. However, the same physical memory could be mapped by different pages, possibly with different protection flags. Notice that execute permissions are nowhere to be seen in the PTE. This is why classic x86 paging allows code on the stack to be executed, making it easier to exploit stack buffer overflows (it’s still possible to exploit non-executable stacks using return-to-libc and other techniques). This lack of a PTE no-execute flag illustrates a broader fact: permission flags in a VMA may or may not translate cleanly into hardware protection. The kernel does what it can, but ultimately the architecture limits what is possible.

Virtual memory doesn’t store anything, it simply maps a program’s address space onto the underlying physical memory, which is accessed by the processor as a large block called the physical address space. While memory operations on the bus are somewhat involved, we can ignore that here and assume that physical addresses range from zero to the top of available memory in one-byte increments. This physical address space is broken down by the kernel into page frames. The processor doesn’t know or care about frames, yet they are crucial to the kernel because the page frame is the unit of physical memory management. Both Linux and Windows use 4KB page frames in 32-bit mode; here is an example of a machine with 2GB of RAM:
Physical Address Space
In Linux each page frame is tracked by a descriptor and several flags. Together these descriptors track the entire physical memory in the computer; the precise state of each page frame is always known. Physical memory is managed with the buddy memory allocation technique, hence a page frame is free if it’s available for allocation via the buddy system. An allocated page frame might be anonymous, holding program data, or it might be in the page cache, holding data stored in a file or block device. There are other exotic page frame uses, but leave them alone for now. Windows has an analogous Page Frame Number (PFN) database to track physical memory.

Let’s put together virtual memory areas, page table entries and page frames to understand how this all works. Below is an example of a user heap:
Heap Mapped
Blue rectangles represent pages in the VMA range, while arrows represent page table entries mapping pages onto page frames. Some virtual pages lack arrows; this means their corresponding PTEs have the Present flag clear. This could be because the pages have never been touched or because their contents have been swapped out. In either case access to these pages will lead to page faults, even though they are within the VMA. It may seem strange for the VMA and the page tables to disagree, yet this often happens.

A VMA is like a contract between your program and the kernel. You ask for something to be done (memory allocated, a file mapped, etc.), the kernel says “sure”, and it creates or updates the appropriate VMA. But it does not actually honor the request right away, it waits until a page fault happens to do real work. The kernel is a lazy, deceitful sack of scum; this is the fundamental principle of virtual memory. It applies in most situations, some familiar and some surprising, but the rule is that VMAs record what has been agreed upon, while PTEs reflect what has actually been done by the lazy kernel. These two data structures together manage a program’s memory; both play a role in resolving page faults, freeing memory, swapping memory out, and so on. Let’s take the simple case of memory allocation:
Heap Allocation
When the program asks for more memory via the brk() system call, the kernel simply updates the heap VMA and calls it good. No page frames are actually allocated at this point and the new pages are not present in physical memory. Once the program tries to access the pages, the processor page faults and do_page_fault() is called. It searches for the VMA covering the faulted virtual address using find_vma(). If found, the permissions on the VMA are also checked against the attempted access (read or write). If there’s no suitable VMA, no contract covers the attempted memory access and the process is punished by Segmentation Fault.

When a VMA is found the kernel must handle the fault by looking at the PTE contents and the type of VMA. In our case, the PTE shows the page is not present. In fact, our PTE is completely blank (all zeros), which in Linux means the virtual page has never been mapped. Since this is an anonymous VMA, we have a purely RAM affair that must be handled by do_anonymous_page(), which allocates a page frame and makes a PTE to map the faulted virtual page onto the freshly allocated frame.

Things could have been different. The PTE for a swapped out page, for example, has 0 in the Present flag but is not blank. Instead, it stores the swap location holding the page contents, which must be read from disk and loaded into a page frame by do_swap_page() in what is called a major fault.

This concludes the first half of our tour through the kernel’s user memory management. In the next post, we’ll throw files into the mix to build a complete picture of memory fundamentals, including consequences for performance. Sphere: Related Content

29/11/10

10 mistakes every programmer makes

Admit it, you've made mistakes like these…
By Julian M Bucknall

When you start programming, you get disillusioned quickly. No longer is the computer the allinfallible perfect machine – "do as I mean, not as I say" becomes a frequent cry.

At night, when the blasted hobgoblins finally go to bed, you lie there and ruminate on the errors you made that day, and they're worse than any horror movie. So when the editor of PC Plus asked me to write this article, I reacted with both fear and knowing obedience.

I was confident that I could dash this off in a couple of hours and nip down to the pub without the usual resultant night terrors. The problem with such a request is, well, which language are we talking about?

I can't just trot out the top 10 mistakes you could make in C#, Delphi, JavaScript or whatever – somehow my top ten list has to encompass every language. Suddenly, the task seemed more difficult. The hobgoblins started cackling in my head. Nevertheless, here goes…

1. Writing for the compiler, not for people
When they use a compiler to create their applications, people tend to forget that the verbose grammar and syntax required to make programming easier is tossed aside in the process of converting prose to machine code.

A compiler doesn't care if you use a single-letter identifier or a more human-readable one. The compiler doesn't care if you write optimised expressions or whether you envelop sub-expressions with parentheses. It takes your human-readable code, parses it into abstract syntax trees and converts those trees into machine code, or some kind of intermediate language. Your names are by then history.

So why not use more readable or semantically significant identifiers than just i, j or x? These days, the extra time you would spend waiting for the compiler to complete translating longer identifiers is minuscule. However, the much-reduced time it takes you or another programmer to read your source code when the code is expressly written to be self-explanatory, to be more easily understandable, is quite remarkable.

Another similar point: you may have memorised the operator precedence to such a level that you can omit needless parentheses in your expressions, but consider the next programmer to look at your code. Does he? Will he know the precedence of operators in some other language better than this one and thereby misread your code and make invalid assumptions about how it works?

Personally, I assume that everyone knows that multiplication (or division) is done before addition and subtraction, but that's about it. Anything else in an expression and I throw in parentheses to make sure that I'm writing what I intend to write, and that other people will read what I intended to say.

The compiler just doesn't care. Studies have shown that the proportion of some code's lifecycle spent being maintained is easily five times more than was spent initially writing it. It makes sense to write your code for someone else to read and understand.

2. Writing big routines

Back when I was starting out, there was a rule of thumb where I worked that routines should never be longer than one printed page of fan-fold paper – and that included the comment box at the top that was fashionable back then. Since then, and especially in the past few years, methods tend to be much smaller – merely a few lines of code.

In essence, just enough code that you can grasp its significance and understand it in a short time. Long methods are frowned upon and tend to be broken up.

The reason is extremely simple: long methods are hard to understand and therefore hard to maintain. They're also hard to test properly. If you consider that testing is a function of the number of possible paths through a method, the longer the method, the more tests you'll have to write and the more involved those tests will have to be.

There's actually a pretty good measurement you can make of your code that indicates how complex it is, and therefore how probable it is to have bugs – the cyclomatic complexity.

Developed by Thomas J. McCabe Sr in 1976, cyclomatic complexity has a big equation linked to it if you're going to run through it properly, but there's an easy, basic method you can use on the fly. Just count the number of 'if' statements and loops in your code. Add 1 and this is the CC value of the method.

It's a rough count of the number of execution paths through the code. If your method has a value greater than 10, I'd recommend you rewrite it.

3. Premature optimisation
This one's simple. When we write code, sometimes we have a niggling devil on our shoulder pointing out that this clever code would be a bit faster than the code you just wrote. Ignore the fact that the clever code is harder to read or harder to comprehend; you're shaving off milliseconds from this loop. This is known as premature optimisation.

The famous computer scientist Donald Knuth said, "We should forget about small efficiencies, say about 97 per cent of the time: premature optimisation is the root of all evil."

In other words: write your code clearly and cleanly, then profile to find out where the real bottlenecks are and optimise them. Don't try to guess beforehand.

4. Using global variables
Back when I started, lots of languages had no concept of local variables at all and so I was forced to use global variables. Subroutines were available and encouraged but you couldn't declare a variable just for use within that routine – you had to use one that was visible from all your code. Still, they're so enticing, you almost feel as if you're being green and environmentally conscious by using them. You only declare them once, and use them all over the place, so it seems you're saving all that precious memory.

But it's that "using all over the place" that trips you up. The great thing about global variables is that they're visible everywhere. This is also the worst thing about global variables: you have no way of controlling who changes it or when the variable is accessed. Assume a global has a particular value before a call to a routine and it may be different after you get control back and you don't notice.

Of course, once people had worked out that globals were bad, something came along with a different name that was really a global variable in a different guise. This was the singleton, an object that's supposed to represent something of which there can only be one in a given program.

A classic example, perhaps, is an object that contains information about your program's window, its position on the screen, its size, its caption and the like. The main problem with the singleton object is testability. Because they are global objects, they're created when first used, and destroyed only when the program itself terminates. This persistence makes them extremely difficult to test.

Later tests will be written implicitly assuming that previous tests have been run, which set up the internal state of the singleton. Another problem is that a singleton is a complex global object, a reference to which is passed around your program's code. Your code is now dependent on some other class.

Worse than that, it's coupled to that singleton. In testing, you would have to use that singleton. Your tests would then become dependent on its state, much as the problem you had in testing the singleton in the first place. So, don't use globals and avoid singletons.

5. Not making estimates
You're just about to write an application. You're so excited about it that you just go ahead and start designing and writing it. You release and suddenly you're beset with performance issues, or out-of-memory problems.

Further investigations show that, although your design works well with small number of users, or records, or items, it does not scale – think of the early days of Twitter for a good example. Or it works great on your super-duper developer 3GHz PC with 8GB of RAM and an SSD, but on a run-of-the-mill PC, it's slower than a Greenland glacier in January.

Part of your design process should have been some estimates, some back-back-of- the-envelope calculations. How many simultaneous users are you going to cater for? How many records? What response time are you targeting?

Try to provide estimates to these types of questions and you'll be able to make further decisions about techniques you can build into your application, such as different algorithms or caching. Don't run pell-mell into development – take some time to estimate your goals.

6. Off by one
This mistake is made by everyone, regularly, all the time. It's writing a loop with an index in such a way that the index incremented once too often or once too little. Consequently, the loop is traversed an incorrect number of times.

If the code in the loop is visiting elements of an array one by one, a non-existent element of the array may be accessed – or, worse, written to – or an element may be missed altogether. One reason why you might get an off-by one error is forgetting whether indexes for array elements are zero-based or one-based.

Some languages even have cases where some object is zero-based and others where the assumption is one-based. There are so many variants of this kind of error that modern languages or their runtimes have features such as 'foreach loops' to avoid the need to count through elements of an array or list.

Others use functional programming techniques called map, reduce and filter to avoid the need to iterate over collections. Use modern 'functional' loops rather than iterative loops.

7. Suppressing exceptions
Modern languages use an exception system as an error-reporting technique, rather than the old traditional passing and checking of error numbers. The language incorporates new keywords to dispatch and trap exceptions, using names such as throw, try, finally and catch.

The remarkable thing about exceptions is their ability to unwind the stack, automatically returning from nested routines until the exception is trapped and dealt with. No longer do you have to check for error conditions, making your code into a morass of error tests.

All in all, exceptions make for more robust software, providing that they're used properly. Catch is the interesting one: it allows you to trap an exception that was thrown and perform some kind of action based upon the type of the exception.

The biggest mistakes programmers make with exceptions are twofold. The first is that the programmer is not specific enough in the type of exception they catch. Catching too general an exception type means that they may be inadvertently dealing with particular exceptions that would be best left to other code, higher up the call chain. Those exceptions would, in effect, be suppressed and possibly lost.

The second mistake is more pernicious: the programmer doesn't want any exceptions leaving their code and so catches them all and ignores them. This is known as the empty catch block. They may think, for example, that only certain types of exceptions might be thrown in his code; ones that they could justifiably ignore.

In reality, other deadly runtime exceptions could happen – things such as out-of-memory exceptions, invalid code exceptions and the like, for which the program shouldn't continue running at all. Tune your exception catch blocks to be as specific as possible.

8. Storing secrets in plain text
A long time ago, I worked in a bank. We purchased a new computer system for the back office to manage some kind of workflow dealing with bond settlements. Part of my job was to check this system to see whether it worked as described and whether it was foolproof. After all, it dealt with millions of pounds daily and then, as now, a company is more likely to be defrauded by an employee than an outsider.

After 15 minutes with a rudimentary hex editor, I'd found the administrator's password stored in plain text. Data security is one of those topics that deserves more coverage than I can justifiably provide here, but you should never, ever store passwords in plain text.

The standard for passwords is to store the salted hash of the original password, and then do the same salting and hashing of an entered password to see if they match.

Here's a handy hint: if a website promises to email you your original password should you forget it, walk away from the site. This is a huge security issue. One day that site will be hacked. You'll read about how many logins were compromised, and you'll swallow hard and feel the panic rising. Don't be one of the people whose information has been compromised and, equally, don't store passwords or other 'secrets' in plain text in your apps.

9. Not validating user input
In the good old days, our programs were run by individuals, one at a time. We grew complacent about user input: after all, if the program crashed, only one person would be inconvenienced – the one user of the program at that time. Our input validation was limited to number validation, or date checking, or other kinds of verification of input.

Text input tended not to be validated particularly. Then came the web. Suddenly your program is being used all over the world and you've lost that connection with the user. Malicious users could be entering data into your program with the express intent of trying to take over your application or your servers.

A whole crop of devious new attacks were devised that took advantage of the lack of checking of user input. The most famous one is SQL injection, although unsanitised user input could precipitate an XSS attack (crosssite scripting) through markup injection.

Both types rely on the user providing, as part of normal form input, some text that contains either SQL or HTML fragments. If the application does not validate the user input, it may just use it as is and either cause some hacked SQl to execute, or some hacked HTML/JavaScript to be produced.

This in turn could crash the app or allow it to be taken over by the hacker. So, always assume the user is a hacker trying to crash or take over your application and validate or sanitise user input.

10. Not being up to date
All of the previous mistakes have been covered in depth online and in various books. I haven't discovered anything new – they and others have been known for years. These days you have to work pretty hard to avoid coming into contact with various modern design and programming techniques.

I'd say that not spending enough time becoming knowledgeable about programming – and maintaining that expertise – is in fact the biggest mistake that programmers make. They should be learning about techniques such as TDD or BDD, about what SLAP or SOLID means, about various agile techniques.

These skills are of equal or greater importance than understanding how a loop is written in your language of choice. So don't be like them: read McConnell and Beck and Martin and Jeffries and the Gang of Four and Fowler and Hunt & Thomas and so on. Make sure you stay up to date with the art and practice of programming.

And that concludes my top 10 list of mistakes programmers make, no matter what their language stripe. There are others, to be sure, perhaps more disastrous than mine, but I would say that their degree of dread is proportional to the consequences of making them.

All of the above were pretty dire for me the last time I made them. If you have further suggestions or calamities of your own, don't hesitate to contact me and let me know.

-------------------------------------------------------------------------------------------------------

First published in PC Plus Issue 300 Sphere: Related Content

11/10/10

30 Programming eBooks

Since this post got quite popular I decided to incorporate some of the excellent suggestions posted in the comments, so this list now has more than 30 books in it. [UPDATED: 2010-10-10]

Learning a new programming language always is fun and there are many great books legally available for free online. Here’s a selection of 30 of them:

Lisp/Scheme:
How to Desing Programs
Let Over Lambda
On Lisp
Practical Common Lisp
Programming in Emacs Lisp
Programming Languages. Application and Interpretation (suggested by Alex Ott)
Structure and Interpretation of Computer Programs
Teach Yourself Scheme in Fixnum Days
Visual LISP Developer’s Bible (suggested by skatterbrainz)

Ruby:
Data Structures and Algorithms with Object-Oriented Design Patterns in Ruby
Learn to Program
MacRuby: The Definitive Guide
Mr. Neighborly’s Humble Little Ruby Book (suggested by @tundal45)
Programming Ruby
Read Ruby 1.9
Ruby Best Practices
Ruby on Rails Tutorial Book (suggested by @tundal45)

Javascript:Building iPhone Apps with HTML, CSS, and JavaScript
Eloquent Javascript
jQuery Fundamentals
Mastering Node

Haskell:
Learn You a Haskell for Great Good
Real World Haskell

Erlang:
Concurrent Programming in Erlang
Learn You Some Erlang for Great Good

Python:
Dive into Python
How to Think Like a Computer Scientist – Learning with Python

Smalltalk:
Dynamic Web Development with Seaside
Pharo by Example (based on the next book in this list, suggested by Anonymous)
Squeak by Example

Misc:
Algorithms
The Art of Assembly Language
Beginning Perl
Building Accessible Websites (suggested by Joe Clark)
The C Book
Compiler Construction
Dive Into HTML 5 (suggested by @til)
Higher-Order Perl
The Implementation of Functional Programming Languages (suggested by “Def”)
An Introduction to R
Learn Prolog Now!
Objective-C 2.0 Essentials
Programming Scala

Of course there are many more free programming eBooks, but this list consists of the ones I read or want(ed) to read. This is far from comprehensive and languages that are completely missing are mostly left out on purpose (e.g. PHP, C++, Java). I’m sure somebody else made a list for them somewhere. Sphere: Related Content

7/9/10

Are microkernels the future of secure OS design ?

MINIX 3, and microkernel OSs in general, might show us the way toward the future of secure OS design.

Date: September 6th, 2010
Author: Chad Perrin
Category: Application Security, Security

--------------------------------------------------------------------------------
Andrew S. Tanenbaum created the MINIX operating system in the 1980s as an instructional tool. He wanted to provide an aid to helping people learn about the design of operating systems, and did so by creating an incredibly simple — in some respects, too simple for any serious use — Unix-like OS from scratch, so that his students could examine and modify the source code and even run it on the commodity hardware of the time. He and Al Woodhull co-authored a textbook that focused on the MINIX OS.

Since then, 1.5 and 2.0 versions of MINIX have been created as well, with the same goal: to serve as a study aid. By contrast, the current iteration of the MINIX project, known as MINIX 3, is being developed with the aim of providing a production-ready, general purpose Unix-like operating system. In Tanenbaum-Torvalds debate, Part II, Andrew Tanenbaum describes the relationship of MINIX 3 to its forebears:

MINIX 1 and MINIX 3 are related in the same way as Windows 3.1 and Windows XP are: same first name. Thus even if you used MINIX 1 when you were in college, try MINIX 3; you’ll be surprised. It is a minimal but functional UNIX system with X, bash, pdksh, zsh, cc, gcc, perl, python, awk, emacs, vi, pine, ssh, ftp, the GNU tools and over 400 other programs.

Among Tanenbaum’s key priorities in the development of MINIX 3 is the desire to improve operating system reliability to the level people have come to expect from televisions and automobiles:

It is about building highly reliable, self-healing, operating systems. I will consider the job finished when no manufacturer anywhere makes a PC with a reset button. TVs don’t have reset buttons. Stereos don’t have reset buttons. Cars don’t have reset buttons. They are full of software but don’t need them. Computers need reset buttons because their software crashes a lot.

He acknowledges the vast differences in the types of software required by these different products, but is optimistic that the shortcomings imposed by the necessarily greater complexity of software in a desktop computer can be mitigated to the point where end users will no longer have to deal with the crashing behavior of operating system software. As he puts it, “I want to build an operating system whose mean time to failure is much longer than the lifetime of the computer so the average user never experiences a crash.”

There are already OSs that meet that goal under extremely limited, tightly constrained circumstances, but he intends to provide the same stability and reliability under the harsh, changeable circumstances of general-purpose consumer computer use. It is an ambitious goal, but his approach is well thought out and shows great promise. Given that many of the stability issues we see in operating systems using monolithic kernels result from the fact that a crashing driver can crash the OS kernel itself, his intention to solve the problem by moving drivers out of kernel space may be on the right track.

In fact, if a driver fails to respond to what he calls a “reincarnation server”, which monitors the status of active drivers, the reincarnation server can silently swap out the hung or otherwise failing driver process for a fresh process that picks up where the original left off. In most — if not all — cases, the user may not even know anything has happened, where a monolithic kernel OS would likely crash immediately.

While his primary interest is in reliability, rather than security per se, there are obvious security benefits to his microkernel design for MINIX 3. Without going into too much technical detail, there are mechanisms in place that restrict a driver’s access to memory addresses, limiting it to its own assigned address space in a manner that in theory cannot be circumvented by the driver. Drivers all run in user space, keeping them out of kernel space, whereas kernels loaded as part of a monolithic kernel design can typically touch any memory addresses at all.

A number of countermeasures have been invented and employed in various OS designs over the years, but keeping drivers out of kernel space entirely certainly seems like the most effective and inviolate protection against driver misbehavior. As a result, this may all but eliminate the possibility of buffer overruns and similar problems in kernel space.

Further, mediated contact between drivers and other user space systems in MINIX 3 extends the strict privilege separation of Unix-like systems to protect other processes than the kernel process, as well. In short, whole classes of common system vulnerabilities may become extinct within the family of OSs that may develop in the wake of MINIX 3.

MINIX 3 itself is still in development, but it is currently a working OS with many of Tanenbaum’s intended reliability assurance features already implemented. You can download it from the MINIX 3 Website and boot it from a LiveCD though, as Tanenbaum states, you should install it to a partition on the hard drive of a computer if you want to do anything useful with it.

It will not replace MS Windows, MacOS X, Ubuntu Linux, or FreeBSD right now, but it may well be the future of general purpose OS design — especially secure general
purpose OS design. Sphere: Related Content