Mostrando entradas con la etiqueta Programming Languages. Mostrar todas las entradas
Mostrando entradas con la etiqueta Programming Languages. Mostrar todas las entradas

11/11/10

Top 5 IT Essays You Should Read

Certainly, there are many landmark books in software development that have shaped our industry. Design Patterns: Elements of Reusable Object-Oriented Software (the GOF book) is one really good example. Unfortunately, a good share of people don’t have the opportunity to read some of these great works because, well…it can get expensive pretty quickly stocking a bookshelf. But there exists a treasure trove of published content available online that is equally impactful. Here, in no particular order, are 5 essays that have helped shaped our industry, for better or worse.
Cathedral and the Bazaar by Eric Raymond - Discusses the evolution of Linux and provides amazing insight to lessons learned.
Code as Design by Jack Reeves - Presents the notion that programming is fundamentally a design activity and that the only final and true representation of “the design” is the source code itself.
Managing the Development of Large Software Systems (pdf) by Winston Royce - Paper widely regarded as that which gave birth to the waterfall development lifecycle.
No Silver Bullet by Frederick Brooks - We’re still looking, but as this paper points out, there is no silver bullet. The essential complexity Brook’s speaks of is largely why we continue to struggle with the same problems today that we did a decade ago.
On the Criteria to be Used in Decomposing Systems into Modules by David Parnas - Discusses the important design decisions that impact how we modularize our software systems. Important because modularity is coming to the Java platform, and we need to know how to use it effectively.

I’ll give an honorable mention to Design Principles and Design Patterns by Bob Martin, which discusses key principles of object-oriented design. Many of the patterns in the GOF book adhere to these principles.

These essays are sure to provide a positive and lasting influence. But I’m sure there are more. What am I missing? What do you consider the most impactful software development essays? What would you add to this list?

Sphere: Related Content

11/10/10

The Schematics Scheme Cookbook

The Schematics Scheme Cookbook is a collaborative effort to produce documentation and recipes for using Scheme for common tasks. See the Book Introduction for more information on the Cookbook's goals, and the important ContributorAgreement statement.

Cookbook Starting Points
Table of Contents (with recipes)
Getting started with PltScheme
• The FAQ answers common questions about the Cookbook
• Common Scheme idioms
• Getting started with macros
• Using strings
• Using files

Editing Pages
•Register as an author
•Information on TWiki the software running this site.
•AuthorChapter -- A guide to contributions

Other Popular Pages:
The following pages are autogenerated:
•RecipeIndex - Recipes by chapter (similar to TOC).
•ParentTopics - List of Chapters & Sections
•AllRecipes - List of all recipes
•LibraryIndex - List of external software libraries

Administrative Pages
•AdminTopics - Summary of all administrative topics
•CookbookBrainStorm - ideas for the structure of the cookbook.
•ToDo - items to be done (a very incomplete list)
•RecipeStubs - Recipes that need to be written or completed
•HelpNeeded - Pages with markup problems
•OriginalCookbook - links to a converted version of the original cookbook
•RecentSiteChanges (last changed 09 Apr 2007 - 00:15)
•SectionIndex - This page has errors but somebody may want to fix it.

Notes:
•You are currently in the Cookbook web. The color code for this web is this background, so you know where you are.
•If you are not familiar with the TWiki collaboration platform, please visit WelcomeGuest first.

Schematics Cookbook Web Site Tools
• (More options in WebSearch)
• WebChanges: Display recent changes to the Cookbook web
• WebIndex: List all Cookbook topics in alphabetical order. See also the faster WebTopicList
• WebNotify: Subscribe to an e-mail alert sent when something changes in the Cookbook web
• WebStatistics: View access statistics of the Cookbook web
• WebPreferences: Preferences of the Cookbook web (TWikiPreferences has site-wide preferences)

The PLT Scheme is now Racket, a new programming language. The above page is for compatiblity and historical reference only. See the Racket site for up-to-date information. Sphere: Related Content

30 Programming eBooks

Since this post got quite popular I decided to incorporate some of the excellent suggestions posted in the comments, so this list now has more than 30 books in it. [UPDATED: 2010-10-10]

Learning a new programming language always is fun and there are many great books legally available for free online. Here’s a selection of 30 of them:

Lisp/Scheme:
How to Desing Programs
Let Over Lambda
On Lisp
Practical Common Lisp
Programming in Emacs Lisp
Programming Languages. Application and Interpretation (suggested by Alex Ott)
Structure and Interpretation of Computer Programs
Teach Yourself Scheme in Fixnum Days
Visual LISP Developer’s Bible (suggested by skatterbrainz)

Ruby:
Data Structures and Algorithms with Object-Oriented Design Patterns in Ruby
Learn to Program
MacRuby: The Definitive Guide
Mr. Neighborly’s Humble Little Ruby Book (suggested by @tundal45)
Programming Ruby
Read Ruby 1.9
Ruby Best Practices
Ruby on Rails Tutorial Book (suggested by @tundal45)

Javascript:Building iPhone Apps with HTML, CSS, and JavaScript
Eloquent Javascript
jQuery Fundamentals
Mastering Node

Haskell:
Learn You a Haskell for Great Good
Real World Haskell

Erlang:
Concurrent Programming in Erlang
Learn You Some Erlang for Great Good

Python:
Dive into Python
How to Think Like a Computer Scientist – Learning with Python

Smalltalk:
Dynamic Web Development with Seaside
Pharo by Example (based on the next book in this list, suggested by Anonymous)
Squeak by Example

Misc:
Algorithms
The Art of Assembly Language
Beginning Perl
Building Accessible Websites (suggested by Joe Clark)
The C Book
Compiler Construction
Dive Into HTML 5 (suggested by @til)
Higher-Order Perl
The Implementation of Functional Programming Languages (suggested by “Def”)
An Introduction to R
Learn Prolog Now!
Objective-C 2.0 Essentials
Programming Scala

Of course there are many more free programming eBooks, but this list consists of the ones I read or want(ed) to read. This is far from comprehensive and languages that are completely missing are mostly left out on purpose (e.g. PHP, C++, Java). I’m sure somebody else made a list for them somewhere. Sphere: Related Content

7/9/10

Some random thoughts on programming

As I near the end of my third decade programming, I keep finding myself coming back to pondering the same kind of questions.

■ Why are some people good at this and some are not?
■ Can the latter be trained to be the former?
■ What is the role of language, development environment, methodology, etc. in
improving productivity?
I’ve always been partial to the “hacker”. The person who can just flat out code. For some people this seems to come easy. I have been lucky enough to work with a few people like this and it has been awesome. As you talk with them about a problem they are already breaking it down and have an idea of how the final system will work. I don’t mean that other aspects of professionalism, experience, etc. aren’t important its just that I believe in the end there is a raw kind of talent that you can’t teach. My wife can sing, I can’t, and not for lack of practice. She practices, but she can also belt out a perfect tune without hardly trying.

What are the core talents that makes someone a good hacker? I think there are two:
1.The ability to break down an idea/system into an abstract set of “concepts and
actions”
2.The ability to understand how these “concepts and actions” execute in a
particular environment.
Note that I didn’t say anything about coding. Coding is actually the least interesting part. If you can do the others well, coding is just another kind of typing. Coding is the thing your fingers do while your mind is flipping between 1) and 2). Learning a new language is less about learning the syntax than it is about learning the execution model.

Writing good, correct code involves not just conceiving of how something should be 1) but also understanding all that can happen 2).

I honestly don’t know if these things can be taught.

In 30 years I have only seen a few examples where a mediocre programmer became good or a good one became great and zero examples of a mediocre programmer becoming great.

What does this mean for people who want to build large software systems? The easy answer is to only hire great people. Which is fine when you can do it but not everyone can be or hire the best so what is left for the majority?

What should be done to make things easier for people aren’t in the top 5%?

There are lots of nice things that have been invented like source code control, higher level languages, garbage collection, etc. which reduce or abstract away some of the problems. And while sometimes these abstractions are leaky, they are really one of the few weapons we have against the beast called Complexity. Complexity is the key problem we face, especially if we are not wizards because:

A persons talent with regard to 1) and 2) above determine how much complexity they can deal with.

When I was in gradual school at UNC-CH Fred Brooks (the mythical man month guy) used to talk about Complexity as being a major thing which makes programming hard. Its not something easily cast aside:
Quoting Dr Brooks:
The complexity of software is an essential property, not an accidental one. Hence, descriptions of a software entity that abstract away its complexity often abstract away its essence.

I wish I had appreciated everything Fred said when I was a cocky grad student as much as I appreciate it today… sigh.

Some things are hard because they are hard and not much can be done about them. Still our goal needs to be to not try to make it worse than it is.

To this end, some people say: Do the simplest thing that will work.

I agree but prefer this formulation: Strive for the simplest thing that can’t possibly fail.

If you think about what can go wrong you are more likely to find the problems than if you think about what will go right.

I wish I could say we were making progress in reducing complexity but we seem to be adding it into our programming environments rather than removing it. One of the things I have thought about recently is the complexity of these abstractions themselves. Programming languages of today are so much more abstracted and complex than what I learned on (Assembler, Basic, Pascal, C).

For example the world (or at least the web) is currently engaged in writing or rewriting everything in Javascript. Not because JS is so awesome, but because 15 years ago it got stuck into browsers, then we all got stuck with browsers and thus we are all stuck with JS. Sure there’s lots of server stuff done in various other languages but really how long can that last? How long can one justify having to write 2 parts of an application in 2 different languages?

The future looks like node.js.

I am not a JS hater. Its just that I used to think Javascript was a simple language. As someone who wrote 3D Game engine code in C++, my thought was “if these webkids can do JS then it can’t be that complicated”. After building a webapp using Javascript/Dojo/svg to create a realtime browser for genetic information, I realize how wrong I was.

Sure its all nice and simple when you are doing this:
$(“#imhungry”).click(function() { alert(“food”);});

However a lot of complexity lurks just under hood. And its the way it all interacts that concerns me. Crockford called it Lisp in C’s clothing. This should excite a few people and scare the crap out of the rest.

JS has lots of opportunity for expressing yourself in interesting ways. Expressiveness is good, but too much of it can create more problems than it solves. Back when C++ was a bunch of source code you downloaded from AT&T and you had to compile the complier yourself before you could use it, we grad students stayed up late eating bad fried chicken and debating the merits of O-O programming. It was at this point I began to wonder if the productivity gains of O-O were more the offset by the time spent debating about what good O-O practice is. This tape has played in my head more than a few times in my career.

Its not all bad of course: On balance, garbage collection is a win and I see lots of benefits to using closures in JS, but the underlying complexities of things escape most people and their interactions are understood by even fewer. If you have to slog through this (and you should) to understand what happens when you call a function, you are cutting out alot of people from fully understanding what is going on when thier code executes.

I hope that some of this will be address by standards of practice. Things like Google’s guidelines seem to be a good start. As with C++, I think that the best hope for improving productivity will actually be to agree on limits on how we use these tools rather than seeing how far we can push the envelope.

Well except for wizards and we know who we are.


15 Responses to Some random thoughts on programming
Jason P Sage says:
September 7, 2010 at 7:49 am
Right On. Nice Article… I’m on year 28 of programming – and I really liked this article.

P.S. I am a JS hater and .Net hater too (though good at both) because P-Code and Compile On demand are Slower and Make it necessary to obfuscate your code if you have intellectual property concerns.

Jonathan Palethorpe says:
September 7, 2010 at 7:52 am
22 years ago when I started in programming I could mess about at home doing a bit of BASIC and at work I wrote green screen sales and invoicing apps in COBOL ’74 using ISAM files. Nowadays programming is many times more difficult- ISAM files have been replaces with all singing all dancing SQL databases with a host of frameworks designed to access them (no in-language stuff as in COBOL with ISAM) – the green screens have been replaced with name your weird and wonderful GUI framework of choice (WPF in my case) and the languages themselves are now many, varied and unreadable to those not in the know.

I much prefer doing development work now but there are times when I still hanker after a bit of green screen/ISAM/MOVE W01-FOO TO W02-BAR

Michael Fever says:
September 7, 2010 at 9:24 am
Yeah, I’ll have to admit that was pretty random. I am an ok programmer, not great, but better than mediocre. I approach every job the same however by looking at what needs to get done and then by what is going to be required later to maintain it. Sometimes being OK is all you need to be.

Bob Gibson says:
September 7, 2010 at 9:47 am
Very good points.

I can relate… I started programming, in Basic & Fortran, … 40 years ago.

Jeffrey Lanham says:
September 7, 2010 at 10:13 am
Right on. After working in computers and programming for 27+ years, I have to agree. Programming is, unfortunately, something you inheritly understand or you don’t. You can teach semantics, but not the thought processes that go along with it. That being said, great programmers start as mediocre programmers, but quickly rise. Mediocre programmers tend to rise to somewhat competent and then stay there. That’s, unfortunately, the way it is.

Great article.

Alan Balkany says:
September 7, 2010 at 10:15 am
Sometimes the “hacker” is just someone who’s very familiar with the problem domain. An average programmer may appear to be a “hacker” to their coworkers when they’re re-implementing a system very similar to what they implemented at two other jobs.

They already know what works best, and the mistakes they’ve made in previous implementations.

I think most people don’t program as well as their potential because the best practices haven’t been communicated to them. For example, I see many programmers who don’t seem to know that reducing function length is a good way to keep complexity manageable.

The best practices are still being worked out; the field of software development is still in its infancy, compared to more mature disciplines such as mathematics and mechanical engineering.

Oisín says:
September 7, 2010 at 10:47 am
Alan is spot on here. Perhaps the idea that many programmers overlook useful, basic heuristics for programming (such as “shorter functions imply more manageable code, so break down your functions until they are small”) ties in with one theme of the OP, that programming has become more complex in terms of frameworks and maybe language expressiveness.

When I started programming, it was on the horrible bare-bones no-frills Commodore 64 Basic interpreter, followed a couple of years later by a lovely language called GFA Basic on the Atari ST, and then by 68k assembly and C. Maybe those formative years of just playing about, writing programs with minimal focus on libraries, industry standards and “frameworks”, were helpful in learning how to solve problems and write code.
But these days I spend more time reading articles about flashy programming/library techniques than actually thinking about and writing programs.

It could be the case that novice programmers nowadays jump too early onto this wheel, without spending the time doing simple, pure programming, so that they don’t sufficiently develop these key abstractions and heuristic tricks for managing complexity.

Joel Wilson says:
September 7, 2010 at 10:54 am
Couldn’t agree more about programming skill or JS. The best programmers I’ve ever known generally don’t have a comp-sci education (myself included). In fact, the single best I’ve ever known personally was an English major a couple of papers away from his Doctorate. He found programming and never looked back. As far as JS, I agree that its “appeal” (for lack of a better word) is simply because it’s the only trick in town. As a language it sucks for anything more than basic functionality. Although I use jQuery professionally, I find it sad that we have to resort to bloated frameworks and layers of abstraction to try and hide the basic weakness of the underlying language.

Leon says:
September 7, 2010 at 11:39 am
Oh, I guess for that post you’ll earn a lot of flames from mediocre guys who think that they will be great one day. Especially nowadays when everyone and his mother see themselves as coders because they can hack up some javascript/ruby code

Maintenance Man says:
September 7, 2010 at 1:12 pm
Say it ain’t so. Us mediocre programmers need some hope for the future. Heheh.

CollegeGrad says:
September 7, 2010 at 1:38 pm
I majored in Mathematics (not applied) and only became interested in programming in my junior year of college. I only started to really learn quality programming skills after leaving college. I find that I often know exactly how to solve a problem before the veteran programmers at my job even have a chance to really think about it. The problem for me is spending the time making sure my code is well structured and maintainable. When working with other programmers, I end up being the one explaining the problem and how we’re going to solve it over and over. My own experience, which I’ll be the first to point out is very limited, has indicated that it’s my experience with the theoretical aspects of computer science and complexity theory, as well as all my time spent proving abstract theorems, that gives me an advantage over the people who just know how to code. Maybe if programmers focused less on languages and more on things like linear algebra, abstract algebra, graph theory, etc. then they would be much better at their job. just my novice opinion.

Ryan says:
September 7, 2010 at 2:29 pm
“Do the simplest thing that will work.”

Not sure if you will even read this, but:

That is a great statement *but* I think it is critical to mention what I consider a serious and way too often overlooked problem. Lets say you start with a great code base. Then you amend/change it ‘as simply as possible’ to add a new feature. And again for a new feature, and again. Every step being done ‘simply’ actually adds unnecessary to the complexity of the program. What really needs to happen is the simplest implementation that incorporates *all* features, including the ones that had previously been implemented. And this takes more time and more effort in the short term, but boy does it save on time and effort in the long term.

Giles Bowkett says:
September 7, 2010 at 2:46 pm
Since my comment disappeared into a black hole with no feedback, I can only assume it’s gone into some moderation zone, which means I should assume people will read it, which means I should probably be less obnoxious about it, so: Java was designed to prevent people from fucking up. It did not; instead it prevented people from writing great software. That’s in the nature of design approaches based on minimizing the damage that the incompetent can do. The incompetent can fuck up anything; the only thing you achieve with standards and practices is restraining the so-called “wizards,” who create all the advances that save everybody else so much time. Exercising control to prevent incompetent mistakes is a sucker’s game; freeing up the language to “give them enough rope” results in great libraries which make everybody more productive (or at least, everybody who uses them).

I’m rehashing a debate for you here from 2006. It comes from the days when people were saying Rails would never be accepted in corporations, and, more to the point, your advocacy of standards and practices to prevent people from fucking up is the same argument that was coming from the people who said Rails would never be accepted in corporations. Their logic was clean and their conclusions were false; to anyone who was reading those debates, your argument was, I’m sorry, completely discredited almost five years ago now.

Sorry to go off in such detail on your final paragraph, which was kind of a tangent to your overall post, it just drives me fucking crazy that people settle these arguments definitively, only to see them revived a few years later and started over completely from scratch. “Those who don’t study history are doomed to repeat it,” etc.

admin says:
September 7, 2010 at 2:47 pm
I think it depends on how draconian the limits are. I don’t think the google guidlines for instance are any kind of crushing blow to innovation. I just don’t think that just because a language allows something you should necessarily do it.

admin says:
September 7, 2010 at 3:03 pm
Ok. Thanks for the longer comment. I’m learning alot about how bad I am at making my point.

Putting people on rails (sorry couldn’t resist) isn’t going to make bad programmers good. Crap can come from anything. That wasn’t what i wanted to say.

I’m more interested in keep some kind of boundary so that what results is maintainable enough by other people.

I Actually pushed for doing a recent project in python/django rather than java etc because I value the benefits which some of these newer, better langauges give you.

I just think with great power comes great responsibility. The cool meta-programming trick you did is awesome and personally I will give you props for it. I just hope the next guys can figure it out too.

Also I think that not all “rope” as you say is equal. Some things allow for great expressiveness and freedom while not adding much to complexity. I don’t use ruby so I’m not an expert on that, but in python there are function decorators, its a cool syntax for people to add some functionality. But its pretty explicit when its used. @login_required etc. Javascripts closures seem to be able to happen almost by accident or at least its not as explicit to someone looking at the code for the first time. Doesn’t mean we shouldn’t use them. I did today, but we need to be aware of what we are doing. And not i am not trying to start a python/javascript debate. Sphere: Related Content

7/6/10

The Road Ahead for UML

By Ivar Jacobson and Steve Cook, May 12, 2010

UML comes of age -- but now what?

Ivar Jacobson is founder and CTO of Ivar Jacobson International and co-developer of UML. Steve Cook is a software architect at Microsoft and represents Microsoft at the OMG.
--------------------------------------------------------------------------------
More than 12 years have passed since the Unified Modeling Language (UML), became a standard. During these years opinions of UML have varied between delight and distaste. In this article, we discuss the deficiencies of the current UML specification and propose how to make it agile, leaner, smarter, and more flexible -- in short, how to prepare it for the future so that users can be confident that their investments in UML today will increase in value going forward.

At the beginning of the '90s there were 26 published methods on object-orientation, most with its own notation with its own set of icons. It was in this environment that UML was born. Although Grady Booch, Jim Rumbaugh, and Ivar Jacobson initiated the design of what became UML, many other contributors (including Steve Cook) quickly joined the effort and the Object Management Group (OMG) launched the result. UML quickly made most other methods -- along with their own notations, obsolete -- UML eventually became the standard we had hoped for, and toolbuilders and practitioners rapidly adopted the new approach.

Since UML first had outstanding success, we all knew that the pendulum would swing in the opposite direction some day -- and we were right. After a few years the setback arrived, but admittably for good reasons. For instance, at the outset there weren't many good UML tools. Some were very advanced, but hard to use. That disappointed many users and hindered the wide adoption of UML. The language received criticism from the academic world; for example David Parnas nicknamed it the "Undefined Modeling Language". The criticism was exaggerated, but not unfounded. Likewise, the original leaders of the agile movement were negative to modeling. They said "no modeling -- just code". Many people were skeptical about the tools, so they worked more with UML sketches on the whiteboard than formally with the tools themselves.

However, the pendulum is now swinging back. The tools have become better. Criticism from academics has mostly stopped. Agility has come to big companies and modeling is agile if done sensibly (and not as a "silver bullet"). Microsoft, for instance, has implemented UML in Visual Studio 2010, alongside domain-specific languages. Other important standards such as SysML are implemented as extensions to UML.

Thus it seems that today the world looks upon UML with a more balanced view. UML is not the panacea that it was sometimes sold as 10 years ago. Nor is it as bad as some academics, agilistas, and competitors have claimed. It is a practical tool to raise the level of abstraction of software from code to system level. Many organizations claim benefits from their use of UML, ranging from improved communication among developers to productivity gains using code generation from UML models.

UML is a good language to describe software systems at a higher level than code. Properly used UML can improve productivity and quality. After several years of consolidation, the OMG (the owner of the UML) is taking the initiative to improve it, having issued a Request for Information (RIF) on the Future Development of UML in 2009 under the leadership of Steve Cook.

The results of this RFI showed that users and vendors want UML to be leaner, more expressive, easier to learn, easier to integrate with other modeling and programming languages, and more relevant to today’s technologies. For example, some users want to use UML to drive the interactive behavior of a mobile application. Others want to use UML diagrams to automatically visualize the structure and dependencies within a massive distributed service-oriented application. Some would like models to be deeply integrated with modern programming languages without any semantic conflicts. Increasingly, users would like to use UML to model applications deployed in the Cloud. To address these needs, the OMG and UML vendors are working together towards making UML smarter and more agile.

Size and Complexity
One of the biggest complaints about UML is that it is too large and too complex. Typically a project that uses UML only uses 20% of the specification to help build 80% of the code and other deliverables. This 20% can vary according to the type of project: real-time systems, telecommunications, web applications, business applications, etc. What is considered essential may vary according to the kind of project, but in all cases the unused 80% obscures and complicates the essential.

To address this complaint, UML should be described differently for different user groups. There are ordinary users such as analysts, designers, website builders, database designers, developers, operators, architects, and testers, each bringing a different -- but valid -- perspective that uses different but overlapping subsets of UML. A particular class of users comprises the designers of UML itself and UML tool builders. It goes without saying that if the language is complex, these designers will have a hard time creating a language that is complete, consistent, extensible, and able to integrate with other languages, and the number of specification defects will become high.

Figure 1 depicts the main components of UML 2 and the dependencies between them. Although there is some layering, the overall structure contains a lot of cyclic dependencies, which makes it difficult to define useful subsets of the language. The UML specification does define formal "compliance points" which supposedly describe legal subsets of UML, but UML tool vendors have taken little or no notice of these, because they do not correspond to important use cases for UML users.

Figure 1: Components and Dependencies of UML 2.

A key point with the current UML is that there is no way in a compliant implementation to use the simple version of a concept without having the complicated version; for example, take Class. Most users think of a Class as a simple thing that has attributes, operations, inheritance etc. But a UML 2 class also has Ports, Parts, Connectors, and Receptions -- concepts only useful in specialized domains. There is no way to have just the simple one, so all users are burdened with the understanding required by advanced users. This can be -- and is -- mitigated to some extent by good tools. However, we believe that the simple options should be inherent in the language definition itself. Furthermore, the UML Class differs in detail from the concept of Class found in any particular programming language, which introduces additional conceptual barriers between UML and those of its users who are software developers. Again, flexibility of these definitions should be inherent in the language, so that it can be fine-tuned to match to modern programming technologies.

In summary, there are two major challenges to be addressed: complexity of the UML specification itself, and the need to describe UML in coherent subsets that address the actual needs of users in particular domains.

To address the first challenge, as a direct response to the feedback from the RFI, the OMG has embarked on a program of simplification for the UML specification. By the middle of 2011, a much simplified version of the UML specification should be available in which cyclic dependencies and redundancy have been greatly reduced. This specification will be compatible with existing tools and models, but described in a way that makes it much more amenable to further simplification and integration.

Refactoring UML
Once the simplification of the UML specification is complete in 2011, we will be able to move onto the next phase, which will be to refactor UML so that it can more effectively address the changing needs of many different classes of users. This section proposes some techniques that we can apply for this. It will be very important to retain backwards compatibility with the existing UML while doing this: we must not introduce changes that invalidate existing investments in tools, models or training.

We suggest it is possible to create a very small kernel of no more than 20 elements such as objects, links, types and actions, so that almost everyone can learn it in a few hours. In these elements, we will only add things that are necessary to understand the kernel, so that it in a way becomes complete. The kernel by itself is not particularly useful to any developer as it is, although it might suffice for abstract conceptual modeling. It is intended to serve as a basis to describe useful elements.

On top of the kernel we now add more useful concepts. The most essential UML use cases can be defined as extensions to the kernel. An example of an essential use case of UML would be: "Developing from requirements to test".

Figure 2: Suggested simplification of UML through separation of concerns.

The key idea here is that the kernel should be kept small and only includes generic elements and attributes needed by most use cases of UML. Specific elements or attributes to existing elements in the kernel which are needed to provide support for the use case "Developing from requirements to test" are added along with that use case. Thus they don't belong to the kernel. Everything new is added with the new use case, since it supports the use case. This is the whole point. This is what is called an "aspect-oriented structure", where the kernel can be kept clean and simple without knowledge of how it will be extended. The new additions to the kernel will be added with the new elements coming with the use cases that need them.

Returning to the use case example of "Developing from requirements to test".... There are many specializations of this use case; for example, "Developing a web application from requirements to test", and variants such as complex system, distributed system, which all wouldn't be essentials, but extensions to the essential use cases.

Figure 3: Adding a non-essential use case.
Since the use case "Developing from requirements to test" in itself is a large use case, we would need to find its constituent smaller use cases: Use case modeling, Designing, Implementing, Testing, Deploying. They are all different aspects or separate concerns of UML (see " Aspect-Oriented Software Development With Use Cases). Note that there is a danger in breaking down the complete use case into constituent use cases, which is that it might create silos and increase the risk of getting waste, contradicting principles of lean development. We mitigate this by using the Framework design principle where "each component should only do what only that component can do". This forces clean minimal factoring.

Figure 4: Constituent use cases of the larger use case "Developing from requirements to test".
Technically, we need some enhancements to the mechanisms used to extend UML. Today there is a mechanism called Profiles which provides some of the required capability but is not well integrated with the rest of the UML architecture. A simple new mechanism for extending UML -- multiple dynamic classification of model elements -- is currently under development at the OMG, and is scheduled to be available as an OMG standard in 2011. This enables, for example, the specification of Ports to be completely separated from the specification of Classes, so that Ports can be added to Classes only for those advanced use cases where they are required.

Structuring UML like this will help eliminate today’s dilemma of choosing between UML and one or more domain-specific languages (DSLs). When UML is structured as a simple kernel plus extensions, new domains can be addressed by crafting further extensions that can be seamlessly integrated with the existing structures. Another mechanism currently under development at the OMG -- Diagram Definition -- will enable new standard diagram types to be specified and integrated with existing ones. Diagram Definition is also scheduled to be available as an OMG standard in 2011, and will be applied to UML and to other modeling languages including BPMN (Business Process Model and Notation).

To help users get started we should define a basic subset of UML, here called "Essential UML", which can be learnt quickly in a few days. This would include the simple aspects of most of the familiar UML diagrams. The rest of the UML can be added as a set of seamlessly interwoven, deeply integrated yet completely independent extensions without altering what already has been described, and taught. This is smart!

During the last 10 years UML has not adapted quickly enough to developments in the software world. Ten years ago, for example we didn’t have so many frameworks. Today we are inundated with frameworks. The work of programming has increasingly moved to become a job of connecting already developed components or to use existing frameworks. The amazing production of apps for mobile phones is a perfect example. Today, we have not found a place for UML in this context. That does not mean that such a place does not exist, just that we have not yet adapted UML to this context or made UML appetizing to use in these contexts.

As we move forward with improving the UML, however, the first step is to simplify the existing UML, so we get it to become what UML should have been. We need to shrink UML first to lay the foundation for radical expansion of UML into many different new and exciting domains. As the basis for this work is being laid, we can begin to enrich UML with the possible new constructs we need today and tomorrow.

As a consequence of these developments, users can carry on investing in UML with confidence that the value of their investment will grow as UML moves into new domains, and becomes easier to apply and to integrate with other technologies. Sphere: Related Content

The Boost.Threads Library

By Bill Kempf, May 01, 2002

Standard C++ threads are imminent and will derive from the Boost.Threads library, explored here by the library's author.

Important Update
For an update of the Boost.Threads library, see the article What's New in Boost Threads? by Anthony Williams, maintainer of the Boost.Threads Library. This update appears in the November 2008 issue of Dr. Dobb's Journal.
--------------------------------------------------------------------------------
Just a few years ago it was uncommon for a program to be written with multiple threads of execution. Today Internet server applications run multiple threads of execution to efficiently handle multiple client connections. To maximize throughput, transaction servers execute services on separate threads. GUI applications perform lengthy operations in a separate thread to keep the user interface responsive. The list goes on.
The C++ Standard doesn’t mention threads, leaving programmers to wonder whether it’s even possible to write multithreaded C++ programs. Though it is not possible to write standards-compliant multithreaded programs, programmers none the less write multithreaded programs in C++ using the libraries provided by their OS that expose the system’s support for threads. However, there are at least two major problems with doing this: these libraries are almost universally C libraries and require careful use in C++, and each OS provides its own set of libraries for handling multithreaded support. Therefore, the resulting code is not only non-standard, but also non-portable [1]. Boost.Threads is a library designed to address both problems.
Boost [2] is an organization started by members of the C++ Standards Committee Library Working Group to develop new source libraries for C++. Its current membership includes approximately 2,000 members. Many libraries can be found in the Boost source distribution [3]. To make these libraries thread-safe, Boost.Threads was created.
Many C++ experts provided input to the design of Boost.Threads. The interface was designed from the ground up and is not just a simple wrapper around any C threading API. Many features of C++ (such as the existence of constructors/destructors, function objects, and templates) were fully utilized to make the interface more flexible. The current implementation works for POSIX, Win32, and Macintosh Carbon platforms.

Thread Creation
The boost::thread class represents a thread of execution in the same way the std::fstream class represents a file. The default constructor creates an instance representing the current thread of execution. An overloaded constructor takes a function object called with no arguments and returning nothing. This constructor starts a new thread of execution, which in turn calls the function object.
At first it appears that this design is less useful than the typical C approach to creating a thread where a void pointer can be passed to the routine called by the new thread, which allows data to be passed. However, because Boost.Threads uses a function object instead of just a function pointer, it is possible for the function object to carry data needed by the thread. This approach is actually more flexible and is type safe. When combined with functional libraries, such as Boost.Bind, this design actually allows you to easily pass any amount of data to the newly created thread.
Currently, not a lot can be done with a thread object created in Boost.Threads. In fact only two operations can be performed. Thread objects can easily be compared for equality or inequality using the == and != operators to verify if they refer to the same thread of execution, and you can wait for a thread to complete by calling boost::thread::join. Other threading libraries allow you to perform other operations with a thread (for example, set its priority or even cancel it). However, because these operations don’t easily map into portable interfaces, research is being done to determine how they can be added to Boost.Threads.
Listing 1 illustrates a very simple use of the boost::thread class. A new thread is created that simply writes “Hello World” out to std::cout, while the main thread waits for it to complete.

Mutexes
Anyone who has written a multithreaded program understands how critical it is for multiple threads not to access shared resources at the same time. If one thread tries to change the value of shared data at the same time as another thread tries to read the value, the result is undefined behavior. To prevent this from happening, make use of some special primitive types and operations. The most fundamental of these types is known as a mutex (the abbreviation for “mutual exclusion”). A mutex allows only a single thread access to a shared resource at one time. When a thread needs to access the shared resource, it must first “lock” the mutex. If any other thread has already locked the mutex, this operation waits for the other thread to unlock the mutex first, thus ensuring that only a single thread has access to the shared resource at a time.
The mutex concept has several variations. Two large categories of mutexes that Boost.Threads supports include the simple mutex and the recursive mutex. A simple mutex can only be locked once. If the same thread tries to lock a mutex twice, it deadlocks, which indicates that the thread will wait forever. With a recursive mutex, a single thread may lock a mutex several times and must unlock the mutex the same number of times to allow another thread to lock the mutex.
Within these two broad categories of mutexes, there are other variations on how a thread can lock the mutex. A thread may attempt to lock a mutex in three ways:
1. Try and lock the mutex by waiting until no other thread has the mutex locked.
2. Try and lock the mutex by returning immediately if any other thread has the mutex locked.
3. Try and lock the mutex by waiting until no other thread has the mutex locked or until a specified amount of time has elapsed.
It appears that the best possible mutex type is a recursive type that allows all 3 forms of locking. However, overhead is involved with each variation,so Boost.Threads allows you to pick the most efficient mutex type for your specific needs. This leaves Boost.Threads with six mutex types, listed in order of preference based on efficiency: boost::mutex, boost::try_mutex, boost::timed_mutex, boost::recursive_mutex, boost::recursive_try_mutex, and boost::recursive_timed_mutex.
Deadlock may occur if every time a mutex is locked it is not subsequently unlocked. This is the most common possible error, so Boost.Threads is designed to make this impossible (or at least very difficult). No direct access to operations for locking and unlocking any of the mutex types is available. Instead, mutex classes define nested typedefs for types that implement the RAII (Resource Acquisition in Initialization) idiom for locking and unlocking a mutex. This is known as the Scoped Lock [4] pattern. To construct one of these types, pass in a reference to a mutex. The constructor locks the mutex and the destructor unlocks it. C++ language rules ensure the destructor will always be called, so even when an exception is thrown, the mutex will always be unlocked properly.
This pattern helps to ensure proper usage of a mutex. However, be aware that although the Scoped Lock pattern ensures that the mutex is unlocked, it does not ensure that any shared resources remain in a valid state if an exception is thrown; so just as with programming for a single thread of execution, ensure that exceptions don’t leave the program in an inconsistent state. Also, the locking objects must not be passed to another thread, as they maintain state that’s not protected from such usage.
Listing 2 illustrates a very simple use of the boost::mutex class. Two new threads are created, which loop 10 times, writing out an id and the current loop count to std::cout, while the main thread waits for both to complete. The std::cout object is a shared resource, so each thread uses a global mutex to ensure that only one thread at a time attempts to write to it.
Many users will note that passing data to the thread in Listing 2 required writing a function object by hand. Although the code is trivial, it can be tedious writing this code every time. There is an easier solution, however. Functional libraries allow you to create new function objects by binding another function object with data that will be passed to it when called. Listing 3 shows how the Boost.Bind library can be used to simplify the code from Listing 2 by removing the need for a hand-coded function object.

Condition Variables
Sometimes it’s not enough to lock a shared resource and use it. Sometimes the shared resource needs to be in some specific state before it can be used. For example, a thread may try and pull data off of a stack, waiting for data to arrive if none is present. A mutex is not enough to allow for this type of synchronization. Another synchronization type, known as a condition variable, can be used in this case.
A condition variable is always used in conjunction with a mutex and the shared resource(s). A thread first locks the mutex and then verifies that the shared resource is in a state that can be safely used in the manner needed. If it’s not in the state needed, the thread waits on the condition variable. This operation causes the mutex to be unlocked during the wait so that another thread can actually change the state of the shared resource. It also ensures that the mutex is locked when the thread returns from the wait operation. When another thread changes the state of the shared resource, it needs to notify the threads that may be waiting on the condition variable, enabling them to return from the wait operation.
Listing 4 illustrates a very simple use of the boost::condition class. A class is defined implementing a bounded buffer, a container with a fixed size allowing FIFO input and output. This buffer is made thread-safe internally through the use of a boost::mutex. The put and get operations use a condition variable to ensure that a thread waits for the buffer to be in the state needed to complete the operation. Two threads are created, one that puts 100 integers into this buffer and the other pulling the integers back out. The bounded buffer can only hold 10 integers at one time, so the two threads wait for the other thread periodically. To verify that it is happening, the put and get operations output diagnostic strings to std::cout. Finally, the main thread waits for both threads to complete.

Thread Local Storage
Many functions are not implemented to be reentrant. This means that it is unsafe to call the function while another thread is calling the same function. A non-reentrant function holds static data over successive calls or returns a pointer to static data. For example, std::strtok is not reentrant because it uses static data to hold the string to be broken into tokens.
A non-reentrant function can be made into a reentrant function using two approaches. One approach is to change the interface so that the function takes a pointer or reference to a data type that can be used in place of the static data previously used. For example, POSIX defines strtok_r, a reentrant variant of std::strtok, which takes an extra char** parameter that’s used instead of static data. This solution is simple and gives the best possible performance; however, it means changing the public interface, which potentially means changing a lot of code. The other approach leaves the public interface as is and replaces the static data with thread local storage (sometimes referred to as thread-specific storage).
Thread local storage is data that’s associated with a specific thread (the current thread). Multithreading libraries give access to thread local storage through an interface that allows access to the current thread’s instance of the data. Every thread gets its own instance of this data, so there’s never an issue with concurrent access. However, access to thread local storage is slower than access to static or local data; therefore it’s not always the best solution. However, it’s the only solution available when it’s essential not to change the public interface.
Boost.Threads provides access to thread local storage through the smart pointer boost::thread_specific_ptr. The first time every thread tries to access an instance of this smart pointer, it has a NULL value, so code should be written to check for this and initialize the pointer on first use. The Boost.Threads library ensures that the data stored in thread local storage is cleaned up when the thread exits.
Listing 5 illustrates a very simple use of the boost::thread_specific_ptr class. Two new threads are created to initialize the thread local storage and then loop 10 times incrementing the integer contained in the smart pointer and writing the result to std::cout (which is synchronized with a mutex because it is a shared resource). The main thread then waits for these two threads to complete. The output of this example clearly shows that each thread is operating on its own instance of data, even though both are using the same boost::thread_specific_ptr.

Once Routines
There’s one issue left to deal with: how to make initialization routines (such as constructors) thread-safe. For example, when a “global” instance of an object is created as a singleton for an application, knowing that there’s an issue with the order of instantiation, a function is used that returns a static instance, ensuring the static instance is created the first time the method is called. The problem here is that if multiple threads call this function at the same time, the constructor for the static instance may be called multiple times as well, with disastrous results.
The solution to this problem is what’s known as a “once routine.” A once routine is called only once by an application. If multiple threads try to call the routine at the same time, only one actually is able to do so while all others wait until that thread has finished executing the routine. To ensure that it is executed only once, the routine is called indirectly by another function that’s passed a pointer to the routine and a reference to a special flag type used to check if the routine has been called yet. This flag is initialized using static initialization, which ensures that it is initialized at compile time and not run time. Therefore, it is not subject to multithreaded initialization problems. Boost.Threads provides calling once routines through boost::call_once and also defines the flag type boost::once_flag and a special macro used to statically initialize the flag named BOOST_ONCE_INIT.
Listing 6 illustrates a very simple use of boost::call_once. A global integer is statically initialized to zero and an instance of boost::once_flag is statically initialized using BOOST_ONCE_INIT. Then main starts two threads, both trying to “initialize” the global integer by calling boost::call_once with a pointer to a function that increments the integer. Next main waits for these two threads to complete and writes out the final value of the integer to std::cout. The output illustrates that the routine truly was only called once because the value of the integer is only one.

The Future of Boost.Threads
There are several additional features planned for Boost.Threads. There will be a boost::read_write_mutex, which will allow multiple threads to read from the shared resource at the same time, but will ensure exclusive access to any threads writing to the shared resource. There will also be a boost::thread_barrier, which will make a set of threads wait until all threads have “entered” the barrier. A boost::thread_pool is also planned to allow for short routines to be executed asynchronously without the need to create or destroy a thread each time.
Boost.Threads has been presented to the C++ Standards Committee’s Library Working Group for possible inclusion in the Standard’s upcoming Library Technical Report, as a prelude to inclusion in the next version of the Standard. The committee may consider other threading libraries; however, they viewed the initial presentation of Boost.Threads favorably, and they are very interested in adding some support for multithreaded programming to the Standard. So, the future is looking good for multithreaded programming in C++.

Listing 1: The boost::thread class
#include ^boost/thread/thread.hpp^
#include ^iostream^

void hello()
{
std::cout <<
"Hello world, I'm a thread!"
<< std::endl;
}

int main(int argc, char* argv[])
{
boost::thread thrd(&hello);
thrd.join();
return 0;
}
— End of Listing —


Listing 2: The boost::mutex class
#include ^boost/thread/thread.hpp^
#include ^boost/thread/mutex.hpp^
#include ^iostream^

boost::mutex io_mutex;

struct count
{
count(int id) : id(id) { }

void operator()()
{
for (int i = 0; i < 10; ++i)
{
boost::mutex::scoped_lock
lock(io_mutex);
std::cout << id << ": "
<< i << std::endl;
}
}

int id;
};

int main(int argc, char* argv[])
{
boost::thread thrd1(count(1));
boost::thread thrd2(count(2));
thrd1.join();
thrd2.join();
return 0;
}
— End of Listing —


Listing 2: DEAD LINK

Listing 4: The boost::condition class
#include ^boost/thread/thread.hpp^
#include ^boost/thread/mutex.hpp^
#include ^boost/thread/condition.hpp^
#include ^iostream^

const int BUF_SIZE = 10;
const int ITERS = 100;

boost::mutex io_mutex;

class buffer
{
public:
typedef boost::mutex::scoped_lock
scoped_lock;

buffer()
: p(0), c(0), full(0)
{
}

void put(int m)
{
scoped_lock lock(mutex);
if (full == BUF_SIZE)
{
{
boost::mutex::scoped_lock
lock(io_mutex);
std::cout <<
"Buffer is full. Waiting..."
<< std::endl;
}
while (full == BUF_SIZE)
cond.wait(lock);
}
buf[p] = m;
p = (p+1) % BUF_SIZE;
++full;
cond.notify_one();
}

int get()
{
scoped_lock lk(mutex);
if (full == 0)
{
{
boost::mutex::scoped_lock
lock(io_mutex);
std::cout <<
"Buffer is empty. Waiting..."
<< std::endl;
}
while (full == 0)
cond.wait(lk);
}
int i = buf[c];
c = (c+1) % BUF_SIZE;
--full;
cond.notify_one();
return i;
}

private:
boost::mutex mutex;
boost::condition cond;
unsigned int p, c, full;
int buf[BUF_SIZE];
};

buffer buf;

void writer()
{
for (int n = 0; n < ITERS; ++n)
{
{
boost::mutex::scoped_lock
lock(io_mutex);
std::cout << "sending: "
<< n << std::endl;
}
buf.put(n);
}
}

void reader()
{
for (int x = 0; x < ITERS; ++x)
{
int n = buf.get();
{
boost::mutex::scoped_lock
lock(io_mutex);
std::cout << "received: "
<< n << std::endl;
}
}
}

int main(int argc, char* argv[])
{
boost::thread thrd1(&reader);
boost::thread thrd2(&writer);
thrd1.join();
thrd2.join();
return 0;
}
— End of Listing —


Listing 5: The boost::thread_specific_ptr class
#include ^boost/thread/thread.hpp^
#include ^boost/thread/mutex.hpp^
#include ^boost/thread/tss.hpp^
#include ^iostream^

boost::mutex io_mutex;
boost::thread_specific_ptr ptr;

struct count
{
count(int id) : id(id) { }

void operator()()
{
if (ptr.get() == 0)
ptr.reset(new int(0));

for (int i = 0; i < 10; ++i)
{
(*ptr)++;
boost::mutex::scoped_lock
lock(io_mutex);
std::cout << id << ": "
<< *ptr << std::endl;
}
}

int id;
};

int main(int argc, char* argv[])
{
boost::thread thrd1(count(1));
boost::thread thrd2(count(2));
thrd1.join();
thrd2.join();
return 0;
}
— End of Listing —


Listing 6: A very simple use of boost::call_once
#include ^boost/thread/thread.hpp^
#include ^boost/thread/once.hpp^
#include ^iostream^

int i = 0;
boost::once_flag flag =
BOOST_ONCE_INIT;

void init()
{
++i;
}

void thread()
{
boost::call_once(&init, flag);
}

int main(int argc, char* argv[])
{
boost::thread thrd1(&thread);
boost::thread thrd2(&thread);
thrd1.join();
thrd2.join();
std::cout << i << std::endl;
return 0;
}
— End of Listing —


NOTES
[1] The POSIX standard defines multithreaded support in what’s commonly known as the pthread library. This provides multithreaded support for a wide range of operating systems, including Win32 through the pthreads-win32 port. However, this is a C library that fails to address some C++ concepts and is not available on all platforms.
[2] Visit the Boost website at http://www.boost.org.
[3] See Bjorn Karlsson’s article, “Smart Pointers in Boost,” in C/C++ Users Journal, April 2002.
[4] Douglas Schmidt, Michael Stal, Hans Rohnert, and Frank Buschmann. Pattern-Oriented Software Architecture Volume 2 — Patterns for Concurrent and Networked Objects (Wiley, 2000).

William E. Kempf received his BS in CompSci/Math from Doane College. He’s been in the industry for 10 years and is currently a senior application developer for First Data Resources, Inc. He is the author of the Boost.Threads library, and an active Boost member. He can be contacted at wekempf@cox.net. Sphere: Related Content