18/12/10

Build A Basic Web Crawler To Pull Information From A Website

Web Crawlers, sometimes called scrapers, automatically scan the Internet attempting to glean context and meaning of the content they find. The web wouldn’t function without them. Crawlers are the backbone of search engines which, combined with clever algorithms, work out the relevance of your page to a given keyword set.

The Google web crawler will enter your domain and scan every page of your website, extracting page titles, descriptions, keywords, and links – then report back to Google HQ and add the information to their huge database.

Today, I’d like to teach you how to make your own basic crawler – not one that scans the whole Internet, though, but one that is able to extract all the links from a given webpage.

Generally, you should make sure you have permission before scraping random websites, as most people consider it to be a very grey legal area. Still, as I say, the web wouldn’t function without these kind of crawlers, so it’s important you understand how they work and how easy they are to make.

To make a simple crawler, we’ll be using the most common programming language of the internet – PHP. Don’t worry if you’ve never programmed in PHP – I’ll be taking you through each step and explaining what each part does. I am going to assume an absolute basic knowledge of HTML though, enough that you understand how a link or image is added to an HTML document.

Before we start, you will need a server to run PHP. You have a number of options here:
• If you host your own blog using WordPress, you already have one, so upload the files you write via FTP and run them from there. Matt showed us some free FTP clients for Windows you could use.
• If you don’t have a web server but do have an old PC sitting around, then you could follow Dave’s tutorial here to turn an old PC into a web server.
• Just one computer? Don’t worry – Jeffry showed us how we can run a local server inside of Windows or Mac.

Getting Started
We’ll be using a helper class called Simple HTML DOM. Download this zip file, unzip it, and upload the simple_html_dom.php file contained within to your website first (in the same directory you’ll be running your programs from). It contains functions we will be using to traverse the elements of a webpage more easily. That zip file also contains today’s example code.

First, let’s write a simple program that will check if PHP is working or not. We’ll also import the helper file we’ll be using later. Make a new file in your web directory, and call it example1.php – the actual name isn’t important, but the .php ending is. Copy and paste this code into it:
include_once('simple_html_dom.php');
phpinfo();
?>

Access the file through your internet browser. If you don’t have a server set up, you can still run the program from my server if you want. If everything has gone right, you should see a big page of random debug and server information printed out like below – all from the little line of code! It’s not really what we’re after, but at least we know everything is working.
The first and last lines simply tell the server we are going to be using PHP code. This is important because we can actually include standard HTML on the page too, and it will render just fine. The second line pulls in the Simple HTML DOM helper we will be using. The phpinfo(); line is the one that printed out all that debug info, but you can go ahead and delete that now. Notice that in PHP, any commands we have must be finished with a colon (;). The most common mistake of any PHP beginner is to forget that little bit of punctuation.

One typical task that Google performs is to pull all the links from a page and see which sites they are endorsing. Try the following code next, in a new file if you like.
include_once('simple_html_dom.php');
$target_url = “http://www.tokyobit.com/”;
$html = new simple_html_dom();
$html->load_file($target_url);
foreach($html->find(‘a’) as $link){
echo $link->href.”
”;
}
?>

Again, you can run that from my server too if you don’t have your own set up. You should get a page full of URLs! Wonderful. Most of them will be internal links, of course. In a real world situation, Google would ignore internal links and simply look at what other websites you’re linking to, but that’s outside the scope of this tutorial.

If you’re running on your own server, go ahead and change the target_URL variable to your own webpage or any other website you’d like to examine.

That code was quite a jump from the last example, so let’s go through in pseudo-code to make sure you understand what’s going on.

Include once the simple HTML DOM helper file.
Set the target URL as http://www.tokyobit.com.
Create a new simple HTML DOM object to store the target page
Load our target URL into that object
For each link <..a..> that we find on the target page
- Print out the HREF attribute


That’s it for today, but if you’d like a bit of challenge – try to modify to the second example so that instead of searching for links (a elements), it grabs images instead (img). Remember, the src attribute of an image specifies the URL for that image, not HREF.

Would you like learn more? Let me know in the comments if you’re interested in reading a part 2 (complete with homework solution!), or even if you’d like a back-basics PHP tutorial – and I’ll rustle one up next time for you. I warn you though – once you get started with programming in PHP, you’ll start making plans to create the next Facebook, and all those latent desires for world domination will soon consume you.
Programming is fun.

This is part 2 in a series I started last time about how to build a web crawler in PHP. Previously I introduced the Simple HTML DOM helper file, as well as showing you how incredibly simple it was to grab all the links from a webpage, a common task for search engines like Google.

If you read part 1 and followed along, you’ll know I set some homework to adjust the script to grab images instead of links.

I dropped some pretty big hints, but if you didn’t get it or if you couldn’t get your code to run right, then here is the solution. I added an additional line to output the actual images themselves as well, rather than just the source address of the image.
include_once('simple_html_dom.php');
$target_url = "http://www.tokyobit.com";
$html = new simple_html_dom();
$html->load_file($target_url);
foreach($html->find('img') as $img)
{
echo $img->src."
";
echo $img."
";
}
?>

This should output something like this:
Of course, the results are far from elegant, but it does work. Notice that the script is only capable of grabbing images that are on the content of the page in the form of <..img..> tags – a lot of the page design elements are hard-coded into the CSS, so our script can’t grab those. Again, you can run this through my server and if you wish at this URL, but to enter your own target site you’ll have to edit the code and run on your own server as I explained in part 1. At this point, you should bear in mind that downloading images from a website is significantly more stress on the server than simply grabbing text links, so do only try the script on your own blog or mine and try not to refresh lots of times.

Let’s move on and be a little more adventurous. We’re going to build upon our original file, and instead of just grabbing all the links randomly, we’re going to make it do something more useful by getting the post content instead. We can do this quite easily because standard WordPress wraps the post content within a <..div class=”post”..> tag, so all we need to do is grab any “div” with that class type, and output them – effectively stripping everything except the main content out of the original site. Here is our initial code:
include_once('simple_html_dom.php');
$target_url = "http://www.tokyobit.com";

$html = new simple_html_dom();

$html->load_file($target_url);
foreach($html->find(‘div[class=post]‘) as $post)
{
echo $post.”
”;
}

?>

You can see the output by running the script from here (forgive the slowness, my site is hosted at GoDaddy and they don’t scale very well at all), but it doesn’t contain any of the original design – it is literally just the content.

Let me show you another cool feature now – the ability to delete elements of the page that we don’t like. For instance, I find the meta data quite annoying – like the date and author name – so I’ve added some more code that finds those bits (identified by various classes of div such as post-date, post-info, and meta). I’ve also added a simple CSS style-sheet to format the output a little. Daniel covered a number of great places to learn CSS online if you’re not familiar with it.

As I mentioned in part 1, even though the file contains PHP code, we can still add standard HTML or CSS to the page and the browser will understand it just fine – the PHP code is run on the server, then everything is sent to the browser, to you, as standard HTML. Anyway, here’s the whole final code:
<..head..>
<..style type=”text/css”..>
div.post{background-color: gray;border-radius: 10px;-moz-border-radius: 10px;padding:20px;}
img{float:left;border:0px;padding-right: 10px;padding-bottom: 10px;}
body{width:60%;font-family: verdana,tahamo,sans-serif;margin-left:20%;}
a{text-decoration:none;color:lime;}
<../style..>
<../head..>

include_once(‘simple_html_dom.php’);

$target_url = “http://www.tokyobit.com”;

$html = new simple_html_dom();

$html->load_file($target_url);
foreach($html->find(‘div[class=post]‘) as $post)
{
$post->find(‘div[class=post-date]‘,0)->outertext = ”;
$post->find(‘div[class=post-info]‘,0)->outertext = ”;
$post->find(‘div[class=meta]‘,0)->outertext = ”;
echo $post.”
”;
}

?>

You can check out the results here. Pretty impressive, huh? We’ve taken the content of the original page, got rid of a few bits we didn’t want, and completely reformatted it in the style we like! And more than that, the process is now automated, so if new content were to be published, it would automatically display on our script.

That’s only a fraction of the power available to you though, you can read the full manual online here if you’d like to explore it a little more of the PHP Simple DOM helper and how it greatly aids and simplifies the web crawling process. It’s a great way to take your knowledge of basic HTML and take it up to the next dynamic level.

What could you use this for though? Well, let’s say you own lots of websites and wanted to gather all the contents onto a single site. You could copy and paste the contents every time you update each site, or you could just do it all automatically with this script. Personally, even though I may never use it, I found the script to be a useful exercise in understanding the underlying structure of modern internet documents. It also exposes how simple it is to re-use content when everything is published on a similar system using the same semantics.

What do you think? Again, do let me know in the comments if you’d like to learn some more basic web programming, as I feel like I’ve started you off on level 5 and skipped the first 4! Did you follow along and try yourself, or did you find it a little too confusing? Would you like to learn more about some of the other technologies behind the modern internet browsing experience?

If you’d prefer learning to program on the desktop side of things, Bakari covered some great beginner resources for learning Cocoa Mac OSX desktop programming at the start of the year, and our featured directory app CodeFetch is useful for any programming language. Remember, skills you develop programming in any language can be used across the board.


(By) James is a web developer and SEO consultant who currently lives in the quaint little English town of Surbiton with his Chinese wife. He speaks fluent Japanese and PHP, and when he isn't burning the midnight oil on MakeUseOf articles, he's burning it on iPad and iPhone board game reviews or random tech tutorials instead. Sphere: Related Content

6/12/10

JSON vs. XML

By Klint Finley / November 30, 2010 6:00 PM

James Clark, technical lead for the World Wide Web Consortium's XML activity, published a blog post today about the perceived competition between JSON and XML. Twitter and Foursquare both recently dropped support for XML, opting to use JSON exclusively. Clark doesn't see XML going away, but sees it less and less as a Web technology. "I think the Web community has spoken," Clark concludes. "And it's clear that what it wants is HTML5, JavaScript and JSON." Clark cites a few particular reasons why JSON is winning the hearts and minds of web developers:
• JSON provides better language-independent representation of data structures.
• JSON has a simpler spec.
• JSON handles mixed content adequately Update: That's not to say XML's handling of mixed content is inadequate, only that JSON's minimal mixed content support isn't a deal breaker.
• XML seems "enteprisey."
Clark, however, laments the fact that the web community will be missing out on the power of XML. He suggests that the best way forward for XML in the near future is improving XML's integration with HTML5.

What do you think? Do you prefer XML or JSON, or do you use different ones for different purposes? What future does XML have? Sphere: Related Content

Learning JavaScript Visually with Diagrams

By Klint Finley / November 9, 2010 9:00 PM

"One of the secrets to being a super effective JavaScript developer is to truly understand the semantics of the language," writes developer Tim Caswell. That's why he's created a series of JavaScript lessons based on diagrams, each one illustrating a piece of example code. Caswell's lessons aren't geared towards new programmers. Those with no experience would be better served looking towards an introductory book on programming (or at least this tutorial) to learn the terminology and basic concepts. However, those wanting a deeper understanding of the JavaScript language will be well served by Caswell's tutorials. "My hope is that this helps those of us that are visual learners to get a better grasp of JavaScript semantics," he writes.
So far there are three lessons:
Learning Javascript with Object Graphs - Explains references, closures, and basic inheritance.
Learning Javascript with Object Graphs (Part II) - Compares different ways of doing object-oriented programming in JavaScript
Learning Javascript with Object Graphs (Part III) - Compares Ruby's object model with the way JavaScript works

I know that there have been various attempts to teach programming concepts using visual programming languages, but has anyone seen other examples of this technique being used to teach non-visual programming languages? Sphere: Related Content

JavaScript: 6 Free e-Books and Tutorials

By Klint Finley / December 4, 2010 11:20 AM

JavaScript has never been hotter, thanks to projects like Node.js, JQuery and PhoneGap. You can now use JavaScript for scripting in the browser, for creating desktop and mobile applications, and for creating server side web applications. But how do you get started? We've compiled a list of six free books and tutorials for beginning programmers, but those with programming experience may find some of these resources valuable as well. Feel free to add more free resources in the comments.

Experienced programmers may also want to take a look at this StackOverflow thread on the subject, but most of the resources there aren't free.

Eloquent JavaScript
Eloquent JavaScript is a JavaScript book by Marijn Haverbeke. The digital form is free, and a print version is forthcoming. The book assumes no prior programming experience. Of the books listed here, this one might have the most to offer experienced programmers looking to get started with JavaScript.

Sams Teach Yourself JavaScript in 24 Hours
Sams Teach Yourself JavaScript in 24 Hours by Michael Moncur is part of the well known series of Sams books.

Learn to Program with Javascript
Learn to Program with Javascript is a series of tutorials from About.com. As the title implies, no programming experience is required.

W3Schools JavaScript Tutorial
W3Schools is one of the most venerable and respected resources for online tutorials, and of course it has its own JavaScript tutorial. It also has a stash of JavaScript examples, including some advanced scripts.

Wikibooks JavaScript
The Wikibooks book on JavaScript isn't finished yet, but it's one to watch. If you're an experienced JavaScript programmer looking for a way to give back to the community, maybe you could contribute to this book.

SitePoint's JavaScript Tutorials
SitePoint is another well respected online source for tutorials and books. It has a few free tutorials, mostly as previews for its books. Its JavaScript Guru List is a good place to start.

Bonus: Douglas Crockford's JavaScript Lectures
Well known JavaScript expert and JavaScript: The Good Parts author Douglas Crockford has a series of free lectures available on history of JavaScript, its features, and its use.

See Also:
- 3 Augmented Reality Tutorials
- How to Import JSON, YAML or CSV into FluidDB Using Flimp
- Learn Ruby and Rails From the Comfort of Your Browser with Try Ruby and Rails for Zombies
- Tips for Speedy JavaScript from the Google Instant Previews Team
- How To Organize a jQuery Application with JavaScriptMVC Sphere: Related Content

Game Design: The Tools You Need

AndrewParsons | 4 Dec 2010 | 10:27 PM

In my last blog post about Imagine Cup, I mentioned that we provide you all the tools you need to get started on your own Game Design, so I thought I’d fill you in on what you need, and where you can get it all from.

XNA Game Development
Building games in XNA is incredibly easy, and getting the technology set up is just as straightforward.

Firstly, you need a PC – preferably with a decent graphics card. Particularly when talking about 3D games, where we take advantage of Direct 3D shading and other capabilities, you need a card supporting DirectX 9 and up (I just don’t want someone puzzling over why their shading isn’t working like I was when I ran a hands on lab with a laptop with a not-so-great graphics card).

Next up, you need Windows. Yes, the developer tools you’ll need only run on Windows. Shock, horror. And to make matters even more specific, XNA 4.0 will only run on Windows Vista or Windows 7. Again, from experience having a student turn up with a Mac, running Bootcamp and only having Windows XP installed, it’s a sad day when you can’t install the actual tool you need to build your own awesome games because you’re using an OS that’s, in technology years, ancient.

(Ah, and this is why you shouldn't write blog posts at 2am on a Saturday... thanks to one of my awesome student buddies back in Oz, I have been corrected. You CAN install the standalone version of XNA 4.0 on Windows XP - it's just when you install it as part of the WP7 tools that you'll hit the problem. Thanks Michael!)

And we’re halfway there.

Next – the actual development environment. For XNA 4.0, you should install Visual Studio 2010. If you can’t get your hands on the proper version of Visual Studio (and more on that in a moment), you can always get the Express version of Visual C# 2010 for free. Whether you get the free-to-everyone Express, or one of the professional level versions of Visual Studio, you’ll be armed with one of the best development environments I’ve ever worked in, and will allow you to create applications for Windows, web, Xbox 360, Phone and more, along with supporting technologies like web services and WCF services.

And that’s it for getting it all set up and ready – the final piece in the puzzle is XNA itself. The latest version is XNA Game Studio 4.0 which allows you to build games and game components for Windows, Xbox 360 and Windows Phone 7.

When XNA is installed, it adds in the XNA .NET Framework extensions, and integrates into Visual Studio or Visual C# Express, including project templates and the extras you need for things like debugging, deploying and project management of multiple project types. It will also install a Device Center for managing connections to actual Xbox 360 and Phone hardware.

As an “optional” extra, if you want to develop for Windows Phone 7, you’ll need to install the Windows Phone 7 tools, including emulator. I put quotes around optional, because the easiest way you’ll get XNA 4.0 installed is to download the WP7 developer tools.

Silverlight Development
You can build Silverlight games for the web browser, or for Windows Phone 7, in a couple of different ways: Visual Studio or Expression Studio. If you’re content building your games in Silverlight in Visual Studio, follow the above instructions until you have Visual Studio installed, and you’re done. You don’t need any extras unless you want to build Silverlight WP7 games, in which case you’ll need the WP7 developer tools as well.

The other way to build Silverlight applications and games is to use Expression Studio, specifically Expression Blend. Blend allows you to do more “design” orientated solutions, than development heavy ones. And, of course, you’re able to leverage the power of both tools in the one solution, going back and forth between them as best suits your needs.

Getting the Tools
So, that’s all you need. But how do you get it? Hopefully, it’s just as easy to get your hands on everything you need, as it is running through the list of what you need.

If you’re a student, you can get Visual Studio 2010 Professional and Expression Studio 4 Ultimate for free at DreamSpark: www.dreamspark.com

If you’re faculty, you can get Visual Studio 2010 Professional and Expression Studio for free at Faculty Resource Center: www.facultyresourcecenter.com

If you’re a university, college, school, and want to setup Visual Studio in your labs, you can get Visual Studio 2010 Ultimate and Expression Studio through MSDNAA: www.msdnaa.net

If you’re not in the academic space but still want to try your hand at game development, you can either buy Visual Studio 2010, or get Visual C# 2010 Express at our main Express website: http://www.microsoft.com/express/downloads/

Getting XNA and the Phone tools, head over to the App Hub and download everything in one go: http://create.msdn.com/en-us/home/getting_started If you’re just after the XNA Game Studio addin without Expression, etc, you can use the Microsoft Download Center: http://www.microsoft.com/downloads/en/details.aspx?FamilyID=9ac86eca-206f-4274-97f2-ef6c8b1f478f

Deployment
One last note. If you’re building games for the Xbox 360 or Windows Phone 7, you need to be able to connect to your devices. XNA Creator Club and membership to the Windows Phone 7 Marketplace are what you need. Students get access to both for free through the DreamSpark program. Sphere: Related Content

5/12/10

Cracking Passwords In The Cloud: Amazon’s New EC2 GPU Instances

Update: Great article about this at Threatpost! This also got slashdotted, featured on Tech News Today and there's a ZDNet article about this.

Update: Because of the huge impact I have clarified some things here

As of today, Amazon EC2 is providing what they call "Cluster GPU Instances": An instance in the Amazon cloud that provides you with the power of two NVIDIA Tesla “Fermi” M2050 GPUs. The exact specifications look like this:
- 22 GB of memory
- 33.5 EC2 Compute Units (2 x Intel Xeon X5570, quad-core “Nehalem” architecture)
- 2 x NVIDIA Tesla “Fermi” M2050 GPUs
- 1690 GB of instance storage
- 64-bit platform
- I/O Performance: Very High (10 Gigabit Ethernet)
- API name: cg1.4xlarge
GPUs are known to be the best hardware accelerator for cracking passwords, so I decided to give it a try: How fast can this instance type be used to crack SHA1 hashes?

Using the CUDA-Multiforce, I was able to crack all hashes from this file with a password length from 1-6 in only 49 Minutes (1 hour costs 2.10$ by the way.):
1. Compute done: Reference time 2950.1 seconds
2. Stepping rate: 249.2M MD4/s
3. Search rate: 3488.4M NTLM/s
This just shows one more time that SHA1 for password hashing is deprecated - You really don't want to use it anymore! Instead, use something like scrypt or PBKDF2! Just imagine a whole cluster of this machines (Which is now easy to do for anybody thanks to Amazon) cracking passwords for you, pretty comfortable ;=) Large scaling password cracking for everybody!

Some more details:
Multiforcer Output
Hashes
Charset
Makefile
cpuinfo
meminfo
nvsmi
If I find the time, I'll write a tool which uses the AWS-API to launch on-demand password-cracking instances with a preconfigured AMI. Stay tuned either via RSS or via Twitter.

Installation Instructions:
I used the "Cluster Instances HVM CentOS 5.5 (AMI Id: ami-aa30c7c3)" machine image as provided by Amazon (I choosed the image because it was the only one with CUDA support built in.) and selected "Cluster GPU (cg1.4xlarge, 22GB)" as the instance type. After launching the instance and SSHing into it, you can continue by installing the cracker:

I decided to install the "CUDA-Multiforcer" in version 0.7, as it's the latest version of which the source is available. To compile it, you first need to download the "GPU Computing SDK code samples":
#wget http://developer.download.nvidia.com/compute/cuda/3_2/sdk/gpucomputingsdk_3.2.12_linux.run
# chmod +x gpucomputingsdk_3.2.12_linux.run
# ./gpucomputingsdk_3.2.12_linux.run
(Just press enter when asked for the installation directory and the CUDA directory.)


Now we need to install the g++ compiler:
# yum install automake autoconf gcc-c++

The next step is compiling the libraries of the SDK samples:
# cd ~/NVIDIA_GPU_Computing_SDK/C/
# make lib/libcutil.so
# make shared/libshrutil.so


Now it's time to download and compile the CUDA-Multiforcer:
# cd ~/NVIDIA_GPU_Computing_SDK/C/
# wget http://www.cryptohaze.com/releases/CUDA-Multiforcer-src-0.7.tar.bz2 -O src/CUDA-Multiforcer.tar.bz2
# cd src/
# tar xjf CUDA-Multiforcer.tar.bz2
# cd CUDA-Multiforcer-Release/argtable2-9/
# ./configure && make && make install
# cd ../

As the Makefile of the CUDA-Multiforcer doesn't work out of the box, we need to open it up and find the line:
CCFILES := -largtable2 -lcuda
Replace CCFILES with LINKFLAGS so that the line looks like this:
LINKFLAGS := -largtable2 -lcuda
And type make. If everything worked out, you should have a file ~/NVIDIA_GPU_Computing_SDK/C/bin/linux/release/CUDA-Multiforcer right now. You can try the Multiforcer by doing something like this:
# export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
# export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
# cd ~/NVIDIA_GPU_Computing_SDK/C/src/CUDA-Multiforcer-Release/
# ../../bin/linux/release/CUDA-Multiforcer -h SHA1 -f test_hashes/Hashes-SHA1-Full.txt --min=1 --max=6 -c charsets/charset-upper-lower-numeric-symbol-95.chr


Congratulations, you now have a fully working, CUDA-based hash-cracker running on an Amazon EC2 instance.

COMMENTS:
Deryck H
November 17th, 2010 - 01:22Best thing I’ve read or heard about all day. I love this kind of stuff. You should run a distributed.net client on a few of these clusters

Wonder how long until things like RC4-128 SSL is completely deprecated. Something that many people put blind trust into through their web browsers.
hackajar
November 17th, 2010 - 01:27Very poor numbers for the Tesla EC2:

CUDA Device Information:
Device 1: “Tesla M2050″
Number of cores: 112
Clock rate: 1.15 GHz
Performance Number: 16058

[root@ip-10-17-130-104 CUDA-Multiforcer-Linux-0.72]# ./CUDA-Multiforcer -d 0 -l -h NTLM -f hash.txt –min=8 –max=8 -c charsets/charset-upper-lower-numeric-symbol-95.chr -o outhash.txt –threads 768 –blocks 60 -m 1024
Version 0.72, length 0-14
Hash type: NTLM
Hashes loaded (22 hashes)
Launching kernel for password length 8
Done: 0.00% Step rate: 1110.7M/s Search rate: 24435.0M/sec
Bitweasil
November 17th, 2010 - 03:08… Um.

It’s a 448 core unit, but clocked slower than the GTX470 (which is a 1.21ghz).

Those numbers look OK, but not great. Try a thread count of 512 or 1024. I may play with this soon.
Cyber Killer
November 17th, 2010 - 09:48Currently passwords are recommended to have 8+ characters. Finding hashes of any type (this is not reserved only to sha1) of 1-6 chars is not that hard even for a desktop machine. That’s why (good) services don’t store only the hashed passwords in the database. They (should) add a (sometimes) random additional string to the pass to make it even longer (try reversing a hash of a 128 char string), so finding the hashes is not trivial. It’s called “salting”.

Unless you can prove to find a collision for sha1, which hasn’t been done yet, even with countless thousands of hours of a BOINC project, your 49-minute statistic doesn’t prove anything except that Amazon clusters are fast .
sbo
November 17th, 2010 - 09:53Sha 1 is broken in a matter of collision. It has a theoretical break, from 2005, that lets the attacker find pairs of messages with same hash (collision) in lower computational complexity than it should provide.

BUT this has little to do with cracking passwords which is another property of hash functions , where also the “theoritical” protection is good but is “lowered” by other things (such as policy, no salting, weak passwords, attacks that compute “words” with char set instead of trying to find ALL the combinations,shortcut for cracking doesnt exist in a matter of flaw in sha-1 design in order to reduce the complexity).

The problem with the hash functions now (because of the NIST Competition) is their design. Before the competition (and the attacks on SHA1) everyone just took MD5 (for example) made some changes and announced a NEW hashing scheme. Now it is obvious that some serious works needs to be done…

The thing we should keep from this article is that A) we have to learn how (and where) to use hashes b) Nowadays computation power (ever for the average user) is not a dream. A combination of 6 char pass for example isn’t an obstacle of many “days” of computing/calculating or an hour in a super computer ( big coorporation, agencies, NSA ,CIA etc). Its there waitng for you and with the use of distributed computing the average user may soon be able to crack even strong 7-8 char password easily without waiting and waiting…

And as we have seen from this post, many people, don’t know even the basics of password protection and hashing. That’s what should alert the community.

Good work again for the author.
Cyber Killer
November 17th, 2010 - 10:02OK, I need to correct myself – as @sbo says, there are collisions in sha1. Anyway the rest of my previous comment stays the same .
blandyuk
November 17th, 2010 - 10:16I cracked 13/14 of those hashes in 15 seconds with my large dictionaries and hashcat nVidia GPUs are slower than ATI GPUs. See benckmarks below:

http://golubev.com/gpuest.htm
Jan Dittberner
November 17th, 2010 - 11:49Normal password hashing routines (i.e. glibc’s crypt function) use (pseudo)-random salt values that increase the complexity of such brute force attempts by orders of magnitude. Is there still any serious application or OS using non-salted passwords?
Thomas Roth
November 17th, 2010 - 11:58Actually yes, I see this a lot of times. I’m also shocked on how many new software projects still use MD5.
Benjamin
November 17th, 2010 - 14:31Adding salt to a password won’t increase the complexity of a brute-force attack. The salt is stored *with* the hash, after all. So instead of searching for “aaa”, “aab”, …, I search for “salt_aaa”, “salt_aab”. Salt is intended to prevent rainbow attacks, where someone makes a huge table of all N-character passwords and lets passwords be looked up by their hash. No brute force needed!
David Schwartz
November 17th, 2010 - 14:40This shows that short passwords are broken. It’s not any defect of the hashing algorithm (except perhaps that it lacks seasoning).
Carsten
November 17th, 2010 - 15:01Cool article, and great job Thomas! It would be awesome to see this cracking weak SSL/TLS ciphers.

6 chars is maybe not that impressive, and yes there are some caveats, like no salting of the hashes. But still the pure power and prowess of those that are cracking passwords/hashes seems to be creeping closer to the current password and cryptographic key length standards…
sha1 hash
November 17th, 2010 - 17:49Wow. Quite frightening. Wonder how it would perform for longer PW? Would be interesting to see how fast it could crack MD5.
David Barr
November 17th, 2010 - 22:02Benjamin: Salt would help if you have multiple hashes with a different salt for each. If this wasn’t done, an attacker would only have to compute the hash each of his guesses one time to compare against multiple stored hashes.
Anonymous Troll
November 18th, 2010 - 04:06SO OLD!

Please ppl stop reinventing the wheel…
news.electricalchemy.net/2009/10/cracking-passwords-in-cloud.html

¬¬
Thomas Roth
November 18th, 2010 - 08:27It’s not old. The GPU instances of Amazon launched 3 days ago.
Grugnog
November 18th, 2010 - 17:34Properly stretched passwords – which would include most linux passwd formats, and even several web applications (such as Drupal) that have adopted the phpass library would have effectively prevented this attack. Hashes are designed to be fast, as well as secure – and hence a single round of hashing is a very poor defence. Stretching applies many repeated rounds of salting and hashing, which make an attack that checks many passwords extremely slow (many years), but is still fast enough that a single password can be checked in a reasonable time interactively. Read http://www.openwall.com/articles/PHP-Users-Passwords for more details.
scriptkiddieloveslamers
November 18th, 2010 - 19:08What about SSHA, and what clown would use SHA-1 for passwords anyways?
淘知识
November 19th, 2010 - 02:00Very Cool.SHA-1 is over.
Platinum
November 19th, 2010 - 04:04shit, your password is too short, most of them can be searched by google

you can put all sha1 of 6byte password in a 30$ harddisk

and if you want guess a 8byte password in ec2, you need pay 36 ^ 2 * 2$

wtf, stupid article
sbo
November 19th, 2010 - 08:52SHA-1 was over long time ago, but for other reasons. i think that even other “new” and more “secure” hash algorithms would have about the same results, because we are talking about DISTRIBUTED brute force, which mean, “TRY TO CALCULATE MORE IN LESS TIME”. So The only thing we can talk about is the “CALCULATE” phase which has to do at most with a)How many ops has one hash application b)How many ops would the “cracker” be able o run.

About the sha-1 and collision, one project about the collision was already stoped, and most people say thats it is mainly a “theoritical break”, but a) Colliding pairs have been announced from 2005 b) the computational complexity of proposed method is really low (2^52 or 2^59 i think) , always in contrast to what is the theoritical boundary (2^80).

Salting has to do with Bruteforce also. a) it makes weak and small passwords stronger b)Salt hs to e secret b) You will have to compute the over all hash but have the knowledge of the salt and Pass part. c) Most crackers (i talk about the programms) bruteforce passwords and dont mess with salt (so i wont really help). d) There is a small possibility that you will break the password with Pass1.Salt1 , so you will go try pass1 as password, BUT what happens if the salt was SaltX? This means you have another password (passX). So hash(Pass1.salt1) != hash(pass1.saltX) != hash(pass2.SaltX).Complete failure and wast of time. (without hash, even we could find another pass it wouldnt be a problem , it would be a collision, but then you can go try your luck in a casino too…Or maybe you have exhausted it…)

Oh! and always remember that we are talking about complexity but also probalities…

Finally i have to mention that this article is good and i dont some people here, it presents a practical application, using GPUs ,potential of distributed computed, and ALL these with the help of amazon (no, you dont have to go to the “nearby” univ./research center, or hack into it and “run” your process, or you can just continue say that distributed computing and the use of GPUs is something you can do everyday for fun and profit with low cost, please send me an email, to inform me to i am jealous and i need it for work).
sbo
November 19th, 2010 - 09:04Also, many people said about the 6byte (i prefer the 6-char -beacause of the “charset used”) password, that are quite small and weak. I agree with that,as i mentioned before, but many people continue using these kind of passwords. Also the results where good,and can be applied to even bigger passwords (you may spend some days but not decades). Think of a completely distributed CPU/GPU cracked (with good and optimized implementation), the total computation time needed would be pretty much lower that ordinary bruteforcing, or even bruteforcing with the “help” of a single GPU. instead calculating for example 500mil hashes per sec you could calculate MANY more.

And if someone think that this methods are “junk”, it is better not to be interested in BRUTEFORCING, or in general exhaustive search in many problems.
sysiu
November 19th, 2010 - 10:33yes, i don’t see the point of salting too
Carlos
November 19th, 2010 - 10:38A benchmark with pyrit for hacking WPA/2 would be interessting…8 Tesla C1060 cards can try 88.000 PMK/s from a wordlist.. 2 Tesla C2050 maybe 25.000 keys/s? Is it possible to use pyrit with these cards in the amazon cloud?
sysiu_wtf
November 19th, 2010 - 16:45@sysiu- is this because your lacking in knowledge or braincells- or are you another monkey from uk?
sbo
November 19th, 2010 - 18:32It is not a matter of lacking knowledge. Its a matter of boredom…
Some people are bored to THINK, to read (other posts, articles books), to ask (someone else) and to search.

Thats the problem.

Salting. The idea behind salting is “extending” a password by a set of bytes (a word, some bits, some characters etc). In this way the PASSWORD becomes more complex. IF someone doesnt know the salt then he/she will have to crack it to, even worse if he doesnt know the existance of the salt he wont find a good combination. (look at my previous comment).

If the salt is known, then its a matter of bruteforcing the password only by “concating” the salt. (it may take some ops more because of computing the salt’s hash, by that i mean the “blocks” needed to somputed by the hash algo.)

And a very good answer about the salt, is that, IT WORKS. as simple as that, it just make it difficult to many people to crack a pass. May be it will send some scriptkiddies away. Maybe even some security analyst or hacker will prefer not spending his time on a pass with hash.

But again, read the post, read some comments, and search ,ask ,think, whatever . just DONT say something just to say it…
dfdt
November 20th, 2010 - 11:26This is pretty stupid. Something is considered “broken” in cryptography, if you find an attack that is faster than bruteforce. Bruteforce is always a valid option and there’s quite nothing one can do to prevent it. While SHA-1 has some other problems, this is definitively not one of it.
If you expect weak passwords, consider employing key strengthening. But this is not a problem of the hash algorithms but of the expected passwords.

I mean, wtf should this prove? Dont use SHA-3 either (when it’s official) because I can brute-force 1 char passwords on my mobile phone in seconds and some people still use 1 char passwords?

Also, what is the contribution of this article? You did not even facilitate the “cloud”, or did I miss this part? I’ve been able to ask my neighbor to use his GPU for 49 minutes since quite some time, do not need Amazon for this. You mean I can buy more instances and do it faster? Well, I doubt you’ll be able to speed up your silly 6-char experiment significantly because all instances will test the same passwords.
You are welcome to report back when you managed to synchronize GPU cracking for a larger amount of instances and crack some 8-12 byte passwords in negligible time.
Cloud Freak
November 20th, 2010 - 11:47I agree with sbo. I would always implent a minimum requirement of 12 characters for the password and by mixing it up with a specific Salt SHA-1 should Be still good to go.

Allthough I’m a big fan of the cloud, as it makes powerful ressources available, even for the small guy, but all the stuff you can do with it in the criminal perspective sounds scary.

In my opinion the access to those Services is way too easy. You can sign up in 1min and you are good to go. No Verification, nothing.. Maybe that is something to think about.
Thomas Roth
November 20th, 2010 - 12:12Hello, this is not stupid and I did not say that SHA1 is broken. I just say that you should not use SHA1 for passwordhashes, as there are much better ways to do it like PBKDF2.
And, as stated in the article, it was just a benchmark using one instance to do some calculations on how many nodes you’d have to utilize to break stronger passwords.
“Well, I doubt you’ll be able to speed up your silly 6-char experiment significantly because all instances will test the same passwords.” That’s wrong. It’s no problem to split the task of cracking passwords onto several instances. Especially cracking hashes is extremely easy to distribute. The experiment is not ‘silly’ if you would’ve understand what the article is about.
Thomas Roth
November 20th, 2010 - 15:32Read the article, understand it, and then rethink what you’ve written. I didn’t claim that I cracked SHA1 or anything. I just showed how fast one instance can be used to brute force SHA1 password hashes and that one can easily use a lot of nodes in the cloud to make password cracking faster. Maybe you should learn what a scriptkiddie actually is.
Gonggo Ballak
November 20th, 2010 - 18:10could you please change the font? this cursive font is not good to read, THANKS!
dfdt
November 20th, 2010 - 18:21“It’s no problem to split the task of cracking passwords onto several instances.” Yes, this aint no problem, obviously. But splitting the crack of max 6byte passwords is, because the workloads are so small you won’t benefit from distribution very much.

Also, PBKDF2 may employ SHA-1 — those are two different things your are talking about.

There are several benchmarks on GPU cracking already. CUDA-Multiforcer is far from new either. It has been run a thousand times already in a set-up similar to the one you have described. The only thing new is that you are now able to rent the set-up from Amazon.

On your “Benchmark”:
At a rate of 250M/sec for one node, 7chars would take 77hours in this set-up. 8chars would already require 307 days (worst-case, half the time in average for uniform distribution).
Consequently, if we’d use 307 instances (and suppose perfect linear scalability) to crack an 8-char hash in a single day, we’d have to spend 307*24*$2.10, that are $15472. Neat; but people who have that money to spend, probably buy their own clusters.
Oh, and guess what, a 10 character password already requires 7594 years. Or almost 140 Million US Dollar to rent time from Amazon to crack in a single day.

So what we learn from this is not “Don’t use SHA-1″, but rather “Dont use short passwords”.
Thomas Roth
November 20th, 2010 - 21:08“Also, PBKDF2 may employ SHA-1 — those are two different things your are talking about.” – Yes, it may employs SHA1, and the recommendation is that it does that at least 1000 times. What’s your point?

“There are several benchmarks on GPU cracking already. CUDA-Multiforcer is far from new either. It has been run a thousand times already in a set-up similar to the one you have described. The only thing new is that you are now able to rent the set-up from Amazon.” – I never stated anything else. I just gave it a try on the cloud instance. So what?

“At a rate of 250M/sec for one node, 7chars would take 77hours in this set-up. 8chars would already require 307 days (worst-case, half the time in average for uniform distribution).” – Yes, that’s right. (By the way I’m down to 25 minutes, using some tweaks from the author of the Multiforcer.

“Consequently, if we’d use 307 instances (and suppose perfect linear scalability) to crack an 8-char hash in a single day, we’d have to spend 307*24*$2.10, that are $15472. Neat; but people who have that money to spend, probably buy their own clusters.” – You’re missing that we’re talking about criminals who would want to use that. It’s no problem to get credit card data to buy a cluster of $15k. (Yes, it really is no problem.)

“So what we learn from this is not “Don’t use SHA-1″, but rather “Dont use short passwords”. – Right conclusion. Read the article at Threatpost, I told them exactly the same.
hardcore
November 22nd, 2010 - 07:18It is more important than just passwords.
courts also rely on MD5 and SHA checksums to ‘ensure’ that evidence is not tampered with.
Daniel
November 22nd, 2010 - 12:51Amazing job

=]
Name(required)
November 24th, 2010 - 14:58In the defense of Thomas Roth, I find his article fair and balanced, he never claimed to have invented a new hack of SHA and he posted the limitations in clear.

I found that a lot of comments left by angry trolls smell like “too bad I did not thought about that earlier”

For every argument they will be a counter argument, So what?
damaskino
November 25th, 2010 - 18:06I’m totally agreed with the last comment. Some people are just haters. Read the article, you not agree with it, post your point in a constructive way. WTF, stupid article etc… Are not constructive at all. I found this article pretty interesting on the importance of using long password instead of short. Some comments are interesting too. They complete well the article.
Troll took day off
November 27th, 2010 - 19:01This is awesome and simple. Thanks Tomas.
Anonym
November 28th, 2010 - 13:04Is there any possibility to do that with RAR files?
I have one RAR File I password protected 1 year ago. Inside there are sensitive financial data. But I forgot the password and I need some of them. Sphere: Related Content

OpenCL - The open standard for parallel programming of heterogeneous systems

OpenCL™ is the first open, royalty-free standard for cross-platform, parallel programming of modern processors found in personal computers, servers and handheld/embedded devices. OpenCL (Open Computing Language) greatly improves speed and responsiveness for a wide spectrum of applications in numerous market categories from gaming and entertainment to scientific and medical software.

OpenCL supports a wide range of applications, from embedded and consumer software to HPC solutions, through a low-level, high-performance, portable abstraction. By creating an efficient, close-to-the-metal programming interface, OpenCL will form the foundation layer of a parallel computing ecosystem of platform-independent tools, middleware and applications.

OpenCL is being created by the Khronos Group with the participation of many industry-leading companies and institutions including 3DLABS, Activision Blizzard, AMD, Apple, ARM, Broadcom, Codeplay, Electronic Arts, Ericsson, Freescale, Fujitsu, GE, Graphic Remedy, HI, IBM, Intel, Imagination Technologies, Los Alamos National Laboratory, Motorola, Movidius, Nokia, NVIDIA, Petapath, QNX, Qualcomm, RapidMind, Samsung, Seaweed, S3, ST Microelectronics, Takumi, Texas Instruments, Toshiba and Vivante.

OpenCL 1.1
OpenCL 1.1 includes significant new functionality including::
• Host-thread safety, enabling OpenCL commands to be enqueued from multiple host threads;
• Sub-buffer objects to distribute regions of a buffer across multiple OpenCL devices;
• User events to enable enqueued OpenCL commands to wait on external events;
• Event callbacks that can be used to enqueue new OpenCL commands based on event state changes in a non-blocking manner;
• 3-component vector data types;
• Global work-offset which enable kernels to operate on different portions of the NDRange;
• Memory object destructor callback;
• Read, write and copy a 1D, 2D or 3D rectangular region of a buffer object;
• Mirrored repeat addressing mode and additional image formats;
• New OpenCL C built-in functions such as integer clamp, shuffle and asynchronous strided copies;
• Improved OpenGL interoperability through efficient sharing of images and buffers by linking OpenCL event objects to OpenGL fence sync objects;
• Optional features in OpenCL 1.0 have been bought into core OpenCL 1.1 including: writes to a pointer of bytes or shorts from a kernel, and conversion of atomics to 32-bit integers in local or global memory.
The OpenCL 1.1 specification and header files are available in the Khronos Registry
The OpenCL 1.1 Quick Reference card.
The OpenCL 1.1 Online Man pages.

OpenCL 1.0
OpenCL 1.0 at a glance
OpenCL (Open Computing Language) is the first open, royalty-free standard for general-purpose parallel programming of heterogeneous systems. OpenCL provides a uniform programming environment for software developers to write efficient, portable code for high-performance compute servers, desktop computer systems and handheld devices using a diverse mix of multi-core CPUs, GPUs, Cell-type architectures and other parallel processors such as DSPs.

The OpenCL 1.0 specification and header files are available in the Khronos Registry
The OpenCL 1.0 Quick Reference card.
The OpenCL 1.0 Online Man pages.

OpenCL PDF Overview Click here (June 2010) Sphere: Related Content

Khronos OpenCL API Registry

The OpenCL API registry contains specifications of the core API; specifications of Khronos- and vendor-approved OpenCL extensions; header files corresponding to the specifications; and other related documentation.

OpenCL Core API Specification, Headers, and Documentation
The current version of OpenCL is OpenCL 1.1.
• OpenCL 1.1 Specification (revision 36, September 30, 2010).
• OpenCL 1.1 C++ Bindings Specification (revision 4, June 14, 2010).
• OpenCL 1.1 Online Manual Pages.
• All of the following headers should be present in a directory CL/ (or OpenCL/ on MacOS X). The single header file opencl.h includes all of the other headers as appropriate for the target platform, and simply including opencl.h should be all that applications need to do.
◦opencl.h - OpenCL 1.1 Single Header File for Applications.
◦cl_platform.h - OpenCL 1.1 Platform-Dependent Macros.
◦cl.h - OpenCL 1.1 Core API Header File.
◦cl_ext.h - OpenCL 1.1 Extensions Header File.
◦cl_d3d10.h - OpenCL 1.1 Khronos OpenCL/Direct3D 10 Extensions Header File.
◦cl_gl.h - OpenCL 1.1 Khronos OpenCL/OpenGL Extensions Header File.
◦cl_gl_ext.h - OpenCL 1.1 Vendor OpenCL/OpenGL Extensions Header File.
◦cl.hpp - OpenCL 1.1 C++ Bindings Header File, implementing the C++ Bindings Specification.
◦Extension template for writing an OpenCL extension specification. Extensions in the registry (listed below) follow the structure of this document, which describes the purpose of each section in an extension specification.

Older Specifications
Older versions of OpenCL provided for reference.
• OpenCL 1.0 Specification (revision 48, October 6, 2009).
• OpenCL 1.0 Online Manual Pages.
• OpenCL 1.0 headers are structure in exactly the same fashion as OpenCL 1.1 headers described above.
◦opencl.h - OpenCL 1.0 Single Header File for Applications.
◦cl_platform.h - OpenCL 1.0 Platform-Dependent Macros.
◦cl.h - OpenCL 1.0 Core API Header File.
◦cl_ext.h - OpenCL 1.0 Extensions Header File.
◦cl_d3d10.h - OpenCL 1.0 Khronos OpenCL/Direct3D 10 Extensions Header File.
◦cl_gl.h - OpenCL 1.0 Khronos OpenCL/OpenGL Extensions Header File.
◦cl_gl_ext.h - OpenCL 1.0 Vendor OpenCL/OpenGL Extensions Header File.
◦cl.hpp - OpenCL 1.1 C++ Bindings Header File, implementing the C++ Bindings Specification. This header works for OpenCL 1.0 as well as OpenCL 1.1.
◦Extension template for writing an OpenCL extension specification. Extensions in the registry (listed below) follow the structure of this document, which describes the purpose of each section in an extension specification.
•OpenCL 1.0 C++ Bindings. These bindings correspond to OpenCL 1.0 revision 48 and are found in cl.hpp with Doxygen documentation cl.hpp-docs.zip (in ZIP format).

Providing Feedback on the Registry
Khronos welcomes comments and bug reports. To provide feedback, please create an account on the Khronos Bugzilla and file a bug. Make sure to fill in the "Product" field in the bug entry form as "OpenCL" and pick appropriate values for the other fields.

If you are already logged into Bugzilla, the following links will prepopulate the bug report fields as appropriate for:
• feedback on the OpenCL Header Files and (NOTE: you must be logged into Bugzilla before clicking on this link)
• feedback on the OpenCL C++ Bindings (NOTE: you must be logged into Bugzilla before clicking on this link)
You can also post feedback to the OpenCL Message Board in a new thread, or there is an existing thread specifically created to discuss the OpenCL C++ Bindings .

Extension Specifications
1.cl_khr_gl_sharing
2.cl_nv_d3d9_sharing
3.cl_nv_d3d10_sharing
4.cl_nv_d3d11_sharing
5.cl_khr_icd
6.cl_khr_d3d10_sharing
7.cl_amd_device_attribute_query
8.cl_amd_fp64
9.cl_amd_media_ops
10.cl_ext_migrate_memobject
11.cl_ext_device_fission Sphere: Related Content

The .NET Developer's Guide to Windows Security

Summary
The Home Page for "The .NET Developer's Guide to Windows Security"

Table of Contents

Preface
Acknowledgements

Part 1: The Big Picture

Item 1: What is secure code?
Item 2: What is a countermeasure?
Item 3: What is threat modeling?
Item 4: What is the principle of least privilege?
Item 5: What is the principle of defense in depth?
Item 6: What is authentication?
Item 7: What is a luring attack?
Item 8: What is a non privileged user?
Item 9: How to develop code as a non admin
Item 10: How to enable auditing
Item 11: How to audit access to files

Part 2: Security Context

Item 12: What is a security principal?
Item 13: What is a SID?
Item 14: How to program with SIDs
Item 15: What is security context?
Item 16: What is a token?
Item 17: What is a logon session?
Item 18: What is a window station?
Item 19: What is a user profile?
Item 20: What is a group?
Item 21: What is a privilege?
Item 22: How to use a privilege
Item 23: How to grant or revoke privileges via security policy
Item 24: What is WindowsIdentity and WindowsPrincipal?
Item 25: How to create a WindowsPrincipal given a token
Item 26: How to get a token for a user
Item 27: What is a daemon?
Item 28: How to choose an identity for a daemon
Item 29: How to display a user interface from a daemon
Item 30: How to run a program as another user
Item 31: What is impersonation?
Item 32: How to impersonate a user given her token
Item 33: What is Thread.CurrentPrincipal?
Item 34: How to track client identity using Thread.CurrentPrincipal
Item 35: What is a null session?
Item 36: What is a guest logon?
Item 37: How to deal with unauthenticated clients

Part 3: Access Control

Item 38: What is role based security?
Item 39: What is ACL based security?
Item 40: What is discretionary access control?
Item 41: What is ownership?
Item 42: What is a security descriptor?
Item 43: What is an access control list?
Item 44: What is a permission?
Item 45: What is ACL inheritance?
Item 46: How to take ownership of an object
Item 47: How to program ACLs
Item 48: How to persist a security descriptor
Item 49: What is Authorization Manager?

Part 4: COM(+)

Item 50: What is the COM authentication level?
Item 51: What is the COM impersonation level?
Item 52: What is CoInitializeSecurity?
Item 53: How to configure security for a COM client
Item 54: How to configure the authentication and impersonation level for a COM app
Item 55: How to configure the authentication and impersonation level for an ASP.NET app
Item 56: How to implement role based security for a managed COM app
Item 57: How to configure process identity for a COM server app

Part 5: Network Security

Item 58: What is CIA?
Item 59: What is Kerberos?
Item 60: What is a service principal name SPN?
Item 61: How to use service principal names
Item 62: What is delegation?
Item 63: What is protocol transition?
Item 64: How to configure delegation via security policy
Item 65: What is SSPI?
Item 66: How to add CIA to a socket based app using SSPI
Item 67: How to add CIA to .NET Remoting
Item 68: What is IPSEC?
Item 69: How to use IPSEC to protect your network

Part 6: Misc

Item 70: How to store secrets on a machine
Item 71: How to prompt for a password
Item 72: How to programmatically lock the console
Item 73: How to programatically log off or reboot the machine
Item 74: What is group policy?
Item 75: How to deploy software securely via group policy

Code Samples
Download them here.

How to read online
See the table of contents below, and click on any subject you want to read!

Note that editing has been disabled due to spam. Thanks to all the good people who have helped fix typos, and of course all the fine folks who helped port the final version of the book into this wiki!

And yes, the entire contents of the book is here for your reference, free of charge. But please support my publisher and my family by picking up a hardcopy from your nearest bookstore ! If you're looking for classroom training on these topics, see the Pluralsight training page at PluralSight.com/courses . Thanks! Sphere: Related Content

What is a Token

A token is a kernel object that caches part of a user's security profile, including the user SID, group SIDs, and privileges (WhatIsAPrivilege). WhatIsSecurityContext discusses the basics of how this cache is normally used, but there's a bit more to it: A token also holds a reference to a logon session (WhatIsALogonSession) and a set of default security settings that the kernel uses.

Tokens are propagated automatically as new processes are created. A new process naturally inherits a copy of the parent's process token. Even if the thread that creates the process is impersonating, the new process gets a copy of the parent's process token, not the thread token, which usually surprises most people who are new to impersonation (WhatIsImpersonation). If you want to start a new process running with some other token, see HowToRunAProgramAsAnotherUser.

The .NET Framework provides two classes that allow you to work with tokens: WindowsIdentity and WindowsPrincipal (WhatIsWindowsIdentityAndWindowsPrincipal). If you ever want to look at the token for your process, call the static method WindowsIdentity.GetCurrent. This method returns a WindowsIdentity instance that wraps the token that represents the thread's security context. Normally this will give you the process token, unless your thread happens to be impersonating (a rare exception that you can read about in WhatIsImpersonation). This function is the way to discover your program's security context as far as the operating system is concerned: It answers the question, Who am I? which is very helpful when trying to diagnose security problems such as being denied access to ACL-protected resources like files. I'd recommend including this user name with any errors that you log.
// here's a simple example of a log that includes
// information about the current security context
void logException(Exception x) {
IIdentity id = WindowsIdentity.GetCurrent();
log.WriteLine("User name: {0}", id.Name);
log.WriteLine("Exception: {0}", x.Message);
log.WriteLine(x.StackTrace);
}

The vast majority of information in a token is immutable, and for good reason! It would be crazy to allow an application to add new groups to its token, for example. But you can change a couple things: You can enable or disable any privileges that happen to be in your token (HowToUseAPrivilege), and you can control the default owner and DACL (WhatIsAnAccessControlList). This latter feature allows your process (or another process running in the same security context, say a parent process) to control the owner and DACL that will be applied to all new kernel objects, such as named pipes, mutexes, and sections, whenever a specific DACL is not provided explicitly to the creation function. For example, these defaults will be used if you call the Win32 function CreateMutex and pass NULL for the LPSECURITY_ATTRIBUTES argument, which is the normal and correct procedure. If you ever need to change these default settings, call the Win32 function SetTokenInformation, but this will be very rare. You see, by default the operating system will set up your token so that the default DACL grants you and SYSTEM full permissions, which is very secure indeed. Usually the only time you want to deviate from this is if you’re going to share an object between two processes running under different accounts, such as between a service process that runs as a daemon (WhatIsADaemon) and a service controller process launched by the interactive user. In that case, see HowToProgramACLs to learn how to programmatically build your own DACL.

Tokens never expire. This makes programmers happy (it would be weird if all of a sudden your process terminated because its token timed out), but it can be dangerous in some cases. For example, nothing stops a server from holding onto client tokens indefinitely once those clients have authenticated. A server running with low privilege is good but keeping a bunch of client tokens in a cache negates all that goodness because an attacker that manages to take over the server process can use the cached client tokens to access resources (WhatIsImpersonation). Fortunately, Kerberos tickets do expire (WhatIsKerberos), so if any of those tokens had network credentials (WhatIsDelegation), they won't be valid forever.

Occasionally you might want to pass tokens between processes. Say you have factored a server into two processes, a low-privileged process listening on an untrusted network (the Internet) and a high privileged helper process that you communicate with using some form of secure interprocess communication such as COM. If you've authenticated a client in your listener process and want your helper process to see the client's token, you can pass it from one process to another by calling the Win32 API DuplicateHandle. You can obtain the token handle from a WindowsIdentity via its Token property (if you have an IIdentity reference, you'll need to cast it to WindowsIdentity first).

At some point you might think about passing a token (or its wrapper, a WindowsIdentity) from one machine to another. This is a big no-no in Windows security. A token only has meaning on the machine where it was created, because the groups and privileges in it were discovered based on the combination of a centralized domain security policy and the local security policy of the machine. Local groups and privilege definitions differ across machines, and domain security policy changes if you cross domain boundaries. Even if the operating system were to provide a way to serialize a token for transmission to another machine (it does not), using this "imported" token would lead to incorrect access control decisions! Thus, if a client (Alice, say) has authenticated with a process on one machine, and you want another machine to see Alice's security context, Alice must authenticate with that other machine. Either she can do this directly or you can delegate her credentials (WhatIsDelegation). In other words, the only way to get a token for Alice on a given machine is to use her credentials to authenticate with a process running on that machine.

While I'm on the subject of the machine sensitive nature of tokens, I should mention that you must never use a token on one machine to perform an access check on an object located on another. For example, resist the temptation to load the security descriptor (WhatIsASecurityDescriptor) for a remote object onto another machine and perform an access check against it using a local token. A token for Alice on machine FOO doesn't have exactly the same groups and privileges it would have if it were produced on machine BAR, so using a token from FOO in access checks against BAR's resources is a very bad idea and is a gaping security hole. The correct procedure is to authenticate with a process on the machine hosting the resource and have that process perform the access check. In other words, keep the access checks on the same machine as the resources being protected. Sphere: Related Content

The History of CSS Resets

When artists begin a new painting, they don’t immediately reach for the cadmium red and the phthalo blue. They first prime the canvas. Why? To ensure that the canvas is smooth and has a uniform white hue.

Many web designers prefer to use a CSS "reset" to "prime" the browser canvas and ensure that their design displays as uniformly as possible across the various browsers and systems their site visitors may use.

This is a three-part series of articles on the topic of CSS resets.
Part 1: The History of CSS Reset
Part 2: A Guide to CSS Resets (coming soon)
Part 3: Should You Use CSS Reset? (coming soon)

What Is CSS Reset?
When you use a CSS "reset," you’re actually overriding the basic stylesheet each individual browser uses to style a web page. If you present a website with no CSS whatsoever, the site will still be styled, to a very limited extent, by the browser’s default stylesheet.

The problem is that every browser’s stylesheet has subtle but fundamental differences. By using a CSS reset, you’re setting the styles of the fundamental CSS elements to a baseline value, thusly rendering the browsers’ varying style defaults moot.

Some of the most common elements that are styled differently among different browsers are hyperlinks (<..a..>), images (<..img..>), headings (<..h1..> through <..h6..>), and the margins and padding given to various elements.
So which browser is right, Firefox or IE? It doesn’t matter. What does matter is that the spacing in between your paragraphs and other elements will look dissimilar if you don’t set a style for a paragraph’s margin and padding. — Jacob Gube, Founder of Six Revisions
It might be useful to peruse this chart showing the various browser defaults. Unfortunately, it doesn’t go back to IE6 (which causes so much havoc among stylesheets).

Who Uses Resets?
According to a 2008 poll by Chris Coyier of CSS-Tricks, a solid majority of web designers use one variation or another of a reset. (Coyier’s parameters were fairly broad, possibly accounting for the heavy support of resets in his poll results.)

The poll, which did not purport to be particularly scientific or comprehensive, gave the following results:
- 27% use some version of Eric Meyer’s reset
- 26% indicated they didn’t know what a reset was
- 15% use a "hard reset," or some iteration of a reset using the universal selector
- 14% use a reset as part of a larger CSS framework
- 13% use their own custom reset
- 4% "purposefully do not use one"
Coyier was not surprised with Meyer’s reset coming in first in the polling, calling it "popular, thoughtful, and effective." Somewhat whimsically perhaps, Meyer replied in the comments: "Huh. I actually didn’t expect that at all; I figured the framework resets would win by a country mile. Now the pressure’s totally on! Arrrrgh!"

Early Days of CSS Reset
As far as I can tell, the first mentions of anything we would later consider to be a "reset" came in late 2004, with two very different approaches.

undohtml.css
The first was legendary developer Tantek Çelik’s UndoHTML.css stylesheet, which "strips the browser varnish" from a number of elements.
In an October 2010 email to me, Çelik confirmed he was most likely first out of the gate. "I’m pretty sure I invented/proposed/shared the concept of resetting CSS (though not by that name) in my blog post of 2004," Çelik said.

Çelik’s reset removes link underlines, borders from linked images, and sets the font size to 1em for headings, <..code..>, and paragraphs. In 2009, author and designer Jason Cranford Teague described Çelik’s reset as "part sublime and part madness," for reasons that elude me.

"hard reset"
The second was web designer and developer Andrew Krespanis’s "hard reset" to overcome browser defaults for margins and padding (in a snippet he called "the tiny addition I threw in at the last minute").
* {
padding:0;
margin:0;
}

When asked about the hard reset, Krespanis mused: "That single line of CSS kick-started my career in a big way, which in retrospect is amusing verging on absurd."

"Certainly no one suggested the reset idea to me, it was something I first suggested to CSS beginners on codingforums.com in early 2004 who defended their over-use of superfluous divs by the confusing rendering caused by default margins and padding on paragraphs, blockquotes, lists, fieldsets, etc." Krespanis said. "I also used it whenever providing examples to people, but wasn’t using it for sites until I started suggesting beginners do so. Eric Meyer was talking about a similar concept at the time, only his was more focused on quality sensible defaults to override those set by [browser] makers and he has continued to develop his over the years since."

Really Undoing html.css
Çelik’s reset quickly drew the attention of CSS guru Eric Meyer, who used Çelik’s work as a jumping-off point for his first attempt at a global reset, as well as a follow-up almost immediately thereafter. In the same conversation I had, Çelik said, "About a week and a half later, Eric Meyer went into a lot more detail on his blog and expanded on my work."

The Differences
Perhaps because Krespanis’s method was so simple and so basic (only addressing margins and padding, as opposed to Çelik’s and Meyer’s far more thorough reset), it seemed to attract more attention off the bat. However, this is a simplistic observation. Judging from the comments in both Meyer’s and Krespanis’s blogs, a number of people were considering something along these lines at around the same time; it’s also worth noting that several commenters in Meyer’s blog discussed the margin/padding reset weeks before Krespanis posted about it (Krespanis himself noted the technique on Meyer’s blog before posting on his own).

As the man said, it steamboats when it’s steamboat time. The idea of some sort of CSS reset had become, at least for many designers, a necessary one.

In 2004, Krespanis wrote:
"A big part of dealing with cross-browser differences is accounting for the default property values of elements in each browser; namely padding and margin. I use the following declaration in every new site I design; it has saved me many hours of nitpicking. * {padding:0; margin:0;}
It doesn’t seem like much at first, but wait till you look at your mildly styled form in 11 browsers to find the positioning identical in all of them; or your button-style lists are perfect the first time, every time."

The difference between Çelik’s and Meyer’s early efforts, and Krespanis’s "afterthought," is, of course, Krespanis’s use of the * (which in CSS, is the universal selector that matches all elements in a web page).

Moving Away from the Universal Selector "hard reset"
Like a drop of antimatter, that single * had widespread effects. As Krespanis went on to note, the use of the universal selector canceled the padding and margin of every element in the page, sometimes to the detriment of the individual design, and it often fouled up forms and lists.

Today, it’s recognized that using the universal selector has repercussions on web page performance because of the resource tax involved in selecting and assigning styles to all elements. Co-creator of the Firefox browser, David Hyatt, advises developers to make sure "a rule doesn’t end up in the universal category," as a best practice for writing efficient CSS.

Russ Weakley, CSS book author and co-chair of the Web Standards Group, outlines a downfall of the "hard reset" method: "Once you have removed all margin and padding, this method relies on you specifically styling the margins and padding of each HTML element that you intend to use."

It didn’t take long for people to start modifying the original "hard reset" to something more specific. Steve Rider, for example, posted what he called a "no assumptions" reset on Meyer’s blog, tweaked to his own preferences:
body {margin: 0px; padding: 8px;}
body, div, ul, li, td, h1, h2, h3, h4, h5, h6 {font-size: 100%; font-family:
Arial, Helvetica, sans-serif;}
div, span, img, form, h1, h2, h3, h4, h5, h6 {margin: 0px; padding: 0px;
background-color: transparent;}

Web developer Seth Thomas Rasmussen tossed his hat in the ring, where he gives some padding and margins back to selected elements:
h1, h2, h3, h4, h5, h6, ul, ol, dl, table, form, etc. {margin: 0 0 1em 0}
And it didn’t take long for Krespanis to come up with his own modification:
* {padding:0; margin:0;}
h1, h2, h3, h4, h5, h6, p, pre, blockquote, label, ul, ol, dl, fieldset, address {
margin:1em 5%; }
li, dd { margin-left:5%; }
fieldset { padding: .5em; }

Between the wide-ranging resets from Çelik and Krespanis, the more "targeted" resets, and the objections to the sometimes-overwhelming changes the resets made, the games were on. Before long, people were trying all kinds of different resets and posting about it.

Lost and Found: Faruk Ateş and initial.css
In July 2005, Web developer Faruk Ateş posted his initial.css on the now-defunct Kurafire.net. In subsequent discussions on Meyer’s and other design blogs, a few commenters recalled Ateş’s efforts. Through the magic of the Wayback Machine, his initial.css reset can be examined. Like others, Ateş revised his reset after discussion and commentary was received.

Ateş said he had been using his reset for a year or so in his client designs, with little or no ill effect. I believe that his was the first truly "global reset" to become publicly available, though he said in a 2010 email to me that he wasn’t at all sure that was the case. In that same exchange, Ateş wrote: "It was also deliberately kept pretty small, because I didn’t like the idea of a huge ton of CSS to reset everything in the browser, when most every site I was building at the time didn’t actually use 50-60% of the less-common elements that were being reset. … [W]here Eric did the more usable-under-any-circumstances version, exhaustive and very complete (for its time), mine was more of the ‘hey, just these handful of styles have made my life as web developer easier’ and I shared them because, well, why wouldn’t I?"
His reset included:
- Setting the html, body, form, fieldset elements to have zero margins and padding, and their fonts to 100%/120% and a Verdana-based sans-serif font stack
- Setting the h1, h2, h3, h4, h5, h6, p, pre, blockquote, ul, ol, dl, address elements to have a 1em vertical margin, no horizontal margin, and no padding
- Giving a 1em left margin to the li, dd, blockquote elements
- Setting the form label cursor to pointer
- Setting the fieldset border to none
- Setting the input, select, textarea font sizes to 100%
As others noted before, and Meyer noted afterward, Ateş eschewed the universal selector because of its detrimental effect on forms (though it was in his first iteration).

For whatever reason, Ateş’s reset received a good bit less attention than some of the others, though it’s clear that many elements in the YUI and Meyer resets that followed appeared first in Ateş’s coding.

In October 2010, Ateş wrote that he never rewrote his reset after the single revision: "Any additions I would have made to it would’ve made it quickly grow in size, at which point people could’ve and should’ve just used Eric’s more comprehensive one. Eventually I stopped using my own initial.css and nowadays I usually copy from YUI or, more recently, the HTML5 Boilerplate, which contains parts from both YUI and Eric Meyer’s latest reset." In 2007, Web designer Christian Montoya provided an updated version of Ateş’s reset that he relies on for his own work.

The Yahoo! User Interface CSS Reset
The Yahoo! User Interface (YUI) reset first came on the scene in February 2006, written by Nate Koechley, the YUI senior frontend engineer, along with his colleague Matt Sweeney.
body,div,dl,dt,dd,ul,ol,li,h1,h2,h3,h4,h5,h6,pre,form,fieldset,input,textarea,p,blockquote,th,td {
margin:0;
padding:0;
}
table {
border-collapse:collapse;
border-spacing:0;
}
fieldset,img {
border:0;
}
address,caption,cite,code,dfn,em,strong,th,var {
font-style:normal;
font-weight:normal;
}
ol,ul {
list-style:none;
}
caption,th {
text-align:left;
}
h1,h2,h3,h4,h5,h6 {
font-size:100%;
font-weight:normal;
}
q:before,q:after {
content:'';
}
abbr,acronym {
border:0;
}


The effects of this reset on any stylesheet were dramatic. While the html element was left untouched, almost every often-used HTML element had its margins and padding zeroed out. Images lost their borders. Lists lost their bullets and numbering. And every single heading was reset to the same size.

The reset was, of course, one part of a much larger framework, called the Yahoo! User Interface Library (YUI), which is a framework for developing web-based user interfaces.

The first YUI CSS Reset was, I believe, the first truly "global" CSS reset that received widespread public notice. Microsoft Developer Network blogger, Dave Ward said, "YUI’s reset is the first that could truly be considered a CSS reset by current standards."

The idea, as Koechley said in multiple presentations and blog posts, was to provide a "clean and sturdy foundation" for an identically formatted "clean slate" that individual stylesheets can build upon for a nearly-identical look no matter what browser or machine is being used to display the site. The YUI CSS Reset "removes everything" stated by the browser’s default CSS presentation.

In a September 2006 presentation given at Yahoo’s "Hack Day" event, Koechley told the audience: "Sound foundation ensures sane development." The reset "overcome[s] browser.css" and "level[s] the playing field."

In a bit of whimsy, Koechley wrote that the YUI CSS Reset "[h]elps ensure all style is declared intentionally. You choose how you want to <..em..>phasize something. Allows us to choose elements based on their semantic meaning instead of their ‘default’ presentation. You choose how you want to render <..li..>sts."

In an October 2007 slideshow, Koechley reminded users that the reset is "a good reminder that HTML should be chosen based on content alone." He restated that in a 2010 interview and went on to note that most people still don’t know that the browsers provide a strong layer of presentational functionality. If nothing else, he said, resets serve to bring all browsers down to a "neutralized, normalized … lowest common denominator" state that designers can then build from. Resets, he said, force people to rethink the semantics of HTML elements.

Koechley no longer works for the YUI team, and is instead a freelance web developer. He isn’t sure what, if any, changes will be made in the YUI reset to accommodate HTML5 and CSS3.

Eric Meyer used the YUI reset as a base for his own expansive reset, garnering even more attention than the YUI code.

Eric Meyer’s CSS Reset
Why do this at all? The basic reason is that all browsers have presentation defaults, but no browsers have the same defaults. … We think of our CSS as modifying the default look of a document — but with a ‘reset’ style sheet, we can make that default look more consistent across browsers, and thus spend less time fighting with browser defaults. — Eric Meyer, Author of leading CSS books

To paraphrase the 1983 commercial, when Eric Meyer talks, people in the design and development community listen. He started with a September 2004 post that, as noted above, itself built on work by Tantek Çelik.

Both Meyer and Çelik focused on "undoing" the html.css file that controlled the way Gecko-based browsers like Firefox and SeaMonkey displayed websites on the individual computer. Meyer’s follow-up on the first post focused on (fractionally) rebuilding the html.css file to make sites relatively usable again.

Both Çelik and Meyer envisioned their work as immediately practical and applicable to web design. Çelik told me that his reset "just made sense as a foundation to simplify coding CSS and make it more predictable" — and it didn’t take them, nor anyone else, apparently, very long to begin to comprehend how they could gain new control over the display across almost all browsers. The power, the power!

Meyer had other fish to fry in the ensuing years, including the birth of a daughter, the care and feeding of a radio show featuring big band and jazz music, and a truly intimidating schedule of design projects and conferences. However, he returned to the subject of CSS resets in April 2007. He brought up the topic at the 2007 An Event Apart conference in Boston, where he specifically avoided the idea of using a universal selector to reset the CSS. "Instead," he wrote, "I said the styles should list all the actual elements to be reset and exactly how they should be reset."

Meyer based his "blanket reset" on Yahoo’s YUI reset, making some significant tweaks along the way. Meyer’s reset.css code included the following:
Eric Meyer’s CSS Reset (2007)
html,body,div,span,
applet,object,iframe,
h1,h2,h3,h4,h5,h6,p,blockquote,pre,
a,abbr,acronym,address,big,cite,code,
del,dfn,em,font,img,ins,kbd,q,s,samp,
small,strike,strong,sub,sup,tt,var,
dd,dl,dt,li,ol,ul,
fieldset,form,label,legend,
table,caption,tbody,tfoot,thead,tr,th,td {
margin: 0;
padding: 0;
border: 0;
font-weight: normal;
font-style: normal;
font-size: 100%;
line-height: 1;
font-family: inherit;
text-align: left;
}
table {
border-collapse: collapse;
border-spacing: 0;
}
ol,ul {
list-style: none;
}
q:before,q:after,
blockquote:before,blockquote:after {
content: "";
}

Two things to note with regards to the 2007 version:
1. The biggest selector, which includes more HTML elements such as <..applet..>, also sets other CSS properties such as line-height and text-align
2. Some elements, such as <..hr..>, <..input..>, and <..select..> were left out of the CSS reset
Meyer intentionally avoided the universal selector to preserve resetting "inputs and other interactive form elements." He continued: "If the :not() selector were universally supported, I’d use it in conjunction with a universal selector, but it isn’t."
Eric Meyer’s reset took the concept of resetting margin and padding and took it to a whole new level, stripping styles from a number of elements, forcing you to think about what you wanted and add them back in. List items would no longer have bullets, headings would no longer be bolded, and most elements would be stripped of their margin and padding, along with some other changes. — Jonathan Snook, Author, Front-End Developer
The effects of both the YUI reset and the Meyer reset were staggering. Almost everything included in the browser’s default stylesheet was reset to zero, including many elements designers rarely touched. Interestingly, the first comments on Meyer’s reset suggested more additions to the reset, particularly to removing link borders and setting the vertical-align property to baseline. For example, Jens O. Meiert, a web designer/developer who works at Google commented: "Is that a trick style sheet? Why should anyone interested in elegant code style use such a style sheet, that even in multi-client projects results in way too many overridden properties?"

Others complained about the redundancy of resetting elements to zero and then re-resetting them later in the stylesheet, adding unwanted weight to the stylesheet. Of course, others defended the reset, resulting in a lively (yet polite and mutually respectful) debate.

The fracture lines in the development/design community over the idea of the "global reset" were already beginning to appear. (More about that is forthcoming.)

Meyer made some significant changes to his original reset; two days later, he added a comment about background colors, changing the font-weight and font-style properties to inherit, resetting the vertical-align attribute to baseline, and zeroing out the borders on linked images.
Eric Meyer’s CSS Reset (2007, revision 1)
/* Don't forget to set a foreground and background color
on the 'html' or 'body' element! */
html, body, div, span,
applet, object, iframe,
h1, h2, h3, h4, h5, h6, p, blockquote, pre,
a, abbr, acronym, address, big, cite, code,
del, dfn, em, font, img, ins, kbd, q, s, samp,
small, strike, strong, sub, sup, tt, var,
dd, dl, dt, li, ol, ul,
fieldset, form, label, legend,
table, caption, tbody, tfoot, thead, tr, th, td {
margin: 0;
padding: 0;
border: 0;
font-weight: inherit;
font-style: inherit;
font-size: 100%;
line-height: 1;
font-family: inherit;
text-align: left;
vertical-align: baseline;
}
a img, :link img, :visited img {
border: 0;
}
table {
border-collapse: collapse;
border-spacing: 0;
}
ol, ul {
list-style: none;
}
q:before, q:after,
blockquote:before, blockquote:after {
content: "";
}

And two weeks after that, he tweaked the code yet again. This time adding minor chances such as the background: transparent property declaration to the first rule and removing the outline on the :focus pseudo-class.
Eric Meyer’s CSS Reset (2007, revision 2)
html, body, div, span, applet, object, iframe,
h1, h2, h3, h4, h5, h6, p, blockquote, pre,
a, abbr, acronym, address, big, cite, code,
del, dfn, em, font, img, ins, kbd, q, s, samp,
small, strike, strong, sub, sup, tt, var,
dl, dt, dd, ol, ul, li,
fieldset, form, label, legend,
table, caption, tbody, tfoot, thead, tr, th, td {
margin: 0;
padding: 0;
border: 0;
outline: 0;
font-weight: inherit;
font-style: inherit;
font-size: 100%;
font-family: inherit;
vertical-align: baseline;
}
/* remember to define focus styles! */
:focus {
outline: 0;
}
body {
line-height: 1;
color: black;
background: white;
}
ol, ul {
list-style: none;
}
/* tables still need 'cellspacing="0"' in the markup */
table {
border-collapse: separate;
border-spacing: 0;
}
caption, th, td {
text-align: left;
font-weight: normal;
}
blockquote:before, blockquote:after,
q:before, q:after {
content: "";
}
blockquote, q {
quotes: "" "";
}

After some more cogitation and discussion, he released another version of his reset. He changed the way quotes are suppressed in the blockquote and q elements, and removed the inherit values for font-weight and font-style.
Eric Meyer’s CSS Reset (2008)
html, body, div, span, applet, object, iframe,
h1, h2, h3, h4, h5, h6, p, blockquote, pre,
a, abbr, acronym, address, big, cite, code,
del, dfn, em, font, img, ins, kbd, q, s, samp,
small, strike, strong, sub, sup, tt, var,
b, u, i, center,
dl, dt, dd, ol, ul, li,
fieldset, form, label, legend,
table, caption, tbody, tfoot, thead, tr, th, td {
margin: 0;
padding: 0;
border: 0;
outline: 0;
font-size: 100%;
vertical-align: baseline;
background: transparent;
}
body {
line-height: 1;
}
ol, ul {
list-style: none;
}
blockquote, q {
quotes: none;
}

/* remember to define focus styles! */
:focus {
outline: 0;
}

/* remember to highlight inserts somehow! */
ins {
text-decoration: none;
}
del {
text-decoration: line-through;
}

/* tables still need 'cellspacing="0"' in the markup */
table {
border-collapse: collapse;
border-spacing: 0;
}

And finally, he released an ever-so-slightly modified version of the reset as what seems to be the final iteration, warning users: "I don’t particularly recommend that you just use this in its unaltered state in your own projects. It should be tweaked, edited, extended, and otherwise tuned to match your specific reset baseline. Fill in your preferred colors for the page, links, and so on. In other words, this is a starting point, not a self-contained black box of no-touchiness."
Eric Meyer’s CSS Reset (most current)
html, body, div, span, applet, object, iframe,
h1, h2, h3, h4, h5, h6, p, blockquote, pre,
a, abbr, acronym, address, big, cite, code,
del, dfn, em, font, img, ins, kbd, q, s, samp,
small, strike, strong, sub, sup, tt, var,
b, u, i, center,
dl, dt, dd, ol, ul, li,
fieldset, form, label, legend,
table, caption, tbody, tfoot, thead, tr, th, td {
margin: 0;
padding: 0;
border: 0;
outline: 0;
font-size: 100%;
vertical-align: baseline;
background: transparent;
}
body {
line-height: 1;
}
ol, ul {
list-style: none;
}
blockquote, q {
quotes: none;
}

/* remember to define focus styles! */
:focus {
outline: 0;
}

/* remember to highlight inserts somehow! */
ins {
text-decoration: none;
}
del {
text-decoration: line-through;
}

/* tables still need 'cellspacing="0"' in the markup */
table {
border-collapse: collapse;
border-spacing: 0;
}

And it wasn’t long before designers found problems in the "final" version of the reset. So nothing is ever really final.

The evolution of Meyer’s thinking, as it was affected by his own reflection and the input from many others is fascinating. Meyer has frequently reminded designers that they shouldn’t just slap his reset stylesheet into their work willy-nilly.

In the 2008 An Event Apart conference in Chicago, Meyer, according to participant Trevor Davis, told designers not to "just drop in his reset stylesheet if there is a more effective way to accomplish the same thing." He said, "Instead, turn the reset into a baseline. Modify the reset stylesheet to suit your project, don’t use his reset stylesheet without modifying anything. Don’t reset then re-style, reboot instead."

In a 2010 interview, Meyer told me: "I had not anticipated it being, you know, popular. I hadn’t really anticipated tons of people using it. I had sort of anticipated, ‘Hey, here’s a way of thinking about this, you know, here’s what people who really have a lot of experience with this might take as a starting point and play with it, and like, we’re all sort of professionals here and we get what this is about.’ And it kind of went everywhere and got used by tons of people who maybe haven’t thought about the Web for a decade or more as some of us have. … You have to be careful about what you throw out there in the world because you never know what’s going to catch on."

Conclusion
This in-depth inspection of the history of CSS resets should set the stage for what we will talk about next in this three-part series of articles. In part 2, we will discuss the various CSS reset options available to you for incorporation into your projects. In part 3, we will discuss the on-going debate on whether or not web designers should use CSS resets.

This is a three-part series of articles on the topic of CSS resets.
Part 1: The History of CSS Reset
Part 2: A Guide to CSS Resets (coming soon)
Part 3: Should You Use CSS Reset? (coming soon)

Related Content
Resetting Your Styles with CSS Reset
Snazzy Hover Effects Using CSS
Sexy Tooltips with Just CSS
Related categories: CSS and Web Design

Michael Tuck is an educator, writer, and freelance web designer. He serves as an advisor to the Web Design forum on SitePoint. When he isn’t teaching or designing sites, he is doing research for the History Commons. You can contact him through his website, Black Max Web Design. Sphere: Related Content