Google released an interesting study about the content of over a billion webpages in their database. It’s nicely seperated into categories of markup, and identifies the most used HTML tags and attributes. I found the comments on the most frequent mistakes very interesting. Obviously many tags are used incorrectly for presentational (instead of semantic) purposes, and many pages are brimming with deprecated elements. A surprising number of pages had specialized tags, and in some cases, there’s no known online documentation that tells which program generated them or for what purpose. It’s a fun read.
Share Your Thoughts
To display code in comments: <pre>Code here. May be multiline. Format XML with > and < entities.</pre>
Some HTML allowed in comments: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>