HTML and markup languages like XML describe documents as hierarchies of tags, in what is called a Document Object Model. This structure can be visualized as a graph.
Websites as Graphs (by Sala of Onethousandpaintings.com) takes a web page URL as input, and outputs a graph of the underlying HTML structure. Used on any large content site like CNN or BoingBoing, it reveal the underlying logic of presentation used to build those pages. Related information form clusters, with color codes revealing a tendency towards table- or CSS-based design (the former being a no-no, obviously) as well as density of images, links etc.
While the graphs make for interesting images, it is still hard to make hard and fast assumptions about the page in question only by looking at the graph. But a well-structured document will always reveal itself as such, as will badly-structured documents. Websites as Graphs should be of interest to anyone who has tried to define a page structure, particularly if that structure conforms to the current CSS-based ideal of “logic-not-presentation” style of web design.
Update: Markavian has hacked up a remix version which allows you to browse the tag structure interactively and even follow links to new documents. To use it, point your browser to a URL in the following format:
“mysite.com” should obviously be replaced with whatever URL it is you want to explore.
- Websites as Graphs original post, with graph examples from popular web sites.
- Websites as Graphs (Applet), online applet which allows the input of user-specified URLs.
- Flickr: websitesasgraphs tag, user-contributed graphs of their own web sites.
- One thousand paintings, Sala’s art experiment in selling generic art objects priced according to a numeric formula based on the number of paintings sold.
- Christian Riekoff: Tree. Another visualization of HTML hierarchies, using 3D tree structures.