Websites as graphs
Everyday, we look at dozens of websites. The structure of these websites is defined in HTML, the lingua franca for publishing information on the web. Your browser's job is to render the HTML according to the specs (most of the time, at least). You can look at the code behind any website by selecting the "View source" tab somewhere in your browser's menu.
HTML consists of so-called tags, like the A tag for links, IMG tag for images and so on. Since tags are nested in other tags, they are arranged in a hierarchical manner, and that hierarchy can be represented as a graph. I've written a little app that visualizes such a graph, and here are some screenshots of websites that I often look at.
I've used some color to indicate the most used tags in the following way:
blue: for links (the A tag)
red: for tables (TABLE, TR and TD tags)
green: for the DIV tag
violet: for images (the IMG tag)
yellow: for forms (FORM, INPUT, TEXTAREA, SELECT and OPTION tags)
orange: for linebreaks and blockquotes (BR, P, and BLOCKQUOTE tags)
black: the HTML tag, the root node
gray: all other tags
Here I post a couple of screenshots, and I plan to make the app available as an applet, so that anybody can look at their websites in a new way.
Update: Here it is: http://www.aharef.info/static/htmlgraph/
CNN has a complicated but typical tag structure of a portal: Lots of links, lots of images. Similar use of divs and tables for layouting purposes. (1316 tags)

boingboing, my favorite blog, has a very simple tag structure: there seems to be one essential container that contains all other tags, essentially links (lots!), images, and tags to layout the text. A typical content driven website. (1056 tags)

As always, simplicity rules at Apple's website. A few images and links, that's it. Note the large yellow cluster, representing a dropdown menu. (350 tags)

Yahoo seems to be stuck in the old days of HTML style: most of the tags are tables, used for layouting - no divs. Very uncommon these days. (952 tags)

The complete opposite of yahoo - this site uses almost no tables at all, only divs (green). It's nice to see how the div tags are holding the other elements, like links and images, together. (454 tags)

Surprisingly, at least to me, Microsoft's portal is very much div-driven. Also of note is it's very scarce use of images. (633 tags)

Today, google is everywhere, but if somebody had asked me 5 years ago why I was using google, and wanted a visual answer, here it is (88 tags):

I finish with two of my own projects:
What can I say? I like it ;-) No tables, lots of links, simple structure. A typical Movable Type site, I guess. (372 tags)

My personal art project. Although I programmed the site myself, I'm surprised by the simplicity of its tag structure. It shows that you can make beautiful websites with just a few tags ;-) (88 tags)

That's it. You can play around with the app, and take a fresh look at websites - here's the applet.
And don't forget to support yours truly by checking out onethousandpaintings.com

[Read More]
Comments
verrry cool
Posted by: p-daddy | 26.05.06 00:59
Great work!
Posted by: Douglas | 26.05.06 03:17
Also, I'm wondering if you've thought about making the code open source? Particularly the graphing part, I'd love to try making graphs of other tree structures, partcularly programming code.
Posted by: Douglas | 26.05.06 03:19
Really amazing visualizations! It really gives you a great high-level overview of a website.
Posted by: Justin Palmer | 26.05.06 03:45
Amazing work, If not opensource, you still can put this app in your website, so that people can atleast buy from you
Posted by: vinod vv | 26.05.06 07:05
This is the kind of art that I'd be prone to sticking up on my walls. Or even bleeding straight into.
Posted by: Reno | 26.05.06 10:08
@Douglas - Yeah, I am happy to make the code open source - problem is, I'm still trying to get this into an applet. As soon as I'm ready, you can find it here.
I've done the programming in processing (i.e Java), and I was using the traer physics library. It's very easy to use for graphing stuff.
Posted by: Sala | 26.05.06 14:11
非常好,不错!
GOOD!
Posted by: StephenZhai | 26.05.06 14:17
google does not like divs or what?
Posted by: elvir | 26.05.06 14:50
You probably wanna visualize mine as well? ;)
Nice work.
Posted by: Jens Meiert | 26.05.06 15:05
where is the black dot for the html tag on the google site?
Posted by: tim | 26.05.06 15:49
Tim, have a better look.
Posted by: DXL | 26.05.06 16:08
@tim & DXL: You were right, there was still an old screenshot with another color scheme. Changed now. Thanks!
Posted by: Sala | 26.05.06 16:12
Drop Java and use JavaScript and use SVG. Do that and you'll see your code everywhere on the web.
Posted by: Daniel Glazman | 26.05.06 16:20
This look really good. Excelent work.
Posted by: jivanov | 26.05.06 16:43
How beautiful. Looking forward to the applet (maybe just release the code if you are comfortable with letting it run free?)
Posted by: Paul Watson | 26.05.06 16:44
Absolutely. I just want to finish that applet, it looks really beautiful when the network unfolds. I will post source code, no problem - I just don't like to put stuff online that doesn't work. But I'm almost there ;-)
Posted by: Sala | 26.05.06 16:53
When I read the title of the article, I thought it would be very dull. Graphs aren't my favorite subject, but this is very interesting and I look forward to the release of the applet!
Posted by: Lawsy | 26.05.06 18:10
Release the code! This is brilliant...open source?
Posted by: Anon | 26.05.06 18:29
This is really twinkle twinkle little star.Good.Can I draw the same ?How please let me know.
Posted by: acmathur | 26.05.06 19:14
Ditto on the javascript plus svg idea. Java applets are a pretty bad idea, they're super slow to start up, lots people don't have the plugin and you're going to have get around the restriction that applets can only access the site they are hosted on. And plus they're so 1990s. ;) Even flash would be a better idea than java.
Posted by: sjf | 26.05.06 19:32
I disagree. I'm not very into Flash, and SVG... I've been there (I wrote a book on the topic). One day SVG + JS will rock, but today is not that day.
Posted by: Sala | 26.05.06 19:41
Ok, applet is online: http://www.aharef.info/static/htmlgraph/
Posted by: Sala | 26.05.06 20:07
Brilliant! I'm also very interested in the source if you're up for posting it.
Posted by: Forrest | 26.05.06 20:24
It didn works with my page :( http://www.aharef.info/static/htmlgraph/?url=http://deejayy.hu/
Posted by: deejayy | 26.05.06 20:34
interesting.
the address, http://www.gasztrojob.hu/ does not exists, but this applet draws a very beauty graph about it :D
Posted by: deejayy | 26.05.06 20:41
@deejayy
1) your site has no html tag. that's needed for the applet to work.
2) the address that does not exist probably causes your browser to display some error message - that's what you see
Cheers,
Sala
Posted by: Sala | 26.05.06 21:33
Sala, this could almost be a dev tool--a supplement to the Mozilla DOM Inspector. It'd be nice to add the ability to roll over a node to find out what it is (element name, classes, id).
Posted by: Justin Watt | 27.05.06 00:16
Yes, roll-over labels would be _very_ useful. For my needs, page title + url-beyond-basename is enough (e.g. just "opinion/" not "www.nytimes.com/opinion/")But,, popping up a tiny page image thumbnail would be nice (fetched via home server so not bound by applet security?)
Posted by: bazzers | 27.05.06 00:36
Howdy. Great tool, really cool. I'm curious about something that may be a bug... When I try to model alistapart.com, after a short bit it spirals right out of the window!
Posted by: anon | 27.05.06 00:40
This is incredibly awesome. Congrats on building something this cool.
I do have one little suggestion, though -- what about a color for list tags (ul, dl, ol) and one for list items (li, dd, dt)? My person website maps out as mostly gray due to my large use of lists, and I suspect a lot of others do, as well. :)
Posted by: Jeff Croft | 27.05.06 01:26
Hey, great applet! Some guy on IRC posted me the link and when I saw it was Processing based I got pretty excited.
Its a shame you've not made it 'open source' - I'd have gone right ahead and put rollover labels in, and an internal colour chart, and all that stuff.
Amazing results though. I've been running it on all the subsites and pages of my own website and the different patterns are amazing. Its fun interpretting the results visually - it makes you think about the structure of your site.
I was almost sort of proud when I ran http://mkv25.net/USy/ and got a sea of orange come up on the display.
Every page has its own personality, and the applet it picks up on that brilliantly.
Posted by: Markavian | 27.05.06 01:33
Another suggestion: make the nodes 'draggable' so you can 'rearrange' the rotation of the image.
Posted by: Markavian | 27.05.06 01:36
yeah, it'd be great if you could package this and offer it as a download so that it could be used on intranets and such.
Posted by: spydrlink | 27.05.06 01:59
I tried this site I know of from a friend: www.sunshineoasistan.com and it starts to appear but then just completely disappears? What's up with that?
Posted by: spydrlink | 27.05.06 02:01
Websites as graphs for analysis. It's very interesting watching the graphs arrange themselves. It's also interesting comparing different site "structures" with your intriguing tool: few commonalities exist between sites.
Very, very cool!
And, Markavian, hit "refresh" and you will have rotation of the image.
Posted by: Sean Fraser | 27.05.06 02:05
Wonderful! I love visual representations of data and structures - I was even more wowed out by the way the applet animates the graph as it processes. Tried http://www.medialens.org/ and a yellow flower blossomed... would agree that a roll-over of the nodes with some tag info would be useful in making this a practical tool.
Posted by: flashparry | 27.05.06 02:27
Bookmarklet: copy the following into a new favourites/boomark and place on your browser links bar for a quick button to visualise the current page you're browsing:
javascript:location.href='http://www.aharef.info/static/htmlgraph/?url='+location.href
And for those using yubnub.org:
vdom URL
Posted by: flashparry | 27.05.06 03:27
Pants. That should've been:
javascript:location.href='http://www.aharef.info/static/htmlgraph/?url='+location.href
Posted by: flashparry | 27.05.06 03:28
OK. so the comments have a line length limit :)
The javascript should end in:
+location.href
Posted by: flashparry | 27.05.06 03:30
bazzers: had the same problem with my site, but as it turns out only on the front page. I think it's the Oxford English Dictionary search box there (which I'm going to remove soom anyway). It is that bit that was giving me red dots for table even though I don't use tables myself, all div/css layout.
The Postcrossing page produces a large red, orage and purple flower :)
Posted by: webchimp | 27.05.06 05:19
Actually, a couple of other pages do it as well. Links, gigs and site map.
Curious, must have a look at the code for those pages to see if thers anything that could be making it go off like that.
Posted by: webchimp | 27.05.06 05:28
This is cool !
Posted by: mcpaige | 27.05.06 07:35
www.bbc.com seems to break the applet - possibly because it's a redirect? - but bbc.co.uk is fine
Posted by: papalaz | 27.05.06 09:25
librarything also seems to break it
Posted by: Anonymous | 27.05.06 09:44
try comparing sites that do similar things - e.g. allmovie and imdb
also wikipedia is sweet
Posted by: papalaz | 27.05.06 09:57
Hey guys - thanks so much for all your feedback. There is indeed some problem with some sites, and I have to look into that. I also think it's a great idea to extend the applet, with rollover and stuff.
I will put the sourcecode online this afternoon (Central European Time). It's great to see all these ideas poppping up, starting from such a simple idea - please keep posting!
Thanks
Sala
Posted by: Sala | 27.05.06 11:12
Great work!
Posted by: Cd0MaN | 27.05.06 12:11
Lovely stuff.. works well with my website without problem.
Posted by: Jee | 27.05.06 12:25
is there any way to export the picture so that it can be printed? that would be amazing. thanks.
Posted by: ben | 27.05.06 12:43
Source code is online. And fixed the bug that caused some networks to disappear.
@Ben: Yeah, you can use some processing libraries to export the picture.
Posted by: Sala | 27.05.06 13:19
I've made this the Geek Toy of the Week for tomorrow over at Deep Thought. It'll show up tomorrow morning at http://www.dtgeeks.com/index.php/features/geektoy/websites_as_graphs .
I agree with one of the comments to add more tags like list tags, with their own colors. But otherwise, this is a very nice toy!
Posted by: Arden | 27.05.06 15:33
The source is up, but what is it in? That's not standard Java -- it's not in a class. What does it run under?
Very nice idea -- I was going to try to reduce it's overhead, and skip or speed the animation, since for large sites it can take a *very* long time to finish.
Posted by: Ken Arnold | 27.05.06 20:23