Home

Mon, May. 3rd, 2004, 05:27 pm
Stream of thought exercise, incomplete

This weekend I came back from a brief vacation and saw in my RSS feed a link to CSS Pencils v3.1. (apparently doesn't display correctly on all browsers - Safari being one that works just fine, and the new Opera Beta being another that is supposedly even faster as well)

That page lists out a way to represent images using only DIV tags in HTML along with CSS applied to said tags. I had thought of doing something similar back when I first was looking at the Perl CPAN module that allows you to take in an image and it then outputs colored text to represent that image (ThousandWords).
That code currently outputs font tags (or did the last time I glanced over it), so it should be relatively trivial to output div tags instead. This would give the overall effect similar to a Chuck Close painting or perhaps that of a photomosaic... without the photos.

The CSS Pencils page does far more - it allows you to treat the fake image as a real image, manipulating various values in "real time" (slowly since it is all done in JavaScript and DOM manipulation as CSS values are changed).

On there, he states that he compressed the image by using the "border" property of the div - which made me smack my forehead in a "why didn't I think of that" sort of moment.
That way instead of having to put down a div for every pixel, you can have a div represent as many as 3 pixels (well, technically you can have it represent as many as 5, but if you do the math, it works out to be more universally efficient to use 3).
This comes from having the center background color of the div, and then also using the border-right and border-left references in the CSS. You can make the div size so that it is one pixel, and then the border width one pixel as well. That is how you get one DIV to represent 3 pixels.

So my question is that can you get any size advantage over the process of using this instead of putting an image on the page?
My immediate reaction is "no" since we obviously don't do this now, so it is very likely that someone else has already done this thought exercise.

But there are perhaps instances where you want to display an image that you don't want people to readily be able to save. There are plenty of sites that disable right-clicking via javascript, but you can then just look at the source via the menu and then in there find what the image URL is and then load that directly into your browser. Perhaps they could use this method which would prevent that - but wouldn't prevent just taking a screenshot and saving that out to be edited to the proper size. Not to mention that this method is essentially like an inefficient PNG/GIF and those get huge in file size the more intricate an image is - whereas this method would get larger and inefficient simply whenever the image size (dimensions) get larger.

All of that aside, perhaps the best use for this would be for those authorization pages on websites. Normally they have an image of text that is obscured in some way and that is then parsed by the human eye and then the text (or numbers) is entered into a form and submitted.
There are exploits that get around this readily via bots that look at the source for images in a specific spot on the screen, then they download that image via its url and do analysis on it, and then with the text found there, submit the form with that data.
This way, bots can get by systems that are trying to limit only human users to get through.
(this again doesn't take into account a bot that is actually taking screenshots and then reading those - but that is far more complicated than any bots that I know of that are used against these things)

With the DIV method you could then have it create what appears to be an image on screen, but there is no actual image for it to be downloaded by any bot. That way the human sees no difference at all, but the bot is then crippled.

I might try to write up some code later that shows a working example of this - but I would think that many people could see this fairly readily if they are familiar with what is needed.

So what sort of size difference are we talking about between images and this text only version?

Say we have an image that is 99 X 99 pixels. (I chose that because it makes better/easier use of the 3 pixel technique that we have already mentioned above)
That is a total of 9,801 pixels that we will have to represent. Divided by 3 we get 3,267 DIVs that we will have to write to the screen.
Assuming that our DIV will look something like this in the code:
<div class="9999">&nbsp;</div>

Then we know that we will have 30 bytes per DIV, and then 3,267 DIVs - so multiply those out and we get 98,010 bytes - so very roughly 100KB.
If we then upped the image size to 300 X 300 pixels, then we get 90,000 total pixels and then divide that by 3 to get the total representative DIVs with 30,000. Representative code looking like:
<div class="99999">&nbsp;</div>

and then that times 31 gives us 930,000 bytes which we can say is roughly 950 KB - close to 1MB.
So to go up 3 times in dimensional size, we go up roughly 10 times in size on disk. That was just a stupid comment made at the end of the day - it still goes up proportionately for the most part - the "3 times" is only on one side - it goes from roughly 100 pixels on one face to 300 pixels - but that is discounting total area... stupid me - I blame work.

That doesn't at all account for the CSS file that would need to be generated on the fly as well that would look something like:
div {
	height: 1px;
	width: 1px;
}
.134 {
	background: #FFFFFF;
	border-right: 1px solid #FFCC99;
	border-left: 1px solid #FF00FF;
}

Here you can see that we actually get some variable size in that not all class names are going to be as long as each other (although I'm not sure that you can legally name them starting with a number, so it might have to always have a letter in front of it).
You will only need to use the DIV definition the one time - and I'm not 100% sure that will even apply the height/width universally like that.

We could make it an "id" references instead of a "class" reference and save us 3 characters/bytes on every DIV that we use, except that then forces us to have a CSS entry for every single DIV that we use. By doing it with a class, then you only need to have a CSS entry for unique 3 pixel rows - if you use the same color combination twice, then it can reference the same class for its colors.
I am unsure as how to calculate out how much that could save us without just running some tests to see.

You should be able to improve the speed a bit by gzipping the CSS sheet, and probably the page itself as well if your server and the browsers of your visitors support it (although I think that might screw with the caching that browsers do, so if you are serving something dynamic - like those text tricks that I mentioned).

I am going to try to write up an example tonight - I haven't seen anyone else try this before (using the DIVs to bypass character recognition). Not sure if it is a novel idea on my part or not.