FAQ 9: Handling Images

Introduction

This FAQ is not meant for all beginners, and it's contents may be difficult to understand for some. However, I did try. ;)

One good FAQ on jpegs can be found at:
http://www.cis.ohio-state.edu/hypertext/faq/usenet/jpeg-faq/top.html

Questions and Answers

Before beginning the details, the general consenus is that jpegs are the most reasonable format for storing scanned photographs. Let me put forth some questions and answer them.

What Jpeg is

Q: What is Jpeg?

The JPEG acronym stands for Joint Photographic Experts Group. It is a general term applied to a family of image compression techniques.

The more common forms of jpeg use a lossy compression and are best suited for scanned photographs or realistic images with a large variety of color. The vast majority of color jpegs use 24-bit color and grayscales (black and white) use 8-bit.

Lossy Compression

Q: Jpegs normally use lossy compression. That sucks, right?

Without lossy compression, many images posted on usenet would be several megabytes in size. Experts agree that the human eye is incapable of telling the difference between a 100kb jpeg and a 1 megabyte non-compressed version of the same image.

So, if you happen to be a human being, then the compression used by jpeg will probably not reduce an image's quality beyong your ability to detect it. If you're not a human being, then this FAQ will probably not teach you anything that you don't already know. ;)

Gifs

Q: Why not use GIFs to store scanned photographs?

GIFs are not often used for scanned images because they lose a lot of color detail. This is because GIFs have a limit of 256 colors, and so if any image has more than 256 colors, its colors must be compressed through a process called dithering before it can be stored as a GIF. This is unfortunate because scanned images almost always exceed the color limit.

Most Jpegs store 16.7 million colors. If done right, no human can tell that an image has been compressed into jpeg format. This is not so with GIFs in many cases, especially if the scanned image is a colorful photograph. An example would be a conversion of a Bluebird scan to GIF format, where this is clearly noticeable.

For these reasons and others, GIFs are primarily for use with simple images with relatively few colors.

Visual differences

Q: I can't see any difference in the quality between GIF and Jpeg versions of the same image. Why is that?

There are several possible reasons.

1) It might be because your computer is old and can't support more than 256 colors. So, even though a jpeg might have millions of colors, your computer will only display 256 of them, making the jpeg look just as bad as a GIF (or worse, since jpegs use lossy compression). The average computer today supports up to 65536 colors, which means even the average computer will not display a jpeg at its full quality. In any case, humans can only tell the difference between millions of colors, so the average monitor is just a few magnitudes below this. Billions of colors in jpegs are rare because most monitors cannot display it, but they do exist for computer analysis (for medical purposes, for example).

2) It might also be because the jpeg image doesn't have more than 256 colors. In such a case, the result would be the same as case 1. One way this might happen is if you take a GIF file and convert it into a Jpeg file. Since the original file was a GIF, it could only have up to 256 colors, and so when you convert it to a jpeg it will also have only up to 256 colors.

Level of compression

Q: About how much are jpegs compressed?

Jpegs are flexible in that you can compress them just about as much or as little as you want. Experts say that jpegs can have a compression ratio of roughly 10:1 to 20:1 without any quality loss that is perceptable to the human eye. In other words, you can take an image that would normally be 500 kb and turn it into a 50 kb or even 25 kb jpeg without noticing a difference.

I can attest to this; I have compared images side by side with far more compression, and I begin to see the difference at between 20:1 and 30:1, though the difference at this point is extremely small. Experts suggest a 30:1 to 50:1 compression for general purposes nonetheless, since most people aren't that picky. This is because while you may notice a difference, it is rarely worth the filesize difference.

Level of compression

Q: Suppose we took an average scanned jpeg image and converted it to GIF. What would typically happen to the filesize?

I took a typical jpeg image and converted it to a GIF. It jumped from around 124Kb to 525Kb. When I compared them, I could easily tell which was the Jpeg, since the GIF only had 256 colors. I could also see that her body - whoops I mean the texture - was soft and smooth in the jpeg. However, it was rough and coarse in the GIF. The horror! Naturally I immediately discarded the GIF, since it took up four times the space anyways. By the way, this difference is not as easy to notice if you're dealing with very small images (since they have a smaller variety of colors anyways).

Level of quality

Q: Wait, you mentioned before about 10:1 ratios and stuff, isn't that the same as a rating of 10 'quality'? Shouldn't you instead choose that 100 quality when saving the jpeg?

The "quality" of an image is a rating based on an arbitrary scale. Usually it goes from 0 to 100, but in actuality 100 is far from perfect (it's just the highest that the scale goes). Depending on the program displaying the jpeg, 100 quality could mean almost anything. It is a purely arbitrary number that has little real meaning to it. In other words, one program can define 100 to be something completely different than another program.

For this reason, 100 quality could very well mean a 10:1 compression ratio or maybe a 5:1 compression ratio. It all depends on what the programmer of the image editor decides it should be. As an example, 75 quality on LViewPro is not the same as 75 quality to another program. In addition, the number has no direct correlation with the ratio of compression (80 quality is not necessarily twice as good or twice the filesize as 40 quality).

Level of quality

Q: No way, jpegs aren't compressed at 10:1, 20:1, 30:1, etc...ratios! You're crazy.

Don't take my word for it, test it out yourself.

Keep in mind that normal jpegs use 24-bit color. So, open up a jpeg image yourself. The amount of RAM (aka memory) on your computer that the uncompressed image will take up equals 3 bytes (aka 24 bits) x the dimensions of the image (pixel length x pixel width). In other words, the formula is: filesize (in bytes) = 3 x length x width. An image that is 100 pixels wide by 100 pixels high will take up precisely 30,000 bytes of RAM (note that when you compress this as a jpeg it will probably take much less than 30kb on your hard drive).

As an example, I just opened up a Bluebird scan of Madoka Wagure (a 102,267 byte file). It's dimensions are 541 x 768. Thus, the uncompressed image data is exactly 1,246,464 bytes (541 pixels wide x 768 pixels high x 3 bytes per pixel = 1,246,464 kb).

Don't believe me? Well, I then save the image as a 24-bit bitmap (bmp) and the filesize is 1,247,286 bytes. As you can see, the filesize formula I gave predicted this filesize almost exactly (the filesize is a tiny bit bigger because of the header info placed at the beginning of the file).

In this particular image, the compression was roughly a 12:1 ratio (1,246,286 divided by 102,267 is about a 12:1 ratio).

I also took the same image and saved it as a jpeg at "95 quality" (remember, this number is a purely arbitrary scale and does *not* mean 95%). The filesize increased to 125,974 bytes. So, in this particular image editor, a 95 quality jpeg means a compression ratio of roughly 10:1. In other words, 1/10 of the image information was saved into the jpeg file, while 9/10 of the information was thrown away (which is where the term "lossy compression" comes from).

The image still looks about the same even after throwing out the 9/10 because when your computer comrpesses, it chooses the 1/10 of the information to be saved carefully so that it closely represents the 9/10 that was thrown away.

Level of quality

Q: I heard about a form of jpeg that uses lossless compression.

There are two methods that are out there, but they are both impractical as they offer very little compression and aren't well supported.

Level of quality

Q: I'm still not convinced about jpegs!

Well, I just did a test with a 237kb image that had many colors. I compressed it at quality 75 using LViewPro and it became a 96kb image. I put one image on top of the other so that I could switch back and forth between them and see the subtle differences. I could not tell for sure which was the better image. In fact, if I didn't place them on top of each other as I described, I would not have even been able to tell that they were different at all.

Now if my webpage didn't take up the space that it did, I'd actually post the example up for you to see. So if you still don't believe me, just keep in mind I'll be too busy enjoying Vivian to care. ;)

I took that same image and converted it to a GIF. The resulting 231kb gif file looked far worse than the 96kb jpeg original file. You can try this out yourself with any sufficiently large image.

The image I worked with was a typical image. I will explain exceptions later.

Large is not always high quality

Q: If that's so, then howcome I sometimes see an awful-looking jpeg even though it's 500kb in size.

Some goofball messed around with it. I'll explain later in this FAQ.

As an example you might understand, imagine that your TV antenna is damaged and your TV displays a lot of static. If you use your VCR to record a program while using the damaged antenna, then no matter how high quality your VCR and the recording is, the static will still be recorded. Even if the VCR makes a perfect recording, it will merely be a perfect recording of a static-filled program.

Large is not always quality

Q: Howcome this jpeg looked so good but when I open it and then save it as a new jpeg I see jagged edges instead of smooth curves, even though I saved it at a higher quality than it started as?

I'll talk about this later too...

As an analogy, if you watch a black and white movie on a color TV, it's still going to be black and white right? Using a high quality color television won't magically turn that black and white movie into color. ;)

If you didn't understand the last paragraph, read it again very carefully. The point is that once information is lost, it cannot be regained through conventional methods. In the example, the information lost was color.

For this reason, resaving a jpeg will typically damage it even more, even if you save it at a higher quality than you loaded it with.

As another example of this, photocopy of some important documents. Suppose you used a photocopy machine to copy the original photocopy. The new photocopies would typically be of worse quality than the original, because any imperfections in the original photocopy would be copied over to the newer copies.

BMP (bitmaps)

What it is
Bmps are a very ancient form of image storage, but it is the most intuitive and straightforward format. They are very similar to the .PCX format.

It's pros/cons
It is inefficient in terms of image filesize, but it's main advantage is a result of its simplicity. Since it is simple, a very slow computer can still display a .bmp image quickly, while the same computer would have more trouble displaying a .jpg image for example. Bmps are good for small images where filesize is negligible or for images used in general applications. Bmps are not supported by standard web browsers.

How it works
Each pixel is assigned a color (typically up to 24-bit accuracy). This would mean that every 3 bytes = 1 pixel. Pretty simple.

It's applications
Bmps have a large range of applications, few of which have any relevancy to high quality scanned images. This is because a scanned image usually demands a high depth of color while still taking up a reasonable amount of space. Since scanned image demands a filesize range that only jpeg provides, bmps are not often used for that purpose.

GIF

What it is
GIF is a good format for images with few colors, particularly artificially created images. It is common and supported by all standard web browsers.

It's pros/cons
Gifs store images using indexed color. Indexing is an efficient and quick form of compression. However, if an image contains more than 256 colors, all excess colors will be eliminated (through a dithering process). Consequently, Gifs are extremely well suited for artificially generated images but do not store photographs in relatively accurate color. In all normal applications Gifs are significantly smaller than bmps. In addition, a conversion to a Gif will retain full image quality assuming the image is not over 256 colors. Most of the time, the increase in filesize from converting a GIF to bmp will range from 50% to 200% (various factors are involved).

How it works
GIFs store images as an index of colors using a palette of up to 256 colors. This is different because instead of each pixel being assigned an individual color, each pixel is instead given an index number, which refers to a list of 256 possible colors. Note that there are a lot of colors available to choose from, only that you can only use up to 256 of them. This way a pixel takes one byte (256 possibilities) instead of 3 bytes (16.7 million colors). It is possible, however, to dither a GIF to have far fewer colors than 256 to compress an image further.
As an analogy, with a GIF image you are allowed to paint any image using up to 256 different buckets of paint (these colors may be chosen out of a selection of 65,536 colors).
Unfortunately, photographs are almost never limited to 256 colors. Even a picture of a flag might have more than 256 detectable colors, since there might be 50 shades of red found in the photograph just because of a wrinkle. A tiny scan of a photograph, on the other hand, might very well have less than 256 colors (taken to extremes, it might even have less than 256 pixels!). A computer generated flag, on the other hand, will probably have very few colors, as there would likely be only one shade of red, for example.

It's applications
GIFs are great and extremely efficient for artificial images. However, if you switch your computer to 256 color mode, you might see what happens when there are too many colors. This is why scanned images would be "damaged" through conversion to a GIF. GIFs are very useful for simple cartoons, images with few distinct colors, and artificial images. If your computer only supports 256 color mode, look forward to your next computer.

Jpeg

What it is
Jpeg is the most common format for storing photographs. If done correctly, the loss may be negligable to the human eye.

It's pros/cons
Jpegs use 24 bit colored pixels (3 bytes, 16.7 million colors, which is beyond the capability of many monitors), thus giving very accurate color depth, near the limit of human ability to detect it. Also, jpegs use a form of lossy compression, which can dramatically reduce the filesize of an image. However, not all images demand such accuracy, and some may be better compressed using the indexed compression GIFs use. Thus, Jpegs are best suited for photographs and high color images where detail in color is essential. Given a nice monitor resolution, an image can appear very realistic. ;)
Jpegs also have a grayscale format. I think it's typically 8 bit. 12 bit is for medical purposes when your life is at stake. ;)

How it works
Jpegs have an arbitrary quality scale up to 100, with 100 being closest to perfect (on most scales). On Lviewpro, a 95 quality jpeg means roughly a 12:1 compression ratio. So, this scale is arbitrary and varies from program to program. Never use anything near 100 to save a jpeg unless you want to be a goofball. There is much more reason than the obvious. The closer the quality is to 100, the more pixels of an image are stored in the jpeg file. The idea is that you only have to save only a few of the pixels during compression, and during decompression, the computer guesses at each of the missing pixels based on its neighbors. So, if you open a jpeg (let's say at 75 quality) and then save it again (as 75 quality), it won't be as good as it was before (though most people would probably never notice).

Lossy Compression

Understanding lossy compression
If you still don't understand, try this paragraph. As an analogy, compressing an image into jpeg format is like taking a map of all the countries of the world and recording only the location of each capital. Decompressing the jpeg for viewing is like taking those capitals and then trying to figure out the shapes of the original countries by only using that information. The higher the quality of the jpeg, the higher the density of capitals. At 100 quality, imagine each country being the size of its capital so that compression loses little information.

Beware of lossy compression
If you take a 100 quality jpeg and compress it to 80 quality using the best compression techqniques with LViewpro, I'd say that no one would be able to tell the difference even if the images were side by side, given decent compression techniques (many can't even tell between 100 and 75). I'm talking about scanned photographs under typical conditions, of course (I'll talk about exceptions later). Also, your computer should be decent and set up properly (at least a 1024 x 768 16 bit display with ACDSee using optimal settings). In addition, Bluebird scans, considered by many to be the best scans of its kind, are never at 100 quality, and you can bet Bluebird knows what he's doing. Once a friend sent me three 500kb+ Bluebird images and asked me why they all looked messed up. They each had an incredible amount of static in them and would have probably looked no different had they been 20kb or 30kb files. I had the originals to those pictures, so I sent him back the original 100kb+ files which looked flawless. The point is that a few people think that if a picture is damaged, they can somehow fix it by simply increasing the jpeg quality. However, it works no differently than trying to take a black and white movie and saving it in color. It will still remain black and white. In fact, trying to increase the quality of a damaged jpeg this way will only damage it more. Yes, on most programs, taking a 30kb jpeg and saving it as a 500kb jpeg will damage it even more (unless 500kb is the full 1:1 ratio). Lastly, because of these people, many programs don't even allow anything close to perfect quality (once again, 100 is completely arbitrary and on most scales does not mean perfect). Lviewpro has a max of 95 (which is roughly a 12:1 compression ratio). Generally speaking 75 quality is defined to be the point where an average human can start seeing the differences. So to be safe, 80 or 90 is best (Bluebirds run around 90). Once again, this is LviewPro's scale and is not the same as the scale another program might use.

Quality Exceptions
Still, there is occasionally a visual difference between 95 on Lviewpro and anything less. Since jpegs compress by relying on guessing missing pixels using adjacent ones, unnaturally straight or hand-drawn lines will often be distorted through compression. However, this is almost always because the image was not scanned, not scanned properly, or was modified after scanning.

Other Notes

Rough Lines
What sometimes happens if the scan has been modified or your display is not optimal is that "curves" look unnatural (you can clearly see the sharp corners of pixels). Normally on a curve the color of the pixels should make it natural and blend in, but either because the scan wasn't good or your display is not good enough, this might not always happen. Often this is also caused by the sampling method. However, I have yet to see an unmodified and professionally scanned image produce this effect on the display setup that I describe.


Preserving Authenticity
With that being said, please don't try to edit images with the intentions of making them "better". It's another thing if you're simply trying to reduce that 300kb image for personal use (like for a quicker slideshow), but keep in mind that if that is the original image (with original filename), it might be more valuable to you in its original form. If you do have good reason to compress an image (such as it isn't the original), be very careful to do it right. I suggest saving the compressed image as a second file so you can undo a bad compression (I advise against relying on the Undo command unless you're careful, because you can't see the effect unless you reload/refresh the image file). If a filesize drops dramatically after you save it as a 75 quality jpg, then it might not be a bad idea (particularly if you don't really value the image). If the filesize drop isn't significant, don't. It will unnecessarily damage the image, and it might not save you a single byte of your hard drive's space (see below on file allocation).
One case study about editing originals is one person's attempt at joining two images. The person had two separate scans of the same image of Laika (same picture, scanned by two different people). In an attempt to reduce the static and color loss in the two images, he joined them, and the end result was a picture with better color depth and a lot less static. Unfortunately, this third image lost something very valuable, that being Laika's beauty marks! The point is that while an edited image might on occasion seem better after being altered, it has through the edit lost its authenticity (and thus it's collectability). Of course, for personal use you are encouraged to do as you please so long as you keep this caveat in mind.

Viewing jpegs
Note that just because a jpeg is in millions of colors (16.7 million) doesn't mean you'll see millions of colors. If your computer or display does not support millions of colors, the image might have to be dithered before you view it. On really old computers (which typically use indexed coloring), this can get pretty bad (I've seen one case where the image would be nearly in black and white because the computer had to use the other 10 colors for Netscape. Heh heh, I bet five of them went just to the Netscape logo!). Yes, some people still have computers with 256 or 16 color mode. This is particularly clear if you're in windows (some colors are reserved for use by Windows). If this is the case, look forward to your next computer.

File Allocation
Note: On a typical IBM PC, a 32kb file takes up just as much space as a 0kb file. That's because of the way your computer allocates space. When you view the directory using explorer, you'll see that 10 1kb files takes up 10kb as reported by windows explorer. Explorer is lying. After all, it's made by Microsoft. Most computers will instead use up 320kb for those 10 files. One clear demonstration is when I had about 1000 files, each one roughly 50-60 bytes (yes, bytes, now don't ask me how this actually happened). My explorer claimed that the 1000 files took up 80kb in all. I deleted them completely (they didn't go to recycle bin), and my free space increased by 30+ megabytes. Most often, space is allocated in multiples of 32kb (though not always the case).
The moral is that trying to compress a 30kb file into a 20kb file isn't going to do much good unless your computer is so slow that the reduction would actually speed up the decompression process significantly. Of course, when you have a copy of a file for a slideshow, you might want to compress that 400kb file into something more like 100kb. ;)

-Ramses

I write these FAQs in hopes it will benefit and educate you, so as always, feel free to correct me and add your own suggestions. You may remain anonymous or allow me to credit you with the suggestion (I will assume the former but definitely feel free to volunteer the use of your nick in the credits of the FAQ)

1